当前位置：首页 > news >正文

第二章：AI Agent的“手脚”——Tool

news 2026/5/13 7:59:23

AI Agent的核心架构，本质上模拟了人类处理问题的逻辑：先通过感官获取信息，再通过大脑分析决策，最后通过手脚落地行动。对应到AI Agent中，就是三大核心模块：感知模块、决策模块、执行模块。这三个模块相互配合、无缝联动，构成了AI Agent自主工作的完整闭环。

本章将围绕执行模块讲起，也就是它的核心Tool，感知和决策先用自己脑子代替。

执行模块是AI Agent的“行动载体”，相当于人类的手脚，负责将决策模块制定的方案落地执行，完成具体任务，同时将执行过程中的情况、最终结果反馈给决策模块，形成闭环。

执行模块的核心能力的是“工具调用”和“动作执行”，这也是它与传统AI工具的核心区别之一——传统AI工具只能被动响应，而AI Agent的执行模块可以主动调用各类工具，完成复杂任务。比如，调用邮件工具发送邮件、调用文档工具整理纪要等。

传统的AI工具我们以豆包客户端为例：

就算我明确要求它写入本地也不行：

它只能告诉你执行什么命令，创建那些文件，代码是什么，最后让你手动复制到本地运行。

但是 Cursor、Trae 或者 Claude Code CLI 就可以一键到底：

demo

执行模块的核心是Tool，工作逻辑很简单，分为“执行”和“反馈”两个环节。

我们也像cursor一样开发一些 tool 给 agent 调用就可以了。

比如：读文件、写文件、执行命令。

一、环境搭建

PS：进入开发前，还需找个可用的大模型，国产的就可以，比如：千问或智谱，都是新用户免费送token，能用很久。

接下来进入正题：

# 创建目录 mkdir tool-test # 进入目录 cd tool-test # 初始化环境 npm init -y

安装依赖：

# langchain 框架 pnpm i @langchain/openai @langchain/core # 工具类 pnpm i zod dotenv

zod：运行时数据校验 + TS 类型生成

dotenv：加载 .env 环境变量的工具

创建一个.env文件，将你的大模型服务相关配置写入

# OpenAI 配置 OPENAI_MODEL_NAME=你的模型名称 OPENAI_API_KEY=你的 API 密钥 OPENAI_BASE_URL=你的 API 服务 URL

接下来写一个hello-world程序，在 src 目录下创建一个 hello-langchain.mjs

二、开发一个读文件的 tool

创建一个 tool-file-read.mjs

// 从 .env 加载环境变量到 process.env import 'dotenv/config'; import { ChatOpenAI } from '@langchain/openai'; import { tool } from '@langchain/core/tools'; import { SystemMessage, HumanMessage, AIMessage } from '@langchain/core/messages'; // Zod：为工具参数定义 schema 并生成描述 import { z } from 'zod'; // Node 内置文件系统模块（同步读写） import fs from 'node:fs'; // 实例化聊天模型，从环境变量读取端点与密钥 const model = new ChatOpenAI({ model: process.env.OPENAI_MODEL_NAME, // 模型名称，如 gpt-4o-mini apiKey: process.env.OPENAI_API_KEY, // API 密钥 temperature: 0, // 温度为 0，输出更确定 configuration: { baseURL: process.env.OPENAI_BASE_URL, // 自定义 API 基地址（兼容代理或第三方） } }); // 定义「读文件」工具：异步函数体 + 元数据（名称、描述、参数 schema） const readFileTool = tool(async (filePath) => { const content = fs.readFileSync(filePath, 'utf8'); return `${filePath} 的内容是：${content}`; }, { name: 'read_file', // 工具在 API 中的标识名 description: '用来读取文件内容的工具。当用户要求读取文件、查看文件内容时，可以使用这个工具，需要输入一个文件路径。', schema: z.object({ filePath: z.string().describe('要读取的文件路径'), }), }); // 工具列表，可扩展多个 tool const tools = [readFileTool]; // 将工具绑定到模型，invoke 时模型可返回 tool_calls const modelWithTools = model.bindTools(tools); // 对话消息数组：先系统提示，再用户任务 const messages = [ new SystemMessage( `你是一个助手，用来读取文件内容。当用户要求读取文件、查看文件内容时，可以使用这个工具，需要输入一个文件路径。 工作流程： 1. 根据用户的需求，选择合适的工具 2. 使用工具读取文件内容，等待返回文件内容 3. 基于文件内容进行分析和解释 可用工具： - read_file：读取文件内容的工具 `), new HumanMessage(`读取文件路径为 "src/tool-file-read.mjs" 的文件内容并解释代码`), ] // 发起一次带工具的推理（可能只返回要调用的工具，不执行工具本身） const response = await modelWithTools.invoke(messages); // 打印模型返回（含 content、tool_calls 等） console.log(response);

整个代码的执行大致流程就是：

创建模型实例
创建 tool
把 tool 绑定到模型实例上
设定系统提示词（SystemMessage）
输入用户提示词（HumanMessage）
执行并返回（AIMessage）

具体的消息有四种：SystemMessage、HumanMessage、AIMessage、ToolMessage

SystemMessage：设置 AI 是谁，可以干什么，有什么能力，以及一些回答、行为的规范等
HumanMessage：用户输入的信息
AIMessage：AI 的回复信息
ToolMessage：调用工具的结果返回

运行结果：

接下来我们将基于 tool_calls 这个参数进行工具调用。

let response = await modelWithTools.invoke(messages); // 打印模型返回（含 content、tool_calls 等） // console.log(response); // 将模型返回添加到消息列表中 messages.push(response); while (response.tool_calls?.length > 0) { console.log(`\n[检测到 ${response.tool_calls.length} 个工具调用]`); // 按工具调用顺序依次执行 const toolResults = await Promise.all(response.tool_calls.map(async (toolCall) => { const toolName = toolCall.name; const toolArgs = toolCall.args; const tool = await tools.find((t) => t.name === toolName); if (!tool) { throw new Error(`未找到工具: ${toolName}`); } console.log(`\n开始执行工具调用: ${toolName}(${JSON.stringify(toolArgs)})`); try { const toolResult = await tool.invoke(toolArgs); // 返回 ToolMessage 对象，包含 content 和 tool_call_id // 使用 id 关联执行结果 return new ToolMessage({ content: toolResult, tool_call_id: toolCall.id, }); } catch (error) { console.error(`工具调用失败: ${toolName}, 错误信息: ${error.message}`); return new ToolMessage(`工具调用失败: ${toolName}，错误信息: ${error.message}`, toolCall.id); } })); // 更新消息历史，添加工具调用结果 messages.push(...toolResults); // 再次调用模型，传入工具结果 console.log(`再次调用模型，共: ${messages.length} 条消息`); response = await modelWithTools.invoke(messages); messages.push(response); } console.log(`\n[最终响应]`); console.log(response.content);

执行大致流程就是：

1. 将模型返回消息（AIMessage）添加到对话列表中；

2. 根据 tool_calls 的数组，分别从 tools 数组里找到对应的工具；

3. 取出来 invoke，传入大模型解析出的参数，

4. 最后把工具调用结果作为 ToolMessage 传给大模型，让它继续回答：

跑起来：

从图中可以看出，模型成功检测并调用了 read_file 这个工具读取文件，让大模型分析并给出了代码解释。

到此就给大模型扩展了“读”的能力。

最后我们加上 git，执行git init，添加一个.gitignore，把 node_modules 和敏感文件.env排除掉。

查看全文

http://www.jsqmd.com/news/807470/