当前位置：首页 > news >正文

Qwen3-4B-Instruct-2507工具调用实战：手把手教你搭建智能问答系统

news 2026/7/22 19:37:17

Qwen3-4B-Instruct-2507工具调用实战：手把手教你搭建智能问答系统

1. 引言

在当今AI技术快速发展的背景下，构建一个高效可靠的智能问答系统已成为企业和开发者的普遍需求。Qwen3-4B-Instruct-2507作为最新发布的开源大语言模型，在指令遵循和工具调用方面表现出色，是搭建此类系统的理想选择。

本文将带你从零开始，使用vLLM部署Qwen3-4B-Instruct-2507模型服务，并通过Chainlit构建交互式前端界面，最终实现一个功能完整的智能问答系统。无论你是AI开发者还是技术爱好者，都能通过本教程快速掌握大模型部署和调用的核心技能。

2. 环境准备与模型部署

2.1 硬件与软件要求

在开始之前，请确保你的系统满足以下基本要求：

GPU配置：至少24GB显存（如NVIDIA A10G或RTX 4090）
系统内存：建议64GB以上
存储空间：模型文件约8GB，预留15GB空间
Python版本：3.8或更高
CUDA版本：11.7或12.x

2.2 使用vLLM部署模型服务

vLLM是一个高性能的推理引擎，特别适合部署大语言模型。以下是部署Qwen3-4B-Instruct-2507的具体步骤：

安装vLLM及其依赖：

pip install vllm

启动模型服务：

python -m vllm.entrypoints.api_server \ --model Qwen/Qwen3-4B-Instruct-2507 \ --tensor-parallel-size 1 \ --gpu-memory-utilization 0.9

验证服务状态：

cat /root/workspace/llm.log

如果看到类似以下输出，说明模型已成功加载：

INFO: Started server process [12345] INFO: Uvicorn running on http://0.0.0.0:8000

3. 构建交互式前端

3.1 Chainlit基础配置

Chainlit是一个轻量级的Python框架，可以快速构建大模型应用的Web界面。首先安装必要的包：

pip install chainlit openai

创建一个名为app.py的文件，添加以下基础配置：

import chainlit as cl from openai import OpenAI client = OpenAI(base_url="http://localhost:8000/v1", api_key="none") @cl.on_message async def main(message: cl.Message): response = client.chat.completions.create( model="Qwen/Qwen3-4B-Instruct-2507", messages=[{"role": "user", "content": message.content}], temperature=0.7, ) await cl.Message(content=response.choices[0].message.content).send()

3.2 启动Chainlit服务

运行以下命令启动前端服务：

chainlit run app.py -p 8001

在浏览器中访问http://localhost:8001，你将看到一个简洁的聊天界面，可以开始与模型交互。

4. 工具调用功能实现

4.1 定义工具函数

要让模型能够调用外部工具，首先需要定义可用的工具函数。在app.py中添加以下代码：

def get_weather(location: str, date: str = "today"): """获取指定地点和日期的天气信息""" # 这里应该是实际的天气API调用 return f"{location} {date}的天气是晴朗，温度25°C" def get_current_time(): """获取当前时间""" from datetime import datetime return datetime.now().strftime("%Y-%m-%d %H:%M:%S")

4.2 工具调用处理逻辑

修改主处理函数，支持工具调用：

@cl.on_message async def main(message: cl.Message): response = client.chat.completions.create( model="Qwen/Qwen3-4B-Instruct-2507", messages=[{"role": "user", "content": message.content}], temperature=0.7, tools=[{ "name": "get_weather", "description": "获取指定地点的天气信息", "parameters": { "type": "object", "properties": { "location": {"type": "string"}, "date": {"type": "string"} }, "required": ["location"] } }, { "name": "get_current_time", "description": "获取当前时间", "parameters": {"type": "object", "properties": {}} }], tool_choice="auto" ) response_message = response.choices[0].message tool_calls = response_message.tool_calls if tool_calls: for tool_call in tool_calls: function_name = tool_call.function.name function_args = json.loads(tool_call.function.arguments) if function_name == "get_weather": weather = get_weather(**function_args) await cl.Message(content=weather).send() elif function_name == "get_current_time": current_time = get_current_time() await cl.Message(content=current_time).send() else: await cl.Message(content=response_message.content).send()