当前位置：首页 > news >正文

Qwen3-1.7B应用案例：快速构建智能问答助手完整流程

news 2026/6/30 13:38:01

Qwen3-1.7B应用案例：快速构建智能问答助手完整流程

1. 项目概述与准备

1.1 Qwen3-1.7B模型简介

Qwen3-1.7B是阿里巴巴开源的通义千问系列语言模型中的轻量级版本，具有17亿参数规模。该模型在保持较高推理性能的同时，对硬件资源需求相对友好，特别适合快速构建各类AI应用。

1.2 智能问答助手应用场景

智能问答助手可应用于多个领域：

企业知识库自动应答
电商客服机器人
教育领域答疑系统
技术支持自动回复

1.3 环境准备

确保已具备以下条件：

已部署Qwen3-1.7B镜像环境
基础Python开发环境（3.8+版本）
网络访问权限（用于API调用）

2. 快速启动与模型调用

2.1 启动Jupyter环境

打开终端，运行以下命令启动容器：

docker run -it --gpus all -p 8888:8888 qwen3-1.7b-image

在浏览器中访问http://localhost:8888进入Jupyter界面

2.2 基础模型调用方法

使用LangChain框架调用Qwen3-1.7B的基础代码如下：

from langchain_openai import ChatOpenAI import os # 初始化模型 chat_model = ChatOpenAI( model="Qwen3-1.7B", temperature=0.5, # 控制生成随机性 base_url="http://localhost:8000/v1", # 本地服务地址 api_key="EMPTY", # 无需真实API密钥 extra_body={ "enable_thinking": True, # 启用思维链 "return_reasoning": True, # 返回推理过程 }, streaming=True, # 启用流式输出 ) # 简单问答测试 response = chat_model.invoke("介绍一下你自己") print(response.content)

3. 构建完整问答系统

3.1 系统架构设计

完整的问答系统通常包含以下组件：

用户接口层（Web/APP/CLI）
请求处理中间件
核心问答引擎（Qwen3-1.7B）
知识库集成模块
日志与监控系统

3.2 核心功能实现

3.2.1 基础问答功能增强

def enhanced_qa(question, chat_history=[]): # 构建对话上下文 context = "\n".join([f"用户：{q}\n助手：{a}" for q, a in chat_history[-3:]]) prompt = f""" 基于以下对话历史和问题，请给出专业、准确的回答： 历史对话： {context} 新问题：{question} 请确保回答： 1. 信息准确无误 2. 语言简洁明了 3. 必要时提供参考资料 """ # 调用模型 response = chat_model.invoke(prompt) return response.content

3.2.2 知识库集成实现

from langchain.vectorstores import FAISS from langchain.embeddings import HuggingFaceEmbeddings # 加载本地知识库 embeddings = HuggingFaceEmbeddings(model_name="BAAI/bge-small-zh") knowledge_base = FAISS.load_local("path_to_knowledge_base", embeddings) def search_knowledge(question): docs = knowledge_base.similarity_search(question, k=3) return "\n\n".join([doc.page_content for doc in docs]) def knowledge_enhanced_qa(question): # 检索相关知识 related_info = search_knowledge(question) # 构建增强提示 prompt = f""" 根据以下参考信息和问题，请给出专业回答： 参考信息： {related_info} 问题：{question} 要求： 1. 基于参考信息回答 2. 如信息不足，明确说明 3. 不要编造不存在的信息 """ return chat_model.invoke(prompt).content

4. 高级功能与优化

4.1 流式输出实现

from IPython.display import display, Markdown import time def stream_response(question): response = "" for chunk in chat_model.stream(question): response += chunk.content display(Markdown(response)) # 在Jupyter中实时显示 time.sleep(0.05) # 控制输出速度 return response

4.2 对话历史管理

class ConversationManager: def __init__(self, max_history=5): self.history = [] self.max_history = max_history def add_interaction(self, question, answer): self.history.append((question, answer)) if len(self.history) > self.max_history: self.history.pop(0) def get_context(self): return "\n".join([f"Q: {q}\nA: {a}" for q, a in self.history]) def ask(self, question): context = self.get_context() prompt = f"对话历史：\n{context}\n\n新问题：{question}" answer = chat_model.invoke(prompt).content self.add_interaction(question, answer) return answer

4.3 性能优化技巧

批处理请求：同时处理多个问题提升吞吐量

def batch_qa(questions): formatted = [f"问题：{q}\n请给出详细回答" for q in questions] return [chat_model.invoke(q).content for q in formatted]

缓存常见问题：减少重复计算

from functools import lru_cache @lru_cache(maxsize=100) def cached_qa(question): return chat_model.invoke(question).content

超时控制：避免长时间等待

import requests from requests.exceptions import Timeout try: response = requests.post( "http://localhost:8000/v1/chat/completions", json={"model": "Qwen3-1.7B", "messages": [{"role": "user", "content": question}]}, timeout=10 # 10秒超时 ) except Timeout: return "请求超时，请稍后再试"

5. 部署与上线

5.1 构建API服务

使用FastAPI创建问答接口：

from fastapi import FastAPI from pydantic import BaseModel app = FastAPI() class Question(BaseModel): text: str @app.post("/ask") async def ask_question(question: Question): response = chat_model.invoke(question.text) return {"answer": response.content} # 启动命令：uvicorn api:app --host 0.0.0.0 --port 8001

5.2 前端集成示例

简单HTML前端代码：

<!DOCTYPE html> <html> <head> <title>智能问答助手</title> <script> async function askQuestion() { const question = document.getElementById('question').value; const response = await fetch('http://localhost:8001/ask', { method: 'POST', headers: {'Content-Type': 'application/json'}, body: JSON.stringify({text: question}) }); const data = await response.json(); document.getElementById('answer').innerHTML = data.answer; } </script> </head> <body> <h1>Qwen3-1.7B智能助手</h1> <textarea id="question" rows="4" cols="50"></textarea><br> <button onclick="askQuestion()">提问</button> <div id="answer" style="margin-top:20px; border:1px solid #ccc; padding:10px;"></div> </body> </html>

5.3 监控与日志

添加基础监控功能：

import logging from datetime import datetime logging.basicConfig(filename='qa_log.log', level=logging.INFO) def log_interaction(question, answer, response_time): timestamp = datetime.now().strftime("%Y-%m-%d %H:%M:%S") logging.info(f""" [{timestamp}] 问答记录 问题：{question} 回答：{answer} 响应时间：{response_time:.2f}秒 """) # 在问答函数中添加记录 start_time = time.time() answer = chat_model.invoke(question).content response_time = time.time() - start_time log_interaction(question, answer, response_time)