当前位置：首页 > news >正文

WeKnora API开发指南：RESTful接口详解与实战

news 2026/7/1 8:17:57

WeKnora API开发指南：RESTful接口详解与实战

1. 引言

如果你正在寻找一个强大的文档理解与语义检索框架，WeKnora绝对值得一试。作为腾讯开源的大模型知识库系统，它提供了完整的RESTful API接口，让你能够轻松集成到自己的应用中。

无论你是想构建企业内部的知识管理系统，还是开发智能问答应用，WeKnora的API都能提供强大的支持。本文将带你全面了解WeKnora的API体系，从基础概念到实际应用，让你快速上手并掌握核心开发技巧。

2. 环境准备与快速开始

2.1 安装WeKnora

首先确保你已经部署了WeKnora服务。如果你还没有安装，可以通过Docker快速部署：

# 克隆项目代码 git clone https://github.com/Tencent/WeKnora.git cd WeKnora # 复制环境配置文件 cp .env.example .env # 启动所有服务 ./scripts/start_all.sh

服务启动后，API服务默认运行在8080端口，Web界面在80端口。

2.2 获取API密钥

在开始调用API之前，你需要先获取API密钥：

# 注册用户（首次使用） curl -X POST "http://localhost:8080/api/v1/auth/register" \ -H "Content-Type: application/json" \ -d '{ "username": "your_username", "email": "your_email@example.com", "password": "your_password" }' # 登录获取token curl -X POST "http://localhost:8080/api/v1/auth/login" \ -H "Content-Type: application/json" \ -d '{ "email": "your_email@example.com", "password": "your_password" }'

登录成功后，你会获得一个access token，后续所有API调用都需要在请求头中携带这个token。

3. 核心API接口详解

3.1 知识库管理API

知识库是WeKnora的核心概念，用于组织和管理文档。以下是一些常用的知识库操作：

import requests class WeKnoraClient: def __init__(self, base_url, token): self.base_url = base_url self.headers = { "Authorization": f"Bearer {token}", "Content-Type": "application/json" } def create_knowledge_base(self, name, description=""): """创建知识库""" url = f"{self.base_url}/api/v1/knowledge-bases" data = { "name": name, "description": description } response = requests.post(url, json=data, headers=self.headers) return response.json() def list_knowledge_bases(self): """获取知识库列表""" url = f"{self.base_url}/api/v1/knowledge-bases" response = requests.get(url, headers=self.headers) return response.json() def upload_document(self, kb_id, file_path): """上传文档到知识库""" url = f"{self.base_url}/api/v1/knowledge-bases/{kb_id}/documents" with open(file_path, 'rb') as f: files = {'file': (os.path.basename(file_path), f)} response = requests.post(url, files=files, headers={ "Authorization": f"Bearer {self.token}" }) return response.json()

3.2 文档处理API

文档上传后，WeKnora会自动进行解析和处理：

def check_document_status(self, kb_id, doc_id): """检查文档处理状态""" url = f"{self.base_url}/api/v1/knowledge-bases/{kb_id}/documents/{doc_id}/status" response = requests.get(url, headers=self.headers) return response.json() def list_documents(self, kb_id): """获取知识库中的文档列表""" url = f"{self.base_url}/api/v1/knowledge-bases/{kb_id}/documents" response = requests.get(url, headers=self.headers) return response.json()

3.3 智能问答API

这是最常用的API，用于基于知识库内容进行问答：

def ask_question(self, kb_id, question, stream=False): """向知识库提问""" url = f"{self.base_url}/api/v1/knowledge-bases/{kb_id}/ask" data = { "question": question, "stream": stream } if stream: # 流式响应处理 response = requests.post(url, json=data, headers=self.headers, stream=True) for line in response.iter_lines(): if line: yield line.decode('utf-8') else: # 普通响应 response = requests.post(url, json=data, headers=self.headers) return response.json() # 使用示例 client = WeKnoraClient("http://localhost:8080", "your_token") response = client.ask_question("kb_123", "如何配置网络参数？") print(response["answer"])

4. 实战案例：构建智能客服系统

让我们通过一个实际案例来展示WeKnora API的应用。假设我们要为一个软件产品构建智能客服系统。

4.1 初始化知识库

首先创建专门的产品文档知识库：

def setup_product_knowledge_base(client): """设置产品知识库""" # 创建知识库 kb_response = client.create_knowledge_base( "产品帮助文档", "包含产品使用指南和常见问题解答" ) kb_id = kb_response["id"] # 上传产品文档 documents = [ "user_manual.pdf", "faq.docx", "troubleshooting_guide.md" ] for doc in documents: upload_response = client.upload_document(kb_id, doc) print(f"上传 {doc}: {upload_response['status']}") return kb_id

4.2 实现问答服务

创建一个简单的问答服务：

from flask import Flask, request, jsonify import threading app = Flask(__name__) client = None kb_id = None @app.route('/ask', methods=['POST']) def ask(): """问答接口""" data = request.json question = data.get('question') if not question: return jsonify({"error": "问题不能为空"}), 400 try: response = client.ask_question(kb_id, question) return jsonify({ "answer": response["answer"], "sources": response.get("sources", []) }) except Exception as e: return jsonify({"error": str(e)}), 500 def initialize_service(): """初始化服务""" global client, kb_id # 初始化客户端 client = WeKnoraClient("http://localhost:8080", "your_token") # 设置知识库 kb_id = setup_product_knowledge_base(client) # 在后台初始化 threading.Thread(target=initialize_service).start() if __name__ == '__main__': app.run(port=5000)

4.3 添加流式响应支持

对于更好的用户体验，可以支持流式响应：

@app.route('/ask-stream', methods=['POST']) def ask_stream(): """流式问答接口""" data = request.json question = data.get('question') def generate(): try: for chunk in client.ask_question(kb_id, question, stream=True): yield f"data: {chunk}\n\n" except Exception as e: yield f"data: {json.dumps({'error': str(e)})}\n\n" return Response(generate(), mimetype='text/event-stream')

5. 高级功能与最佳实践

5.1 批量处理文档

如果需要处理大量文档，建议使用批量操作：

def batch_upload_documents(client, kb_id, document_folder): """批量上传文档""" import os from concurrent.futures import ThreadPoolExecutor documents = [] for filename in os.listdir(document_folder): if filename.endswith(('.pdf', '.docx', '.txt', '.md')): documents.append(os.path.join(document_folder, filename)) def upload_file(file_path): try: response = client.upload_document(kb_id, file_path) return {"file": file_path, "status": "success", "response": response} except Exception as e: return {"file": file_path, "status": "error", "error": str(e)} # 使用线程池并行上传 with ThreadPoolExecutor(max_workers=5) as executor: results = list(executor.map(upload_file, documents)) return results

5.2 监控与错误处理

在生产环境中，良好的监控和错误处理很重要：

class RobustWeKnoraClient(WeKnoraClient): def __init__(self, base_url, token, max_retries=3): super().__init__(base_url, token) self.max_retries = max_retries self.session = requests.Session() # 配置重试策略 retry_strategy = Retry( total=max_retries, backoff_factor=0.1, status_forcelist=[429, 500, 502, 503, 504] ) adapter = HTTPAdapter(max_retries=retry_strategy) self.session.mount("http://", adapter) self.session.mount("https://", adapter) def ask_question_with_retry(self, kb_id, question, stream=False): """带重试机制的提问""" for attempt in range(self.max_retries): try: return super().ask_question(kb_id, question, stream) except requests.exceptions.RequestException as e: if attempt == self.max_retries - 1: raise e time.sleep(2 ** attempt) # 指数退避

5.3 性能优化建议

连接池管理：使用会话对象复用连接
批量操作：尽量减少API调用次数
缓存策略：对常见问题答案进行缓存
异步处理：对于非实时要求的操作使用异步方式

import asyncio import aiohttp async def async_ask_questions(session, base_url, token, kb_id, questions): """异步批量提问""" async with aiohttp.ClientSession() as session: headers = { "Authorization": f"Bearer {token}", "Content-Type": "application/json" } tasks = [] for question in questions: data = {"question": question, "stream": False} task = session.post( f"{base_url}/api/v1/knowledge-bases/{kb_id}/ask", json=data, headers=headers ) tasks.append(task) responses = await asyncio.gather(*tasks) return [await resp.json() for resp in responses]