当前位置：首页 > news >正文

Agent记忆系统工程：让AI真正记住重要的事

news 2026/7/10 16:07:30

无状态的 AI 助手每次对话都从零开始，这是当前应用体验差的核心原因之一。本文系统性地拆解 Agent 记忆系统的工程实现，从短期工作记忆到长期知识库，构建有"真实记忆"的 AI Agent。

记忆系统的四个层次人类记忆是分层的：有即时工作记忆（当前任务），有情景记忆（具体事件），有语义记忆（通用知识），有程序记忆（技能习惯）。AI Agent 的记忆系统同样需要分层设计。四层记忆架构：┌─────────────────────────────────────┐│ 工作记忆（In-Context Memory） │ 当前对话 + 任务上下文├─────────────────────────────────────┤│ 情景记忆（Episodic Memory） │ 历史交互摘要├─────────────────────────────────────┤│ 语义记忆（Semantic Memory） │ 用户偏好 + 领域知识├─────────────────────────────────────┤│ 程序记忆（Procedural Memory） │ 工具使用经验 + 错误教训└─────────────────────────────────────┘大多数 Agent 应用只实现了工作记忆（即对话历史），缺少后三层才是用户体验差距的根本原因。## 工作记忆管理工作记忆就是 LLM 的 Context Window。管理的核心挑战是：如何在有限的上下文窗口内保留最重要的信息。### 滑动窗口策略最简单的方案：只保留最近 N 轮对话：pythonclass SlidingWindowMemory: def init(self, max_turns: int = 20): self.max_turns = max_turns self.messages = [] def add_message(self, role: str, content: str): self.messages.append({"role": role, "content": content}) # 超出窗口时，删除最旧的用户-助手对（保留system prompt） while len(self.messages) > self.max_turns * 2: # 找到第一个非system消息对并删除 for i, msg in enumerate(self.messages): if msg["role"] == "user": self.messages.pop(i) # 删除用户消息 self.messages.pop(i) # 删除对应的助手消息 break def get_context(self) -> list: return self.messages.copy()### 摘要压缩策略当对话历史超出限制时，对旧内容进行摘要压缩：pythonfrom openai import OpenAIclient = OpenAI()class SummarizingMemory: def init(self, max_tokens: int = 4000, compression_threshold: int = 3000): self.messages = [] self.summary = "" self.max_tokens = max_tokens self.compression_threshold = compression_threshold def estimate_tokens(self, messages: list) -> int: # 粗略估算：每个字符约0.5个token total_chars = sum(len(m["content"]) for m in messages) return int(total_chars * 0.5) def compress_if_needed(self): if self.estimate_tokens(self.messages) > self.compression_threshold: # 压缩较旧的一半消息 old_messages = self.messages[:len(self.messages)//2] old_text = "\n".join([f"{m['role']}: {m['content']}" for m in old_messages]) response = client.chat.completions.create( model="gpt-4o-mini", messages=[{ "role": "user", "content": f"请将以下对话历史压缩为简洁摘要（200字以内），保留关键信息：\n\n{old_text}" }] ) self.summary = response.choices[0].message.content self.messages = self.messages[len(self.messages)//2:] def get_context(self) -> list: context = [] if self.summary: context.append({ "role": "system", "content": f"对话历史摘要：{self.summary}" }) context.extend(self.messages) return context## 情景记忆：记住"发生过什么"情景记忆存储的是具体事件的记录。对于 Agent 来说，这意味着记住用户做过什么、说过什么、达成过什么结论。### 情景记忆的数据结构`pythonfrom dataclasses import dataclassfrom datetime import datetimefrom typing import Optional, List@dataclassclass Episode: episode_id: str timestamp: datetime summary: str # 本次交互的简要描述 key_facts: List[str] # 提取的关键事实 user_intent: str # 用户意图标签 outcome: str # 结果（成功/失败/进行中） tags: List[str] # 主题标签（便于检索） def to_searchable_text(self) -> str: return f"{self.summary} {' '.join(self.key_facts)} {' '.join(self.tags)}"`### 情景记忆的写入每次对话结束后，自动提取并存储情景：pythonasync def extract_and_store_episode(conversation: list, user_id: str): """从对话中提取情景记忆""" conversation_text = "\n".join([ f"{m['role']}: {m['content']}" for m in conversation ]) extraction_prompt = f""" 分析以下对话，提取情景记忆。以JSON格式返回： {{ "summary": "一句话描述这次交互", "key_facts": ["事实1", "事实2"], "user_intent": "用户意图分类", "outcome": "成功/失败/进行中", "tags": ["主题标签"] }} 对话内容： {conversation_text} """ response = await llm.complete(extraction_prompt) episode_data = json.loads(response) episode = Episode( episode_id=generate_id(), timestamp=datetime.now(), **episode_data ) # 存入向量数据库（用于语义检索） await vector_store.upsert( id=episode.episode_id, vector=await embed(episode.to_searchable_text()), metadata=episode.dict )## 语义记忆：用户画像与知识积累语义记忆存储的是结构化的用户偏好和领域知识，它是 AI 个性化体验的核心。### 用户画像系统pythonclass UserProfile: def init(self, user_id: str): self.user_id = user_id self.preferences = {} # 偏好设置 self.expertise = {} # 专业领域和水平 self.communication_style = {} # 沟通风格偏好 self.context = {} # 工作/生活背景 def update_preference(self, key: str, value, confidence: float = 1.0): """更新用户偏好，带置信度权重""" if key not in self.preferences: self.preferences[key] = {"value": value, "confidence": confidence, "count": 1} else: # 增量更新置信度 current = self.preferences[key] new_count = current["count"] + 1 if current["value"] == value: new_confidence = min(1.0, current["confidence"] + 0.1) else: new_confidence = max(0.1, current["confidence"] - 0.2) if new_confidence < 0.3: current["value"] = value # 偏好已改变 self.preferences[key] = { "value": current["value"], "confidence": new_confidence, "count": new_count } def to_system_prompt(self) -> str: """生成个性化 System Prompt 片段""" lines = ["关于用户的了解："] for key, pref in self.preferences.items(): if pref["confidence"] > 0.7: lines.append(f"- {key}: {pref['value']}") if self.communication_style: style = self.communication_style lines.append(f"- 偏好的沟通风格: {style.get('tone', '专业')}, {style.get('detail_level', '适中')}") return "\n".join(lines)### 知识库更新当 Agent 从用户处获得新知识时，应将其存入语义记忆：pythonasync def update_knowledge_base(user_id: str, conversation: list): """从对话中提取并更新知识库""" extraction_prompt = """ 从以下对话中提取值得记住的用户信息： 1. 用户的明确偏好（如"我喜欢/不喜欢..."） 2. 用户的背景信息（工作、项目、技术栈） 3. 用户提到的重要约束或要求 4. 用户的沟通风格偏好只提取明确表达的信息，不要推断。以JSON数组返回，每项包含{type, key, value}。 """ knowledge_items = await extract_from_conversation(conversation, extraction_prompt) profile = await load_user_profile(user_id) for item in knowledge_items: profile.update_preference(item["key"], item["value"]) await save_user_profile(user_id, profile)## 程序记忆：学习工具使用经验程序记忆存储 Agent 在工具调用和任务执行中积累的经验——哪些方法有效，哪些方法失败过。python@dataclassclass ToolExperience: tool_name: str use_case: str # 使用场景描述 parameters_pattern: dict # 成功的参数模式 success_rate: float common_errors: list # 常见错误及解决方案 best_practices: list # 最佳实践class ProceduralMemory: def init(self): self.tool_experiences = {} def record_success(self, tool_name: str, params: dict, context: str): """记录成功的工具调用""" key = f"{tool_name}:{context}" if key not in self.tool_experiences: self.tool_experiences[key] = ToolExperience( tool_name=tool_name, use_case=context, parameters_pattern=params, success_rate=1.0, common_errors=[], best_practices=[] ) else: exp = self.tool_experiences[key] exp.success_rate = exp.success_rate * 0.9 + 1.0 * 0.1 # EMA def record_failure(self, tool_name: str, params: dict, error: str, context: str): """记录失败的工具调用及错误信息""" key = f"{tool_name}:{context}" if key in self.tool_experiences: exp = self.tool_experiences[key] exp.success_rate = exp.success_rate * 0.9 + 0.0 * 0.1 if error not in exp.common_errors: exp.common_errors.append(error) def get_advice(self, tool_name: str, context: str) -> str: """获取工具使用建议""" key = f"{tool_name}:{context}" if key not in self.tool_experiences: return "" exp = self.tool_experiences[key] advice = [] if exp.success_rate > 0.8: advice.append(f"参考历史成功模式: {exp.parameters_pattern}") if exp.common_errors: advice.append(f"注意避免: {', '.join(exp.common_errors[:3])}") return "\n".join(advice)## 记忆检索：找到最相关的记忆存储是基础，检索才是关键。Agent 在响应时需要快速找到最相关的历史记忆。pythonclass MemoryRetriever: def init(self, vector_store, user_profile_store): self.vector_store = vector_store self.user_profile = user_profile_store async def retrieve_relevant_memories( self, current_query: str, user_id: str, top_k: int = 5 ) -> dict: """检索与当前查询最相关的记忆""" # 1. 情景记忆检索（向量相似度） query_embedding = await embed(current_query) episodes = await self.vector_store.search( vector=query_embedding, filter={"user_id": user_id}, limit=top_k ) # 2. 用户画像（全量加载，较小） profile = await self.user_profile.load(user_id) # 3. 组合为上下文 context_parts = [] if profile: context_parts.append(f"用户背景：\n{profile.to_system_prompt()}") if episodes: episode_texts = [] for ep in episodes: episode_texts.append( f"[{ep.timestamp.strftime('%Y-%m-%d')}] {ep.summary}" ) context_parts.append( f"相关历史记录：\n" + "\n".join(episode_texts) ) return { "context": "\n\n".join(context_parts), "episodes": episodes, "profile": profile }## 记忆遗忘机制无限累积的记忆会降低检索质量。合理的遗忘机制必不可少：pythonclass MemoryDecay: """基于艾宾浩斯遗忘曲线的记忆衰减""" @staticmethod def calculate_retention(days_since_created: float, initial_importance: float = 1.0) -> float: """计算记忆保留率""" import math # 艾宾浩斯公式：R = e^(-t/S)，S为稳定性 stability = initial_importance * 10 # 重要性越高，遗忘越慢 retention = math.exp(-days_since_created / stability) return retention async def cleanup_old_memories(self, user_id: str, threshold: float = 0.1): """清理保留率低于阈值的记忆""" all_episodes = await self.load_all_episodes(user_id) now = datetime.now() to_delete = [] for episode in all_episodes: days_old = (now - episode.timestamp).days retention = self.calculate_retention(days_old, episode.importance) if retention < threshold: to_delete.append(episode.episode_id) if to_delete: await self.vector_store.delete(to_delete) print(f"清理了 {len(to_delete)} 条过期记忆")## 生产部署考量存储选择：- 向量数据库（Qdrant/Pinecone/Weaviate）：存储情景记忆的嵌入向量- PostgreSQL/MongoDB：存储结构化用户画像- Redis：工作记忆缓存（TTL 控制自动过期）隐私与安全：- 记忆内容应加密存储- 为用户提供查看和删除自己记忆的接口- 明确告知用户哪些信息被记忆扩展性：- 用户量大时，记忆检索需要按用户 ID 分片- 考虑记忆冷热分层（近期热数据 vs 历史冷数据）## 总结Agent 记忆系统是从"工具"到"伙伴"的关键进化。四层记忆架构（工作记忆 + 情景记忆 + 语义记忆 + 程序记忆）缺一不可，但工程落地可以分阶段推进：1.Phase 1：实现工作记忆管理（摘要压缩）2.Phase 2：构建用户画像（语义记忆核心）3.Phase 3：引入情景记忆（历史对话检索）4.Phase 4：加入程序记忆（工具经验积累）每个 Phase 都能带来可感知的用户体验提升，是值得持续投入的工程方向。

查看全文

http://www.jsqmd.com/news/866887/

免费图片去水印工具怎么选？2026年在线软件全面对比与推荐指南

ZFS修复不是fsck：状态回溯与三重校验机制解析

设备码钓鱼攻击产业化扩散机理与闭环防御体系研究

OpenISP 模块拆解 · 第16讲：亮度对比度控制 (BCC)

Unity运行时几何切割：OpenFracture物理可信破碎方案

TVA凭什么成为”数字AI“通往”物理AI“的关键桥梁（8）

自由职业者的合同模板：保护自己的六个关键条款

python民宿预定信息退订系统

Unity第三人称射击原型：Playmaker可视化逻辑解剖

Unity脚本智能生成与一键部署工作流

Unity手机变无线触摸板：UDP低延迟输入注入实战

如何快速解密QQ音乐QMC格式音频文件？

Unity转微信小游戏3D重构实战：Three.js替代方案与性能优化

企业技术培训的ROI怎么算？一个让HR和老板都认可的框架——软件测试从业者专业解读

Unity第三人称射击模板：Playmaker驱动的TPS功能骨架

《元创力》纪实录·桥段双生未来：神谕纪元与共生纪元的观测报告

ZFS故障诊断与修复实战：从DEGRADED到数据可信恢复

TVA凭什么成为”数字AI“通往”物理AI“的关键桥梁（9）

2026年汕头龙湖区黄金回收top排名对比：谁才是合规变现的优选？ - 小仙贝贝

技术专利的那些事：什么代码值得申请专利？

FairyGUI控制器驱动UI动画：Unity中事件与状态的正确绑定方式

在极客上线，AI是一种新的工作方式

java springboot-vue高校毕业生公职资讯系统考公辅导系统

视觉-语言对齐失效全归因，深度解析DeepSeek VL在OCR弱文本、细粒度图文检索中的5大断裂点及修复方案

亲测8款2026年好用的降AI工具（含免费版） - 殷念写论文

行空板（UNIHIKER）小白图文指南

微信小程序HTTPS请求失败-101错误的SSL证书排查指南

海洋中尺度涡旋识别与追踪的终极指南：5分钟快速入门Py Eddy Tracker

相关文章：