LLM Agent的幻觉问题诊断与治理:从检测到缓解的完整方案
LLM Agentçå¹»è§é®é¢è¯æä¸æ²»çï¼ä»æ£æµå°ç¼è§£ç宿´æ¹æ¡
å¼è¨
大åè¯è¨æ¨¡åï¼LLMï¼é©±å¨çAI Agentæ£å¨éå¡åè¡åä¸çèªå¨åæµç¨ã仿ºè½å®¢æå°ä»£ç çæï¼ä»ç§ç è¾ å©å°å³çæ¯æï¼Agentçè½åè¾¹ç䏿æå±ãç¶èï¼ä¸ä¸ªé¿æå°æ°ä¸ççæ ¹æ¬æ§é®é¢å§ç»åå¨ââå¹»è§ï¼Hallucinationï¼ãå½Agentèªä¿¡æ»¡æ»¡å°è¾åºé误信æ¯ãç¼é ä¸åå¨çå¼ç¨ãæå¨æ¨çé¾ä¸å¼å ¥é»è¾è°¬è¯¯æ¶ï¼å ¶äº§ççåæå¯è½è¿è¶ 䏿¬¡ç®åçåçé误ï¼å¨å»çåºæ¯ä¸ï¼é误çè¯æå»ºè®®å¯è½å±åæ£è çå½ï¼å¨éèé¢åï¼èæçæ°æ®åæå¯è½å¯¼è´é大æèµå³ç失误ï¼å¨è½¯ä»¶å¼åä¸ï¼ä¸åå¨çAPIè°ç¨ä¼è®©æ´ä¸ªé¡¹ç®é·å ¥è°è¯å°ç±ã
ä¸ä¼ ç»è½¯ä»¶ç³»ç»ç"æç¡®å¤±è´¥"ä¸åï¼LLMçå¹»è§å¾å¾æ«çä¸ä¸ãæµç ãæè¯´æåçå¤è¡£ï¼æé¾è¢«ç»ç«¯ç¨æ·è¯å«ãæ¬æå°ç³»ç»æ§å°æè§£LLM Agentå¹»è§é®é¢ç宿´æ²»çæ¹æ¡ï¼ä»åç±»è¯æå°æ£æµæ¹æ³ï¼ä»ç½®ä¿¡åº¦æ ¡åå°å¤é¨éªè¯ï¼æç»è½å°å¯è½å°çç¼è§£çç¥ï¼å¸®å©å¼åè æå»ºæ´å¯ä¿¡çAI Agentç³»ç»ã
ä¸ãå¹»è§çä¸éé¢åï¼ç±»åå¦åæ
è¦æ²»çå¹»è§ï¼é¦å å¿ é¡»å¦ä¼è¯å«å®ãå¨å®é ç产ç¯å¢ä¸ï¼Agentçå¹»è§å¹¶éåä¸å½¢æï¼èæ¯åç°åºè³å°ä¸ç§æªç¶ä¸åçç±»åï¼æ¯ç§ç±»åçæå ã表ç°åæ£æµæ¹æ³é½å¤§ä¸ç¸åã
1.1 äºå®æ§å¹»è§ï¼Factual Hallucinationï¼
äºå®æ§å¹»è§æ¯æç´è§ã乿容æè¢«è¯å«çå¹»è§ç±»åãå®è¡¨ç°ä¸ºAgent对客è§äºå®çé误éè¿°ââå æ¬ç¼é ä¸åå¨ç人ç©ãå°ç¹ãæ°æ®ï¼éè¯¯å°æè¿°åå²äºä»¶ï¼æè 对ç§å¦æ¦å¿µè¿è¡æªæ²è§£éã
å ¸åæ¡ä¾ï¼æç¥è¯é®çAgentå¨åç"éå计ç®é¢åæåªäºéè¦çä¸å½ç§å¦å®¶"æ¶ï¼çæäºä¸æ®µçä¼¼åççä»ç»ï¼"å¼ æè¿ææå¨2021å¹´æåºäº'éåçº ç¼ ç¨³å®æ§å®ç'ï¼è¯¥å®ç为ææéå计ç®å¥ å®äºåºç¡ã"è¿æ®µæåè¯è¨æµç ãé»è¾èªæ´½ï¼ä½"å¼ æè¿ææ"åæè°ç"éåçº ç¼ ç¨³å®æ§å®ç"å系模åæé ã
æå åæï¼äºå®æ§å¹»è§çæ ¸å¿æå å¨äºLLMçè®ç»ç®æ ãæ¨¡åéè¿é¢æµä¸ä¸ä¸ªtokenæ¥ä¼åï¼èé追æ±äºå®åç¡®æ§ãå½è®ç»è¯æä¸ç¼ºä¹ç¸å ³ä¿¡æ¯ï¼æè å¤ä¸ªä¿¡æ¯æºåå¨å²çªæ¶ï¼æ¨¡åå¾åäº"çæææµç çææ¬"èé"è¾åºæåç¡®ççæ¡"ãæ¤å¤ï¼ç¥è¯æªæ¢ï¼knowledge cutoffï¼å¯¼è´æ¨¡åå¯¹ææ°äºä»¶ä¸æ æç¥ï¼å´ä»ä¼å°è¯åçã
1.2 æ¨çæ§å¹»è§ï¼Reasoning Hallucinationï¼
æ¨çæ§å¹»è§æ´ä¸ºéè½ï¼å®åçå¨Agentç"æèè¿ç¨"ä¸ãå³ä½¿æ¯ä¸ä¸ªåæäºå®é½æ¯æ£ç¡®çï¼Agentä¹å¯è½å¨é»è¾æ¨å¯¼ä¸ç¯éï¼å¾åºèè°¬çç»è®ºãè¿ç§å¹»è§å¨éè¦å¤æ¥æ¨ççæ°å¦è®¡ç®ãé»è¾æ¨çãå æåæä»»å¡ä¸å°¤ä¸ºå¸¸è§ã
å ¸åæ¡ä¾ï¼Agent被é®åï¼"妿Aå ¬å¸ä»å¹´è¥æ¶å¢é¿20%ï¼å©æ¶¦å¢é¿10%ï¼èè¡ä¸å¹³åè¥æ¶å¢é¿15%ï¼å©æ¶¦å¢é¿12%ï¼é£ä¹Aå ¬å¸ççå©è½åæ¯æåè¿æ¯ä¸éï¼"
Agentå¯è½è¿æ ·"æ¨ç"ï¼"è¥æ¶å¢é¿é«äºè¡ä¸å¹³åï¼20% > 15%ï¼ï¼è¯´æAå ¬å¸å¸åºä»½é¢æ©å¤§ï¼å©æ¶¦å¢é¿ä½äºè¡ä¸å¹³åï¼10% < 12%ï¼ï¼è¯´æææ¬ä¸åãå æ¤Aå ¬å¸ççå©è½å卿åï¼å 为å¸åºä»½é¢æ©å¤§çé¿æä»·å¼å¤§äºçæå©æ¶¦çæ³¢å¨ã"
è¿ä¸ªç»è®ºçé®é¢å¨äºï¼å®æ··æ·äº"å¢é¿éåº¦å¿«æ ¢"ä¸"çå©è½åæ°´å¹³"çæ¦å¿µï¼ä¸å¿½è§äºå©æ¶¦å¢é¿ç»å¯¹ä½äºè¡ä¸å¹³åè¿ä¸ç´æ¥ä¿¡å·ãæ´å ³é®çæ¯ï¼Agent卿¨çè¿ç¨ä¸å¼å ¥äºä¸ä¸ªæªç»éªè¯çå设ââ"å¸åºä»½é¢æ©å¤§çé¿æä»·å¼"ã
æå åæï¼æ¨çæ§å¹»è§æºäºLLMå¨å¤æé»è¾é¾ä¸çèå¼±æ§ãè½ç¶æ¨¡åå¨ç®åæ¨çä¸è¡¨ç°è¯å¥½ï¼ä½å½æ¨çæ¥æ°å¢å ãéè¦åæº¯æ£æ¥æå¤çåäºå®æ¡ä»¶æ¶ï¼é误æ¦ç伿æ°çº§ä¸åãChain-of-Thoughtï¼CoTï¼æç¤ºè½ç¶ææå¸®å©ï¼ä½å¹¶ä¸è½æ ¹é¤é®é¢ââæ¨¡åå¯è½åªæ¯"çæäºçä¼¼åççæ¨çè¿ç¨"ï¼èéçæ£æ§è¡äºä¸¥è°¨çé»è¾è¿ç®ã
1.3 å¼ç¨æ§å¹»è§ï¼Citable Hallucinationï¼
å¼ç¨æ§å¹»è§æ¯å¦æ¯çåä¸ä¸äººå£«ææ·±æ¶çç»çä¸ç§ç±»åãAgentå¨çæå 容æ¶ï¼ä¼ç¼é çä¼¼æå¨çæ¥æºââå æ¬ä¸åå¨ç论æãé误çä½è ãèåçDOIç¼å·ï¼çè³æé æ ¹æ¬ä¸åå¨çæ³å¾æ¡æã
å ¸åæ¡ä¾ï¼ææ³å¾è¾ å©Agentå¨åç"ä¸å½å ³äºæ°æ®è·¨å¢ä¼ è¾çææ°æ³è§è¦æ±"æ¶ï¼å¼ç¨äº"ãæ°æ®åºå¢å®å ¨è¯ä¼°åæ³ã第18æ¡ï¼'å¤çè¶ è¿100ä¸ç¨æ·ä¸ªäººä¿¡æ¯çä¼ä¸ï¼åºå½å¨æ¯å¹´3æ31æ¥ååå½å®¶ç½ä¿¡é¨é¨æäº¤å¹´åº¦æ°æ®åºå¢æ¥åã'"å®é ä¸ï¼è¯¥åæ³å ±åªæ20æ¡ï¼ä¸ä¸åå¨ä¸è¿°è§å®ã
æå åæï¼å¼ç¨æ§å¹»è§æ¯è®ç»æ°æ®ä¸æ¨¡åæ¶æå ±åä½ç¨çç»æã妿¯è¯æä¸å¤§éåå¨"æ ¹æ®XXç ç©¶""å¦Smithç人ï¼2023ï¼æåº"çæ¨¡æ¿åè¡¨è¾¾ï¼æ¨¡åå¦ä¼äºè¿ç§"å¼ç¨æ ¼å¼"ï¼å´æ²¡æå»ºç«èµ·å¼ç¨å 容ä¸ç宿ç®ä¹é´çå¯é æ å°ã彿¨¡å被æç¡®è¦æ±æä¾å¼ç¨æ¶ï¼å®æ´å¾åäº"çæçèµ·æ¥æ£ç¡®çå¼ç¨æ ¼å¼"ï¼èéæ£ç´¢ç宿¥æºã
äºãå¹»è§æ£æµï¼ä»å é¨ä¿¡å·å°å¤é¨éªè¯
æ£æµæ¯æ²»ççç¬¬ä¸æ¥ãä¸ä¸ªææçå¹»è§æ£æµç³»ç»éè¦å¤å±æ¬¡çæ£æµæºå¶ï¼ä»æ¨¡åå é¨ç置信度信å·å°å¤é¨ç¥è¯åºç交åéªè¯ã
2.1 åºäºå é¨ç¶æçæ£æµæ¹æ³
LLMå¨çætokenæ¶ï¼å ¶å 鍿¦çåå¸è´å«ç丰å¯çä¿¡æ¯ãéè¿åæè¿äºå é¨ç¶æï¼æä»¬å¯ä»¥å¨ä¸ä¾èµå¤é¨èµæºçæ åµä¸ï¼å¯¹å¹»è§é£é©è¿è¡åæ¥è¯ä¼°ã
置信度çµåæï¼å½æ¨¡å对æä¸ªtokenç颿µæ¦çåå¸é«åº¦åæ£ï¼é«çµï¼æ¶ï¼å¾å¾æå³ç模åå¤äº"ä¸ç¡®å®"ç¶æï¼æ¤æ¶äº§çå¹»è§çé£é©è¾é«ã
import torch import numpy as np from transformers import AutoTokenizer, AutoModelForCausalLM def calculate_token_entropy(logits, top_k=10): """ 计ç®top-k tokenç颿µçµ é«çµè¡¨ç¤ºæ¨¡åä¸ç¡®å®ï¼å¯è½äº§çå¹»è§ """ probs