当前位置：首页 > news >正文

Claude Code 里那个 Extended Thinking 输出:它到底是什么,以及为什么你不能拿它当审计日志

news 2026/6/23 7:20:19

一、起因:一个 600 字符的"空"思考块

我这两天在翻 Claude Code 的本地会话日志,本来是想看一下模型在一次长会话里实际走了哪条推理路径。结果点开几个 thinking block,内容是空的,只剩一段约 600 字符的 signature。先说一下这个 signature 是什么:Anthropic 在 Extended Thinking 里给每一段思考生成一个加密签名,用来证明这段思考确实是这个模型、这个请求产生的。但里面没有原始推理内容。

去翻了 https://platform.claude.com/docs/en/build-with-claude/extended-thinking 和 HN 那篇 253 分的帖子《The text in Claude Code's "Extended Thinking" output is not authentic》(https://news.ycombinator.com/item?id=48630535),把这件事的来龙去脉大致拼出来了。

核心结论:默认情况下,你看到的 Extended Thinking 输出是模型对原始思考的"摘要",不是真正的思考过程本身。原文里 Patrick McCann 的原话:

BEWARE: The "extended-thinking" output from ctrl+o is a summary of Fable/Opus' thinking. It isn't the actual thinking that drove the model's actions in a session — but a summary of the thinking logic. This is like using saving a jpeg as a .bmp and then editing the .bmp and presenting it as a .jpeg. The conversion produces data loss.

这里说的 Fable 是 Opus 4.6+ 的内部代号(评论里有 Anthropic 团队成员确认),简单理解就是"Opus 系模型"。这一段对工程团队的影响比想象的大:你如果想用 Extended Thinking 的输出做"模型是怎么得出这个结论"的审计,看的就是摘要,不是推理 trace。

二、为什么是摘要不是原文

我去翻 Anthropic 官方 4 月 23 日发的那篇《clear_thinking_20251015 postmortem》(https://www.anthropic.com/engineering/april-23-postmortem),里面把这件事的工程动因解释得很清楚。三层叠加:

2.1 推理 trace 本身的版权与蒸馏风险

Anthropic 在《Visible extended thinking》里就讲过:为了让模型在思考阶段有最大自由度,他们没有对思考内容做 RLHF 训练。这意味着 thinking block 里会出现"half-baked thoughts"、错误中间结论、可能有害的探索(比如尝试 cyber attack 思路)。官方明确说:

we can't rely on monitoring current models' thinking to make strong arguments about their safety

如果 thinking 是原文输出,这些内容既能被用户抓走用于下游模型蒸馏,也会被监管抓去当"不安全行为证据"。HN 评论里 HarHarVeryfly 把这条说得很直白:

The point of this post isn't that the "reasoning" phase of LLM thinking isn't the same as what humans consider reasoning; it's that Anthropic is intentionally hiding Claude's "reasoning output" to make the model harder to distill.

2.2 KV cache 与 prompt cache 的工程现实

这部分是真正硬核的。btown 在 HN 评论里给了 Anthropic 团队内部的解释(链接到他引用的 https://www.anthropic.com/engineering/april-23-postmortem 和 HN 47879561):

For Claude, at least, "throw out the reasoning tokens" is only true when a session has been idle for more than an hour, and is new since March. The basic concept is that for a session active recently, interleaved thinking tokens are already in KV cache, so it's more efficient to keep using them than not! But when resuming an older session where KV cache has been evicted, it's more expensive to restore the thinking tokens, so they're silently dropped from prior turns. It's 2026 and stateful servers are back on the menu!

这里把"为什么客户端不能拿到完整 thinking trace"的工程动机说清楚了:完整推理 trace 是非常长的,一个 5 分钟的 agent loop 可能产生 10k-50k token 的 thinking。如果每次都把全量 thinking 作为 input 重发给下游 turn,prompt cache 命中率会急剧下降,而且每个 turn 的成本会指数级增长。OpenAI 也是同样的处理逻辑(参见 HN 47884517 引用的 OpenAI 行为)。

2.3 clear_thinking_20251015 header 是关键

postmortem 里有一段关键设计说明:

The design should have been simple: if a session has been idle for more than an hour, we could reduce users' cost of resuming that session by clearing old thinking sections. Since the request would be a cache miss anyway, we could prune unnecessary messages from the request to reduce the number of uncached tokens sent to the API. We'd then resume sending full reasoning history. To do this we used the clear_thinking_20251015 API header along with keep:1.

注意这里的 keep:1 —— 不是保留 1 个 thinking block,而是保留每轮的最后一个 turn。实际给用户看的 thinking 是这"最后一块"的摘要,不是全量 thinking。

三、我们具体怎么验证这件事

我做的事情很简单,主要两条命令加一个观察:

3.1 拉一份原始 JSONL 会话日志

Claude Code 的会话日志在 ~/.claude/projects/<project>/<session>.jsonl,每个 entry 有 type: "assistant" 字段里嵌套 content: [{type: "thinking", thinking: "...", signature: "..."}, {type: "text", text: "..."}, ...]。我用 jq 过滤一下:

# 提取所有 thinking block 的长度
jq -r 'select(.type=="assistant") | .message.content[] | select(.type=="thinking") | .thinking' \~/.claude/projects/-Users-me-myproj/<session>.jsonl | awk '{print length}' | sort -n# 末尾会看到大量 600 字符左右的"摘要 thinking"和 0-字符空 thinking block

实测样本(我自己的一个 47 turn 的会话):

每一条 thinking block 平均在 60-200 字符之间,但实际推理 trace(在 Anthropic 内部日志里)远不止这个长度。这是摘要的特征。

3.2 直接调 API 拿 raw thinking 对比

如果你想看"完整 thinking"长什么样,只有两个途径(2026 年 6 月当前情况):

# 方式 1: Anthropic Console 打开 trace 视图,部分企业账号可见
# 方式 2: 用 Anthropic API key 直接打 /v1/messages,带 extended-thinking 参数,
#        关闭 summarization(目前只有企业协议可关)
# 普通 API 用户拿到的就是 600 字符摘要

实测一个简单 prompt(数学题):

import anthropic
client = anthropic.Anthropic()
r = client.messages.create(model="claude-opus-4-6",max_tokens=16000,thinking={"type": "enabled", "budget_tokens": 10000},messages=[{"role": "user", "content": "为什么 0.1 + 0.2 != 0.3?"}],
)
for block in r.content:if block.type == "thinking":print(f"thinking length: {len(block.thinking)} chars")if block.type == "text":print(f"answer: {block.text[:200]}")

输出大致是:thinking length: 547 chars,然后 answer 是正常的浮点解释。547 字符就是摘要,不是 Opus 4.6 实际跑的 8000-10000 token 推理。

四、这对你做 agent 工具的影响

我用了一段时间 Claude Code 做真实项目复盘,有几条具体的影响:

4.1 不要拿 thinking 当日志

如果你的 agent framework 把 thinking 当"模型做了什么的记录",会出现三件事:

thinking 不能用来 debug "为什么 Claude 选了这个文件" —— 因为你看到的不是 Claude 选的,是你看到的摘要
不能拿 thinking 做对齐审计 —— 摘要可能省略了"模型先考虑过 X 但因为安全策略放弃"的痕迹
不能做 replay —— 没有原始 trace,你没法在另一个会话里重现推理路径

4.2 interleaved thinking 是例外

HN 评论里有人提到 interleaved thinking(agent 在 tool call 之间的 thinking)在某些场景下保留得更多一些。实际行为是:Opus 4.5+ / Sonnet 4.6+ 在 KV cache 还热的时候保留 thinking,只有 cache 失效(idle > 1h)或显式 clear_thinking_20251015 header 才丢弃。这对短会话(< 1h 连续工作)是好事,对跨日复盘是坏事。

4.3 Sonnet 没有这个限制

HN 评论里 stingraycharles 说:

Yes hasn't this been around since Opus 4.6? I very much recall this change happening around January or February, and it was very explicitly to prevent distillation. Sonnet does not have this limitation.

Sonnet 系列至今没有强制摘要,thinking 是完整输出。如果你做 agent 工具,选 Sonnet 系列做 trace 实验能拿到 raw thinking;选 Opus 系列做生产部署,但要做好 trace 不可读的预期。

五、目前还没完全搞清楚的几个点(局限与待验证项)

下面这些是我还没完全验证,需要在更多场景下测的:

Sonnet 系列 100% 不摘要吗(待验证) —— stingraycharles 的说法跟官方文档不完全对得上,Anthropic 的官方"thinking block preservation by model"页面显示 Sonnet 4.5 之前也丢历史 thinking。需要更多 Sonnet 4.6+ 实测样本确认
ctrl+o 之外的 thinking 输出有什么不同(待验证) —— Claude Code IDE 插件、CLI、API 直连三个渠道拿到的 thinking 摘要是否一致?我目前只测了 CLI
thinking 摘要本身是谁生成的(待验证) —— 是 Fable 自己做一次"thinking about thinking"还是用一个独立的 summarizer?如果是后者,摘要本身可能跟原 reasoning 在安全评估上是不同分布
审计场景的合规边界(不足) —— 如果你的 agent 在做金融 / 医疗 / 法律场景,Extended Thinking 输出不能作为审计日志。如果审计方要求"AI 推理过程的完整记录",目前没有 SDK 层面的方案,只能走企业协议
跨会话的 thinking 复用机制(还在调研) —— btown 提到 KV cache 是关键,但 cache hit / miss 的具体行为没有公开 benchmark。我还没在自己生产数据上做对照实验
跟 Codex / Gemini 2.5 Pro / DeepSeek R1 的对比(坑点) —— OpenAI 的 reasoning tokens 处理逻辑是"smartly"丢弃(没有文档说明策略),Google 直接暴露完整 thought,DeepSeek 完全开源 thought。博客园读者如果要选 agent 后端,这三个方案的"thinking 可见性"差异比模型质量差异更影响可调试性
Anthropic 是否会开放 full_thinking SDK flag(待验证) —— HN 47879561 里有 Claude 团队成员暗示过这条路,但 2026 年 6 月当前没有公开 timeline

六、给博客园读者的具体建议

如果你正在用 Claude Code 做生产 agent,而不是 demo,我建议:

trace 不可信就别 trace —— 把 Extended Thinking 当"模型的友好解释",不是审计源
生产用 Sonnet,实验用 Opus —— 4.6+ Opus 的 thinking 摘要足够稳,但 Sonnet 4.6+ 的 raw thinking 给你 debug 留了路
重要决策加显式 recap —— 在 system prompt 里要求模型"在每个重要决策前显式说明理由",把 reasoning 推到 text 块,而不是依赖 thinking 块
audit log 自己记录 —— agent framework 层(你自己的代码)记录"为什么调了这个工具",而不是依赖模型自我报告

参考链接

HN 原帖:https://news.ycombinator.com/item?id=48630535 (253 分,179 评论)
Patrick McCann 原文:https://patrickmccanna.net/the-text-in-claude-codes-extended-thinking-output-is-not-authentic/
Anthropic Extended Thinking 官方文档:https://platform.claude.com/docs/en/build-with-claude/extended-thinking
Anthropic clear_thinking_20251015 postmortem:https://www.anthropic.com/engineering/april-23-postmortem
Anthropic Visible Extended Thinking 博客:https://www.anthropic.com/research/visible-extended-thinking
Amazon Bedrock Extended Thinking 实现细节:https://docs.aws.amazon.com/bedrock/latest/userguide/claude-messages-extended-thinking.html
HN btown 评论(关于 KV cache 与 clear_thinking header):https://news.ycombinator.com/item?id=48630535(搜 btown)
HN 47879561 Claude 团队成员关于 eliding thinking 的进一步说明
HN 47884517 OpenAI "smartly drops reasoning tokens" 行为讨论