当前位置：首页 > news >正文

claw-code 源码分析：从「清单」到「运行时」——Harness 为什么必须先做 inventory 再做 I/O？

news 2026/7/26 1:59:47

说明：本文分析对象为开源仓库claw-code（README 中Rewriting Project Claw Code的 Python/Rust 移植工作区）。

1. 问题在问什么

Inventory（清单）：在 Harness 里，指「系统承认存在的命令名、工具名及其元数据」的有穷集合——谁算内置、谁算插件、谁可被模型调用、各自职责与来源提示是什么。

I/O（输入输出）：指真正对外部世界产生副作用的行为——读盘、起进程、调网络 API、改用户仓库等。

核心论点是：在智能体系统里，若没有一个稳定、可枚举、可过滤的清单，就直接开放 I/O，会把「命名空间」「路由」「权限」「审计」全部绑死在即兴逻辑上；后期每一次加工具/加命令，都会变成全库手术。claw-code 的 Python 移植层用快照 JSON → 内存元组 → 路由/注册表 →（再）模拟执行的链条，把这一顺序写进了代码结构本身。

2. 源码中的「清单层」长什么样

2.1 数据源头：`reference_data`快照

命令与工具的权威枚举来自版本库内的 JSON，而不是运行时再扫描磁盘猜名字：

src/reference_data/commands_snapshot.json
src/reference_data/tools_snapshot.json

commands.py/tools.py在模块加载时读取 JSON，解析为不可变元组PORTED_COMMANDS、PORTED_TOOLS，并缓存：

@lru_cache(maxsize=1) def load_command_snapshot() -> tuple[PortingModule, ...]: raw_entries = json.loads(SNAPSHOT_PATH.read_text()) return tuple( PortingModule( name=entry['name'], responsibility=entry['responsibility'], source_hint=entry['source_hint'], status='mirrored', ) for entry in raw_entries ) PORTED_COMMANDS = load_command_snapshot()

@lru_cache(maxsize=1) def load_tool_snapshot() -> tuple[PortingModule, ...]: raw_entries = json.loads(SNAPSHOT_PATH.read_text()) return tuple( PortingModule( name=entry['name'], responsibility=entry['responsibility'], source_hint=entry['source_hint'], status='mirrored', ) for entry in raw_entries ) PORTED_TOOLS = load_tool_snapshot()

学习点：清单与代码解耦——JSON 可 diff、可审计、可随 parity 演进；Python 侧只消费「已镜像」条目，避免运行时动态发现带来的不可重复性。

2.2 在清单之上做「视图」：过滤、简单模式、权限

get_tools()不重新发明清单，而是在PORTED_TOOLS上叠加策略（simple mode、是否包含 MCP、权限上下文）：

def get_tools( simple_mode: bool = False, include_mcp: bool = True, permission_context: ToolPermissionContext | None = None, ) -> tuple[PortingModule, ...]: tools = list(PORTED_TOOLS) if simple_mode: tools = [module for module in tools if module.name in {'BashTool', 'FileReadTool', 'FileEditTool'}] if not include_mcp: tools = [module for module in tools if 'mcp' not in module.name.lower() and 'mcp' not in module.source_hint.lower()] return filter_tools_by_permission_context(tuple(tools), permission_context)

tool_pool.py的assemble_tool_pool只是把上述「当前允许的子集」包装成报告对象——先有全集（inventory），再有池（policy 下的视图）：

def assemble_tool_pool( simple_mode: bool = False, include_mcp: bool = True, permission_context: ToolPermissionContext | None = None, ) -> ToolPool: return ToolPool( tools=get_tools(simple_mode=simple_mode, include_mcp=include_mcp, permission_context=permission_context), simple_mode=simple_mode, include_mcp=include_mcp, )

学习点：权限与产品模式是清单上的过滤器，不是散落在每个 I/O 调用点里的 if-else；没有 inventory，过滤器无处附着。

2.3 命令「图」仍是清单的划分

command_graph.py根据source_hint把同一批PORTED_COMMANDS分成 builtin / plugin-like / skill-like——拓扑来自元数据字段，而不是执行时行为：

def build_command_graph() -> CommandGraph: commands = get_commands() builtins = tuple(module for module in commands if 'plugin' not in module.source_hint.lower() and 'skills' not in module.source_hint.lower()) plugin_like = tuple(module for module in commands if 'plugin' in module.source_hint.lower()) skill_like = tuple(module for module in commands if 'skills' in module.source_hint.lower()) return CommandGraph(builtins=builtins, plugin_like=plugin_like, skill_like=skill_like)

3. 「I/O 层」在本仓库里如何被刻意推迟

3.1 执行入口：`execute_*`首先是名字校验 + 描述性消息

真正的危险 I/O 并未接在execute_tool上；当前实现是mirrored shim：只在清单里找到名字时返回「将会如何处理」的字符串：

def execute_tool(name: str, payload: str = '') -> ToolExecution: module = get_tool(name) if module is None: return ToolExecution(name=name, source_hint='', payload=payload, handled=False, message=f'Unknown mirrored tool: {name}') action = f"Mirrored tool '{module.name}' from {module.source_hint} would handle payload {payload!r}." return ToolExecution(name=module.name, source_hint=module.source_hint, payload=payload, handled=True, message=action)

命令同理（execute_command）。学习点：Harness 演进的标准节奏是——先让「调用约定」在清单内跑通（名字、payload、返回结构），再接真实后端；若颠倒顺序，调试时无法区分「路由错了」还是「I/O 错了」。

3.2 注册表：`ExecutionRegistry`完全由清单构造

build_execution_registry()遍历PORTED_COMMANDS/PORTED_TOOLS生成可查找对象，注册表容量 = 清单条目数：

def build_execution_registry() -> ExecutionRegistry: return ExecutionRegistry( commands=tuple(MirroredCommand(module.name, module.source_hint) for module in PORTED_COMMANDS), tools=tuple(MirroredTool(module.name, module.source_hint) for module in PORTED_TOOLS), )

运行时拿路由结果去 registry 里取执行器——没有 inventory，registry 无法构建，路由结果也无法落到稳定 handler。

4.`PortRuntime`：路由与清单的硬依赖

PortRuntime.route_prompt的输入是用户prompt，但匹配对象只能是PORTED_COMMANDS与PORTED_TOOLS中的模块；它用 token 与name/source_hint/responsibility做打分，产出有限条RoutedMatch：

class PortRuntime: def route_prompt(self, prompt: str, limit: int = 5) -> list[RoutedMatch]: tokens = {token.lower() for token in prompt.replace('/', ' ').replace('-', ' ').split() if token} by_kind = { 'command': self._collect_matches(tokens, PORTED_COMMANDS, 'command'), 'tool': self._collect_matches(tokens, PORTED_TOOLS, 'tool'), } selected: list[RoutedMatch] = [] for kind in ('command', 'tool'): if by_kind[kind]: selected.append(by_kind[kind].pop(0)) leftovers = sorted( [match for matches in by_kind.values() for match in matches], key=lambda item: (-item.score, item.kind, item.name), ) selected.extend(leftovers[: max(0, limit - len(selected))]) return selected[:limit]

bootstrap_session的流程顺序非常清晰：

构建上下文与 setup（环境自省）
QueryEnginePort.from_workspace()（再拉 manifest / summary 相关状态）
history 记下commands={len(PORTED_COMMANDS)}, tools={len(PORTED_TOOLS)}——显式把清单规模当作会话元数据
route_prompt→build_execution_registry()→ 仅对匹配到的名字执行 shim
再把matched_commands/matched_tools/ 推断的denials交给QueryEnginePort的submit_message/stream_submit_message

def bootstrap_session(self, prompt: str, limit: int = 5) -> RuntimeSession: context = build_port_context() setup_report = run_setup(trusted=True) setup = setup_report.setup history = HistoryLog() engine = QueryEnginePort.from_workspace() history.add('context', f'python_files={context.python_file_count}, archive_available={context.archive_available}') history.add('registry', f'commands={len(PORTED_COMMANDS)}, tools={len(PORTED_TOOLS)}') matches = self.route_prompt(prompt, limit=limit) registry = build_execution_registry() command_execs = tuple(registry.command(match.name).execute(prompt) for match in matches if match.kind == 'command' and registry.command(match.name)) tool_execs = tuple(registry.tool(match.name).execute(prompt) for match in matches if match.kind == 'tool' and registry.tool(match.name)) denials = tuple(self._infer_permission_denials(matches)) stream_events = tuple(engine.stream_submit_message( prompt, matched_commands=tuple(match.name for match in matches if match.kind == 'command'), matched_tools=tuple(match.name for match in matches if match.kind == 'tool'), denied_tools=denials, )) turn_result = engine.submit_message( prompt, matched_commands=tuple(match.name for match in matches if match.kind == 'command'), matched_tools=tuple(match.name for match in matches if match.kind == 'tool'), denied_tools=denials, )

学习点：路由（routing）是定义在有穷 inventory 上的搜索问题；I/O 只应作用于路由后的已解析符号。若先写 I/O，常见反模式是「字符串里猜路径」「正则提取 shell 片段」——不可枚举、不可审计。

权限拒绝示例（_infer_permission_denials）同样建立在已匹配的工具名上（例如 bash 类工具），说明deny-list / gate 需要名字语义，而名字来自清单。

5.`QueryEnginePort`：会话与预算——仍以「匹配集合」为输入

submit_message并不自己去「发现」工具；它接收调用方已经算好的matched_commands、matched_tools与denied_tools，再写入摘要、用量、转写与压缩策略：

summary_lines = [ f'Prompt: {prompt}', f'Matched commands: {", ".join(matched_commands) if matched_commands else "none"}', f'Matched tools: {", ".join(matched_tools) if matched_tools else "none"}', f'Permission denials: {len(denied_tools)}', ] output = self._format_output(summary_lines) projected_usage = self.total_usage.add_turn(prompt, output) stop_reason = 'completed' if projected_usage.input_tokens + projected_usage.output_tokens > self.config.max_budget_tokens: stop_reason = 'max_budget_reached' self.mutable_messages.append(prompt) self.transcript_store.append(prompt) self.permission_denials.extend(denied_tools) self.total_usage = projected_usage self.compact_messages_if_needed() return TurnResult( prompt=prompt, output=output, matched_commands=matched_commands, matched_tools=matched_tools, permission_denials=denied_tools, usage=self.total_usage, stop_reason=stop_reason, )

render_summary()再次聚合manifest + command/tool backlog（仍来自清单），说明「给用户/维护者看的系统面」与 inventory 同源。

6. Bootstrap 阶段叙事：`bootstrap_graph`把顺序写死

build_bootstrap_graph()用字符串阶段描述了整个启动链，其中「setup + commands/agents 并行加载」在「query engine submit loop」之前：

def build_bootstrap_graph() -> BootstrapGraph: return BootstrapGraph( stages=( 'top-level prefetch side effects', 'warning handler and environment guards', 'CLI parser and pre-action trust gate', 'setup() + commands/agents parallel load', 'deferred init after trust', 'mode routing: local / remote / ssh / teleport / direct-connect / deep-link', 'query engine submit loop', ) )

这与前文代码一致：先加载/信任/模式，再进入 submit loop。在更完整的产品里，「agents parallel load」就是 inventory + policy 的装配；没有这一步，query loop 没有稳定工具面可展示给模型或用户。

7. Parity Audit：清单是「可度量一致性」的锚

parity_audit.py将归档侧与当前 Python 树的根文件、目录、命令条目、工具条目做比例统计——命令/工具覆盖率直接绑定commands_snapshot/tools_snapshot与归档：

@dataclass(frozen=True) class ParityAuditResult: archive_present: bool root_file_coverage: tuple[int, int] directory_coverage: tuple[int, int] total_file_ratio: tuple[int, int] command_entry_ratio: tuple[int, int] tool_entry_ratio: tuple[int, int] missing_root_targets: tuple[str, ...] missing_directory_targets: tuple[str, ...]

学习点：没有 inventory，就没有「条目覆盖率」这种工程指标；移植进度会沦为感受，而不是数据。

8. 结论：为什么必须先 inventory 再 I/O（结合本仓库）

维度	若先做 I/O	先做 inventory（本仓库做法）
命名空间	任意字符串都可能触发副作用	仅`PORTED_*`内名字可进入执行链
路由	难以定义「匹配到什么算合法工具」	`route_prompt`在固定模块集合上打分
权限	权限逻辑散落在具体 syscall	`ToolPermissionContext`、denial 推断附着在模块名与元数据
审计/回放	日志与真实能力面脱节	history / TurnResult 记录「匹配了哪些已登记符号」
移植	无法做 parity 与 snapshot diff	JSON 快照 + audit 量化进度
演进	每加一个工具改多处	增删 JSON 条目 → 注册表与路由自动继承