当前位置：首页 > news >正文

A2UI : 以动态 UI 代替 LLM 文本输出的方案

news 2026/3/27 5:26:03

A2UI (Agent to UI) 是一个基于 JSON 的流式 UI 协议，旨在让 AI Agent 能够动态生成、控制并响应用户界面。从技术本质上看，它将 UI 视为纯粹的数据 payload，由前端运行时（Runtime）负责解析并映射为原生组件。
后端一直在生成固定格式的 json ，复杂的是前端runtime ，第一轮对话后 LLM call生成 JSON => UI ，UI 点击交互后又是新的 LLM call 之后生成的新 JSON => UI。
demo a2ui.org/quickstart/ 需要强网络环境。
初始交互流程与消息结构
在 A2UI 系统中，所有交互都始于标准的 JSON-RPC 消息。当用户在聊天框输入内容时，前端会封装一个message/send请求：

{ "jsonrpc": "2.0", "method": "message/send", "params": { "message": { "messageId": "28f374eb-5682-454a-9366-2350761362e4", "role": "user", "parts": [ { "kind": "text", "text": "查找纽约评分最高的5家中餐厅" } ] } }, "id": 1 }

后端 Agent 接收到此请求后，会调用大语言模型（LLM）。此时 LLM 的任务不仅是提供文本回复，还需要生成符合 A2UI 规范的 UI 描述 JSON。一个典型的 LLM 返回结构包含以下三个核心 DataPart：

[ { "beginRendering": { "surfaceId": "restaurant-surface", "root": "main-container" } }, { "surfaceUpdate": { "surfaceId": "restaurant-surface", "components": [ { "id": "main-container", "component": { "Column": { "children": { "explicitList": ["header-text", "result-list"] } } } }, { "id": "header-text", "component": { "Text": { "text": { "path": "title" }, "usageHint": "h1" } } }, { "id": "result-list", "component": { "List": { "children": { "template": { "componentId": "item-card", "dataBinding": "/restaurants" } } } } }, { "id": "item-card", "component": { "Card": { "child": "item-btn" } } }, { "id": "item-btn", "component": { "Button": { "child": "btn-label", "action": { "name": "view_detail", "context": [{ "key": "id", "value": { "path": "id" } }] } } } }, { "id": "btn-label", "component": { "Text": { "text": { "path": "name" } } } } ] } }, { "dataModelUpdate": { "surfaceId": "restaurant-surface", "path": "/", "contents": [ { "key": "title", "valueString": "推荐餐厅清单" }, { "key": "restaurants", "valueMap": [ { "key": "r1", "valueMap": [ { "key": "id", "valueString": "rest_001" }, { "key": "name", "valueString": "西安名吃" } ] } ] } ] } } ]

结论：A2UI 的核心逻辑是“UI 即数据”。Agent 通过surfaceUpdate定义组件的邻接表结构，通过dataModelUpdate注入业务数据。前端运行时负责维护这个状态并递归渲染。

用户交互触发的二次 LLM Call
当用户点击 UI 上的按钮（例如上述生成的“西安名吃”按钮）时，前端并不会执行本地的业务跳转，而是产生一个userAction回传给后端：

{ "userAction": { "name": "view_detail", "surfaceId": "restaurant-surface", "sourceComponentId": "item-btn", "timestamp": "2026-01-14T01:00:00Z", "context": { "id": "rest_001" } } }

后端 Agent 接收到此 Action 后，将其作为上下文再次请求 LLM，LLM 随后生成一段新的 UI JSON（例如该餐厅的详细预订表单）。
交互后 LLM 返回：动态生成的详情界面
后端 Agent 接收到userAction后，将其作为新的 Context 输入给 LLM。LLM 理解用户想要查看rest_001的详情，于是生成第二轮 UI JSON。这段 JSON 会覆盖或更新当前的渲染表面（Surface）：

[ { "beginRendering": { "surfaceId": "restaurant-surface", "root": "detail-container" } }, { "surfaceUpdate": { "surfaceId": "restaurant-surface", "components": [ { "id": "detail-container", "component": { "Column": { "children": { "explicitList": ["detail-header", "booking-form"] } } } }, { "id": "detail-header", "component": { "Text": { "text": { "path": "restaurant_name" }, "usageHint": "h2" } } }, { "id": "booking-form", "component": { "Card": { "child": "form-fields" } } }, { "id": "form-fields", "component": { "Column": { "children": { "explicitList": ["phone-input", "date-picker", "confirm-btn"] } } } }, { "id": "phone-input", "component": { "TextField": { "label": { "literalString": "预订电话" }, "text": { "path": "/form/phone" } } } }, { "id": "date-picker", "component": { "DateTimeInput": { "label": { "literalString": "预订日期" }, "value": { "path": "/form/date" }, "enableDate": true } } }, { "id": "confirm-btn", "component": { "Button": { "child": "confirm-text", "primary": true, "action": { "name": "submit_booking", "context": [ { "key": "restaurantId", "value": { "literalString": "rest_001" } }, { "key": "phone", "value": { "path": "/form/phone" } } ] } } } }, { "id": "confirm-text", "component": { "Text": { "text": { "literalString": "确认预订" } } } } ] } }, { "dataModelUpdate": { "surfaceId": "restaurant-surface", "path": "/", "contents": [ { "key": "restaurant_name", "valueString": "西安名吃 - 第九大道店" }, { "key": "form", "valueMap": [] } ] } } ]

关键点：-上下文继承：LLM 记住了上一步选择的餐厅。

动态组件切换：原本是列表页，现在变成了带有TextField和DateTimeInput的预订表单。
数据回传绑定：confirm-btn按钮通过path绑定，会将用户在phone-input中输入的内容作为 Context 发回给后端。

标准组件库与渲染协议
A2UI 提供了一套标准组件库，目前包含 18 种原子组件，分为展示、布局、交互、表单四大类。
展示类组件 (Text, Image, Icon, Video, Audio)
这些组件负责内容呈现。以Image为例，其属性支持响应式填充：

{ "id": "poster", "component": { "Image": { "url": { "literalString": "https://cdn.example.com/img.png" }, "usageHint": "mediumFeature", "fit": "cover" } } }

布局类组件 (Row, Column, List, Card, Tabs, Divider, Modal)
布局组件通过children属性构建组件树。特别需要关注的是List组件的模板机制：
JSON

{ "List": { "direction": "vertical", "children": { "template": { "componentId": "reusable-card", "dataBinding": "/dataSourceArray" } } } }

运行时会遍历dataBinding指向的数据数组，为每个元素实例化一个componentId指定的模板组件。
交互与表单类组件 (Button, TextField, CheckBox, MultipleChoice, Slider, DateTimeInput)
TextField等组件支持数据双向绑定。当用户输入时，运行时会自动更新内部 DataModel：

{ "TextField": { "label": { "literalString": "联系电话" }, "text": { "path": "/form/phone" }, "textFieldType": "number" } }

数据绑定系统 (DynamicValue)
A2UI 弃用了硬编码字符串，所有属性均使用DynamicValue结构，支持literal（字面量）和path（路径绑定）两种模式：

类型	数据模型示例	JSON 绑定配置	渲染结果
DynamicString	{"name": "Alice"}	{"path": "name"}	"Alice"
DynamicNumber	{"price": 99}	{"path": "price"}	99
DynamicBoolean	{"valid": true}	{"path": "valid"}	true

数据路径支持绝对路径（以/开头）和相对路径（在模板循环中使用），这使得 UI 结构可以保持高度复用，仅通过切换数据源即可改变展示内容。
定制性边界
要理解 A2UI，必须明确 LLM 能做什么，以及它被禁止做什么。这类似于现代 Web 开发中的“设计系统”与“业务逻辑”的分离。
LLM 拥有完全控制权的部分（What & Where）
LLM 决定界面的业务逻辑和结构，这部分是完全动态的：

组件树拓扑：决定是先显示文字还是先显示图片，决定使用 Row 还是 Column 进行包装。
组件类型：根据上下文选择最合适的组件（如：发现需要用户输入时，动态生成TextField）。
数据映射：定义 UI 组件读取数据模型的哪个分支。
布局权重：通过weight属性（flex-grow 的 JSON 映射）决定组件占据的空间比例。
交互逻辑：定义按钮点击后发送给后端的 Action 名称和参数名。

前端 Runtime 强制控制的部分（How it looks）
为了防止 LLM 生成风格诡异、不可用或不安全的界面，样式权限被锁死在前端：

视觉样式 (CSS)：LLM 无法指定具体的十六进制颜色（如 #FF0011）或像素值（如 14px）。所有的颜色、圆角、内边距、字体族都由前端的主题文件（Theme）预先定义。
交互细节：按钮点击时的缩放动画、列表滚动的物理惯性、输入框聚焦时的边框高亮。
响应式规则：在窄屏手机和宽屏显示器上如何折行，这由前端的 CSS Media Queries 固定处理。
底层实现：Button在 HTML 中是用<button>还是<div>模拟，LLM 并不感知。

维度	LLM 控制 (JSON 描述)	前端控制 (Hardcoded/Theme)
内容	✅ 文字内容、图片 URL、选项列表	❌ 字体族、行高、字符间距
结构	✅ 组件层级、排列顺序	❌ 组件内部 DOM 嵌套
样式	⚠️ 语义标记 (usageHint="h1")	✅ 具体的 px、hex 颜色、阴影值
布局	✅ 水平/垂直、flex 比例	❌ 具体的 gap 间距、响应式断点
交互	✅ 点击触发什么 Action	✅ 点击时的视觉反馈、动画过渡

技术架构总结
A2UI 渲染器本质上是一个状态机。其核心逻辑如下：

// 渲染器核心伪代码逻辑 class A2uiRenderer { private surfaceState: Map<string, Surface>; processMessages(payload: JSON) { if (payload.surfaceUpdate) { // 1. 更新扁平化的组件注册表 this.updateComponents(payload.surfaceUpdate); } if (payload.dataModelUpdate) { // 2. 更新响应式数据源 this.updateDataModel(payload.dataModelUpdate); } // 3. 触发 UI 递归重绘 this.requestRender(); } renderNode(nodeId: string) { const config = this.getComponentConfig(nodeId); const componentClass = registry.get(config.type); // 自动解析数据绑定 const props = this.resolveProps(config.props, this.dataModel); return new componentClass(props, config.children); } }

总结：作为“协议”而非“框架”的 A2UI
从工程角度看，A2UI 类似于一种“强约束的低代码协议”。它的判断逻辑如下：

它是声明式的：后端不发送 JavaScript 代码，只发送 UI 的状态快照。
它是闭环的：前端的每一个有效操作（Action）都会导致后端生成新的 JSON 响应，从而驱动界面进入下一个状态。
它是安全的：由于 CSS 和底层 DOM 结构的解释权在前端，恶意 LLM 无法通过注入style或script标签来进行 XSS 攻击或破坏 UI 规范。

这种设计的核心价值在于：它让 AI Agent 拥有了“长出手脚”的能力，能够直接通过界面引导用户完成复杂的业务流程，而不仅仅是作为一个聊天窗口存在。

查看全文

http://www.jsqmd.com/news/245987/