当前位置：首页 > news >正文

livekit全双工语音交互系统

news 2026/7/4 7:44:51

Overview

本地(macOS)搭建一个全双工语音交互系统

livekit-cli：agent项目，语音交互逻辑在这边，包括ASR，TTS，LLM，打断
livekit-server：LiveKit服务器
前端测试页面

dev 模式的交互流程

具体调用链路

启动阶段
- agent.py dev 启动本地 Agent 服务器
- 向 LiveKit 服务器注册，等待任务分配
用户连接
- 用户打开网页前端
- 前端通过 WebRTC 连接到 LiveKit 服务器房间
- LiveKit 发现房间需要 Agent，通过 WebSocket 通知你的 agent.py
语音处理管道 (全部在 agent.py 中运行)

livekit-cli

参考doc参考github
需要先注册免费的[LiveKit Cloud]账号(https://cloud.livekit.io/)
项目准备

brewinstalllivekit-cli# macOS, version:2.16lk cloud auth lk agent init my-agent--templateagent-starter-python# node.js选agent-starter-nodecdmy-agent uvsync# 3.10会报错，用了python3.13uv run src/agent.py download-files# Required for the Silero VAD, turn detector, and noise cancellation plugins.

src/agent.py中STT/LLM/TTS可以替换为自定义的（我用的aliyun的ASR/TTS，uv pip install aliyun-python-sdk-core）。然后启动

uv run python src/agent.py console# 控制台模式直接测试. 无需连接livekit-serveruv run python src/agent.py dev# development mode. 连接livekit-serveruv run python src/agent.py start# production mode. 连接livekit-server

.env.local文件

LIVEKIT_API_KEY="devkey" # 改为本地的，auth时初始化为LiveKit Cloud的相关配置 LIVEKIT_API_SECRET="secret" LIVEKIT_URL="ws://localhost:7880" ALIYUN_ACCESS_KEY_ID=xxx ALIYUN_ACCESS_KEY_SECRET=xxx ALIYUN_APP_KEY=xxx

livekit-server

参考doc

brewinstalllivekit livekit-server--dev--bind0.0.0.0# 默认API key: devkey，API secret: secret

前端测试

web可以用livekit-examples/agent-starter-react，安卓/ios等都支持，还没试，目前让claude code帮我写了一个playground.html来调用livekit-server。

测试流程

# 1. 启动本地 LiveKit 服务器livekit-server--dev--bind0.0.0.0# 2. 启动 Agentuv run python src/agent.py dev# dev 模式的热重载会创建子进程，Ctrl+C 有时无法干净退出。下次用 Ctrl+\（强制退出）可以彻底杀掉整个进程树。# 3. 创建房间 + dispatch (新终端)lk dispatch create --new-room --agent-name lk-agent--dev# 输出会给你房间名，如 room-xxxxx# 4. 生成该房间的 tokenlk token create--roomroom-xxxxx--identitytester--join--allow-source microphone --valid-for 1h--dev--token-only# 5. 打开测试页面# 方式1：直接双击playground.html打开# 方式2：建个httpserverpython3-mhttp.server8081&# 进入playground.html的目录openhttp://localhost:8081/playground.html# 粘贴 token → Connect → 允许麦克风 → 说话