当前位置：首页 > news >正文

OmAgent与本地模型部署：使用Ollama和LocalAI的完整教程

news 2026/6/10 18:48:57

OmAgent与本地模型部署：使用Ollama和LocalAI的完整教程

【免费下载链接】OmAgent[EMNLP-2024] Build multimodal language agents for fast prototype and production项目地址: https://gitcode.com/gh_mirrors/om/OmAgent

OmAgent是一个强大的多模态语言智能体框架，让开发者能够轻松构建复杂的AI代理系统。本教程将指导您如何在OmAgent中使用Ollama和LocalAI部署本地模型，实现完全本地化的AI应用部署方案，无需依赖外部API服务。😊

🚀 为什么选择本地模型部署？

在AI应用开发中，本地模型部署提供了诸多优势：

数据隐私保护：敏感数据无需离开本地环境
成本控制：避免API调用费用，特别适合高频使用场景
网络独立性：不依赖互联网连接，保证服务稳定性
自定义灵活性：完全控制模型参数和配置

📦 OmAgent项目架构概览

OmAgent采用模块化设计，核心组件包括：

omagent-core：核心框架，包含工作流引擎、记忆系统、工具系统等
examples：丰富的示例项目，涵盖多种应用场景
docs：详细的概念文档和教程
docker：Conductor服务器的Docker部署配置

🔧 准备工作：环境搭建

1. 安装OmAgent核心库

首先克隆项目并安装核心依赖：

git clone https://gitcode.com/gh_mirrors/om/OmAgent cd OmAgent pip install omagent-core

2. 部署Conductor服务器

OmAgent使用Conductor作为工作流编排引擎，通过Docker快速部署：

cd docker docker-compose up -d

这将启动三个服务：

Conductor服务器（端口8080）：工作流编排引擎
Redis（端口6379）：短时记忆存储
Elasticsearch（端口9200）：日志和索引存储

🦙 方案一：使用Ollama部署本地LLM

1. 安装和启动Ollama

Ollama是一个简单易用的本地大语言模型运行环境：

# 安装Ollama（根据您的操作系统选择） curl -fsSL https://ollama.com/install.sh | sh # 启动Ollama服务 ollama serve # 拉取模型（例如Llama 3.2） ollama pull llama3.2:1b

2. 配置OmAgent使用Ollama

修改LLM配置文件examples/step1_simpleVQA/configs/llms/gpt.yml：

name: OpenaiGPTLLM model_id: llama3.2:1b # Ollama模型名称 api_key: ${env| custom_openai_key, abcd} # API密钥非必需 endpoint: ${env| custom_openai_endpoint, http://localhost:11434/v1} temperature: 0 vision: true

3. 设置环境变量

export custom_openai_endpoint="http://localhost:11434/v1" export custom_openai_key="dummy_key" # Ollama不需要真实密钥

4. 编译容器配置

cd examples/step1_simpleVQA python compile_container.py

这将生成container.yaml文件，其中包含了所有组件的配置。

🤖 方案二：使用LocalAI部署完整AI栈

1. 安装LocalAI

LocalAI支持多种模型后端，包括Whisper语音识别和文本嵌入模型：

# 安装LocalAI # 请参考 https://github.com/mudler/LocalAI 的安装指南 # 下载所需模型 # Whisper语音识别模型 wget -P /usr/share/local-ai/models/ https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-model-whisper-base.en.bin # 文本嵌入模型 wget -P /usr/share/local-ai/models/ https://huggingface.co/hugging-quants/Llama-3.2-1B-Instruct-Q4_K_M-GGUF/resolve/main/llama-3.2-1b-instruct-q4_k_m.gguf

2. 创建LocalAI配置文件

嵌入模型配置(embedding.yaml)：

name: text-embedding-ada-002 backend: llama-cpp embeddings: true parameters: model: llama-3.2-1b-instruct-q4_k_m.gguf

Whisper配置(whisper.yaml)：

name: whisper backend: whisper parameters: model: ggml-model-whisper-base.en.bin

3. 启动LocalAI服务

# 初始运行（链接配置文件） local-ai run /path/to/embedding.yaml local-ai run /path/to/whisper.yaml # 后续运行 local-ai run

4. 配置OmAgent使用LocalAI

修改examples/video_understanding/configs/llms/json_res.yml：

name: OpenaiTextEmbeddingV3 model_id: text-embedding-ada-002 dim: 2048 endpoint: ${env| custom_openai_endpoint, http://localhost:8080/v1} api_key: ${env| custom_openai_key, openai_api_key}

更新视频处理器配置examples/video_understanding/configs/workers/video_preprocessor.yml：

name: VideoPreprocessor llm: ${sub|gpt4o} use_cache: true scene_detect_threshold: 27 frame_extraction_interval: 5 stt: name: STT endpoint: http://localhost:8080/v1 api_key: ${env| custom_openai_key, openai_api_key} model_id: whisper output_parser: name: DictParser text_encoder: ${sub| text_encoder}

🎯 验证部署：运行示例应用

1. 简单视觉问答示例

cd examples/step1_simpleVQA python run_cli.py

这将启动一个命令行交互界面，您可以上传图片并提问，OmAgent将使用本地部署的模型进行回答。

2. 视频理解应用

cd examples/video_understanding python run_webpage.py

打开浏览器访问http://127.0.0.1:7860，您将看到一个完整的视频理解Web界面，支持上传视频并提问。

🔍 配置详解：关键文件说明

容器配置文件 (`container.yaml`)

这是OmAgent的核心配置文件，管理所有组件的依赖关系：

# Redis连接配置 RedisConnector: name: RedisConnector host: value: localhost env_var: REDIS_HOST port: value: 6379 env_var: REDIS_PORT # Conductor服务器配置 ConductorConnector: name: ConductorConnector host: value: http://localhost:8080 env_var: CONDUCTOR_HOST

工作流定义

OmAgent使用基于图的工作流引擎，在examples/step1_simpleVQA/run_cli.py中可以看到：

# 初始化简单VQA工作流 workflow = ConductorWorkflow(name="step1_simpleVQA") # 配置工作流任务 task1 = simple_task(task_def_name="InputInterface", task_reference_name="input_task") task2 = simple_task( task_def_name="SimpleVQA", task_reference_name="simple_vqa", inputs={"user_instruction": task1.output("user_instruction")}, ) # 配置工作流执行流程 workflow >> task1 >> task2