深度解读阿里云百炼 HappyHorse 1.1 视频生成大模型——能力解析与 Python/Java 工程化接入实战
本文基于阿里云百炼(Alibaba Cloud Model Studio / DashScope)2026年6月发布的 HappyHorse 1.1 官方文档与 API 参考编写,涵盖模型架构升级点、三种生成模式(T2V/I2V/R2V)、完整 API 参数说明,以及生产可用的 Python DashScope SDK 与 Java HTTP 异步调用示例。
一、背景与定位
2026年6月,阿里巴巴在阿里云百炼平台正式上线HappyHorse 1.1(快乐小马 1.1) 视频生成大模型。作为 HappyHorse 1.0 之后的重要迭代,1.1 版本并未单纯堆砌参数量,而是针对 AI 视频在工业级内容生产(短剧、电商广告、品牌营销、游戏 CG 预告)中的五大核心痛点做了系统性修复:
动态表现力不足(动作迟缓、残影)→ 强化运动建模与时序一致性
主体一致性差(角色/商品多镜头"变脸")→ 支持最多 9 张参考图 R2V 模式
指令遵循弱(长 Prompt 只识别前半句)→ 增强长上下文语义理解与镜头规划
视觉质感"AI 味重"(油光、过度锐化)→ 还原真实肤质,减少涂抹感
音画脱节 → 原生音视频联合生成,支持环境音/配乐/BGM 描述
HappyHorse 1.1 基于统一编码的单流 Transformer 架构,将文本、图像、视频、音频四种模态在同一表示空间内处理,实现原生音画同步生成,这使其区别于单纯的"无声视频扩散模型"。
官网:https://happyhorsecn.cn/
二、核心生成模式
HappyHorse 1.1 在 API 层暴露三个独立 Model ID,分别对应三种生成范式:
模式 | API Model ID | 输入要求 | 典型场景 |
|---|---|---|---|
T2V(Text-to-Video 文生视频) |
| 文本 Prompt(必填) | 概念短片、氛围镜头、无素材创意 |
I2V(Image-to-Video 图生视频) |
| 文本 Prompt + 首帧图片 URL(必填) | 产品图转动效、静态海报动态化 |
R2V(Reference-to-Video 参考生视频) |
| 文本 Prompt + 1~9 张参考图(reference_image 类型) | 短剧多镜头角色一致、品牌代言人/商品保 ID |
R2V 是 1.1 的旗舰能力:传入角色正面/侧面/道具等多视角参考图,模型会在生成的全过程中保持这些主体的外观一致性,解决此前 AIGC 视频"每帧一个人"的行业难题。
三、API 技术规范
3.1 端点与认证
Endpoint(推荐华北2北京地域):
POST https://dashscope.aliyuncs.com/api/v1/services/aigc/video-generation/video-synthesis如使用业务空间专属域名:
https://{WorkspaceId}.cn-beijing.maas.aliyuncs.com/api/v1/services/aigc/video-generation/video-synthesis认证:
Authorization: Bearer YOUR_DASHSCOPE_API_KEY异步标识(HTTP 调用必须):
X-DashScope-Async: enable
3.2 通用请求体结构
{ "model": "happyhorse-1.1-t2v", "input": { "prompt": "详细描述期望的视频内容……", "media": [ { "type": "first_frame", "url": "https://.../img.png" }, { "type": "reference_image", "url": "https://.../ref1.jpg" } ] }, "parameters": { "resolution": "720P", "ratio": "16:9", "duration": 5, "watermark": true, "seed": 42 } }关键字段说明:
字段 | 类型 | 说明 |
|---|---|---|
| String | 必填,≤5000 英文字符或 ≤2500 中文,描述主体/动作/场景/镜头/音频要求 |
| Array | T2V 可不传;I2V 传 |
| String |
|
| String |
|
| Integer | 3~15 秒,默认 5 |
| Boolean |
|
| Integer | 随机种子 [0, 2147483647],固定 seed 可提高可复现性 |
3.3 响应与异步轮询
创建任务响应(HTTP 202):
{ "output": { "task_id": "a1b2c3d4-xxxx-xxxx-xxxx-xxxxxxxxxxxx", "task_status": "PENDING" }, "request_id": "xxxx" }轮询接口:
GET https://dashscope.aliyuncs.com/api/v1/tasks/{task_id} Header: Authorization: Bearer YOUR_API_KEY终态SUCCEEDED返回示例:
{ "output": { "task_status": "SUCCEEDED", "video_url": "https://dashscope-result.oss-cn-xxxx.aliyuncs.com/xxxxx/result.mp4" }, "usage": { "duration": 5 } }
task_id有效期 24 小时;video_url通常也有时效限制(一般 24~72h),生产环境建议及时转存。
四、Python 调用示例(DashScope SDK)
4.1 环境准备
pip install dashscope>=1.25.16 export DASHSCOPE_API_KEY="sk-xxxxxx"Windows PowerShell:
$env:DASHSCOPE_API_KEY="sk-xxxxxx"
4.2 T2V 文生视频
# -*- coding: utf-8 -*- """ HappyHorse 1.1 - Text to Video (T2V) Example """ from dashscope import VideoSynthesis import dashscope import os dashscope.base_http_api_url = 'https://dashscope.aliyuncs.com/api/v1' API_KEY = os.getenv("DASHSCOPE_API_KEY") def happyhorse_t2v(prompt: str, duration: int = 5, resolution: str = "720P"): rsp = VideoSynthesis.call( api_key=API_KEY, model="happyhorse-1.1-t2v", prompt=prompt, duration=duration, resolution=resolution, ratio="16:9", watermark=False ) if rsp.status_code == 200: task_id = rsp.output.task_id print(f"[T2V] Task submitted: {task_id}") # SDK wait() 内部自动轮询 result = VideoSynthesis.wait(api_key=API_KEY, task_id=task_id) if result.output.task_status == "SUCCEEDED": print(f"[T2V] Video URL: {result.output.video_url}") return result.output.video_url else: raise RuntimeError(f"Task failed: {result.output}") else: raise RuntimeError(f"Submit failed [{rsp.status_code}]: {rsp.message}") if __name__ == "__main__": happyhorse_t2v( prompt="夕阳下金色麦田随风起伏,一匹白马奔驰而过,慢动作,电影感,暖色调", duration=5, resolution="720P" )4.3 I2V 图生视频
def happyhorse_i2v(prompt: str, image_url: str, duration: int = 5): rsp = VideoSynthesis.call( api_key=API_KEY, model="happyhorse-1.1-i2v", prompt=prompt, image_url=image_url, # DashScope SDK 使用 image_url 参数传首帧 duration=duration, resolution="720P", ratio="16:9", watermark=False ) if rsp.status_code == 200: task_id = rsp.output.task_id result = VideoSynthesis.wait(api_key=API_KEY, task_id=task_id) print(f"[I2V] Video URL: {result.output.video_url}") return result.output.video_url else: raise RuntimeError(f"Submit failed [{rsp.status_code}]: {rsp.message}") # 用法示例: # happyhorse_i2v( # prompt="镜头缓慢推近模特面部,微风轻拂发丝,眼神温柔", # image_url="https://your-cdn.com/model_front.jpg" # )4.4 R2V 参考生视频(最多9张参考图)
def happyhorse_r2v(prompt: str, ref_urls: list, duration: int = 5): """ ref_urls: list of public image URLs, max 9 """ media = [{"type": "reference_image", "url": u} for u in ref_urls] rsp = VideoSynthesis.call( api_key=API_KEY, model="happyhorse-1.1-r2v", prompt=prompt, media=media, # R2V 通过 media 字段传参考图列表 duration=duration, resolution="720P", ratio="16:9", watermark=False ) if rsp.status_code == 200: task_id = rsp.output.task_id result = VideoSynthesis.wait(api_key=API_KEY, task_id=task_id) print(f"[R2V] Video URL: {result.output.video_url}") return result.output.video_url else: raise RuntimeError(f"Submit failed [{rsp.status_code}]: {rsp.message}") # 用法示例: # happyhorse_r2v( # prompt="身着红色旗袍的女性(character1)缓步走来,展开折扇(character2),耳坠(character3)随动作轻晃,中景转面部特写", # ref_urls=[ # "https://your-cdn.com/girl.jpg", # "https://your-cdn.com/fan.jpg", # "https://your-cdn.com/earring.jpg" # ] # )⚠️注意:R2V 的参考图需为公网可访问 HTTPS URL;DashScope SDK 中通过
media参数传入,HTTP 原始 API 对应input.media数组。
五、Java 调用示例(HTTP 异步 + 轮询)
Java 侧推荐使用OkHttp 4.x+Jackson,直接走 HTTP 异步接口(与 Python SDK 底层一致)。
5.1 Maven 依赖
<dependencies> <dependency> <groupId>com.squareup.okhttp3</groupId> <artifactId>okhttp</artifactId> <version>4.12.0</version> </dependency> <dependency> <groupId>com.fasterxml.jackson.core</groupId> <artifactId>jackson-databind</artifactId> <version>2.17.0</version> </dependency> </dependencies>5.2 HappyHorseClient.java
package com.example.ai; import com.fasterxml.jackson.databind.JsonNode; import com.fasterxml.jackson.databind.ObjectMapper; import okhttp3.*; import java.io.IOException; import java.util.*; /** * 阿里云百炼 HappyHorse 1.1 视频生成 — Java HTTP 异步调用封装 * 支持 T2V / I2V / R2V */ public class HappyHorseClient { private static final String API_KEY = System.getenv("DASHSCOPE_API_KEY"); private static final String BASE_URL = "https://dashscope.aliyuncs.com/api/v1/services/aigc/video-generation/video-synthesis"; private static final OkHttpClient CLIENT = new OkHttpClient.Builder() .connectTimeout(30, java.util.concurrent.TimeUnit.SECONDS) .readTimeout(60, java.util.concurrent.TimeUnit.SECONDS) .build(); private static final ObjectMapper MAPPER = new ObjectMapper(); /** * 提交视频生成任务 * * @param model happyhorse-1.1-t2v / happyhorse-1.1-i2v / happyhorse-1.1-r2v * @param prompt 提示词 * @param mediaList input.media 数组(I2V 传 first_frame,R2V 传 reference_image,T2V 传 null) * @param duration 3~15 * @return taskId */ public static String submitTask(String model, String prompt, List<Map<String, String>> mediaList, int duration) throws IOException { // 构建 input Map<String, Object> input = new LinkedHashMap<>(); input.put("prompt", prompt); if (mediaList != null && !mediaList.isEmpty()) { input.put("media", mediaList); } // 构建 parameters Map<String, Object> params = new LinkedHashMap<>(); params.put("resolution", "720P"); params.put("ratio", "16:9"); params.put("duration", duration); params.put("watermark", false); // 构建 body Map<String, Object> body = new LinkedHashMap<>(); body.put("model", model); body.put("input", input); body.put("parameters", params); String jsonBody = MAPPER.writeValueAsString(body); Request request = new Request.Builder() .url(BASE_URL) .addHeader("Authorization", "Bearer " + API_KEY) .addHeader("X-DashScope-Async", "enable") .addHeader("Content-Type", "application/json") .post(RequestBody.create(jsonBody, MediaType.parse("application/json"))) .build(); try (Response resp = CLIENT.newCall(request).execute()) { if (!resp.isSuccessful()) { throw new RuntimeException("Submit failed: " + resp.code() + " - " + resp.body().string()); } JsonNode root = MAPPER.readTree(resp.body().string()); return root.get("output").get("task_id").asText(); } } /** * 轮询任务直到 SUCCEEDED / FAILED */ public static String pollTask(String taskId) throws IOException, InterruptedException { String taskUrl = "https://dashscope.aliyuncs.com/api/v1/tasks/" + taskId; Request req = new Request.Builder() .url(taskUrl) .addHeader("Authorization", "Bearer " + API_KEY) .get() .build(); while (true) { try (Response resp = CLIENT.newCall(req).execute()) { JsonNode root = MAPPER.readTree(resp.body().string()); String status = root.get("output").get("task_status").asText(); if ("SUCCEEDED".equals(status)) { return root.get("output").get("video_url").asText(); } else if ("FAILED".equals(status) || "CANCELED".equals(status)) { throw new RuntimeException("Task " + taskId + " ended with status: " + status + " | err=" + root.get("message")); } // PENDING / RUNNING → 继续等 Thread.sleep(5000L); } } } // ====== Demo ====== public static void main(String[] args) throws Exception { // --- T2V 示例 --- String taskId = submitTask( "happyhorse-1.1-t2v", "夜幕降临,霓虹灯映照的现代都市街道,车流穿梭,赛博朋克风格", null, 5 ); System.out.println("T2V TaskID: " + taskId); System.out.println("T2V Video: " + pollTask(taskId)); // --- I2V 示例 --- List<Map<String, String>> i2vMedia = Collections.singletonList( Map.of("type", "first_frame", "url", "https://your-cdn.com/frame.jpg") ); taskId = submitTask( "happyhorse-1.1-i2v", "镜头缓慢环绕人物,微风轻拂", i2vMedia, 5 ); System.out.println("I2V Video: " + pollTask(taskId)); // --- R2V 示例(多参考图)--- List<Map<String, String>> r2vMedia = Arrays.asList( Map.of("type", "reference_image", "url", "https://your-cdn.com/char1.jpg"), Map.of("type", "reference_image", "url", "https://your-cdn.com/prop1.jpg") ); taskId = submitTask( "happyhorse-1.1-r2v", "角色(character1)手持道具(character2)走向镜头,微笑,背景为古风庭院", r2vMedia, 5 ); System.out.println("R2V Video: " + pollTask(taskId)); } }Java 示例要点说明:
所有调用均为异步:先 POST 拿
task_id,再 GET 轮询/api/v1/tasks/{task_id}X-DashScope-Async: enable不可省略,否则返回"current user api does not support synchronous calls"I2V 的
media元素type为"first_frame";R2V 为"reference_image",上限 9 张生产环境建议给轮询加指数退避/最大重试次数,并对 429(限流)做重试处理
六、Prompt 编写最佳实践
要让 HappyHorse 1.1 发挥最佳效果,建议 Prompt 包含以下要素:
主体描述:人物/商品的衣着、发型、颜色、材质、表情、动作(例:"穿象牙白真丝旗袍、盘发、持团扇、浅笑")
场景与镜头语言:背景环境 + 运镜(例:"老上海石库门天井,晨光斜射,中景→面部特写,缓慢推镜")
风格与质感:写实/动漫/胶片/电影感(例:"ARRI Alexa 35 胶片质感,低饱和度,自然肤质无磨皮")
音频要求(如有):BGM 类型、环境音、语速(例:"背景轻快古筝曲,鸟鸣环境音,无旁白")
约束:时长、分辨率、宽高比
Bad:一个女生走路
Good:生成5秒720P视频,身着月白刺绣旗袍的年轻女性从古宅廊下缓步走出面向镜头微笑,手持团扇轻摇,背景为青瓦白墙苏州园林,侧逆光柔光,电影感,背景轻快琵琶曲,人物面部保持参考图特征
七、计费与注意事项
项目 | 说明 |
|---|---|
计费单位 | 按生成视频秒数计费,与所选分辨率档位挂钩 |
2026上线优惠 | 限时约6折(720P ≈ ¥0.54/秒,1080P ≈ ¥0.72/秒,以控制台为准) |
新用户 | 通常赠送一定免费额度(如10秒),详见百炼控制台 |
并发限制 | 默认有 QPS/RPM 限制,企业级高并发需申请提额 |
参考图要求 | JPG/PNG/WEBP,公网 HTTPS URL,建议 ≥512×512,过多小图会影响一致性 |
水印 |
|
八、总结
HappyHorse 1.1 的核心进步在于把 AI 视频从"偶尔出好片"推进到了"可纳入生产管线"——尤其是R2V 多参考图主体一致性 + 原生音画同步 + 长 Prompt 镜头规划,使其对短剧、电商广告、品牌内容团队具备实际落地价值。通过阿里云百炼异步 API,无论是 Python 数据脚本还是 Java 后端微服务,都能以标准化异步任务模式接入。
建议在正式集成前进行小规模 Prompt 测试,确认角色参考图角度覆盖度与 Prompt 描述精度匹配后再批量投产。
