HappyHorse 1.1:
Three Paths to Video.
HappyHorse 1.1 is Alibaba's upgraded multimodal video model — improving motion continuity, prompt following, and identity preservation over HappyHorse 1.0. Generate from text alone, animate a single first frame, or direct multi-image reference scenes inside VidTool AI.
Generate sample clips with HappyHorse 1.1 in the VidTool AI workspace
What Makes HappyHorse 1.1 Different?
HappyHorse 1.1 unifies three generation workflows behind one engine selection that switches automatically based on what you upload — text-to-video with no images, image-to-video with exactly one image, or reference-to-video with two to nine ordered reference images.
Reference-to-video is where HappyHorse 1.1 stands out for production work: upload multiple character, product, or style references and describe how they interact using [Image 1], [Image 2], and so on in your prompt. The model preserves identity and visual consistency across the generated clip.
Four Capabilities That Change the Workflow
Each one maps to a real production pattern — from prompt-only drafts to multi-reference storytelling.
Text-to-Video
Generate from a prompt alone with full aspect ratio control — no reference images required. Ideal for concept exploration and scene drafting.
Image-to-Video
Upload exactly one image to animate the first frame. The model derives motion and composition from your prompt while preserving the source visual identity.
Reference-to-Video
Upload 2–9 reference images and reference them as [Image 1], [Image 2], etc. in your prompt — for multi-character scenes and brand-consistent outputs.
720p & 1080p
Output at 720p or 1080p across 9 aspect ratios, with duration from 3 to 15 seconds.
HappyHorse 1.1 Technical Specifications
The exact parameters available when you run HappyHorse 1.1 inside the VidTool AI workspace.
- Generation modes
- Text-to-video (0 images) / image-to-video (1 image) / reference-to-video (2–9 images)
- Resolutions
- 720p / 1080p
- Aspect ratios
- 16:9, 9:16, 1:1, 4:3, 3:4, 4:5, 5:4, 9:21, 21:9
- Duration
- 3–15 seconds
- Reference images
- Up to 9 images (reference-to-video)
- Prompt convention (R2V)
- Use [Image 1], [Image 2], … matching upload order
- Provider
- Alibaba HappyHorse
How to Generate Video with HappyHorse 1.1
From text-only generation to multi-reference scenes in four steps.
Choose your workflow
No uploads for text-to-video, one image for image-to-video, or 2–9 images for reference-to-video — the engine switches automatically.
Write a detailed prompt
For reference-to-video, label subjects with [Image 1], [Image 2], etc. in the same order as your uploads. Describe action, camera, and atmosphere clearly.
Set resolution, ratio & duration
Pick 720p or 1080p, choose an aspect ratio (hidden for single-image I2V where ratio follows the source), and set duration from 3 to 15 seconds.
Generate & download
Preview the output in the workspace, iterate on prompt or references, and download the finished clip.
Frequently Asked Questions about HappyHorse 1.1
Technical questions about HappyHorse 1.1, answered plainly.