VidTool AI Logo
HappyHorse 1.1 Available Now

HappyHorse 1.1: Three Paths to Video.

HappyHorse 1.1 is Alibaba's upgraded multimodal video model — improving motion continuity, prompt following, and identity preservation over HappyHorse 1.0. Generate from text alone, animate a single first frame, or direct multi-image reference scenes inside VidTool AI.

Generate sample clips with HappyHorse 1.1 in the VidTool AI workspace

What Makes HappyHorse 1.1 Different?

HappyHorse 1.1 unifies three generation workflows behind one engine selection that switches automatically based on what you upload — text-to-video with no images, image-to-video with exactly one image, or reference-to-video with two to nine ordered reference images.

Reference-to-video is where HappyHorse 1.1 stands out for production work: upload multiple character, product, or style references and describe how they interact using [Image 1], [Image 2], and so on in your prompt. The model preserves identity and visual consistency across the generated clip.

HAPPYHORSE 1.1 CAPABILITIES

Four Capabilities That Change the Workflow

Each one maps to a real production pattern — from prompt-only drafts to multi-reference storytelling.

Text-to-Video

Generate from a prompt alone with full aspect ratio control — no reference images required. Ideal for concept exploration and scene drafting.

Image-to-Video

Upload exactly one image to animate the first frame. The model derives motion and composition from your prompt while preserving the source visual identity.

Reference-to-Video

Upload 2–9 reference images and reference them as [Image 1], [Image 2], etc. in your prompt — for multi-character scenes and brand-consistent outputs.

720p & 1080p

Output at 720p or 1080p across 9 aspect ratios, with duration from 3 to 15 seconds.

HappyHorse 1.1 Technical Specifications

The exact parameters available when you run HappyHorse 1.1 inside the VidTool AI workspace.

Generation modes
Text-to-video (0 images) / image-to-video (1 image) / reference-to-video (2–9 images)
Resolutions
720p / 1080p
Aspect ratios
16:9, 9:16, 1:1, 4:3, 3:4, 4:5, 5:4, 9:21, 21:9
Duration
3–15 seconds
Reference images
Up to 9 images (reference-to-video)
Prompt convention (R2V)
Use [Image 1], [Image 2], … matching upload order
Provider
Alibaba HappyHorse

How to Generate Video with HappyHorse 1.1

From text-only generation to multi-reference scenes in four steps.

1

Choose your workflow

No uploads for text-to-video, one image for image-to-video, or 2–9 images for reference-to-video — the engine switches automatically.

2

Write a detailed prompt

For reference-to-video, label subjects with [Image 1], [Image 2], etc. in the same order as your uploads. Describe action, camera, and atmosphere clearly.

3

Set resolution, ratio & duration

Pick 720p or 1080p, choose an aspect ratio (hidden for single-image I2V where ratio follows the source), and set duration from 3 to 15 seconds.

4

Generate & download

Preview the output in the workspace, iterate on prompt or references, and download the finished clip.

FAQ

Frequently Asked Questions about HappyHorse 1.1

Technical questions about HappyHorse 1.1, answered plainly.

What is HappyHorse 1.1?

HappyHorse 1.1 is Alibaba's upgraded video generation model on VidTool AI. It improves motion quality, prompt adherence, and identity preservation compared to HappyHorse 1.0.

How does VidTool AI choose the generation mode?

Zero uploaded images → text-to-video. Exactly one image → image-to-video. Two to nine images → reference-to-video. The mode switches automatically — you do not select it manually.

How does reference-to-video prompting work?

Upload 2–9 images in order, then reference them in your prompt as [Image 1], [Image 2], and so on. The first uploaded image is [Image 1], the second is [Image 2], etc.

What resolutions and durations are supported?

Resolutions: 720p and 1080p. Duration: 3 to 15 seconds. Aspect ratios (text-to-video and reference-to-video): 16:9, 9:16, 1:1, 4:3, 3:4, 4:5, 5:4, 9:21, 21:9.

Why does image-to-video require exactly one image?

HappyHorse image-to-video uses a single first-frame image as the visual anchor. The model animates from that frame using your optional prompt. Multiple images route to reference-to-video instead.

Can I use HappyHorse 1.1 for text-to-video with images?

No. Text-to-video does not accept reference images. Remove all uploads to use pure text-to-video, or add images to switch into image-to-video or reference-to-video.

What types of content is HappyHorse 1.1 best suited for?

Multi-character scenes, brand videos with consistent assets, product animation from a hero still, social clips, and any workflow that benefits from ordered multi-image references.

How does HappyHorse 1.1 compare to Kling 3.0 on VidTool AI?

HappyHorse 1.1 emphasizes three auto-switching workflows including reference-to-video with up to nine images. Kling 3.0 emphasizes quality tiers (std/pro/4K), optional sound, and first/last frame control with up to two images.
UNLEASH YOUR CREATIVITY

Ready to produce your first HappyHorse 1.1 masterpiece?

Access HappyHorse 1.1 instantly within your unified VidTool AI workspace — generate, preview, and download professional videos in minutes.