Video models
Compare every video generation model available in Kyoso.
Pick a model in the agent input with @. Different models support different durations, aspect ratios, and frame attachment options. Video generations typically take 1–8 minutes. Faster models prioritize speed and are great for quick iteration, while slower models spend more time producing higher-quality results.
Models
Fast models (1–4 min)
Great for quick iteration, drafts, and exploring ideas.
| Model | Best for | Durations | Aspect ratios | Frames you can attach | Audio |
|---|---|---|---|---|---|
| LTX 2 Fast | Open-source 4K with synced audio for fast iteration | 4 / 8s | 16:9, 9:16 | Start | — |
| Grok Imagine | Real-time cinematic, movie-style physics | 3–15s | 7 ratios (widest range) | Start, video ref | — |
| Sora 2 | Realistic, detailed, long-form video | 4 / 8 / 12s | 16:9, 9:16 | Start | — |
| Seedance | Motion-focused video from images and keyframes | — | — | Start, end | — |
| Veo 3.1 Fast | Fast, cinematic video generation | 4 / 6 / 8s | 16:9, 9:16 | Start, end | Generates audio |
| Seedance 2 | Next-gen motion and dance-focused text-to-video | — | — | — | — |
Longer models (5–8 min)
Take more time but produce higher-quality, more detailed output.
| Model | Best for | Durations | Aspect ratios | Frames you can attach | Audio |
|---|---|---|---|---|---|
| Kling O3 | Real-time, action-focused 4K video | 3–15s | 16:9, 9:16, 1:1 | Start, end, video ref | Keeps source audio |
| Kling O1 | Vivid, action-focused synthesis | 5 / 10s | 16:9, 9:16, 1:1 | Start, end, video ref | — |
| Kling V3 Standard | Physics-accurate 4K with camera control | 3–15s | 16:9, 9:16, 1:1 | Start | Generates audio |
| Kling V3 Pro | Premium 4K with maximum detail and accuracy | — | — | Start | — |
How to choose
- Need fast results? Start with LTX 2 Fast, Grok Imagine, or Sora 2 — they deliver in under 2 minutes.
- Need the highest quality? Kling V3 Pro, Kling V3 Standard, or Kling O3 take longer but produce more detailed output.
- Need start and end frames? Use Kling O3, Veo 3.1 Fast, Seedance, or Kling O1.
- Need synced audio? Use Veo 3.1 Fast, Kling V3 Standard, or LTX 2 Fast.
- Need a video as a style reference? Use Kling O3, Kling O1, or Grok Imagine.
- Need long clips? Sora 2 goes up to 12s; Kling O3, Kling V3 Standard, and Grok Imagine go up to 15s.
- Need vertical, square, and landscape support? Grok Imagine has the widest aspect ratio range.