Generation

Common questions about image and video generation in Kyoso.

How long do generations take?

Image generations typically take 8–40 seconds. Video generations take 1–8 minutes. The exact time depends on the model you choose — faster models are great for quick drafts and iteration, while slower models spend more time producing higher-quality output. Current load can also affect timing.

What happens when a generation fails?

The agent informs you about the failure. You are not charged for failed generations — credits are pre-deducted when a tool is called, and if the generation fails, the deducted amount is automatically rolled back.

Which model is best for text in images?

Ideogram V3 — it's specifically strong at rendering crisp text and logos. See Image models.

Which model is best for photorealism?

Imagen 4 or Seedream 4.5 for photorealistic images. For photorealistic video, try Sora 2 or Kling V3 Standard.

Are generations private?

Generations are private to your organization. Other orgs cannot see your content.

Are generations stored permanently?

Yes. There is no lifecycle policy for generations at the moment — your images, videos, and animations are stored indefinitely and remain available unless you delete them.

How many generations can I run at once?

You can queue up to 5 generations in a single agent session.

Why did a tool or generation fail?

Tools can sometimes fail due to model policies, poor-quality reference images or videos, unsupported formats, or unexpected errors. Common examples:

  • Apply Motion — the reference video lacks clear motion, the image doesn't contain a recognizable character, or the video file is too large.
  • Multi-Scene Video — uneven number of keyframe images, or human faces that models struggle to animate.

In all cases, the agent will inform you about the issue and suggest how to fix it. You can retry with adjusted inputs or a different model.

On this page