Video Generation with AI Gateway

Published: 3 days ago (February 19, 2026 at 08:00 AM EST)

6 min read

Source: Vercel Blog

How Video Generation Differs from Image Generation

Prompts can include motion cues – camera moves, object actions, timing.
Audio direction can be added (optional).
Each provider exposes different capabilities through provider‑specific options that unlock fundamentally different generation modes.
See the [Provider‑Specific Options Documentation] for details.

Types of Video Generation Supported by AI Gateway

Generation Mode	Description	Typical Use‑Case
Text‑to‑Video	Describe what you want; the model handles visuals, motion, and optionally audio. Great for hyper‑realistic, production‑quality footage from a simple text prompt.	Ad creatives, explainer videos, social content
Programmatic Video (API)	Generate videos on demand for your app, platform, or content pipeline. No licensing fees or production required – just prompts and outputs.	Scalable programmatic video at scale
Image‑to‑Video	Turn a simple prompt (or a starting image) into polished video clips for social media, ads, or storytelling with natural motion and cinematic quality.	Creative content generation, product animation
Reference‑to‑Video	Provide reference images or videos of a person/character; the model extracts appearance & voice to generate new scenes starring them with a consistent identity.	Spokesperson content, consistent brand characters
First‑and‑Last‑Frame	Define start and end states (two images); the model generates a seamless transition between them.	Before/after reveals, time‑lapse, outfit swaps
Video Editing / Style Transfer	Supply a source video URL and describe the desired transformation; the model applies a new style while preserving original motion.	Watercolor‑style videos, artistic re‑renders

Example Models & Their Typical Workflows

Model (Provider)	Generation Mode	Example Prompt / Use‑Case
klingai/kling‑v2.6‑t2v	Text‑to‑Video	“Generate a 30‑second cinematic travel video of a sunrise over the Alps, with gentle camera pans and ambient orchestral music.”
google/veo‑3.1‑generate‑001	Text‑to‑Video (high‑fidelity)	“Create a photorealistic kitchen scene with a chef chopping vegetables, realistic lighting, and synchronized sound effects.”
klingai/kling‑v2.6‑i2v	Image‑to‑Video	Provide a product photo URL + “Add a slow 360° rotation and subtle lighting changes.”
klingai/kling‑v3.0‑i2v‑imagelastFrame	First‑and‑Last‑Frame	Upload “before” and “after” product images → “Generate a smooth transition showing the product assembling.”
alibaba/wan‑v2.6‑r2v‑flash	Reference‑to‑Video	Supply two reference images of a dog → “Create a short video of the dog playing fetch in a park, preserving its identity.”
xai/grok‑imagine‑video	Video Editing / Style Transfer	Source video URL + “Apply a watercolor painting style while keeping the original motion.”

Tip: For multi‑reference generation (e.g., multiple characters), include tags like character1, character2, etc., in the prompt. See the [Wan Prompt Guide] for best practices.

Model‑Creator Capabilities Overview

Provider	Text‑to‑Video	Image‑to‑Video	First‑&‑Last‑Frame	Reference‑to‑Video	Audio Generation	Video Editing
xAI	✅	✅	✅	✅	✅	✅
Alibaba Wan	✅	✅	✅	✅	✅	❌
Kling	✅	✅	✅	❌	✅	❌
Google Veo	✅	✅	❌	❌	✅	✅

Getting Started

1. Programmatic Access (One API, One Auth Flow)

AI SDK 6 lets you generate videos programmatically using the same interface you already use for text and images.
• One API endpoint
• Unified authentication
• Central observability dashboard for your entire AI pipeline

# Example: Generate a 10‑second video from a text prompt
curl -X POST https://api.ai-gateway.com/v1/video \
  -H "Authorization: Bearer " \
  -H "Content-Type: application/json" \
  -d '{
        "model": "klingai/kling-v2.6-t2v",
        "prompt": "A futuristic city skyline at dusk, drone fly‑through, synthwave soundtrack",
        "duration_seconds": 10,
        "aspect_ratio": "16:9"
      }'

2. No‑Code Playground

Each model page includes an embedded, configurable playground where you can:

Compare providers side‑by‑side
Tweak prompts and provider options in real time
Download results without writing a single line of code

Access the playground via AI Gateway → Model List → Video Generation.

Provider Spotlights

Provider	Strengths	Notable Model(s)
xAI – Grok Imagine	Fast, strong instruction following; video editing & style transfer in seconds.	`xai/grok-imagine-video`
Alibaba – Wan	Reference‑based generation, multi‑shot storytelling, identity preservation across scenes.	`alibaba/wan-v2.6-r2v-flash`
Kling	Excellent image‑to‑video, native audio, new 3.0 models support multishot video with automatic scene transitions.	`klingai/kling-v3.0-i2v-imagelastFrame`
Google – Veo	Highest visual fidelity, realistic physics, native audio generation with cinematic lighting.	`google/veo-3.1-generate-001`

Documentation & Resources

[Video Generation Documentation] – Full reference guide.
[Video Generation Quick‑Start] – Step‑by‑step tutorials and sample scripts.
Changelogs – Detailed examples and prompt updates for each model.

Quick Reference Tables

Generation Types

Type	Required Inputs	Optional	Typical Output
Text‑to‑Video	Text prompt	Aspect ratio, duration, audio cues	Full‑length video
Image‑to‑Video	Image URL (or upload)	Text prompt for motion, audio	Animated clip
First‑and‑Last‑Frame	Two images	Prompt for transition style	Seamless transition video
Reference‑to‑Video	Images or video clips of a character	Prompt describing new scenes	Video starring the referenced entity
Video Editing	Source video URL	Style description, audio overlay	Stylized video

Model‑Creator Capabilities

Provider	Text‑to‑Video	Image‑to‑Video	First‑&‑Last‑Frame	Reference‑to‑Video	Audio	Video Editing
xAI	✅	✅	✅	✅	✅	✅
Wan	✅	✅	✅	✅	✅	❌
Kling	✅	✅	✅	❌	✅	❌
Veo	✅	✅	❌	❌	✅	✅

Next Steps

Read the full docs – familiarize yourself with provider‑specific options.
Pick a model – start with the playground to experiment.
Integrate via API – use the sample cURL request (or SDK) to embed video generation into your product.

Happy creating! 🚀

- deo
- image-to-video
- audio

Video Generation with AI Gateway

How Video Generation Differs from Image Generation

Types of Video Generation Supported by AI Gateway

Example Models & Their Typical Workflows

Model‑Creator Capabilities Overview

Getting Started

1. Programmatic Access (One API, One Auth Flow)

2. No‑Code Playground

Provider Spotlights

Documentation & Resources

Quick Reference Tables

Generation Types

Model‑Creator Capabilities

Next Steps

Related posts

Skills Night: 69,000+ ways agents are getting smarter

Access billing usage and cost data via API

Grok Imagine Video on AI Gateway

Kling video models on AI Gateway

How Video Generation Differs from Image Generation

Types of Video Generation Supported by AI Gateway

Example Models & Their Typical Workflows

Model‑Creator Capabilities Overview

Getting Started

1. Programmatic Access (One API, One Auth Flow)

2. No‑Code Playground

Provider Spotlights

Documentation & Resources

Quick Reference Tables

Generation Types

Model‑Creator Capabilities

Next Steps

Related posts

Skills Night: 69,000+ ways agents are getting smarter

Access billing usage and cost data via API

Grok Imagine Video on AI Gateway

Kling video models on AI Gateway

Types of Video Generation Supported by AI Gateway