Build a Realtime Video Restyling Agent with Gemini 3 + Decart AI
Source: Dev.to
Overview
Google’s Gemini 3 (released Nov 18 2025) provides multimodal reasoning and tool‑use for building response‑accurate AI applications. Paired with Decart AI’s Mirage LSD—the first live‑stream diffusion model for zero‑latency video restyling at 24 FPS—you can create an agent that restyles video in real time.
processor = decart.RestylingProcessor(
initial_prompt="Change the video style to a cute animated movie with vibrant colours",
model="mirage_v2"
)
llm = gemini.LLM(model="gemini-3-pro-preview")
agent = Agent(
edge=getstream.Edge(),
agent_user=User(name="Story teller", id="agent"),
instructions="You will use the Decart processor to change the style of the video and the user's background.",
llm=llm,
tts=elevenlabs.TTS(voice_id="N2lVS1w4EtoT3dr4eOWO"),
stt=deepgram.STT(),
processors=[processor],
)
@llm.register_function(
description="Changes the prompt of the Decart processor, which in turn changes the video style and background."
)
async def change_prompt(prompt: str) -> str:
await processor.update_prompt(prompt)
return f"Prompt changed to {prompt}"
return agent
async def join_call(agent: Agent, call_type: str, call_id: str, **kwargs) -> None:
"""Join the call and start the agent."""
await agent.create_user()
call = await agent.create_call(call_type, call_id)
logger.info("🤖 Starting Agent...")
with await agent.join(call):
logger.info("Joining call")
logger.info("LLM ready")
await agent.finish() # Run until the call ends
if __name__ == "__main__":
cli(AgentLauncher(create_agent=create_agent, join_call=join_call))
Running the Agent
Store your API credentials as environment variables (or place them in a .env file at the project root):
export GOOGLE_API_KEY=your_google_key
export DECART_API_KEY=your_decart_key
export ELEVENLABS_API_KEY=your_elevenlabs_key
export DEEPGRAM_API_KEY=your_deepgram_key
export STREAM_API_KEY=your_stream_key
export STREAM_API_SECRET=your_stream_secret
Then launch the script:
uv run main.py
A browser tab opens with a video‑call interface that automatically joins you. Grant camera and microphone access, then speak commands such as “Make my video Studio Ghibli.” The feed will transform in real time.
Example Interaction
You: "Make it Neon Nostalgia."
Agent: "OK, I've updated the video style to Neon Nostalgia."
You: "Make it a War Zone."
Agent: "OK, I've updated the video style to a War Zone."
Resources
- Vision Agents – official site | GitHub repository
- Decart AI – homepage | Decart plugin for Vision Agents
- Gemini 3 documentation –
- Stream (WebRTC) –
Give it a spin and experiment with styles—post‑apocalyptic Paris, Van Gogh’s Starry Night, or anything you can imagine. 🎨