Realtime Multimodal AI on Ray-Ban Meta Glasses with Gemini Live & LiveKit
Source: Dev.to

Architecture
The setup involves several layers to ensure low‑latency, secure communication between the wearable device and the AI:
- Meta Ray‑Ban Glasses – Capture video and audio, connecting via Bluetooth to your phone.
- Phone (Android/iOS) – Acts as the gateway, connecting via WebRTC to LiveKit Cloud.
- LiveKit Cloud – Serves as a secure, high‑performance proxy for the Gemini Live API.
- Gemini Live API – Processes the stream via WebSockets, enabling real‑time multimodal interaction.

Backend: Building the Gemini Live Agent
We use the LiveKit Agents framework to act as a secure WebRTC proxy for the Gemini Live API. This agent joins the LiveKit room, listens to the audio, and processes the video stream from the glasses.
Setting up the Assistant
The core of our agent is the AgentSession. We use the google.beta.realtime.RealtimeModel to interface with Gemini and enable video_input in the RoomOptions so the agent can “see.”
@server.rtc_session()
async def entrypoint(ctx: JobContext):
ctx.log_context_fields = {"room": ctx.room.name}
session = AgentSession(
llm=google.beta.realtime.RealtimeModel(
model="gemini-2.5-flash-native-audio-preview-12-2025",
proactivity=True,
enable_affective_dialog=True,
),
vad=ctx.proc.userdata["vad"],
)
await session.start(
room=ctx.room,
agent=Assistant(),
room_options=room_io.RoomOptions(
video_input=True,
),
)
await ctx.connect()
await session.generate_reply()
By setting video_input=True, the agent automatically requests the video track from the room, which in this case is the 1 FPS stream coming from the glasses.
Running the Agent
To start your agent in development mode and make it accessible globally via LiveKit Cloud:
uv run agent.py dev
Find the full Gemini Live vision agent example in the LiveKit docs.
Connection & Authentication
CLI Token Generation
For testing and demos, you can quickly generate a short‑lived access token using the LiveKit CLI:
lk token create \
--api-key \
--api-secret \
--join \
--room \
--identity \
--valid-for 24h
In a production environment, always issue tokens from a secure backend to keep your API secrets safe (see LiveKit’s authentication guide).
Frontend: Meta Wearables Integration
This example targets Android devices (e.g., Google Pixel). You’ll need the Meta Wearables Toolkit and the sample project.
-
Clone the sample – Get the Android client example.
-
Configure
local.properties– Add your GitHub token as required by the Meta SDK. -
Update connection details – In
StreamScreen.kt, replace the server URL and token with your LiveKit details:// streamViewModel.connectToLiveKit connectToLiveKit( url = "wss://your-project.livekit.cloud", token = "your-generated-token" ) -
Run the app – Connect your device via USB and deploy from Android Studio.
Conclusion
By bridging Meta Wearables with Gemini Live via LiveKit, we’ve created a powerful, low‑latency vision AI experience. The architecture is scalable and secure, providing a foundation for the next generation of wearable AI applications.
Resources
Happy hacking! 🚀