Somnium Audio Dream Journal

Published: (January 6, 2026 at 09:55 PM EST)
2 min read
Source: Dev.to

Source: Dev.to

Cover image for Somnium Audio Dream Journal

Education Track: Build Apps with Google AI Studio
This post is my submission for DEV Education Track: Build Apps with Google AI Studio.

What I Built

I built Somnium, a mystical, voice‑first dream journal that acts as a bridge to your subconscious. Instead of typing out dreams in the middle of the night, users simply record their voice. The app uses Google’s Gemini API to transcribe the audio, analyze the dream using Jungian psychology, detect emotional themes, and even generate a surrealist image representing the dreamscape.

Key Prompts & Features

  • Multimodal audio processing – leveraged the gemini-3-flash-preview model to handle raw audio blobs directly.
  • Analysis Prompt:

    “You are an expert Jungian dream analyst… Transcribe the audio… Analyze for hidden meanings… Identify archetypes… Rate the intensity of primary emotions.”

  • Visual Generation – used the analysis output to craft a dynamic prompt for gemini-2.5-flash-image, requesting “Abstract Expressionism mixed with Dreamcore” based on the specific emotions and themes found in the dream.
  • Real‑time audio visualizer and an emotion radar chart (implemented with Recharts).
  • Auto‑tagging system – AI suggests relevant keywords for each journal entry.

Demo

Somnium Audio Dream Journal Demo

My Experience

Building with the Google GenAI SDK was surprisingly intuitive, especially regarding Structured Output.

  • Multimodal Ease – I didn’t need a separate speech‑to‑text library. Passing the audio blob directly to Gemini with a prompt to “transcribe and analyze” handled both tasks in a single request, reducing latency and code complexity.
  • JSON Schema – Using the responseSchema configuration ensured Gemini always returned data (e.g., emotion scores and archetype lists) in a clean JSON format that my React components could render immediately without parsing errors.
  • Chaining Outputs – The ability to feed the text analysis output into an image generation prompt created a cohesive user experience where the visuals truly matched the “vibe” of the dream interpretation.
Back to Blog

Related posts

Read more »

Rapg: TUI-based Secret Manager

We've all been there. You join a new project, and the first thing you hear is: > 'Check the pinned message in Slack for the .env file.' Or you have several .env...

Technology is an Enabler, not a Saviour

Why clarity of thinking matters more than the tools you use Technology is often treated as a magic switch—flip it on, and everything improves. New software, pl...