Somnium Audio Dream Journal

Published: (January 6, 2026 at 09:55 PM EST)
2 min read
Source: Dev.to

Source: Dev.to

Cover image for Somnium Audio Dream Journal

Education Track: Build Apps with Google AI Studio
This post is my submission for DEV Education Track: Build Apps with Google AI Studio.

What I Built

I built Somnium, a mystical, voice‑first dream journal that acts as a bridge to your subconscious. Instead of typing out dreams in the middle of the night, users simply record their voice. The app uses Google’s Gemini API to transcribe the audio, analyze the dream using Jungian psychology, detect emotional themes, and even generate a surrealist image representing the dreamscape.

Key Prompts & Features

  • Multimodal audio processing – leveraged the gemini-3-flash-preview model to handle raw audio blobs directly.
  • Analysis Prompt:

    “You are an expert Jungian dream analyst… Transcribe the audio… Analyze for hidden meanings… Identify archetypes… Rate the intensity of primary emotions.”

  • Visual Generation – used the analysis output to craft a dynamic prompt for gemini-2.5-flash-image, requesting “Abstract Expressionism mixed with Dreamcore” based on the specific emotions and themes found in the dream.
  • Real‑time audio visualizer and an emotion radar chart (implemented with Recharts).
  • Auto‑tagging system – AI suggests relevant keywords for each journal entry.

Demo

Somnium Audio Dream Journal Demo

My Experience

Building with the Google GenAI SDK was surprisingly intuitive, especially regarding Structured Output.

  • Multimodal Ease – I didn’t need a separate speech‑to‑text library. Passing the audio blob directly to Gemini with a prompt to “transcribe and analyze” handled both tasks in a single request, reducing latency and code complexity.
  • JSON Schema – Using the responseSchema configuration ensured Gemini always returned data (e.g., emotion scores and archetype lists) in a clean JSON format that my React components could render immediately without parsing errors.
  • Chaining Outputs – The ability to feed the text analysis output into an image generation prompt created a cohesive user experience where the visuals truly matched the “vibe” of the dream interpretation.
Back to Blog

Related posts

Read more »

Hello, Newbie Here.

Hi! I'm falling back into the realm of S.T.E.M. I enjoy learning about energy systems, science, technology, engineering, and math as well. One of the projects I...