DoodleMates: Building a Multimodal Creature Generator
Source: Dev.to

This post is my submission for DEV Education Track: Build Apps with Google AI Studio.
I set out to build DoodleMates, an app that turns any photo and personality traits into a unique 3D doodle creature.
The core functionality relies on a single multimodal API call. The key prompt I crafted was designed to leverage both image and text inputs:
“Analyze the image’s aesthetic and colors, then generate a detailed 3D doodle‑style creature sticker that reflects ‘[User’s Personality Notes]’ and matches the image’s style.”
I utilized the Studio’s multimodal capabilities and the Prompt Engineering interface to rapidly iterate on the visual style and consistency.
Demo
Input
The user shares a photo and simple text notes.
Output
The generated, custom DoodleMate.
My Experience
What I Learned 💡
- True Multimodal Simplicity – The model elegantly handles fundamentally different inputs (an image and a block of text) and produces a unified, creative output (a new image) without needing separate APIs for analysis and generation.
- Prompt as Code – Tweaking words like “3D sticker,” “whimsical,” or “charming” acted like visual parameters, allowing me to refine the aesthetic without writing traditional code.
What Was Surprising 🤯
- Speed of Prototyping – I went from a simple concept to a functional core engine for a highly custom, image‑to‑image application in less than an hour. Testing the API directly in the Studio environment made iterating on the perfect prompt incredibly fast, a game‑changer for solo developers.
If you’re looking for a quick, creative project, using Google AI Studio for multimodal tasks is the perfect way to turn pixels into personality!

