DoodleMates: Building a Multimodal Creature Generator

Published: 3 days ago (December 3, 2025 at 02:40 AM EST)

2 min read

Source: Dev.to

Cover image for DoodleMates: Building a Multimodal Creature Generator

This post is my submission for DEV Education Track: Build Apps with Google AI Studio.

I set out to build DoodleMates, an app that turns any photo and personality traits into a unique 3D doodle creature.
The core functionality relies on a single multimodal API call. The key prompt I crafted was designed to leverage both image and text inputs:

“Analyze the image’s aesthetic and colors, then generate a detailed 3D doodle‑style creature sticker that reflects ‘[User’s Personality Notes]’ and matches the image’s style.”

I utilized the Studio’s multimodal capabilities and the Prompt Engineering interface to rapidly iterate on the visual style and consistency.

Demo

Input

The user shares a photo and simple text notes.

Output

The generated, custom DoodleMate.

My Experience

What I Learned 💡

True Multimodal Simplicity – The model elegantly handles fundamentally different inputs (an image and a block of text) and produces a unified, creative output (a new image) without needing separate APIs for analysis and generation.
Prompt as Code – Tweaking words like “3D sticker,” “whimsical,” or “charming” acted like visual parameters, allowing me to refine the aesthetic without writing traditional code.

What Was Surprising 🤯

Speed of Prototyping – I went from a simple concept to a functional core engine for a highly custom, image‑to‑image application in less than an hour. Testing the API directly in the Studio environment made iterating on the perfect prompt incredibly fast, a game‑changer for solo developers.

If you’re looking for a quick, creative project, using Google AI Studio for multimodal tasks is the perfect way to turn pixels into personality!

DoodleMates: Building a Multimodal Creature Generator

Demo

Input

Output

My Experience

What I Learned 💡

What Was Surprising 🤯

Related posts

AWS re:Invent 2025 - Beyond web browsers: HITL and tool integration for Nova Agents (AIM3334)

AWS re:Invent 2025 - Zoox: Building Machine Learning Infrastructure for Autonomous Vehicles (AMZ304)

arreglar pinchazos cerca de mi en Alpedrete

AWS re:Invent 2025 - Intelligent security: Protection at scale from development to production-INV214