Engineering Context for Local and Cloud AI: Personas, Content Intelligence, and Zero-Prompt UX

Published: 5 days ago (December 20, 2025 at 09:55 AM EST)

9 min read

Source: Dev.to

Introduction

In the previous article we covered how DocuMentor AI’s hybrid architecture seamlessly adapts between Chrome’s local Gemini Nano and cloud AI. We built a system that automatically routes tasks based on capabilities and performance constraints.

But having a robust execution layer is only half the battle. The other half? Engineering the context that goes into those models.

Most AI tools give you a powerful model and a blank text box, then expect you to figure out what to ask. It’s like handing someone a professional camera and saying “take a good photo” – technically possible, but the burden is entirely on the user.

DocuMentor takes a different approach: zero‑prompt UX through intelligent context engineering. Users never write prompts. They click a feature (Quick Scan, Deep Analysis, Cheat Sheet), and the extension handles the rest—assembling the right persona elements, extracting the right page sections, and shaping everything into a request the AI can’t misinterpret.

This article breaks down how that works: the philosophy behind zero‑prompt design, the persona system that personalizes every response, and the content‑intelligence layer that knows exactly what to send to the AI.

The Zero‑Prompt Philosophy

Most technical‑documentation tools and AI‑powered browsers present the same UX pattern: a blank chat input with a placeholder like “Ask me anything about this page.”

Ask Me Anything

On the surface this seems user‑friendly, but if you don’t know what to ask it creates three problems:

Cognitive load – Users must think about how to phrase their question.
Intent ambiguity – Small wording changes lead to wildly different answers.
Generic responses – Without context about the user, the AI gives one‑size‑fits‑all answers.

I built DocuMentor to solve my own problem: I spend hours each week scanning documentation, blog posts, and API references trying to extract what I need quickly. Sometimes I want a TL;DR to decide if it’s worth reading. Other times I need a cheat sheet for future reference. And sometimes I just want to know “Should I care about this?”

These are specific, recurring needs. Why should I have to articulate them from scratch every time?

Feature‑First Design

Instead of a blank chat box, DocuMentor exposes four purpose‑built features:

Feature	What It Does
Quick Scan	Instant insights: TL;DR, “should I read this?”, related resources, page architecture
Deep Analysis	Comprehensive overview, code patterns, video recommendations, learning resources with reasoning
Cheat Sheet	Condensed, actionable summary optimized for quick lookup
AskMe	Targeted chat: select text or images and ask specific questions

Each feature represents a pre‑crafted intent. Users don’t have to think about how to ask; they just pick the outcome they want. The extension then crafts the prompt, selects the right page sections, and applies the user’s persona.

This isn’t just about convenience. It’s about eliminating ambiguity. When a user clicks Quick Scan, there’s zero room for misinterpretation. The AI knows exactly what format to return, what level of detail to provide, and what the user cares about.

Persona‑Driven Personalization

After building the initial feature set, I realized something critical: none of these features should return generic answers.

A “Should I read this?” recommendation means nothing without knowing who’s asking. A senior AI engineer doesn’t need an intro to neural networks, whereas a junior frontend developer does. Same feature, same page, completely different answers.

That’s when I introduced the persona system—a user profile that shapes every AI response.

What’s in a Persona

Component	Description
Role	AI/ML Engineer, Frontend Developer, Backend Engineer, etc.
Seniority	Beginner, Intermediate, Senior
Skills	Programming languages, frameworks, concepts – each with a proficiency level (Beginner, Intermediate, Advanced)
Learning Goals	What the user wants to master right now (e.g., “Master LangGraph for production AI agents”)
Learning Preferences	Text, video, or mixed

Persona components diagram

Figure: The five components of a DocuMentor persona.

The challenge wasn’t just collecting this information—it was knowing which elements matter for which features.

Mapping Persona Elements to Features

Feature	Relevant Persona Elements
Quick Scan	Role, Seniority, Skills, Learning Goals
Deep Analysis	Role, Seniority, Skills, Learning Goals, Learning Preferences
Cheat Sheet	Role, Seniority, Skills
AskMe	All elements (depends on the specific query)

For example, learning preferences are irrelevant for cheat sheets (the user already decided they want text), while skills and goals are critical for “Should I read this?” recommendations. Sending irrelevant persona data adds noise and wastes tokens—especially on local AI with tight context limits.

Conclusion

Zero‑prompt UX, combined with a finely tuned persona system, lets DocuMentor deliver precise, context‑aware answers without forcing users to craft prompts. By mapping only the necessary persona attributes to each feature, we keep token usage efficient while still providing highly personalized output.

The next sections (not shown here) dive into the content‑intelligence layer, the prompt‑generation pipeline, and performance considerations for hybrid local/cloud execution. Stay tuned!

Persona‑Driven Recommendations

Scenario: You’re a Junior Front‑end Developer learning React and you land on an article about advanced state‑management patterns.

DocuMentor’s “Should I read this?” feature might say:

Yes, read this.
This covers useReducer and Context API patterns that will level up your React skills. It assumes familiarity with useState, which you have. The examples are practical and match your learning goal: mastering React for production apps.

Scenario: You’re a Senior Backend Engineer who knows React but isn’t focused on front‑end work.

DocuMentor’s recommendation:

Skip this.
You already understand these patterns from your React experience. This won’t advance your current goal (mastering distributed systems). If you need a refresher later, the cheat‑sheet feature has you covered.

The same page, the same feature, but completely different recommendations because the persona tells the AI who is asking and why they care.

This isn’t personalization for its own sake. It’s about respecting the user’s time. Generic AI tools waste time by forcing you to read irrelevant content or by hiding important insights. Persona‑driven AI acts like a knowledgeable colleague who knows your background and priorities.

Content Intelligence: Strategic Page Decomposition

Early on I made the naive mistake most AI developers make: I fed the entire page HTML to the model, assuming it could “figure it out.”
That failed spectacularly:

Problem	Why it hurts
Context overflow	Raw HTML easily exceeds Chrome AI’s ~4 K token limit
Noise drowning signal	Ads, navigation, footers, and JavaScript compete with the actual content
Hallucinations	Small models like Gemini Nano get confused by irrelevant information

First fix → Content extraction

I used Mozilla’s Readability library (with a custom fallback for pages where Readability fails) to extract clean, readable text.

Even after cleaning, a new problem emerged: not every feature needs the same information.

Feature	Information needed
Summaries & cheat sheets	Full article content
Video recommendations	Only a summary of the page
“Learn Resources” suggestions	Page links & navigation context (no article body)

Sending everything to every feature wastes tokens, increases latency, and reduces relevance.

Solution: Strategic page decomposition.

DocuMentor’s purpose‑driven sections

Main content – Core article text (extracted via Readability)
Table of contents – Page structure & hierarchy
Page links – URLs embedded in the content
Code blocks – Extracted separately for pattern analysis
Breadcrumbs & navigation – Metadata about where the page fits in the documentation

Page Decomposition

Figure: Page sections are strategically routed to different features based on what information is actually relevant.

Feature‑to‑section mapping

Feature	Content Sections Used
Summary	Main content
Cheat Sheet	Main content + code blocks + page links
Video Recommendations	Summary only
Learn Resources	Summary + page links + breadcrumbs + navigation
Code Patterns (Deep Analysis)	Code blocks + surrounding context

Concrete example: Video recommendations

A naïve approach would send the full 10 K‑word article to the model, then ask it to find relevant YouTube videos. That would:

Burn most of Chrome AI’s token budget on a single feature
Slow down the response (the model processes 10 K words before calling the YouTube API)
Risk quota errors on low‑VRAM devices

DocuMentor’s optimized flow

Generate a summary of the page (≈200‑300 words).
Send the summary plus the user persona to the AI.
AI creates an optimal YouTube search query based on the topic and the user’s learning goals.
Extension calls the YouTube Data API (outside the AI).
AI ranks the top 10 results for relevance to the user’s goals and the page summary.
Return the top 3 videos with personalized descriptions.

Result: ~10× faster and ~1/10th the tokens compared with sending the full content. Because the AI only sees relevant information (summary + persona), the recommendations are more accurate.

This pattern repeats across every feature: content intelligence isn’t about giving the AI more information—it’s about giving it the right information.

Adaptive Prompting Across Providers

One final layer of context engineering: how you shape the request matters as much as what you send.

DocuMentor runs on two AI providers:

Provider	Characteristics
Gemini Nano (local)	• Simple, directive instructions • One reasoning task per prompt • Defensive output parsing (often returns malformed JSON)
Gemini 2.0 Flash (cloud)	• Rich, multi‑step instructions • Tool‑calling support • Reliable structured output

The persona and content sections stay the same, but the prompt framing changes based on the model’s reasoning capacity.

Example: Video recommendations

Provider	Prompt strategy
Gemini Nano (local)	Sequential decomposition – each step is a separate AI call: 1️⃣ Generate search query → 2️⃣ Call API → 3️⃣ Rank results → 4️⃣ Format output
Gemini Flash (cloud)	Single tool‑augmented call – the model receives the summary, persona, and a single instruction to generate a query, fetch results via the YouTube tool, rank, and format all in one request

By adapting prompts to each provider’s strengths, DocuMentor maximizes accuracy, speed, and token efficiency across both local and cloud environments.

How It Works

LL: the model generates the query, calls the YouTube tool, ranks results, and formats the output—all in one request.

Users never see this complexity. They click “Video Recommendations,” and the system automatically routes to the appropriate provider and prompt strategy.

What’s Next

This is just the first version of DocuMentor’s context‑engineering system. Two areas I’m exploring for future iterations:

1. User‑customizable feature prompts

Let users add personalized instructions to individual features. For example:

“In summaries, always include a brief definition of core concepts.”
“For video recommendations, prioritize short tutorials under 15 minutes.”
“When suggesting resources, focus on official documentation over blog posts.”

This would let users fine‑tune the experience without overthinking every request.

2. Dynamic personas

Right now, personas are static. But a full‑stack developer might want to view a page as a frontend engineer one day and a backend engineer the next, depending on context.

Future versions could let users switch personas per page or even infer persona adjustments based on the content type (e.g., automatically apply a security‑focused lens when reading about authentication).

The goal remains the same: personalization without overthinking. AI should adapt to you, not the other way around.

Final Thoughts

Building effective AI features isn’t just about picking the right model or writing clever prompts. It’s about engineering the context that goes into those prompts:

Zero‑prompt UX – Features replace chat boxes, eliminating user guesswork.
Persona‑driven personalization – Every response adapts to role, skills, goals, and preferences.
Content intelligence – Strategic decomposition ensures features get exactly what they need.

The result: an AI tool that feels less like a chatbot and more like a knowledgeable colleague who understands what you’re trying to accomplish.

If you want to see this in action, try DocuMentor AI on a technical article or documentation page. And if you find it useful, the best way to support this work is to leave a review and share it with someone who might benefit.

I’d also love to hear from you: What other aspects of building DocuMentor would you like to hear about? Drop a comment or reach out—your feedback shapes what I write next.

Engineering Context for Local and Cloud AI: Personas, Content Intelligence, and Zero-Prompt UX

Introduction

The Zero‑Prompt Philosophy

Feature‑First Design

Persona‑Driven Personalization

What’s in a Persona

Mapping Persona Elements to Features

Conclusion

Persona‑Driven Recommendations

Content Intelligence: Strategic Page Decomposition

First fix → Content extraction

DocuMentor’s purpose‑driven sections

Feature‑to‑section mapping

Concrete example: Video recommendations

Adaptive Prompting Across Providers

Example: Video recommendations

How It Works

What’s Next

1. User‑customizable feature prompts

2. Dynamic personas

Final Thoughts

Related posts

Replacing Phone Addiction with Building a Real Project

A Definitive Guide to Warehouse Utilisation

CinemaSins: Everything Wrong With Red One In 18 Minutes Or Less

Ingesting 100M Heartbeats: Scaling Wearable Tech Without Going Broke

Introduction

The Zero‑Prompt Philosophy

Feature‑First Design

Persona‑Driven Personalization

What’s in a Persona

Mapping Persona Elements to Features

Conclusion

Persona‑Driven Recommendations

Content Intelligence: Strategic Page Decomposition

First fix → Content extraction

DocuMentor’s purpose‑driven sections

Feature‑to‑section mapping

Concrete example: Video recommendations

Adaptive Prompting Across Providers

Example: Video recommendations

How It Works

What’s Next

1. User‑customizable feature prompts

2. Dynamic personas

Final Thoughts

Related Articles

Related posts

Replacing Phone Addiction with Building a Real Project

A Definitive Guide to Warehouse Utilisation

CinemaSins: Everything Wrong With Red One In 18 Minutes Or Less

Ingesting 100M Heartbeats: Scaling Wearable Tech Without Going Broke