Beyond Simple RAG: Building an Agentic Workflow with Next.js, Python, and Supabase
Source: Dev.to


The Problem: “Chat with PDF” is the new Hello World
Building a basic RAG app is easy today. You upload a 5‑page PDF, split it into 1,000‑character chunks, and it works.
When I tried this with a 500‑page university textbook, the standard pipeline fell apart. I didn’t want a chatbot; I wanted a tutor. So I built Learneazy.io.
The Secret Sauce: 3‑Layer Semantic Indexing
Most RAG apps treat a document as one giant blob of text. Textbooks, however, have a natural hierarchy (Index → Chapters → Content). I mirrored that structure in the database using a Python (Flask) microservice with PyMuPDF.
Layer 1 – The “Skeleton” (Table of Contents)
Purpose: Quick, high‑level structural queries.
Layer 2 – The “Container” (Chapter‑wise Chunks)
Purpose: Context‑aware searches. When you ask about “Thermodynamics in Chapter 4,” only Chapter 4 is searched.
Layer 3 – The “Deep Dive” (Granular Chunks)
Purpose: Answering specific, deep‑dive questions where every nuance matters.
The Brain: Agentic Routing
A hierarchical index is useless without a mechanism to decide which layer to query. I implemented a LangChain Agent equipped with custom tools for each layer. The agent acts as a router:
- User: “How many chapters are there?” → Agent: Calls Index Tool (fast, cheap).
- User: “Summarize Chapter 3.” → Agent: Calls Chapter Tool (high context).
- User: “Explain the formula for X.” → Agent: Calls Deep Dive Tool (high precision).
This routing logic reduced token usage by ~40 % and dramatically improved accuracy.
Beyond Chat: The Flashcard Engine
The hardest technical challenge was enabling users to say: “Generate 10 flashcards for Chapter 5.” The AI couldn’t simply guess; it needed a grounded workflow.
- Topic Extraction: Scan the Chapter Layer to identify key themes (e.g., “Mitochondria,” “Krebs Cycle”).
- Context Retrieval: Perform a targeted vector search in the Deep Dive Layer for those topics to obtain precise definitions.
- Synthesis: Use the LLM (Google Gemini) to format the grounded facts into strict Q&A pairs.
The result: flashcards generated from the user’s specific material, not generic internet knowledge.
The Stack: Why Microservices?
| Component | Technology | Rationale |
|---|---|---|
| Frontend | Next.js 16 (React 19) | Snappy, responsive UI |
| Processing Service | Python (Flask) | Superior PDF manipulation and chunking logic |
| Embeddings | Cohere (embed-english-v3.0) | Fine‑tuned for RAG retrieval quality, better than OpenAI for this use case |
| Database | Supabase (PostgreSQL + pgVector) | Stores vectors alongside user data (auth, metadata) for a streamlined backend |
| Source Code | github.com/Abhinav-Sriharsha/Learneazy |