Beyond Simple RAG: Building an Agentic Workflow with Next.js, Python, and Supabase

Published: (December 24, 2025 at 02:51 PM EST)
2 min read
Source: Dev.to

Source: Dev.to

Cover image for Beyond Simple RAG: Building an Agentic Workflow with Next.js, Python, and Supabase

The flow of RAG Application

The Problem: “Chat with PDF” is the new Hello World

Building a basic RAG app is easy today. You upload a 5‑page PDF, split it into 1,000‑character chunks, and it works.

When I tried this with a 500‑page university textbook, the standard pipeline fell apart. I didn’t want a chatbot; I wanted a tutor. So I built Learneazy.io.

The Secret Sauce: 3‑Layer Semantic Indexing

Most RAG apps treat a document as one giant blob of text. Textbooks, however, have a natural hierarchy (Index → Chapters → Content). I mirrored that structure in the database using a Python (Flask) microservice with PyMuPDF.

Layer 1 – The “Skeleton” (Table of Contents)

Purpose: Quick, high‑level structural queries.

Layer 2 – The “Container” (Chapter‑wise Chunks)

Purpose: Context‑aware searches. When you ask about “Thermodynamics in Chapter 4,” only Chapter 4 is searched.

Layer 3 – The “Deep Dive” (Granular Chunks)

Purpose: Answering specific, deep‑dive questions where every nuance matters.

The Brain: Agentic Routing

A hierarchical index is useless without a mechanism to decide which layer to query. I implemented a LangChain Agent equipped with custom tools for each layer. The agent acts as a router:

  • User: “How many chapters are there?” → Agent: Calls Index Tool (fast, cheap).
  • User: “Summarize Chapter 3.” → Agent: Calls Chapter Tool (high context).
  • User: “Explain the formula for X.” → Agent: Calls Deep Dive Tool (high precision).

This routing logic reduced token usage by ~40 % and dramatically improved accuracy.

Beyond Chat: The Flashcard Engine

The hardest technical challenge was enabling users to say: “Generate 10 flashcards for Chapter 5.” The AI couldn’t simply guess; it needed a grounded workflow.

  1. Topic Extraction: Scan the Chapter Layer to identify key themes (e.g., “Mitochondria,” “Krebs Cycle”).
  2. Context Retrieval: Perform a targeted vector search in the Deep Dive Layer for those topics to obtain precise definitions.
  3. Synthesis: Use the LLM (Google Gemini) to format the grounded facts into strict Q&A pairs.

The result: flashcards generated from the user’s specific material, not generic internet knowledge.

The Stack: Why Microservices?

ComponentTechnologyRationale
FrontendNext.js 16 (React 19)Snappy, responsive UI
Processing ServicePython (Flask)Superior PDF manipulation and chunking logic
EmbeddingsCohere (embed-english-v3.0)Fine‑tuned for RAG retrieval quality, better than OpenAI for this use case
DatabaseSupabase (PostgreSQL + pgVector)Stores vectors alongside user data (auth, metadata) for a streamlined backend
Source Codegithub.com/Abhinav-Sriharsha/Learneazy
Back to Blog

Related posts

Read more »

Daily Tech News Roundup - 2026-01-03

Grok is undressing anyone, including minors Source: The Vergehttps://www.theverge.com/news/853191/grok-explicit-bikini-pictures-minors Summary: xAI's Grok is r...

How GraphRAG Works

Indexing Phase Offline, Expensive but Done Once - Text Chunking – Split the input text into manageable chunks. - Entity Extraction – Use an LLM to identify ent...