Beyond Simple RAG: Building an Agentic Workflow with Next.js, Python, and Supabase

Published: 1 week ago (December 24, 2025 at 02:51 PM EST)

2 min read

Source: Dev.to

Cover image for Beyond Simple RAG: Building an Agentic Workflow with Next.js, Python, and Supabase

The flow of RAG Application

The Problem: “Chat with PDF” is the new Hello World

Building a basic RAG app is easy today. You upload a 5‑page PDF, split it into 1,000‑character chunks, and it works.

When I tried this with a 500‑page university textbook, the standard pipeline fell apart. I didn’t want a chatbot; I wanted a tutor. So I built Learneazy.io.

The Secret Sauce: 3‑Layer Semantic Indexing

Most RAG apps treat a document as one giant blob of text. Textbooks, however, have a natural hierarchy (Index → Chapters → Content). I mirrored that structure in the database using a Python (Flask) microservice with PyMuPDF.

Layer 1 – The “Skeleton” (Table of Contents)

Purpose: Quick, high‑level structural queries.

Layer 2 – The “Container” (Chapter‑wise Chunks)

Purpose: Context‑aware searches. When you ask about “Thermodynamics in Chapter 4,” only Chapter 4 is searched.

Layer 3 – The “Deep Dive” (Granular Chunks)

Purpose: Answering specific, deep‑dive questions where every nuance matters.

The Brain: Agentic Routing

A hierarchical index is useless without a mechanism to decide which layer to query. I implemented a LangChain Agent equipped with custom tools for each layer. The agent acts as a router:

User: “How many chapters are there?” → Agent: Calls Index Tool (fast, cheap).
User: “Summarize Chapter 3.” → Agent: Calls Chapter Tool (high context).
User: “Explain the formula for X.” → Agent: Calls Deep Dive Tool (high precision).

This routing logic reduced token usage by ~40 % and dramatically improved accuracy.

Beyond Chat: The Flashcard Engine

The hardest technical challenge was enabling users to say: “Generate 10 flashcards for Chapter 5.” The AI couldn’t simply guess; it needed a grounded workflow.

Topic Extraction: Scan the Chapter Layer to identify key themes (e.g., “Mitochondria,” “Krebs Cycle”).
Context Retrieval: Perform a targeted vector search in the Deep Dive Layer for those topics to obtain precise definitions.
Synthesis: Use the LLM (Google Gemini) to format the grounded facts into strict Q&A pairs.

The result: flashcards generated from the user’s specific material, not generic internet knowledge.

The Stack: Why Microservices?

Component	Technology	Rationale
Frontend	Next.js 16 (React 19)	Snappy, responsive UI
Processing Service	Python (Flask)	Superior PDF manipulation and chunking logic
Embeddings	Cohere (`embed-english-v3.0`)	Fine‑tuned for RAG retrieval quality, better than OpenAI for this use case
Database	Supabase (PostgreSQL + pgVector)	Stores vectors alongside user data (auth, metadata) for a streamlined backend
Source Code	github.com/Abhinav-Sriharsha/Learneazy

Beyond Simple RAG: Building an Agentic Workflow with Next.js, Python, and Supabase

The Problem: “Chat with PDF” is the new Hello World

The Secret Sauce: 3‑Layer Semantic Indexing

Layer 1 – The “Skeleton” (Table of Contents)

Layer 2 – The “Container” (Chapter‑wise Chunks)

Layer 3 – The “Deep Dive” (Granular Chunks)

The Brain: Agentic Routing

Beyond Chat: The Flashcard Engine

The Stack: Why Microservices?

Related posts

Uploaded a 120-page PDF instant AI chat. If you want the same setup, FastRAG is live - https://www.fastrag.live/

Hey. Writing Here Feels Harder Than It Should

Daily Tech News Roundup - 2026-01-03

How GraphRAG Works

The Problem: “Chat with PDF” is the new Hello World

The Secret Sauce: 3‑Layer Semantic Indexing

Layer 1 – The “Skeleton” (Table of Contents)

Layer 2 – The “Container” (Chapter‑wise Chunks)

Layer 3 – The “Deep Dive” (Granular Chunks)

The Brain: Agentic Routing

Beyond Chat: The Flashcard Engine

The Stack: Why Microservices?

Related posts

Uploaded a 120-page PDF instant AI chat. If you want the same setup, FastRAG is live - https://www.fastrag.live/

Hey. Writing Here Feels Harder Than It Should

Daily Tech News Roundup - 2026-01-03

How GraphRAG Works

Layer 1 – The “Skeleton” (Table of Contents)

Layer 2 – The “Container” (Chapter‑wise Chunks)

Layer 3 – The “Deep Dive” (Granular Chunks)