Day 1: Foundations of Agentic AI - RAG and Vector Stores
Source: Dev.to
Note
This blog post is part of the 4‑Day Series – Agentic AI with LangChain/LangGraph.
Welcome to the first day of our series on building Agentic AI applications with LangChain and LangGraph. Over the next four days we’ll evolve a simple script into a sophisticated, human‑in‑the‑loop multi‑agent system.
Today we start with the bedrock of modern AI apps: Retrieval‑Augmented Generation (RAG).
The Problem: LLMs Have Amnesia
Large Language Models (LLMs) like GPT‑4 are frozen in time. They don’t know about your private data, your company’s wiki, or today’s news. If you ask them about it, they might “hallucinate” (make things up).
The Solution: RAG
RAG is a technique to inject knowledge into the LLM’s prompt before it answers. It works in two phases:
- Ingestion – preparing your data for search.
- Retrieval – finding the right data and sending it to the LLM.
Let’s build this from scratch using LangChain.
Prerequisites
npm install @langchain/openai langchain dotenv
The RAG Architecture
graph LR
A[Documents] -->|Split| B[Chunks]
B -->|Embed| C[Vector Store]
D[User Question] -->|Embed| C
C -->|Retrieve| E[Context]
E -->|Combine| F[LLM Prompt]
F -->|Generate| G[Answer]
Step 1: Loading and Splitting Data
LLMs have a context window (a limit on how much text they can read). We can’t feed an entire book; we need to break it down.
Key utilities
TextLoader: reads files from disk.RecursiveCharacterTextSplitter: smartly breaks text into chunks (e.g., 200 characters) while trying to keep paragraphs together.
import { TextLoader } from "langchain/document_loaders/fs/text";
import { RecursiveCharacterTextSplitter } from "langchain/text_splitter";
// 1. Load the file
const loader = new TextLoader("info.txt");
const docs = await loader.load();
// 2. Split into chunks
const splitter = new RecursiveCharacterTextSplitter({
chunkSize: 200, // characters per chunk
chunkOverlap: 20, // overlap to preserve context between chunks
});
const splitDocs = await splitter.splitDocuments(docs);
Step 2: Embeddings and Vector Stores
Keyword search (Ctrl + F) is brittle. Instead we use semantic search: convert text into embeddings (lists of numbers) where similar concepts are mathematically close.
Key utilities
OpenAIEmbeddings: turns text into vectors.MemoryVectorStore: a simple in‑memory database to store these vectors.
import { OpenAIEmbeddings } from "@langchain/openai";
import { MemoryVectorStore } from "langchain/vectorstores/memory";
// 3. Index the chunks
const vectorStore = await MemoryVectorStore.fromDocuments(
splitDocs,
new OpenAIEmbeddings()
);
Step 3: The Retrieval Chain
When a user asks a question:
- Turn the question into an embedding.
- Find the most similar chunks in the
VectorStore. - Paste those chunks into a prompt template.
- Send the prompt to the LLM.
LangChain abstracts this with createRetrievalChain.
import { createStuffDocumentsChain } from "langchain/chains/combine_documents";
import { createRetrievalChain } from "langchain/chains/retrieval";
// Create the "Answerer" (LLM + Prompt)
const combineDocsChain = await createStuffDocumentsChain({
llm: model,
prompt: ChatPromptTemplate.fromTemplate(`
Answer based on this context:
{context}
Question: {input}
`),
});
// Create the full pipeline (Retriever + Answerer)
const retrievalChain = await createRetrievalChain({
retriever: vectorStore.asRetriever(),
combineDocsChain,
});
// Run it!
const response = await retrievalChain.invoke({
input: "What is LangGraph inspired by?",
});
Summary
You’ve just built a system that can “read”. It doesn’t rely solely on the LLM’s training data; it uses your data. This is the foundation. Tomorrow we’ll turn this linear script into an agent that can decide when to use this knowledge.