Day 1: Foundations of Agentic AI - RAG and Vector Stores

Published: (December 3, 2025 at 08:12 PM EST)
3 min read
Source: Dev.to

Source: Dev.to

Note

This blog post is part of the 4‑Day Series – Agentic AI with LangChain/LangGraph.

Welcome to the first day of our series on building Agentic AI applications with LangChain and LangGraph. Over the next four days we’ll evolve a simple script into a sophisticated, human‑in‑the‑loop multi‑agent system.

Today we start with the bedrock of modern AI apps: Retrieval‑Augmented Generation (RAG).

The Problem: LLMs Have Amnesia

Large Language Models (LLMs) like GPT‑4 are frozen in time. They don’t know about your private data, your company’s wiki, or today’s news. If you ask them about it, they might “hallucinate” (make things up).

The Solution: RAG

RAG is a technique to inject knowledge into the LLM’s prompt before it answers. It works in two phases:

  1. Ingestion – preparing your data for search.
  2. Retrieval – finding the right data and sending it to the LLM.

Let’s build this from scratch using LangChain.

Prerequisites

npm install @langchain/openai langchain dotenv

The RAG Architecture

graph LR
    A[Documents] -->|Split| B[Chunks]
    B -->|Embed| C[Vector Store]
    D[User Question] -->|Embed| C
    C -->|Retrieve| E[Context]
    E -->|Combine| F[LLM Prompt]
    F -->|Generate| G[Answer]

Step 1: Loading and Splitting Data

LLMs have a context window (a limit on how much text they can read). We can’t feed an entire book; we need to break it down.

Key utilities

  • TextLoader: reads files from disk.
  • RecursiveCharacterTextSplitter: smartly breaks text into chunks (e.g., 200 characters) while trying to keep paragraphs together.
import { TextLoader } from "langchain/document_loaders/fs/text";
import { RecursiveCharacterTextSplitter } from "langchain/text_splitter";

// 1. Load the file
const loader = new TextLoader("info.txt");
const docs = await loader.load();

// 2. Split into chunks
const splitter = new RecursiveCharacterTextSplitter({
  chunkSize: 200,    // characters per chunk
  chunkOverlap: 20,  // overlap to preserve context between chunks
});
const splitDocs = await splitter.splitDocuments(docs);

Step 2: Embeddings and Vector Stores

Keyword search (Ctrl + F) is brittle. Instead we use semantic search: convert text into embeddings (lists of numbers) where similar concepts are mathematically close.

Key utilities

  • OpenAIEmbeddings: turns text into vectors.
  • MemoryVectorStore: a simple in‑memory database to store these vectors.
import { OpenAIEmbeddings } from "@langchain/openai";
import { MemoryVectorStore } from "langchain/vectorstores/memory";

// 3. Index the chunks
const vectorStore = await MemoryVectorStore.fromDocuments(
  splitDocs,
  new OpenAIEmbeddings()
);

Step 3: The Retrieval Chain

When a user asks a question:

  1. Turn the question into an embedding.
  2. Find the most similar chunks in the VectorStore.
  3. Paste those chunks into a prompt template.
  4. Send the prompt to the LLM.

LangChain abstracts this with createRetrievalChain.

import { createStuffDocumentsChain } from "langchain/chains/combine_documents";
import { createRetrievalChain } from "langchain/chains/retrieval";

// Create the "Answerer" (LLM + Prompt)
const combineDocsChain = await createStuffDocumentsChain({
  llm: model,
  prompt: ChatPromptTemplate.fromTemplate(`
    Answer based on this context:
    {context}

    Question: {input}
  `),
});

// Create the full pipeline (Retriever + Answerer)
const retrievalChain = await createRetrievalChain({
  retriever: vectorStore.asRetriever(),
  combineDocsChain,
});

// Run it!
const response = await retrievalChain.invoke({
  input: "What is LangGraph inspired by?",
});

Summary

You’ve just built a system that can “read”. It doesn’t rely solely on the LLM’s training data; it uses your data. This is the foundation. Tomorrow we’ll turn this linear script into an agent that can decide when to use this knowledge.

Source Code

Back to Blog

Related posts

Read more »