Building a Local RAG AI Agent for Airline Reviews with Ollama
Source: Dev.to

I wanted to explore how far I could go with a fully local AI agent using Retrieval‑Augmented Generation (RAG). As a small curiosity‑driven evening project, I built an agent that can answer questions about airline reviews — entirely offline, fast, and inexpensive.
Tech Stack Overview
Language: Python
LLM runtime: Ollama
Models
llama3.2for question answeringmxbai-embed-largefor embeddings
Vector store: Chroma
Libraries
langchainlangchain-ollamalangchain-chromapandas
Dataset: Airline Reviews (CSV) from Kaggle
Why Ollama?
I installed Ollama on a Linux cloud server, but it also runs smoothly on most modern PCs and laptops.
- Easy to run locally
- Cheap (no API costs)
- Fast enough for experimentation
Perfect for side projects and learning.
Dataset Preparation
The dataset comes from Kaggle and contains airline reviews in CSV format.
To make vector ingestion faster and lighter, I created a reduced CSV version that keeps only the columns relevant for semantic search (review text, airline name, rating, etc.). This significantly improved:
- Embedding generation time
- Vector store loading speed
High-Level Architecture
- Load airline reviews from CSV using pandas.
- Generate embeddings with mxbai-embed-large.
- Store vectors in Chroma.
- Retrieve relevant reviews for a user question.
- Pass retrieved reviews + question to llama3.2.
- Generate an answer strictly based on retrieved content.
This is classic RAG, but fully local.
Prompt Design
The prompt is explicit and restrictive to avoid hallucinations:
You are an expert in answering questions about airline reviews.
Use the provided reviews to answer the question as accurately as possible.
Here are some relevant reviews: {reviews}
Here is the question to answer: {question}
IMPORTANT: Base your answer ONLY on the reviews provided above. If no reviews are provided, say "No reviews were found."
This single instruction already improved answer reliability a lot.
Minimal Python Setup (Conceptual)
import pandas as pd
from langchain_ollama import OllamaEmbeddings, OllamaLLM
from langchain_chroma import Chroma
from langchain.chains import RetrievalQA
# 1. Load CSV
df = pd.read_csv("airline_reviews_reduced.csv")
# 2. Create embeddings
embeddings = OllamaEmbeddings(model="mxbai-embed-large")
# 3. Build vector store
vector_store = Chroma.from_documents(
documents=df["review_text"].tolist(),
embedding=embeddings,
collection_name="airline_reviews"
)
# 4. Set up LLM
llm = OllamaLLM(model="llama3.2")
# 5. RetrievalQA chain
qa = RetrievalQA.from_chain_type(
llm=llm,
retriever=vector_store.as_retriever(),
chain_type_kwargs={"prompt": YOUR_PROMPT_STRING}
)
# 6. Ask a question
answer = qa.run("How do passengers generally feel about Emirates?")
print(answer)
The code is intentionally basic and readable, focusing on clarity rather than abstraction layers.
Example Results
✅ Example 1: Valid Question
Question: How do passengers generally feel about Emirates?
Result: The agent retrieved multiple relevant reviews and summarized them correctly.

❌ Example 2: No Relevant Data
Question: Is Honda CRA a good SUV car?
Result: Since no relevant reviews were retrieved, the agent responded:
No reviews were found…

This fallback behavior prevents the model from fabricating answers.
GitHub Repository
The full source code is available here:
👉 GitHub Repo: local-AI-agent-RAG
You can clone the repository and run the project following the instructions in the README.
Final Thoughts
This project was built out of curiosity and for fun—a way to experiment with local RAG systems without overcomplicating things. The same approach can scale further: with a larger dataset and more performant hardware, you can build something significantly faster, more accurate, and production‑ready.
For now, it serves as a solid proof of concept and a reminder that meaningful AI projects don’t always need massive infrastructure to get started 🚀