Building a Local RAG AI Agent for Airline Reviews with Ollama

Published: 2 hours ago (January 16, 2026 at 11:51 AM EST)

3 min read

Source: Dev.to

Cover image for Building a Local RAG AI Agent for Airline Reviews with Ollama

I wanted to explore how far I could go with a fully local AI agent using Retrieval‑Augmented Generation (RAG). As a small curiosity‑driven evening project, I built an agent that can answer questions about airline reviews — entirely offline, fast, and inexpensive.

Tech Stack Overview

Language: Python
LLM runtime: Ollama

Models

llama3.2 for question answering
mxbai-embed-large for embeddings

Vector store: Chroma

Libraries

langchain
langchain-ollama
langchain-chroma
pandas

Dataset: Airline Reviews (CSV) from Kaggle

Why Ollama?

I installed Ollama on a Linux cloud server, but it also runs smoothly on most modern PCs and laptops.

Easy to run locally
Cheap (no API costs)
Fast enough for experimentation

Perfect for side projects and learning.

Dataset Preparation

The dataset comes from Kaggle and contains airline reviews in CSV format.
To make vector ingestion faster and lighter, I created a reduced CSV version that keeps only the columns relevant for semantic search (review text, airline name, rating, etc.). This significantly improved:

Embedding generation time
Vector store loading speed

High-Level Architecture

Load airline reviews from CSV using pandas.
Generate embeddings with mxbai-embed-large.
Store vectors in Chroma.
Retrieve relevant reviews for a user question.
Pass retrieved reviews + question to llama3.2.
Generate an answer strictly based on retrieved content.

This is classic RAG, but fully local.

Prompt Design

The prompt is explicit and restrictive to avoid hallucinations:

You are an expert in answering questions about airline reviews.
Use the provided reviews to answer the question as accurately as possible.

Here are some relevant reviews: {reviews}

Here is the question to answer: {question}

IMPORTANT: Base your answer ONLY on the reviews provided above. If no reviews are provided, say "No reviews were found."

This single instruction already improved answer reliability a lot.

Minimal Python Setup (Conceptual)

import pandas as pd
from langchain_ollama import OllamaEmbeddings, OllamaLLM
from langchain_chroma import Chroma
from langchain.chains import RetrievalQA

# 1. Load CSV
df = pd.read_csv("airline_reviews_reduced.csv")

# 2. Create embeddings
embeddings = OllamaEmbeddings(model="mxbai-embed-large")

# 3. Build vector store
vector_store = Chroma.from_documents(
    documents=df["review_text"].tolist(),
    embedding=embeddings,
    collection_name="airline_reviews"
)

# 4. Set up LLM
llm = OllamaLLM(model="llama3.2")

# 5. RetrievalQA chain
qa = RetrievalQA.from_chain_type(
    llm=llm,
    retriever=vector_store.as_retriever(),
    chain_type_kwargs={"prompt": YOUR_PROMPT_STRING}
)

# 6. Ask a question
answer = qa.run("How do passengers generally feel about Emirates?")
print(answer)

The code is intentionally basic and readable, focusing on clarity rather than abstraction layers.

Example Results

✅ Example 1: Valid Question

Question: How do passengers generally feel about Emirates?

Result: The agent retrieved multiple relevant reviews and summarized them correctly.

Valid Answer

❌ Example 2: No Relevant Data

Question: Is Honda CRA a good SUV car?

Result: Since no relevant reviews were retrieved, the agent responded:

No reviews were found…

No Relevant Data

This fallback behavior prevents the model from fabricating answers.

GitHub Repository

The full source code is available here:
👉 GitHub Repo: local-AI-agent-RAG

You can clone the repository and run the project following the instructions in the README.

Final Thoughts

This project was built out of curiosity and for fun—a way to experiment with local RAG systems without overcomplicating things. The same approach can scale further: with a larger dataset and more performant hardware, you can build something significantly faster, more accurate, and production‑ready.

For now, it serves as a solid proof of concept and a reminder that meaningful AI projects don’t always need massive infrastructure to get started 🚀