Build a Simple Local Pune Travel AI with FAISS + Ollama LLM - POC

Published: (February 6, 2026 at 02:40 AM EST)
5 min read
Source: Dev.to

Source: Dev.to

Ever wondered how to create your own local AI assistant for city tours or travel recommendations? In this proof‑of‑concept we build a Pune Grand Tour AI using a FAISS vector database for embeddings and an Ollama LLM for generating answers. No Docker, no cloud costs — just local Python and embeddings.

🔹 Purpose of this POC

  • Local AI Assistant – a mini‑ChatGPT specialized for Pune tourism.
  • Quick Retrieval – embeddings enable fast similarity search over a curated dataset.
  • Cost‑efficient – no cloud vector DB; FAISS runs entirely locally.
  • Hands‑on AI Exploration – learn the practical pipeline: embeddings → vector DB → LLM.

🔹 Why FAISS?

FAISS (Facebook AI Similarity Search) is a high‑performance library for:

  • Storing vector embeddings.
  • Performing fast similarity (nearest‑neighbor) search.
  • Running locally without any cloud infrastructure.

Key point: All Pune data fits comfortably in memory, so FAISS gives us lightning‑fast retrieval.

🔹 Dataset

We use a simple text file (pune_places_chunks.txt) that contains Pune’s:

  • Historical forts
  • Monuments
  • Museums
  • Tourist spots

Each line or chunk represents one document. Example:

[PLACE] Shaniwar Wada
Shaniwar Wada is a historic fort located in Pune, built in 1732 by Peshwa Bajirao I.
It served as the administrative center of the Maratha Empire.

[PLACE] Aga Khan Palace
The Aga Khan Palace is known for its association with Mahatma Gandhi and history.

🔹 Step 1 – Create Embeddings & FAISS Vector Store

File: ingest.py

from langchain_community.document_loaders import TextLoader
from langchain_huggingface import HuggingFaceEmbeddings
from langchain_community.vectorstores import FAISS

# 1️⃣ Load processed Pune data
loader = TextLoader("../data/processed/pune_places_chunks.txt")
documents = loader.load()

# 2️⃣ Create embeddings
embeddings = HuggingFaceEmbeddings(
    model_name="sentence-transformers/all-MiniLM-L6-v2"
)

# 3️⃣ Build FAISS vector store
vectorstore = FAISS.from_documents(documents, embeddings)

# 4️⃣ Persist the DB locally
vectorstore.save_local("../embeddings/pune_faiss")

print("Pune embeddings created successfully")

Run:

python ingest.py

Expected output

Pune embeddings created successfully

Explanation

  • TextLoader reads the text chunks.
  • HuggingFaceEmbeddings converts each chunk into a vector.
  • FAISS.from_documents builds a searchable vector store.
  • save_local persists the FAISS index for later use.

🔹 Step 2 – Query FAISS with Ollama LLM

File: chat.py

from langchain_community.vectorstores import FAISS
from langchain_huggingface import HuggingFaceEmbeddings
from langchain_ollama import OllamaLLM

# 1️⃣ Load embeddings
embeddings = HuggingFaceEmbeddings(
    model_name="sentence-transformers/all-MiniLM-L6-v2"
)
print("Embeddings loaded")

# 2️⃣ Load FAISS DB (allow pickle deserialization)
vectordb = FAISS.load_local(
    "../embeddings/pune_faiss",
    embeddings,
    allow_dangerous_deserialization=True
)
print("FAISS Vector DB loaded")

# 3️⃣ Ask a question
question = "Tell me famous places to visit in Pune"
docs = vectordb.similarity_search(question, k=3)

if not docs:
    print("No documents retrieved. Check embeddings folder.")
    exit(1)

context = "\n".join([d.page_content for d in docs])
print(f"Retrieved docs count: {len(docs)}")
print("Context preview (first 300 chars):")
print(context[:300])

# 4️⃣ Initialise Ollama LLM
llm = OllamaLLM(model="llama3")
print("Ollama LLM loaded")

# 5️⃣ Build prompt
prompt = f"""
You are a Pune travel guide AI.
Answer using only the context below.

Context:
{context}

Question:
{question}
"""

# 6️⃣ Generate response
response = llm.invoke(prompt)

print("\nPune AI says:\n")
print(response)

Run:

python chat.py

Sample output

Embeddings loaded
FAISS Vector DB loaded
Retrieved docs count: 3
Context preview (first 300 chars):
[PLACE] Shaniwar Wada
Shaniwar Wada is a historic fort located in Pune, built in 1732...
Ollama LLM loaded

Pune AI says:

Pune is famous for Shaniwar Wada, Sinhagad Fort, Aga Khan Palace, and Dagdusheth Ganpati Temple.

Explanation

  • similarity_search fetches the top‑k most relevant documents.
  • The retrieved context is concatenated and sent to the Ollama LLM.
  • The LLM produces a human‑like answer based solely on the provided context.

🔹 Step 3 – Make It Interactive with Streamlit

File: app.py

import streamlit as st
from langchain_community.vectorstores import FAISS
from langchain_huggingface import HuggingFaceEmbeddings
from langchain_ollama import OllamaLLM

st.title("Pune Grand Tour AI")
st.write("Ask about Pune's forts, monuments, and travel tips!")

@st.cache_resource
def load_vectorstore():
    embeddings = HuggingFaceEmbeddings(
        model_name="sentence-transformers/all-MiniLM-L6-v2"
    )
    return FAISS.load_local(
        "../embeddings/pune_faiss",
        embeddings,
        allow_dangerous_deserialization=True,
    )

@st.cache_resource
def load_llm():
    return OllamaLLM(model="llama3")

vectordb = load_vectorstore()
llm = load_llm()

question = st.text_input("Ask a question about Pune:")

if question:
    docs = vectordb.similarity_search(question, k=3)
    if not docs:
        st.warning("No documents found!")
    else:
        context = "\n".join([d.page_content for d in docs])
        prompt = f"""
You are a Pune travel guide AI.

Context:
{context}

Question:
{question}
"""
        response = llm.invoke(prompt)

        st.subheader("Retrieved Context")
        st.text(context[:500] + ("..." if len(context) > 500 else ""))

        st.subheader("AI Answer")
        st.write(response)

Run the app:

streamlit run app.py

Now you have a fully interactive web UI where users can type any Pune‑related query and receive answers powered by local embeddings and an Ollama LLM.

🎉 You’re all set!

You’ve built a complete local AI assistant for Pune tourism:

  1. Ingest raw text → embeddings → FAISS index.
  2. Retrieve relevant chunks with similarity search.
  3. Generate answers with an Ollama LLM.
  4. Interact via a Streamlit web app.

All of this runs on your own machine—no cloud costs, no Docker, just Python. Happy touring!

Run the Application

pip install streamlit
streamlit run app.py

Result

A browser UI opens where you can ask any Pune‑related question, and the AI responds interactively.

🔹 Key Benefits of this POC

  • Fully Local – No cloud or Docker dependency.
  • Fast Retrieval – FAISS provides instant similarity search.
  • Context‑aware AI – Ollama LLM answers based on curated Pune knowledge.
  • Expandable – Add more documents, images, or travel tips.
  • Interactive UI – Streamlit allows anyone to use the AI easily.

🔹 Common Issues & Fixes

Common Issues & Fixes

🔹 Use Cases

  • Local city‑guide AI for tourism apps
  • Educational assistant for geography/history lessons
  • Personal knowledge assistant for any curated dataset
  • Prototype for RAG (Retrieval‑Augmented Generation) projects

🔹 Conclusion

With FAISS + Ollama LLM + Streamlit, you can build fast, local, context‑aware AI assistants without relying on cloud services or Docker. This Pune AI POC demonstrates how a specialized knowledge base can power a chatbot capable of giving accurate, context‑specific answers.

Back to Blog

Related posts

Read more »

API Gateway vs Gateway API

API Gateway An API Gateway is a central entry point for all client requests, acting as a reverse proxy that routes them to the appropriate backend microservice...