I built the missing UI for Gemini's File Search (managed RAG) API

Published: 2 days ago (January 17, 2026 at 08:30 PM EST)

5 min read

Source: Dev.to

Retrieval Augmented Generation (RAG) has become the standard architecture for building AI apps that know about your specific data.

Usually, building a RAG pipeline involves a lot of moving parts: spinning up a vector database (like Pinecone or Weaviate), writing Python scripts to chunk your PDFs, generating embeddings, and managing the retrieval logic.

Google recently released a feature called Gemini File Search that simplifies this drastically.
It is a fully‑managed RAG pipeline built directly into the Gemini API. It handles the chunking, embedding, and storage for you. To top it, the pricing model is arguably the most compelling feature.

Unlike traditional vector databases where you often pay for hourly pod usage or storage size, Gemini File Search offers free storage and free embedding generation at query time.
You only pay a one‑time fee when you first index the documents (currently $0.15 per 1 million tokens) and the standard input/output token costs for the model’s response.

This makes it incredibly cost‑effective for side projects and scaling applications alike, as you are not bleeding money on idle vector storage.

But there is a catch

Gemini File Search is completely “headless.” There is no console to view your files, no drag‑and‑drop uploader, and no way to test a knowledge base without writing a script. If you want to delete a file or check if your chunking strategy is working, you have to write code.

I got tired of writing throwaway scripts just to manage my knowledge bases, so I built the Gemini File Search Manager.

What is Gemini File Search Manager?

An open‑source, local‑first web interface that acts as a control plane for the Gemini File Search API. Built with Next.js, it lets you manage your RAG pipeline visually.

You can check out the code and run it locally here:

👉

Why I built it & the problems it solves

Visualizing the “Black Box”

When you use the File Search API programmatically, you’re often flying blind. You create a Store, upload a file, and hope it processed correctly.

The dashboard shows:

All active stores
Document counts per store
Ingestion status (active, pending, failed)

Drag‑and‑Drop Ingestion (No more scripts)

Uploading a file via the API is a multi‑step process: upload the bytes, wait for the operation to complete, then link it to a store.

The manager provides a drag‑and‑drop interface that handles the whole orchestration. It supports PDF, TXT, MD, CSV, JSON, and the 100+ other formats Gemini supports.

One of the most powerful features of the Gemini API is Custom Chunking and Metadata. Normally you’d craft complex JSON objects in code; the UI now lets you:

Adjust maxTokensPerChunk and maxOverlapTokens
Add metadata tags (e.g., author, year) for later filtering

The RAG Playground

After uploading data you need to verify that the model can actually find it. The Playground view for each Store lets you:

Chat with your specific documents (conversational history is preserved)
Select different models (Gemini 3 Pro Preview, Gemini 3 Flash, Gemini 2.5 Pro, etc.)
Filter by metadata using AIP‑160 syntax (e.g., author = "Smith" AND year > 2020)
View citations – the UI parses groundingMetadata from the API response and shows exactly which document chunks were used to generate the answer.

Under the Hood: The Tech Stack

Component	Technology
Framework	Next.js 16 (App Router with React 19)
Styling	Tailwind CSS 4 + shadcn/ui
State Mgmt	TanStack Query
SDK	`@google/genai`

Solving the Async Polling Challenge

When you upload a file to Gemini, it doesn’t become active immediately. The API returns an Operation object and the file enters a PROCESSING state.

To avoid freezing the browser, the manager implements a polling mechanism:

// Pseudo‑code
const pollOperation = async (operationId) => {
  while (true) {
    const status = await fetchOperationStatus(operationId);
    if (status === 'DONE') break;
    await new Promise(r => setTimeout(r, 3000)); // poll every 3 seconds
  }
};

After the initial upload completes, the app polls the operations endpoint every 3 seconds until the file’s status changes to ACTIVE (or an error occurs). This keeps the UI responsive while providing real‑time feedback on ingestion progress.

Background Processing

When a document is uploaded, the server immediately starts generating embeddings in the background. Once the embeddings are ready, it automatically invalidates the cache and updates the UI—the document status changes from a spinner to a green checkmark.

For the document list itself, TanStack Query handles background refetching every 5 seconds to catch any status changes.

Streaming Chat Responses

The chat playground uses Server‑Sent Events (SSE) to stream responses in real‑time. As the model generates text, it appears character by character in the UI. When the stream completes, the grounding metadata (citations) is extracted and displayed below the response.

Security

Since this is a tool for developers, I didn’t want to deal with user accounts or databases. The app runs locally and uses your environment variables.

Create a .env.local file with your GEMINI_API_KEY.
The app reads the key on the server side only.

Your key never leaves your machine and is never exposed to the client browser.

Quick Start

You can have the project running in about 2 minutes.

1. Clone the repo

git clone https://github.com/prashantrohilla-max/gemini-file-search-manager
cd gemini-file-search-manager
npm install

2. Add your API key

Create a .env.local file:

GEMINI_API_KEY=your_key_here

3. Run the app

npm run dev

Open http://localhost:3000, and you’re ready to go.

What’s Next?

Structured Outputs – to test data‑extraction workflows.
URL import – a feature to import content directly from web pages.
Persistent chat sessions – currently lost when navigating away or stopping the server; persistence is on the roadmap.

The project is open source and MIT‑licensed. If you find it useful, please ⭐ the repository or submit a PR!

Repo: