I deleted my entire AI microservice and just used Postgres (here is why) 🐘⚡

Published: 2 days ago (March 1, 2026 at 06:04 AM EST)

3 min read

Source: Dev.to

Introduction

A few months ago I needed to build a feature that everyone is asking for right now: “Let our users chat with their messy data.” In my case it was a massive dump of chaotic customer support tickets.

The Problem with the Typical Stack

If you Google how to build this, the internet immediately suggests a 5‑tier architecture:

A vector DB for embeddings
A graph database for relationships
LangChain to glue everything together
A separate Python microservice to run it all

I fell for it and built it. The result was that when a user deleted a ticket in our main database, the vector representation of that ticket still lived in the vector DB, causing the AI to hallucinate answers based on deleted data.

The “Aha” Moment 💡

I decided to stop treating AI as a magical entity that requires a bespoke ecosystem and went back to the most reliable backend tool: Postgres.

Postgres + pgvector = Peace of Mind 🧘‍♂️

Enable the pgvector extension.
Store both the raw data and the embeddings in the same table.

CREATE EXTENSION vector;

CREATE TABLE support_tickets (
  id uuid PRIMARY KEY DEFAULT gen_random_uuid(),
  user_id uuid REFERENCES users(id) ON DELETE CASCADE,
  issue_text text,
  embedding vector(1536) -- OpenAI's embedding size
);

Notice the ON DELETE CASCADE – when a ticket is removed, its embedding disappears automatically.

Dumping the Heavy Frameworks 🗑️

Most AI frameworks hide the actual API calls behind layers of abstraction, making debugging difficult. With Postgres the entire Retrieval‑Augmented Generation (RAG) pipeline reduces to a raw SQL query and a standard API fetch:

User asks a question.
Convert the question into an embedding.
Run a cosine‑similarity search directly in SQL:

SELECT issue_text
FROM support_tickets
ORDER BY embedding  $1
LIMIT 5;

Feed those five results into the OpenAI/Anthropic API using a strict JSON schema.

That’s it—no microservices, no $80/month vector‑DB bills, just a regular monolithic backend doing its job efficiently.

I spend a lot of time testing architectures, making mistakes, and stripping away marketing fluff so you can build things that actually work in production. If this post saved you from an unnecessary architecture rewrite, a massive SaaS bill, or a weekend of debugging sync errors, consider sponsoring my work on GitHub:

👉 Sponsor AmaLS367 on GitHub

Sponsorship gives me the freedom to keep experimenting, breaking things, and open‑sourcing production‑ready boilerplates that save you time.

TL;DR

I’m curious—how are you managing your embeddings right now? Are you using a separate DB or keeping it monolithic? Let me know below! 👇

I deleted my entire AI microservice and just used Postgres (here is why) 🐘⚡

Introduction

The Problem with the Typical Stack

The “Aha” Moment 💡

Postgres + pgvector = Peace of Mind 🧘‍♂️

Dumping the Heavy Frameworks 🗑️

TL;DR

Related posts

Shared Workflows: minha experiência definindo pipelines reutilizáveis

Building a Local-First Financial IDE: How I forced Gemini AI to do strict Double-Entry Accounting

I ran cursor-doctor on 50 real projects. Here's what broke.

Google Gemini Writing Challenge

Introduction

The Problem with the Typical Stack

The “Aha” Moment 💡

Postgres + pgvector = Peace of Mind 🧘‍♂️

Dumping the Heavy Frameworks 🗑️

Why I’m Sharing This ☕

TL;DR

Related posts

Shared Workflows: minha experiência definindo pipelines reutilizáveis

Building a Local-First Financial IDE: How I forced Gemini AI to do strict Double-Entry Accounting

I ran cursor-doctor on 50 real projects. Here's what broke.

Google Gemini Writing Challenge