I don’t hate SQL. I hate metadata friction.

Published: 3 days ago (February 15, 2026 at 02:03 AM EST)

4 min read

Source: Dev.to

Source: Dev.to

I don’t struggle with writing SQL, but I struggle with everything around it. You know the drill:

Open BigQuery console.
Write a simple query on top of INFORMATION_SCHEMA.
Realize you forgot the partition column.
Copy an old query.
Tweak it.
Realize you need another column.
Open docs.
Switch to Airflow to check if you can get the answer faster.
Go back to the BigQuery console.
Forget what you were checking.

None of this is hard. It’s just constant, low‑level friction repeated every other day.

So I built something small to reduce that friction.

The problem

I regularly ask questions like:

Which tables changed schema recently?
What jobs are consuming the most slots?
Why was BigQuery slow yesterday?
Why are BigQuery costs skyrocketing?

BigQuery exposes many answers through INFORMATION_SCHEMA, but the queries are rarely trivial. They’re long, require joins, and you need to remember field names—so I constantly find myself going back to the docs for small details.

I started modeling the metadata in dbt and created a few summarized tables for:

Jobs
Storage
Schema changes
Dependencies
Slot usage

Nothing revolutionary—just structuring the data the way I actually use it. That helped, but I was still writing the same kinds of queries repeatedly.

So I threw an LLM at it

Once I had clean dbt models with decent descriptions, the next step felt obvious: instead of writing SQL, why not just ask questions? Most BI tools advertise Text‑to‑SQL solutions for business users; why wouldn’t it work for engineers as well?

Current setup

dbt models summarize BigQuery metadata.
The data is synced into DuckDB (to avoid scanning BigQuery every time and because I don’t trust the LLMs with live data).
I pass the schema + column descriptions to an LLM.
The LLM generates SQL.
SQL runs against DuckDB.
Results are shown in a small terminal UI.

TUI console

It’s basically a TUI where I can “chat” with my metadata.

A real example

One week BigQuery was painfully slow. There was likely slot contention, probably caused by a new transformation that had been deployed earlier that week.

Instead of digging manually, I asked:

why was bigquery slow this week?

The system analyzed slot usage and long‑running jobs and identified a new transformation model that was timing out after six hours. I checked with the team responsible, and they said they’d fix it.

A few days later I asked:

are the slot timeouts gone?

It confirmed that the long‑running jobs had disappeared. That’s when I realized this was actually useful for first‑pass investigations.

Analytics mode

It’s not perfect

There were issues immediately.

Long answers – even simple questions triggered overly detailed reports. I split the UI into two modes: Fast and Analytics.
Hallucinations – BigQuery job IDs look like UUIDs. Even when the query result was correct, the LLM sometimes invented job IDs in its explanation, which is unacceptable.

I tried adding validation steps, but they quickly became too expensive in terms of tokens. Right now I return raw query results along with summaries.

Expand raw results from queries

I also hit an infinite loop where the LLM queried the data, interpreted it, then queried again, and so on. Luckily I only had $5 in credits, so I added strict usage limits after that.

Why DuckDB?

Mainly cost and speed. I didn’t want accidental expensive scans and slow iteration loops. So I sync the relevant metadata into DuckDB and query that instead.

TUI architecture

It works well at this scale, though obviously you can’t mirror unlimited history.

What this tool actually is

This is not a BI or an observability tool. It’s just a faster way to ask operational questions without writing repetitive metadata queries.

It can be used for:

First‑level investigations
Sanity checks
Avoiding copy‑paste SQL

That’s it. I use it when I need quick answers.

replace proper analysis. But it lowers the activation energy. And that alone makes it useful.

Open‑sourcing?

I’m considering open‑sourcing it. Right now, it works with BigQuery metadata, but technically you could plug in any BigQuery dataset with proper table definitions.

Before I invest more time to support additional data‑engineering ops metadata, I’d love to know:

Would you use something like this?
Does this solve a real annoyance for you?
What would immediately make it unusable?
What would make it indispensable?

If there’s interest, I’ll clean up the repo and open‑source it.

I’m trying to validate whether this is just my personal itch or something broader. Honest (including brutal) feedback is welcome.

I don’t hate SQL. I hate metadata friction.

The problem

So I threw an LLM at it

Current setup

A real example

It’s not perfect

Why DuckDB?

What this tool actually is

Open‑sourcing?

Related posts

[2 more days] Win All-Access Pass to NVIDIA GTC 2026 🎫

The Protocol Wars Are Missing the Point

Docker Desktop and LM Studio installation

Why Distributed Leadership is More Efficient than Command-and-Control