gni-compression is on npm — What a month of building a domain-adaptive LLM compressor taught me
Source: Dev.to
Two functions
const { compress, decompress } = require('gni-compression')
const compressed = await compress(Buffer.from(longContext))
No warm‑up. No session state. The domain knowledge is baked into a pre‑trained dictionary (gcdict.bin) bundled with the package — trained on real LLM conversation corpora.
The numbers
| Corpus | GN Ratio | Savings (brotli‑6) |
|---|---|---|
| Ubuntu IRC | 8.4× | 1.2× |
Ubuntu IRC is the surprising one. Messages average 67 bytes — too short for Brotli to do much (1.2×). GN gets 8.4× because IRC vocabulary is extremely consistent. Short repetitive messages are where a domain dictionary wins hardest.
Why the numbers are what they are
When I swept minimum phrase length I found the vocabulary isn’t a smooth distribution — it’s two clusters with a gap:
- minLen 4→5: token count drops 68 % (short filler tokens)
This means compression cuts filler preferentially. That’s probably why we see a small, consistent downstream quality improvement when feeding compressed context back to models — the signal‑to‑noise ratio improves.
Why I built it
I’m building NN Dash, a persistent AI agent scaffold that routes across Claude, GPT, and local Ollama. The goal is to make a long‑running AI relationship essentially free. GN is what makes multi‑thousand‑message context sessions viable without the token bill killing it.
Use it
npm install gni-compression
const { compress, decompress } = require('gni-compression')
Source: (MIT)
Feedback on the numbers, methodology, or use cases welcome.