KODA Format: A Schema-First Data Format to Reduce LLM Token Usage ( 40%)
Source: Dev.to
Introduction
When building applications with large language models (LLMs), one of the most overlooked costs is how structured data is represented. Most systems use JSON, but JSON is inefficient for LLM input.
The Problem with JSON
JSON repeats field names for every record, wasting tokens.
Example
[
{"id": 1, "title": "Bug", "state": "open"},
{"id": 2, "title": "Fix", "state": "closed"}
]
Each object repeats the keys id, title, and state.
If you send 1 000 records, those keys are repeated 1 000 times, leading to:
- Tokens wasted
- Higher costs
- Smaller effective context window
KODA: Knowledge‑Oriented Data Abstraction
KODA is a schema‑first data format designed to reduce token usage when sending structured data to LLMs. It works by:
- Defining the structure once (schema‑first)
- Encoding values positionally
- Eliminating repeated keys found in JSON
Optimized For
- Retrieval‑augmented generation (RAG) pipelines
- Tool‑calling systems
- Agent workflows
- High‑volume structured LLM input
KODA Syntax Example
KODA/1
@META
schemas:issue
counts:issue=3
@SCHEMA
issue:id title state
@DATA:issue
1|Bug|open
2|Fix|closed
No repeated keys—only the schema definition and the values.
Token Reduction Results
Measured with a GPT‑4o‑mini tokenizer on real datasets.
| Case | JSON Tokens | KODA Tokens | Reduction |
|---|---|---|---|
| Repetitive Logs | 3 202 | 1 233 | 61.5 % |
| GitHub Issues | 4 137 | 2 576 | 37.7 % |
| Small Dataset | 26 | 35 | ‑34.6 % |
KODA performs best on large, repetitive structured data. For very small datasets, the schema overhead can outweigh the benefits.
Why Tokens Matter
- Tokens = cost (API pricing)
- Tokens = latency (processing time)
- Tokens = context capacity (how much data the model can see)
Reducing tokens by ~30–40 %:
- Lowers API costs
- Increases usable context
- Improves overall system efficiency
How KODA Works
- Schema → defined once
- Data → streamed positionally
This removes structural redundancy.
Python Example
from koda import Schema, Field, encode
schema = Schema("user", [
Field("id"),
Field("name"),
Field("email", optional=True),
Field("active", default="true")
])
data = [
{"id": 1, "name": "Alice", "email": "alice@example.com"},
{"id": 2, "name": "Bob"}
]
koda_str = encode(data, schema)
print(koda_str)
Comparison of Formats
| Format | Token Efficiency | Readability | Best Use Case |
|---|---|---|---|
| JSON | Low | High | APIs |
| YAML | Medium | Medium | Config files |
| TOON | High | Medium | LLM structured data |
| KODA | High | Low | LLM pipelines |
When to Use KODA
Use KODA if you are:
- Sending large structured datasets to LLMs
- Building RAG pipelines
- Working with tool calls or agents
- Optimizing token usage in production systems
Do not use KODA for:
- Small datasets (1–2 records)
- Irregular or deeply nested JSON
- Human‑authored configuration files (JSON is preferable)
Frequently Asked Questions
-
Is KODA a replacement for JSON?
No. KODA is a transport format for LLM pipelines, acting as an optimization layer. For general use, JSON remains the better choice. -
Does KODA work with any LLM?
It is designed for LLM input; it works best on large structured datasets. -
What workloads benefit most?
RAG pipelines, tool calls, and other structured LLM input scenarios.
Getting Started
- Repository:
- Install:
pip install koda
If you’re sending structured data to LLMs, you’re likely wasting tokens. KODA offers a simple way to reduce that overhead. Feedback and contributions are welcome.