KODA Format: A Schema-First Data Format to Reduce LLM Token Usage ( 40%)

Published: (May 4, 2026 at 04:28 AM EDT)
3 min read
Source: Dev.to

Source: Dev.to

Introduction

When building applications with large language models (LLMs), one of the most overlooked costs is how structured data is represented. Most systems use JSON, but JSON is inefficient for LLM input.

The Problem with JSON

JSON repeats field names for every record, wasting tokens.

Example

[
  {"id": 1, "title": "Bug", "state": "open"},
  {"id": 2, "title": "Fix", "state": "closed"}
]

Each object repeats the keys id, title, and state.
If you send 1 000 records, those keys are repeated 1 000 times, leading to:

  • Tokens wasted
  • Higher costs
  • Smaller effective context window

KODA: Knowledge‑Oriented Data Abstraction

KODA is a schema‑first data format designed to reduce token usage when sending structured data to LLMs. It works by:

  1. Defining the structure once (schema‑first)
  2. Encoding values positionally
  3. Eliminating repeated keys found in JSON

Optimized For

  • Retrieval‑augmented generation (RAG) pipelines
  • Tool‑calling systems
  • Agent workflows
  • High‑volume structured LLM input

KODA Syntax Example

KODA/1
@META
schemas:issue
counts:issue=3

@SCHEMA
issue:id title state

@DATA:issue
1|Bug|open
2|Fix|closed

No repeated keys—only the schema definition and the values.

Token Reduction Results

Measured with a GPT‑4o‑mini tokenizer on real datasets.

CaseJSON TokensKODA TokensReduction
Repetitive Logs3 2021 23361.5 %
GitHub Issues4 1372 57637.7 %
Small Dataset2635‑34.6 %

KODA performs best on large, repetitive structured data. For very small datasets, the schema overhead can outweigh the benefits.

Why Tokens Matter

  • Tokens = cost (API pricing)
  • Tokens = latency (processing time)
  • Tokens = context capacity (how much data the model can see)

Reducing tokens by ~30–40 %:

  • Lowers API costs
  • Increases usable context
  • Improves overall system efficiency

How KODA Works

  • Schema → defined once
  • Data → streamed positionally

This removes structural redundancy.

Python Example

from koda import Schema, Field, encode

schema = Schema("user", [
    Field("id"),
    Field("name"),
    Field("email", optional=True),
    Field("active", default="true")
])

data = [
    {"id": 1, "name": "Alice", "email": "alice@example.com"},
    {"id": 2, "name": "Bob"}
]

koda_str = encode(data, schema)
print(koda_str)

Comparison of Formats

FormatToken EfficiencyReadabilityBest Use Case
JSONLowHighAPIs
YAMLMediumMediumConfig files
TOONHighMediumLLM structured data
KODAHighLowLLM pipelines

When to Use KODA

Use KODA if you are:

  • Sending large structured datasets to LLMs
  • Building RAG pipelines
  • Working with tool calls or agents
  • Optimizing token usage in production systems

Do not use KODA for:

  • Small datasets (1–2 records)
  • Irregular or deeply nested JSON
  • Human‑authored configuration files (JSON is preferable)

Frequently Asked Questions

  • Is KODA a replacement for JSON?
    No. KODA is a transport format for LLM pipelines, acting as an optimization layer. For general use, JSON remains the better choice.

  • Does KODA work with any LLM?
    It is designed for LLM input; it works best on large structured datasets.

  • What workloads benefit most?
    RAG pipelines, tool calls, and other structured LLM input scenarios.

Getting Started

  • Repository:
  • Install:
pip install koda

If you’re sending structured data to LLMs, you’re likely wasting tokens. KODA offers a simple way to reduce that overhead. Feedback and contributions are welcome.

0 views
Back to Blog

Related posts

Read more »