Building A Binary Compiler For Node.js.

Published: 1 month ago (January 6, 2026 at 07:08 AM EST)

5 min read

Source: Dev.to

Overview

A binary compiler takes a data format and produces pure, flat binary data.

data → buffer

The reason these things exist (protobufs, flatbuffers, you name it) is simple: the current standard for internet communication is bloated. Yes, JSON is insanely inefficient, but that’s the tax we pay for debuggability and readability.

This number right here → 1 is 200 % larger in JSON than in pure binary.

'1'   = 8 bits
'"'   = 8 bits
'"'   = 8 bits
----------------
Total: 24 bits

In pure binary it’s just 8 bits.

But that’s not even the crazy part. Every value in JSON also requires a key:

{
  "key": "1"
}

Take a second and guess how many bits that is now.

Don’t take my word for it—here’s proof.

Example Object

const obj = {
  age: 26,
  randNum16: 256,
  randNum16: 16000,
  randNum32: 33000,
  hello: "world",
  thisisastr: "a long string lowkey",
}

Size Comparison

obj.json  98 kb
obj.bin   86 kb   # ← no protocol (keys + values serialized)
obj2.bin  41 kb   # ← with protocol (values only, protocol owns the keys)

Even with keys encoded, pure binary is still way leaner, and the savings compound quickly as payloads grow.

Performance

Execution time comparison (milliseconds):

Name | Count | Min(ms) | Max(ms) | Mean(ms) | Median(ms) | StdDev(ms)
-----+-------+---------+---------+----------+------------+-----------
JSON | 100   | 0.000   | 0.000   | 0.000    | 0.000      | 0.000
TEMU | 100   | 0.000   | 1.000   | 0.010    | 0.000      | 0.099

The longest run took 1 ms, which is misleading—it’s a single outlier in 100 samples. My guess? Node allocating the initial buffer.

Why a Binary Compiler?

I’ve been building more real‑time systems and tools for Node.js lately, and JSON is a bandwidth hog. It’s borderline inconceivable to build a low‑latency system with JSON as your primary transport.

That’s why things like protobuf exist, especially for server‑to‑server communication. They’re insanely fast and way leaner.

Instead of using a generic solution, I’m experimenting with rolling my own, mainly as groundwork for the stuff I want to bring to Node.js:

tessera.js – a C++ N‑API renderer powered by raw bytes (SharedArrayBuffers)
How I built a renderer for Node.js
shard – a sub‑100 nanosecond latency profiler (native C/C++ usually sits around 5–40 ns)
nexus – a Godot‑like game engine for Node.js

This experiment is really about preparing solid binary encoders/decoders for those projects.

Building the Compiler

Note: this is experimental. I literally opened VS Code and just started coding—no research, no paper‑reading.

Honestly, this is the best way to learn anything: build it badly from intuition first, then see how experts do it. You’ll notice there’s zero thought put into naming, just pure flow. That’s intentional; it’s how I prototype.

Utils and Setup

import { writeFileSync, fstatSync, openSync, closeSync } from "fs";

const obj = {
  age: 26,
  randNum16: 256,
  randNum16: 16000,
  randNum32: 33000,
  hello: "world",
  thisisastr: "a long string lowkey",
  // stack: ['c++', "js", "golang"],
  // hobbies: ["competitive gaming", "hacking node.js", "football"]
};

const TYPES = {
  numF: 1,   // float
  numI8: 2,  // int8
  numI16: 3, // int16
  numI32: 4, // int32
  string: 5,
  array: 6,
};

function getObjectKeysAndValues(obj) {
  // JS preserves property order per spec
  const keys = Object.keys(obj);
  const values = Object.values(obj);
  return [keys, values];
}

function isFloat(num) {
  return !Number.isInteger(num);
}

Serializing Keys

Simple protocol:

[allKeysLen | keyLen | key] → buffer

function serKeys(keys) {
  let len = 0;

  for (let i = 0; i  255)
      throw new Error(`Key too long: "${k}" (${k.length} bytes)`);

    buf.writeUInt8(k.length, writer++);
    const written = buf.write(k, writer, "utf8");
    writer += written;
  }

  return buf;
}

Deserializing is just the reverse: read length → read key → move pointer.

function deserKeys(buf) {
  let reader = 2;
  const keys = [];

  while (reader = -128 && num = -32768 && num = -128 && num = -32768 && num  8 bits
i16 -> 16 bits
i32 -> 32 bits

The Compiler

Serialize

function gen(obj, protocol = false) {
  if (typeof obj !== "object")
    throw new Error("Must be Object");

  let cache = new Map();
  const [keys] = getObjectKeysAndValues(obj);

  let serk;
  if (!protocol) {
    serk = serKeys(keys);
  }

  let length = 0;

  for (const key of keys) {
    let buf;

    switch (typeof obj[key]) {
      case "number":
        buf = seNumber(obj[key]);
        break;
      case "string":
        buf = seString(obj[key]);
        break;
      default:
        continue;
    }

    length += buf.length;
    cache.set(key, buf);
  }

  const dataBuf = Buffer.allocUnsafe(length);
  let writer = 0;

  for (const key of keys) {
    const b = cache.get(key);
    if (b) {
      b.copy(dataBuf, writer);
      writer += b.length;
    }
  }

  return protocol ? dataBuf : Buffer.concat([serk, dataBuf]);
}

Deserialize

function unserData(buf) {
  let reader = 0;
  let data = [];

  while (reader < buf.length) {
    const t = buf.readInt8(reader++);
    switch (t) {
      case 1:
        data.push(buf.readFloatBE(reader));
        reader += 4;
        break;
      case 2:
        data.push(buf.readInt8(reader++));
        break;
      case 3:
        data.push(buf.readInt16BE(reader));
        reader += 2;
        break;
      case 4:
        data.push(buf.readInt32BE(reader));
        reader += 4;
        break;
      case 5:
        const len = buf.readInt16BE(reader);
        reader += 2;
        data.push(buf.subarray(reader, reader + len).toString("utf8"));
        reader += len;
        break;
    }
  }

  return data;
}

Unified Parser

function ungen(buf, protocol = false) {
  if (!protocol) {
    const keysLen = buf.readInt16BE(0);
    const keysBuf = buf.subarray(0, keysLen);
    deserKeys(keysBuf);
    return unserData(buf.subarray(keysLen));
  }
  return unserData(buf);
}

Running It

Sanity Check

let samples = { JSON: [], TEMU: [] };

function J() {
  const start = process.hrtime.bigint();
  JSON.parse(JSON.stringify(obj));
  const end = process.hrtime.bigint();
  samples.JSON.push((end - start) / 1_000_000n);
}

function T() {
  const start = process.hrtime.bigint();
  const b = gen(obj, true);
  ungen(b);
  const end = process.hrtime.bigint();
  samples.TEMU.push((end - start) / 1_000_000n);
}

Sampling

const WARMUP = 100_000;
const SAMPLE = 100;

for (let i = 0; i < WARMUP; i++) {}
for (let i = 0; i < SAMPLE; i++) J();

for (let i = 0; i < WARMUP; i++) {}
for (let i = 0; i < SAMPLE; i++) T();

console.dir(samples.TEMU);

It works.

The real question is: after doing proper research, how much better can this get?