98 Bytes That Prove Your Document Existed

Published: (February 20, 2026 at 11:48 PM EST)
9 min read
Source: Dev.to

Source: Dev.to

ATL Protocol Checkpoint – Fixed‑Size Wire Format

A checkpoint in the ATL Protocol is a signed snapshot of the transparency‑log state.
It captures the root hash of the Merkle tree at a specific tree size, at a specific moment in time, from a specific log instance.
If you have a checkpoint and the corresponding signature verifies, you know the exact state of the log at that point.

Note – The entire wire format is 98 bytes long, fixed‑size, with no variable‑length fields and no parser required.

Byte Layout

OffsetSize (bytes)Field
018Magic bytes: "ATL-Protocol-v1-CP" (ASCII)
1832Origin ID (SHA‑256 of instance UUID)
508Tree size (u64, little‑endian)
588Timestamp (u64, little‑endian, Unix nanoseconds)
6632Root hash (SHA‑256)
Total98(signed blob)

The Ed25519 signature (64 bytes) and the key ID (32 bytes) are stored separately – they are not part of the 98‑byte blob. This separation is a deliberate design decision (explained below).

Why a Fixed‑Size Binary Blob?

1. No Ambiguity in Signed Contexts

Typical serialization formats (JSON, Protobuf, CBOR, MessagePack) are great for APIs and configuration, but they introduce ambiguity when used for data that must be cryptographically signed:

FormatAmbiguity Source
JSONKey ordering can differ → different byte sequences for the same logical object (hence RFC 8785).
ProtobufField ordering is technically undefined; implementations may emit different byte orders.
CBORMultiple valid encodings exist for the same value.
MessagePackSimilar to CBOR – several canonical forms.

When you sign a checkpoint, you need to know exactly what you signed: “these exact 98 bytes.” If the serialization is ambiguous, verification becomes ambiguous, and two implementations may produce incompatible signatures.

2. Zero‑Parsing Overhead

A 98‑byte blob can be read deterministically in any language:

  1. Read 18 bytes → magic.
  2. Read 32 bytes → origin ID.
  3. Read 8 bytes (little‑endian) → tree size.
  4. Read 8 bytes (little‑endian) → timestamp.
  5. Read 32 bytes → root hash.

No length prefixes, delimiters, or TLV structures are needed. The bytes are the canonical form.

Rust Implementation – to_bytes()

pub fn to_bytes(&self) -> [u8; CHECKPOINT_BLOB_SIZE] {
    let mut blob = [0u8; CHECKPOINT_BLOB_SIZE];
    blob[0..18].copy_from_slice(CHECKPOINT_MAGIC);
    blob[18..50].copy_from_slice(&self.origin);
    blob[50..58].copy_from_slice(&self.tree_size.to_le_bytes());
    blob[58..66].copy_from_slice(&self.timestamp.to_le_bytes());
    blob[66..98].copy_from_slice(&self.root_hash);
    blob
}
  • No allocations – a stack‑allocated array.
  • No error paths – the return type is [u8; 98], not Vec or Result.
  • Deterministic – always produces the exact data that was signed.

Signature & Key ID – Stored Separately

The signature and key ID are not part of the signed blob.

Why keep them separate?

  • Chicken‑and‑egg problem – you cannot sign data that already contains its own signature.
  • Dual‑format risk – many systems define a “signing input” (blob without signature) and a “stored format” (blob + signature). This creates two serialization rules for the same logical object, leading to bugs where verification uses the wrong format.

Verification Example

pub fn verify(&self, verifier: &CheckpointVerifier) -> AtlResult {
    // Fast‑reject on key‑ID mismatch (cheap SHA‑256 compare)
    if self.key_id != verifier.key_id {
        return Err(AtlError::InvalidSignature(format!(
            "key_id mismatch: checkpoint has {}, verifier has {}",
            hex::encode(self.key_id),
            hex::encode(verifier.key_id)
        )));
    }

    // Re‑create the exact signed blob
    let blob = self.to_bytes();
    verifier.verify(&blob, &self.signature)
}
  • The key‑ID check (SHA‑256 of the public key) is performed before the expensive Ed25519 verification, providing a fast‑rejection path.

Magic Bytes – “ATL‑Protocol‑v1‑CP”

The first 18 bytes serve two purposes:

  1. Format identification – If a JPEG, Protobuf message, or random 98‑byte buffer is fed to a checkpoint parser, the magic bytes will not match, resulting in a clear InvalidCheckpointMagic error instead of obscure downstream failures.
  2. Versioning – The v1 baked into the magic string ties the wire‑format version to the data itself. If the format ever changes (e.g., new fields, different hash algorithm), the magic string can be updated to "ATL-Protocol-v2-CP". A v1 parser encountering a v2 checkpoint will reject it cleanly rather than mis‑interpreting the bytes.

Eighteen bytes is generous for a magic string, but it provides ample room for future extensions while keeping the format simple and unambiguous.

TL;DR

  • 98‑byte fixed binary → no ambiguity, no parsing complexity.
  • Signature & key ID stored separately → avoids chicken‑and‑egg and dual‑format pitfalls.
  • Magic bytes → format identification + versioning.
  • Rust to_bytes() → deterministic, allocation‑free, always the signed data.

This design may look obvious in hindsight, but many implementations get it wrong by mixing variable‑length encodings or by conflating the signed and stored representations. Keeping the signed blob immutable and minimal eliminates those classes of bugs.

Magic Bytes (Hex Representation)

The checkpoint blob starts with a human‑readable string rather than a short binary magic number.
The magic bytes are:

41544C2D50726F746F636F6C2D76312D4350

which corresponds to the ASCII text ATL-Protocol-v1-CP.
Using a readable string makes the blob easy to spot in hex dumps, log files, and debugging sessions.

Timestamp

  • Field type: u64
  • Encoding: Unix nanoseconds (not seconds or milliseconds)

The range of a u64 in nanoseconds covers from 1970 up to roughly the year 2554, which is more than sufficient.

Why nanosecond precision?
A transparency log can process multiple entries within the same millisecond. If timestamps were only millisecond‑resolution, two checkpoints could end up with identical timestamps, making their ordering ambiguous. Nanosecond resolution guarantees a unique timestamp even for entries processed microseconds apart.

The timestamp is generated by current_timestamp_nanos() and clamped to u64::MAX to handle the (theoretical) case where system time exceeds the representable range.

Little‑Endian Encoding

Both u64 fields (tree size and timestamp) are encoded little‑endian.

  • This is an explicit design choice, not a default.
  • Modern hardware (x86, ARM default, RISC‑V) is little‑endian, so encoding a u64 as little‑endian is a no‑op on the most common platforms.
  • It eliminates an entire class of byte‑swapping bugs on those platforms.

Test for Endianness

// test_endianness: 0x0102_0304_0506_0708 encodes as
// [0x08, 0x07, 0x06, 0x05, 0x04, 0x03, 0x02, 0x01]

The test test_endianness exists because byte order can be trivially correct on one platform and silently wrong on another. It documents and verifies the encoding as a property of the format.

Human‑Readable JSON Representation

Checkpoints need a readable form for APIs, debugging, and storage systems that don’t handle raw binary well.

  • A CheckpointJson struct is provided with string encodings:

    • Hashes: "sha256:"
    • Signatures: "base64:"
  • Conversion methods:

    • to_json() – binary checkpoint → JSON
    • from_json() – JSON → binary checkpoint

Important: Cryptographic operations always work on the 98‑byte binary blob, never on the JSON. The Ed25519 signature is computed over to_bytes(), not over serde_json::to_string().

To enforce this, the main Checkpoint struct does not derive Serialize or Deserialize. Attempting to serialize it directly will cause a compile‑time error, forcing callers to use the explicit conversion methods.

Signature Trust Model (ATL Protocol v2.0)

  • The Ed25519 signature on a checkpoint is an integrity check, not a trust anchor.
  • It proves: “this checkpoint was issued by the holder of this private key.”
  • It does not prove: “you should trust this key.”

External Trust Anchors

  1. RFC 3161 TSA timestamps – a trusted third‑party timestamping authority attests to when the checkpoint existed.
  2. Bitcoin OTS – the checkpoint hash is anchored in the Bitcoin blockchain, providing an immutable timestamp that no single party can forge.

These anchors establish when a checkpoint existed; the Ed25519 signature merely binds the checkpoint to a specific log instance.

Consequence: If the Ed25519 signing key is later compromised, past checkpoints that were already anchored remain trustworthy, because the external anchors are independent of the signing key.

Test Suite (Wire‑Format Coverage)

TestPurpose
test_checkpoint_blob_sizeVerifies the blob is exactly 98 bytes.
test_magic_bytesChecks the first 18 bytes equal "ATL-Protocol-v1-CP".
test_endiannessConfirms little‑endian encoding of u64 fields.
test_wire_format_layoutEnsures every field is at the correct byte offset.
test_sign_and_verifyRound‑trip: create checkpoint → sign → verify.
test_verify_wrong_key_failsSignature from key A does not verify with key B.
test_verify_tampered_data_failsFlipping a byte in the checkpoint makes verification fail.
test_verify_tampered_signature_failsFlipping a byte in the signature makes verification fail.
test_json_roundtripBinary → JSON → binary yields identical bytes.
test_empty_tree_checkpointA checkpoint with tree_size = 0 is valid.

Each test name documents a specific property of the format; a failing test immediately tells you what broke and why it matters.

Blob Size Breakdown (Why 98 bytes?)

BytesMeaning
18Magic bytes / format identifier ("ATL-Protocol-v1-CP").
32Origin identifier (which log instance).
8Tree size (number of entries).
8Timestamp (nanoseconds).
32Root hash (cryptographic commitment to the entire log).
Total: 98 bytes (the signed statement).

All 98 bytes are essential:

  • Magic bytes prevent misidentification.
  • Origin avoids cross‑log confusion.
  • Tree size and timestamp locate the snapshot in the log’s history.
  • Root hash commits to every entry ever written.

Removing any field would make the checkpoint ambiguous or forgeable.

What Is Not Inside the Signed Blob

Anything beyond the 98 bytes—such as the signature itself, key IDs, metadata, or annotations—belongs outside the signed statement. The signed blob is the immutable statement; everything else is commentary.

Implementation Details

  • Repository: (Apache‑2.0)
  • File discussed: src/core/ (the checkpoint implementation lives here).
# `checkpoint.rs`

**Description**  
- Implements the checkpoint wire format.  
- Handles serialization, signing, and verification.  

**Details**  
- **File size**: 1080 lines of Rust code.  
- **Signature scheme**: Ed25519 signatures using the **`ed25519-dalek`** crate.  
0 views
Back to Blog

Related posts

Read more »

How We Made Our E2E Tests 12x Faster

!Alex Neamtuhttps://media2.dev.to/dynamic/image/width=50,height=50,fit=cover,gravity=auto,format=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2F...