Stop decompressing entire archives to get one file — introducing ARCX

Published: (March 19, 2026 at 04:43 PM EDT)
2 min read
Source: Dev.to

Source: Dev.to

Overview

Most archive formats make a simple task unnecessarily expensive: you need one file, so you download and decompress everything.
I built ARCX, a compressed archive format designed to fix that.

ARCX combines cross‑file compression (like tar+zstd) with indexed random access (like ZIP), so you can retrieve a single file from a large archive in milliseconds without decompressing the rest.

GitHub:

Install

cargo install arcx

Benchmarks (across 5 real‑world datasets)

DatasetARCX Bytes ReadTAR+ZSTD Bytes ReadReduction
Python ML326 KB63.1 MB198× less
Build Artifacts714 KB140.4 MB202× less
Other 3 datasets≈ 200 ms per file retrieval from a ~200 MB archiveup to 200× less data read vs tar+zstd
Compression overheadwithin ~3 % of tar+zstd

Use Cases

  • CI/CD pipelines (artifact retrieval)
  • Cloud storage with partial reads
  • Large codebases
  • Package registries

Modern systems often need one file, immediately, rather than the entire archive.

How ARCX Works

  1. Block‑based compression – the archive is split into independently compressed blocks.
  2. Binary manifest index – stored at the end of the archive, mapping each file to its block offset.
  3. Direct offset reads – a client can:
    • Look up the file in the index.
    • Seek to the relevant block.
    • Decompress only that block.

This replaces scanning or decompressing the full archive with a simple manifest lookup and a single block read.

Format Comparison

FormatCompression StrengthAccess Speed
ZIPweakerfast
tar+zstdstrongslow
ARCXstrongfast

Limitations & Future Work

  • ARCX is not designed for streaming (like tar). The archive must be complete before reading because the manifest is written at the end.
  • Remote/S3 range‑read workflows have not been fully benchmarked yet.
  • Metadata/index overhead is still being optimized for very large file counts.
  • Full extraction benchmarks in Rust are still in progress.

Still early – feedback welcome.

0 views
Back to Blog

Related posts

Read more »