6cy: A Content-Addressed Archive Format in Rust
Source: Dev.to
I built 6cy as an experimental archive format focused on content‑addressing, deduplication, and practical streaming workflows. This post gives a short, concrete overview you can read quickly.
Core Architectural Features
Streaming‑First Design
- Optimized for single‑pass read and write operations.
- Data can be appended without seeking back, fitting network streams and large‑scale pipelines.
Data Recoverability
- Blocks are self‑describing and include checks.
- Periodic checkpoints (recovery map) let readers recover data even if the archive is truncated or partially corrupted.
Codec Polymorphism
- Multiple compression algorithms can coexist in one archive (e.g., Zstd, LZ4).
- Each block can pick the best codec for its data, allowing the writer to trade speed vs. ratio per block.
Plugin Architecture
- A simple plugin ABI and manifest allow closed‑source or third‑party codecs to be loaded as binary plugins without changing the core codebase.
- Keeps the core implementation small and auditable.
Metadata‑First Indexing
- A central index maps files to blocks, enabling fast listing and random extraction.
- Readers do not need to scan the entire archive to find files.
Rust Reference Implementation
- The canonical implementation is written in Rust, prioritizing memory safety, clear error handling, and predictable performance.
- The repository serves as the reference for the format.
Architecture at a Glance
[ Data Blocks (content‑addressed) ... ]
[ Central Index (BlockRefs + metadata) ]
[ Superblock (index offset, codecs, UUID) ]
- Blocks carry block headers, compressed payloads, and a content hash.
- The index contains
BlockRefentries that point to blocks or to remote archive references.
Quickstart (try it)
git clone https://github.com/byte271/6cy
cd 6cy
# Build (Linux / macOS)
cargo build --release
# Pack files
./target/release/6cy pack -o test.6cy path/to/file1 path/to/file2
# Inspect
./target/release/6cy info test.6cy
Windows users: run the same commands in PowerShell or WSL. Be mindful of line endings and file ordering when testing cross‑platform determinism.
Current Status
- Implemented features: streaming write/read, central index, block‑level deduplication, content hashes, root hash, solid mode, plugin hooks.
- Validation in progress: cross‑platform determinism (Linux/Windows), crash‑recovery semantics, fuzz testing, performance benchmarks.
- State: experimental / stabilizing. The spec and implementation are evolving but now have a stable core surface.
How to Help
If you want to try the format, please:
- Clone the repo and run the quickstart above.
- Pack the same file multiple times to see deduplication in action.
- Open issues for bugs or missing documentation.
- Submit PRs for tests, examples, or platform fixes.
Open Source & Contact
Code and issues: