How We Built a Sub-Millisecond Crypto Feed in C++
Source: Dev.to
[](https://dev.to/hpc_group_b579dc28b930e08)
Most crypto market data APIs give you top‑of‑book prices with 100 ms+ latency. We wanted full L2 order books from 21 exchanges, all normalized into a single WebSocket stream, at sub‑millisecond speed. So we built it.
This post covers the core engineering decisions behind **[Microverse Systems](https://microversesystems.com/)** — a free, real‑time order‑book API that aggregates depth‑of‑market data across major crypto exchanges.
---
## The Problem
If you're building a trading bot, an arbitrage scanner, or even a simple price dashboard, you hit the same wall: every exchange has its own WebSocket protocol, its own message format, its own rate limits.
* **Binance** sends JSON.
* **Bybit** sends JSON but structures it differently.
* Some exchanges batch updates; others stream individual changes.
Normalizing all of this in Python or Node means you spend more time parsing messages than actually using the data. And if latency matters to your strategy, the language overhead alone puts you at a disadvantage.
---
## Why C++
We chose C++ for the core feed handler—not because we enjoy debugging segfaults, but because it was the only way to hit our latency targets:
- **Zero‑copy message parsing** – Incoming WebSocket frames are parsed in‑place using pointer arithmetic rather than deserializing into intermediate objects. This avoids heap allocations on the hot path.
- **Lock‑free order‑book structures** – Each exchange’s order book is maintained in a lock‑free data structure that allows readers (subscriber threads) to access snapshots without blocking the writer (the feed‑handler thread).
- **Kernel‑bypass networking** – On our production boxes we use DPDK to bypass the kernel’s TCP/IP stack entirely. This shaves off ~15 µs per packet compared to standard socket reads.
The result is internal tick‑to‑publish latency under 50 µs for most exchanges. The bottleneck is almost always the exchange’s own WebSocket server, not our processing.
---
## Architecture Overview
Exchange WS Feeds ──► C++ Feed Handlers ──► Normalized Book Builder │ ▼ Snapshot Cache (shared memory) │ ┌───────┴───────┐ ▼ ▼ WebSocket Gateway REST API (user‑facing) (historical)
Each exchange gets its own feed‑handler process. These are independent—if Bybit’s feed dies, it doesn’t take down Binance. The handlers write normalized book updates into a shared‑memory ring buffer that the WebSocket gateway reads from.
The gateway fans out to subscribers. When a new client connects and requests, say, `BTC/USDT` on Binance, it receives an immediate full‑depth snapshot from the cache, then a stream of incremental updates.
---
## The Normalization Layer
This is where most of the complexity lives. Every exchange represents order books slightly differently:
| Exchange | Snapshot / Diff Details |
|----------|--------------------------|
| **Binance** | Initial snapshot + diff updates with `firstUpdateId` / `lastUpdateId` for sequencing |
| **Bybit** | Periodic snapshots + delta updates with a sequence number |
| **OKX** | Batches multiple instruments in a single message with checksums |
| **Kraken** | Completely different depth model with `republish` flags |
Our normalization layer maintains a state machine per exchange per instrument. It handles:
- **Initial sync** – Requesting a snapshot, buffering diffs until the snapshot arrives.
- **Sequence validation** – Detecting gaps and re‑syncing.
- **Cross‑normalization** – Converting all price/quantity to the same decimal format.
We checksum the book state after every update and compare it against exchange‑provided checksums where available (OKX, Kraken). If there’s a mismatch, we force a full re‑sync.
---
## What We Ship to Users
The API is intentionally simple. Connect via WebSocket and send a subscribe message:
```json
{
"action": "subscribe",
"exchange": "binance",
"symbol": "BTC/USDT",
"depth": 25
}You receive a full snapshot, then a stream of incremental updates. All exchanges use the same message format—no need to learn 21 different APIs.
- No API key required
- No rate limits on the WebSocket stream
We want this to be the easiest way to get institutional‑grade market data without paying institutional‑grade prices (it’s free).
Lessons Learned
Shared memory is underrated.
We initially passed data between feed handlers and the gateway over Unix sockets. Switching tommap‑backed ring buffers cut internal latency by 10× and eliminated a whole class of back‑pressure issues.Exchange WebSocket connections are fragile.
Binance, for example, can silently stop sending updates without closing the connection. We now have heartbeat monitors on every feed that force a reconnect if no message arrives within 2× the expected interval.Don’t trust exchange timestamps.
Some exchanges report timestamps in seconds, some in milliseconds, some with timezone offsets, some without. We stamp everything with our own receive time and treat exchange timestamps as advisory.
Try It
The API is live now at microversesystems.com. The docs contain code samples for Python, Node, and Rust, and there’s also a live dashboard (link truncated in the original post).
Happy coding!
**Explore Real‑Time Order Books**
Visit **[a.microversesystems.com](https://a.microversesystems.com/)** to watch the order books update live.
If you're building anything that needs crypto market data—trading bots, analytics dashboards, academic research—give it a try. We'd love your feedback.
*Built by [Microverse Systems](https://microversesystems.com/). Questions? Drop a comment or open an issue on our [GitHub](https://github.com/microversesystems).*