JSONB vs. BSON: Tracing PostgreSQL and MongoDB Wire Protocols
Source: Dev.to
Overview
There is an essential difference between MongoDB’s BSON and PostgreSQL’s JSONB. Both are binary JSON formats, but they serve different roles. JSONB is purely an internal storage format for JSON data in PostgreSQL. BSON, on the other hand, is MongoDB’s native data format: it is used by application drivers to communicate with the database, and it also serves as the on‑disk storage format.
Key Differences
- Purpose
- JSONB – Optimized for storage and querying inside PostgreSQL.
- BSON – Designed for efficient network transmission and on‑disk storage in MongoDB.
- Data Types
- JSONB supports a subset of JSON types (objects, arrays, strings, numbers, booleans, null).
- BSON adds additional types such as
ObjectId,Date,Binary,Decimal128, and more.
- Size Overhead
- BSON includes extra bytes for type information and length prefixes, making it slightly larger than JSONB for the same data.
- Indexing
- PostgreSQL can index JSONB fields using GIN/GiST indexes.
- MongoDB provides indexes on BSON fields directly.
Tracing PostgreSQL Wire Protocol
PostgreSQL communicates with clients using its own wire protocol. To inspect the raw messages:
# Using pg_recvlogical to capture logical replication messages
pg_recvlogical \
--dbname=postgres \
--slot=debug_slot \
--start \
--verbose
Or with tcpdump:
sudo tcpdump -i any -w pg.pcap port 5432
You can then open the capture file in Wireshark and apply the filter postgresql to decode the protocol messages.
Tracing MongoDB Wire Protocol
MongoDB uses the BSON wire protocol. A simple way to capture traffic is with mongodump in combination with tcpdump:
# Capture traffic on default MongoDB port 27017
sudo tcpdump -i any -w mongo.pcap port 27017
In Wireshark, use the filter mongodb to decode the messages. You’ll see operations like OP_QUERY, OP_INSERT, and their BSON payloads.
Example: Decoding a BSON Document
Suppose you captured the following raw bytes (hex):
16 00 00 00 02 66 6f 6f 00 04 00 00 00 62 61 72 00 00
Interpretation:
| Offset | Bytes | Meaning |
|---|---|---|
| 0‑3 | 16 00 00 00 | Document length (22 bytes) |
| 4 | 02 | Type: String |
| 5‑7 | 66 6f 6f | Key: foo |
| 8 | 00 | Null terminator for key |
| 9‑12 | 04 00 00 00 | String length (4) |
| 13‑15 | 62 61 72 | Value: bar |
| 16 | 00 | Null terminator for value |
| 17‑21 | 00 00 00 00 | End of document marker |
When to Use Which Format
- Use JSONB when working within PostgreSQL and you need powerful SQL querying capabilities, indexing, and transactional guarantees.
- Use BSON when interacting with MongoDB, especially when you need MongoDB‑specific data types or want to leverage its flexible schema and sharding features.
Understanding the wire protocols and the underlying binary formats helps when debugging performance issues, building custom drivers, or simply gaining deeper insight into how these databases operate under the hood.