Which Document class is best to use in Java to read MongoDB documents?
Source: Dev.to
TL;DR
The answer is in the title – use Document.
Overview
BSON is a binary serialization format (similar to protobuf) that MongoDB uses for efficient storage and network transfer.
Instead of scanning and rewriting the whole byte sequence to read or write fields, you work with an in‑memory object that exposes convenient methods.
On the server side MongoDB uses a mutable BSON object; on the client side the Java driver provides several classes that implement the Bson interface.
Below is a concise guide to the five document classes available in the MongoDB Java driver, their characteristics, and when to choose each one.
1. Document (recommended for most applications)
- Type:
Map(backed by aLinkedHashMapto preserve insertion order) - Key characteristics
- Loosely‑typed: values are plain Java objects (
String,Integer,Date, …) - Flexible: easy to work with dynamically‑structured documents
- Map API: all standard
Mapoperations are available
- Loosely‑typed: values are plain Java objects (
- When to use
- You want a concise, flexible representation that “just works”.
- No strict BSON‑type safety is required.
2. BsonDocument
- Type:
Map(also backed by aLinkedHashMap) - Key characteristics
- Type‑safe: every value must be a BSON library type (
BsonString,BsonInt32,BsonDocument, …) - Stricter API: compile‑time safety, but more verbose code
- Map API: same as
Documentbut withBsonValues
- Type‑safe: every value must be a BSON library type (
- When to use
- You need explicit control over BSON types (e.g., precise type handling, interacting with APIs that require
BsonDocument).
- You need explicit control over BSON types (e.g., precise type handling, interacting with APIs that require
3. RawBsonDocument
- Type: immutable wrapper around a raw byte array (the BSON document)
- Key characteristics
- Immutable: mutation operations throw
UnsupportedOperationException - Lazy parsing: data is parsed only when accessed – ideal for pass‑through scenarios
- Memory‑efficient: no intermediate Java objects are created
- Conversion:
decode(codec)can turn it into any other document type when needed
- Immutable: mutation operations throw
- When to use
- Maximum performance & memory efficiency for whole‑document operations.
- Scenarios such as:
- Reading documents and forwarding them unchanged (e.g., change streams, client‑side encryption).
- Working with very large documents you don’t need to inspect field‑by‑field.
- High‑throughput pipelines where parsing overhead matters.
4. JsonObject
- Type: simple wrapper around a JSON string (
String) - Key characteristics
- Does not implement
Map; it’s just a string holder with optional validation. - No parsing is performed – the raw JSON stays as‑is.
- Supports MongoDB Extended JSON format.
- Does not implement
- When to use
- Your application primarily deals with JSON (e.g., REST APIs, logging, persisting documents as JSON).
- You want to avoid the cost of converting to/from a
Map.
5. BasicDBObject (legacy)
- Type: extends
BasicBSONObjectand implementsDBObject(does not implementMap). - Key characteristics
- Exists only for backward compatibility with driver versions prior to 3.0.
- Lacks modern
Mapconvenience methods. - May cause binary‑compatibility issues.
- When to use
- Migrating old code that already uses
BasicDBObject. - Do not use for new development – the driver documentation recommends avoiding it.
- Migrating old code that already uses
Inter‑class Conversions
All five classes implement the Bson interface, so they can be used interchangeably in driver operations (though performance may differ).
| From → To | Example |
|---|---|
| JSON → BsonDocument | BsonDocument.parse(jsonString) |
| Raw bytes → other type | RawBsonDocument.decode(codec) |
| Document → BsonDocument | Use a CodecRegistry/DocumentCodec (or document.toBsonDocument(... )) |
| BsonDocument → Document | documentCodec.decode(bsonDocument.asBsonReader()) |
| JsonObject → JSON string | jsonObject.getJson() |
| RawBsonDocument ↔ Document | Encode/decode via a Codec (e.g., DocumentCodec) |
Quick Decision Guide
| Requirement | Preferred Class |
|---|---|
| General purpose, flexible, easy‑to‑use | Document |
| Need strict BSON‑type safety | BsonDocument |
| Want immutable, raw‑byte representation for performance | RawBsonDocument |
| Working exclusively with JSON strings | JsonObject |
| Maintaining legacy code (pre‑3.0 driver) | BasicDBObject |
Summary
Documentis the default choice for most applications – it balances flexibility, simplicity, and functionality.- Choose
BsonDocumentwhen you need compile‑time type safety. - Pick
RawBsonDocumentfor high‑performance, memory‑efficient pass‑through scenarios. - Use
JsonObjectwhen your workflow revolves around raw JSON. - Reserve
BasicDBObjectfor legacy migrations only.
All classes can be converted between each other via the driver’s codecs, allowing you to start with the most convenient type and switch later if performance or type‑safety requirements change.
Overview
The text explains how RawBsonDocument and Document differ in their parsing strategies, memory usage, and mutability. It also compares the availability of similar concepts in Oracle and PostgreSQL, and highlights MongoDB’s advantages for modern applications.
RawBsonDocument vs. Document
RawBsonDocument
- Parsing strategy – Reads a BSON document sequentially, examining each field’s type and name until it finds the requested key.
- Field handling – Only the matching field is decoded with
RawBsonValueHelper.decode; all other fields are skipped without parsing. - Nested structures – For nested documents and arrays, it reads only their sizes and wraps the corresponding byte ranges in new
RawBsonDocumentorRawBsonArrayinstances, keeping the contents as raw bytes. - Performance – Provides fast, single‑field lookup while remaining memory‑efficient and keeping the document immutable.
- Use case – Ideal for large documents where only a few fields are needed, or for documents that are mostly passed through without inspection.
Document
- Parsing strategy – Uses a fully deserialized
LinkedHashMap. - Eager parsing – All fields are parsed into Java objects when the
Documentis created. - Field access –
containsKeyand other lookups are simpleHashMapoperations. - Mutability – The document is fully mutable, supporting
put,remove,clear, etc. - Memory usage – Consumes more memory because every field is materialized as a Java object.
- Use case – Suited for small‑to‑medium‑sized documents, scenarios where many fields are accessed, or when the document needs frequent modification.
Note:
Documentdoes not useRawBsonDocumentfor parsing or field access because that would be inefficient; the two classes serve different purposes.
BSON‑like Support in Other Databases
Oracle
- No BSON – Oracle stores JSON in OSON (Oracle’s binary JSON format) rather than BSON.
- Closest equivalent –
OracleJsonObject, one of theOracleJsonValuetypes, can be exposed as ajavax.json.JsonObjector mapped to a domain object. - Efficiency – The OSON API works directly on the underlying bytes without fully parsing the document. It maintains a local dictionary of field names, a sorted array of hash codes, and compact field‑ID/value offset arrays, allowing the driver to locate a field in place via binary search.
- JSON text – If JSON text is required, simply call
ResultSet.getString(), which converts the OSON image to JSON on the client side.
PostgreSQL
- No native Java JSON object API – Both
jsonandjsonbcolumns are returned as text via the JDBC driver. - Parsing responsibility – Applications must parse the returned string with a separate JSON library.
- Binary storage – Although
jsonbstores data in a binary format on the server, this efficiency does not cross the wire; the client still receives text and performs a full parse before accessing fields.
MongoDB’s Advantage
- Domain‑object centric – MongoDB lets you work directly with your domain objects without an intermediate ORM layer.
Documentclass – Serves as the document object model (DOM), offering flexible, map‑style access and a natural Java interface.- Transparent conversion – The driver automatically converts BSON from the network into usable Java objects, enabling immediate use in the application.