Implementing a JSON Schema Validator from Scratch - Week 2
Source: Dev.to
Background
After two weeks of reading, I’ve finally finished the JSON Schema specifications (specifically the Core and Validation specs). I now have a pretty good idea of what a JSON Schema validator should look like and a few thoughts on the specs.
Overall, the authors did a great job—almost every minute detail of the system is mentioned: how it should work, how implementations should handle certain cases, what a received schema should look like, and what to do if it doesn’t.
Challenges with the Specs
I faced some difficulties reading the specs. This could be due to ambiguities in the documents or simply my lack of experience, as this was my first time working through a specification. The main issues were:
- Ambiguity about the intended audience.
- Lack of clarity in certain sections.
Intended Audience Ambiguity
The specs address three different entities:
- Schema Authors – people who write schemas to use the validator.
- Validator Implementers – people who implement/write the validator code.
- Specification Extenders – people who create custom keywords or vocabularies.
The documents rarely specify which audience a particular paragraph targets. In some cases, the same sentence starts by addressing one group and ends by addressing another. The only way to know who is being addressed is to fully understand the surrounding context, which is difficult when you’re new to the material.
Specific Areas of Confusion
- Lexical vs. dynamic scopes – unclear explanations made it hard to grasp the distinction.
- Meta‑schemas – I discussed this in detail in last week’s post, but the spec’s wording was confusing.
anyOfexample from chapter 11 – the example was difficult to follow without additional context.
Additional examples and clearer wording would have made reading the specs much easier.
Tools That Helped
I’d like to give a shout‑out to Google’s NotebookLM. I tried several tools to help me understand the specs, and NotebookLM was the most helpful.
Planned Implementation Scope
The specifications state that some keywords and features are mandatory, while others are optional. For my initial implementation I will:
- Support only draft 2020‑12
- Exclude
$vocabularykeyword - Support the detailed output format
- Exclude short‑circuiting
- Support only the annotation functionality of the
formatkeyword - Exclude dereferencing of JSON pointers that use the schema’s parent or ancestor base URI (per section 9.2.1 of the Core specs)
- Exclude remote schema fetching
The goal is to lower the difficulty as much as possible for this first fully functional validator, given that this is my first attempt at a spec‑compliant software system.
Future Work
Once I have a basic, fully functional validator, I may consider adding:
- Additional drafts – designing the validator to be architecturally resilient should make it straightforward to add new drafts later.
- Short‑circuiting – if I can implement it in a way that complies with the specs.
formatkeyword assertions – if I find learning value or fun in extending this functionality.