AAoM-02: XML Parser with W3C Conformance

Published: 3 weeks ago (January 13, 2026 at 09:32 PM EST)

4 min read

Source: Dev.to

Skill

I’m still using Claude Code (Opus 4.5) with the MoonBit system prompt and IDE skill.
Moreover, I created a new skill named moonbit-lang to inform the AI about best practices and common pitfalls for the MoonBit language. The header looks as follows:

---
name: moonbit-lang
description: "MoonBit language reference and coding conventions. Use when writing MoonBit code, asking about syntax, or encountering MoonBit-specific errors. Covers error handling, FFI, async, and common pitfalls."
---

# MoonBit Language Reference

@reference/fundamentals.md
@reference/error-handling.md
@reference/ffi.md
@reference/async-experimental.md
@reference/package.md
@reference/toml-parser-parser.mbt

In this skill doc I also mention the official file‑I/O package moonbitlang/x/fs, which the AI is not familiar with.
The complete skill doc and references can be accessed on GitHub, where I continuously update the skills I use.

The AI (both Codex and Claude) reads only the description at startup and the rest on demand. I keep the skill doc simple because, in my experience, excessively long documents hinder the AI’s ability to understand the details.

Problem

XML remains ubiquitous in configuration files, data interchange, and legacy systems. A conformant XML parser must handle:

Element tags, attributes, and namespaces
Entity references

Below is a simple test that parses a minimal document and inspects the resulting event stream:

let xml = "\n\n\n"
let reader = Reader::from_string(xml)
let events : Array[Event] = []
for {
  match reader.read_event() {
    Eof => {
      events.push(Eof)
      break
    }
    event => events.push(event)
  }
}
inspect(
  to_libxml_format(events),
  content="[DocType(\"doc\"), Empty({name: \"doc\", attributes: []}), Eof]",
)

A not‑well‑formed example:

test "w3c/not-wf/not_wf_sa_001" {
  // Attribute values must start with attribute names, not "?".
  let xml = "\n\n\n"
  let reader = Reader::from_string(xml)
  let has_error = for {
    try reader.read_event() catch {
      _ => break true
    } noraise {
      Eof => break false
      _ => continue
    }
  }
  inspect(has_error, content="true")
}

A total of 735 tests were generated, comprising ~14 k lines of code. After adding a few manually‑written tests, the suite now contains 800 tests.

Parser Implementation

Since quick‑xml was the initial reference, Claude followed a pull‑parser architecture inspired by it, which I thought was acceptable for our goal. The API looks like this:

let reader = @xml.Reader::from_string(xml)
for {
  match reader.read_event() {
    Eof => break
    Start(elem) => println("Start: \{elem.name}")
    End(name)   => println("End: \{name}")
    Text(content) => println("Text: \{content}")
    _ => continue
  }
}

Because lxml returns a tree while our parser emits events, I asked Claude to implement a to_libxml_format function that transforms our event stream into the exact format produced by lxml. This made test comparison straightforward.

The basic implementation took about 4 hours of AI‑only work (aside from occasional “Please continue” prompts). The most complex feature was DTD parsing and validation. I used Claude’s plan mode to structure the implementation. Below is a summary of that plan:

Plan summary

Project Summary

Diagram

After about 1 hour, DTD support was implemented and 726 tests passed.
It then took another 3 hours to handle edge cases such as:

Entity value expansion
Text‑splitting details
UTF‑8 BOM handling

Results

At the end of the effort 800 W3C conformance tests passed.

59 tests were skipped by the tests‑gen script because:
- Some were valid but rejected by lxml.
- Others were not well‑formed but passed by lxml.

These were marked as “lxml implementation quirks”.
Since the edge cases were overly complicated, I didn’t verify each one in detail, but the remaining 800 tests were sufficient for confidence.

Supported Features

XML 1.0 + Namespaces 1.0
Pull‑parser API for memory‑efficient streaming
Writer API for XML generation
DTD support with entity expansion

Reflections

What Worked Well?

Using an official test suite – The W3C conformance tests uncovered obscure edge cases (character references, DTD quirks, namespace handling, etc.) that I would never have thought to test manually.
Switching reference implementations – quick‑xml is intentionally lenient, which made conformance testing difficult. Switching to libxml2 gave me a strict reference.
Planning mode for complex features – Breaking DTD parsing into a plan kept the work organized; without it, I would have jumped between unrelated bugs.

Challenges Encountered

Claude often tried to modify the tests instead of fixing the parser:

Changing test expectations to match incorrect output.
Updating the test generator to skip failing tests.
Marking tests as “lenient” and skipping them.

I had to repeatedly remind Claude: “Update the MoonBit implementation, not the tests.”

Other recurring issues:

Forgetting project conventions (e.g., not using the moon‑ide skill for navigation, using match (try? expr) instead of try/catch/noraise).
Adding these conventions to CLAUDE.md helped but didn’t eliminate the problem.

I found a related discussion on Reddit (link) that suggests a bug in Opus 4.5 and Sonnet 4.5. Hopefully it will be fixed soon.

Future Work

I anticipate needing to implement or port many more parsers. My plan is to turn the experience of writing parsers and generating standard‑based test scripts into reusable skills or commands, so the next project can benefit from this groundwork.

Time Investment (≈ 10 hours)

Activity	Hours
Collaborative exploration of test‑generation script	2
Autonomous implementation of basic features	4
Planning & implementing DTD, namespaces, entities	1
Handling edge cases (fixing 17 test failures)	3

The code is available on GitHub:

AAoM-02: XML Parser with W3C Conformance

Skill

Problem

Parser Implementation

Project Summary

Results

Supported Features

Reflections

What Worked Well?

Challenges Encountered

Future Work

Time Investment (≈ 10 hours)

Related posts

Vibe Coding Feels Productive — But Often Produces Nothing

/statusline: Build Your Dream Status Bar for Claude Code

Overlapping Markup

The recurring dream of replacing developers

Skill

Problem

Parser Implementation

Project Summary

Results

Supported Features

Reflections

What Worked Well?

Challenges Encountered

Future Work

Time Investment (≈ 10 hours)

Related posts

Vibe Coding Feels Productive — But Often Produces Nothing

/statusline: Build Your Dream Status Bar for Claude Code

Overlapping Markup

The recurring dream of replacing developers

Time Investment (≈ 10 hours)