YAML to JSON in CI Pipelines: Why It Breaks More Often Than You Expect
Source: Dev.to
Why CI Pipelines Often Need JSON
Many CI tools accept YAML only as input, but internally convert or forward configuration as JSON:
- APIs expect strict JSON
- Schema validators work on JSON
- Helm renders YAML → JSON
- Custom build steps serialize configs to JSON
So even if your repo uses YAML, JSON is almost always involved downstream.
The Hidden Problem: YAML Is More Permissive Than JSON
YAML is designed for humans, which leads to subtle but dangerous issues.
1️⃣ Duplicate Keys (The Silent Killer)
YAML allows duplicate keys without throwing errors:
env:
VAR1: value1
VAR1: value2 # duplicate key
Most YAML parsers will silently overwrite the first value. After conversion, JSON ends up with:
{
"env": {
"VAR1": "value2"
}
}
Your pipeline passes, but the original intent is lost—one of the most common CI bugs.
2️⃣ Indentation Errors That “Parse” but Break Logic
YAML indentation defines structure. This looks valid at a glance:
steps:
name: build
run:
echo "Building"
Depending on the parser, this may serialize incorrectly or fail schema validation after conversion. CI tools often don’t validate YAML deeply before passing it along.
3️⃣ Anchors & Aliases Don’t Translate Cleanly
YAML supports reuse:
defaults: &defaults
timeout: 30
retries: 2
job:
<<: *defaults
script: make test
After conversion, some tools:
- Inline the values
- Drop anchors entirely
- Fail schema validation
JSON has no concept of anchors, so the result can be unpredictable.
4️⃣ Data Types Change Without Warning
YAML guesses types:
enabled: yes
Depending on the parser, this may convert to:
{
"enabled": true
}
If your API expects a string ("yes"), this breaks compatibility.
Common CI Conversion Methods (and Their Limits)
| ✅ | Method | Notes |
|---|---|---|
| ✅ | Python (PyYAML) | See code snippet below |
| ✅ | Helm toJson | Works inside Helm charts, but inherits Helm’s handling of anchors and types. |
| ✅ | Helm templating | Good for Kubernetes manifests, but still subject to the same YAML‑to‑JSON pitfalls. |
import yaml, json
json_data = json.dumps(yaml.safe_load(open("config.yaml")))
Best Practice: Validate Before Conversion
Before converting YAML to JSON in CI:
- Validate indentation
- Detect duplicate keys
- Confirm data types
- Inspect the final JSON output
Perform these checks before the config reaches your API or deployment step.
A Practical Debugging Tip (Saved Me Many Times)
When a CI pipeline fails after conversion, try the following:
- Paste the YAML into a strict YAML → JSON converter.
- Inspect the JSON output for:
- Missing fields
- Overwritten keys
- Unexpected booleans or numbers
A handy browser‑based tool is jsonviewertool.com/yaml-to-json. It runs fully client‑side and helps spot structural issues quickly.
When Should You Avoid Conversion Entirely?
If possible:
- Keep YAML all the way through (e.g., Helm → Kubernetes).
- Define configs natively in JSON when APIs are involved.
Conversion should be intentional, not accidental.
Final Thoughts
YAML → JSON conversion isn’t hard—but it’s deceptively dangerous. Most CI failures caused by it:
- Don’t throw errors
- Pass validation
- Break production behavior later
Treat conversion as a validation step, not just a formatting step. Your CI pipelines—and future self—will thank you.
Further Reading
- YAML vs JSON in APIs & CI pipelines
- Helm
toJsonpitfalls - Duplicate key detection in YAML