From O(n ) to O(n): Building a Streaming Markdown Renderer for the AI Era
Source: Dev.to
If you’ve built an AI chat application, you’ve probably noticed something frustrating: the longer the conversation gets, the slower the rendering becomes.
The reason is simple — every time the AI outputs a new token, traditional markdown parsers re‑parse the entire document from scratch. This is a fundamental architectural problem, and it only gets worse as AI outputs get longer.
We built Incremark to fix this.
The Uncomfortable Truth About AI in 2025
If you’ve been following AI trends, you know the numbers are getting crazy:
| Year | Typical Output |
|---|---|
| 2022 | GPT‑3.5 responses? A few hundred words, no big deal |
| 2023 | GPT‑4 cranks it up to 2,000–4,000 words |
| 2024‑2025 | Reasoning models (o1, DeepSeek R1) are outputting 10,000+ word “thinking processes” |
We’re moving from 4K‑token conversations to 32K, even 128K. And here’s the thing nobody talks about: rendering 500 words and rendering 50,000 words of Markdown are completely different engineering problems.
Most markdown libraries? They were built for blog posts, not for AI that thinks out loud.
Why Your Markdown Parser is Lying to You
Here’s what happens under the hood when you stream AI output through a traditional parser:
Chunk 1: Parse 100 chars ✓
Chunk 2: Parse 200 chars (100 old + 100 new)
Chunk 3: Parse 300 chars (200 old + 100 new)
...
Chunk 100: Parse 10,000 chars 😰
Total work: 100 + 200 + 300 + … + 10,000 = 5,050,000 character operations.
That’s O(n²). The cost doesn’t just grow — it explodes.
For a 20 KB AI response, this means:
| Library | Parsing time |
|---|---|
| ant‑design‑x | 1,657 ms |
| markstream‑vue | 5,755 ms (almost 6 seconds of parsing!) |
These are popular, well‑maintained libraries. The problem isn’t bad code — it’s the wrong architecture.
The Key Insight
Once a markdown block is “complete”, it will never change.
Think about it. When the AI outputs:
# Heading
This is a paragraph.
After that second blank line, the paragraph is done. Locked in. No matter what comes next — code blocks, lists, more paragraphs — that paragraph will never be touched again.
So why are we re‑parsing it 500 times?
How Incremark Actually Works
We built Incremark around this insight. The core algorithm:
- Detect stable boundaries — blank lines, new headings, fence closings.
- Cache completed blocks — never touch them again.
- Only re‑parse the pending block — the one still receiving input.
Chunk 1: Parse 100 chars → cache stable blocks
Chunk 2: Parse only ~100 new chars
Chunk 3: Parse only ~100 new chars
...
Chunk 100: Parse only ~100 new chars
Total work: 100 × 100 = 10,000 character operations.
That’s 500× less work. Each character is parsed at most once → O(n).
Complete Benchmark Data
We benchmarked 38 real markdown files — AI conversations, docs, code‑analysis reports (not synthetic data). Total: 6,484 lines, 128.55 KB.
| File | Lines | Size | Incremark | Streamdown | markstream‑vue | ant‑design‑x |
|---|---|---|---|---|---|---|
| test‑footnotes‑simple.md | 15 | 0.09 KB | 0.3 ms | 0.0 ms | 1.4 ms | 0.2 ms |
| simple‑paragraphs.md | 16 | 0.41 KB | 0.9 ms | 0.9 ms | 5.9 ms | 1.0 ms |
| introduction.md | 34 | 1.57 KB | 5.6 ms | 12.6 ms | 75.6 ms | 12.8 ms |
| footnotes.md | 52 | 0.94 KB | 1.7 ms | 0.2 ms | 10.6 ms | 1.9 ms |
| concepts.md | 91 | 4.29 KB | 12.0 ms | 50.5 ms | 381.9 ms | 53.6 ms |
| comparison.md | 109 | 5.39 KB | 20.5 ms | 74.0 ms | 552.2 ms | 85.2 ms |
| complex‑html‑examples.md | 147 | 3.99 KB | 9.0 ms | 58.8 ms | 279.3 ms | 57.2 ms |
| FOOTNOTE_FIX_SUMMARY.md | 236 | 3.93 KB | 22.7 ms | 0.5 ms | 535.0 ms | 120.8 ms |
| OPTIMIZATION_SUMMARY.md | 391 | 6.24 KB | 19.1 ms | 208.4 ms | 980.6 ms | 217.8 ms |
| BLOCK_TRANSFORMER_ANALYSIS.md | 489 | 9.24 KB | 75.7 ms | 574.3 ms | 1984.1 ms | 619.9 ms |
| test‑md‑01.md | 916 | 17.67 KB | 87.7 ms | 1441.1 ms | 5754.7 ms | 1656.9 ms |
| Total (38 files) | 6,484 | 128.55 KB | 519.4 ms | 3,190.3 ms | 14,683.9 ms | 3,728.6 ms |
Being Honest: Where We’re Slower
You’ll notice something odd: for footnotes.md and FOOTNOTE_FIX_SUMMARY.md, Streamdown appears much faster.
| File | Incremark | Streamdown | Why? |
|---|---|---|---|
| footnotes.md | 1.7 ms | 0.2 ms | Streamdown doesn’t support footnotes |
| FOOTNOTE_FIX_SUMMARY.md | 22.7 ms | 0.5 ms | Same — it just skips them |
This isn’t a performance issue — it’s a feature difference.
When Streamdown encounters [^1] footnote syntax, it simply ignores it. Incremark fully implements footnotes, and we had to solve a tricky streaming‑specific problem: references often arrive before definitions.
Chunk 1: "See footnote[^1] for details..." // reference first
Chunk 2: "More content..."
Chunk 3: "[^1]: This is the definition" // definition later
Traditional parsers assume a complete document. We built “optimistic references” that gracefully handle incomplete links/images during streaming, then resolve them when definitions arrive.
We also fully implement math blocks ($…$) and custom containers (:::tip) because those are common in AI‑generated content.
Where We Actually Shine
Excluding footnote files, look at standard markdown performance:
| File | Lines | Incremark | Streamdown | Advantage |
|---|---|---|---|---|
| concepts.md | 91 | 12.0 ms | 50.5 ms | 4.2× |
| comparison.md | 109 | 20.5 ms | 74.0 ms | 3.6× |
| complex‑html‑examples.md | 147 | 9.0 ms | 58.8 ms | 6.6× |
| OPTIMIZATION_SUMMARY.md | 391 | 19.1 ms | 208.4 ms | 10.9× |
| test‑md‑01.md | 916 | 87.7 ms | 1441.1 ms | 16.4× |
The pattern is clear: the larger the document, the bigger our advantage.
For the largest file (17.67 KB):
| Library | Time | Relative |
|---|---|---|
| Incremark | 88 ms | — |
| ant‑design‑x | 1,657 ms | 18.9× slower |
| markstream‑vue | 5,755 ms | 65.6× slower |
O(n) vs O(n²) in Action
Traditional parsers re‑parse the entire document on every chunk:
Chunk 1: Parse 100 chars
Chunk 2: Parse 200 chars (100 old + 100 new)
Chunk 3: Parse 300 chars (200 old + 100 new)
...
Chunk 100: Parse 10,000 chars
Total work: 100 + 200 + … + 10,000 = 5,050,000 character operations.
Incremark only processes new content:
Chunk 1: Parse 100 chars → cache stable blocks
Chunk 2: Parse only ~100 new chars
Chunk 3: Parse only ~100 new chars
...
Chunk 100: Parse only ~100 new chars
Total work: 100 × 100 = 10,000 character operations.
That’s a 500× difference, and the gap widens as documents grow.
When to Use Incremark
✅ Use Incremark for:
- AI chat with streaming output (Claude, ChatGPT, etc.)
- Long‑form AI content (reasoning models, code generation)
- Real‑time markdown editors
- Content requiring footnotes, math, or custom containers
- 100K+ token conversations
⚠️ Consider alternatives for:
- One‑time static markdown rendering (just use
markeddirectly) - Very small files (e.g., a few lines of text)
import { ref } from 'vue'
import { IncremarkContent } from '@incremark/vue'
const content = ref('')
const isFinished = ref(false)
async function handleStream(stream) {
for await (const chunk of stream) {
content.value += chunk
}
isFinished.value = true
}
We support Vue 3, React 18, and Svelte 5 with identical APIs—one core, three frameworks, zero behavior differences.
What’s Next
Version 0.3.0 is just the beginning.
The AI world is moving toward longer outputs, more complex reasoning traces, and richer formatting. Traditional parsers can’t keep up—their O(n²) architecture guarantees it.
We built Incremark because we needed it. We hope you find it useful too.
- 📚 Docs:
- 💻 GitHub:
- 🎮 Live Demos:
- Vue:
- React:
- Svelte:
If this saved you debugging time, a ⭐️ on GitHub would mean a lot. Questions? Open an issue or drop a comment below.