From O(n ) to O(n): Building a Streaming Markdown Renderer for the AI Era

Published: 15 hours ago (January 7, 2026 at 10:32 PM EST)

5 min read

Source: Dev.to

If you’ve built an AI chat application, you’ve probably noticed something frustrating: the longer the conversation gets, the slower the rendering becomes.

The reason is simple — every time the AI outputs a new token, traditional markdown parsers re‑parse the entire document from scratch. This is a fundamental architectural problem, and it only gets worse as AI outputs get longer.

We built Incremark to fix this.

The Uncomfortable Truth About AI in 2025

If you’ve been following AI trends, you know the numbers are getting crazy:

Year	Typical Output
2022	GPT‑3.5 responses? A few hundred words, no big deal
2023	GPT‑4 cranks it up to 2,000–4,000 words
2024‑2025	Reasoning models (o1, DeepSeek R1) are outputting 10,000+ word “thinking processes”

We’re moving from 4K‑token conversations to 32K, even 128K. And here’s the thing nobody talks about: rendering 500 words and rendering 50,000 words of Markdown are completely different engineering problems.

Most markdown libraries? They were built for blog posts, not for AI that thinks out loud.

Why Your Markdown Parser is Lying to You

Here’s what happens under the hood when you stream AI output through a traditional parser:

Chunk 1: Parse 100 chars ✓
Chunk 2: Parse 200 chars (100 old + 100 new)
Chunk 3: Parse 300 chars (200 old + 100 new)
...
Chunk 100: Parse 10,000 chars 😰

Total work: 100 + 200 + 300 + … + 10,000 = 5,050,000 character operations.

That’s O(n²). The cost doesn’t just grow — it explodes.

For a 20 KB AI response, this means:

Library	Parsing time
ant‑design‑x	1,657 ms
markstream‑vue	5,755 ms (almost 6 seconds of parsing!)

These are popular, well‑maintained libraries. The problem isn’t bad code — it’s the wrong architecture.

The Key Insight

Once a markdown block is “complete”, it will never change.

Think about it. When the AI outputs:

# Heading

This is a paragraph.

After that second blank line, the paragraph is done. Locked in. No matter what comes next — code blocks, lists, more paragraphs — that paragraph will never be touched again.

So why are we re‑parsing it 500 times?

How Incremark Actually Works

We built Incremark around this insight. The core algorithm:

Detect stable boundaries — blank lines, new headings, fence closings.
Cache completed blocks — never touch them again.
Only re‑parse the pending block — the one still receiving input.

Chunk 1: Parse 100 chars → cache stable blocks
Chunk 2: Parse only ~100 new chars
Chunk 3: Parse only ~100 new chars
...
Chunk 100: Parse only ~100 new chars

Total work: 100 × 100 = 10,000 character operations.

That’s 500× less work. Each character is parsed at most once → O(n).

Complete Benchmark Data

We benchmarked 38 real markdown files — AI conversations, docs, code‑analysis reports (not synthetic data). Total: 6,484 lines, 128.55 KB.

File	Lines	Size	Incremark	Streamdown	markstream‑vue	ant‑design‑x
test‑footnotes‑simple.md	15	0.09 KB	0.3 ms	0.0 ms	1.4 ms	0.2 ms
simple‑paragraphs.md	16	0.41 KB	0.9 ms	0.9 ms	5.9 ms	1.0 ms
introduction.md	34	1.57 KB	5.6 ms	12.6 ms	75.6 ms	12.8 ms
footnotes.md	52	0.94 KB	1.7 ms	0.2 ms	10.6 ms	1.9 ms
concepts.md	91	4.29 KB	12.0 ms	50.5 ms	381.9 ms	53.6 ms
comparison.md	109	5.39 KB	20.5 ms	74.0 ms	552.2 ms	85.2 ms
complex‑html‑examples.md	147	3.99 KB	9.0 ms	58.8 ms	279.3 ms	57.2 ms
FOOTNOTE_FIX_SUMMARY.md	236	3.93 KB	22.7 ms	0.5 ms	535.0 ms	120.8 ms
OPTIMIZATION_SUMMARY.md	391	6.24 KB	19.1 ms	208.4 ms	980.6 ms	217.8 ms
BLOCK_TRANSFORMER_ANALYSIS.md	489	9.24 KB	75.7 ms	574.3 ms	1984.1 ms	619.9 ms
test‑md‑01.md	916	17.67 KB	87.7 ms	1441.1 ms	5754.7 ms	1656.9 ms
Total (38 files)	6,484	128.55 KB	519.4 ms	3,190.3 ms	14,683.9 ms	3,728.6 ms

Being Honest: Where We’re Slower

You’ll notice something odd: for footnotes.md and FOOTNOTE_FIX_SUMMARY.md, Streamdown appears much faster.

File	Incremark	Streamdown	Why?
footnotes.md	1.7 ms	0.2 ms	Streamdown doesn’t support footnotes
FOOTNOTE_FIX_SUMMARY.md	22.7 ms	0.5 ms	Same — it just skips them

This isn’t a performance issue — it’s a feature difference.

When Streamdown encounters [^1] footnote syntax, it simply ignores it. Incremark fully implements footnotes, and we had to solve a tricky streaming‑specific problem: references often arrive before definitions.

Chunk 1: "See footnote[^1] for details..."   // reference first
Chunk 2: "More content..."
Chunk 3: "[^1]: This is the definition"      // definition later

Traditional parsers assume a complete document. We built “optimistic references” that gracefully handle incomplete links/images during streaming, then resolve them when definitions arrive.

We also fully implement math blocks ( $…$ ) and custom containers (:::tip) because those are common in AI‑generated content.

Where We Actually Shine

Excluding footnote files, look at standard markdown performance:

File	Lines	Incremark	Streamdown	Advantage
concepts.md	91	12.0 ms	50.5 ms	4.2×
comparison.md	109	20.5 ms	74.0 ms	3.6×
complex‑html‑examples.md	147	9.0 ms	58.8 ms	6.6×
OPTIMIZATION_SUMMARY.md	391	19.1 ms	208.4 ms	10.9×
test‑md‑01.md	916	87.7 ms	1441.1 ms	16.4×

The pattern is clear: the larger the document, the bigger our advantage.

For the largest file (17.67 KB):

Library	Time	Relative
Incremark	88 ms	—
ant‑design‑x	1,657 ms	18.9× slower
markstream‑vue	5,755 ms	65.6× slower

O(n) vs O(n²) in Action

Traditional parsers re‑parse the entire document on every chunk:

Chunk 1: Parse 100 chars
Chunk 2: Parse 200 chars (100 old + 100 new)
Chunk 3: Parse 300 chars (200 old + 100 new)
...
Chunk 100: Parse 10,000 chars

Total work: 100 + 200 + … + 10,000 = 5,050,000 character operations.

Incremark only processes new content:

Chunk 1: Parse 100 chars → cache stable blocks
Chunk 2: Parse only ~100 new chars
Chunk 3: Parse only ~100 new chars
...
Chunk 100: Parse only ~100 new chars

Total work: 100 × 100 = 10,000 character operations.

That’s a 500× difference, and the gap widens as documents grow.

When to Use Incremark

✅ Use Incremark for:

AI chat with streaming output (Claude, ChatGPT, etc.)
Long‑form AI content (reasoning models, code generation)
Real‑time markdown editors
Content requiring footnotes, math, or custom containers
100K+ token conversations

⚠️ Consider alternatives for:

One‑time static markdown rendering (just use marked directly)
Very small files (e.g., a few lines of text)

import { ref } from 'vue'
import { IncremarkContent } from '@incremark/vue'

const content = ref('')
const isFinished = ref(false)

async function handleStream(stream) {
  for await (const chunk of stream) {
    content.value += chunk
  }
  isFinished.value = true
}

We support Vue 3, React 18, and Svelte 5 with identical APIs—one core, three frameworks, zero behavior differences.

What’s Next

Version 0.3.0 is just the beginning.

The AI world is moving toward longer outputs, more complex reasoning traces, and richer formatting. Traditional parsers can’t keep up—their O(n²) architecture guarantees it.

We built Incremark because we needed it. We hope you find it useful too.

📚 Docs:
💻 GitHub:
🎮 Live Demos:
- Vue:
- React:
- Svelte:

If this saved you debugging time, a ⭐️ on GitHub would mean a lot. Questions? Open an issue or drop a comment below.

From O(n ) to O(n): Building a Streaming Markdown Renderer for the AI Era

The Uncomfortable Truth About AI in 2025

Why Your Markdown Parser is Lying to You

The Key Insight

How Incremark Actually Works

Complete Benchmark Data

Being Honest: Where We’re Slower

Where We Actually Shine

O(n) vs O(n²) in Action

When to Use Incremark

What’s Next

Related posts

Congrats to the AI Agents Intensive Course Writing Challenge Winners!

How GitHub Pull Requests in VS Code Improved My Open-Source Workflow

AI SEO agencies Nordic

How do I discover new music that actually fits my taste?