From O(n ) to O(n): Building a Streaming Markdown Renderer for the AI Era

Published: (January 7, 2026 at 10:32 PM EST)
5 min read
Source: Dev.to

Source: Dev.to

If you’ve built an AI chat application, you’ve probably noticed something frustrating: the longer the conversation gets, the slower the rendering becomes.

The reason is simple — every time the AI outputs a new token, traditional markdown parsers re‑parse the entire document from scratch. This is a fundamental architectural problem, and it only gets worse as AI outputs get longer.

We built Incremark to fix this.


The Uncomfortable Truth About AI in 2025

If you’ve been following AI trends, you know the numbers are getting crazy:

YearTypical Output
2022GPT‑3.5 responses? A few hundred words, no big deal
2023GPT‑4 cranks it up to 2,000–4,000 words
2024‑2025Reasoning models (o1, DeepSeek R1) are outputting 10,000+ word “thinking processes”

We’re moving from 4K‑token conversations to 32K, even 128K. And here’s the thing nobody talks about: rendering 500 words and rendering 50,000 words of Markdown are completely different engineering problems.

Most markdown libraries? They were built for blog posts, not for AI that thinks out loud.


Why Your Markdown Parser is Lying to You

Here’s what happens under the hood when you stream AI output through a traditional parser:

Chunk 1: Parse 100 chars ✓
Chunk 2: Parse 200 chars (100 old + 100 new)
Chunk 3: Parse 300 chars (200 old + 100 new)
...
Chunk 100: Parse 10,000 chars 😰

Total work: 100 + 200 + 300 + … + 10,000 = 5,050,000 character operations.

That’s O(n²). The cost doesn’t just grow — it explodes.

For a 20 KB AI response, this means:

LibraryParsing time
ant‑design‑x1,657 ms
markstream‑vue5,755 ms (almost 6 seconds of parsing!)

These are popular, well‑maintained libraries. The problem isn’t bad code — it’s the wrong architecture.


The Key Insight

Once a markdown block is “complete”, it will never change.

Think about it. When the AI outputs:

# Heading

This is a paragraph.

After that second blank line, the paragraph is done. Locked in. No matter what comes next — code blocks, lists, more paragraphs — that paragraph will never be touched again.

So why are we re‑parsing it 500 times?


How Incremark Actually Works

We built Incremark around this insight. The core algorithm:

  1. Detect stable boundaries — blank lines, new headings, fence closings.
  2. Cache completed blocks — never touch them again.
  3. Only re‑parse the pending block — the one still receiving input.
Chunk 1: Parse 100 chars → cache stable blocks
Chunk 2: Parse only ~100 new chars
Chunk 3: Parse only ~100 new chars
...
Chunk 100: Parse only ~100 new chars

Total work: 100 × 100 = 10,000 character operations.

That’s 500× less work. Each character is parsed at most once → O(n).


Complete Benchmark Data

We benchmarked 38 real markdown files — AI conversations, docs, code‑analysis reports (not synthetic data). Total: 6,484 lines, 128.55 KB.

FileLinesSizeIncremarkStreamdownmarkstream‑vueant‑design‑x
test‑footnotes‑simple.md150.09 KB0.3 ms0.0 ms1.4 ms0.2 ms
simple‑paragraphs.md160.41 KB0.9 ms0.9 ms5.9 ms1.0 ms
introduction.md341.57 KB5.6 ms12.6 ms75.6 ms12.8 ms
footnotes.md520.94 KB1.7 ms0.2 ms10.6 ms1.9 ms
concepts.md914.29 KB12.0 ms50.5 ms381.9 ms53.6 ms
comparison.md1095.39 KB20.5 ms74.0 ms552.2 ms85.2 ms
complex‑html‑examples.md1473.99 KB9.0 ms58.8 ms279.3 ms57.2 ms
FOOTNOTE_FIX_SUMMARY.md2363.93 KB22.7 ms0.5 ms535.0 ms120.8 ms
OPTIMIZATION_SUMMARY.md3916.24 KB19.1 ms208.4 ms980.6 ms217.8 ms
BLOCK_TRANSFORMER_ANALYSIS.md4899.24 KB75.7 ms574.3 ms1984.1 ms619.9 ms
test‑md‑01.md91617.67 KB87.7 ms1441.1 ms5754.7 ms1656.9 ms
Total (38 files)6,484128.55 KB519.4 ms3,190.3 ms14,683.9 ms3,728.6 ms

Being Honest: Where We’re Slower

You’ll notice something odd: for footnotes.md and FOOTNOTE_FIX_SUMMARY.md, Streamdown appears much faster.

FileIncremarkStreamdownWhy?
footnotes.md1.7 ms0.2 msStreamdown doesn’t support footnotes
FOOTNOTE_FIX_SUMMARY.md22.7 ms0.5 msSame — it just skips them

This isn’t a performance issue — it’s a feature difference.

When Streamdown encounters [^1] footnote syntax, it simply ignores it. Incremark fully implements footnotes, and we had to solve a tricky streaming‑specific problem: references often arrive before definitions.

Chunk 1: "See footnote[^1] for details..."   // reference first
Chunk 2: "More content..."
Chunk 3: "[^1]: This is the definition"      // definition later

Traditional parsers assume a complete document. We built “optimistic references” that gracefully handle incomplete links/images during streaming, then resolve them when definitions arrive.

We also fully implement math blocks ($…$) and custom containers (:::tip) because those are common in AI‑generated content.


Where We Actually Shine

Excluding footnote files, look at standard markdown performance:

FileLinesIncremarkStreamdownAdvantage
concepts.md9112.0 ms50.5 ms4.2×
comparison.md10920.5 ms74.0 ms3.6×
complex‑html‑examples.md1479.0 ms58.8 ms6.6×
OPTIMIZATION_SUMMARY.md39119.1 ms208.4 ms10.9×
test‑md‑01.md91687.7 ms1441.1 ms16.4×

The pattern is clear: the larger the document, the bigger our advantage.

For the largest file (17.67 KB):

LibraryTimeRelative
Incremark88 ms
ant‑design‑x1,657 ms18.9× slower
markstream‑vue5,755 ms65.6× slower

O(n) vs O(n²) in Action

Traditional parsers re‑parse the entire document on every chunk:

Chunk 1: Parse 100 chars
Chunk 2: Parse 200 chars (100 old + 100 new)
Chunk 3: Parse 300 chars (200 old + 100 new)
...
Chunk 100: Parse 10,000 chars

Total work: 100 + 200 + … + 10,000 = 5,050,000 character operations.

Incremark only processes new content:

Chunk 1: Parse 100 chars → cache stable blocks
Chunk 2: Parse only ~100 new chars
Chunk 3: Parse only ~100 new chars
...
Chunk 100: Parse only ~100 new chars

Total work: 100 × 100 = 10,000 character operations.

That’s a 500× difference, and the gap widens as documents grow.


When to Use Incremark

Use Incremark for:

  • AI chat with streaming output (Claude, ChatGPT, etc.)
  • Long‑form AI content (reasoning models, code generation)
  • Real‑time markdown editors
  • Content requiring footnotes, math, or custom containers
  • 100K+ token conversations

⚠️ Consider alternatives for:

  • One‑time static markdown rendering (just use marked directly)
  • Very small files (e.g., a few lines of text)
import { ref } from 'vue'
import { IncremarkContent } from '@incremark/vue'

const content = ref('')
const isFinished = ref(false)

async function handleStream(stream) {
  for await (const chunk of stream) {
    content.value += chunk
  }
  isFinished.value = true
}

We support Vue 3, React 18, and Svelte 5 with identical APIs—one core, three frameworks, zero behavior differences.


What’s Next

Version 0.3.0 is just the beginning.

The AI world is moving toward longer outputs, more complex reasoning traces, and richer formatting. Traditional parsers can’t keep up—their O(n²) architecture guarantees it.

We built Incremark because we needed it. We hope you find it useful too.

  • 📚 Docs:
  • 💻 GitHub:
  • 🎮 Live Demos:
    • Vue:
    • React:
    • Svelte:

If this saved you debugging time, a ⭐️ on GitHub would mean a lot. Questions? Open an issue or drop a comment below.

Back to Blog

Related posts

Read more »

AI SEO agencies Nordic

!Cover image for AI SEO agencies Nordichttps://media2.dev.to/dynamic/image/width=1000,height=420,fit=cover,gravity=auto,format=auto/https%3A%2F%2Fdev-to-uploads...