I tried Cloudflare’s “Markdown for Agents” idea in NGINX (Rust module) — early prototype

Published: (March 2, 2026 at 02:21 AM EST)
3 min read
Source: Dev.to

Source: Dev.to

Introduction

Cloudflare recently shipped Markdown for Agents: if a client sends Accept: text/markdown, Cloudflare can fetch your HTML and return a Markdown variant.
Inspired by this, I built a self‑hostable NGINX dynamic module that does something similar on your own infrastructure. This is a very early starter prototype, mainly meant to help people try the workflow and share feedback.

Repository:

How it works

ClientRequest headerResult
BrowserAccept: text/htmlOriginal HTML
AgentAccept: text/markdownNGINX converts upstream HTML to Markdown and returns text/markdown
  • No application changes are required; the conversion sits entirely at the reverse‑proxy layer.
  • Best for: documentation, blogs, news, knowledge‑base pages.
  • Not suitable for: APIs, streaming responses, or authenticated pages (unless you handle caching carefully).

Agents and LLM tools often fetch full HTML and waste tokens on navigation, footers, cookie banners, layout markup, scripts, and noisy attributes. A Markdown variant can make downstream parsing cheaper and more predictable.

Installation

# Install the module
curl -sSL https://raw.githubusercontent.com/cnkang/nginx-markdown-for-agents/main/tools/install.sh | sudo bash

# Test and reload NGINX
sudo nginx -t && sudo nginx -s reload

Note: Dynamic modules must match your exact NGINX patch version (nginx -v). If a matching build isn’t available, you may need to compile the module yourself.

Verifying the Markdown Variant

# Request Markdown
curl -sD - -o /dev/null -H "Accept: text/markdown" http://localhost:8080/ | grep -iE 'content-type|vary'
# Expected output:
# content-type: text/markdown; charset=utf-8
# vary: Accept

Verifying the HTML Variant

curl -sD - -o /dev/null -H "Accept: text/html" http://localhost:8080/ | grep -i 'content-type'

Sample Request

curl -s -H "Accept: text/markdown" http://localhost:8080/ | head -40

Configuration

Start small—enable the filter on a single route first.

load_module modules/ngx_http_markdown_filter_module.so;

http {
    markdown_filter off;

    server {
        listen 8080;

        location /docs/ {
            markdown_filter on;

            # Recommended: avoid upstream compression for clean conversion
            proxy_set_header Accept-Encoding "";

            proxy_pass http://backend;
        }
    }
}

If conversion fails, the original HTML is returned:

markdown_on_error pass;

Limiting Work

markdown_max_size 10m;
markdown_timeout 5s;

Metrics Endpoint (localhost only)

location /markdown-metrics {
    markdown_metrics;
}

Cache Considerations

If you cache at NGINX or a CDN, ensure variants are split by the Accept header:

proxy_cache_key "$scheme$request_method$host$request_uri$http_accept";

Caveats

  • Edge cases will exist (weird HTML, giant pages, odd encodings).
  • The module focuses on HTML → Markdown only (no PDFs or arbitrary binaries).
  • Caching needs care (variant keys + auth‑aware behavior).

If you encounter a broken page, a very slow page, or a caching issue, please open an issue with:

  • A sample URL (or anonymized HTML)
  • Output of nginx -v
  • Whether the upstream is compressed
  • Any cache/CDN in front

References

  • Cloudflare inspiration – Blog:
  • Cloudflare docs:
  • Project repository:
0 views
Back to Blog

Related posts

Read more »

Google Gemini Writing Challenge

What I Built - Where Gemini fit in - Used Gemini’s multimodal capabilities to let users upload screenshots of notes, diagrams, or code snippets. - Gemini gener...