rakers — a headless JS renderer in Rust
Source: Dev.to
Overview
A lot of useful content on the web only exists after JavaScript runs. Server‑side rendering has made a comeback, but many sites still ship a near‑empty HTML skeleton and populate it entirely client‑side. To extract that content—for archiving, testing, or processing—you need to run the JavaScript first.
The standard answer is a headless browser (Playwright, Puppeteer, or a self‑hosted Chrome). These work, but they are heavy: Chrome’s footprint is around 300 MB, startup takes one to two seconds, and running it in CI requires care. For many use cases the full browser is overkill—you don’t need CSS layout, GPU compositing, or WebGL; you just need the DOM after scripts have run.
rakers is an attempt to find the floor on that problem. It is a single ~10 MB binary that parses HTML, runs the JavaScript, and returns the post‑execution HTML.
Pipeline
- Parse –
html5ever(Servo’s HTML5 parser, published as a standalone crate) turns the input into a DOM tree. - Execute –
<script>tags are collected in document order, inline and external alike. External scripts are fetched synchronously. All scripts are evaluated in a sandboxed JS context that exposes a browser‑compatible global environment:document,window,console,XMLHttpRequest,localStorage,setTimeout, and more. - Serialize – The post‑execution DOM is serialized back to HTML and written to stdout (or a file).
JavaScript Engine
The JS engine is QuickJS via the rquickjs crate. QuickJS is ES2023, compact, and embeds cleanly into Rust without much ceremony. For environments without a C compiler there is an optional boa_engine backend, a pure‑Rust JS engine with a smaller compatibility surface but zero native dependencies.
Browser Environment Stub
The trickiest part is not the JS engine—it is the fake browser environment. Frameworks like React, Vue, and Svelte expect a fairly complete DOM API. Since rakers has no real layout engine, those APIs are implemented as a JavaScript stub in bootstrap.js that is injected before any page script runs.
The stub covers the most common patterns:
- Element creation and mutation
getElementById,querySelector,querySelectorAll- Event listener registration
classListlocalStoragehistory.pushState- XHR (backed by a synchronous Rust fetch via
ureq)
Because XHR is synchronous, frameworks that load templates at runtime via XHR—like RiotJS—actually retrieve them.
Known Gaps
- Layout‑dependent properties (
offsetWidth,getBoundingClientRect) return zero. - Native ES modules are skipped.
Despite these gaps, the stub is good enough to run 21 of the 23 TodoMVC implementations, providing a reasonable compatibility benchmark across React, Vue, Angular, Svelte, Preact, Mithril, Elm, Backbone, Ember, Knockout, and others.
Usage Examples
# Basic usage
rakers https://example.com
# Pipe input from curl
curl -s https://example.com | rakers
# Use a SOCKS5 proxy
rakers --proxy socks5://127.0.0.1:9050 https://example.com
# Extract a specific selector
rakers --selector "article.post" https://example.com
Installation
Installation instructions are provided in the project’s README. Typically you can install via cargo:
cargo install rakers
Limitations
- No CSS engine, layout, WebGL, IndexedDB, or service workers.
- Anything that depends on those features will not work.
- Sites that fingerprint the JS environment or require a real
navigator.pluginsarray will also fail. - For such cases a real headless browser remains the appropriate tool.
Value Proposition
The proposition is narrow but real: for sites that render via a self‑contained JS bundle with no exotic browser dependencies, rakers is faster, smaller, and simpler to deploy than any solution that ships a copy of Chromium.