Generating PDFs from HTML in Node.js (and why I stopped using Puppeteer)
Source: Dev.to
What’s actually wrong with Puppeteer
- Memory – Chromium is a full browser. Each instance consumes 300–500 MB. If a render crashes, the browser process may not clean up, eventually exhausting server memory under real traffic.
- Cold starts – Spinning up a Chromium instance takes 1–3 seconds, every time. On server‑less functions that scale to zero, this latency hits the first request.
- Fonts and assets – Puppeteer runs sandboxed. Anything loaded from a relative path or
file://URL either fails silently or renders incorrectly. PDFs that look fine locally can appear broken in production. - Server dependencies – Chromium requires libraries such as
libglib,libnss,libatk, etc., which are not present on a vanilla Ubuntu server. Each new environment becomes a fresh debugging session, and Docker images grow by ~400 MB.
None of this is Puppeteer’s fault; it’s a browser‑automation tool being asked to do something it wasn’t really designed for.
The other options people try
wkhtmltopdf
Uses WebKit to render HTML. It’s fast and lightweight, with no browser process to manage.
Drawback: Unmaintained since 2020; CSS support is frozen around 2013 (no flexbox, grid, or CSS variables). Modern layouts will break.
PDFKit / jsPDF
Describe the document in code—place text, draw lines, set fonts. Very precise and works well for fixed‑layout documents.
Drawback: You can’t reuse HTML templates. Every design change requires editing code, and even a simple invoice with a dynamic table becomes verbose.
An API
Send HTML, get a PDF back. The rendering infrastructure is handled by a third party—no Chromium to manage, no system dependencies, nothing to deploy. This is where most teams land after exhausting the alternatives.
What using an API actually looks like
Here’s a basic Node.js example using LightningPDF:
const response = await fetch("https://lightningpdf.dev/api/v1/pdf/generate", {
method: "POST",
headers: {
"Authorization": "Bearer YOUR_API_KEY",
"Content-Type": "application/json"
},
body: JSON.stringify({
html: `
## Invoice #1042
Due: March 31, 2026
`,
options: { format: "A4" }
})
});
const { data } = await response.json();
const pdfBuffer = Buffer.from(data.pdf, "base64");
That’s it. Your existing HTML templates work as‑is, Tailwind classes render without a build step, and the migration from Puppeteer is mostly just removing the browser setup/teardown code.
Templates for documents you generate repeatedly
If you generate invoices or reports with a stable structure, build the template once in a visual designer and pass data at render time:
body: JSON.stringify({
template_id: "invoice-001",
data: {
company: "Acme Corp",
invoice_number: "1042",
items: [
{ name: "Web development", quantity: 10, price: 150 },
{ name: "Design review", quantity: 2, price: 200 }
]
}
})
No HTML string concatenation in your app code—the template lives separately and gets filled in at render time.
Batch generation
For bulk jobs (end‑of‑month invoices, report runs) use the async endpoint and receive a webhook when the PDFs are ready:
const response = await fetch("https://lightningpdf.dev/api/v1/pdf/async", {
method: "POST",
headers: {
"Authorization": "Bearer YOUR_API_KEY",
"Content-Type": "application/json"
},
body: JSON.stringify({
template_id: "monthly-statement",
data: { user_id: "usr_123", month: "February" },
webhook_url: "https://yourapp.com/webhooks/pdf-ready"
})
});
Rough performance numbers
| Approach | Typical render time | Memory overhead |
|---|---|---|
| Puppeteer (self‑hosted) | 2–4 s | 300–500 MB per instance |
| wkhtmltopdf | 0.5–1 s | Low |
| API (simple docs) | < 100 ms | Not your problem |
| API (complex CSS) | 1–3 s | Not your problem |
The speed difference for simple documents is significant. A Go‑native renderer can produce a basic invoice in under 100 ms, whereas Chromium only becomes necessary for complex HTML/CSS.
Is this worth it for a small project?
Probably yes, mainly because of deployment complexity. Even if you generate only 20 PDFs a month, avoiding Chromium installation on every server is valuable. Most PDF APIs offer a free tier that covers low volume.
For anything with real traffic or batch generation, the benefits are clearer—you no longer worry about memory limits or process management.
What are you currently using for PDF generation? If you’ve found a way to make self‑hosted Puppeteer work well in production, I’d love to hear about it.