I built a Screenshot & Metadata API that extracts 50+ fields from any URL

Published: (March 13, 2026 at 08:41 AM EDT)
4 min read
Source: Dev.to

Source: Dev.to

What it does

  • URL to Screenshot – capture any webpage as PNG, JPEG, or WebP
  • URL to PDF – generate PDFs with custom format, margins, orientation
  • Metadata Extraction – 50+ fields from any URL (see below)
  • HTML to Image – render custom HTML/CSS to PNG/JPEG/WebP

The Metadata Endpoint

A single GET request extracts:

Basic SEO

  • title, description, keywords, author, language, charset, viewport, robots, canonical URL, generator

Open Graph

  • og:title, og:description, og:image (+ dimensions), og:url, og:type, og:site_name, og:locale

Twitter Card

Icons & Theme

  • favicon, apple-touch-icon, manifest, theme-color, color-scheme

Content Analysis

  • first h1 text, h2 count, internal links count, external links count, images count, images without alt text, forms count, scripts count, stylesheets count, word count

Structured Data

  • JSON‑LD (Schema.org) parsed and returned

Feeds

  • RSS/Atom feeds auto‑detected

Raw dump

  • All meta tags as key‑value pairs

Quick Start (Python)

import requests

headers = {
    "X-RapidAPI-Key": "YOUR_KEY",
    "X-RapidAPI-Host": "screenshot-pdf-api.p.rapidapi.com"
}

# Screenshot a website
response = requests.get(
    "https://screenshot-pdf-api.p.rapidapi.com/v1/screenshot",
    headers=headers,
    params={"url": "https://github.com", "width": 1280, "format": "png"}
)

with open("screenshot.png", "wb") as f:
    f.write(response.content)

print(f"Saved {len(response.content)} bytes")

Quick Start (JavaScript)

// Screenshot
const response = await fetch(
  "https://screenshot-pdf-api.p.rapidapi.com/v1/screenshot?url=https://github.com&format=png",
  {
    headers: {
      "X-RapidAPI-Key": "YOUR_KEY",
      "X-RapidAPI-Host": "screenshot-pdf-api.p.rapidapi.com"
    }
  }
);
const blob = await response.blob();

// Metadata
const meta = await fetch(
  "https://screenshot-pdf-api.p.rapidapi.com/v1/metadata?url=https://github.com",
  {
    headers: {
      "X-RapidAPI-Key": "YOUR_KEY",
      "X-RapidAPI-Host": "screenshot-pdf-api.p.rapidapi.com"
    }
  }
);
const data = await meta.json();
console.log(data.data.title);      // "GitHub · Build and ship software..."
console.log(data.data.og_image);   // "https://..."
console.log(data.data.word_count); // 834

cURL

# Screenshot
curl -o screenshot.png \
  -H "X-RapidAPI-Key: YOUR_KEY" \
  -H "X-RapidAPI-Host: screenshot-pdf-api.p.rapidapi.com" \
  "https://screenshot-pdf-api.p.rapidapi.com/v1/screenshot?url=https://github.com"

# Full page capture
curl -o fullpage.png \
  -H "X-RapidAPI-Key: YOUR_KEY" \
  -H "X-RapidAPI-Host: screenshot-pdf-api.p.rapidapi.com" \
  "https://screenshot-pdf-api.p.rapidapi.com/v1/screenshot?url=https://en.wikipedia.org&full_page=true"

Endpoints

EndpointDescriptionTier
GET /v1/screenshotScreenshot URL to PNG/JPEG/WebPFree
GET /v1/healthAPI status & queue depthFree
GET /v1/pdfGenerate PDF from URLBasic
GET /v1/metadataExtract 50+ metadata fieldsBasic
POST /v1/screenshot/htmlRender HTML/CSS to imagePro

Screenshot Parameters

ParamDefaultDescription
urlrequiredURL to capture
width1280Viewport width
height800Viewport height
formatpngpng, jpeg, webp
quality85JPEG/WebP quality (1‑100)
full_pagefalseCapture entire scrollable page
delay0Wait N seconds before capture (0‑5)
selectornullCSS selector to capture specific element

Use Cases

  • Social media previews – generate Open Graph images
  • PDF reports – convert dashboards and pages to PDF
  • Web scraping – screenshot + metadata in one call
  • Thumbnails – generate website thumbnails at scale
  • SEO auditing – check OG tags, missing alt text, structured data
  • Link previews – build rich preview cards
  • Visual regression testing – automated screenshots for QA

Pricing vs Competitors

FeatureThis APIScreenshotOneURLBox
Free tier20/day100 one‑timeNone
Basic plan$9/mo$17/mo$19/mo
Metadata extraction50+ fieldsNoNo
JSON‑LD parsingYesNoNo
Content analysisYesNoNo

Try it

Built with FastAPI + Playwright (headless Chromium). Hosted on a Hetzner VPS.

0 views
Back to Blog

Related posts

Read more »