How to add browser automation to any MCP server using PageBolt

Published: (March 2, 2026 at 07:15 AM EST)
6 min read
Source: Dev.to

Source: Dev.to

Introduction

You’re building an AI agent with Claude. Your agent needs to interact with the web — take screenshots, generate PDFs, record demo videos, inspect page structure.

You could:

  • Write Python scripts to call Puppeteer (fragile, maintenance burden)
  • Manage your own headless browser pool (infrastructure overhead)
  • Use PageBolt’s MCP server (tool call, done)

Option 3 takes 5 minutes.

PageBolt provides an open MCP server that gives Claude, Cursor, and Windsurf native access to browser‑automation tools. Your agent calls take_screenshot() directly. No Python. No subprocess management. No infrastructure.


What is PageBolt MCP?

MCP (Model Context Protocol) is a standard for AI agents to call external tools. PageBolt implements the MCP spec so AI agents can:

  • Take screenshots of any URL
  • Generate PDFs from HTML or web pages
  • Record browser interactions as narrated videos
  • Inspect page structure (CSS selectors, element text)
  • Run multi‑step browser sequences

All via direct function calls in Claude Code, Cursor IDE, Windsurf. PageBolt MCP translates these calls into hosted API requests—no infrastructure on your end.

Installation (2 minutes)

Step 1 – Install PageBolt MCP globally

npm install -g pagebolt-mcp

Step 2 – Get your API key

Sign up at (free tier: 100 requests/month) and copy your API key.

Step 3 – Configure Claude Desktop

Edit ~/.claude/claude_desktop_config.json:

{
  "mcpServers": {
    "pagebolt": {
      "command": "pagebolt-mcp",
      "env": {
        "PAGEBOLT_API_KEY": "YOUR_API_KEY_HERE"
      }
    }
  }
}

Restart Claude Desktop. Done.

Step 4 – Configure Cursor (optional)

Edit .cursor/mcp.json in your project root (or the global Cursor settings):

{
  "mcpServers": {
    "pagebolt": {
      "command": "pagebolt-mcp",
      "env": {
        "PAGEBOLT_API_KEY": "YOUR_API_KEY_HERE"
      }
    }
  }
}

Restart Cursor. Your agent can now call PageBolt tools.

Tools available

Once installed, your agent can call the following functions.

take_screenshot(url, options)

Capture a screenshot of any URL.

Agent: "Take a screenshot of example.com on mobile"
Tool: take_screenshot("https://example.com", {
  "device": "iphone_14_pro",
  "fullPage": true
})
Result: base64 PNG image

generate_pdf(url, options)

Generate a PDF from a URL or HTML.

Agent: "Generate a PDF of the checkout page"
Tool: generate_pdf("https://mystore.com/checkout", {
  "format": "A4",
  "margin": "1in"
})
Result: base64 PDF

record_video(steps, narration)

Record a browser interaction as an MP4 with narration.

Agent: "Record a demo of adding a product to cart and checking out"
Tool: record_video([
  {"action": "navigate", "url": "https://mystore.com"},
  {"action": "click", "selector": "button.add-to-cart"},
  {"action": "click", "selector": "a.checkout"}
], {
  "narration": true,
  "audioGuide": "Adding product to cart. Proceeding to checkout."
})
Result: base64 MP4 video

inspect_page(url)

Get a structured map of page elements—buttons, inputs, links with CSS selectors.

Agent: "Inspect the login page and find the submit button"
Tool: inspect_page("https://myapp.com/login")
Result: {
  "forms": [{
    "selector": "form.login-form",
    "inputs": [
      {"selector": "#email", "type": "text", "placeholder": "Email"},
      {"selector": "#password", "type": "password", "placeholder": "Password"}
    ],
    "buttons": [{"selector": "button[type=submit]", "text": "Sign In"}]
  }]
}

run_sequence(steps, options)

Execute a multi‑step browser automation sequence.

Agent: "Run through the checkout flow and tell me if it succeeds"
Tool: run_sequence([
  {"action": "navigate", "url": "https://mystore.com/checkout"},
  {"action": "fill", "selector": "input[name=email]", "value": "test@example.com"},
  {"action": "click", "selector": "button[type=submit]"},
  {"action": "wait_for", "selector": ".order-confirmation", "timeout": 5000}
], {
  "captureScreenshots": true
})
Result: success/failure + screenshot evidence

Practical examples

Example 1 – Screenshot‑taking agent

User: "Take a screenshot of each competitor's pricing page"
Agent: Calls take_screenshot() for each URL
Result: PNG images for comparison analysis

Your agent can now autonomously capture and analyze screenshots without writing any Puppeteer code.

Example 2 – Demo video generator

User: "Record a narrated demo of our checkout flow"
Agent: Calls record_video() with checkout steps + narration script
Result: MP4 demo video auto‑generated, saved to S3

Instead of manually recording a 5‑minute screencast, your agent does it in ~30 seconds.

Example 3 – PDF report automation

User: "Generate a PDF report of all product pages"
Agent:
  1. Inspect each product page with inspect_page()
  2. Collect relevant data (titles, prices, images)
  3. Build an HTML template
  4. Call generate_pdf() on the template
Result: Consolidated PDF report

Example 4 – Automated Web Testing

User: “Test if our checkout flow works on mobile and desktop”

Agent:

  1. Run checkout sequence on iPhone preset
  2. Run checkout sequence on desktop preset
  3. Compare screenshots for visual regression

Result: Pass/fail with evidence (screenshots)

Real‑World Use Case: AI Agent That Demos Products

User: "Record a demo of the new dashboard feature"

Agent:
1. Calls inspect_page() on staging dashboard
2. Gets CSS selectors for "Add Widget" button, filter controls, etc.
3. Calls record_video() with step‑by‑step interactions
   - Navigate to dashboard
   - Click "Add Widget"
   - Select "Sales Chart"
   - Configure chart options
   - Click "Save"
4. Adds narration: "Adding a sales chart widget to your dashboard..."
5. Returns MP4 file

Result: Professional demo video, auto‑generated, ready to ship

No manual screencast. No waiting for a video editor. Done in ~30 seconds.

Cost Comparison: Manual vs. Agent‑Driven

TaskManualAI Agent (PageBolt MCP)
Screenshot 10 URLs2 min10 s
Record 5‑minute demo20 min1 min
Generate PDF report15 min2 min
Test checkout flow10 min30 s

Total time saved per week: ≈ 10 hours

Getting Started

  1. Install

    npm install -g pagebolt-mcp
  2. Get API key – Sign up at

  3. Configure – Add the key to Claude Desktop / Cursor config

  4. Use – Your agent now has browser tools natively

No infrastructure. No setup. Your AI agent is now a power user of the web. 🚀

Limits and Caveats

  • Authentication: Pass cookies/headers if needed
  • JavaScript rendering: Full Chromium, waits for network idle
  • Localhost: Not accessible (we’re a hosted service)
  • Rate limits: Free tier = 100 requests/month; paid tiers scale

Next Steps

Your agent can now:

  • ✅ Screenshot any website
  • ✅ Generate PDFs on demand
  • ✅ Record and narrate videos
  • ✅ Inspect page structure
  • ✅ Run complex browser sequences

Use these tools to automate tasks that previously required manual work or fragile Python scripts.

Start free — 100 requests/month, no credit card. Add browser automation to your MCP server in 5 minutes.

0 views
Back to Blog

Related posts

Read more »

Google Gemini Writing Challenge

What I Built - Where Gemini fit in - Used Gemini’s multimodal capabilities to let users upload screenshots of notes, diagrams, or code snippets. - Gemini gener...