How to add browser automation to any MCP server using PageBolt

Published: 1 day ago (March 2, 2026 at 07:15 AM EST)

6 min read

Source: Dev.to

Introduction

You’re building an AI agent with Claude. Your agent needs to interact with the web — take screenshots, generate PDFs, record demo videos, inspect page structure.

You could:

Write Python scripts to call Puppeteer (fragile, maintenance burden)
Manage your own headless browser pool (infrastructure overhead)
Use PageBolt’s MCP server (tool call, done)

Option 3 takes 5 minutes.

PageBolt provides an open MCP server that gives Claude, Cursor, and Windsurf native access to browser‑automation tools. Your agent calls take_screenshot() directly. No Python. No subprocess management. No infrastructure.

What is PageBolt MCP?

MCP (Model Context Protocol) is a standard for AI agents to call external tools. PageBolt implements the MCP spec so AI agents can:

Take screenshots of any URL
Generate PDFs from HTML or web pages
Record browser interactions as narrated videos
Inspect page structure (CSS selectors, element text)
Run multi‑step browser sequences

All via direct function calls in Claude Code, Cursor IDE, Windsurf. PageBolt MCP translates these calls into hosted API requests—no infrastructure on your end.

Installation (2 minutes)

Step 1 – Install PageBolt MCP globally

npm install -g pagebolt-mcp

Step 2 – Get your API key

Step 3 – Configure Claude Desktop

Edit ~/.claude/claude_desktop_config.json:

{
  "mcpServers": {
    "pagebolt": {
      "command": "pagebolt-mcp",
      "env": {
        "PAGEBOLT_API_KEY": "YOUR_API_KEY_HERE"
      }
    }
  }
}

Restart Claude Desktop. Done.

Step 4 – Configure Cursor (optional)

Edit .cursor/mcp.json in your project root (or the global Cursor settings):

{
  "mcpServers": {
    "pagebolt": {
      "command": "pagebolt-mcp",
      "env": {
        "PAGEBOLT_API_KEY": "YOUR_API_KEY_HERE"
      }
    }
  }
}

Restart Cursor. Your agent can now call PageBolt tools.

Tools available

Once installed, your agent can call the following functions.

`take_screenshot(url, options)`

Capture a screenshot of any URL.

Agent: "Take a screenshot of example.com on mobile"
Tool: take_screenshot("https://example.com", {
  "device": "iphone_14_pro",
  "fullPage": true
})
Result: base64 PNG image

`generate_pdf(url, options)`

Generate a PDF from a URL or HTML.

Agent: "Generate a PDF of the checkout page"
Tool: generate_pdf("https://mystore.com/checkout", {
  "format": "A4",
  "margin": "1in"
})
Result: base64 PDF

`record_video(steps, narration)`

Record a browser interaction as an MP4 with narration.

Agent: "Record a demo of adding a product to cart and checking out"
Tool: record_video([
  {"action": "navigate", "url": "https://mystore.com"},
  {"action": "click", "selector": "button.add-to-cart"},
  {"action": "click", "selector": "a.checkout"}
], {
  "narration": true,
  "audioGuide": "Adding product to cart. Proceeding to checkout."
})
Result: base64 MP4 video

`inspect_page(url)`

Get a structured map of page elements—buttons, inputs, links with CSS selectors.

Agent: "Inspect the login page and find the submit button"
Tool: inspect_page("https://myapp.com/login")
Result: {
  "forms": [{
    "selector": "form.login-form",
    "inputs": [
      {"selector": "#email", "type": "text", "placeholder": "Email"},
      {"selector": "#password", "type": "password", "placeholder": "Password"}
    ],
    "buttons": [{"selector": "button[type=submit]", "text": "Sign In"}]
  }]
}

`run_sequence(steps, options)`

Execute a multi‑step browser automation sequence.

Agent: "Run through the checkout flow and tell me if it succeeds"
Tool: run_sequence([
  {"action": "navigate", "url": "https://mystore.com/checkout"},
  {"action": "fill", "selector": "input[name=email]", "value": "test@example.com"},
  {"action": "click", "selector": "button[type=submit]"},
  {"action": "wait_for", "selector": ".order-confirmation", "timeout": 5000}
], {
  "captureScreenshots": true
})
Result: success/failure + screenshot evidence

Practical examples

Example 1 – Screenshot‑taking agent

User: "Take a screenshot of each competitor's pricing page"
Agent: Calls take_screenshot() for each URL
Result: PNG images for comparison analysis

Your agent can now autonomously capture and analyze screenshots without writing any Puppeteer code.

Example 2 – Demo video generator

User: "Record a narrated demo of our checkout flow"
Agent: Calls record_video() with checkout steps + narration script
Result: MP4 demo video auto‑generated, saved to S3

Instead of manually recording a 5‑minute screencast, your agent does it in ~30 seconds.

Example 3 – PDF report automation

User: "Generate a PDF report of all product pages"
Agent:
  1. Inspect each product page with inspect_page()
  2. Collect relevant data (titles, prices, images)
  3. Build an HTML template
  4. Call generate_pdf() on the template
Result: Consolidated PDF report

Example 4 – Automated Web Testing

User: “Test if our checkout flow works on mobile and desktop”

Agent:

Run checkout sequence on iPhone preset
Run checkout sequence on desktop preset
Compare screenshots for visual regression

Result: Pass/fail with evidence (screenshots)

Real‑World Use Case: AI Agent That Demos Products

User: "Record a demo of the new dashboard feature"

Agent:
1. Calls inspect_page() on staging dashboard
2. Gets CSS selectors for "Add Widget" button, filter controls, etc.
3. Calls record_video() with step‑by‑step interactions
   - Navigate to dashboard
   - Click "Add Widget"
   - Select "Sales Chart"
   - Configure chart options
   - Click "Save"
4. Adds narration: "Adding a sales chart widget to your dashboard..."
5. Returns MP4 file

Result: Professional demo video, auto‑generated, ready to ship

No manual screencast. No waiting for a video editor. Done in ~30 seconds.

Cost Comparison: Manual vs. Agent‑Driven

Task	Manual	AI Agent (PageBolt MCP)
Screenshot 10 URLs	2 min	10 s
Record 5‑minute demo	20 min	1 min
Generate PDF report	15 min	2 min
Test checkout flow	10 min	30 s

Total time saved per week: ≈ 10 hours

Getting Started

Install
```
npm install -g pagebolt-mcp
```
Get API key – Sign up at
Configure – Add the key to Claude Desktop / Cursor config
Use – Your agent now has browser tools natively

No infrastructure. No setup. Your AI agent is now a power user of the web. 🚀

Limits and Caveats

Authentication: Pass cookies/headers if needed
JavaScript rendering: Full Chromium, waits for network idle
Localhost: Not accessible (we’re a hosted service)
Rate limits: Free tier = 100 requests/month; paid tiers scale

Next Steps

Your agent can now:

✅ Screenshot any website
✅ Generate PDFs on demand
✅ Record and narrate videos
✅ Inspect page structure
✅ Run complex browser sequences

Use these tools to automate tasks that previously required manual work or fragile Python scripts.

Start free — 100 requests/month, no credit card. Add browser automation to your MCP server in 5 minutes.

How to add browser automation to any MCP server using PageBolt

Introduction

What is PageBolt MCP?

Installation (2 minutes)

Step 1 – Install PageBolt MCP globally

Step 2 – Get your API key

Step 3 – Configure Claude Desktop

Step 4 – Configure Cursor (optional)

Tools available

`take_screenshot(url, options)`

`generate_pdf(url, options)`

`record_video(steps, narration)`

`inspect_page(url)`

`run_sequence(steps, options)`

Practical examples

Example 1 – Screenshot‑taking agent

Example 2 – Demo video generator

Example 3 – PDF report automation

Example 4 – Automated Web Testing

Real‑World Use Case: AI Agent That Demos Products

Cost Comparison: Manual vs. Agent‑Driven

Getting Started

Limits and Caveats

Next Steps

Related posts

Shared Workflows: minha experiência definindo pipelines reutilizáveis

Building a Local-First Financial IDE: How I forced Gemini AI to do strict Double-Entry Accounting

I ran cursor-doctor on 50 real projects. Here's what broke.

Google Gemini Writing Challenge

Introduction

What is PageBolt MCP?

Installation (2 minutes)

Step 1 – Install PageBolt MCP globally

Step 2 – Get your API key

Step 3 – Configure Claude Desktop

Step 4 – Configure Cursor (optional)

Tools available

take_screenshot(url, options)

generate_pdf(url, options)

record_video(steps, narration)

inspect_page(url)

run_sequence(steps, options)

Practical examples

Example 1 – Screenshot‑taking agent

Example 2 – Demo video generator

Example 3 – PDF report automation

Example 4 – Automated Web Testing

Real‑World Use Case: AI Agent That Demos Products

Cost Comparison: Manual vs. Agent‑Driven

Getting Started

Limits and Caveats

Next Steps

Related posts

Shared Workflows: minha experiência definindo pipelines reutilizáveis

Building a Local-First Financial IDE: How I forced Gemini AI to do strict Double-Entry Accounting

I ran cursor-doctor on 50 real projects. Here's what broke.

Google Gemini Writing Challenge

Installation (2 minutes)

Step 1 – Install PageBolt MCP globally

Step 2 – Get your API key

Step 3 – Configure Claude Desktop

Step 4 – Configure Cursor (optional)

`take_screenshot(url, options)`

`generate_pdf(url, options)`

`record_video(steps, narration)`

`inspect_page(url)`

`run_sequence(steps, options)`

Example 1 – Screenshot‑taking agent

Example 2 – Demo video generator

Example 3 – PDF report automation

Example 4 – Automated Web Testing