Proxy Bandwidth Optimization: Cut Costs Without Sacrificing Performance
Source: Dev.to
Residential and mobile proxy bandwidth is expensive — $5‑50 per GB. Every wasted byte is wasted money. A typical web page is 2‑5 MB; if you only need a price or title, you’re wasting 99 % of the bandwidth on images, CSS, JavaScript, and ads. Heavy resources, poor caching, and failed requests quickly add up.
Techniques to Reduce Proxy Bandwidth
Block Images and Media in Headless Browsers
# playwright_sync_example.py
from playwright.sync_api import sync_playwright
def create_optimized_page(browser):
page = browser.new_page()
# Block images, fonts, stylesheets, media, analytics, tracking, ads
page.route("**/*.{png,jpg,jpeg,gif,svg,webp}", lambda route: route.abort())
page.route("**/*.{woff,woff2,ttf,eot}", lambda route: route.abort())
page.route("**/*.css", lambda route: route.abort())
page.route("**/analytics*", lambda route: route.abort())
page.route("**/tracking*", lambda route: route.abort())
page.route("**/ads*", lambda route: route.abort())
return page
Blocking these resources can cut bandwidth by 60‑80 %.
Prefer Direct HTTP Requests Over Headless Browsers
import requests
# Headless browser: Downloads 3‑5 MB per page
# Direct HTTP: Downloads 50‑200 KB per page
response = requests.get(
url,
proxies=proxy,
headers={"Accept-Encoding": "gzip, deflate, br"},
timeout=15,
)
Use a headless browser only when JavaScript rendering is required.
Enable Compression
headers = {
"Accept-Encoding": "gzip, deflate, br", # Server will send compressed response
# `requests` automatically decompresses the payload
}
Compression typically reduces the HTML payload by 70‑80 %.
Use Structured APIs Instead of HTML Scraping
# Scraping HTML: ~200 KB per product
html_resp = requests.get("https://site.com/product/123")
# Using API: ~2 KB per product (100× smaller)
api_resp = requests.get("https://api.site.com/products/123")
APIs return compact JSON, dramatically lowering bandwidth.
Implement Local Caching
import hashlib, json, time
class ProxyCache:
def __init__(self, cache_ttl=3600):
self.cache = {}
self.ttl = cache_ttl
def get(self, url):
key = hashlib.md5(url.encode()).hexdigest()
entry = self.cache.get(key)
if entry and time.time() - entry["timestamp"] < self.ttl:
return entry["data"] # Cache hit – zero bandwidth
return None
def set(self, url, data):
key = hashlib.md5(url.encode()).hexdigest()
self.cache[key] = {"data": data, "timestamp": time.time()}
Cache hits eliminate network traffic entirely.
Use Conditional Requests
# First request
resp = requests.get(url, proxies=proxy)
etag = resp.headers.get("ETag")
last_modified = resp.headers.get("Last-Modified")
# Subsequent requests
headers = {}
if etag:
headers["If-None-Match"] = etag
if last_modified:
headers["If-Modified-Since"] = last_modified
resp = requests.get(url, proxies=proxy, headers=headers)
if resp.status_code == 304:
# Content unchanged – minimal bandwidth used
pass
Conditional GETs avoid downloading unchanged content, saving 95 %+ of the payload.
Smart Retry Logic
def smart_retry(url, proxy_manager, max_retries=3):
for attempt in range(max_retries):
proxy = proxy_manager.get_fresh_proxy() # Different proxy each time
try:
response = requests.get(url, proxies=proxy, timeout=10)
if response.status_code == 200:
return response
if response.status_code in (403, 429):
proxy_manager.mark_failed(proxy)
continue # Try a different proxy
except requests.Timeout:
proxy_manager.mark_slow(proxy)
continue
return None
Avoid immediate retries on the same proxy to reduce repeated bandwidth waste.
Bandwidth Reduction Summary
| Technique | Bandwidth Reduction |
|---|---|
| Block images/media | 60‑80 % |
| HTTP vs. headless browser | 90‑95 % |
| Enable compression | 70‑80 % |
| Use APIs vs. scraping | 95‑99 % |
| Caching | 100 % on cache hits |
| Conditional requests | 95 %+ on unchanged content |
Cost Comparison
| Method | Per‑Page Size | Daily Bandwidth | Monthly Cost ($10/GB) |
|---|---|---|---|
| Headless, no optimization | 3 MB | 30 GB | $300 |
| Headless, blocked resources | 500 KB | 5 GB | $50 |
| Direct HTTP, compressed | 50 KB | 500 MB | $5 |
| API requests | 2 KB | 20 MB | $0.20 |
Optimization can reduce your proxy costs by up to 99 %.
For more proxy‑optimization guides and cost‑saving strategies, visit DataResearchTools.