实时代理监控:使用 Python 和 Grafana 构建仪表盘

发布: (2026年3月9日 GMT+8 00:18)
3 分钟阅读
原文: Dev.to

请提供您希望翻译的完整文本(文章内容),我将在保持原有格式、代码块和链接不变的前提下,将其翻译成简体中文。

关键指标跟踪

  • Success rate – 请求返回 HTTP 200 的百分比
  • Response time – 每个代理的平均延迟和 P95 延迟
  • Bandwidth usage – 每个代理及总体的数据消耗
  • Error distribution – 错误类型(超时、403、429、CAPTCHA)
  • IP uniqueness – 实际使用的唯一 IP 数量
  • Pool health – 活跃代理与失效代理的比例
  • Rotation frequency – IP 更换的频率
  • Geographic distribution – 出口 IP 的地理位置分布
  • Cost per successful request – 实际成本核算
  • Blacklist rate – 当前被封锁的 IP 数量

架构概览

Your Application
    |
    v
Proxy Middleware (collects metrics)
    |
    v
Prometheus (stores time‑series data)
    |
    v
Grafana (visualizes dashboards)

代理包装器及指标

import time
import requests
from prometheus_client import Counter, Histogram, Gauge, start_http_server

# Define metrics
REQUEST_COUNT = Counter(
    "proxy_requests_total",
    "Total proxy requests",
    ["proxy", "status", "target_domain"]
)

RESPONSE_TIME = Histogram(
    "proxy_response_seconds",
    "Response time in seconds",
    ["proxy"],
    buckets=[0.1, 0.5, 1, 2, 5, 10, 30]
)

ACTIVE_PROXIES = Gauge(
    "proxy_pool_active",
    "Number of active proxies in pool"
)

BANDWIDTH = Counter(
    "proxy_bandwidth_bytes",
    "Bandwidth consumed in bytes",
    ["proxy", "direction"]
)

class MonitoredProxy:
    def __init__(self, proxy_url):
        self.proxy_url = proxy_url
        self.proxy_dict = {"http": proxy_url, "https": proxy_url}

    def request(self, url, **kwargs):
        start = time.time()
        domain = url.split("/")[2]

        try:
            response = requests.get(
                url,
                proxies=self.proxy_dict,
                timeout=kwargs.get("timeout", 15),
                **kwargs
            )
            duration = time.time() - start

            # Record metrics
            REQUEST_COUNT.labels(
                proxy=self.proxy_url,
                status=str(response.status_code),
                target_domain=domain
            ).inc()

            RESPONSE_TIME.labels(proxy=self.proxy_url).observe(duration)

            BANDWIDTH.labels(
                proxy=self.proxy_url, direction="response"
            ).inc(len(response.content))

            return response

        except Exception:
            duration = time.time() - start
            REQUEST_COUNT.labels(
                proxy=self.proxy_url,
                status="error",
                target_domain=domain
            ).inc()
            raise

Prometheus 配置 (prometheus.yml)

scrape_configs:
  - job_name: "proxy_monitor"
    scrape_interval: 15s
    static_configs:
      - targets: ["localhost:8000"]

核心仪表盘面板(Grafana)

  • 成功率

    rate(proxy_requests_total{status="200"}[5m]) /
    rate(proxy_requests_total[5m]) * 100
  • 平均响应时间

    rate(proxy_response_seconds_sum[5m]) /
    rate(proxy_response_seconds_count[5m])
  • 错误分布

    sum by (status) (rate(proxy_requests_total{status!="200"}[5m]))
  • 每小时带宽

    sum(rate(proxy_bandwidth_bytes[1h])) * 3600

警报规则 (alert_rules.yml)

groups:
  - name: proxy_alerts
    rules:
      - alert: LowSuccessRate
        expr: |
          rate(proxy_requests_total{status="200"}[5m]) /
          rate(proxy_requests_total[5m])  5
        for: 5m
        annotations:
          summary: Average proxy latency above 5 seconds

轻量级 CSV 记录器(替代方案)

如果 Prometheus 和 Grafana 过于繁重,你可以记录到 CSV:

import csv
from datetime import datetime

def log_request(proxy, url, status, latency, bytes_received):
    with open("proxy_log.csv", "a", newline="") as f:
        writer = csv.writer(f)
        writer.writerow([
            datetime.now().isoformat(),
            proxy,
            url,
            status,
            round(latency, 3),
            bytes_received
        ])

之后,使用 pandas(或任何数据分析工具)分析 CSV,以识别趋势和有问题的代理。

进一步阅读

欲获取更多代理监控设置和基础设施指南,请访问 DataResearchTools

0 浏览
Back to Blog

相关文章

阅读更多 »