Building a Transparent AI Pipeline: 59 Weeks of Automated Political Scoring with Claude API

Published: 1 month ago (March 18, 2026 at 11:25 PM EDT)

5 min read

Source: Dev.to

Source: Dev.to

![Cover image for Building a Transparent AI Pipeline: 59 Weeks of Automated Political Scoring with Claude API](https://media2.dev.to/dynamic/image/width=1000,height=420,fit=cover,gravity=auto,format=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqi3e292ud2evff6smts3.png)

[![Steve Harlow](https://media2.dev.to/dynamic/image/width=50,height=50,fit=cover,gravity=auto,format=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3648455%2F134510f9-fcbe-4f4a-aa40-6d640fafb234.jpg)](https://dev.to/steve_harlow_0dbc0e910b6d)

I've been running an automated AI pipeline for over a year that ingests news articles, clusters them into political events, and scores each event on two independent axes. Here's how it works, what I learned, and why I made everything transparent.

---

## The Problem

Political events have two dimensions that are rarely measured together:

- **How much institutional damage does this cause?** (democratic health)  
- **How much media attention does it get?** (distraction economics)

When these are wildly mismatched — high damage, low attention — something important is being missed. I built [The Distraction Index](https://distractionindex.org/) to detect these gaps automatically.

---

## Architecture Overview

```text
News Sources (GDELT + GNews + Google News RSS)
    ↓ every 4 hours
Ingestion Pipeline (/api/ingest)
    ↓ dedup + store
Clustering (Claude Haiku) → group articles into events
    ↓
Dual‑Axis Scoring (Claude Sonnet) → Score A + Score B
    ↓
Weekly Freeze → immutable snapshot

Tech stack: Next.js 16 (App Router), Supabase (PostgreSQL), Claude API, Vercel

Why Two Models?

Cost optimization was critical. Running everything through Sonnet would cost ~$300 /month. Instead:

Model	Role	Cost (per 1 M tokens)
Claude Haiku	Article clustering	$0.25
Claude Sonnet	Scoring (institutional impact)	$3.00

Result: ~$30 /month for a production pipeline processing articles every 4 hours.

The Dual Scoring System

Score A: Constitutional Damage (0‑100)

Seven weighted governance drivers, each scored 0‑5:

Driver	Weight	What it measures
Judicial Independence	0.18	Court stacking, ruling defiance
Press Freedom	0.15	Journalist targeting, access restrictions
Voting Rights	0.15	Disenfranchisement, election interference
Environmental Policy	0.12	Regulatory rollbacks, enforcement gaps
Civil Liberties	0.15	Due process, privacy, free assembly
International Norms	0.10	Treaty violations, alliance damage
Fiscal Governance	0.15	Budget manipulation, oversight bypass

Each driver score is multiplied by severity modifiers (durability × reversibility × precedent) and mechanism/scope modifiers.

Score B: Distraction/Hype (0‑100)

Two‑layer model:

Layer	Weight	Description
Layer 1 (55 %)	Raw media hype – volume, social amplification, cross‑platform spread, emotional framing, celebrity involvement
Layer 2 (45 %)	Strategic manipulation indicators – timing relative to damage events, coordinated messaging, deflection patterns

Layer 2 is modulated by an intentionality score (0‑15). Low intentionality drops Layer 2’s weight to 10 %.

Classification

Events are classified by dominance margin:

Class	Condition
Damage (List A)	Score A exceeds Score B by ≥ 10 points
Distraction (List B)	Score B exceeds Score A by ≥ 10 points
Noise (List C)	Neither dominates

The Smokescreen Index

The most interesting feature: automatic pairing of high‑distraction events with concurrent high‑damage events. When a B‑dominant event (media spectacle) co‑occurs with an A‑dominant event (institutional harm) that received less coverage, the system flags it as a potential smokescreen.

210+ pairs identified across 59 weeks.

Radical Transparency

Every scoring formula, weight, and AI prompt is published at /methodology. This was a deliberate design choice — if you’re scoring political events, your methodology must be auditable.

Key transparency features:

Immutable weekly snapshots – once a week freezes, scores cannot be silently changed.
Append‑only corrections – post‑freeze corrections are timestamped and linked to the original.
Published prompts – the exact Claude prompts used for scoring are documented.
Open source – full codebase on GitHub.

What I Learned

Publishing your prompts is terrifying
When your prompt templates are public, anyone can argue with your framing. That’s the point — but it requires thick skin and a willingness to iterate.
Immutability prevents model drift
Without frozen snapshots, you can’t tell if score changes come from real‑world changes or model updates. Immutability is essential for longitudinal analysis.
The two‑axis approach reveals patterns
Single‑dimension scoring (left/right, reliable/unreliable) misses the key insight: damage and distraction are independent variables. Some events are both; some are neither.
Cost optimization matters for indie projects
The Haiku‑for‑clustering, Sonnet‑for‑scoring split keeps costs at ~$30 /month. Without this, the project wouldn’t be sustainable as a solo effort.

The Numbers (after 59 weeks)

1,500+ scored events
11,800+ ingested articles
210+ smokescreen pairs
288 tests passing
1,071 pages indexed

Try It

Live site:
Methodology:
Source code:

I’d love feedback on the scoring methodology!

It looks like the snippet you provided is just a fragment (“odology. What would you weight differently? What blind spots do you see?”). To clean up the markdown while preserving its structure and content, I’ll need the complete markdown segment you’d like revised.

Could you please paste the full markdown text (or at least the surrounding context) that you want cleaned up? Once I have the complete segment, I’ll tidy up the formatting, headings, lists, code blocks, etc., while keeping the original content unchanged.

Building a Transparent AI Pipeline: 59 Weeks of Automated Political Scoring with Claude API

Why Two Models?

The Dual Scoring System

Score A: Constitutional Damage (0‑100)

Score B: Distraction/Hype (0‑100)

Classification

The Smokescreen Index

Radical Transparency

What I Learned

The Numbers (after 59 weeks)

Try It

Related posts

Your Pipeline Is 21.5h Behind: Catching Startups Sentiment Leads with Pulsebit

The Claude Code CVE That Should Change How You Review AI-Generated Code

Are Banking Apps Safe? Why Yes, But Your Habits Matter More

45,000 Layoffs in March. Companies Blamed AI. The Numbers Say Otherwise.

Why Two Models?

The Dual Scoring System

Score A: Constitutional Damage (0‑100)

Score B: Distraction/Hype (0‑100)

Classification

The Smokescreen Index

Radical Transparency

What I Learned

The Numbers (after 59 weeks)

Try It

Related posts

Your Pipeline Is 21.5h Behind: Catching Startups Sentiment Leads with Pulsebit

The Claude Code CVE That Should Change How You Review AI-Generated Code

Are Banking Apps Safe? Why Yes, But Your Habits Matter More

45,000 Layoffs in March. Companies Blamed AI. The Numbers Say Otherwise.

Score A: Constitutional Damage (0‑100)

Score B: Distraction/Hype (0‑100)

The Numbers (after 59 weeks)