Building Clusterflick: A London Cinema Aggregator

Published: 3 days ago (February 6, 2026 at 01:20 PM EST)

2 min read

Source: Dev.to

Overview

I’ve been working on a personal project called Clusterflick — a single source for every movie showing across London. It currently tracks 240 venues across 5 event platforms, pulling in 1,398 events and over 30,000 showings. What began as a simple desire to have cinema times on my calendar quickly evolved into a full data pipeline running on GitHub Actions, a statically generated Next.js site, and a cluster of Raspberry Pis in my living room.

Challenges

Movie matching is deceptively hard – title + year or title + director often isn’t enough to uniquely identify a film. Some cinema listings provide too little information for reliable human identification.
Scraping at scale without a budget – GitHub runner IPs get blocked, so a Raspberry Pi cluster now handles the trickier sources.
Using LLMs for data quality – When fuzzy matching falls short, large language models have proven surprisingly useful for resolving ambiguous movie lookups against The Movie DB.
Keeping it cheap – The entire system runs on near‑zero infrastructure costs: GitHub Actions for orchestration, Releases as storage, and static site generation to avoid hosting fees.

Open Source

The whole project is open source on GitHub. If any of this sounds interesting, I’d love to hear from others working on similar scraping, aggregation, or data‑pipeline projects.

Building Clusterflick: A London Cinema Aggregator

Overview

Challenges

Open Source

Related posts

Building a Jedi-Style Hand Gesture Interface with TensorFlow.js: Control Your Browser Without Touching Anything

How to Sync AI Skills Across Claude Code, OpenClaw, and Codex in 2 Minutes

API Gateway vs Gateway API

Insecurity of Quantum Secure Computations