How I built a local-first Reddit tool without cloud infrastructure

Published: (December 15, 2025 at 09:15 PM EST)
4 min read
Source: Dev.to

Source: Dev.to

Okay so this is going to sound backwards. But hear me out.

I spent two weeks building a cloud‑based Reddit scraper—authentication, Postgres, the whole thing—then threw it all away and started over with a desktop app that has zero cloud infrastructure. Best decision I made on this project. Here’s why.

The cloud version kept getting blocked

I’m building a tool called Reddit Toolbox. It scrapes subreddits and filters posts by comment count, useful for Reddit marketing or research.

The first version was a web app: Next.js frontend, Python backend on a VPS, Supabase for auth and data. Reddit, however, really doesn’t like servers making requests. My VPS IP was flagged within literally five minutes of testing. Rotating proxies, residential IPs, and throttling didn’t help reliably. Every time I thought I’d solved it, Reddit updated its detection and I was back to square one. Users kept asking “why am I getting blocked?” and I had no good answer.

The obvious solution I kept ignoring

A friend suggested, “Why don’t you just run it on the user’s machine?” I worried about distribution, manual updates, lack of usage tracking, and revenue models. But the fact remained: if the app runs from a home IP, Reddit sees a normal person browsing. No proxy tricks, no cat‑and‑mouse with detection—just works.

What I rebuilt

I discarded the web stack and started fresh with Python + PyQt6.

# core of the scraper – embarrassingly simple
import requests
import sqlite3

def scrape_subreddit(name, limit=100):
    url = f"https://reddit.com/r/{name}.json?limit={limit}"

    # Just a GET request from the user's IP.
    response = requests.get(url, headers={
        'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64)'
    })

    if response.status_code == 200:
        return response.json()['data']['children']
    else:
        # Fallback to RSS if JSON is blocked
        return scrape_via_rss(name)

The RSS fallback is important—sometimes Reddit blocks the JSON API for certain patterns but leaves RSS open, so the tool almost never fails completely.

For data storage I use SQLite—a single file that lives next to the app. Users can back it up by copying that file. No database server, no connection strings, no “is my DB running?” debugging at 2 am.

The part that scared me: how to make money

With a web app, gating features behind auth is easy. With a desktop app, users can download the binary, crack it, patch it, or share it. I spent too long thinking about DRM, then realized the people who would crack a $15/month tool weren’t going to pay anyway. I opted for a simple approach:

  • The app phones home once per session to check subscription status via a quick Supabase API call.
  • If the user is offline or blocks the call, the app still works—defaulting to free‑tier limits (15 scrapes/day).

Could someone bypass this? Sure. Do I care? Not really. The paying customers are happy, and the rest weren’t going to become customers anyway.

Trade‑offs I accepted

  • No cross‑device sync. Data lives on one machine; export/import is required for another device. Slight inconvenience, but most users work from a single machine.
  • Manual updates. I’m working on an auto‑updater, but for now users download new versions themselves. Some feedback suggests people actually prefer knowing exactly when their software changes.
  • Zero telemetry. I have no insight into how people use the app. I might add opt‑in analytics later, but it’s nice not to drown in dashboards.

Results after a few weeks

  • Zero support tickets about blocking. The web version generated daily tickets; the local version doesn’t.
  • App size is ~50 MB. An Electron version would be ~150 MB+. PyQt6 feels native and lightweight.
  • Positive user feedback. Users appreciate the simplicity—no sign‑up required. One email said, “finally a tool that doesn’t want my email before I can try it.” That made my week.

When to go local‑first

Local‑first isn’t suitable for everything. You’ll still need a server for:

  • Real‑time collaboration
  • Mobile apps that sync across devices
  • Anything with social features

But for single‑user productivity tools that talk to external APIs—especially APIs that actively fight scrapers—local‑first can eliminate unnecessary cloud complexity.

If you want to see what I built, it’s called Reddit Toolbox. Search “Reddit Toolbox wappkit” or check wappkit.com. A free tier is available.

Happy to answer questions about PyQt, the architecture, or why I now hate proxies.

Back to Blog

Related posts

Read more »

Comparison of AI Tools for Startups

You’ve Got Seed Funding Not a lot—$200 K to $500 K. You need to build and ship fast. The question isn’t which AI tools are best in theory. It’s which combinati...