Publishing Pipeline - Refactoring

Published: (January 15, 2026 at 11:53 AM EST)
4 min read
Source: Dev.to

Source: Dev.to

Where We Started

The original setup had clear limitations:

  • One script, one platform
  • Tight coupling between:
    • content loading
    • publishing logic
    • database access
  • Minimal state tracking
  • Hard to debug
  • Hard to extend
  • Impossible to reason about once it grew past a few hundred lines

In short: it worked, but it didn’t scale — neither technically nor mentally.

Original Flow

Original pipeline diagram

Characteristics of the Old Design

  • One script owns everything
  • Platform logic intertwined
  • Hard to add new platforms
  • Any change risks breaking unrelated behavior
  • Re‑running safely is difficult

The New Architecture: Clear Responsibilities Everywhere

Over the past weeks, the pipeline was redesigned around layers, each with a single responsibility.

1. Pipeline Layer (Orchestration)

Responsible only for flow:

  • Load posts
  • Process media
  • Inject backlinks
  • Publish to enabled platforms
  • Track results

No platform‑specific logic lives here.

2. Publisher Layer (Per‑Platform Semantics)

Each platform gets:

  • a Publisher (what to publish, when, and how)
  • a Client (how to talk to the API)

This distinction turned out to be crucial. For example:

PlatformUpdate Support
WordPressSupports updates
Dev.toSupports updates with constraints
X (Twitter)Publish‑once‑only

That difference now lives exactly where it belongs: inside the publisher, not smeared across the pipeline.

3. Client Layer (Pure API Communication)

Clients do one thing only:

  • Authenticate
  • Send requests
  • Return normalized results

They contain no:

  • Database access
  • Publishing decisions
  • Content logic

This makes them:

  • Testable
  • Replaceable
  • Reusable

4. Database Layer (State & Deduplication)

The database now tracks:

  • What was published
  • Where it was published
  • Which content hash was used
  • Which media assets already exist
  • Canonical URLs

This enables:

  • Idempotent runs
  • Safe re‑publishing
  • Deduplication of media
  • Correct canonical handling across platforms

New Flow Diagram

New pipeline diagram

What Changed

  • Pipeline only controls flow.
  • Publishers decide what to do.
  • Clients only talk to APIs.
  • Database provides state and idempotency.
  • Each part is testable and replaceable.

This is the turning point where the system stops being fragile.

Why This Is Faster (and Calmer)

One unexpected benefit: performance and reliability improved naturally—not because of clever optimizations, but because:

  • Small modules load faster
  • Less global state stays in memory
  • Failures are isolated
  • Logs are precise and meaningful
  • Re‑runs are cheap and safe

Instead of hoping nothing breaks, the pipeline now knows when nothing needs to happen.

Adding a New Platform Is Now Boring (That’s a Compliment)

To add a new platform today, the steps are:

  1. Write a client (platforms/foo_client.py)
  2. Write a publisher (publishers/foo.py)
  3. Wire it into the composition root

No changes required to:

  • Media handling
  • Backlinks
  • Canonical logic
  • Database schema
  • Pipeline flow

That’s the kind of boring you (well, I for certain) want.

Some highlights that are now just working:

  • Media deduplication via hashing → upload once, reuse everywhere
  • Featured images correctly attached per post
  • Backlinks injected deterministically
  • Canonical URLs handled centrally and propagated correctly
  • Per‑platform semantics respected (no more accidental re‑posting)

None of this required hacks—only the right separation of concerns.

What I Learned (The Unexpected Part)

This project wasn’t about learning Python—but it turned into exactly that. In three weeks, I learned more about:

  • Python module design
  • Data modeling
  • Immutability vs. mutation
  • Error handling
  • API semantics
  • Architectural boundaries

Key insight: Python becomes dramatically easier once the architecture is clean. Most complexity wasn’t Python at all—it was unclear responsibility.

What’s Next: Roadmap to v1.4.0

The current version 1.3.1 closes the stabilization phase.

The next milestone, 1.4.0, will focus on capabilities.

Planned Features

  1. Platform‑Aware Update Strategies
    • … (details to be added)
# Publish‑once (X)

- **Update‑in‑place** (WordPress)  
- **Conditional update** (Dev.to)  
- **Future:** quote‑tweet or follow‑up logic  

## 2. Rate Limiting & Scheduling

- Per‑platform throttling  
- Optional delayed publishing  
- CI‑safe dry‑run mode  

## 3. Observability Improvements

- Structured logs  
- Optional JSON output  
- Better failure summaries  

New Platforms (Starting with One)

At least one new platform will be added in v1.4.0.

Candidates

  • LinkedIn (high priority)
  • Xing
  • Indeed (content‑syndication angle)

Later roadmap

  • Substack
  • Patreon
  • Possibly others

Thanks to the current architecture, adding these is now a contained task, not a rewrite.

And Yes — This Will Become a Series

This post is likely the first of a small series covering:

  • Python for infrastructure‑minded people
  • Real‑world refactoring
  • Automation that survives growth
  • Designing for change, not just success

Not tutorials — but honest write‑ups of systems that evolved under real constraints.

Final Thoughts

What started as a utility script has become a small platform.
Not because it needed to be fancy — but because it needed to be understandable, extendable, and trustworthy.

And that, more than anything, made the difference.

Down the road it might become a service that interested people can use for their own writing. Since I am developing and running my own Kubernetes cloud, there should be nothing in the way of making this a container solution accessible for tenants.

Support

Did you find this post helpful? You can support me.

Buy Me a Coffee

Hetzner Referral

Confdroid Feedback Portal

Back to Blog

Related posts

Read more »

Lyra: The Command line Assistant

I coded the skeleton and the main loop for the assistant. The reason to choose a CLI assistant over a voice or AI assistant is due to my hardware limitations. I...