When Not to Reach for Microservices: A Startup's First 18 Months

Published: (June 13, 2026 at 07:03 AM EDT)
7 min read
Source: Dev.to

Source: Dev.to

Book: Ship It — The Pragmatic Startup Tech Stack

Also by me: Thinking in Go (2-book series) — Complete Guide to Go Programming + Hexagonal Architecture in Go

My project: Hermes IDE | GitHub — an IDE for developers who ship with Claude Code and other AI coding tools Me: xgabriel.com | GitHub

You are four people. You have no paying customers. The deck says you will “scale to millions.” So you draw a diagram with seven boxes: auth service, billing service, notification service, an API gateway, a message broker, and two databases you are sure you will need later. Three weeks in, nobody has shipped a feature. You are debugging why the billing service can’t reach the user service in your local Docker setup. Somewhere in that mess is the product you were supposed to be building. This is the most common self-inflicted wound in early-stage engineering. Microservices are an answer to a question startups don’t ask for a long time. The question is organizational, not technical. Read the canonical sources and the framing is consistent. Martin Fowler and James Lewis, in their 2014 microservices article, tie service boundaries to teams “organized around business capabilities.” That echoes Conway’s Law from 1968: systems mirror the communication structure of the org that builds them. Microservices let independent teams deploy independently. That is the whole point. Team A ships the billing service on Tuesday without coordinating with Team B, who owns search. Separate repos, separate pipelines, separate on-call rotations. No shared release train, no blocking on each other. Now count your teams. You have one. Maybe two engineers who sit next to each other. When one team owns every service, you get the costs of independent deployment with none of the benefits. You coordinate every change anyway, because you are the same people. You just added network calls between functions that used to be method calls. Fowler himself wrote a follow-up, MonolithFirst, arguing that almost every successful microservices story started as a monolith that got too big, and almost every system built as microservices from scratch ran into serious trouble. The pattern is empirical, not ideological. A method call either returns or throws. A network call has a third outcome: it hangs. Now multiply that across every service boundary you drew. Here is a request that touches two services. In a monolith it looks like this: def place_order(user_id, cart): user = users.get(user_id) if not user.payment_ok: raise PaymentError() order = orders.create(user_id, cart) return order

That runs in one process, one transaction. If orders.create fails, the whole thing rolls back. You reason about it by reading top to bottom. Split users and orders into separate services and the same logic grows a tail of failure modes: def place_order(user_id, cart): # network call: can time out, return 500, # or succeed but the response gets lost user = user_service.get(user_id) if not user.payment_ok: raise PaymentError() # second network call: what if THIS fails # after the user check already passed? order = order_service.create(user_id, cart) return order

Now you need timeouts, retries, idempotency keys so a retried order doesn’t double-charge, and a story for partial failure. There is no shared transaction across two databases, so you reach for a saga or an outbox pattern to keep them consistent. Each of those is a real piece of engineering with its own bugs. You also lose the cheap things. A join across two tables becomes an API call plus in-memory stitching. A stack trace that once spanned the whole request now stops at a service boundary, so you bolt on distributed tracing to see across the gap. Local development needs the whole constellation running, so someone writes a Docker Compose file that nobody fully understands. None of this work ships a feature. It is rent you pay for a building you don’t occupy yet. This is the fear that sells premature microservices, and it gets the risk backwards. Rewriting a clean monolith into services is a known, bounded problem. You do it when you have the revenue and the team to justify it, and you do it knowing the actual seams because production traffic has shown you where they are. Guessing the seams up front is the unbounded problem. You draw boundaries based on a whiteboard mental model of a product that does not exist yet. The product changes. The boundaries were wrong. Now you are paying network and coordination costs to move logic across services that should have been one service all along, which is far more painful than moving a function between modules. Amazon Prime Video’s engineering team wrote about this in 2023: they moved a monitoring tool from a distributed serverless architecture back to a monolith and cut infrastructure cost by 90 percent. The direction of that migration is the point. Even inside a company that runs microservices at planet scale, the right call for a specific workload was to collapse it back. You keep the option to split open by building the monolith well, not by splitting early. Use modules with clear interfaces. Keep your domain logic out of your web framework. A well-structured monolith is a microservice architecture that hasn’t been forced apart yet. src/ billing/ # clear public interface, owns its tables orders/ # talks to billing through a function call users/ # for now web/ # http handlers, thin

Each top-level package is a future service if it ever needs to be. The boundaries live in code, enforced by review and imports, not by the network. That costs you nothing today and saves you the rewrite-the-seams pain later. Don’t split because traffic grew. A bigger Hetzner box, a read replica, and a cache handle far more load than most startups ever see. Vertical scaling buys you a long runway, and it is boring in the good way. Split when you hit one of these, and not before: Team friction is real and measurable. Two or more teams keep blocking each other on a shared deploy. People are waiting on a release train. That is the Conway’s Law signal, and it is the only one that actually justifies the cost. One component has a genuinely different scaling profile. A video transcoder, an ML inference path, or a job runner that needs GPUs or 10x the memory of everything else. Isolating that workload is a real win because it scales on a different axis. A clear blast-radius boundary matters. A piece of the system whose failure must never take down checkout, where the isolation is worth the operational cost. You need different runtimes. One part is Python for the ML libraries, the rest is Go. A process boundary is the honest way to draw that line. Notice what is not on the list: “we might scale,” “it’s cleaner,” “the architecture blog said so,” or “investors will ask.” Those are reasons to write a better monolith, not to distribute one. Build a monolith. One deployable, one database, modules with sharp internal boundaries. Put it on a single server you understand. Add a read replica and a cache before you add a service. Keep your domain logic framework-agnostic so the seams are visible when you do need them. Spend the time you save on the thing that actually kills startups, which is not having a product anyone wants. The distributed-systems tax is real money and real hours, and before product-market fit you cannot afford either. Pay it when an org problem forces your hand, with production traffic telling you exactly where the lines go. The companies you admire that run hundreds of services almost all started with one. They earned the complexity. You get to skip it until you have. If this matched how you think about early infrastructure, the same reasoning runs through Ship It — the book works through hosting, databases, and architecture decisions at each budget and stage, with the same “what to pick and when to upgrade” framing. It is the longer version of this post: choosing complexity on purpose instead of by default.

0 views
Back to Blog

Related posts

Read more »

The spec is in the wrong place

My day job is at a large tech company. Hundreds of engineering teams, and every one of them is somewhere different on AI adoption. Some are still treating codin...

The Heuristics Say Don't

A culture that only records its disasters ends up with a biased archive. Wars documented, plagues chronicled, collapses catalogued. The quiet decades go unwritt...