How We Built a Programmatic SEO Engine Serving 80K+ Pages on WordPress (Without Using wp_posts)

Published: (March 24, 2026 at 07:28 AM EDT)
7 min read
Source: Dev.to

Source: Dev.to

When we set out to build startup‑cost.com, we knew traditional WordPress wouldn’t cut it. We needed to serve 79 000+ unique pages – one for every combination of 479 cities and 167 business types – with real cost data, real‑time calculations, and solid performance.

Most people hear “80 K pages on WordPress” and assume we’re crazy. WordPress is a blogging platform, right? Well, yes – but under the hood it’s a flexible PHP framework with a powerful rewrite engine. We just had to throw away the parts that don’t scale and build our own.

Here’s the story of how we did it without a single row in wp_posts.

The Problem with wp_posts at Scale

WordPress stores all content in a single table called wp_posts. For a blog or a small‑business site with a few hundred pages this works fine. But when you start pushing tens of thousands of rows into that table, things fall apart quickly:

IssueWhy it hurts
Query performance degradesWordPress joins wp_posts with wp_postmeta for almost every query. With 80 K posts, each with 10+ meta fields, you end up with 800 K+ rows in wp_postmeta. Queries that used to take ~5 ms now take ~500 ms.
Admin panel becomes unusableLoading the “All Posts” screen with 80 K entries is a nightmare – pagination is there, but counting the total takes forever.
Revision history eats diskAuto‑saves create 3‑4× the actual content in revision rows when content is regenerated programmatically.
XML sitemaps chokePopular sitemap plugins try to query all posts at once; with 80 K rows they either timeout or exhaust memory.
Imports/exports breakWordPress export generates a single XML file. An 80 K‑post WXR file is practically impossible to handle.

We needed a fundamentally different approach.

Why Not Use a Static Site Generator or a Different CMS?

Fair question. We evaluated Hugo, Next.js, and a custom Node.js app. WordPress still gave us specific advantages:

  • Hosting is dirt cheap – Shared WordPress hosting costs a few dollars a month and handles our traffic fine.
  • Plugin ecosystem – We still use utility plugins for SEO, caching, and other tasks.
  • Familiar deployment – Our team knows WordPress inside‑out. No learning curve, no new CI/CD pipeline needed.
  • PHP is fast enough – With OpCache and clean query patterns, PHP 8 serves pages in double‑digit milliseconds.

The key insight: we don’t have to use WordPress the way it was designed. We can treat it as a routing and rendering framework while storing our data however we want.

The Architecture – High Level

We built a custom WordPress plugin that bypasses wp_posts entirely. The concept rests on three pillars:

Custom Database Tables

Instead of stuffing everything into posts and postmeta, we created dedicated tables with proper schemas, data types, and indexes. Think of it as a mini‑application database that lives inside WordPress’s MySQL instance.

  • One table per entity (cities, business types, cost metrics).
  • Columns match the actual data model – no generic key‑value pairs.
  • Proper indexing, normalization, and queries that hit exactly the data they need.

Result: a meta‑based lookup on 80 K posts might take hundreds of milliseconds; a direct indexed query on a purpose‑built table returns in single‑digit milliseconds.

Virtual URL Routing

WordPress’s rewrite API is powerful but underused. Most developers only touch it via the permalink settings screen. Under the hood you can register completely custom URL patterns that map to your own rendering logic.

  • Define URL patterns WordPress recognizes.
  • When a request arrives, WordPress matches the URL, extracts slugs, and hands control to our plugin.
  • No post is created, no row in wp_posts is touched – the URL exists purely because we told WordPress to recognize the pattern.

This gives us full control over URL structure while still benefiting from WordPress’s request lifecycle, caching hooks, and plugin compatibility.

Dynamic Content Generation

Programmatic SEO works only when each page delivers genuine value. We don’t just swap city names in a template; each page shows real cost figures, real calculations, and real comparisons.

  • Rendering layer pulls data from our custom tables.
  • Runs computations specific to each city‑business combination.
  • Outputs fully‑formed HTML with unique title, meta description, schema markup, etc.

Google can detect thin, templated content. Genuine, data‑driven value on each page is what makes programmatic SEO work long‑term.

Performance Results

Benchmarks were run on the same server (shared hosting, PHP 8.1, MariaDB 10.6).

Metricwp_posts approachCustom approach
Average page load (TTFB)~800 ms~120 ms
Database query time~200 ms (meta joins)~15 ms (indexed lookup)
Admin dashboard load30+ secondsInstant (no admin overhead)
Sitemap generationTimeout at 60 s~2 seconds
Memory usage per request~64 MB~12 MB

The difference is dramatic. Custom tables with proper indexes make WordPress perform like a purpose‑built application.

Sitemap Strategy

Google’s limits: 50 000 URLs per file and 50 MB uncompressed. With 80 K pages we need multiple sitemaps plus an index.

  • Generate sitemaps programmatically, chunking URLs into manageable files.
  • Create a sitemap index that references all chunks.
  • Run the generation on a weekly cron.
  • Google Search Console picks them up without issues, and we can track indexing progress per sitemap chunk.

Internal Linking

With 80 K pages, internal linking is critical both for SEO and for helping users navigate:

  • Each city page links to all business types available in that city.
  • Each business page links to top cities for that business type.
  • Each detail page links to related pages based on geographic and thematic proximity.
  • A dense internal‑link graph helps search engines discover and understand the site structure.

Caching

We use a multi‑tier caching strategy:

  • Not all 80 K pages get equal traffic.
    • Popular city/business combinations are pre‑cached with longer TTLs.
    • Long‑tail pages are cached on demand with shorter TTLs.
  • We also cache at the database‑query level for frequently accessed aggregations.

What We Learned

  • WordPress can handle massive scale – but only if you step outside its default content model. The platform is more flexible than people give it credit for.
  • Custom tables are not “hacky” – They’re the right tool when your data doesn’t fit the post/meta pattern. WordPress itself uses custom tables for comments, users, and options.
  • Virtual routing is powerful – WordPress rewrite rules can handle complex URL patterns. You don’t need a custom framework just because your URLs don’t map to posts.
  • Programmatic SEO needs real value – Google detects thin, templated content. Every page needs genuinely different, useful data. That’s the difference between spam and a real product.
  • Start with the data model – We spent more time designing our database schema than writing rendering code. Good data modeling pays off at every layer.
  • Monitor, then optimize – We started without caching and added it only where monitoring showed bottlenecks. Premature optimization would have added complexity we didn’t need.

Check out the result: startup-cost.com – startup cost estimates for 167 business types across 479 cities worldwide.

Built by Kavela Ltd – we build data‑driven web tools at scale.

0 views
Back to Blog

Related posts

Read more »