I Built a 287,000-Page Website. Here's What I Learned About Programmatic SEO.
Source: Dev.to
Most SEO advice boils down to the same thing
- Pick a keyword.
- Write an article.
- Wait three months.
- Repeat.
If you want 10× the traffic, you write 10× the content. That math doesn’t work when you’re a one‑person team.
A different approach
About a year ago I started experimenting with a programmatic method:
- Instead of writing articles one by one, I built a system that generates pages programmatically.
- One template + one data pipeline = thousands of output pages.
- Each page targets a specific long‑tail keyword.
The effort goes into building the machine, not feeding it.
The project: a stock comparison engine
- Input – two ticker symbols.
- Output – side‑by‑side breakdown: financials, dividends, growth metrics, etc.
- Scale – every meaningful stock pair across 12 languages.
Result: 287,000 pages built by a single person. No content team.
What actually happened
Tech stack
| Component | Role |
|---|---|
| Astro | Static site generation – fast, SEO‑friendly, handles thousands of routes |
| Supabase PostgreSQL | Data layer |
| yfinance API | Pulls financial data |
| Local Llama 3 | Generates narrative sections for each page |
| Cloudflare CDN | Serves the static files |
Total monthly cost: under $50.
Core insight
Separate data from presentation.
- The database stores structured financial data for 8,000+ tickers.
- Templates define how that data is rendered into comparison pages.
- AI fills the gaps – generating human‑readable analysis unique to each pair.
Architecture diagram
Data Layer (Supabase PostgreSQL)
↓
ETL Pipeline (Python + yfinance)
↓
Content Generation (Llama 3 – local)
↓
Static Site Generator (Astro)
↓
CDN (Cloudflare)
Each layer is independent, making swaps (e.g., Astro → Next.js, Llama 3 → Claude) painless.
Long‑tail coverage
- A page for every stock pair captures searches nobody else targets.
- Example: “Random small‑cap vs other random small‑cap” has almost zero competition.
- Individual page volume is tiny, but aggregate volume across 287k pages adds up.
Multilingual support
- Once the template and data pipeline are ready, translating to 12 languages is mostly a matter of translating the template strings and regenerating the pages.
- Numbers, tickers, percentages stay the same.
- Used AI for the initial pass, then manually polished the most important pages.
Schema markup at scale
- Every page includes FinancialProduct, FAQ, and BreadcrumbList schema.
- Because it’s programmatic, you write the schema once and it applies everywhere – something impossible to do manually on 287k pages.
Stability
- The build pipeline is static and generated from structured data, so there’s very little that can go wrong at runtime.
- The site serves plain HTML files – no server‑side processing, no DB queries on page load, nothing to crash.
The hidden problem: Google won’t index 287k pages from a brand‑new domain
- Indexed pages: ~2,500 (≈ 0.9 % index rate).
- Root causes:
- No domain authority – zero backlinks, no brand recognition.
- Content similarity – narratives followed similar patterns, triggering Google’s “helpful content” filter for thin pages.
- Crawl budget – Googlebot was only visiting 200‑500 pages per day; at that rate it would take years to crawl everything, and Google won’t crawl pages it deems low‑value.
The counter‑intuitive solution: fewer pages, not more
-
Cut pages ruthlessly
- Reduce from 287k to 5 k–30 k pages per language.
- Keep only comparison pairs with real search demand (validated via Search Console & keyword tools).
-
Thicken remaining pages
- Add sector context, historical trend analysis, dividend deep‑dives, and custom AI‑generated insights that are truly specific to each stock pair.
- Aim for each page to stand alone as a useful resource.
-
Build backlinks
- Start with directory submissions (boring but necessary).
- Move to industry‑specific directories.
- Conduct targeted outreach to achieve Domain Rating 15+ within 6 months.
-
Optimize crawl budget
- Split the massive sitemap into smaller, topic‑based sitemaps.
- Improve internal linking so Googlebot discovers important pages via site structure, not just the sitemap.
- Use
noindextags for low‑value pages.
Current metrics (transparent snapshot)
| Metric | Current |
|---|---|
| Total pages | 287,000 |
| Pages indexed | ~2,500 |
| Index rate | 0.9 % |
| Domain Rating | 0 |
| Backlinks | 0 |
| Monthly revenue | $0 |
| Monthly cost | ~ $50 |
Not pretty, but the infrastructure works:
- Data pipeline ✅
- Content generation ✅
- Site loads fast, passes Core Web Vitals ✅
- Proper schema markup on every page ✅
The only thing broken is Google’s trust – and that’s a solvable problem.
Programmatic SEO – Lessons Learned & Playbook
1. Indexing & Growth
- Indexing is the bottleneck. Once pages start getting indexed, growth compounds quickly.
- Each indexed page targets low‑competition keywords (virtually nobody else is ranking).
- With 12 languages the addressable market is massive.
2. Start Smaller
“If I could do it over, I’d launch with 5 000 pages instead of 287 000.”
- Get a modest set of pages indexed first.
- Prove the model works before scaling.
- Launching with hundreds of thousands of pages on a brand‑new domain is essentially asking Google to ignore you.
3. Your Data Source = Your Moat
- Anyone can copy a template; the real advantage is a comprehensive, hard‑to‑replicate data source.
- Example: Financial data via
yfinance– free, structured, covers thousands of entities. - Ask yourself: What data can I obtain that others can’t easily replicate at scale?
4. Template + AI Hybrid – The Sweet Spot
| Approach | Pros | Cons |
|---|---|---|
| Pure template‑based pages | Fast, cheap | Thin content → flagged by Google |
| Pure AI‑generated pages | Highly unique | Expensive, inconsistent quality |
| Hybrid (structured data + AI narrative) | Scalable, unique, higher quality | Requires careful orchestration |
- Render structured data with templates.
- Use AI to generate narrative sections that add value and uniqueness.
5. Budget for Backlinks from Day One
- Great content won’t earn links automatically when the domain has zero authority.
- Make backlink acquisition a core part of the launch plan, not an afterthought.
6. Patience Is a Must
- Programmatic SEO is not a “launch and rank tomorrow” strategy.
- Think of it as building an infrastructure that compounds over time.
- The first 6 months will feel slow—that’s normal.
7. Documentation & Transparency
- I’m documenting everything:
- Full technical architecture
- Every prompt used for content generation
- Monetization roadmap
- Mistakes and lessons learned
8. The Programmatic SEO Blueprint
A complete guide that includes:
- Niche selection
- Data architecture
- AI content generation workflow
- Astro/Next.js implementation details
- SEO infrastructure & indexing solutions
- Monetization strategy
- All code examples are MIT‑licensed
9. Final Thoughts & Call to Action
- If you’re thinking about building a programmatic SEO site, go for it—just start smaller.
- Follow me for more updates on building programmatic SEO sites and on how the indexing situation evolves.
Prepared by the author of the Programmatic SEO Blueprint.