Google Only Indexed 2% of My 100,000-Page Site. Here's What I'm Doing About It.
Source: Dev.to
The problem
Current GSC Indexing Report
| Status | Pages | What It Means |
|---|---|---|
| Crawled — not indexed | 51,061 | Google visited, read the page, and said “no thanks.” |
| Discovered — not indexed | 28,016 | Google knows the URL exists but won’t even bother crawling it. |
| Indexed | 1,920 | The 2 % that made the cut. |
| Redirects | 2,648 | Pages I intentionally removed. |
The most painful line is “Crawled — not indexed.” Googlebot is spending crawl budget on 51 k pages, processing them, and deciding they aren’t worth indexing. This is not a discovery problem – it’s a quality problem.
Even worse: the indexed count dropped from 2,246 to 1,920 in one week. Google is actively de‑indexing pages it previously accepted.
Root Causes I’ve Identified
Thin‑Content Signals
- Original pages: 200‑300 words of templated analysis.
- For a brand‑new domain, that’s insufficient to prove unique value.
- AI‑first indexing treats thousands of shallow, similarly‑structured pages as low‑quality.
- What I’ve done: expanded pages to 600‑800 words with deeper, ticker‑specific analysis. Reputation damage, however, takes time to reverse.
Zero Domain Authority
- No backlinks in any tool I’ve checked.
- A reference audit showed a site with 8 M discovered pages but only 650 k indexed – the culprit was lack of trust signals.
- What I’m doing: published five articles on Medium, Dev.to, and Hashnode (still a drop in the ocean) and plan many more that naturally link back.
Crawl‑Budget Economics
- Google allocates crawl budget based on perceived importance.
- With 100 k+ URLs on a domain it doesn’t trust, most budget is wasted on pages that will never index.
- This creates a vicious cycle: low authority → less crawling → fewer indexed pages → less traffic → still low authority.
Shifting the Programmatic‑SEO Playbook
1. Subtract, then Add
- Removed all comparison pages (the thinnest content).
- Those 2,648 redirects in GSC prove the subtraction.
- Principle: Fewer, better pages > more pages.
2. Enrich Every Stock Page
- Added unique sections that can’t be auto‑generated:
- Related news
- Analyst ratings
- Earnings timelines
- Market‑context commentary
- Goal: convince Google’s quality classifiers that each page is a legitimate analysis, not just a data table with a paragraph bolted on.
3. Strengthen Internal Linking
- Built widgets: “Related Stocks,” “Popular in This Sector.”
- Cross‑link stock pages, sector pages, and ETF pages.
- Helps Google understand topical relationships and spreads whatever authority the domain has.
4. Build External Authority
- Five articles live across three platforms, each linking back to StockVS.
- Future content will focus on the journey of building a large‑scale site, resonating with developers and SEOs.
- This article itself is part of that strategy.
5. Leverage Multilingual Arbitrage
- GSC data shows non‑English pages get more impressions than English ones.
- Dutch pages lead, followed by German and Polish.
- Competition for “
[ticker] analyse” (Dutch) is dramatically lower than “[ticker] analysis” (English). - Validation that the multilingual approach can win in less‑competitive SERPs.
What the New Indexing Reality Looks Like
- Content depth – go beyond what any template can auto‑generate.
- Authority signals – backlinks, brand searches, engagement metrics.
- Technical hygiene – clean sitemaps, proper canonicals, fast load times, no crawl traps.
- Patience – Google’s re‑evaluation cycle for domains recovering from thin‑content signals is slow.
Weekly Metrics I Track
| Metric | Why It Matters |
|---|---|
| Crawled‑not‑indexed count | Should decrease as content quality improves. |
| Indexed page count | The north‑star metric. |
| Impressions on non‑English pages | Shows multilingual arbitrage effectiveness. |
| Domain Rating (Ahrefs) | Currently 0; any movement indicates growing authority. |
Call to Action
If you’re building a programmatic‑SEO site and hitting the same indexing wall, let’s talk. The old “just build more pages” playbook is dead. In 2026, success hinges on:
- Depth – each page must earn its place individually.
- Authority – earn backlinks and brand trust.
- Technical excellence – keep Google’s crawlers happy.
- Patience – give Google time to reassess.
What’s working for you?
deserve to be indexed.
I write about building large‑scale SEO sites, AI‑powered content generation, and the tools I use to manage it all. If you're into programmatic SEO, check out my **Programmatic SEO Blueprint** — it covers the architecture, data pipelines, and multilingual strategy I use for StockVS.
For AI‑powered SEO workflows, I've also built a set of **Claude Skills** that handle everything from content auditing to cross‑platform publishing.