구글이 0.3초 만에 검색 결과를 보여주는 방법 (그들은 치팅하고, 당신도 그래야 합니다)
Source: Dev.to
Hm… I think internet cannot be searched; you only look up on Google, right?
You type “best jollof rice in Lagos” into Google. Three hundred milliseconds later, you have ten perfect results. In that time, Google supposedly searched through hundreds of billions of web pages, ranked them by relevance, personalized results based on your location and history, and sent everything back to your phone.
Except they didn’t. That’s physically impossible. Here’s what actually happened, and why it changes everything about how you should build fast applications.
The illusion of real‑time search
When you press search, you’re not actually searching the internet. You’re searching Google’s pre‑computed index of the internet, which is a completely different thing.
Imagine a library with one million books and someone asks you to find every book that mentions “jollof rice.” Scanning every page would take days or weeks. Instead, you could spend months beforehand creating an index—a massive book that lists every word that appears in the library and exactly which books and pages contain that word.
When someone asks for “jollof rice,” you just open the index to the “jollof” section, see the list of books and page numbers, do the same for “rice,” find the books that appear in both lists, and hand them the answer in seconds. You did the hard work before anyone asked the question.
That’s exactly what Google does, just at planetary scale.
What actually happens when you search
Let’s break down the steps that occur in those three hundred milliseconds:
- Front‑end servers (GFE) – Your query hits Google’s globally distributed front‑end servers, located close to you (e.g., a West African GFE if you’re in Lagos).
- Spelling correction – Google checks for typos and auto‑corrects if needed. This runs on Google’s internal infrastructure called Borg (the precursor to Kubernetes).
- Index shards – The corrected query is sent to Google’s index shards. The index is split into thousands of shards distributed across data centers worldwide. Each shard holds a portion of the inverted index plus the actual document data.
- Top‑k fetch – Each relevant shard quickly fetches the top ~1,000 possible documents that match the query. The index is already sorted and optimized for this lookup.
- Ranking – All those results are sent to Google’s ranking system, which uses over two hundred signals (location, search history, freshness, backlinks, mobile‑friendliness, page speed, etc.) to sort them in milliseconds.
- Formatting – The top ten results are formatted and sent back to your browser. The whole process takes under half a second.
Critical insight: Google isn’t searching billions of pages in real time; it’s looking up pre‑computed results in an index that is continuously updated in the background.
The inverted index: Google’s secret weapon
An inverted index stores words and the list of documents that contain them, rather than storing whole documents.
Normal storage
Document 1 contains “I love jollof rice”
Inverted index
- “jollof” appears in: Document 1, Document 2
- “rice” appears in: Document 1
- “Lagos” appears in: Document 2
When someone searches for “jollof Lagos,” the system instantly knows Document 2 is the only one containing both words—no document scanning required.
Google’s inverted index tracks hundreds of billions of web pages and trillions of words. Because it is pre‑computed and sharded across thousands of servers, lookups are extremely fast.
Incremental indexing: How Google stays current
The web changes constantly—new pages appear, old pages update, sites go offline. Rebuilding the entire index from scratch each time would be impossible.
Google uses incremental indexing:
- Crawlers continuously browse the web, looking for new or changed content.
- When a change is detected, only the relevant portions of the index are updated.
- Popular news sites are crawled every few minutes; smaller sites may be crawled every few days or weeks.
The index is never perfectly up‑to‑date, but it’s current enough that users don’t notice. Breaking news can appear in search results within minutes; a small blog post might take hours or days.
Why 99 % of apps are slow
Most applications are slow because they perform full database scans or complex calculations on every request. Google’s approach is the opposite: pre‑compute everything you possibly can.
Example: Food‑delivery app
Slow approach
- User opens app.
- App gets user’s location.
- App queries the database: “Find all restaurants, calculate distance, filter within 5 km, sort by rating.”
- Database scans thousands of rows, performs distance calculations, and returns results after a few seconds.
Google‑style approach
- Every few minutes, pre‑compute which restaurants are near every major neighborhood.
- Store these lists in fast memory (e.g., Redis).
- When the user opens the app, look up the pre‑computed list for the relevant neighborhood.
- Results appear in a few hundred milliseconds.
The expensive work (distance calculations, filtering, sorting) is moved to background jobs, not request time.
Real examples where pre‑computation matters
- YouTube recommendations – Algorithms run continuously in the background, updating recommendation lists for hundreds of millions of users. When you open the app, you see a list that was pre‑computed minutes or hours ago.
- Instagram feed – The feed is assembled in the background based on who you follow and your engagement history. Pull‑to‑refresh simply fetches the pre‑built feed.
- E‑commerce search (Amazon) – Inverted indexes of products by keywords, categories, and attributes allow instant lookups instead of scanning the entire product catalog.
- Banking dashboards – Account balances and transaction histories are aggregated and cached for dashboard views; the main transaction database handles new writes, while reporting views are pre‑computed.
The trade‑off: freshness vs. speed
Pre‑computation introduces a single major trade‑off: the data may be slightly stale. If Google’s index was updated ten minutes ago and a website changed five minutes ago, the search results won’t reflect that change yet.
For most use cases, this is acceptable. Users expect fast responses, not perfect real‑time accuracy. The slight lag is a small price to pay for the dramatic performance gains that pre‑computed indexes provide.