Database-Level Caching with Materialized Views and Summary Tables: The Art of Precomputed Truth
Source: Dev.to
Background
Let me rewind to a Tuesday afternoon I’d rather forget.
We had a Rails monolith that had grown fat and happy over five years. The dashboard—a beautiful, chart‑filled monster—was running a 12‑second query every time the CEO clicked “refresh.” Twelve seconds of GROUP BY, COUNT(DISTINCT), and LEFT JOIN hell across a million‑row events table.
The CEO didn’t yell. He just stared at the spinning cursor and said, “This used to be fast.” That silence was worse.
I’d already tried everything.
- Redis caching? Stale data on first load.
- Counter caches? Fine for counts, useless for complex roll‑ups.
- Pagination? The dashboard needed totals.
I was staring at the ceiling at 2 AM when I whispered: “What if I just… pre‑compute the answer?”
Materialized views and summary tables aren’t new—they’re as old as data warehouses. But in Rails, with ActiveRecord’s object‑relational map guiding our every thought, we forget that the database itself can be a cache: a smart, transactional, ACID‑compliant cache that never lies.
This is the journey of learning to think in sets, not objects. Senior Rails devs who’ve optimized N+1 queries to death: your next frontier is the precomputed column.
The Lie We Tell Ourselves: “Indexes Are Enough”
We throw indexes at everything—composite, partial, expression‑based. And indexes are magic—until they aren’t. When your query aggregates millions of rows, the database still has to read those rows. Even with a covering index, you’re doing work per row, per request.
I remember running EXPLAIN ANALYZE on that CEO dashboard:
Aggregate (cost=12483.67..12483.68 rows=1 width=32)
-> Seq Scan on events (cost=0.00..10483.33 rows=400068 width=32)
Filter: (created_at > '2024-01-01')
Seq scan. Four hundred thousand rows. Every. Single. Request.
Indexes reduced the scan to an index scan, but the aggregate still looped over hundreds of thousands of index entries. The database was doing the same work over and over—like a chef who grates a block of cheese for every single omelet, instead of pre‑grating a bowl in the morning.
That’s when I discovered materialized views: the pre‑grated cheese bowl.
The First Step: Materialized Views as Heavy‑Lifters
A materialized view is a query whose results are stored physically on disk. You refresh it on a schedule, or after relevant data changes. Reads are instantaneous—milliseconds instead of seconds.
Here’s the one that saved my CEO’s dashboard:
CREATE MATERIALIZED VIEW daily_sales_summary AS
SELECT
date(created_at) AS day,
product_id,
COUNT(*) AS units_sold,
SUM(amount_cents) AS revenue_cents,
COUNT(DISTINCT user_id) AS unique_buyers
FROM orders
WHERE status = 'completed'
GROUP BY day, product_id;
In Rails:
class DailySalesSummary { where(day: 30.days.ago..Date.today) }
end
# Dashboard query becomes:
revenue = DailySalesSummary.recent.sum(:revenue_cents)
From 12 seconds to 42 ms. The CEO’s cursor stopped spinning. I felt like a wizard.
But materialized views come with a curse: staleness. The data is only as fresh as your last REFRESH. We started with a cron job running every hour—fine for a dashboard, not fine for a real‑time leaderboard.
That’s when I learned about incremental refresh (REFRESH MATERIALIZED VIEW CONCURRENTLY in PostgreSQL 14+) and the art of the summary table.
The Art of the Summary Table: Listen, Then Update
A summary table (a.k.a. aggregation table) is a regular PostgreSQL table you maintain with triggers or ActiveRecord callbacks. It’s a materialized view you update incrementally—only the rows that change.
We built one for a gamification feature: user points from dozens of actions (comments, likes, shares). The raw user_actions table grew by 50 k rows/day. Real‑time leaderboard queries were killing us.
Migration
# db/migrate/create_user_points_summaries.rb
create_table :user_points_summaries, id: false do |t|
t.integer :user_id, null: false
t.integer :total_points, default: 0
t.integer :daily_points, default: 0
t.integer :weekly_points, default: 0
t.datetime :last_calculated_at
t.timestamps
end
add_index :user_points_summaries, :user_id, unique: true
Incremental update
class UserAction < ApplicationRecord
after_create_commit :update_summary
private
def update_summary
summary = UserPointsSummary.find_or_initialize_by(user_id: user_id)
summary.lock!
summary.total_points += point_value
summary.daily_points = calculate_daily_points
summary.weekly_points = calculate_weekly_points
summary.last_calculated_at = Time.current
summary.save!
end
end
Notice the lock? Yes. Two concurrent actions on the same user will deadlock if you’re not careful. We used SELECT … FOR UPDATE to serialize updates per user. It’s fine because a single user’s actions are rare—but for global aggregates you’d need a different pattern (e.g., a queued job).
The beauty of summary tables? They’re always fresh. Every write triggers an incremental update. Reads are O(1). The database becomes a materialized stream.
The Challenge: Keeping It Atomic
When you maintain summary tables, you open the door to inconsistency. What if the after_create_commit callback fails? What if the summary update succeeds but the original action rolls back? The solution is to wrap both the source write and the summary update in the same transaction, or to use out‑of‑band processing (e.g., a background job) that retries on failure.
In practice:
class UserAction < ApplicationRecord
after_commit :queue_summary_update, on: :create
private
def queue_summary_update
SummaryUpdateJob.perform_later(id)
end
end
The job runs in a separate transaction, guaranteeing that the UserAction row is persisted before attempting the summary mutation. If the job fails, Sidekiq (or your queue) will retry, preserving eventual consistency without risking partial writes.
Takeaways
- Indexes are not a panacea for heavy aggregations.
- Materialized views give you instant reads at the cost of staleness.
- Summary tables provide real‑time freshness when you need it, but require careful transactional handling.
- Think set‑based first; only drop down to object‑level tricks when you’ve exhausted the database’s strengths.
The next time a dashboard spins for minutes, remember: the database can be your cache, too. 🚀
class UserAction < ApplicationRecord
after_create_commit :schedule_summary_refresh
def schedule_summary_refresh
# Non‑critical: use a background job with idempotency key
RefreshUserPointsJob.perform_later(user_id, self.id)
end
end
class RefreshUserPointsJob < ApplicationJob
def perform(user_id, action_id = nil)
# Recalculate from scratch for this user using the raw table
# Idempotent and safe, even if called multiple times
totals = UserAction.where(user_id: user_id)
.group("date(created_at)")
.sum(:point_value)
UserPointsSummary.upsert(
{ user_id: user_id, total_points: totals.values.sum, ... },
unique_by: :user_id
)
end
end
This trades real‑time for eventual consistency. For a leaderboard that’s fine, but for the CEO dashboard we stuck with hourly materialized views.
The art is knowing which battles to fight.
The Real Magic: Combining Both Worlds
After two years of refinement, I now have a three‑tier caching strategy inside the database:
| Tier | Technique | Freshness | Use case |
|---|---|---|---|
| L1 | In‑memory (Rails cache) | Seconds | User‑specific, hot |
| L2 | Summary table (trigger‑updated) | Millisecond‑level | Real‑time counters |
| L3 | Materialized view (scheduled refresh) | Hour/Day | Analytics dashboards |
The dashboard that started this journey now uses:
- a materialized view for daily aggregates,
- a summary table for “today so far,” and
- a small bit of JavaScript to poll the real‑time summary every 30 seconds.
Result: Twelve seconds became 80 ms. The CEO doesn’t even look at the spinner anymore; he just trusts the numbers.
The Human Lesson: Caching Is a Taxonomy of Time
You can’t cache everything. What you can do is classify your data by how fresh it needs to be.
| Freshness needed | Recommended store |
|---|---|
| Real‑time counters | Summary table |
| Yesterday’s numbers | Materialized view |
| Last year’s reports | Regular table with good indexes |
I’ve stopped reaching for Redis as the default. Sometimes the best cache is the one already inside your database—the one that understands transactions, consistency, and the shape of your data.
Materialized views and summary tables feel archaic. They aren’t shiny, but they’re reliable. For a senior Rails engineer who’s seen too many caching layers collapse under their own complexity, that reliability is the ultimate art.
Now go pre‑compute something beautiful. And when a junior asks, “why not just add an index?”, tell them about the CEO and the spinning cursor. Some lessons need to be lived.