When MySQL InnoDB's Engine Redo Log Saved Your Startup

Published: 1 month ago (December 6, 2025 at 10:07 PM EST)

4 min read

Source: Dev.to

Source: Dev.to

The Lighthouse Keeper Analogy

Think of InnoDB’s redo log as a lighthouse keeper’s journal.

Before modern GPS, lighthouse keepers kept meticulous logs:

“3:15 PM — Ship passing north.”
“4:22 PM — Lit beacon for fog.”
“5:03 PM — Wind shifted west.”

If anything went wrong, investigators could reconstruct what happened by reading the log.

InnoDB works the same way. Before changing anything, it writes to the redo log first:

“I’m about to update row 47 in table users. Here’s what I’m changing.”

This write‑ahead logging (WAL) is your safety net for crashes, power failures, and cosmic rays flipping bits in RAM.

Note: The lighthouse keeper can become a bottleneck.

The Transaction Patterns That Nearly Killed Us

Our application processed financial transactions—fast. We handled ~5,000 writes per second at peak.

A typical transaction looked like:

-- Running thousands of times per second
START TRANSACTION;

UPDATE accounts 
SET balance = balance - 100 
WHERE user_id = ?;

INSERT INTO transaction_log 
(user_id, amount, timestamp) 
VALUES (?, 100, NOW());

UPDATE user_metadata 
SET last_transaction = NOW() 
WHERE user_id = ?;

COMMIT;

Looks innocent, right? Wrong. This pattern was silently strangling our redo log system.

The Redo Log: A Technical Deep Dive (Without the Boring Parts)

Visualizing what happens inside InnoDB:

┌─────────────────────────────────────────────────────┐
│  Transaction Flow                                    │
├─────────────────────────────────────────────────────┤
│  1. Client: "UPDATE accounts..."                     │
│          │                                          │
│          ▼                                          │
│  2. InnoDB: Write to REDO LOG (on disk)              │
│          [Sequential write - fast! ⚡]              │
│          │                                          │
│          ▼                                          │
│  3. InnoDB: Update buffer pool (in RAM)              │
│          [Random access - still fast! 💨]          │
│          │                                          │
│          ▼                                          │
│  4. Client: "COMMIT"                                 │
│          │                                          │
│          ▼                                          │
│  5. InnoDB: Flush redo to disk (fsync)               │
│          [This is where we wait... 🐌]             │
└─────────────────────────────────────────────────────┘

Key insight: Redo writes are sequential (fast), but fsync is extremely slow, especially under heavy load. At 5,000 transactions per second, each requiring a disk flush, we were asking storage to perform 5,000 synchronous operations per second—something even SSDs hate.

The Investigation: Four Variables That Changed Everything

1. `innodb_flush_log_at_trx_commit`

-- Safest, slowest
SET GLOBAL innodb_flush_log_at_trx_commit = 1;

-- OS cache, faster
SET GLOBAL innodb_flush_log_at_trx_commit = 2;

-- Unsafe, fastest
SET GLOBAL innodb_flush_log_at_trx_commit = 0;

Setting	Durability	Performance	Use Case
1	Full ACID	Slowest	Financial data
2	OS cache	Fast	Web apps
0	Risky	Fastest	Analytics

We moved to 2, giving us a 10× throughput boost.

2. `innodb_log_file_size`

The redo log is a circular buffer:

[Checkpoint] ---> ███████ write position ---> (wrap)

If the write pointer catches the checkpoint → stall.

SHOW VARIABLES LIKE 'innodb_log_file_size';

We increased ours from 48 MB to 2 GB × 2 files = 4 GB.
Rule of thumb: Large enough to hold ≈ 1 hour of peak writes.

3. `innodb_flush_method`

SET GLOBAL innodb_flush_method = 'O_DIRECT';

Prevents double‑caching and reduced our I/O wait by 30 %.

4. Batch Your Commits

Before:

5000 commits/sec → 5000 fsync calls

After batching:

100 commits/sec → 100 fsync calls

Result: 50× fewer disk operations.

The Read‑Only Transaction Secret Weapon

START TRANSACTION READ ONLY;

SELECT SUM(balance)
FROM accounts
WHERE region = 'US-WEST';

COMMIT;

InnoDB then:

Skips redo logging
Skips locking
Skips creating transaction IDs

Result: ≈ 4× faster for reporting queries.

Monitoring the Redo Log

SELECT 
  'Redo Log Usage' AS metric,
  ROUND(
    (SELECT VARIABLE_VALUE 
     FROM performance_schema.global_status 
     WHERE VARIABLE_NAME = 'Innodb_os_log_written') / 1024 / 1024,
    2
  ) AS value
UNION ALL
SELECT 
  'Log Waits',
  (SELECT VARIABLE_VALUE 
   FROM performance_schema.global_status 
   WHERE VARIABLE_NAME = 'Innodb_log_waits')
UNION ALL
SELECT 
  'Checkpoint Age',
  ROUND(
    (SELECT VARIABLE_VALUE 
     FROM performance_schema.global_status 
     WHERE VARIABLE_NAME = 'Innodb_checkpoint_age') / 1024 / 1024,
    2
  );

If Innodb_log_waits > 0 → Your redo log is too small.

The Production Optimization Playbook

Profile Before Optimizing

SET GLOBAL slow_query_log = 1;
SET GLOBAL long_query_time = 0.1;
SET GLOBAL log_slow_extra = 1;

Right‑Size the Redo Log
Use daily metrics to size log files at ≈ 2× peak hour.
Test Durability Settings With Benchmarks
Compare innodb_flush_log_at_trx_commit values = 1, 2, 0.
Use Read‑Only Transactions for reporting and analytics.

Lessons Learned

The redo log is both your safety net and your bottleneck.
Every durability guarantee has a performance cost.
Log file sizing is mission‑critical.
Read‑only transactions are massively underused.
Batching writes is one of the most powerful optimizations at scale.

When MySQL InnoDB's Engine Redo Log Saved Your Startup

The Lighthouse Keeper Analogy

The Transaction Patterns That Nearly Killed Us

The Redo Log: A Technical Deep Dive (Without the Boring Parts)

The Investigation: Four Variables That Changed Everything

1. `innodb_flush_log_at_trx_commit`

2. `innodb_log_file_size`

3. `innodb_flush_method`

4. Batch Your Commits

The Read‑Only Transaction Secret Weapon

Monitoring the Redo Log

The Production Optimization Playbook

Lessons Learned

Further Reading

Related posts

Why Your MySQL Database Thinks 'café' 'café'

A (very) Short Look at How MySQL Joins Work

100k TPS over a billion rows: the unreasonable effectiveness of SQLite

Show HN: I built a system for active note-taking in regular meetings like 1-1s

The Lighthouse Keeper Analogy

The Transaction Patterns That Nearly Killed Us

The Redo Log: A Technical Deep Dive (Without the Boring Parts)

The Investigation: Four Variables That Changed Everything

1. innodb_flush_log_at_trx_commit

2. innodb_log_file_size

3. innodb_flush_method

4. Batch Your Commits

The Read‑Only Transaction Secret Weapon

Monitoring the Redo Log

The Production Optimization Playbook

Lessons Learned

Further Reading

Related posts

Why Your MySQL Database Thinks 'café' 'café'

A (very) Short Look at How MySQL Joins Work

100k TPS over a billion rows: the unreasonable effectiveness of SQLite

Show HN: I built a system for active note-taking in regular meetings like 1-1s

1. `innodb_flush_log_at_trx_commit`

2. `innodb_log_file_size`

3. `innodb_flush_method`