Kleppmann's DDIA is the book I keep returning to whenever I have to reason about a system whose state lives in more than one place. It's not a recipe book — it's a vocabulary book.
What stuck with me
- Replication, partitioning, and consistency are three orthogonal axes, not a single dial. Most production confusion I've seen comes from conflating them.
- The CAP theorem is overrated as a design tool — the more useful frame is the latency / consistency trade-off you actually pay for under partial failure.
- Logs are everywhere. Once you see write-ahead logs, replication logs, and event logs as variants of the same idea, a lot of architectures collapse into the same shape.
Who should read it
Anyone who's about to choose a database for a new service and would otherwise pick the one their last team used. Read chapter 1, 5, 7, and 9 first if you're short on time.
