Mastering CAP & BASE Theory with Gemini: From Distributed Principles to Nacos & Redis Reality
Source: Dev.to
Core concepts
CAP theorem
The CAP theorem (also known as Brewer’s Theorem) is a cornerstone for understanding distributed system design. It states that a distributed system cannot perfectly guarantee all three of the following properties at the same time:
- Consistency (C) – All nodes see the same data at the same time.
Example: Checking inventory at any branch returns exactly the same result. - Availability (A) – Every request receives a response (success or failure), meaning the system is always “online”.
- Partition Tolerance (P) – The system continues to operate even when network failures split nodes into isolated groups (a partition).
In real networks, partitions are inevitable, so a distributed system typically must trade off between CP and AP.
CP vs. AP
CP (Consistency + Partition tolerance)
- When the network fails, the system stops serving requests to keep data strictly consistent across nodes.
- Idea: It is better to return no result than to return incorrect or stale data.
- Example: Bank transfers. If two servers are disconnected, the system must lock the account to prevent double withdrawals.
- Cost: The system becomes unavailable during the fault.
AP (Availability + Partition tolerance)
- Even if the network is partitioned, the system continues to respond to requests.
- Idea: Data might not be the latest, or different users might see different results, but the service remains usable.
- Example: Social media likes. During a partition, a like may be visible to a friend a few seconds later, which is acceptable.
- Cost: Immediate consistency is sacrificed.
Nacos: AP vs. CP modes
- AP mode (default) – Used for ephemeral instances (Ephemeral Nodes). After registration, instances keep a heartbeat with the server. During a partition, Nacos prioritizes service availability; short‑term inconsistency is acceptable. This uses Nacos’s Distro protocol.
- CP mode – Used for persistent instances (Persistent Nodes). Instance metadata is persisted to disk and requires strong consistency across nodes. If consensus cannot be reached due to a network failure, the system sacrifices availability. This uses a Raft‑based consensus protocol.
Mapping to service concerns
- Service discovery usually leans toward AP. If the registry becomes unavailable, all microservices may fail, which is a larger impact than short delays that can be masked by client retries.
- Configuration management can lean toward CP. Critical settings (e.g., database passwords, rate‑limit values) need to be propagated consistently and immediately to all nodes.
BASE theory
Once you understand the CAP trade‑off between Consistency and Availability, BASE can be seen as a practical compromise for distributed systems. The core idea is that, since strong consistency is hard to achieve, we accept a more flexible approach so the system remains usable most of the time.
BASE is an acronym for:
- Basically Available (BA) – During failures, the system may lose some availability but should not completely crash.
Example: A page that normally loads in 0.1 s might take 2 s, or some non‑core functionality may be temporarily disabled to protect core services. - Soft State (S) – The system’s data is allowed to be in an intermediate state; replication between nodes may be delayed, which is acceptable for overall availability.
- Eventually Consistent (E) – The system does not require data to be consistent at all times, but guarantees that after some time all replicas will converge to the same final state.
Redis Cluster and BASE
Redis Cluster (cluster mode) is generally designed to be AP (Availability + Partition Tolerance). It does not pursue strong consistency; instead, it achieves eventual consistency via the BASE principles.
- Basically Available (BA) – Redis Cluster splits data into 16,384 hash slots. Even if a small number of nodes go down, the cluster can continue serving as long as most slots remain covered.
- Soft State (S) – After a master writes data, it returns success to the client immediately, then replicates to slaves asynchronously. This means the master and slaves can be inconsistent at any given moment.
- Eventually Consistent (E) – Under normal conditions, slaves catch up with the master within milliseconds.
Example scenario
Step 1: Write SET key1 value1 to master node A.
Step 2: Node A writes to memory and immediately replies “OK”.
Step 3: Before A replicates the data to slave A1, A suddenly loses power and goes down.
Step 4: The cluster promotes slave A1 to become the new master.
Result: The value1 you just wrote is lost.