Answer
Zookeeper guarantees data consistency through the ZAB protocol (Zookeeper Atomic Broadcast), which is its core mechanism.
ZAB Protocol
The ZAB protocol includes two modes:
-
Crash Recovery Mode:
- Entered when Leader fails or cluster starts
- Elect new Leader
- Data synchronization and recovery
-
Message Broadcast Mode:
- Runs when Leader is working normally
- Handles client write requests
- Broadcasts transactions to all Followers
Write Request Consistency Guarantee
Write Request Flow:
- Client sends write request to any node
- Follower forwards request to Leader
- Leader creates transaction proposal and assigns globally incremental zxid
- Leader broadcasts proposal to all Followers
- Follower executes transaction and returns ACK
- Leader commits transaction after receiving majority ACK
- Leader broadcasts commit message to all Followers
- Follower commits transaction and returns success to client
Consistency Guarantee:
- All write requests must be processed through Leader
- Leader commits only after receiving majority Follower confirmation
- All nodes execute transactions in the same order
- zxid guarantees global order of transactions
Read Request Consistency
Read Request Characteristics:
- Read requests can be read directly from any node
- May read stale data (eventual consistency)
- No Leader participation required, high performance
Strong Consistency Read:
- Use
sync()method to force synchronization - Ensure reading latest data
- Sacrifice performance for consistency
Data Synchronization Mechanism
Data Synchronization After Leader Election:
- Leader determines latest data: Select node with largest zxid as Leader
- Follower connects to Leader: Sends its latest zxid
- Leader sends differential data:
- If Follower data is behind, send missing transactions
- If Follower data is too new, require Follower to roll back
- Follower synchronizes data: Apply transactions sent by Leader
- Synchronization complete: Follower can process requests
Consistency Levels
Zookeeper provides the following consistency guarantees:
- Sequential Consistency: All clients see the same transaction order
- Atomicity: Transactions either fully succeed or fully fail
- Single System Image: All clients connecting to any node see the same data view
- Reliability: Once a transaction is committed, it will not be lost
Consistency Trade-offs
Choice in CAP Theory:
- CP System: Guarantees consistency and partition tolerance
- Sacrifices Availability: Some nodes unavailable during network partition
Practical Impact:
- Higher write request latency (waiting for majority confirmation)
- Excellent read request performance (can read from any node)
- Suitable for read-heavy, write-light scenarios
Version Number Mechanism
Each ZNode maintains three version numbers:
- dataVersion: Data version number, increments when data is updated
- cversion: Child node version number, increments when child nodes change
- aversion: ACL version number, increments when permissions change
CAS Operation:
- Use version numbers to implement optimistic locking
- Specify version number when updating to prevent concurrent modification
- Update fails if version number does not match