乐闻世界logo
搜索文章和话题

How does Zookeeper guarantee data consistency? What is the working principle of the ZAB protocol?

2月21日 16:24

Answer

Zookeeper guarantees data consistency through the ZAB protocol (Zookeeper Atomic Broadcast), which is its core mechanism.

ZAB Protocol

The ZAB protocol includes two modes:

  1. Crash Recovery Mode:

    • Entered when Leader fails or cluster starts
    • Elect new Leader
    • Data synchronization and recovery
  2. Message Broadcast Mode:

    • Runs when Leader is working normally
    • Handles client write requests
    • Broadcasts transactions to all Followers

Write Request Consistency Guarantee

Write Request Flow:

  1. Client sends write request to any node
  2. Follower forwards request to Leader
  3. Leader creates transaction proposal and assigns globally incremental zxid
  4. Leader broadcasts proposal to all Followers
  5. Follower executes transaction and returns ACK
  6. Leader commits transaction after receiving majority ACK
  7. Leader broadcasts commit message to all Followers
  8. Follower commits transaction and returns success to client

Consistency Guarantee:

  • All write requests must be processed through Leader
  • Leader commits only after receiving majority Follower confirmation
  • All nodes execute transactions in the same order
  • zxid guarantees global order of transactions

Read Request Consistency

Read Request Characteristics:

  • Read requests can be read directly from any node
  • May read stale data (eventual consistency)
  • No Leader participation required, high performance

Strong Consistency Read:

  • Use sync() method to force synchronization
  • Ensure reading latest data
  • Sacrifice performance for consistency

Data Synchronization Mechanism

Data Synchronization After Leader Election:

  1. Leader determines latest data: Select node with largest zxid as Leader
  2. Follower connects to Leader: Sends its latest zxid
  3. Leader sends differential data:
    • If Follower data is behind, send missing transactions
    • If Follower data is too new, require Follower to roll back
  4. Follower synchronizes data: Apply transactions sent by Leader
  5. Synchronization complete: Follower can process requests

Consistency Levels

Zookeeper provides the following consistency guarantees:

  1. Sequential Consistency: All clients see the same transaction order
  2. Atomicity: Transactions either fully succeed or fully fail
  3. Single System Image: All clients connecting to any node see the same data view
  4. Reliability: Once a transaction is committed, it will not be lost

Consistency Trade-offs

Choice in CAP Theory:

  • CP System: Guarantees consistency and partition tolerance
  • Sacrifices Availability: Some nodes unavailable during network partition

Practical Impact:

  • Higher write request latency (waiting for majority confirmation)
  • Excellent read request performance (can read from any node)
  • Suitable for read-heavy, write-light scenarios

Version Number Mechanism

Each ZNode maintains three version numbers:

  1. dataVersion: Data version number, increments when data is updated
  2. cversion: Child node version number, increments when child nodes change
  3. aversion: ACL version number, increments when permissions change

CAS Operation:

  • Use version numbers to implement optimistic locking
  • Specify version number when updating to prevent concurrent modification
  • Update fails if version number does not match
标签:Zookeeper