Role of ZooKeeper in Kafka
ZooKeeper plays a key role in Kafka clusters, responsible for coordinating and managing various metadata and states of the cluster. Although Kafka 2.8+ introduced the KRaft mode (ZooKeeper-less), ZooKeeper is still a core component of most Kafka clusters.
Core Roles of ZooKeeper
1. Broker Registration and Discovery
Function Description:
- Each Broker registers with ZooKeeper when starting
- ZooKeeper maintains the Broker list and status
- Producers and Consumers discover Brokers through ZooKeeper
Implementation Mechanism:
shellZooKeeper Node Structure: /brokers/ids/[broker_id] -> broker information
Key Information:
- Broker ID
- Broker address and port
- Broker status (active/inactive)
- Broker's Rack information
2. Controller Election
Function Description:
- Elect one Broker as Controller in the Kafka cluster
- Controller manages partition state and replica allocation
- ZooKeeper coordinates the Controller election process
Election Process:
- All Brokers compete to create
/controllerephemeral node - The Broker that successfully creates becomes Controller
- Other Brokers monitor the
/controllernode - When Controller fails, re-elect
Controller Responsibilities:
- Manage Partition Leader election
- Manage Partition replica allocation
- Manage Topic creation and deletion
- Manage cluster metadata changes
3. Topic Metadata Management
Function Description:
- Store Topic partition information
- Store Topic replica allocation information
- Store Topic configuration information
ZooKeeper Node Structure:
shell/brokers/topics/[topic_name] -> partition information /config/topics/[topic_name] -> Topic configuration /admin/delete_topics/[topic_name] -> Topics to be deleted
Stored Content:
- Topic's Partition count
- Replica distribution for each Partition
- Topic's override configuration (e.g., retention.ms)
4. Consumer Group Management
Function Description:
- Manage Consumer Group member information
- Manage Consumer Group Offset commits
- Coordinate Consumer Group Rebalance
ZooKeeper Node Structure:
shell/consumers/[group_id]/ids/[consumer_id] -> Consumer information /consumers/[group_id]/offsets/[topic]/[partition] -> Offset /consumers/[group_id]/owners/[topic]/[partition] -> Partition owner
Managed Content:
- Consumer Group member list
- Topics subscribed by each Consumer
- Offset for each Partition
- Allocation relationship between Partitions and Consumers
5. ACL Permission Management
Function Description:
- Store Kafka's access control lists
- Manage user and permission information
ZooKeeper Node Structure:
shell/kafka-acl/Topic/[topic_name] -> Topic permissions /kafka-acl/Cluster/kafka-cluster -> Cluster permissions /kafka-acl/Group/[group_id] -> Consumer Group permissions
6. Configuration Management
Function Description:
- Store cluster-level configuration
- Store Topic-level configuration
- Store Client-level configuration
ZooKeeper Node Structure:
shell/config/brokers/[broker_id] -> Broker configuration /config/topics/[topic_name] -> Topic configuration /config/clients/[client_id] -> Client configuration
ZooKeeper and Kafka Interaction
Broker Startup Process
-
Connect to ZooKeeper
- Broker connects to ZooKeeper cluster
- Create session
-
Register Broker
- Create ephemeral node under
/brokers/ids/ - Register Broker information
- Create ephemeral node under
-
Participate in Controller Election
- Try to create
/controllernode - Compete to become Controller
- Try to create
-
Load Metadata
- Read Topic information from ZooKeeper
- Read configuration information from ZooKeeper
Topic Creation Process
-
Create Topic Node
- Create Topic node under
/brokers/topics/ - Store partition and replica information
- Create Topic node under
-
Create Configuration Node
- Create configuration node under
/config/topics/ - Store Topic configuration
- Create configuration node under
-
Notify Controller
- Controller monitors Topic changes
- Controller executes partition allocation
Consumer Group Rebalance Process
-
Consumer Joins Group
- Consumer creates ephemeral node under
/consumers/[group_id]/ids/ - Register Consumer information
- Consumer creates ephemeral node under
-
Trigger Rebalance
- Group Coordinator detects member changes
- Start Rebalance process
-
Allocate Partitions
- Leader Consumer formulates allocation plan
- Update
/consumers/[group_id]/owners/node
-
Commit Offset
- Consumer commits Offset to ZooKeeper
- Update
/consumers/[group_id]/offsets/node
ZooKeeper Configuration
Kafka Configuration
properties# ZooKeeper connection address zookeeper.connect=localhost:2181 # ZooKeeper connection timeout zookeeper.connection.timeout.ms=6000 # ZooKeeper session timeout zookeeper.session.timeout.ms=6000 # ZooKeeper sync time zookeeper.sync.time.ms=2000
ZooKeeper Configuration
properties# Client connection limit maxClientCnxns=60 # Data directory dataDir=/var/lib/zookeeper # Tick time tickTime=2000 # Initial sync timeout initLimit=10 # Sync timeout syncLimit=5 # Client port clientPort=2181
ZooKeeper High Availability
ZooKeeper Cluster Deployment
Deployment Architecture:
- At least 3 ZooKeeper nodes
- Odd number of nodes (avoid split brain)
- Distributed on different physical machines
Configuration Example:
propertiestickTime=2000 dataDir=/var/lib/zookeeper clientPort=2181 initLimit=5 syncLimit=2 server.1=zoo1:2888:3888 server.2=zoo2:2888:3888 server.3=zoo3:2888:3888
Failure Recovery
ZooKeeper Failure:
- If majority of nodes survive, cluster continues to serve
- If minority of nodes fail, automatic recovery
Kafka Failure:
- Broker failure, ZooKeeper detects and triggers Controller election
- Controller failure, re-elect new Controller
KRaft Mode (ZooKeeper-less)
KRaft Mode Introduction
Kafka 2.8+ introduced KRaft mode, removing dependency on ZooKeeper.
Advantages:
- Simplify deployment and operations
- Reduce component dependencies
- Improve performance
- Better scalability
Architecture Changes:
- Use internal metadata storage instead of ZooKeeper
- Controller cluster manages metadata
- Broker communicates directly with Controller
KRaft Mode Configuration
properties# Enable KRaft mode process.roles=broker,controller # Controller list controller.quorum.voters=1@localhost:9093,2@localhost:9094,3@localhost:9095 # Listen addresses listeners=PLAINTEXT://:9092,CONTROLLER://:9093 # Metadata directory metadata.log.dir=/var/lib/kafka/metadata
Best Practices
1. ZooKeeper Cluster Planning
- At least 3 nodes
- Distributed across different racks
- Use dedicated disks
2. Monitor ZooKeeper
- Monitor ZooKeeper latency
- Monitor ZooKeeper connection count
- Monitor ZooKeeper node status
3. Optimize ZooKeeper Performance
- Adjust JVM parameters
- Optimize network configuration
- Use SSD storage
4. Backup ZooKeeper Data
- Regularly backup ZooKeeper data directory
- Establish disaster recovery plans
- Test backup and recovery processes
By understanding ZooKeeper's role in Kafka, you can better design, deploy, and operate Kafka clusters, ensuring system stability and reliability.