乐闻世界logo
搜索文章和话题

What is the role of ZooKeeper in Kafka?

2月21日 16:54

Role of ZooKeeper in Kafka

ZooKeeper plays a key role in Kafka clusters, responsible for coordinating and managing various metadata and states of the cluster. Although Kafka 2.8+ introduced the KRaft mode (ZooKeeper-less), ZooKeeper is still a core component of most Kafka clusters.

Core Roles of ZooKeeper

1. Broker Registration and Discovery

Function Description:

  • Each Broker registers with ZooKeeper when starting
  • ZooKeeper maintains the Broker list and status
  • Producers and Consumers discover Brokers through ZooKeeper

Implementation Mechanism:

shell
ZooKeeper Node Structure: /brokers/ids/[broker_id] -> broker information

Key Information:

  • Broker ID
  • Broker address and port
  • Broker status (active/inactive)
  • Broker's Rack information

2. Controller Election

Function Description:

  • Elect one Broker as Controller in the Kafka cluster
  • Controller manages partition state and replica allocation
  • ZooKeeper coordinates the Controller election process

Election Process:

  1. All Brokers compete to create /controller ephemeral node
  2. The Broker that successfully creates becomes Controller
  3. Other Brokers monitor the /controller node
  4. When Controller fails, re-elect

Controller Responsibilities:

  • Manage Partition Leader election
  • Manage Partition replica allocation
  • Manage Topic creation and deletion
  • Manage cluster metadata changes

3. Topic Metadata Management

Function Description:

  • Store Topic partition information
  • Store Topic replica allocation information
  • Store Topic configuration information

ZooKeeper Node Structure:

shell
/brokers/topics/[topic_name] -> partition information /config/topics/[topic_name] -> Topic configuration /admin/delete_topics/[topic_name] -> Topics to be deleted

Stored Content:

  • Topic's Partition count
  • Replica distribution for each Partition
  • Topic's override configuration (e.g., retention.ms)

4. Consumer Group Management

Function Description:

  • Manage Consumer Group member information
  • Manage Consumer Group Offset commits
  • Coordinate Consumer Group Rebalance

ZooKeeper Node Structure:

shell
/consumers/[group_id]/ids/[consumer_id] -> Consumer information /consumers/[group_id]/offsets/[topic]/[partition] -> Offset /consumers/[group_id]/owners/[topic]/[partition] -> Partition owner

Managed Content:

  • Consumer Group member list
  • Topics subscribed by each Consumer
  • Offset for each Partition
  • Allocation relationship between Partitions and Consumers

5. ACL Permission Management

Function Description:

  • Store Kafka's access control lists
  • Manage user and permission information

ZooKeeper Node Structure:

shell
/kafka-acl/Topic/[topic_name] -> Topic permissions /kafka-acl/Cluster/kafka-cluster -> Cluster permissions /kafka-acl/Group/[group_id] -> Consumer Group permissions

6. Configuration Management

Function Description:

  • Store cluster-level configuration
  • Store Topic-level configuration
  • Store Client-level configuration

ZooKeeper Node Structure:

shell
/config/brokers/[broker_id] -> Broker configuration /config/topics/[topic_name] -> Topic configuration /config/clients/[client_id] -> Client configuration

ZooKeeper and Kafka Interaction

Broker Startup Process

  1. Connect to ZooKeeper

    • Broker connects to ZooKeeper cluster
    • Create session
  2. Register Broker

    • Create ephemeral node under /brokers/ids/
    • Register Broker information
  3. Participate in Controller Election

    • Try to create /controller node
    • Compete to become Controller
  4. Load Metadata

    • Read Topic information from ZooKeeper
    • Read configuration information from ZooKeeper

Topic Creation Process

  1. Create Topic Node

    • Create Topic node under /brokers/topics/
    • Store partition and replica information
  2. Create Configuration Node

    • Create configuration node under /config/topics/
    • Store Topic configuration
  3. Notify Controller

    • Controller monitors Topic changes
    • Controller executes partition allocation

Consumer Group Rebalance Process

  1. Consumer Joins Group

    • Consumer creates ephemeral node under /consumers/[group_id]/ids/
    • Register Consumer information
  2. Trigger Rebalance

    • Group Coordinator detects member changes
    • Start Rebalance process
  3. Allocate Partitions

    • Leader Consumer formulates allocation plan
    • Update /consumers/[group_id]/owners/ node
  4. Commit Offset

    • Consumer commits Offset to ZooKeeper
    • Update /consumers/[group_id]/offsets/ node

ZooKeeper Configuration

Kafka Configuration

properties
# ZooKeeper connection address zookeeper.connect=localhost:2181 # ZooKeeper connection timeout zookeeper.connection.timeout.ms=6000 # ZooKeeper session timeout zookeeper.session.timeout.ms=6000 # ZooKeeper sync time zookeeper.sync.time.ms=2000

ZooKeeper Configuration

properties
# Client connection limit maxClientCnxns=60 # Data directory dataDir=/var/lib/zookeeper # Tick time tickTime=2000 # Initial sync timeout initLimit=10 # Sync timeout syncLimit=5 # Client port clientPort=2181

ZooKeeper High Availability

ZooKeeper Cluster Deployment

Deployment Architecture:

  • At least 3 ZooKeeper nodes
  • Odd number of nodes (avoid split brain)
  • Distributed on different physical machines

Configuration Example:

properties
tickTime=2000 dataDir=/var/lib/zookeeper clientPort=2181 initLimit=5 syncLimit=2 server.1=zoo1:2888:3888 server.2=zoo2:2888:3888 server.3=zoo3:2888:3888

Failure Recovery

ZooKeeper Failure:

  • If majority of nodes survive, cluster continues to serve
  • If minority of nodes fail, automatic recovery

Kafka Failure:

  • Broker failure, ZooKeeper detects and triggers Controller election
  • Controller failure, re-elect new Controller

KRaft Mode (ZooKeeper-less)

KRaft Mode Introduction

Kafka 2.8+ introduced KRaft mode, removing dependency on ZooKeeper.

Advantages:

  • Simplify deployment and operations
  • Reduce component dependencies
  • Improve performance
  • Better scalability

Architecture Changes:

  • Use internal metadata storage instead of ZooKeeper
  • Controller cluster manages metadata
  • Broker communicates directly with Controller

KRaft Mode Configuration

properties
# Enable KRaft mode process.roles=broker,controller # Controller list controller.quorum.voters=1@localhost:9093,2@localhost:9094,3@localhost:9095 # Listen addresses listeners=PLAINTEXT://:9092,CONTROLLER://:9093 # Metadata directory metadata.log.dir=/var/lib/kafka/metadata

Best Practices

1. ZooKeeper Cluster Planning

  • At least 3 nodes
  • Distributed across different racks
  • Use dedicated disks

2. Monitor ZooKeeper

  • Monitor ZooKeeper latency
  • Monitor ZooKeeper connection count
  • Monitor ZooKeeper node status

3. Optimize ZooKeeper Performance

  • Adjust JVM parameters
  • Optimize network configuration
  • Use SSD storage

4. Backup ZooKeeper Data

  • Regularly backup ZooKeeper data directory
  • Establish disaster recovery plans
  • Test backup and recovery processes

By understanding ZooKeeper's role in Kafka, you can better design, deploy, and operate Kafka clusters, ensuring system stability and reliability.

标签:Kafka