本文介绍了ZooKeeper 在集群中的作用?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如果我有集群托管 1 个具有三个分区的主题.所以 ZooKeeper(ZK) 集群托管了 3 个代理实例.

If I have cluster hosting 1 topic which has three partitions. So ZooKeeper(ZK) cluster hosting 3 broker instances.

根据我的理解,

  1. Producer 将与 ZooKeeper 交互以在 Broker 上发布消息.
  2. ZK 将根据每个代理实例的负载在内部决定它需要哪个分区发布消息.Broker 也会和 ZK 互动维护每个消费者实例的偏移量
  3. 同样,Consumer 将与 ZooKeeper 交互以消费来自 Broker 的消息.ZK 将根据负载从正确的代理中获取消息.

但是在阅读了队列消息/消费者组的工作流程部分的粗体文本后,我感到困惑在 kafka 教程.我上面的理解是错误的吗?基于下面看起来生产者/消费者不直接与动物园管理员交互.是不是反过来ZK 与生产者/消费者交互的地方.如果是 who(Zookeeper 或 broker) 需要发布或消费哪个 broker 实例消息?

But I got confused after reading below bold text from section Workflow of Queue Messaging / Consumer Group at kafka tutorial. Is mine understanding above wrong ? Based on below looks like producer/consumer does not interact directly with zookeeper. Is it otherway around where ZK interact with producer/consumer. If yes who(Zookeeper or broker) which broker instance message needs to be published or consumed ?

ZooKeeper 服务主要用于通知生产者和消费者Kafka 系统中是否存在任何新代理或系统故障卡夫卡系统中的经纪人.根据收到的通知Zookeeper 关于经纪人和生产者的存在或失败消费者做出决定并开始协调他们的任务其他一些经纪人.基本上 Apache Zookeeper 是一个分布式的配置和同步服务

推荐答案

你似乎很困惑,你认为由 Kafka brokers 完成的大部分事情实际上是由客户完成的,而你所做的大部分事情认为是 Zookeeper 做的,其实是 Broker 做的.

You seem to be very mixed up in that most of the things you think are done by Kafka brokers are actually done by the clients and that most of the things you think are done by Zookeeper are actually done by the brokers.

Kafka 是一个非常可扩展的系统,因为客户端做了很多处理.客户端未完成的部分由代理(以及称为控制器和协调器的特殊代理组件)完成.除了存储状态和代理的一些配置(以非常可靠的方式)外,Zookeeper 几乎不做任何事情

Kafka is a very scalable system because the clients do a lot of the processing. The parts not done by the clients are done by the brokers (and the special broker components called the Controller and the Coordinators). Zookeeper does very little other than store state and some configuration for the brokers (in a very reliable way)

解决您的问题:

1) 不正确.新的 Producer 不直接与 ZooKeeper 交互.Producer 直接与 Broker 对话以发布消息或发出元数据请求以查找哪个 Broker 是它想要发布到的分区的领导者.

1) Incorrect. The new Producer does not interact directly with ZooKeeper. Producer talks directly to the brokers to publish messages or make meta-data requests to find which broker is the leader for a partition it wants to publish to.

2) 不正确.ZK 不决定"任何事情.ZK 是一个复制的容错存储系统,代理使用它来保存集群的信息和状态.发布到哪个分区的决定是在生产者中完成的,取决于要发布的消息的密钥和客户端分区算法.分区不是基于负载分配的,而是基于密钥(或者如果密钥为空)然后使用循环算法分配的.Broker 不会与 ZK 交互以维护每个消费者实例的偏移量.消费者跟踪他们自己的偏移量并将它们(有时,通过偏移量提交)存储在代理的 _consumer_offsets 主题中.

2) Incorrect. ZK does not "decide" anything. ZK is a replicated fault tolerant storage system that the brokers use to save information and state for the cluster. The decision on which partition to publish into is done in the Producer and depends on the key of the message being published and the client side partitioner algorithm. Partitions are not assigned based on load, they are assigned based on the key (or if the key is null) then using a round robin algorithm. The Broker will NOT interact with ZK to maintain offset per consumer instance. Consumers keep track of their own offsets and store them (occasionally, via offset commits) in the _consumer_offsets topic on the brokers.

3) 不正确.新消费者不会直接与 ZooKeeper 交互来消费来自代理的消息.ZK 不会根据负载从正确的代理中获取消息.消费者将直接与经纪人交谈,通过使用 kafka 协议发送给经纪人的 RPC 加入和离开消费者组.

3) Incorrect. New Consumer will NOT directly interact with ZooKeeper to consume the message from broker. ZK will NOT get the message out from right broker based on load. Consumers will talk directly to the brokers, join and leave consumer groups via RPCs sent to the brokers using the kafka protocol.

这篇关于ZooKeeper 在集群中的作用?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-06 17:02