问题描述
1.在同一个主题和同一个分区上并发消费
假设我有 100 个给定主题的分区(例如 Purchases
),我可以轻松消费这 100 个分区(例如 Electronics
、Clothing
,等等...)同时使用一个有 100 个消费者的消费者组.
Suppose I have 100 partitions for a given topic (e.g. Purchases
), I can easily consume these 100 partitions (e.g. Electronics
, Clothing
, and etc...) in parallel using a consumer group with 100 consumers in it.
然而,这是为 Purchases
总数据的每个子集分配一个消费者.如果我只想同时使用 100 个消费者的一个数据子集怎么办?例如,对于我所有的消费者,他们只想知道 Purchases
主题的 Electronics
分区.
However, that is assigning one consumer to each subset of the total data on Purchases
. What if I want just want to consume one subset of data with 100 consumers concurrently? For example, for all of my consumers, they just want to know Electronics
partition of the Purchases
topic.
有没有办法同时使用这个分区?
一般来说,我只希望我的所有消费者同时接收相同的数据集.
In general I just want all my consumers to receive the same data set concurrently.
从我收集到的信息来看,在我看来消费者不能从副本消费:Consuming from复制品
From the information I've gathered, it seems to me that consumers CANNOT consume from replicas: Consuming from a replica
我可以为多个主题生成相同的数据吗,例如 Purchase-1[Electronics]
和 Purchase-2[Electronics]
以便我可以使用它们同时?这是推荐的方法吗?
Can I produce the same data to multiple topics, like Purchase-1[Electronics]
and Purchase-2[Electronics]
so then I can consume them concurrently? Is this a recommended approach?
2.同主题同分区并发生产
当多个生产者为同一个主题和同一个分区生产时,由于我们只能写入分区领导者,而副本只是为了容错,这是否意味着没有任何并发?em>(即每次提交都必须排队等待.)
When multiple producers are producing to the same topic and same partition, since we can only write to the partition leader and replicas are only there for fault-tolerance, does this mean there isn't any concurrency? (i.e. each commit must wait in line.)
推荐答案
- 如果这 100 个消费者属于不同的消费者组,他们可以同时从同一主题和分区进行消费.在这种情况下,您需要确保每个使用者都能够处理来自 100 个分区的负载.
- 生产者可以同时向同一个主题分区生产,但消息写入分区的实际顺序由分区领导决定.
这篇关于Kafka中的并行生产和消费的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!