问题描述
我想知道为什么在kafka中对两个Kstream进行共分区,为什么两个流都需要相同数量的分区,如下面URL中的文档所示:在此处输入链接描述
I wanted to know why does co-partitioning of two Kstreams in kafka require same number of partitions for both the streams as is given in the documentation in below URL:enter link description here
推荐答案
正如名称"co-partition"所指示的,您希望将来自不同主题但具有相同键的数据放入同一Kafka Streams应用程序实例.如果您没有相同数量的分区,则无法获得此行为.
As the name "co-partition" indicates, you want to put data from different topic but same key to the same Kafka Streams application instance. If you don't have the same number of partitions, it's not possible to get this behavior.
假定您的主题A具有2个分区,主题B具有3个分区.因此,可能发生的情况是,具有键X的一条记录被散列到分区A-0和B-1(即,不同的分区号).但是,对于其他键Y,它可能会散列到A-0而不是B-2.
Assume you have topic A with 2 partitions and topic B with 3 partitions. Thus, it can happen that one record with key X is hashed to partitions A-0 and B-1 (ie, not same partition number). However, for a different key Y it might be hashed to A-0 but B-2.
仅当两个主题的分区数相同时,具有相同键的记录才最终位于相同的分区(当然是不同主题)中,这允许处理A-0/B-0和A-1/B-1等.
Only if the number of partitions is the same for both topics, records with same key end up in the same partitions (of different topics of course), and this allows to process A-0/B-0 and A-1/B-1 etc together.
这篇关于为什么在kafka中对两个Kstream进行共分区,两个流都需要相同数量的分区?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!