问题描述
我正在尝试调试一个问题,我试图证明如果集群没有重新平衡,每个不同的键只会转到 1 个分区.
I am trying to debug a issue for which I am trying to prove that each distinct key only goes to 1 partition if the cluster is not rebalancing.
所以我想知道对于给定的主题,有没有办法确定密钥发送到哪个分区?
So I was wondering for a given topic, is there a way to determine which partition a key is send to?
推荐答案
你需要 byte[] keyBytes
假设它不为空,然后使用 org.apache.kafka.common.utils.Utils
,你可以运行以下.
You need the byte[] keyBytes
assuming it isn't null, then using org.apache.kafka.common.utils.Utils
, you can run the following.
Utils.toPositive(Utils.murmur2(keyBytes)) % numPartitions;
对于字符串或 JSON,它是 UTF8 编码的,Utils 类有帮助函数来获取它.
对于 Avro,例如 Confluent 序列化值,它有点复杂(一个魔术字节,然后是架构 ID,然后是数据).请参阅有线格式
For strings or JSON, it's UTF8 encoded, and the Utils class has helper functions to get that.
For Avro, such as Confluent serialized values, it's a bit more complicated (a magic byte, then a schema ID, then the data). See Wire format
只去1个分区
这不是保证.哈希可能会发生冲突.
This isn't a guarantee. Hashes can collide.
说一个给定的键不在多个分区中更有意义.
It makes more sense to say that a given key isn't in more than one partition.
如果集群没有重新平衡
重新平衡仍将保留分区值.
Rebalancing will still preserve a partition value.
这篇关于如何检查kafka中键分配给哪个分区?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!