问题描述
分区器的选择可以带来我的吞吐量和延迟,有什么区别。我经历了所有的三个部分。有一件事我已经在我的脑海中明确了有序partioner 有开销,所以不使用它们。现在我对选择 Random和Murmur3分区器感到困惑。
What Difference a could a choice of Partitioner Could bring in my throughput and latency. I have gone through all the three partioner. and one thing i have made clear in my mind that ordered partioner have overhead so not to use them. Now I am bit confused with the choosing Random and murmur3 partitioner .
推荐答案
两者之间的主要区别是每个生成令牌哈希值的方式。 Random分区器使用JDK本机MD5哈希(因为它对于开发人员都很方便,而且在所有JDK中都是标准的)。但是,由于Cassandra真的不需要加密散列,该函数花费的时间比它需要的更长。
The main difference between the two, is in how each generates the token hash values. The Random partitioner used the JDK native MD5 hash (because it was both convenient for the developers and standard across all JDKs). But since Cassandra really doesn't need a cryptographic hash, that function took much longer than it needed to.
使用Murmur3分区器,令牌散列只做Cassandra需要它做。其中,是生成令牌确保均匀分布在节点上。这导致令牌哈希性能改进3到5倍,这不可靠地转化为上述Carlo提到的总体10%的增益。
With the Murmur3 partitioner, the token hashing does only what Cassandra needs it to do. Which, is to generate a token ensuring even distribution across the nodes. This results in an improvement of 3 to 5 times in token hashing performance, which untimately translates into the overall 10% gain that Carlo mentioned above.
还应注意, DataStax警告分区器不兼容。这意味着,一旦你从一个分区器开始,你不能(轻松)转换到另一个。因此,我会选择较新的,略快的Murmur3分区器。
It should also be noted that DataStax warns that the partitioners are not compatible. Which means, that once you start with one partitioner, you cannot (easily) convert to the other. Therefore, I would pick the newer, slightly faster Murmur3 partitioner.
这篇关于哪个是更好的partioner。随机或Murmur3在cassandra在吞吐量和什么是diffence b / w他们?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!