问题描述
我对Cassandra中的分区分布有疑问。
I have one doubt regarding partition distribution in Cassandra.
我的问题是我的分区大小不均,某些分区的访问权限比其他分区大,所以恐怕迟早会在某些分区中出现热点。
My problem is that my partitions are not even-sized, and some of the partitions are more accessed than others, so I'm afraid I'll have a hot spot in some partitions sooner or later.
例如:
- 我有两个分区:A和B。
- A的大小为10,B的大小为5。
- 读取完整的A分区是我读取B的两倍。
- 具有三个(1、2和3)节点,复制因子为2。
- I've two partitions: A and B.
- Size of A is 10, size of B is 5.
- read the full A partition twice the times I read B.
- have three (1, 2, and 3) nodes, with replication factor 2.
结果:
- 节点1(A)节点2(B ,A)节点3(B)
- 节点1的大小为10,读取1.0
- 节点2的大小为15,读取1.5
- 节点3的大小为5,读取0.5
- Node 1 (A) Node 2 (B, A) Node 3 (B)
- Node 1 size is 10, read 1.0
- Node 2 size is 15, read 1.5
- Node 3 size is 5, read 0.5
我的节点1和2超载。
我开始研究我的问题,我发现了虚拟节点的概念,但是我不太确定它的实际含义。
I started researching about my problem, and I found the Virtual Nodes concept, but I'm not too sure about what it actually means.
将单个分区键分配给不同的虚拟节点(1个分区键-> n个令牌范围)吗?
Will a single partition key be assigned to different virtual nodes (1 partition key -> n token ranges)?
一个分区键只能存储在一个虚拟节点?
One partition key can only be stored in a virtual node?
我必须对密钥进行分区,添加一些分区信息(例如随机%10或类似的东西),或者有一种方法可以让Cassandra自动执行此操作? p>
I have to partition my keys adding some partition info (like a random % 10 or something) or there's a way to make Cassandra do it automatically?
推荐答案
否。每个分区密钥将仅映射到一个虚拟节点及其副本。
No. Each partition key will be mapped to only one virtual node and it's replicas.
为避免出现热点,在此分区上添加分片密钥(随机数%n)很有用。分区键。否则,请尝试选择分区键,使其不会引起热点。
To avoid hotspots, it is useful to add a sharding key (random number % n) to the partition key. Otherwise try choosing your partition key such that it does not cause hotspots.
这篇关于卡桑德拉不均匀的分区和热点的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!