本文介绍了Cassandra如何水平缩放?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经在Cassandra数据库上观看了,该视频非常有效确实对Cassandra做了很多解释。我还准备了一些有关Cassandra的文章和书籍,但我不明白的是Cassandra是如何水平缩放的。通过水平缩放,我的意思是添加更多节点以获取更多空间。
据我了解,每个节点具有相同的数据,即,如果一个节点具有1TB的数据并将其复制到其他节点,这意味着所有n个节点将各自包含1TB的数据。我在这里想念什么吗?

I've watched a video on Cassandra database, which turns to be very effective and really explains a lot about Cassandra. I've also ready some article and books about Cassandra but the thing I could not understand is how does Cassandra scale horizontally. By horizontally scale I mean add more nodes to gain more space. As I understand each node has the identical data i.e if one node has 1TB of data and is replicated to other nodes this means all n nodes will each contain 1TB of data. Am I missing something here ?

推荐答案

是的,您缺少一些东西。数据可能不需要重复 n 次,其中 n 是节点数。通常,您将复制因子(RF)配置为低于节点数(N)。

Yes, you are missing something. Data may not need to be duplicated n times, where n is the number of nodes. You would typically configure your replication factor (RF) to be lower than the number of nodes (N).

例如,RF = 3,N = 5。该行将在5个节点(外加原始副本)中随机选择的3个节点之间重复3次。如果一个节点出现故障,则在其他节点上的其他位置将有3个副本。

For example, RF = 3, N = 5. Meaning each row will be duplicated 3 times across randomly chosen 3 nodes out of 5 nodes (plus the pristine copy). If one node goes down, you will have 3 copies elsewhere on the other nodes.

这在较大的群集中效果更好,例如RF = 5,N =100。

This works better in larger clusters, e.g. RF = 5, N = 100.

较高的RF可以改善数据冗余和读取速度,但会降低写入速度。因此,如果您的RF非常高(例如RF = N),那么您将拥有很高的数据冗余度,对节点故障的弹性和较高的读取吞吐量,这是一个平衡点。另一方面,由于需要将数据复制到所有节点,因此写吞吐量将非常有限。如果在这种情况下一个节点出现故障,则写入可能会失败(取决于客户端配置),因为无法实现所需的复制因子。

Higher RF improves data redundancy and read speed, but decreases your write speed. So there is a balance, if your RF is very high, like RF = N, you'd have very high data redundancy, high resilience to node failures, and high read throughput. On the other side your write throughput will be very limited, as data needs to be replicated to all the nodes. If one node goes down in this scenario the write may fail (depending on client config) as desired replication factor cannot be achieved.

这篇关于Cassandra如何水平缩放?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-12 09:54