问题描述
之间有什么区别?
a) nodetool rebuild
b) nodetool修复[-pr]
换句话说,各自的命令呢?
In other words, what exactly do the respective commands do?
推荐答案
nodetool rebuild:类似于引导过程(在向群集中添加新节点时),但用于数据中心。这里的过程主要是从已经处于活动状态的节点到新节点(新节点为空)的流式传输。因此,在为节点定义了非常快的键范围之后,其余的可以看作是复制操作。
nodetool rebuild: is similar to the bootstrapping process (when you add a new node to the cluster) but for a datacenter. The process here is mainly a streaming from the already live nodes to the new nodes (the new ones are empty). So after defining the key ranges for the nodes which is very fast, the rest can be seen as a copy operation.
nodetool repair -pr:不是复制操作,如果要修复的节点不为空,则它已经包含数据,但是如果复制因子大于1,则需要将数据与其余副本上的数据进行比较,并且如果存在差异,则将对其进行校正。该过程涉及大量流传输,但不是数据流传输:要修复的节点请求一棵Merkle树(基本上是哈希树),以验证两个节点所拥有的信息是否相同,如果不相同,则请求数据段的完整流具有任何差异(因此所有副本具有相同的数据)。如果比在验证之前流式传输整个数据要快,则流式传输此哈希数据的工作原理是,除了此处和此处的一些差异之外,大多数数据在两个节点上都将相同。此过程还将删除从数据库中删除时创建的逻辑删除,定义为一个新的检查点,此后将在删除数据时创建新的逻辑删除,但不再使用旧的逻辑删除。
nodetool repair -pr: is not a copy operation, the node being repaired is not empty, it already contains data but if the replication factor is greater than 1 that data needs to be compared to the data on the rest of the replicas and if there is a difference it will be corrected. The process involves a lot of streaming but it is not data streaming: the node being repaired requests a merkle tree (basically a tree of hashes) in order to verify if the information both nodes have is the same or not, if not it requests a full stream of the section of the data that has any difference (so all the replicas have the same data). Streaming this hashes if faster than streaming the whole data before verification, this works under the assumption that most data will be the same on both nodes except for some differences here and there. This process also removes tombstones created when deleting from the database, defining like a new "checkpoint" after which new tombstones will be created upon deletion of data, but the old ones will not be used anymore.
希望有帮助!
这篇关于Cassandra节点-重建v.s.修理的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!