本文介绍了Cassandra和Tombstones:创建行,删除行,重新创建行=性能?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有人可以解释一下,以下过程对墓碑有什么影响:

序列在一个线程中按顺序执行(因此,这种情况发生在相对较高的速度=动作之间没有长时间的暂停)。



问题:



1。)这对创建墓碑有什么影响。 2。)创建/存在墓碑。但是,如果在同一个键下重新创建新的(稍微更改的行),在现有墓碑会发生什么(在进程Step 3。))。卡桑德拉可以非常有效地重生墓碑吗?)



2。)与上面描述的过程相比,只有非常有目的地删除 date 字段,然后创建 logincount 字段? (它将很可能是更高性能,但相反,要找出哪些字段已被删除比较只是简单地删除整个行,并从头重新创建它与正确的数据更复杂)...) p>

备注/更新:



我实际想要做的是, code>date字段更改为 null 。但这不工作在cassandra。值不允许为空。所以,如果我想将其设置为null,我必须删除它。但是我担心这个明确的第二个删除请求将具有负面的性能影响(相比,只是将其设置为null)...并且如所述我必须首先找出哪些字段是nulliefied和首要有一个值(我有比较此状态的所有属性...)



非常感谢!
Markus

解决方案

我想迟些时间澄清一些事情。



首先,关于Theodore的答案:



1)为了简单起见,所有行在内部都有一个墓碑字段,所以当新行与墓碑,它只是变成新的数据行,也记得它曾经在时间X删除。因此,在这方面没有真正的惩罚。



2)这是不正确的说如果你创建和删除一个列值足够快,中间...墓碑被简单地丢弃;墓碑总是持续的,正确性。也许情况西奥多想的是另一种方式:如果你删除,然后插入一个新的列值,然后新的列替换墓碑(就像任何过时的值)。这不同于行的情况,因为Column是存储的原子。



3)给定(2),delete-row-and-insert-new - 如果有很多列要随时间删除,那么这可能是更高性能的。但对于单列,差异是可以忽略的。



最后,关于泰勒的回答,在我看来,更简单地删除有问题的列,到一个空[字节]字符串。


Could someone please explain, what effect the following process has on tombstones:

The sequence is executed in one thread sequentially (so this happens with a relatively high "speed" = no long pauses between the actions).

My Questions:

1.) What effect does this have on the creation of a tombstone. After 2.) a tombstone is created/exists. But what happens to the existing tombstone, if the new (slightly changed row) is created again under the same key (in process Step 3.)). Can cassandra "reanimate" the tombstones very efficiently?)

2.) How much worse is the process described above in comparison to only very targetly deleting the date "field" and then creating the "logincount" field instead? (It will most likely be more performant. But on the contrary it is much more complex to find out which fields have been deleted in comparison to just simply delete the whole row and recreate it from scratch with the correct data...)

Remark/Update:

What I actually want to do is, setting the "date" field to null. But this does not work in cassandra. Nulls are not allowed for values. So in case I want to set it to null I have to delete it. But I am afraid that this explicit second delete request will have a negative performance impact (compared to just setting it to null)...And as described I have to first find out which fields are nulliefied and foremost had a value (I have to compare all atributes for this state...)

Thank you very much!Markus

解决方案

I would like to belatedly clarify some things here.

First, with respect to Theodore's answer:

1) All rows have a tombstone field internally for simplicity, so when the new row is merged with the tombstone, it just becomes "row with new data, that also remembers that it was once deleted at time X." So there is no real penalty in that respect.

2) It is incorrect to say that "If you create and delete a column value rapidly enough that no flush takes place in the middle... the tombstone [is] simply discarded"; tombstones are always persisted, for correctness. Perhaps the situation Theodore was thinking was the other way around: if you delete, then insert a new column value, then the new column replaces the tombstone (just as it would any obsolete value). This is different from the row case since the Column is the "atom" of storage.

3) Given (2), the delete-row-and-insert-new-one is likely to be more performant if there are many columns to be deleted over time. But for a single column the difference is negligible.

Finally, regarding Tyler's answer, in my opinion it is more idiomatic to simply delete the column in question than to change its value to an empty [byte]string.

这篇关于Cassandra和Tombstones:创建行,删除行,重新创建行=性能?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-20 13:12