问题描述
我对Cassandra DB中非常大的表的行计数感到麻烦。
I have a trouble with the rows counting of very huge table in Cassandra DB.
简单的语句:
SELECT COUNT(*) FROM my.table;
调用超时错误:
OperationTimedOut: errors={}, ...
我增加了在〜/ .cassandra / cqlshrc文件中的client_timeout:
I have increased client_timeout in ~/.cassandra/cqlshrc file:
[connection]
client_timeout = 900
语句这次正在运行,并再次调用OperationTimeout错误。我该如何计数表中的行?
Statement is running this time and invokes OperationTimeout error again. How can I count rows in table?
推荐答案
您可以通过使用拆分标记范围来多次计数。
Cassandra使用的令牌范围是-2 ^ 63至+ 2 ^ 63-1。因此,通过拆分此范围,您可以执行以下查询:
You could count multiple times by using split token ranges. Cassandra uses a token range from -2^63 to +2^63-1. So by splitting up this range you could do queries like that:
select count(*) from my.table where token(partitionKey) > -9223372036854775808 and token(partitionKey) < 0;
select count(*) from my.table where token(partitionKey) >= 0 and token(partitionKey) < 9223372036854775807;
将这两个计数相加,便得到总数。
如果仍然无法通过这些查询,则可以再次将其拆分为较小的令牌范围。
Add those two counts and you'll have the total count.If those querys still not go through you can split them again into smaller token ranges.
请查看此工具,该工具的作用基本上是这样的:
Check out this tool, which does basically exactly that: https://github.com/brianmhess/cassandra-count
这篇关于计算表中的行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!