问题描述
我有一个如下所示的数据模型,
I have a Data model like below,
CREATE TABLE appstat.nodedata (
nodeip text,
timestamp timestamp,
flashmode text,
physicalusage int,
readbw int,
readiops int,
totalcapacity int,
writebw int,
writeiops int,
writelatency int,
PRIMARY KEY (nodeip, timestamp)
) WITH CLUSTERING ORDER BY (timestamp DESC)
其中,nodeip-主键和时间戳-群集键(通过排序以获取最新信息),
where, nodeip - primary key and timestamp - clustering key (Sorted by descinding oder to get the latest),
此表中的样本数据
SELECT * from nodedata WHERE nodeip = '172.30.56.60' LIMIT 2;
nodeip | timestamp | flashmode | physicalusage | readbw | readiops | totalcapacity | writebw | writeiops | writelatency
--------------+---------------------------------+-----------+---------------+--------+----------+---------------+---------+-----------+--------------
172.30.56.60 | 2017-12-08 06:13:07.161000+0000 | yes | 34 | 57 | 19 | 27 | 8 | 89 | 57
172.30.56.60 | 2017-12-08 06:12:07.161000+0000 | yes | 70 | 6 | 43 | 88 | 79 | 83 | 89
这是正确可用的,每当我需要获取统计信息时,我都可以使用如下所示的分区键,
This is properly available and whenever I need to get the statistics I am able to get the data using the partition key like below,
(以上逻辑似乎与我之前的问题类似:)但期望有所不同,
(The above logic seems similar to my previous question : Aggregation in Cassandra across partitions) but expectation is different,
我很有价值对于所有列(例如readbw,延迟等),在所有4个节点中每1分钟填充一次。
I have value for each column (like readbw, latency etc.,) populated for every one minute in all the 4 nodes.
现在,如果我需要获取a的最大值列(示例:readbw),可以使用以下查询,
Now, If I need to get the max value for a column (Example : readbw), It is possible using the following query,
SELECT max(readbw) FROM nodedata WHERE nodeip IN ('172.30.56.60','172.30.56.61','172.30.56.60','172.30.56.63') AND timestamp < 1512652272989 AND timestamp > 1512537899000;
1)第一个问题:有没有办法执行 max 汇总在列(readbw)的所有节点上不使用IN查询?
1) First question : Is there a way to perform max aggregation on all nodes of a column (readbw) without using IN query?
2)第二个问题:只要我将数据插入节点1,节点2,节点3和节点中,Cassandra中就有办法吗4.
它需要汇总并存储在另一个表中。这样我就可以从汇总表中收集每一列的汇总值。
2) Second question : Is there a way in Cassandra, whenever I insert the data in Node 1, Node 2, Node 3 and Node 4.It needs to be aggregated and stored in another table. So that I will collect the aggregated value of each column from the aggregated table.
如果我的观点不清楚,请告诉我。
If any of my point is not clear, please let me know.
谢谢,
哈里
Thanks,
Harry
推荐答案
如果您是dse Cassandra,则可以启用spark并编写聚合查询
If you are dse Cassandra you can enable spark and write the aggregation queries
这篇关于Cassandra中各列的总和的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!