问题描述
我是Cassandra的新手,我试图弄清楚如何存储数据以便能够并行执行快速读取。我读过分区数据会带来性能问题吗?
I'm new to Cassandra and I'm trying to figure out how I should store data in order to be able to perform fast reads in parallel. I have read that partitioning data can give performance issues? Is it possible to read data from Cassandra tables in the same partition in parallel?
推荐答案
DataStax的Oliver Michallat的博客文章不错,可以并行读取同一分区中的Cassandra表中的数据吗?讨论以下内容:
DataStax's Oliver Michallat has a good blog post which discusses this:
在那篇文章中,他介绍了如何编写并行查询代码来解决与多分区键相关的问题
In that article, he describes how to code in-parallel queries to solve the issues associated with multi-partition-key queries.
他使用的示例不是运行单个查询(来自Java),例如:
The example he uses, is instead of running a single query (from Java) for something like this:
SELECT * FROM users WHERE id IN (
e6af74a8-4711-4609-a94f-2cbfab9695e5,
281336f4-2a52-4535-847c-11a4d3682ec1);
更好的方法是使用异步未来,例如:
A better way is to use an async "future" like this:
Future<List<ResultSet>> future = ResultSets.queryAllAsList(session,
"SELECT * FROM users WHERE id = ?",
UUID.fromString("e6af74a8-4711-4609-a94f-2cbfab9695e5"),
UUID.fromString("281336f4-2a52-4535-847c-11a4d3682ec1")
);
for (ResultSet rs : future.get()) {
... // here is where you process the result set
}
对于从同一分区内查询数据,当然可以。我假设您的意思是使用不同的集群键(否则将没有意义),并且应该以与上面列出的类似的方式工作。
As for querying data from within the same partition, of course you can. I assume that you mean with differing clustering keys (otherwise there would be no point), and that should work in a similar way to what is listed above.
这篇关于并行从Cassandra读取数据的最佳方法是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!