本文介绍了Cassandra 1.1复合键,列和CQL 3中的过滤的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想有一个表格如下:

CREATE TABLE ProductFamilies (
  ID varchar,
  PriceLow int,
  PriceHigh int,
  MassLow int,
  MassHigh int,
  MnfGeo int,
  MnfID bigint,
  Data varchar,
  PRIMARY KEY (ID)
);

共有13个字段。大多数代表桶。数据是产品系列ID的JSON,然后在后续查询中使用。 考虑Cassandra如何工作,引擎下的列名称将是值。

There are 13 fields in total. Most of these represent buckets. Data is a JSON of product family IDs, which are then used in a subsequent query. Given how Cassandra works, the column names under the hood will be the values. I wish to filter these.

我希望运行查询,如下所示:

I wish to run queries as follows:

SELECT Data FROM MyApp.ProductFamilies WHERE ID IN (?, ?, ?) AND PriceLow >= ?
AND PriceHigh <= ? AND MassLow >= ? AND MassHigh <= ? and MnfGeo >= ? AND
MnfGeo <= ?




  1. 我读到Cassandra只能对复合行键执行WHERE谓词,索引列。这是真的吗?如果是,我将使列<

  2. 是否必须包含从左到右的所有列,并且不能跳过任何列?

  3. 在我的设计中有任何非最佳点?

  4. 我想添加一个Materials列,这是一个产品系列中可能的材料数组。想想比萨饼浇头,并查询WHERE Materials IN('Pineapple')。没有创建一个单独的倒排索引的材料,然后对上述查询执行手动交叉,是否有任何其他[更优雅的]在Cassandra中处理这种方式?

  1. I read that Cassandra can only execute WHERE predicates against composite row keys or indexed columns. Is this still true? If so, I would have to make the columns < Data part of the PK.
  2. Is it still the case that one has to include all columns from left to right and cannot skip any?
  3. Are there any non-optimum points in my design?
  4. I would like to add a column "Materials", which is an array of possible materials in a product family. Think pizza toppings, and querying "WHERE Materials IN ('Pineapple')". Without creating a separate inverted index of materials and then performing a manual intersection against the above query, is there any other [more elegant] way of handling this in Cassandra?


推荐答案

基本上为了支持你的查询,你需要有

Basically to support you queries you need to have

create column family ProductFamilies with
comparator='CompositeType(UTF8Type, Int32Type, Int32Type, Int32Type, Int32Type, Int32Type, LongType, UTF8Type)'
and key_validation_class='UTF8Type'

CREATE TABLE ProductFamilies (
  ID varchar,
  PriceLow int,
  PriceHigh int,
  MassLow int,
  MassHigh int,
  MnfGeo int,
  MnfID bigint,
  Data varchar,
  PRIMARY KEY (ID, PriceLow, PriceHigh, MassLow, MnfGeo, MnfID, Data)
);

Provided you don't miss any column from left to right [although not a filter but atleast a *] and all your values are in the column names rather the value.

另一件你应该理解的复合列的事情是Column Slice必须是连续的所以,pricelow> = 10和pricelow< = 40将返回一个连续的切片,但是用masslow和其他列过滤结果集将不会工作,因为它不会导致连续切片。 BTW pricelow = 10且masslow = 10应该工作[使用phpcassa测试],因为它将再次导致连续的片。

One more thing you should understand about composite columns is "Column Slice must be contiguous" So, pricelow > =10 and pricelow <= 40 will return you a contiguous slice but filtering the result set with masslow and other columns will not work as it is not going to result in a contiguous slice. BTW pricelow = 10 and masslow <= 20 and masslow >=10 should work [tested with phpcassa] as it will result in a contiguous slice once again.

Else create您的任何列的一个或多个辅助索引。然后,您有权基于列值进行查询,前提是您始终在查询中至少有一个索引字段。

Else create a or multiple secondary index on any of the column of yours. Then you have the rights to query based on column values provided you always have atleast one of the indexed field in query.http://www.datastax.com/docs/1.1/ddl/indexes

关于你的重要问题,除非有倒排索引,否则没有别的东西,因为我知道它将是一个多值列。

Regarding you material question there is no other go than having an inverted index if it is going to be a multivalued column as of I know.

如果@jbellis验证此

It would be great if @jbellis verifies this

这篇关于Cassandra 1.1复合键,列和CQL 3中的过滤的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-05 03:39