本文介绍了Crossfilter是否需要一个平面数据结构?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Crossfilter的所有例子我发现使用这样的平面结构:

All the examples of Crossfilter I've found use a flat structure like this:

[
  { name: "Rusty",  type: "human", legs: 2 },
  { name: "Alex",   type: "human", legs: 2 },
  ...
  { name: "Fiona",  type: "plant", legs: 0 }
]


date,open,high,low,close,volume oi11/01 / 1985,115.48,116.78,115.48,116.28,900900,0 11/04 / 1985,116.28,117.07,115.82,116.04,753400,0
11/05 / 1985,116.04,116.57 ,115.88,116.44,876800,0

我有数百MB的平面文件,我处理得到一个1-2MB具有以下结构的JSON对象:

I have hundreds of MBs of flat files I process to yield a 1-2MB JSON object with a structure roughly like:

{
  "meta": {"stuff": "here"},
  "data": {
    "accountName": {
      // rolled up by week
      "2013-05-20": {
        // any of several "dimensions"
        "byDay": {
          "2013-05-26": {
            "values": {
              "thing1": 1,
              "thing2": 2,
              "etc": 3
            }
          },
          "2013-05-27": {
            "values": {
              "thing1": 4,
              "thing2": 5,
              "etc": 6
            }
          }
          // and so on for day
        },
        "bySource": {
          "sourceA": {
            "values": {
              "thing1": 2,
              "thing2": 6,
              "etc": 7
            }
          },
          "sourceB": {
            "values": {
              "thing1": 3,
              "thing2": 1,
              "etc": 2
            }
          }
        }
      }
    }
  }
}

我想显示为一个表格:

Group: byDay* || bySource || byWhatever

           | thing1 | thing2 | etc
2013-05-26 |      1 |      2 |   2
2013-05-27 |      4 |      5 |   7

或:

Group: byDay || bySource* || byWhatever

           | thing1 | thing2 | etc
sourceA    |      2 |      6 |   6
sourceB    |      3 |      1 |   3

展开此JSON结构将会很困难,并产生一个非常大的

Flattening this JSON structure would be difficult and yield a very large object.

我想利用Crossfilter的精彩功能,但我不确定是否可能。

I'd love to take advantage of Crossfilter's wonderful features, but I'm unsure if it's possible.

我可以定义/解释我当前的结构Crossfilter吗?也许还有另一种方法可以解决这个问题?我很容易承认我对尺寸和许多其他关键的Crossfilter概念没有很好的理解。

Is it possible for me to define/explain my current structure to Crossfilter? Perhaps there's another way I could approach this? I'll readily admit that I don't have a good grasp on dimensions and many other key Crossfilter concepts.

推荐答案

Crossfilter工程对数组的记录,数组的每个元素通过维度(使用访问器函数定义)映射到一个或多个值。

Crossfilter works on an array of records, with each element of the array being mapped to one or more values via dimensions (which are defined using accessor functions).

即使您的数据包含汇总结果,您可以使用Crossfilter,但请注意,在技术上不可能组合在不同维度上汇总的数据,例如将按日和按来源数据合并示例。您可以为每个聚合维度创建一个Crossfilter,例如一个用于按日,并运行查询和组对此,但我不知道如何有用的,将与你已经有比较。

Even if your data contains aggregate results, you can use this with Crossfilter, but note that it's technically impossible to combine data that has been aggregated across different dimensions, such as combining the "by day" and "by source" data in your example above. You could create a Crossfilter for each aggregated dimension, e.g. one for "by day", and run queries and groups on this, but I'm not sure how useful that would be compared with what you already have.

至于内存使用,你确定展平你的扁平结构真的会有那个问题吗?请记住,每个记录(展平数组的元素)可以包含引用到嵌套结构中的字符串和其他对象,因此您不一定会占用那么多内存。

As for memory usage, are you sure flattening your flattened structure would really be that problematic? Bear in mind that each record (element of the flattened array) can contain references to strings and other objects in your nested structure, so you wouldn't necessarily use up all that much memory.

这篇关于Crossfilter是否需要一个平面数据结构?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-27 13:22