

我一直在玩 crossfilter ,发现它很棒,但是最近撞墙了。我根本不知道我的问题是否需要 crossfilter ,很高兴听到任何替代解决方案。

I have been playing with crossfilter and found it great but hit into a wall recently. I don't know if my problem needs crossfilter at all, so happy to hear any alternatives solutions.


year: "1987", 
country: "UK", 
product: "pineapple", 
tons_available: 10, 
tons_sold: 8
{ year: "1987", country: "US", product: "pineapple", tons_available: 34, tons_sold: 18},
{ year: "1987", country: "UK", product: "pear", tons_available: 4, tons_sold: 3},
{ year: "1987", country: "US", product: "pear", tons_available: 23, tons_sold: 20},
{ year: "1988", country: "UK", product: "pineapple", tons_available: 12, tons_sold: 3},
{ year: "1988", country: "US", product: "pineapple", tons_available: 56, tons_sold: 6},
{ year: "1988", country: "UK", product: "pear", tons_available: 32, tons_sold: 32},
{ year: "1988", country: "US", product: "pear", tons_available: 31, tons_sold: 8},
and on, and on...]

我想汇总年度数据,并能够针对 tons_available和 tons_sold等现有指标进行操作。

I want to aggregate yearly data and was able to do it for metrics already available like "tons_available" and "tons_sold".

var by_week = data_to_filter.dimension( function(d) { return d.year; });
var tons_sold_by_week = by_week.group()
                         .reduceSum(function(d) { return d.tons_sold; });

但是,我找不到在聚合对象之上创建指标的方法。例如,我想创建一个每年的销售率。为此,我需要每年汇总所有字段,然后仅进行除法:tons_sold / tons_available。

However, I cannot find a way to create metrics on top of the aggregated object. For example, I would like to create a sell-through-rate per year. For this, I would need to sum all the field per year and then only divide: tons_sold/tons_available.


This ideally would give me a json formatted as:

year: "1987", 
tons_available: 71, 
tons_sold: 49
str: 0.69
{ year: "1988", tons_available: 131, tons_sold: 49, str: 0.37 },
and on, and on...]


It seems to me that the aggregated object only keep the summed variables and result with just one couple key/value per year.


Is there way to achieve what I am after?





You can do this by defining reduceAdd, reduceRemove and reduceInitial functions for your specific use case and then passing them to reduce.


var cf   = crossfilter(data),
    year = cf.dimension(function(d) { return d.year; });

var reduceAdd = function(p, v) {
    p.tons_available += v.tons_available;
    p.tons_sold      += v.tons_sold;
    p.str            = p.tons_sold/p.tons_available;
    return p;

var reduceRemove = function(p, v) {
    p.tons_available -= v.tons_available;
    p.tons_sold      -= v.tons_sold;
    p.str            = p.tons_sold/p.tons_available;
    return p;

var reduceInitial = function() {
    return {
        tons_available : 0,
        tons_sold      : 0,
        str            : 0

var json = year.group().reduce(reduceAdd,reduceRemove,reduceInitial).orderNatural().top(Infinity);



When Crossfilter adds records it uses reduceAdd to recalculate values. When Crossfilter removes filters out records it uses reduceRemove to update values. It needs an initial value which is supplied by reduceInitial (note that it's a function). See this permalink from the docs https://github.com/square/crossfilter/wiki/API-Reference#group_reduce.


I used the input that you gave above and this is the json that I got as a result:

[{key: "1988", value: {
    str: 0.37404580152671757,
    tons_available: 131,
    tons_sold: 49},
 {key: "1987", value: {
    str: 0.6901408450704225,
    tons_available: 71,
    tons_sold: 49}]


It's not exactly the output you asked for, but it's pretty close.


10-12 19:42