问题描述
我正在使用Mike Bostock的库来过滤和排序大型数据集。我的问题:鉴于具有多个维度的数据集,我如何一次对多个维度进行排序?
I'm using Mike Bostock's crossfilter library to filter and sort large datasets. My problem: Given a dataset with multiple dimensions, how can I sort on more than one dimension at a time?
测试数据集:
[
{ cat: "A", val:1 },
{ cat: "B", val:2 },
{ cat: "A", val:11 },
{ cat: "B", val:5 },
{ cat: "A", val:3 },
{ cat: "B", val:2 },
{ cat: "A", val:11 },
{ cat: "B", val:100 }
]
所需输出的示例,按 cat,val $ c $排序c>(升序):
Example of desired output, sorting by cat, val
(ascending):
[
{ cat: "A", val:1 },
{ cat: "A", val:3 },
{ cat: "A", val:11 },
{ cat: "A", val:11 },
{ cat: "B", val:2 },
{ cat: "B", val:2 },
{ cat: "B", val:5 },
{ cat: "B", val:100 }
]
到目前为止我使用的方法对我们来说所需维度的字符串连接:
The approach I've used thus far is to use string concatenation on the desired dimensions:
var combos = cf.dimension(function(d) { return d.cat + '|' + d.val; });
这适用于多个基于字符串的维度,但不适用于数字维度,因为它是不是一种自然的排序('4'> '11'
)。我想我可以在数字上使用零填充来完成这项工作,但是对于大型数据集来说这可能会变得昂贵,所以我宁愿避免使用它。 有没有其他方法可以在这里工作,使用crossfilter?
This works fine with multiple string-based dimensions, but won't work with numeric dimensions, as it's not a natural sort ('4' > '11'
). I think I could make this work with zero-padding on the numbers, but this could get expensive for a large dataset, so I'd prefer to avoid it. Is there another way that might work here, using crossfilter?
任何允许不同维度具有不同排序方向的解决方案的加分点(升序/降序)。
Bonus points for any solution that allows different dimensions to have different sort directions (ascending/descending).
澄清:是的,我可能需要切换到原生 Array.sort
实施。但是使用crossfilter的重点在于它非常非常快,特别是对于大型数据集,它以一种使重复排序更快的方式缓存排序顺序。所以我真的在这里寻找一个基于crossfilter的答案。
Clarification: Yes, I may need to switch to a native Array.sort
implementation. But the whole point of using crossfilter is that it's very, very fast, especially for large datasets, and it caches sort order in a way that makes repeated sorts even faster. So I'm really looking for a crossfilter-based answer here.
推荐答案
这就是我最终做的事情:
Here's what I ended up doing:
- 我仍然在单个新维度上使用字符串连接,但是
-
我将度量转换为使用crossfilter获取最小值/最大值之前的正数,可比较的十进制数:
- I still use string concatenation on a single new dimension, but
I convert the measure to a positive, comparable decimal before turning it into a string, using crossfilter to get the min/max:
var vals = cf.dimension(function(d) { return d.val }),
min = vals.bottom(1)[0].val,
offset = min < 0 ? Math.abs(min) : 0,
max = vals.top(1)[0].val + offset,
valAccessor = function(d) {
// offset ensures positive numbers, fraction ensures sort order
return ((d.val + offset) / max).toFixed(8);
},
combos = cf.dimension(function(d) {
return d.cat + '|' + valAccessor(d);
});
查看工作小提琴:
这样做的好处是可以正确处理负数 - 据我所知,零填充是不可能的。它似乎同样快。缺点是它需要在数字列上创建一个新维度,但在我的情况下,我通常要求在任何情况下。
This has the advantage of handling negative numbers properly - not possible with zero-padding, as far as I can tell. It seems to be just as fast. The downside is that it requires creating a new dimension on the numeric column, but in my case I usually require that in any case.
这篇关于在crossfilter.js中按多个维度排序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!