本文介绍了用d3将累积百分比线拟合到排序后的直方图,以获取帕累托图直方图的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是我到目前为止的内容: https://gist.github.com/daluu/fc1cbcab68852ed3c5fa http://bl.ocks.org/daluu/fc1cbcab68852ed3c5fa .我正在尝试复制Excel功能.

This is what I have so far: https://gist.github.com/daluu/fc1cbcab68852ed3c5fa and http://bl.ocks.org/daluu/fc1cbcab68852ed3c5fa. I'm trying to replicate Excel functionality.

该行适合默认直方图,就像基本/原始 http://bl.ocks一样.org/daluu/f58884c24ff893186416 .而且,我能够按降序对直方图进行排序,尽管这样做,我切换了x刻度(从线性到有序).在这一点上,我似乎无法正确地将线映射到排序后的直方图.就视觉表示而言,它应该看起来像以下示例:

The line fits the default histogram just fine as in the base/original http://bl.ocks.org/daluu/f58884c24ff893186416. And I'm able to sort the histogram in descending frequency, although in doing so, I switched x scales (from linear to ordinal). I can't seem to map the line to the sorted histogram correctly at this point. It should look like the following examples in terms of visual representation:

  • the Excel screenshot in a comment in my gist referenced above
  • the pareto chart sorted histogram in this SO post
  • the pareto chart (similar to but not exactly a sorted histogram) made with d3 here

使其余部分正常工作的最佳设计方法是什么?我应该从单一的x刻度开始,而不需要从线性转换为有序吗?如果是这样,我不确定如何使用序数比例尺正确地应用直方图布局,或者如何不使用线性x比例作为直方图布局的输入源并仍然获得所需的输出.

What's the best design approach to get the remaining part working? Should I have started with a single x scale and not need to switch from linear to ordinal? If so, I'm not sure how to apply the histogram layout correctly using an ordinal scale or how not to use a linear x scale as a source of input to the histogram layout and still get the desired output.

与我到目前为止使用的代码使用相同的序数标度,这条线看起来还可以,但这不是我期望看到的曲线.

Using the same ordinal scale with the code I have so far, the line looks ok but it's not the curve I am expecting to see.

任何帮助表示赞赏.

推荐答案

该行的主要问题是,对条形进行排序后,需要重新计算累积分布,或者如果您想获取静态pareto图表,累积分布需要按照目标排序顺序进行计算.为此,我创建了一个小函数来进行此计算:

The main issue with the line is that the cumulative distribution needs to be recalculated after the bar is sorted, or if you're gunning for a static pareto chart, the cumulative distribution needs to be calculated in the target sort order. For this purpose i've created a small function to do this calculation:

function calcCDF(data){
  data.forEach(function(d,i){
      if(i === 0){
      d.cum = d.y/dataset.length
    }else{
      d.cum = (d.y/dataset.length) + data[i-1].cum
    }
  })
  return data
}

在我的情况下,我每次切换pareto排序并重新计算d.cum属性.从理论上讲,一个可以创建两个累积的dist属性.即d.cum表示常规有序分布,而d.ParetoCum表示已排序的累积值,但我在工具提示上使用d.cum并决定反对.

In my case, i'm toggling the pareto sort on/off and recalculating the d.cum property each time. One could theoretically create two cumulative dist properties to start with; i.e. d.cum for a regular ordered distribution and say d.ParetoCum for the sorted cumulative, but i'm using d.cum on a tooltip and decided against that.

我沿轴使用单个顺序刻度,我认为它更干净,但需要做一些工作才能使标签对于数字范围有意义,因为刻度线和标签不再像划定的那样划定垃圾箱线性比例我在这里的解决方案是仅将数字范围用作刻度线,例如"1-1.99"并向替代刻度线添加功能(前一阵子从获得该解决方案d3.js中的交替刻度填充).

Per the axis, i'm using a single ordinal scale which i think is cleaner, but required some work on getting the labels to be meaningful for number ranges since tick-marks and labels no longer delineate the bins as one would get with a linear scale. My solution here was to just use the number range as the tick mark e.g. "1 - 1.99" and add a function to alternate tickmarks (got that solution a while ago from Alternating tick padding in d3.js).

对于条形排序,我将使用此d3示例作为参考,以防您需要在一个更简单/更小的示例的上下文中进行理解.

For the bar sorting, i'm using this d3 example as a reference in case you need to understand in the context of a simpler/smaller example.

请参阅包含所有上述内容的小提琴.如果您想使用它,我建议您添加一个检查,以避免用户能够同时关闭横条和横条(在代码中留下一个注释……应该很简单)

See this fiddle that incorporates all of the above. If you want to use it, i would suggest adding a check to avoid the user being able to toggle off both bars and line (left a note in the code...should be trivial)

这篇关于用d3将累积百分比线拟合到排序后的直方图,以获取帕累托图直方图的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-04 07:54
查看更多