我觉得应该有一个可用的库来更简单地做两件事:a)在双精度情况下找到数组的模式;b)优雅地降低精度,直到达到特定的频率。
想象一下这样的数组:
double[] a = {1.12, 1.15, 1.13, 2.0, 3.4, 3.44, 4.1, 4.2, 4.3, 4.4};
如果我在寻找一个3的频率,那么它将从2个小数点到1个小数点,最后返回1.1作为我的模式如果我有4的频率要求,它会返回4作为我的模式。
我确实有一套按照我希望的方式工作的代码,并且返回我所期望的,但是我觉得应该有一种更有效的方法来完成这一点,或者一个现有的库可以帮助我做到这一点。附件是我的代码,我会对我应该采取的不同方法的想法/评论感兴趣……我列出了迭代,以限制精度会降低多少。
public static double findMode(double[] r, int frequencyReq)
{
double mode = 0d;
int frequency = 0;
int iterations = 4;
HashMap<Double, BigDecimal> counter = new HashMap<Double, BigDecimal>();
while(frequency < frequencyReq && iterations > 0){
String roundFormatString = "#.";
for(int j=0; j<iterations; j++){
roundFormatString += "#";
}
DecimalFormat roundFormat = new DecimalFormat(roundFormatString);
for(int i=0; i<r.length; i++){
double element = Double.valueOf(roundFormat.format(r[i]));
if(!counter.containsKey(element))
counter.put(element, new BigDecimal(0));
counter.put(element,counter.get(element).add(new BigDecimal(1)));
}
for(Double key : counter.keySet()){
if(counter.get(key).compareTo(new BigDecimal(frequency))>0){
mode = key;
frequency = counter.get(key).intValue();
log.debug("key: " + key + " Count: " + counter.get(key));
}
}
iterations--;
}
return mode;
}
编辑
另一种重新表述这个问题的方法是,根据保罗的评论:目标是找到一个在邻域中至少有
frequency
数组元素的数字,邻域的半径尽可能小。 最佳答案
这里是重新制定的问题的解决办法:
目标是在邻域中至少有frequency
个数组元素的地方定位一个数字,邻域的半径尽可能小。
(我获得了在输入数组中切换1.15
和1.13
顺序的自由。)
基本思想是:我们已经对输入进行了排序(即相邻元素是连续的),并且我们知道在我们的邻域中需要多少元素。所以我们在这个数组上循环一次,测量左边元素和右边元素之间的距离它们之间是frequency
元素,所以这形成了一个邻域。然后我们取最小的距离。(我的方法返回结果的方式很复杂,您可能希望做得更好。)
这并不完全等同于你最初的问题(不适用于固定的数字步数),但也许这是你真正想要的:-)
不过,你必须找到一种更好的格式化结果的方法。
package de.fencing_game.paul.examples;
import java.util.Arrays;
/**
* searching of dense points in a distribution.
*
* Inspired by http://stackoverflow.com/questions/5329628/finding-a-mode-with-decreasing-precision.
*/
public class InpreciseMode {
/** our input data, should be sorted ascending. */
private double[] data;
public InpreciseMode(double ... data) {
this.data = data;
}
/**
* searchs the smallest neighbourhood (by diameter) which
* contains at least minSize elements.
*
* @return an array of two arrays:
* { { the middle point of the neighborhood,
* the diameter of the neighborhood },
* all the elements of the neigborhood }
*
* TODO: better return an object of a class encapsuling these.
*/
public double[][] findSmallNeighbourhood(int minSize) {
int currentLeft = -1;
int currentRight = -1;
double currentMinDiameter = Double.POSITIVE_INFINITY;
for(int i = 0; i + minSize-1 < data.length; i++) {
double diameter = data[i+minSize-1] - data[i];
if(diameter < currentMinDiameter) {
currentMinDiameter = diameter;
currentLeft = i;
currentRight = i + minSize-1;
}
}
return
new double[][] {
{
(data[currentRight] + data[currentLeft])/2.0,
currentMinDiameter
},
Arrays.copyOfRange(data, currentLeft, currentRight+1)
};
}
public void printSmallNeighbourhoods() {
for(int frequency = 2; frequency <= data.length; frequency++) {
double[][] found = findSmallNeighbourhood(frequency);
System.out.printf("There are %d elements in %f radius "+
"around %f:%n %s.%n",
frequency, found[0][1]/2, found[0][0],
Arrays.toString(found[1]));
}
}
public static void main(String[] params) {
InpreciseMode m =
new InpreciseMode(1.12, 1.13, 1.15, 2.0, 3.4, 3.44, 4.1,
4.2, 4.3, 4.4);
m.printSmallNeighbourhoods();
}
}
输出是
There are 2 elements in 0,005000 radius around 1,125000:
[1.12, 1.13].
There are 3 elements in 0,015000 radius around 1,135000:
[1.12, 1.13, 1.15].
There are 4 elements in 0,150000 radius around 4,250000:
[4.1, 4.2, 4.3, 4.4].
There are 5 elements in 0,450000 radius around 3,850000:
[3.4, 3.44, 4.1, 4.2, 4.3].
There are 6 elements in 0,500000 radius around 3,900000:
[3.4, 3.44, 4.1, 4.2, 4.3, 4.4].
There are 7 elements in 1,200000 radius around 3,200000:
[2.0, 3.4, 3.44, 4.1, 4.2, 4.3, 4.4].
There are 8 elements in 1,540000 radius around 2,660000:
[1.12, 1.13, 1.15, 2.0, 3.4, 3.44, 4.1, 4.2].
There are 9 elements in 1,590000 radius around 2,710000:
[1.12, 1.13, 1.15, 2.0, 3.4, 3.44, 4.1, 4.2, 4.3].
There are 10 elements in 1,640000 radius around 2,760000:
[1.12, 1.13, 1.15, 2.0, 3.4, 3.44, 4.1, 4.2, 4.3, 4.4].