重复计算百分位数的快速算法?

本文介绍了重复计算百分位数的快速算法?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

在算法中，我必须计算数据集的 75th percentile价值.现在我正在这样做:

In an algorithm I have to calculate the 75th percentile of a data set whenever I add a value. Right now I am doing this:

获取值x
在后面已经排序好的数组中插入x
向下交换 x 直到数组被排序
读取位置array[array.size * 3/4]

Get value x
Insert x in an already sorted array at the back
swap x down until the array is sorted
Read the element at position array[array.size * 3/4]

点 3 是 O(n)，其余是 O(1)，但这仍然很慢，尤其是当数组变大时.有什么办法可以优化这个吗?

Point 3 is O(n), and the rest is O(1), but this is still quite slow, especially if the array gets larger. Is there any way to optimize this?

更新

谢谢尼基塔！由于我使用的是 C++，因此这是最容易实现的解决方案.代码如下:

Thanks Nikita! Since I am using C++ this is the solution easiest to implement. Here is the code:

template<class T>
class IterativePercentile {
public:
  /// Percentile has to be in range [0, 1(
  IterativePercentile(double percentile)
    : _percentile(percentile)
  { }

  // Adds a number in O(log(n))
  void add(const T& x) {
    if (_lower.empty() || x <= _lower.front()) {
      _lower.push_back(x);
      std::push_heap(_lower.begin(), _lower.end(), std::less<T>());
    } else {
      _upper.push_back(x);
      std::push_heap(_upper.begin(), _upper.end(), std::greater<T>());
    }

    unsigned size_lower = (unsigned)((_lower.size() + _upper.size()) * _percentile) + 1;
    if (_lower.size() > size_lower) {
      // lower to upper
      std::pop_heap(_lower.begin(), _lower.end(), std::less<T>());
      _upper.push_back(_lower.back());
      std::push_heap(_upper.begin(), _upper.end(), std::greater<T>());
      _lower.pop_back();
    } else if (_lower.size() < size_lower) {
      // upper to lower
      std::pop_heap(_upper.begin(), _upper.end(), std::greater<T>());
      _lower.push_back(_upper.back());
      std::push_heap(_lower.begin(), _lower.end(), std::less<T>());
      _upper.pop_back();
    }
  }

  /// Access the percentile in O(1)
  const T& get() const {
    return _lower.front();
  }

  void clear() {
    _lower.clear();
    _upper.clear();
  }

private:
  double _percentile;
  std::vector<T> _lower;
  std::vector<T> _upper;
};

重复计算百分位数的快速算法

问题描述

推荐答案