问题描述
我有100,000个对象的列表.每个列表元素都有一个与之关联的权重",它是从1到N的正整数.
I have a list of 100,000 objects. Every list element has a "weight" associated with it that is a positive int from 1 to N.
从列表中选择随机元素的最有效方法是什么?我希望我的随机选择元素的分布与列表中权重的分布相同.
What is the most efficient way to select a random element from the list? I want the behavior that my distribution of randomly chosen elements is the same as the distribution of weights in the list.
例如,如果我有一个列表L = {1,1,2,5},则希望平均选择第5个元素的第4个元素.
For example, if I have a list L = {1,1,2,5}, I want the 4th element to be selected 5/9ths of the time, on average.
假定此列表中的插入和删除操作很常见,因此任何使用整数区域表"的方法都需要经常更新-希望有O(1)运行时和O(1)额外内存的解决方案. /p>
Assume inserts and deletes are common on this list, so any approach using "integral area tables" would need to be updated often - hoping there is a solution with O(1) runtime and O(1) extra memory required.
推荐答案
您可以使用增强型二叉搜索树来存储元素以及每个子树中权重的总和.这样,您就可以根据需要插入和删除元素和权重.采样和更新每次操作都需要O(lg n)时间,并且空间使用量为O(n).
You can use an augmented binary search tree to store the elements, along with the sum of the weights in each subtree. This lets you insert and delete elements and weights however you want. Both sampling and updates require O(lg n) time per operation, and space usage is O(n).
通过在[1,S]中生成一个随机整数来完成采样,其中S是所有权重的总和(S存储在树的根中),并使用为每个权重存储的权重和执行二进制搜索子树.
Sampling is accomplished by generating a random integer in [1, S], where S is the sum of all weights (S is stored at the root of the tree), and performing binary search using the weight-sums stored for each subtree.
这篇关于从加权列表中随机选择一个元素的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!