问题描述
我在 Vector 中有一组对象,我想从中选择一个随机子集(例如,返回 100 个项目;随机选择 5 个).在我第一次(非常仓促)通过时,我做了一个非常简单但可能过于聪明的解决方案:
I have a set of objects in a Vector from which I'd like to select a random subset (e.g. 100 items coming back; pick 5 randomly). In my first (very hasty) pass I did an extremely simple and perhaps overly clever solution:
Vector itemsVector = getItems();
Collections.shuffle(itemsVector);
itemsVector.setSize(5);
虽然这具有美观和简单的优点,但我怀疑它不会很好地扩展,即 Collections.shuffle() 至少必须是 O(n).我不太聪明的选择是
While this has the advantage of being nice and simple, I suspect it's not going to scale very well, i.e. Collections.shuffle() must be O(n) at least. My less clever alternative is
Vector itemsVector = getItems();
Random rand = new Random(System.currentTimeMillis()); // would make this static to the class
List subsetList = new ArrayList(5);
for (int i = 0; i < 5; i++) {
// be sure to use Vector.remove() or you may get the same item twice
subsetList.add(itemsVector.remove(rand.nextInt(itemsVector.size())));
}
关于从集合中抽取随机子集的更好方法有什么建议吗?
Any suggestions on better ways to draw out a random subset from a Collection?
推荐答案
Jon Bentley 在Programming Pearls"或More Programming Pearls"中讨论了这个问题.你需要小心你的 N of M 选择过程,但我认为显示的代码工作正常.您可以只对前 N 个位置进行随机洗牌,而不是随机洗牌所有项目 - 当 N <<
Jon Bentley discusses this in either 'Programming Pearls' or 'More Programming Pearls'. You need to be careful with your N of M selection process, but I think the code shown works correctly. Rather than randomly shuffle all the items, you can do the random shuffle only shuffling the first N positions - which is a useful saving when N << M.
Knuth 还讨论了这些算法 - 我相信那将是第 3 卷排序和搜索",但我的集合已被打包等待搬家,所以我无法正式检查.
Knuth also discusses these algorithms - I believe that would be Vol 3 "Sorting and Searching", but my set is packed pending a move of house so I can't formally check that.
这篇关于从集合中选择随机子集的最佳方法?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!