问题描述
我正在使用大型词典,由于某种原因,我还需要处理该词典中的少量随机样本.如何获得这个小样本(例如长度为2的样本)?
I'm working with a big dictionary and for some reason I also need to work on small random samples from that dictionary. How can I get this small sample (for example of length 2)?
这是一个玩具模型:
dy={'a':1, 'b':2, 'c':3, 'd':4, 'e':5}
我需要对dy执行一些涉及所有条目的任务.让我们说,为简单起见,我需要将所有值加在一起:
I need to perform some task on dy which involves all the entries. Let us say, to simplify, I need to sum together all the values:
s=0
for key in dy.key:
s=s+dy[key]
现在,我还需要对dy的随机样本执行相同的任务;为此,我需要dy键的随机样本.我能想到的简单解决方案是
Now, I also need to perform the same task on a random sample of dy; for that I need a random sample of the keys of dy. The simple solution I can imagine is
sam=list(dy.keys())[:1]
通过这种方式,我得到了字典中两个键的列表,这些键在某种程度上是随机的.因此,回到may任务,我需要对代码进行的唯一更改是:
In that way I have a list of two keys of the dictionary which are somehow random. So, going back to may task, the only change I need in the code is:
s=0
for key in sam:
s=s+dy[key]
重点是我不完全了解dy.keys的构造方式,因此我无法预见任何未来的问题
The point is I do not fully understand how dy.keys is constructed and then I can't foresee any future issue
推荐答案
给出您的示例:
dy = {'a':1, 'b':2, 'c':3, 'd':4, 'e':5}
然后将所有值的总和简单地表示为:
Then the sum of all the values is more simply put as:
s = sum(dy.values())
然后,如果不是不禁止使用内存,则可以使用以下示例进行采样:
Then if it's not memory prohibitive, you can sample using:
import random
values = list(dy.values())
s = sum(random.sample(values, 2))
或者,由于random.sample
可以接受类似set
的对象,因此:
Or, since random.sample
can take a set
-like object, then:
from operator import itemgetter
import random
s = sum(itemgetter(*random.sample(dy.keys(), 2))(dy))
或仅使用:
s = sum(dy[k] for k in random.sample(dy.keys(), 2))
一种替代方法是使用heapq
,例如:
An alternative is to use a heapq
, eg:
import heapq
import random
s = sum(heapq.nlargest(2, dy.values(), key=lambda L: random.random()))
这篇关于随机获取字典样本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!