问题描述
我有一个mongo集合与文档。每个文档中有一个字段为0或1.我需要从数据库随机抽取1000条记录,并将具有该字段的文档数计为1.我需要做这个抽样1000次。
I have a mongo collection with documents. There is one field in every document which is 0 OR 1. I need to random sample 1000 records from the database and count the number of documents who have that field as 1. I need to do this sampling 1000 times. How do i do it ?
推荐答案
对于MongoDB 3.0和之前的版本,我使用了一个SQL的老技巧使用他们的随机页功能)。我在每个需要随机化的对象中存储0和1之间的随机数,我们称之为r。然后在r上添加索引。
For MongoDB 3.0 and before, I use an old trick from SQL days (which I think Wikipedia use for their random page feature). I store a random number between 0 and 1 in every object I need to randomize, let's call that field "r". You then add an index on "r".
db.coll.ensureIndex(r: 1);
现在要获取随机x对象,您可以使用:
Now to get random x objects, you use:
var startVal = Math.random();
db.coll.find({r: {$gt: startVal}}).sort({r: 1}).limit(x);
这将在一个查询查询中提供随机对象。根据你的需要,这可能是过度的,但如果你要做大量的抽样,随着时间的推移,这是一个非常有效的方式,而不加载在你的后端。
This gives you random objects in a single find query. Depending on your needs, this may be overkill, but if you are going to be doing lots of sampling over time, this is a very efficient way without putting load on your backend.
这篇关于从Mongo随机抽样的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!