问题描述
我希望从庞大的(1亿条记录)mongodb
中获得随机记录.
I am looking to get a random record from a huge (100 million record) mongodb
.
最快,最有效的方法是什么?数据已经存在,并且没有可以生成随机数并获得随机行的字段.
What is the fastest and most efficient way to do so? The data is already there and there are no field in which I can generate a random number and obtain a random row.
有什么建议吗?
推荐答案
从MongoDB 3.2版本开始,您可以使用 $sample
聚合管道运算符:
Starting with the 3.2 release of MongoDB, you can get N random docs from a collection using the $sample
aggregation pipeline operator:
// Get one random document from the mycoll collection.
db.mycoll.aggregate([{ $sample: { size: 1 } }])
如果要从集合的过滤子集中选择随机文档,请在管道之前添加$match
阶段:
If you want to select the random document(s) from a filtered subset of the collection, prepend a $match
stage to the pipeline:
// Get one random document matching {a: 10} from the mycoll collection.
db.mycoll.aggregate([
{ $match: { a: 10 } },
{ $sample: { size: 1 } }
])
如注释中所述,当size
大于1时,返回的文档样本中可能有重复项.
As noted in the comments, when size
is greater than 1, there may be duplicates in the returned document sample.
这篇关于来自MongoDB的随机记录的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!