python - PyMongo —游标迭代

我最近开始通过Shell和PyMongo测试MongoDB。我注意到返回游标并尝试对其进行迭代似乎在实际迭代中成为瓶颈。有没有一种方法可以在迭代过程中返回多个文档？

伪代码:

for line in file:
    value = line[a:b]
    cursor = collection.find({"field": value})
    for entry in cursor:
        (deal with single entry each time)

我希望做的是这样的:

for line in file
    value = line[a:b]
    cursor = collection.find({"field": value})
    for all_entries in cursor:
        (deal with all entries at once rather than iterate each time)

我试过按照this question使用batch_size()并将值一路更改为1000000，但这似乎没有任何效果(或者我做错了)。

任何帮助是极大的赞赏。请轻松使用这个Mongo新手!

- - 编辑 - -

谢谢Caleb。我认为您已经指出了我真正想问的问题，这就是:与cx_Oracle模块一样，有什么方法可以执行某种collection.findAll()或cursor.fetchAll()命令吗？问题不在于存储数据，而是尽可能快地从Mongo DB检索数据。

据我所知，由于我的网络决定了数据返回给我的速度，因为Mongo必须单次提取每条记录，对吗？

最佳答案

您是否考虑过类似的方法:

for line in file
  value = line[a:b]
  cursor = collection.find({"field": value})
  entries = cursor[:] # or pull them out with a loop or comprehension -- just get all the docs
  # then process entries as a list, either singly or in batch

或者，类似:

# same loop start
  entries[value] = cursor[:]
# after the loop, all the cursors are out of scope and closed
for value in entries:
  # process entries[value], either singly or in batch

基本上，只要您有足够的RAM存储结果集，就应该能够将它们从游标中拉出并保持住，然后再进行处理。这样做的速度可能不太快，但是可以减轻游标的速度，如果您为此设置了数据，则可以使您自由地并行处理数据。

关于python - PyMongo —游标迭代，我们在Stack Overflow上找到一个类似的问题：https://stackoverflow.com/questions/6680659/