问题描述
我正在对我的 $ geoNear 查询strong> sharded 集群(6个节点,每个节点具有3个副本集,每个副本集2个shardsvr和1个仲裁器).我希望查询返回1.1m文档.我只收到〜130.xxx文件.我正在使用Java驱动程序发出查询并处理数据(目前,我只是在计算返回的文档).我正在使用MongoDB 3.2.9和最新的Java驱动程序.
I'm running a $geoNear query on my sharded cluster (6 nodes with 3 replica sets each of 2 shardsvr and 1 arbiter).I expect the query to return 1.1m documents. I am recieving only ~130.xxx documents. I am using the Java driver to issue the query and process the data (for now, I'm just counting the documents that get returned). I am using MongoDB 3.2.9 and the latest java driver.
mongod日志显示以下错误,该错误是由于输出文档大于16MB引起的:
The mongod log shows the following error which is caused by the output document getting larger than 16MB:
2016-10-10T12:00:22.933+0200 W COMMAND [conn22] Too many geoNear results for query { location: { $nearSphere: { type: "Point", coordinates: [ 10.xxxx, 52.xxxxx] }, $maxDistance: 3900.0 } }, truncating output.
2016-10-10T12:00:22.951+0200 I COMMAND [conn22] command mydb.data command: geoNear { geoNear: "data", near: { type: "Point", coordinates: [ 10.xxxx, 52.xxxxx ] },
num: 50000000, maxDistance: 3900.0, query: {}, spherical: true, distanceMultiplier: 1.0, includeLocs: true } keyUpdates:0 writeConflicts:0 numYields:890 reslen:16777310
locks:{ Global: { acquireCount: { r: 1784 } }, Database: { acquireCount: { r: 892 } }, Collection: { acquireCount: { r: 892 } } } protocol:op_query 589ms
2016-10-10T12:00:23.183+0200 I COMMAND [conn22] getmore mydb.data query: { aggregate: "data", pipeline: [ { $geoNear: { near: { type: "Point", coordinates: [ 10.xxxx, 52.xxxxx ] },
distanceField: "dist.calculated", limit: 50000000, maxDistance: 3900.0, query: {}, spherical: true, distanceMultiplier: 1.0, includeLocs: "dist.location" } }, { $project: { _id: false,
dist: { calculated: true } } } ], fromRouter: true, cursor: { batchSize: 0 } } cursorid:170255616227 ntoreturn:0 cursorExhausted:1 keyUpdates:0 writeConflicts:0 numYields:0 nreturned:43558
reslen:1568108 locks:{ Global: { acquireCount: { r: 1786 } }, Database: { acquireCount: { r: 893 } }, Collection: { acquireCount: { r: 893 } } } 820ms
查询:
db.data.aggregate([
{
$geoNear:{
near:{
type:"Point",
coordinates:[
10.xxxx,
52.xxxxx
]
},
distanceField:"dist.calculated",
maxDistance:3900,
num:50000000,
includeLocs:"dist.location",
spherical:true
}
}
])
请注意,我发出带有和不带有参数num
的查询,都因上述错误而失败.
Note that I issued the query with and without the parameter num
, both fail with the error shown above.
我希望一旦超出文档大小限制(16 MB),查询将返回数据库的块.我想念什么?如何检索所有数据?
I expected the query to return chunks of the database once the document size limit (16 MB) gets exceeded.What am I missing? How can I retrieve all the data?
添加组阶段时,查询也会在mongod日志中失败,并显示相同的错误:
The query also fails with the same error in the mongod logs when I add a group stage:
db.data.aggregate([
{
$geoNear:{
near:{
type:"Point",
coordinates:[
10.xxxx,
52.xxxxxx
]
},
distanceField:"dist.calculated",
maxDistance:3900,
includeLocs:"dist.location",
num:2000000,
spherical:true
}
},
{
$group:{
_id:"$root_document"
}
}
])
推荐答案
MongoDB工作人员Lungang Fang在此期间回答了我对MongoDB用户组的询问.以下是他的答案:
MongoDB Staff member Lungang Fang has answered to my enquiry on the MongoDB user group in the meantime. Below is his answer:
可以考虑两个选项:
如果不需要所有结果,则可以限制"geoNear" 使用num,limit或maxDistance选项的聚合结果大小 您需要所有结果,可以使用find()运算符 不限于BSON最大大小,因为它返回了一个游标. 以下是我在MongoDB 3.2.10上完成的测试,供您参考.
If you don’t need all the results, you could limit the "geoNear" aggregation result size using num, limit, or maxDistance options If you require all of the results, you can use the find() operator which is not limited to the BSON maximum size since it returns a cursor. Below is a test I done on MongoDB 3.2.10 For your information.
为指定的集合创建"2dsphere": db.coll.createIndex({location: '2dsphere'})
创建并插入几个大文档:
var padding = ''; for (var j = 0; j < 15; j++) { for (var i = 1024*128; i > 0; --i) { var padding = padding + '12345678'; } }
Create "2dsphere" for designated collection: db.coll.createIndex({location: '2dsphere'})
Create and insert several big documents:
var padding = ''; for (var j = 0; j < 15; j++) { for (var i = 1024*128; i > 0; --i) { var padding = padding + '12345678'; } }
db.coll.insert({location:{type:"Point", coordinates:[-73.861, 40.73]}, padding:padding})
db.coll.insert({location:{type:"Point", coordinates:[-73.862, 40.73]}, padding:padding})
db.coll.insert({location:{type:"Point", coordinates:[-73.863, 40.73]}, padding:padding})
db.coll.insert({location:{type:"Point", coordinates:[-73.864, 40.73]}, padding:padding})
db.coll.insert({location:{type:"Point", coordinates:[-73.865, 40.73]}, padding:padding})
db.coll.insert({location:{type:"Point", coordinates:[-73.866, 40.73]}, padding:padding}) Query using "geoNear" and server log shows "Too many geoNear results …, truncating output"
db.coll.aggregate(
[
{
$geoNear:{
near:{type:"Point", coordinates:[-73.86, 40.73]},
distanceField:"dist.calculated",
maxDistance:150000000,
spherical:true
}
},
{$project: {location:1}}
]
) Query using "find" and all expected documents are returned
// This and following "var" are necessary to avoid the screen being flushed by padding string.
var cursor = db.coll.find (
{
location: {
$near: {
$geometry:{type:"Point", coordinates:[-73.86, 40.73]},
maxDistance:150000,
}
}
}
)
// It is necessary to iterate through the cursor. Otherwise, the query is not actually executed.
var x = cursor.next()
x._id
var x = cursor.next()
x._id
...
问候,龙岗
这篇关于MongoDB-错误“查询结果太多,截断了输出";与$ geoNear的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!