问题描述
我们最近决定重新访问某些MongoDB索引,并在使用包含多键部分的复合索引时遇到了奇怪的结果.
We've recently decided to revisit some of our MongoDB indexes and came across a peculiar result when using a compound index which contains a multikey part.
请务必注意,我们正在使用v2.4.5
It's important to note that we're using v2.4.5
TLDR :当使用具有多键部分的复合索引时,用于范围限制的非多键字段的边界将被删除.
TLDR: When using a compound index with multikey part, the bounds of a non-multikey field used for range restriction are dropped.
我将用一个例子来解释这个问题:
I'll explain the problem with an example:
创建一些数据
Create some data
db.demo.insert(
[{ "foo" : 1, "attr" : [ { "name" : "a" }, { "name" : "b" }, { "name" : "c" } ]},
{ "foo" : 2, "attr" : [ { "name" : "b" }, { "name" : "c" }, { "name" : "d" } ]},
{ "foo" : 3, "attr" : [ { "name" : "c" }, { "name" : "d" }, { "name" : "e" } ]},
{ "foo" : 4, "attr" : [ { "name" : "d" }, { "name" : "e" }, { "name" : "f" } ]}])
索引
Index
db.demo.ensureIndex({'attr.name': 1, 'foo': 1})
查询和解释
Query & Explain
查询"attr.name",但限制了非多键字段"foo"的范围:
Query on 'attr.name' but constrain the range of the non-multikey field 'foo':
db.demo.find({foo: {$lt:3, $gt: 1}, 'attr.name': 'c'}).hint('attr.name_1_foo_1').explain()
{
"cursor" : "BtreeCursor attr.name_1_foo_1",
"isMultiKey" : true,
"n" : 1,
"nscannedObjects" : 2,
"nscanned" : 2,
"nscannedObjectsAllPlans" : 2,
"nscannedAllPlans" : 2,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 0,
"nChunkSkips" : 0,
"millis" : 0,
"indexBounds" : {
"attr.name" : [
[
"c",
"c"
]
],
"foo" : [
[
-1.7976931348623157e+308,
3
]
]
}
}
如您所见,'foo'的范围未在查询中定义,一端被完全忽略,这导致nscanned大于其应有的范围.
As you can see, the range of 'foo' is not as defined in the query, one end is completely ignored which results in nscanned being larger than it should.
更改范围操作数的顺序将更改放置的结尾:
Changing the order of the range operands will alter the dropped end:
db.demo.find({foo: {$gt: 1, $lt:3}, 'attr.name': 'c'}).hint('attr.name_1_foo_1').explain()
{
"cursor" : "BtreeCursor attr.name_1_foo_1",
"isMultiKey" : true,
"n" : 1,
"nscannedObjects" : 2,
"nscanned" : 2,
"nscannedObjectsAllPlans" : 2,
"nscannedAllPlans" : 2,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 0,
"nChunkSkips" : 0,
"millis" : 0,
"indexBounds" : {
"attr.name" : [
[
"c",
"c"
]
],
"foo" : [
[
1,
1.7976931348623157e+308
]
]
}
}
我们或者错过了一些多键索引基础知识,或者我们遇到了一个错误.
We're either missing out on some multikey index basics, or we're facing a bug.
我们经历了类似的主题,包括:
We've gone through similar topics, including:
- https://groups.google.com/forum/#!searchin/mongodb-user/multikey$20bounds/mongodb-user/RKrsyzRwHrE/_i0SxdJV5qcJ
- $ lt和$ gt的顺序在MongoDB范围查询中
- https://groups.google.com/forum/#!searchin/mongodb-user/multikey$20bounds/mongodb-user/RKrsyzRwHrE/_i0SxdJV5qcJ
- Order of $lt and $gt in MongoDB range query
不幸的是,这些帖子解决了一个不同的用例,在该用例中,在多键值上设置了一个范围.
Unfortunately these posts address a different use-case where a range is set on the multikeyed value.
我们尝试做的其他事情:
Other things we've tried to do:
-
从非多键字段开始,更改复合索引的顺序.
Change the compound index ordering, starting with the non-multikey field.
将'foo'值放入'attr'数组中的每个子文档中,通过('attr.name','attr.foo')进行索引,并在'attr'上执行$ elemMatch 'foo'的范围约束.
Put the 'foo' value inside each of the subdocuments in the 'attr' array, index by ('attr.name', 'attr.foo') and do an $elemMatch on 'attr' with a range constraint on 'foo'.
在定义范围时使用$ and运算符:
Use an $and operator when defining the range:
db.demo.find({'attr.name': 'c', $and: [{num: {$lt: 3}}, {num: {$gt: 1}}]})
使用MongoDB v2.5.4
Use MongoDB v2.5.4
以上所有方法均无效(v2.5.4通过完全抛弃范围的两端使情况变得更糟).
None of the above had any effect (v2.5.4 made things worse by dumping both ends of the range completely).
我们将不胜感激!
非常感谢,
Roi
推荐答案
对于复合索引(其中索引字段之一是数组),MongoDB将仅对范围查询使用下限或上限,以确保返回正确的匹配项.请参见 SERVER-958 ,该示例找不到同时限制上下索引范围的示例预期的文件.
With compound indexes where one of the indexed fields is an array, MongoDB will only use either a lower or upper bound for the range query to ensure correct matches are returned. See SERVER-958 for an example where constraining to both upper and lower index bounds would not find the expected document.
如果范围查询在数组字段上,则可以使用 $elemMatch
运算符可在预期的索引范围内优化您的查询.与MongoDB 2.4一样,$elemMatch
运算符不适用于非数组字段,因此很遗憾,这对您的用例没有帮助.您可以观看/支持 SERVER-6050:考虑允许$ elemMatch应用于非数组 MongoDB问题跟踪器.
If your range query is on the array field you can potentially use the $elemMatch
operator to optimise your query within the expected index bounds. As at MongoDB 2.4, the $elemMatch
operator does not work on non-array fields so unfortunately this doesn't help your use case. You can watch/upvote SERVER-6050: Consider allowing $elemMatch applied to non arrays in the MongoDB issue tracker.
还有一个未解决的问题 SERVER-7959:当某些字段位于多键描述这种行为.
There is also an open issue SERVER-7959: Potentially unexpected scans with compound indexes when some fields are multikey describing this behaviour.
这篇关于MongoDB Multikey复合索引-需要帮助了解范围的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!