问题描述
我有一个集合,它是关于这样的对象的活动日志:
I have a collection that is log of activity on objects like this:
{
"_id" : ObjectId("55e3fd1d7cb5ac9a458b4567"),
"object_id" : "1",
"activity" : [
{
"action" : "test_action",
"time" : ISODate("2015-08-31T00:00:00.000Z")
},
{
"action" : "test_action",
"time" : ISODate("2015-08-31T00:00:22.000Z")
}
]
}
{
"_id" : ObjectId("55e3fd127cb5ac77478b4567"),
"object_id" : "2",
"activity" : [
{
"action" : "test_action",
"time" : ISODate("2015-08-31T00:00:00.000Z")
}
]
}
{
"_id" : ObjectId("55e3fd0f7cb5ac9f458b4567"),
"object_id" : "1",
"activity" : [
{
"action" : "test_action",
"time" : ISODate("2015-08-30T00:00:00.000Z")
}
]
}
如果我跟踪查询:
db.objects.find({
"createddate": {$gte : ISODate("2015-08-30T00:00:00.000Z")},
"activity.action" : "test_action"}
}).count()
它返回包含"test_action"(在此集合中为3)的文档计数,但我需要获取所有test_actions(在此集合中为4)的计数.我该怎么办?
it returns count of documents containing "test_action" (3 in this set), but i need to get count of all test_actions (4 on this set). How do i do that?
推荐答案
最高效"的方法是跳过 $unwind
完全简单地 $group
进行计数.本质上,过滤器"数组获取 $size
到 $sum
:
The most "performant" way to do this is to skip the $unwind
altogther and simply $group
to count. Essentially "filter" arrays get the $size
of the results to $sum
:
db.objects.aggregate([
{ "$match": {
"createddate": {
"$gte": ISODate("2015-08-30T00:00:00.000Z")
},
"activity.action": "test_action"
}},
{ "$group": {
"_id": null,
"count": {
"$sum": {
"$size": {
"$setDifference": [
{ "$map": {
"input": "$activity",
"as": "el",
"in": {
"$cond": [
{ "$eq": [ "$$el.action", "test_action" ] },
"$$el",
false
]
}
}},
[false]
]
}
}
}
}}
])
MongoDB的未来版本将具有$filter
,这使此过程变得更加简单:
Future releases of MongoDB will have $filter
, which makes this much more simple:
db.objects.aggregate([
{ "$match": {
"createddate": {
"$gte": ISODate("2015-08-30T00:00:00.000Z")
},
"activity.action": "test_action"
}},
{ "$group": {
"_id": null,
"count": {
"$sum": {
"$size": {
"$filter": {
"input": "$activity",
"as": "el",
"cond": {
"$eq": [ "$$el.action", "test_action" ]
}
}
}
}
}
}}
])
使用$unwind
会导致文档反规范化并有效地为每个数组条目创建一个副本.由于可能经常需要付出极高的成本,因此应尽可能避免这种情况.相比之下,每个文档的过滤和计数数组条目要快得多.与许多阶段相比,这是一个简单的$match
和$group
管道.
Using $unwind
causes the documents to de-normalize and effectively creates a copy per array entry. Where possible you should avoid this due the the often extreme cost. Filtering and counting array entries per document is much faster by comparison. As is a simple $match
and $group
pipeline compared to many stages.
这篇关于Mongodb按条件对所有对象中的所有数组元素进行计数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!