MongoDB聚合框架

本文介绍了MongoDB聚合框架的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个结构如下的文件:

I have a document that's structured as follows:

{
  '_id' => 'Star Wars',
  'count' => 1234,
  'spelling' => [ ( 'Star wars' => 10, 'Star Wars' => 15, 'sTaR WaRs' => 5) ]
}

我想获得前N个文档(按降序排列)，但是每个文档仅拼写一个(值最高的一个).聚合框架有没有办法做到这一点?

I would like to get the top N documents (by descending count), but with only one one spelling per document (the one with the highest value). It there a way to do this with the aggregation framework?

我可以轻松获得前10个结果(使用$ sort和$ limit).但是，我每个人怎么只得到一个拼写?

I can easily get the top 10 results (using $sort and $limit). But how do I get only one spelling per each?

例如，如果我有以下三个记录:

So for example, if I have the following three records:

{
  '_id' => 'star_wars',
  'count' => 1234,
  'spelling' => [ ( 'Star wars' => 10, 'Star Wars' => 15, 'sTaR WaRs' => 5) ]
}
{
  '_id' => 'willow',
  'count' => 2211,
  'spelling' => [ ( 'willow' => 300, 'Willow' => 550) ]
}
{
  '_id' => 'indiana_jones',
  'count' => 12,
  'spelling' => [ ( 'indiana Jones' => 10, 'Indiana Jones' => 25, 'indiana jones' => 5) ]
}

我要求获得前2个结果，我会得到:

And I ask for the top 2 results, I'll get:

{
  '_id' => 'willow',
  'count' => 2211,
  'spelling' => 'Willow'
}
{
  '_id' => 'star_wars',
  'count' => 1234,
  'spelling' => 'Star Wars'
}

(或具有这种效果的东西)

(or something to this effect)

谢谢！

推荐答案

您设计的架构将使您很难使用MapReduce之外的任何东西，因为您已将对象的键用作值.因此，我调整了您的架构以使其更好地与MongoDB的功能匹配(在此示例中，也是JSON格式):

Your schema as designed would make using anything but a MapReduce difficult as you've used the keys of the object as values. So, I adjusted your schema to better match with MongoDB's capabilities (in JSON format as well for this example):

{
  '_id' : 'star_wars',
  'count' : 1234,
  'spellings' : [
    { spelling: 'Star wars', total: 10},
    { spelling: 'Star Wars', total : 15},
    { spelling: 'sTaR WaRs', total : 5} ]
}

请注意，现在它是一个具有特定键名spelling和total值的对象的数组(我不知道该数字实际表示的是什么，因此在我的书中称它为total例子).

Note that it's now an array of objects with a specific key name, spelling, and a value for the total (I didn't know what that number actually represented, so I've called it total in my examples).

进入汇总:

db.so.aggregate([
    { $unwind: '$spellings' },
    { $project: {
        'spelling' : '$spellings.spelling',
        'total': '$spellings.total',
        'count': '$count'
        }
    },
    { $sort : { total : -1 } },
    { $group : { _id : '$_id',
        count: { $first: '$count' },
        largest : { $first : '$total' },
        spelling : { $first: '$spelling' }
        }
    }
])

展开所有数据，以便聚合管道可以访问数组的各种值
整理数据以包括管道所需的关键方面.在这种情况下，特定的spelling，total和count.
在total上排序，以便最后的分组可以使用$first
然后进行分组，以便仅返回每个_id的$first值，然后还返回count，由于将其展平为管道的方式，每个临时文档将包含字段.

Unwind all of the data so the aggregation pipeline can access the various values of the array
Flatten the data to include the key aspects needed by the pipeline. In this case, the specific spelling, the total, and the count.
Sort on the total, so that the last grouping can use $first
Then, group so that only the $first value for each _id is returned, and then also return the count which because of the way it was flattened for the pipeline, each temporary document will contain the count field.

结果:

[
{
    "_id" : "star_wars",
    "count" : 1234,
    "largest" : 15,
    "spelling" : "Star Wars"
},
{
    "_id" : "indiana_jones",
    "count" : 12,
    "largest" : 25,
    "spelling" : "Indiana Jones"
},
{
    "_id" : "willow",
    "count" : 2211,
    "largest" : 550,
    "spelling" : "Willow"
}
]

这篇关于MongoDB聚合框架的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！