

mongodb 2.2引入的聚合框架,相比map/reduce有什么特别的性能提升吗?

Is the aggregation framework introduced in mongodb 2.2, has any special performance improvements over map/reduce?


If yes, why and how and how much?


(Already I have done a test for myself, and the performance was nearly same)


我亲自运行的每个测试(包括使用您自己的数据)都表明聚合框架比 map reduce 快一个倍数,并且通常快一个数量级.

Every test I have personally run (including using your own data) shows aggregation framework being a multiple faster than map reduce, and usually being an order of magnitude faster.

只取您发布的数据的 1/10(但不是清除操作系统缓存,而是先预热缓存 - 因为我想测量聚合的性能,而不是需要多长时间来分页数据)我得到了这个:

Just taking 1/10th of the data you posted (but rather than clearing OS cache, warming the cache first - because I want to measure performance of the aggregation, and not how long it takes to page in the data) I got this:

MapReduce:1,058 毫秒

MapReduce: 1,058ms
Aggregation Framework: 133ms

从聚合框架中删除 $match 并从 mapReduce 中删除 {query:}(因为两者都只使用索引,这不是我们想要测量的)并通过 key2 对整个数据集进行分组,我得到了:

Removing the $match from aggregation framework and {query:} from mapReduce (because both would just use an index and that's not what we want to measure) and grouping the entire dataset by key2 I got:

MapReduce:18,803 毫秒
聚合框架:1,535 毫秒

MapReduce: 18,803ms
Aggregation Framework: 1,535ms


Those are very much in line with my previous experiments.


