问题描述
我有一个 MongoDB,它存储 204.639.403 项目的产品数据,这些数据已经按项目所在的国家/地区写入 四个逻辑 数据库中运行同一个MongoDB进程中的同一个物理机.
I have got a MongoDB which store product data for 204.639.403 items, those data has already spitted up, by the item's country, into four logical databases running on the same physical machine in the same MongoDB process.
这是一个包含每个逻辑数据库的文档数量的列表:
Here is a list with the number of documents per logical database:
- CoUk:56.719.977
- 德:61.216.165
- 周五:52.280.460
- 它:34.422.801
我的问题是数据库写入性能越来越差,尤其是写入四个数据库中最大的(De)变得非常糟糕,根据iotop
mongod进程使用99% IO 时间少于每秒 3MB 的写入和 1.5MB 的读取.这导致长时间锁定数据库,根据mongostat
,100%+ 锁定变得正常 - 即使对其他国家/地区数据库的所有进程写入和读取都已停止.当前slave的LOAD达到6,副本集master同时有2-3的load,因此也会导致复制延迟.
My problem is that the database write performance is getting worser, especially writes to the largest of the four databases (De) has become really bad, according to iotop
the mongod process uses 99% of the IO time with less than 3MB writes and 1.5MB reads per second. This leads to long locking databases, 100%+ lock become normally according to mongostat
- even if all processes writing and reading to the other country databases has been stopped. The current slave reaches a LOAD up to 6, the replica set master has a load of 2-3 at the same time, therefore it leads to a replication lag, too.
每个数据库都有相同的数据和索引结构,我使用最大的数据库 (De) 仅作为进一步的例子.
Each databases has the same data and index structure, I am using the largest database (De) for further examples only.
这是从数据库中随机抽取的一个项目,例如,结构被优化以通过一次读取收集所有重要数据:
This is a random item taken from the database, just as example, the structure is optimized to gather all important data with a single read:
{
"_id" : ObjectId("533b675dba0e381ecf4daa86"),
"ProductId" : "XGW1-E002F-DW",
"Title" : "Sample item",
"OfferNew" : {
"Count" : 7,
"LowestPrice" : 2631,
"OfferCondition" : "NEW"
},
"Country" : "de",
"ImageUrl" : "http://….jpg",
"OfferHistoryNew" : [
…
{
"Date" : ISODate("2014-06-01T23:22:10.940+02:00"),
"Value" : {
"Count" : 10,
"LowestPrice" : 2171,
"OfferCondition" : "NEW"
}
}
],
"Processed" : ISODate("2014-06-09T23:22:10.940+02:00"),
"Eans" : [
"9781241461959"
],
"OfferUsed" : {
"Count" : 1,
"LowestPrice" : 5660,
"OfferCondition" : "USED"
},
"Categories" : [
NumberLong(186606),
NumberLong(541686),
NumberLong(288100),
NumberLong(143),
NumberLong(15777241)
]
}
典型查询的形式很简单,例如按 ProductId 或 EAN 仅按类别进行细化,并按其 A 等级或按类别和 A 等级范围(例如 1 到 10.000)的细化排序,并按B级....
Typical querys are form simple one like by the ProductId or an EAN only to refinements by the category and sorted by its A rank or refinements by the category and an A rank range (1 up to 10.000 for example) and sorted by the B rank… .
这是来自最大数据库的统计数据:
This are the stats from the largest db:
{
"ns" : "De.Item",
"count" : 61216165,
"size" : 43915150656,
"avgObjSize" : 717,
"storageSize" : 45795192544,
"numExtents" : 42,
"nindexes" : 6,
"lastExtentSize" : 2146426864,
"paddingFactor" : 1,
"systemFlags" : 0,
"userFlags" : 1,
"totalIndexSize" : 41356824320,
"indexSizes" : {
"_id_" : 2544027808,
"RankA_1" : 1718096464,
"Categories_1_RankA_1_RankB_-1" : 16383534832,
"Eans_1" : 2846073776,
"Categories_1_RankA_-1" : 15115290064,
"ProductId_1" : 2749801376
},
"ok" : 1
}
值得一提的是,索引大小接近存储大小的一半.
It is mentionable that the index size is nearly half of the storage size.
每个国家/地区的数据库每天必须处理 3-5 百万次更新/插入,我的目标是在夜间不到五个小时内执行写入操作.
Each country DB has to handle 3-5 million updates/inserts per day, my target is to perform the write operations in less than five hours during the night.
目前它是一个带有两台服务器的副本集,每台服务器都有 32GB RAM 和一个带有 2TB HDD 的 RAID1.死锁调度器和 noatime 等简单优化已经完成.
Currently it's a replica set with two servers, each has 32GB RAM and a RAID1 with 2TB HDDs. Simple optimizations like the deadlock scheduler and noatime has already been made.
我已经制定了一些优化策略:
I have worked out some optimizations strategies:
- 减少数字索引:
- 默认的 _id 可以使用 ProductId 而不是默认的 MongoId,这将为每个总 nixes 大小为每个 DB 节省 6-7%.
- 尝试删除 Categories_1_RankA_-1 索引,也许 BrowseNodes_1_RankA_1_RankB_-1 索引也可以处理查询.当不使用完整索引时,排序仍然表现良好吗?另一种方法是将与 Categories_1_RankA_1_RankB_-1 匹配的索引存储在另一个引用主集合的集合中.
- 我听说每个分片都必须保存整个数据库索引?
- 我担心查询结构可能不适合共享环境.使用产品 ID 作为分片键似乎并不适合所有查询类型,而且按类别进行分片也很复杂.单个项目可以列在多个主要和子类别中…….我的担心可能是错误的,我从未在生产环境中使用过它.
但是应该还有其他优化策略,我也没有想到我想听听!
哪种优化策略听起来最有希望,或者这里需要多种优化的混合?But there should be other optimization strategies, too did not comes to my mind I would like to hear about!
Which optimization strategy sound most promising or is a mixture of several optimizations is needed here?推荐答案
很可能由于创纪录的增长而遇到问题,请参阅 http://docs.mongodb.org/manual/core/write-performance/#document-growth.
Most likely you are running into issues due to record growth, see http://docs.mongodb.org/manual/core/write-performance/#document-growth.
Mongo 更喜欢固定(或至少有界)大小的记录.将记录大小增加到超出预先分配的存储空间将导致文档被移动到磁盘上的另一个位置,从而使每次写入的 I/O 成倍增加.如果您的文档大小相对相同,请考虑在插入时为您的平均文档预先分配足够"的空间.否则考虑将快速增长的嵌套数组拆分为一个单独的集合,从而用插入替换更新.还要检查您的碎片并考虑不时压缩您的数据库,以便您每个块拥有更高密度的文档,从而减少硬页面错误.
Mongo prefers records of fixed (or at least bounded) size. Increasing the record size beyond the pre-allocated storage will cause the document to be moved to another location on disk, multiplying your I/O with each write. Consider pre-allocating "enough" space for your average document on insert, if your document sizes are relatively homogenous. Otherwise consider splitting rapidly growing nested arrays into a separate collection, thereby replacing updates with inserts. Also check your fragmentation and consider compacting your databases from time to time, so that you have a higher density of documents per block which will cut down on hard page faults.
这篇关于MongoDB 在包含 50.000.000 个文档的大型集合上写入性能不佳的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!