问题描述
我在mongo中似乎是一个简单的分片设置遇到了问题.
I'm having problems with what seems to be a simple sharding setup in mongo.
我有两个分片,一个mongos实例,以及一个像这样设置的配置服务器:
I have two shards, a single mongos instance, and a single config server set up like this:
机器A-10.0.44.16-配置服务器,mongos
机器B-10.0.44.10-分片1
机器C-10.0.44.11-分片2
Machine A - 10.0.44.16 - config server, mongos
Machine B - 10.0.44.10 - shard 1
Machine C - 10.0.44.11 - shard 2
我有一个名为"Seeds"的集合,它具有一个分片键"SeedType",这是该集合中每个文档上都存在的字段,并且包含四个值之一(请看下面的分片状态).其中两个值的条目数量明显多于其他两个值(其中两个值各有784,000条记录,而两个值约有5,000条).
I have a collection called 'Seeds' that has a shard key 'SeedType' which is a field that is present on every document in the collection, and contains one of four values (take a look at the sharding status below). Two of the values have significantly more entries than the other two (two of them have 784,000 records each, and two have about 5,000).
我希望看到的行为是,使用InventoryPOS的种子"集合中的记录将最终集中在一个分片上,而使用InventoryOnHand的记录将最终集中在另一个分片上.
The behavior I'm expecting to see is that records in the 'Seeds' collection with InventoryPOS will end up on one shard, and the ones with InventoryOnHand will end up on the other.
但是,似乎两个较大的分片键的所有记录最终都位于主分片上.
However, it seems that all records for both the two larger shard keys end up on the primary shard.
这是我的分片状态文本(为清晰起见,删除了其他集合):
Here's my sharding status text (other collections removed for clarity):
--- Sharding Status ---
sharding version: { "_id" : 1, "version" : 3 }
shards:
{ "_id" : "shard0000", "host" : "10.44.0.11:27019" }
{ "_id" : "shard0001", "host" : "10.44.0.10:27017" }
databases:
{ "_id" : "admin", "partitioned" : false, "primary" : "config" }
{ "_id" : "TimMulti", "partitioned" : true, "primary" : "shard0001" }
TimMulti.Seeds chunks:
{ "SeedType" : { $minKey : 1 } } -->> { "SeedType" : "PBI.AnalyticsServer.KPI" } on : shard0000 { "t" : 2000, "i" : 0 }
{ "SeedType" : "PBI.AnalyticsServer.KPI" } -->> { "SeedType" : "PBI.Retail.InventoryOnHand" } on : shard0001 { "t" : 2000, "i" : 7 }
{ "SeedType" : "PBI.Retail.InventoryOnHand" } -->> { "SeedType" : "PBI.Retail.InventoryPOS" } on : shard0001 { "t" : 2000, "i" : 8 }
{ "SeedType" : "PBI.Retail.InventoryPOS" } -->> { "SeedType" : "PBI.Retail.SKU" } on : shard0001 { "t" : 2000, "i" : 9 }
{ "SeedType" : "PBI.Retail.SKU" } -->> { "SeedType" : { $maxKey : 1 } } on : shard0001 { "t" : 2000, "i" : 10 }
我做错什么了吗?
半不相关的问题:
将对象从一个集合原子转移到另一个集合而又不阻塞整个mongo服务的最佳方法是什么?
What is the best way to atomically transfer an object from one collection to another without blocking the entire mongo service?
预先感谢,-蒂姆(Tim)
Thanks in advance,-Tim
推荐答案
着色并不是真的要用这种方式.您应该选择具有某种变体的分片键(或制作复合分片键),以便MongoDB可以制作合理大小的块.分片的要点之一是您的应用程序不必知道数据在哪里.
Sharding really isn't meant to be used this way. You should choose a shard key with some variation (or make a compound shard key) so that MongoDB can make reasonable-size chunks. One of the points of sharding is that your application doesn't have to know where your data is.
如果要手动分片,则应执行以下操作:启动未链接的MongoDB服务器,然后从客户端自行路由.
If you want to manually shard, you should do that: start unlinked MongoDB servers and route things yourself from the client side.
最后,如果您真正致力于此设置,则可以自己迁移该块(有一个moveChunk命令).
Finally, if you're really dedicated to this setup, you could migrate the chunk yourself (there's a moveChunk command).
平衡器根据内存中映射的数量移动块(运行serverStatus并查看"mapped"字段).可能要花一些时间,MongoDB不想让您的数据在生产环境中四处飞散,所以这很保守.
The balancer moves chunks based on how much is mapped in memory (run serverStatus and look at the "mapped" field). It can take a while, MongoDB doesn't want your data flying all over the place in production, so it's pretty conservative.
半不相关的答案:您无法通过分片来自动完成此操作(在多个服务器上eval并非自动执行).您必须要做一个findOne,然后插入并删除.
Semi-unrelated answer: you can't do it atomically with sharding (eval isn't atomic across multiple servers). You'll have to do a findOne, insert, remove.
这篇关于Mongo分片无法在分片之间拆分大型集合的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!