问题描述
我最近一直在学习 RavenDB,想使用它.
I have been learning RavenDB recently and would like to put it to use.
我想知道人们对以可扩展的方式构建系统有什么建议或建议,特别是跨服务器分片数据,但可以在单个服务器上启动,并且只能根据需要增长.
I was wondering what advice or suggestions people had around building the system in a way that is ready to scale, specifically sharding the data across servers, but that can start on a single server and only grow as needed.
在单个实例上创建多个数据库并在它们之间实施分片是可取的,甚至是可能的.那么要扩展它是否只是将这些数据库分布到机器上的问题?
Is it advisable, or even possible, to create multiple databases on a single instance and implement sharding across them. Then to scale it would simply be a matter of spreading these databases across the machines?
我的第一印象是这种方法行得通,但我很想听听其他人的意见和经验.
My first impression is that this approach would work, but I would be interested to hear the opinions and experiences of others.
更新 1:
我一直在思考这个话题.我认为我对稍后整理"方法的问题是,在这种情况下,在我看来很难在服务器之间均匀分布数据.我不会有一个可以在 (A-E,F-M..) 范围内的字符串键,它将用数字完成.
I have been thinking more on this topic. I think my problem with the "sort it out later" approach is that it seems to me difficult to spread data evenly across servers in that situation. I will not have a string key which I can range on (A-E,F-M..) it will be done with numbers.
这留下了我可以看到的两个选项.要么在边界处打破它,所以 1-50000 在分片 1 上,50001-100000 在分片 2 上,但是对于一个老化的站点,比如这个站点,您的原始分片将少做很多工作.或者,如果您需要将文档移动到新的分片,那么循环分片并将分片 ID 放入键的策略会受到影响,它会更改键并破坏使用该键的 URL.
This leaves two options I can see. Either break it at boundaries, so 1-50000 is on shard 1, 50001-100000 is on shard 2, but then with a site that ages, say like this one, your original shards will be doing a lot less work. Alternatively a strategy that round robins the shards and put the shard id into the key will suffer if you need to move a document to a new shard, it would change the key and break urls that have used the key.
所以我的新想法,我再次将它放在那里征求意见,是从一开始就创建一个分桶系统.这就像将分片 id 塞进密钥一样,但是你从一个大数字开始,比如 1000,你在它们之间平均分配.然后,当需要将负载拆分为分片时,您可以说将存储桶 501-1000 移动到新服务器并编写分片逻辑,即 1-500 转到分片 1,501-1000 转到分片 2.然后当 a第三台服务器上线,您选择另一个范围的存储桶并进行调整.
So my new idea, and again I am putting it out there for comment, would be to create from day one a bucketting system. Which works like stuffing the shard id into the key, but you start with a large number, say 1000 which you distribute evenly between. Then when it comes time to split the load into a shard, you can say move buckets 501-1000 to the new server and write your shard logic that 1-500 goes to shard 1 and 501-1000 goes to shard 2. Then when a third server comes online you pick another range of buckets and adjust.
在我看来,这使您能够拆分为与最初创建的存储桶一样多的碎片,从而在数量和年龄方面均匀分布负载.无需更改密钥.
To my eye this gives you the ability to split into as many shards as you originally created buckets, spreading the load evenly both in terms of quantity and age. Without having to change keys.
想法?
推荐答案
这是可能的,但真的没有必要.您可以开始使用一个实例,然后在必要时通过设置分片进行扩展.
It is possible, but really unnecessary. You can start using one instance, and then scale when necessary by setting up sharding later.
另见:
http://ravendb.net/documentation/docs-sharding
http://ayende.com/blog/4830/ravendb-auto-sharding-bundle-design-early-thoughts
http://ravendb.net/documentation/replication/sharding
这篇关于RavenDB - 规划可扩展性的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!