问题描述
我们正在开发具有以下属性的 SSD 支持的键值解决方案:
- 吞吐量:10000 TPS;50/50 看跌/获得;
- 延迟:平均 1 毫秒,第 99.9 个百分位数 10 毫秒
- 数据量:约 10 亿个值,每个值约 150 字节;64位密钥;随机访问,20% 的数据适合 RAM
我们在商用 SSD 上尝试了 KyotoCabinet、LevelDB 和 RethinkDB,使用不同的 Linux IO 调度程序、ext3/xfs 文件系统;使用 .
总结是随着磨损均衡开销的增加,SSD 写入性能会随着时间的推移而恶化.随着驱动器上的可用页面数量减少,NAND 控制器必须开始对页面进行碎片整理,这会导致延迟.NAND 还必须构建 LBA 到块映射,以跟踪数据在各种 NAND 块中的随机分布.随着地图的增长,地图上的操作(插入、删除)会变慢.
您将无法使用软件方法解决低级硬件问题,您将需要升级到企业级 SSD 或放宽您的延迟要求.
We are working on a SSD-backed key-value solution with the following properties:
- Throughput: 10000 TPS; 50/50 puts/gets;
- Latency: 1ms average, 99.9th percentile 10ms
- Data volume: ~1 billion values, ~150 bytes each; 64-bit keys; random access, 20% of data fits RAM
We tried KyotoCabinet, LevelDB, and RethinkDB on commodity SSDs, with different Linux IO schedulers, ext3/xfs file systems; made a number of tests using Rebench; and found that in all cases:
- Read-only throughput/latency are very good
- Write/update-only throughout is moderate, but there are many high-latency outliers
- Mixed read/write workload causes catastrophic oscillation in throughput/latency even in case of direct access to the block device (bypassing the file system)
The picture below illustrates such behavior for KyotoCabinet (horizontal axis is time, three periods are clearly visible - read-only, mixed, update only).
The question is: is it possible to achieve low latency for described SLAs using SSDs and what key-value stores are recommended?
Highly variant write latency is a common attribute of SSDs (especially consumer models). There is a pretty good explanation of why in this AnandTech review .
Summary is that the SSD write performance worsens overtime as the wear leveling overhead increases. As the number of free pages on the drive decreases the NAND controller must start defragmenting pages, which contributes to latency. The NAND also must build an LBA to block map to track the random distribution of data across various NAND blocks. As this map grows, operations on the map (inserts, deletions) will get slower.
You aren't going to be able to solve a low level HW issue with a SW approach, you are going to need to either move up to an enterprise level SSD or relax your latency requirements.
这篇关于SSD 的低延迟键值存储的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!