问题描述
我们使用带有 Redis 服务器的 Ruby 网络应用程序进行缓存.有必要测试 Memcached 吗?
什么会给我们带来更好的表现?Redis 和 Memcached 有什么优缺点?
需要考虑的要点:
- 读/写速度.
- 内存使用情况.
- 磁盘 I/O 转储.
- 缩放.
Summary (TL;DR)
2017 年 6 月 3 日更新
Redis 比 memcached 更强大、更流行、支持更好.Memcached 只能做 Redis 能做的一小部分.Redis 更好,即使它们的功能重叠.
对于任何新事物,请使用 Redis.
Memcached 与 Redis:直接比较
这两种工具都是强大、快速的内存数据存储,可用作缓存.两者都可以通过缓存数据库结果、HTML 片段或其他任何生成成本可能很高的内容来帮助加快应用程序的速度.
要考虑的要点
当用于同一事物时,以下是它们如何使用原始问题的要考虑的要点"进行比较:
- 读/写速度:两者都非常快.基准测试因工作负载、版本和许多其他因素而异,但通常表明 redis 与 memcached 一样快或几乎一样快.我推荐 redis,但不是因为 memcached 很慢.不是.
- 内存使用:Redis 更好.
- memcached:您指定缓存大小,当您插入项目时,守护进程会迅速增长到略大于此大小.除了重新启动 memcached 之外,从来没有真正的方法可以回收任何空间.您的所有密钥都可能已过期,您可以刷新数据库,但它仍会使用您为其配置的完整 RAM 块.
- redis:设置最大大小由您决定.Redis 永远不会使用超过它必须使用的内存,并且会将不再使用的内存归还给您.
- 我将 100,000 个 ~2KB 的随机句子字符串 (~200MB) 存储到两者中.Memcached RAM 使用量增长到约 225MB.Redis RAM 使用量增长到约 228MB.刷新两者后,redis 下降到 ~29MB,而 memcached 保持在 ~225MB.它们在存储数据的方式上同样高效,但只有一个能够回收数据.
- 磁盘 I/O 转储:Redis 的明显优势,因为它默认执行此操作并且具有非常可配置的持久性.Memcached 没有在没有第 3 方工具的情况下转储到磁盘的机制.
- 扩展:在您需要多个实例作为缓存之前,两者都为您提供了大量的空间.Redis 包含可帮助您超越这些的工具,而 memcached 则没有.
内存缓存
Memcached 是一个简单的易失性缓存服务器.它允许您存储键/值对,其中值限制为最大 1MB 的字符串.
它擅长于此,但仅此而已.您可以通过密钥以极高的速度访问这些值,这通常会使可用网络甚至内存带宽饱和.
当您重新启动 memcached 时,您的数据就消失了.这对于缓存来说很好.你不应该在那里存放任何重要的东西.
如果您需要高性能或高可用性,可以使用第三方工具、产品和服务.
redis
Redis 可以完成与 memcached 相同的工作,并且可以做得更好.
Redis 也可以充当缓存.它也可以存储键/值对.在 redis 中,它们甚至可以达到 512MB.
您可以关闭持久性,它也会在重新启动时愉快地丢失您的数据.如果你想让你的缓存在重启后继续存在,它也可以让你这样做.事实上,这是默认设置.
速度也非常快,通常受网络或内存带宽的限制.
如果一个 redis/memcached 实例的性能不足以满足您的工作负载,那么 redis 是明智的选择.Redis 包括集群支持,并带有高可用性工具(redis-sentinel) 就在盒子里".在过去的几年里,redis 也已经成为 3rd 方工具的明显领导者.Redis Labs、Amazon 等公司提供了许多有用的 Redis 工具和服务.redis 周围的生态系统要大得多.大规模部署的数量现在可能比 memcached 多.
Redis 超集
Redis 不仅仅是一个缓存.它是一个内存数据结构服务器.您将在下面快速概述 Redis 可以做的事情,而不仅仅是像 memcached 这样的简单的键/值缓存.大多数 redis 的功能是 memcached 无法做到的.
文档
Redis 比 memcached 有更好的文档记录.虽然这可能是主观的,但似乎越来越真实.
redis.io 是一个极好的易于导航的资源.它让您在浏览器中尝试 redis,甚至为您提供文档中每个命令的实时交互式示例.>
现在 redis 的 stackoverflow 结果是 memcached 的 2 倍.Google 结果的 2 倍.更多语言的更易于访问的示例.更积极的发展.更积极的客户开发.这些衡量指标可能没有单独的意义,但结合起来可以清楚地表明,对 redis 的支持和文档更丰富、更及时.
持久化
默认情况下,redis 使用称为快照的机制将您的数据持久化到磁盘.如果您有足够的可用 RAM,它可以将所有数据写入磁盘,而几乎不会降低性能.几乎是免费的!
在快照模式下,突然崩溃可能会导致少量数据丢失.如果您绝对需要确保不会丢失任何数据,请不要担心,redis 也支持 AOF(仅附加文件)模式.在这种持久模式下,数据可以在写入时同步到磁盘.这可以将最大写入吞吐量降低到您的磁盘可以写入的速度,但应该仍然相当快.
如果需要,有许多配置选项可以微调持久性,但默认值非常合理.这些选项可以轻松地将 redis 设置为安全、冗余的数据存储位置.这是一个真正的数据库.
多种数据类型
Memcached 仅限于字符串,但 Redis 是一个数据结构服务器,可以提供多种不同的数据类型.它还提供了充分利用这些数据类型所需的命令.
字符串(命令)
大小不超过 512MB 的简单文本或二进制值.这是 redis 和 memcached 共享的唯一数据类型,尽管 memcached 字符串限制为 1MB.
Redis 通过提供按位运算、位级操作、浮点递增/递减支持、范围查询和多键运算的命令,为您提供了更多利用此数据类型的工具.Memcached 不支持这些.
字符串对于各种用例都很有用,这就是为什么 memcached 仅对这种数据类型非常有用.
哈希(命令)
哈希有点像键值存储中的键值存储.它们在字符串字段和字符串值之间映射.使用散列的字段->值映射比使用常规字符串的键->值映射更节省空间.
哈希作为命名空间很有用,或者当你想对许多键进行逻辑分组时.使用散列,您可以有效地获取所有成员、使所有成员一起过期、一起删除所有成员等.非常适合您有多个键/值对需要分组的任何用例.
散列的一个示例用途是在应用程序之间存储用户配置文件.使用用户 ID 作为密钥存储的 redis 哈希将允许您根据需要存储有关用户的尽可能多的数据位,同时将它们存储在单个密钥下.使用散列而不是将配置文件序列化为字符串的优点是,您可以让不同的应用程序读取/写入用户配置文件中的不同字段,而不必担心一个应用程序会覆盖其他应用程序所做的更改(如果您序列化过时,可能会发生这种情况)数据).
列表(命令)
Redis 列表是字符串的有序集合.它们针对在列表的顶部或底部(又名:左侧或右侧)插入、读取或删除值进行了优化.
Redis 提供了许多 commands 来利用列表,包括推送/弹出项目的命令,在列表之间推送/弹出列表、截断列表、执行范围查询等
列表是非常持久的、原子的队列.这些非常适用于作业队列、日志、缓冲区和许多其他用例.
集合(命令)
集合是唯一值的无序集合.它们经过优化,可让您快速检查某个值是否在集合中、快速添加/删除值以及测量与其他集合的重叠.
这些非常适用于访问控制列表、唯一访问者跟踪器和许多其他内容.大多数编程语言都有类似的东西(通常称为 Set).就是这样,只有分布式.
Redis 提供了多个 commands 来管理集合.存在明显的添加、删除和检查集等.不那么明显的命令(例如弹出/读取随机项目以及与其他集合执行联合和交集的命令)也是如此.
排序集(命令)
排序集也是唯一值的集合.顾名思义,这些是有序的.它们按分数排序,然后按字典顺序排序.
此数据类型针对按分数快速查找进行了优化.获取最高值、最低值或介于两者之间的任何值范围的速度都非常快.
如果您将用户连同他们的高分一起添加到排序集中,您就拥有了一个完美的排行榜.当新的高分出现时,只需将它们以高分再次添加到集合中,它就会重新排列您的排行榜.也非常适合跟踪用户上次访问的时间以及谁在您的应用程序中处于活动状态.
存储具有相同分数的值会导致它们按字典顺序排列(按字母顺序考虑).这对于自动完成功能等很有用.
许多排序集合 commands 类似于集合的命令,有时带有额外的分数参数.还包括用于管理分数和按分数查询的命令.
地理位置
Redis 有多个命令用于存储、检索和测量地理数据.这包括半径查询和测量点之间的距离.
从技术上讲,redis 中的地理数据存储在有序集合中,因此这并不是真正独立的数据类型.它更像是排序集之上的扩展.
位图和 HyperLogLog
与 geo 一样,这些不是完全独立的数据类型.这些命令允许您将字符串数据视为位图或超级日志.
位图是我在 Strings
下引用的位级运算符的用途.这种数据类型是 reddit 最近合作艺术项目的基本构建块:r/地点.
HyperLogLog 允许您使用恒定的极少量空间以惊人的准确性计算几乎无限的唯一值.仅使用约 16KB,您就可以有效地计算您网站的唯一身份访问者数量,即使该数量以数百万计.
事务和原子性
redis 中的命令是原子的,这意味着您可以确保一旦您向 redis 写入一个值,该值对所有连接到 redis 的客户端都是可见的.无需等待该值传播.从技术上讲,memcached 也是原子的,但是随着 redis 添加了除 memcached 之外的所有这些功能,值得注意的是,所有这些额外的数据类型和特性也是原子的.
虽然与关系数据库中的事务不太一样,但 redis 也有使用乐观的"的事务锁定"(WATCH/MULTI/EXEC).
流水线
Redis 提供了一项名为pipelining"的功能.如果您有许多要执行的 redis 命令,您可以使用流水线将它们一次性发送到 redis,而不是一次发送一个.
通常,当您对 redis 或 memcached 执行命令时,每个命令都是一个单独的请求/响应周期.通过流水线,redis 可以缓冲多个命令并同时执行它们,在一个回复中响应所有命令的所有响应.
这可以让您在批量导入或其他涉及大量命令的操作上实现更高的吞吐量.
发布/订阅
Redis 有 commands 专用于 pub/sub 功能,允许 redis 充当高速消息广播器.这允许单个客户端向连接到通道的许多其他客户端发布消息.
Redis 可以发布/订阅以及几乎所有工具.像 RabbitMQ 这样的专用消息代理可能在某些领域具有优势,但事实上同一台服务器也可以为您提供持久的持久队列和您的发布/订阅工作负载可能需要的其他数据结构,Redis 通常被证明是该工作的最佳和最简单的工具.
Lua 脚本
您可以将 lua 脚本 视为 redis 自己的 SQL 或存储过程.比这多或少,但类比大多有效.
也许您有想要 redis 执行的复杂计算.也许您不能让您的事务回滚,并且需要保证复杂流程的每一步都将原子地发生.这些问题以及更多问题都可以通过 lua 脚本来解决.
整个脚本以原子方式执行,因此如果您可以将逻辑放入 lua 脚本中,您通常可以避免与乐观锁定事务混淆.
缩放
如上所述,redis 包括对集群的内置支持,并与其自己的名为 redis-sentinel
的高可用性工具捆绑在一起.
结论
对于任何新项目或尚未使用 memcached 的现有项目,我会毫不犹豫地推荐 redis 而不是 memcached.
以上内容听起来像是我不喜欢 memcached.相反:它是一个强大、简单、稳定、成熟和硬化的工具.甚至有一些用例比 redis 快一点.我喜欢内存缓存.我只是觉得这对未来的发展没有多大意义.
Redis 可以完成 memcached 所做的一切,而且通常更好.memcached 的任何性能优势都是次要的,并且是特定于工作负载的.还有一些工作负载 redis 会更快,还有更多 redis 可以完成而 memcached 无法完成的工作负载.面对巨大的功能鸿沟,以及这两种工具都如此快速和高效的事实,它们很可能成为基础架构的最后一部分,您将不得不担心扩展.
只有一种情况使 memcached 更有意义:memcached 已被用作缓存.如果您已经使用 memcached 进行缓存,请继续使用它,如果它满足您的需求.转移到 redis 可能不值得付出努力,如果您打算将 redis 仅用于缓存,它可能无法提供足够的好处,值得您花时间.如果 memcached 不能满足您的需求,那么您可能应该转向 redis.无论您是需要扩展到 memcached 之外还是需要其他功能,这都是正确的.
We're using a Ruby web-app with Redis server for caching. Is there a point to test Memcached instead?
What will give us better performance? Any pros or cons between Redis and Memcached?
Points to consider:
- Read/write speed.
- Memory usage.
- Disk I/O dumping.
- Scaling.
Summary (TL;DR)
Updated June 3rd, 2017
Redis is more powerful, more popular, and better supported than memcached. Memcached can only do a small fraction of the things Redis can do. Redis is better even where their features overlap.
For anything new, use Redis.
Memcached vs Redis: Direct Comparison
Both tools are powerful, fast, in-memory data stores that are useful as a cache. Both can help speed up your application by caching database results, HTML fragments, or anything else that might be expensive to generate.
Points to Consider
When used for the same thing, here is how they compare using the original question's "Points to Consider":
- Read/write speed: Both are extremely fast. Benchmarks vary by workload, versions, and many other factors but generally show redis to be as fast or almost as fast as memcached. I recommend redis, but not because memcached is slow. It's not.
- Memory usage: Redis is better.
- memcached: You specify the cache size and as you insert items the daemon quickly grows to a little more than this size. There is never really a way to reclaim any of that space, short of restarting memcached. All your keys could be expired, you could flush the database, and it would still use the full chunk of RAM you configured it with.
- redis: Setting a max size is up to you. Redis will never use more than it has to and will give you back memory it is no longer using.
- I stored 100,000 ~2KB strings (~200MB) of random sentences into both. Memcached RAM usage grew to ~225MB. Redis RAM usage grew to ~228MB. After flushing both, redis dropped to ~29MB and memcached stayed at ~225MB. They are similarly efficient in how they store data, but only one is capable of reclaiming it.
- Disk I/O dumping: A clear win for redis since it does this by default and has very configurable persistence. Memcached has no mechanisms for dumping to disk without 3rd party tools.
- Scaling: Both give you tons of headroom before you need more than a single instance as a cache. Redis includes tools to help you go beyond that while memcached does not.
memcached
Memcached is a simple volatile cache server. It allows you to store key/value pairs where the value is limited to being a string up to 1MB.
It's good at this, but that's all it does. You can access those values by their key at extremely high speed, often saturating available network or even memory bandwidth.
When you restart memcached your data is gone. This is fine for a cache. You shouldn't store anything important there.
If you need high performance or high availability there are 3rd party tools, products, and services available.
redis
Redis can do the same jobs as memcached can, and can do them better.
Redis can act as a cache as well. It can store key/value pairs too. In redis they can even be up to 512MB.
You can turn off persistence and it will happily lose your data on restart too. If you want your cache to survive restarts it lets you do that as well. In fact, that's the default.
It's super fast too, often limited by network or memory bandwidth.
If one instance of redis/memcached isn't enough performance for your workload, redis is the clear choice. Redis includes cluster support and comes with high availability tools (redis-sentinel) right "in the box". Over the past few years redis has also emerged as the clear leader in 3rd party tooling. Companies like Redis Labs, Amazon, and others offer many useful redis tools and services. The ecosystem around redis is much larger. The number of large scale deployments is now likely greater than for memcached.
The Redis Superset
Redis is more than a cache. It is an in-memory data structure server. Below you will find a quick overview of things Redis can do beyond being a simple key/value cache like memcached. Most of redis' features are things memcached cannot do.
Documentation
Redis is better documented than memcached. While this can be subjective, it seems to be more and more true all the time.
redis.io is a fantastic easily navigated resource. It lets you try redis in the browser and even gives you live interactive examples with each command in the docs.
There are now 2x as many stackoverflow results for redis as memcached. 2x as many Google results. More readily accessible examples in more languages. More active development. More active client development. These measurements might not mean much individually, but in combination they paint a clear picture that support and documentation for redis is greater and much more up-to-date.
Persistence
By default redis persists your data to disk using a mechanism called snapshotting. If you have enough RAM available it's able to write all of your data to disk with almost no performance degradation. It's almost free!
In snapshot mode there is a chance that a sudden crash could result in a small amount of lost data. If you absolutely need to make sure no data is ever lost, don't worry, redis has your back there too with AOF (Append Only File) mode. In this persistence mode data can be synced to disk as it is written. This can reduce maximum write throughput to however fast your disk can write, but should still be quite fast.
There are many configuration options to fine tune persistence if you need, but the defaults are very sensible. These options make it easy to setup redis as a safe, redundant place to store data. It is a real database.
Many Data Types
Memcached is limited to strings, but Redis is a data structure server that can serve up many different data types. It also provides the commands you need to make the most of those data types.
Strings (commands)
Simple text or binary values that can be up to 512MB in size. This is the only data type redis and memcached share, though memcached strings are limited to 1MB.
Redis gives you more tools for leveraging this datatype by offering commands for bitwise operations, bit-level manipulation, floating point increment/decrement support, range queries, and multi-key operations. Memcached doesn't support any of that.
Strings are useful for all sorts of use cases, which is why memcached is fairly useful with this data type alone.
Hashes (commands)
Hashes are sort of like a key value store within a key value store. They map between string fields and string values. Field->value maps using a hash are slightly more space efficient than key->value maps using regular strings.
Hashes are useful as a namespace, or when you want to logically group many keys. With a hash you can grab all the members efficiently, expire all the members together, delete all the members together, etc. Great for any use case where you have several key/value pairs that need to grouped.
One example use of a hash is for storing user profiles between applications. A redis hash stored with the user ID as the key will allow you to store as many bits of data about a user as needed while keeping them stored under a single key. The advantage of using a hash instead of serializing the profile into a string is that you can have different applications read/write different fields within the user profile without having to worry about one app overriding changes made by others (which can happen if you serialize stale data).
Lists (commands)
Redis lists are ordered collections of strings. They are optimized for inserting, reading, or removing values from the top or bottom (aka: left or right) of the list.
Redis provides many commands for leveraging lists, including commands to push/pop items, push/pop between lists, truncate lists, perform range queries, etc.
Lists make great durable, atomic, queues. These work great for job queues, logs, buffers, and many other use cases.
Sets (commands)
Sets are unordered collections of unique values. They are optimized to let you quickly check if a value is in the set, quickly add/remove values, and to measure overlap with other sets.
These are great for things like access control lists, unique visitor trackers, and many other things. Most programming languages have something similar (usually called a Set). This is like that, only distributed.
Redis provides several commands to manage sets. Obvious ones like adding, removing, and checking the set are present. So are less obvious commands like popping/reading a random item and commands for performing unions and intersections with other sets.
Sorted Sets (commands)
Sorted Sets are also collections of unique values. These ones, as the name implies, are ordered. They are ordered by a score, then lexicographically.
This data type is optimized for quick lookups by score. Getting the highest, lowest, or any range of values in between is extremely fast.
If you add users to a sorted set along with their high score, you have yourself a perfect leader-board. As new high scores come in, just add them to the set again with their high score and it will re-order your leader-board. Also great for keeping track of the last time users visited and who is active in your application.
Storing values with the same score causes them to be ordered lexicographically (think alphabetically). This can be useful for things like auto-complete features.
Many of the sorted set commands are similar to commands for sets, sometimes with an additional score parameter. Also included are commands for managing scores and querying by score.
Geo
Redis has several commands for storing, retrieving, and measuring geographic data. This includes radius queries and measuring distances between points.
Technically geographic data in redis is stored within sorted sets, so this isn't a truly separate data type. It is more of an extension on top of sorted sets.
Bitmap and HyperLogLog
Like geo, these aren't completely separate data types. These are commands that allow you to treat string data as if it's either a bitmap or a hyperloglog.
Bitmaps are what the bit-level operators I referenced under Strings
are for. This data type was the basic building block for reddit's recent collaborative art project: r/Place.
HyperLogLog allows you to use a constant extremely small amount of space to count almost unlimited unique values with shocking accuracy. Using only ~16KB you could efficiently count the number of unique visitors to your site, even if that number is in the millions.
Transactions and Atomicity
Commands in redis are atomic, meaning you can be sure that as soon as you write a value to redis that value is visible to all clients connected to redis. There is no wait for that value to propagate. Technically memcached is atomic as well, but with redis adding all this functionality beyond memcached it is worth noting and somewhat impressive that all these additional data types and features are also atomic.
While not quite the same as transactions in relational databases, redis also has transactions that use "optimistic locking" (WATCH/MULTI/EXEC).
Pipelining
Redis provides a feature called 'pipelining'. If you have many redis commands you want to execute you can use pipelining to send them to redis all-at-once instead of one-at-a-time.
Normally when you execute a command to either redis or memcached, each command is a separate request/response cycle. With pipelining, redis can buffer several commands and execute them all at once, responding with all of the responses to all of your commands in a single reply.
This can allow you to achieve even greater throughput on bulk importing or other actions that involve lots of commands.
Pub/Sub
Redis has commands dedicated to pub/sub functionality, allowing redis to act as a high speed message broadcaster. This allows a single client to publish messages to many other clients connected to a channel.
Redis does pub/sub as well as almost any tool. Dedicated message brokers like RabbitMQ may have advantages in certain areas, but the fact that the same server can also give you persistent durable queues and other data structures your pub/sub workloads likely need, Redis will often prove to be the best and most simple tool for the job.
Lua Scripting
You can kind of think of lua scripts like redis's own SQL or stored procedures. It's both more and less than that, but the analogy mostly works.
Maybe you have complex calculations you want redis to perform. Maybe you can't afford to have your transactions roll back and need guarantees every step of a complex process will happen atomically. These problems and many more can be solved with lua scripting.
The entire script is executed atomically, so if you can fit your logic into a lua script you can often avoid messing with optimistic locking transactions.
Scaling
As mentioned above, redis includes built in support for clustering and is bundled with its own high availability tool called redis-sentinel
.
Conclusion
Without hesitation I would recommend redis over memcached for any new projects, or existing projects that don't already use memcached.
The above may sound like I don't like memcached. On the contrary: it is a powerful, simple, stable, mature, and hardened tool. There are even some use cases where it's a little faster than redis. I love memcached. I just don't think it makes much sense for future development.
Redis does everything memcached does, often better. Any performance advantage for memcached is minor and workload specific. There are also workloads for which redis will be faster, and many more workloads that redis can do which memcached simply can't. The tiny performance differences seem minor in the face of the giant gulf in functionality and the fact that both tools are so fast and efficient they may very well be the last piece of your infrastructure you'll ever have to worry about scaling.
There is only one scenario where memcached makes more sense: where memcached is already in use as a cache. If you are already caching with memcached then keep using it, if it meets your needs. It is likely not worth the effort to move to redis and if you are going to use redis just for caching it may not offer enough benefit to be worth your time. If memcached isn't meeting your needs, then you should probably move to redis. This is true whether you need to scale beyond memcached or you need additional functionality.
这篇关于Memcached 与 Redis?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!