问题描述
结论: SHA-1 与抵御原像攻击一样安全,但它易于计算,这意味着更容易进行暴力破解或字典攻击.(对于 SHA-256 等后继者也是如此.)根据情况,设计为计算成本高的哈希函数(例如 bcrypt)可能是更好的选择.
Conclusion: SHA-1 is as safe as anything against preimage attacks, however it is easy to compute, which means it is easier to mount a bruteforce or dictionary attack. (The same is true for successors like SHA-256.) Depending on the circumstances, a hash function which was designed to be computationally expensive (such as bcrypt) might be a better choice.
有些人经常抛出诸如SHA-1 已损坏"之类的言论,所以我试图了解这到底是什么意思.假设我有一个 SHA-1 密码哈希数据库,一个拥有最先进的 SHA-1 破解算法和一个拥有 100,000 台机器的僵尸网络的攻击者可以访问它.(控制 10 万台家用计算机意味着他们每秒可以执行大约 10^15 次操作.)他们需要多少时间
Some people throw around remarks like "SHA-1 is broken" a lot, so I'm trying to understand what exactly that means. Let's assume I have a database of SHA-1 password hashes, and an attacker whith a state of the art SHA-1 breaking algorithm and a botnet with 100,000 machines gets access to it. (Having control over 100k home computers would mean they can do about 10^15 operations per second.) How much time would they need to
- 找出任何一个用户的密码?
- 找出给定用户的密码?
- 找出所有用户的密码?
- 找到一种以用户身份登录的方法?
- 找到一种以特定用户身份登录的方法吗?
如果密码被加盐会如何改变?加盐的方法(前缀、后缀、两者,或者更复杂的东西,比如异或)重要吗?
How does that change if the passwords are salted? Does the method of salting (prefix, postfix, both, or something more complicated like xor-ing) matter?
这是我目前的理解,经过一些谷歌搜索.如果我误解了什么,请更正答案.
Here is my current understanding, after some googling. Please correct in the answers if I misunderstood something.
- 如果没有盐,彩虹攻击会立即找到所有密码(除了极长的密码).
- 如果有足够长的随机盐,找出密码的最有效方法是蛮力或字典攻击.碰撞和原像攻击都无助于找出实际密码,因此针对 SHA-1 的加密攻击在这里无济于事.使用什么算法甚至无关紧要 - 甚至可以使用 MD5 或 MD4,并且密码同样安全(存在细微差别,因为计算 SHA-1 哈希值较慢).
- 为了评估同样安全"的安全性,我们假设单次 sha1 运行需要 1000 次操作,并且密码包含大写、小写和数字(即 60 个字符).这意味着攻击者每天可以测试 10*60*60*24/1000 ~= 10 个潜在密码.对于蛮力攻击,这意味着在 3 小时内测试最多 9 个字符、一周最多 10 个字符、一年最多 11 个字符的所有密码.(每增加一个字符需要 60 倍的时间.)字典攻击要快得多(即使攻击者使用一台计算机也可以在数小时内完成攻击),但只能找到弱密码.
- 以用户身份登录,攻击者不需要找出确切的密码;找到一个产生相同散列的字符串就足够了.这称为第一原像攻击.据我所知,没有针对 SHA-1 的原像攻击.(暴力攻击需要 2 次操作,这意味着我们的理论攻击者需要 10 年才能成功.理论上可能性的限制约为 2 操作,攻击需要几年时间.)有 对 SHA-1 的简化版本 的原像攻击的影响可以忽略不计(对于使用 44 步而不是 80 步的简化 SHA-1,攻击时间从 2 次操作减少到 2).存在针对 SHA-1 的碰撞攻击,这在理论上是可能的(我发现的最好的 将时间从 2 减少到 2),但那些是即使没有加盐,对密码哈希也无用.
- If there is no salt, a rainbow attack will immediately find all passwords (except extremely long ones).
- If there is a sufficiently long random salt, the most effective way to find out the passwords is a brute force or dictionary attack. Neither collision nor preimage attacks are any help in finding out the actual password, so cryptographic attacks against SHA-1 are no help here. It doesn't even matter much what algorithm is used - one could even use MD5 or MD4 and the passwords would be just as safe (there is a slight difference because computing a SHA-1 hash is slower).
- To evaluate how safe "just as safe" is, let's assume that a single sha1 run takes 1000 operations and passwords contain uppercase, lowercase and digits (that is, 60 characters). That means the attacker can test 10*60*60*24 / 1000 ~= 10 potential password a day. For a brute force attack, that would mean testing all passwords up to 9 characters in 3 hours, up to 10 characters in a week, up to 11 characters in a year. (It takes 60 times as much for every additional character.) A dictionary attack is much, much faster (even an attacker with a single computer could pull it off in hours), but only finds weak passwords.
- To log in as a user, the attacker does not need to find out the exact password; it is enough to find a string that results in the same hash. This is called a first preimage attack. As far as I could find, there are no preimage attacks against SHA-1. (A bruteforce attack would take 2 operations, which means our theoretical attacker would need 10 years to pull it off. Limits of theoretical possibility are around 2 operations, at which the attack would take a few years.) There are preimage attacks against reduced versions of SHA-1 with negligible effect (for the reduced SHA-1 which uses 44 steps instead of 80, attack time is down from 2 operations to 2). There are collision attacks against SHA-1 which are well within theoretical possibility (the best I found brings the time down from 2 to 2), but those are useless against password hashes, even without salting.
简而言之,使用 SHA-1 存储密码似乎非常安全.我错过了什么吗?
In short, storing passwords with SHA-1 seems perfectly safe. Did I miss something?
更新: Marcelo 指出了一篇文章,其中提到了 2 中的第二次原像攻击 操作.(正如 Thomas 解释的那样,这种攻击是一种假设结构,不适用于现实生活场景.)不过,我仍然不明白这对使用 SHA-1 作为密钥推导函数有何危险.通常是否有充分的理由认为碰撞攻击或第二次原像攻击最终可以转化为第一次原像攻击?
Update: Marcelo pointed out an article which mentions a second preimage attack in 2 operations. ( As Thomas explains, this attack is a hypothetical construct which does not apply to real-life scenarios.) I still don't see how this spells danger for the use of SHA-1 as a key derivation function, though. Are there generally good reasons to think that a collision attack or a second preimage attack can be eventually turned into a first preimage attack?
推荐答案
对您的问题的简短回答是:SHA-1 尽可能安全.MD5 也可以,即使是 MD4;但这可能会让一些投资者感到紧张.对于公共关系,最好使用更好"的哈希函数,例如SHA-256,即使您将其输出截断为 160 或 128 位(以节省存储成本).一些 SHA-3 第 2 轮候选人 似乎比 SHA-1 更快,同时可以说更安全";但它们仍然有点新,因此现在坚持使用 SHA-256 或 SHA-512 将是更安全的途径.这会让你看起来很专业和谨慎,这很好.
The short answer to your question is: SHA-1 is as secure as you can get. MD5 would be fine too, even MD4; but it could make some investors nervous. For public relations, it is best to use a "better" hash function, e.g. SHA-256, even if you truncate its output to 160 or 128 bits (to save on storage cost). Some of the SHA-3 round-2 candidates appear to be faster than SHA-1 while being arguably "more secure"; yet they are still a bit new, so sticking to SHA-256 or SHA-512 would be a safer route right now. It would make you look professional and cautious, which is good.
请注意,尽可能安全"与绝对安全"不同.请参阅下文以获得相当冗长的解释.
Note that "as secure as you can get" is not the same as "perfectly safe". See below for rather lengthy explanations.
关于已知攻击:
对 MD4、MD5 和 SHA-1 的已知攻击是关于碰撞的,不会影响原像抵抗力.已经证明 MD4 有一些弱点,在尝试破解 HMAC/MD4 时可以(仅在理论上)利用这些弱点,但这不适用于您的问题.Kesley 和 Schneier 的论文中的 2 秒原像攻击是一种通用权衡,仅适用于非常长的输入(2 字节;即一百万太字节 -- 请注意 106+60 是如何超过 160 的;这就是您看到权衡中没有任何魔法的地方).
The known attacks on MD4, MD5 and SHA-1 are about collisions, which do not impact preimage resistance. It has been shown that MD4 has a few weaknesses which can be (only theoretically) exploited when trying to break HMAC/MD4, but this does not apply to your problem. The 2 second preimage attack in the paper by Kesley and Schneier is a generic trade-off which applies only to very long inputs (2 bytes; that's a million terabytes -- notice how 106+60 exceeds 160; that's where you see that the trade-off has nothing magic in it).
此消息的其余部分假设您使用的哈希函数(例如 SHA-1)是一个黑匣子",没有攻击者可以使用的特殊属性.即使使用损坏的"哈希函数 MD5 和 SHA-1,这也是您现在所拥有的.
The rest of this message assumes that the hash function you use (e.g. SHA-1) is a "black box" with no special property that the attacker may use. That's what you have right now even with the "broken" hash functions MD5 and SHA-1.
关于彩虹桌:
彩虹攻击"实际上是字典或蛮力攻击的成本分摊.它是 时间的衍生物- 内存权衡首先由 Hellman 在 1980 年描述.假设您有 N 个可能的密码(这是您的字典的大小,或 2 如果您考虑使用 n 位输出对哈希函数进行暴力破解,则存在分时攻击,您可以预先计算 N 哈希值密码并将它们存储在一个大表中.如果对哈希输出进行排序,则可以通过一次查找获得密码.rainbow table 是一种巧妙的方式来存储空间大大减少的表格.您只存储 N/t 个散列密码,并通过 O(t) 次查找破解密码.Rainbow 表可让您以虚拟方式处理比实际存储的表大得多的预计算表.
The "rainbow attack" is actually cost-sharing of a dictionary or brute force attack. It is a derivative from the time-memory trade-off first described by Hellman in 1980. Assuming that you have N possible passwords (that's the size of your dictionary, or 2 if you consider brute-forcing a hash function with an output of n bits), there is a time-sharing attack in which you precompute the N hashed passwords and store them in a big table. If you sort the hash outputs, you can get your password in a single lookup. A rainbow table is a smart way to store that table with a much reduced space. You store only N/t hashed passwords, and you crack passwords with O(t) lookups. Rainbow tables allow you to virtually handle precomputed tables much larger than what you can realistically store.
然而,不管是不是彩虹,攻击者仍然必须至少进行一次全面攻击.这可以看作是几个连续的优化层:
However, rainbow or not, the attacker still has to run the full attack at least once. This can be viewed as several successive optimization layers:
- 蛮力/字典攻击需要花费 N 来破解每个密码.
- 使用预先计算好的表,攻击者支付N一次的费用,然后可以攻击许多个密码,每次只需很少的额外费用密码.
- 如果预先计算的表是彩虹表,那么N可以大一些,因为降低了存储成本.N 上的瓶颈在于攻击者可以聚集的 CPU 能力,而不是他的硬盘大小.
- The brute-force / dictionary attack has cost N for cracking each password.
- With a pre-computed table, the attacker pays that cost N once and can thereafter attack many passwords with very small extra cost per password.
- If the pre-computed table is a rainbow table, then N can be somewhat bigger, because storage cost is reduced. The bottleneck on N becomes the CPU power that the attacker can muster, not the size of his harddisks.
如果 N 足够大以至于散列 N 个密码的 CPU 成本是可笑的,那么这种攻击是不可行的,无论是使用彩虹表还是不是.这意味着具有 80 位或更多输出的(抗原像的)哈希函数足以使蛮力攻击不可行.
If N is large enough that the CPU-cost of hashing N passwords is ludicrous, then such an attack is not feasible, regardless of whether rainbow tables are used or not. This means that a (preimage-resistant) hash function with an output of 80 bits or more is enough to make brute-force attack infeasible.
关于盐:
salts 是一种打败预计算的方法.在上面的描述中,salt 将攻击者带回到第 1 步:salting 防止攻击者在多个被攻击的密码之间共享 O(N) 成本.预先计算的表格,更重要的是彩虹表格,不再可行.
Salts are a way to defeat pre-computations. In the description above, the salt brings back the attacker to step 1: salting prevents the attacker from sharing the O(N) cost between several attacked passwords. Pre-computed tables, a fortiori rainbow tables, are no longer feasible.
你想要加盐是因为当散列数据包含在密码中时,即适合随机人类大脑的东西,那么N可能非常低:人类在选择和记住密码方面真的很糟糕.这就是字典攻击"的含义:假设许多用户密码将位于该特别选择的空间中,使用潜在密码(字典")的减少空间.
You want salting because when the hashed data consists in passwords, i.e. something which fits within the brain of a random human being, then N can be quite low: humans are really bad at choosing and remembering passwords. This is what "dictionary attacks" are about: that's using a reduced space of potential passwords (the "dictionary") under the assumption that many user passwords will be in that specially selected space.
因此加盐至少会阻止攻击者使用预先计算的表格,特别是预先计算的彩虹表格.这假设攻击者将能够破解一两个密码;我们不希望他以很少的额外开销破解其他 1000 个密码.
Hence salting will at least prevent the attacker from using pre-computed tables, in particular pre-computed rainbow tables. This assumes that the attacker will be able to break one password or two; we do not want him to break 1000 other passwords with little extra overhead.
此外,腌制有利于公共关系.
Also, salting is good for public relations.
关于 SHA-1 成本:
SHA-1 的基本成本是对 64 字节块进行散列.这就是 SHA-1 的工作原理:填充数据,然后拆分为 64 字节的块.在 Intel Core2 系统上处理单个块的成本约为 500 个时钟周期,这是单核的.MD5 和 MD4 更快,分别计算大约 400 和 250 个周期.不要忘记,大多数现代 CPU 都有多个内核,因此要相应地增加.
The elementary cost of SHA-1 is about hashing a 64-byte block. That's how SHA-1 works: data is padded, then split into 64-byte blocks. The cost of processing a single block is about 500 clock cycles on an Intel Core2 system, and that's for a single core. MD5 and MD4 are faster, count about 400 and 250 cycles, respectively. Do not forget that most modern CPU have several cores, so multiply accordingly.
一些盐渍计划规定了大量的盐;例如进入散列函数的实际上是单个 128 位盐的 40000 个连续副本,然后是密码本身.这使得密码散列对于合法用户和攻击者而言都更加昂贵(在我的示例中是 10000 倍).这是否是一个好主意取决于设置.对于在桌面系统上登录,这很好:用户甚至不会注意到散列密码需要 10 毫秒,而不是 1 微秒;但是攻击者的成本已经上升了非常明显的 10000 倍.在每秒有数千个客户端的共享服务器上,总成本可能会变得令人望而却步.从概念上讲,将合法用户和攻击者的标准提高相同的因素最终并不是很好的安全性;但在某些特定情况下,这可能是一个有价值的想法.
Some salting schemes prescribe huge salts; e.g. what enters the hash function is actually 40000 successive copies of a single 128-bit salt, followed by the password itself. This makes password hashing more expensive (by a factor of 10000 with my example), both for the legitimate user and for the attacker. Whether this is a good idea depends on the setup. For login on a desktop system, this is good: the user will not even notice that it took 10ms to hash his password, instead of 1µs; but the cost for the attacker has risen by a very noticeable factor 10000. On shared servers with thousands of clients per second, the aggregate cost may become prohibitive. Conceptually, raising the bar by the same factor for the legitimate user and the attacker is not ultimately good security; but it can be a worthwhile idea in some specific situations.
关于在线攻击:
以上所有内容都是关于击败离线攻击.离线攻击是攻击者拥有测试"密码所需的所有数据的攻击;例如攻击者可以获得包含散列密码的数据库副本.在离线攻击中,攻击者仅受其计算资源的限制.相反,在线攻击是一种攻击,其中攻击者的每次猜测都必须经过诚实的验证者(例如,攻击者只是尝试登录被攻击的系统).通过强制限制每秒可以尝试的密码数量来阻止在线攻击.极端的例子是在三个错误的 PIN 码后关闭的智能卡.
All of the above is about defeating offline attacks. An offline attack is an attack where the attacker has all the data he needs in order to "test" passwords; e.g. the attacker could get a copy of the database holding the hashed passwords. In an offline attack, the attacker is limited only by his computational resources. Conversely, an online attack is an attack where each guess by the attacker must go through an honest verifier (e.g. the attacker simply tries to log in on the attacked system). Online attacks are thwarted by enforcing limits on how many passwords can be tried per second. Extreme examples are smartcards which shut down after three wrong PINs.
通常,为了密码安全,安排系统不让攻击者进行离线攻击会带来更多回报.这就是 Unix 系统所做的:过去在世界可读的 /etc/password
文件中的散列密码现在在 /etc/shadow
文件中除少数特权应用程序外,防止读取访问.这里的假设是,如果攻击者可以读取/etc/shadow
,那么他可能对系统有足够的控制权,他不再真正需要密码......
Usually, for password security, it pays off much more to arrange the system for not letting an attacker build an offline attack. That's what Unix systems do: the hashed passwords, which used to be in the world-readable /etc/password
file, are now in the /etc/shadow
file which is protected against read access, except by a few privileged applications. The assumption here is that if the attacker can read /etc/shadow
, then he probably has enough control over the system that he does not really need passwords anymore...
这篇关于SHA-1 密码存储安全吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!