本文介绍了如何使用河豚散列长密码(> 72 个字符)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

上周我阅读了很多关于密码散列的文章,Blowfish 似乎是目前最好的散列算法(之一)——但这不是这个问题的主题!

The last week I read a lot articles about password hashing and Blowfish seems to be (one of) the best hashing algorithm right now - but that's not the topic of this question!

Blowfish 只考虑输入密码的前 72 个字符:

Blowfish only consider the first 72 characters in the entered password:

<?php
$password = "Wow. This is a super secret and super, super long password. Let's add some special ch4r4ct3rs a#d everything is fine :)";
$hash = password_hash($password, PASSWORD_BCRYPT);
var_dump($password);

$input = substr($password, 0, 72);
var_dump($input);

var_dump(password_verify($input, $hash));
?>

输出为:

string(119) "Wow. This is a super secret and super, super long password. Let's add some special ch4r4ct3rs a#d everything is fine :)"
string(72) "Wow. This is a super secret and super, super long password. Let's add so"
bool(true)

正如您所见,只有前 72 个字符很重要.Twitter 正在使用河豚又名 bcrypt 来存储他们的密码(https://shouldichangemypassword.com/twitter-hacked.php) 猜猜看:将您的 Twitter 密码更改为超过 72 个字符的长密码,您只需输入前 72 个字符即可登录您的帐户.

As you can see only the first 72 characters matter. Twitter is using blowfish aka bcrypt to store their passwords (https://shouldichangemypassword.com/twitter-hacked.php) and guess what: change your twitter password to a long password with more than 72 characters and you can login to your account by entering only the first 72 characters.

关于peppering"密码有很多不同的看法.有人说这是不必要的,因为您必须假设秘密胡椒串​​也是已知/已发布的,因此它不会增强哈希.我有一个单独的数据库服务器,所以很可能只有数据库被泄露,而不是持续的胡椒.

There are a lot different opinions about "peppering" passwords. Some people say it's unnecessary, because you have to assume that the secret pepper-string is also known/published so it doesn't enhance the hash. I have a separate database server so it's quite possible that only the database is leaked and not the constant pepper.

在这种情况下(胡椒没有泄露),您会根据字典进行更困难的攻击(如果不对,请纠正我).如果您的胡椒串也泄漏了:还不错-您仍然有盐,并且它的保护效果与没有胡椒的哈希一样好.

In this case (pepper not leaked) you make an attack based on a dictionary more difficult (correct me if this isn't right). If your pepper-string is also leaked: not that bad - you still have the salt and it's as good protected as a hash without pepper.

所以我认为添加密码至少是一个不错的选择.

So I think peppering the password is at least no bad choice.

我的建议是为超过 72 个字符(和胡椒)的密码获取 Blowfish 哈希值:

My suggestion to get a Blowfish hash for a password with more than 72 characters (and pepper) is:

<?php
$pepper = "foIwUVmkKGrGucNJMOkxkvcQ79iPNzP5OKlbIdGPCMTjJcDYnR";

// Generate Hash
$password = "Wow. This is a super secret and super, super long password. Let's add some special ch4r4ct3rs a#d everything is fine :)";
$password_peppered = hash_hmac('sha256', $password, $pepper);
$hash = password_hash($password_peppered, PASSWORD_BCRYPT);

// Check
$input = substr($password, 0, 72);
$input_peppered = hash_hmac('sha256', $input, $pepper);

var_dump(password_verify($input_peppered, $hash));
?>

这是基于这个问题:password_verify 返回 false.

什么是更安全的方法?首先获取 SHA-256 哈希值(返回 64 个字符)还是仅考虑密码的前 72 个字符?

What is the safer way? Getting an SHA-256 hash first (which returns 64 characters) or consider only the first 72 characters of the password?

  • 用户无法通过仅输入前 72 个字符进行登录
  • 可以在不超过字符限制的情况下添加胡椒
  • hash_hmac 的输出可能比密码本身具有更多的熵
  • 密码由两个不同的函数散列
  • 仅使用 64 个字符来构建河豚哈希


编辑 1: 此问题仅针对河豚/bcrypt 的 PHP 集成.感谢您的评论!

Edit 1: This question adresses only the PHP integration of blowfish/bcrypt. Thank's for the comments!

推荐答案

这里的问题基本上是熵的问题.所以让我们开始寻找那里:

The problem here is basically a problem of entropy. So let's start looking there:

每字节熵的位数为:

  • 十六进制字符
    • 位数:4
    • 值:16
    • 72 个字符的熵:288 位
    • 位数:6
    • 值:62
    • 72 个字符的熵:432 位
    • 位:6.5
    • 值:94
    • 72 个字符的熵:468 位
    • 位数:8
    • 值:255
    • 72 个字符的熵:576 位

    因此,我们的行为取决于我们期望的角色类型.

    So, how we act depends on what type of characters we expect.

    您的代码的第一个问题是您的 pepper" 哈希步骤输出了十六进制字符(因为 hash_hmac() 的第四个参数未设置).

    The first problem with your code, is that your "pepper" hash step is outputting hex characters (since the fourth parameter to hash_hmac() is not set).

    因此,通过散列你的胡椒,你有效地将密码可用的最大熵减少了 2 倍(从 576 到 288 可能 位).

    Therefore, by hashing your pepper in, you're effectively cutting the maximum entropy available to the password by a factor of 2 (from 576 to 288 possible bits).

    然而,sha256 首先只提供 256 位的熵.所以你有效地将可能的 576 位减少到 256 位.你的哈希步骤 * 立即 *,根据定义丢失至少 50% 的可能 密码熵.

    However, sha256 only provides 256 bits of entropy in the first place. So you're effectively cutting a possible 576 bits down to 256 bits. Your hash step * immediately*, by very definition losesat least 50% of the possible entropy in the password.

    您可以通过切换到 SHA512 来部分解决这个问题,这样您只会将可用熵减少约 12%.但这仍然是一个不显着的差异.这 12% 将排列数量减少了 1.8e19 的因子.这是一个很大的数字......这就是因素它减少了......

    You could partially solve this by switching to SHA512, where you'd only reduce the available entropy by about 12%. But that's still a not-insignificant difference. That 12% reduces the number of permutations by a factor of 1.8e19. That's a big number... And that's the factor it reduces it by...

    根本问题是存在超过 72 个字符的三种类型的密码.这种风格系统对他们的影响将大不相同:

    The underlying issue is that there are three types of passwords over 72 characters. The impact that this style system has on them will be very different:

    注意:从现在开始,我假设我们将与使用 SHA512 和原始输出(非十六进制)的胡椒系统进行比较.

    Note: from here on out I'm assuming we're comparing to a pepper system which uses SHA512 with raw output (not hex).

    • 高熵随机密码

    • High entropy random passwords

    这些是您使用密码生成器的用户,这些密码生成器会生成大量的密码密钥.它们是随机的(生成的,不是人为选择的),并且每个角色的熵都很高.这些类型使用高字节(字符 > 127)和一些控制字符.

    These are your users using password generators which generate what amount to large keys for passwords. They are random (generated, not human chosen), and have high entropy per character. These types are using high-bytes (characters > 127) and some control characters.

    对于这个组,你的散列函数将显着减少他们的可用熵到bcrypt.

    For this group, your hashing function will significantly reduce their available entropy into bcrypt.

    让我再说一遍.对于使用高熵、长密码的用户,您的解决方案显着降低了他们密码的强度.(72 个字符的密码丢失 62 位熵,更长的密码丢失更多)

    Let me say that again. For users who are using high entropy, long passwords, your solution significantly reduces the strength of their password by a measurable amount. (62 bits of entropy lost for a 72 character password, and more for longer passwords)

    中等熵随机密码

    该组使用的密码包含常用符号,但没有高位字节或控制字符.这些是您的可输入密码.

    This group is using passwords containing common symbols, but no high bytes or control characters. These are your typable passwords.

    对于这个组,您将稍微解锁更多的熵(不是创建它,而是允许更多的熵适合 bcrypt 密码).当我说轻微时,我的意思是轻微.当您最大化 SHA512 的 512 位时,就会出现盈亏平衡.因此,峰值在 78 个字符处.

    For this group, you are going to slightly unlock more entropy (not create it, but allow more entropy to fit into the bcrypt password). When I say slightly, I mean slightly. The break-even occurs when you max out the 512 bits that SHA512 has. Therefore, the peak is at 78 characters.

    让我再说一遍.对于此类密码,在熵用完之前,您只能存储额外的 6 个字符.

    Let me say that again. For this class of passwords, you can only store an additional 6 characters before you run out of entropy.

    低熵非随机密码

    这是使用可能不是随机生成的字母数字字符的组.像圣经引用之类的东西.这些短语每个字符大约有 2.3 位的熵.

    This is the group who are using alpha-numeric characters that are probably not randomly generated. Something like a bible quote or such. These phrases have approximately 2.3 bits of entropy per character.

    对于这个组,你可以通过散列显着解锁更多的熵(不是创建它,而是允许更多的熵适合 bcrypt 密码输入).在您用完熵之前,盈亏平衡点约为 223 个字符.

    For this group, you can significantly unlock more entropy (not create it, but allow more to fit into the bcrypt password input) by hashing. The breakeven is around 223 characters before you run out of entropy.

    再说一遍.对于此类密码,预散列肯定会显着提高安全性.

    Let's say that again. For this class of passwords, pre-hashing definitely increases security significantly.

    这些类型的熵计算在现实世界中并不重要.重要的是猜测熵.这就是直接影响攻击者可以做什么的原因.这就是您想要最大化的.

    These kinds of entropy calculations don't really matter much in the real world. What matters is guessing entropy. That's what directly effects what attackers can do. That's what you want to maximize.

    虽然很少有研究涉及猜测熵,但我想指出一些要点.

    While there's little research that's gone into guessing entropy, there are some points that I'd like to point out.

    连续随机猜出 72 个正确字符的机会.你更有可能赢得强力球彩票 21 次,而不是这次碰撞......这就是我们谈论的一个数字.

    The chances of randomly guessing 72 correct characters in a row are extremely low. You're more likely to win the Powerball lottery 21 times, than to have this collision... That's how big of a number we're talking about.

    但我们可能不会在统计上偶然发现它.在短语的情况下,前 72 个字符相同的可能性比随机密码高得多.但它仍然很低(根据每个字符 2.3 位计算,您更有可能中 5 次强力球彩票).

    But we may not stumble on it statistically. In the case of phrases the chance of the first 72 characters being the same is a whole lot higher than for a random password. But it's still trivially low (you're more likely to win the Powerball lottery 5 times, based on 2.3 bits per character).

    实际上,这并不重要.有人猜对了前 72 个字符,而后者产生显着差异的可能性非常低,不值得担心.为什么?

    Practically, it doesn't really matter. The chances of someone guessing the first 72 characters right, where the latter ones make a significant difference are so low that it's not worth worrying about. Why?

    好吧,假设您正在接受一个短语.如果此人能够正确地输入前 72 个字符,则他们真的很幸运(不太可能),或者这是一个常用短语.如果这是一个常见的短语,唯一的变量是制作它的时间.

    Well, let's say you're taking a phrase. If the person can get the first 72 characters right, they are either really lucky (not likely), or it's a common phrase. If it's a common phrase, the only variable is how long to make it.

    让我们举个例子.让我们从圣经中引用一段话(只是因为它是长文本的常见来源,而不是出于任何其他原因):

    Let's take an example. Let's take a quote from the bible (just because it's a common source of long text, not for any other reason):

    你不可贪恋你邻居的房子.不可贪恋邻舍的妻子、仆婢、牛驴,以及邻舍的一切.

    那是 180 个字符.第 73 个字符是第二个 neighbor's 中的 g.如果您猜到了这么多,您可能不会停在 nei 处,而是继续其余的诗句(因为密码很可能就是这样使用的).因此,您的哈希"并没有增加太多.

    That's 180 characters. The 73rd character is the g in the second neighbor's. If you guessed that much, you're likely not stopping at nei, but continuing with the rest of the verse (since that's how the password is likely to be used). Therefore, your "hash" didn't add much.

    顺便说一句:我绝对不提倡使用圣经引用.事实上,恰恰相反.

    BTW: I am ABSOLUTELY NOT advocating using a bible quote. In fact, the exact opposite.

    您不会真正通过先散列来帮助那些使用长密码的人.有些团体你绝对可以提供帮助.有些你肯定会受伤.

    You're not really going to help people much who use long passwords by hashing first. Some groups you can definitely help. Some you can definitely hurt.

    但最终,没有一个是过分重要的.我们正在处理的数字只是方式太高了.熵的差异不会太大.

    But in the end, none of it is overly significant. The numbers we are dealing with are just WAY too high. The difference in entropy isn't going to be much.

    你最好保留 bcrypt 原样.您更有可能搞砸散列(实际上,您已经这样做了,而且您不是第一个或最后一个犯该错误的人),而不是您试图阻止的攻击将要发生.

    You're better off leaving bcrypt as it is. You're more likely to screw up the hashing (literally, you've done it already, and you're not the first, or last to make that mistake) than the attack you're trying to prevent is going to happen.

    专注于保护网站的其余部分.并在注册时在密码框中添加一个密码熵计来指示密码强度(并指示用户可能希望更改密码是否过长)...

    Focus on securing the rest of the site. And add a password entropy meter to the password box on registration to indicate password strength (and indicate if a password is overlong that the user may wish to change it)...

    那至少是我的 0.02 美元(或者可能超过 0.02 美元)...

    That's my $0.02 at least (or possibly way more than $0.02)...

    实际上没有研究将一个哈希函数输入 bcrypt.因此,充其量也不清楚将peppered"散列输入 bcrypt 是否会导致未知漏洞(我们知道执行 hash1(hash2($value)) 可以暴露围绕抗碰撞和原像攻击的重大漏洞).

    There is literally no research into feeding one hash function into bcrypt. Therefore, it's unclear at best if feeding a "peppered" hash into bcrypt will ever cause unknown vulnerabilities (we know doing hash1(hash2($value)) can expose significant vulnerabilities around collision resistance and preimage attacks).

    考虑到您已经在考虑存储一个秘密密钥(胡椒"),为什么不以一种经过充分研究和理解的方式来使用它呢?为什么不在存储之前加密散列?

    Considering that you're already considering storing a secret key (the "pepper"), why not use it in a way that's well studied and understood? Why not encrypt the hash prior to storing it?

    基本上,在您对密码进行哈希处理后,将整个哈希输出输入到一个强大的加密算法中.然后存储加密的结果.

    Basically, after you hash the password, feed the entire hash output into a strong encryption algorithm. Then store the encrypted result.

    现在,SQL 注入攻击不会泄露任何有用的信息,因为它们没有密钥.如果密钥泄露,攻击者的处境并不比您使用普通哈希(这是可证明的,胡椒预哈希"不提供)更好.

    Now, an SQL-Injection attack will not leak anything useful, because they don't have the cipher key. And if the key is leaked, the attackers are no better off than if you used a plain hash (which is provable, something with the pepper "pre-hash" doesn't provide).

    注意:如果您选择这样做,请使用库.对于 PHP,我强烈推荐 Zend Framework 2 的 ZendCrypt 包.它实际上是我在当前时间点唯一推荐的.它已经过严格审查,并为您做出所有决定(这是一件非常好的事情)...

    Note: if you choose to do this, use a library. For PHP, I strongly recommend Zend Framework 2's ZendCrypt package. It's actually the only one I'd recommend at this current point in time. It's been strongly reviewed, and it makes all the decisions for you (which is a very good thing)...

    类似于:

    use ZendCryptBlockCipher;
    
    public function createHash($password) {
        $hash = password_hash($password, PASSWORD_BCRYPT, ["cost"=>$this->cost]);
    
        $blockCipher = BlockCipher::factory('mcrypt', array('algo' => 'aes'));
        $blockCipher->setKey($this->key);
        return $blockCipher->encrypt($hash);
    }
    
    public function verifyHash($password, $hash) {
        $blockCipher = BlockCipher::factory('mcrypt', array('algo' => 'aes'));
        $blockCipher->setKey($this->key);
        $hash = $blockCipher->decrypt($hash);
    
        return password_verify($password, $hash);
    }
    

    这是有益的,因为您正在以易于理解和研究的方式(至少相对而言)使用所有算法.记住:

    And it's beneficial because you're using all of the algorithms in ways that are well understood and well studied (relatively at least). Remember:

    任何人,从最笨的业余爱好者到最优秀的密码学家,都可以创建自己无法破解的算法.

    • 布鲁斯·施奈尔
    • 这篇关于如何使用河豚散列长密码(> 72 个字符)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-01 17:30