问题描述
我正在使用SHA-1来检测程序处理文件中的重复项.它不需要具有强大的密码功能,并且可以是可逆的.我发现了以下快速哈希函数列表 https://code.google.com/p/xxhash/
I'm using SHA-1 to detect duplicates in a program handling files. It is not required to be cryptographic strong and may be reversible. I found this list of fast hash functions https://code.google.com/p/xxhash/
如果我想要更快的功能并在SHA-1附近的随机数据上发生冲突该怎么办?
What do I choose if I want a faster function and collision on random data near to SHA-1?
也许128位哈希足以满足文件重复数据删除的需求? (vs 160 bit sha-1)
Maybe a 128 bit hash is good enough for file deduplication? (vs 160 bit sha-1)
在我的程序中,哈希是按0-512 KB的块计算的.
In my program the hash is calculated on chuncks from 0 - 512 KB.
推荐答案
也许可以帮助您: https://softwareengineering.stackexchange.com/questions/49550/其中哈希算法最适合唯一性和速度
我不了解xxHash,但看起来也很有希望.
I don't know about xxHash but it looks also promising.
MurmurHash非常快,版本3支持128bit长度,我会选择这一版本. (已在Java和Scala中实现.)
MurmurHash is very fast and version 3 supports 128bit length, I would choose this one. (Implemented in Java and Scala.)
这篇关于快速散列函数,在SHA-1附近可能发生冲突的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!