问题描述
摘要:有没有办法做到这一点?这是我的意思:假设我有一个 unsigned int 号.然后我将其乘以几次(并且出现溢出,这是预期的).然后可以将原始"值还原"吗?
Summary: Is there a way to do that? Here's what I mean: suppose I have an unsigned int number. Then I multiply it several times(and there's overflow, which is expected). Then is it possible to "revert" the original value back?
详细信息:
关于 Rabin-Karp滚动哈希 .我需要做的是:我有一个长字符串的哈希-例如:"abcd".然后,我得到了较短子字符串的哈希-例如"cd".如何使用两个给定的哈希值,用O(1)计算"ab"哈希值?
It's all about Rabin-Karp rolling hash. What I need to do is: I have the hash of a long string - for example: "abcd". Then I have the hash for a shorter substring - for example "cd". How to calculate the "ab" hash with O(1), using the two given hashes?
我现在作为算法所拥有的:
What I have now as an algorithm:
- 从"abcd"哈希值中减去"cd"哈希值(从多项式中删除最后一个元素)
- 用
p ^ len( "cd" )
表示"abcd"哈希,其中p
是基数(素数).
- substract the "cd" hash from "abcd" hash (remove the last elements from the polynomial)
- devide the "abcd" hash by
p ^ len( "cd" )
, wherep
is the base (prime number).
这是:
a * p ^ 3 + b * p ^ 2 + c * p ^ 1 + d * p ^ 0
- abcd
c * p ^ 1 + d * p ^ 0
- cd
ab 得到:
(
( a * p ^ 3 + b * p ^ 2 + c * p ^ 1 + d * p ^ 0 ) -
( c * p ^ 1 + d * p ^ 0 )
)
/ ( p ^ 2 )
= a * p ^ 1 + b * p ^ 0
如果我没有溢出(如果p
是一个小数字),则此方法有效.但是,如果不是,那是行不通的.
And this works, if I don't have an overflow (if p
is small number). But if it's not - it's not working.
有什么把戏吗?
P.S. c++
标记是由于数字的溢出而引起的,因为它是特定的(与python,scheme或sth不同)
P.S. The c++
tag is because of the number's overflow, as it is specific (and different from python, scheme or sth)
推荐答案
扩展的欧几里得算法是一个很好的解决方案,但它过于复杂且难以实现.有更好的一个.
Extended Euclidean algorithm is a good solution for this, but it's too complicated and hard to implement. There's a better one.
还有另一种方法(感谢我的一个朋友(:)
And there's another way to do this (thanks to e friend of mine (: )
维基百科-模数乘积倒数中有一篇不错的文章在m
和a
是互素的情况下,使用欧拉定理:
There's a nice article in wikipedia - modular multiplicative inverse using Euler's theorem in the case, when m
and a
are coprime:
其中φ(m)
是 Euler的Totient函数.
在我的情况下,m
(模)是哈希类型的大小-2^32
,2^64
等(在我的情况下为64位).
好吧,这意味着我们应该只找到φ(m)
的值.但是,请考虑一下-m == 2 ^ 64
,因此,我们可以保证m
将与所有奇数共质数,而将不会与偶数均互质数.因此,我们要做的是获取所有值的数量并将它们除以2.
In my case, the m
(modulo) is the size of the hash type - 2^32
, 2^64
, etc. (64bit in my case).
Well, this means, that we should only find the value of φ(m)
. But think about that - m == 2 ^ 64
so, that gives us the guarantee that m
will be coprime with all odd numbers and will NOT be coprime any even number. So, what we need to do is to get the number of all values and divide them by 2.
此外,我们知道m
将是未签名的,否则我们将遇到一些问题.不仅如此,我们还有机会这样做:
Also, we know that m
will be unsigned, as otherwise we will have some issues. Than this gives us the chance to do this:
hash_t x = -1;
x /= 2;
hash_t a_reverse = fast_pow( a, x );
好吧,大约64位数字,x
确实是很大的数字(19位数字:9 223 372 036 854 775 807
),但是fast_pow
确实非常快,如果需要多个查询,我们可以缓存反向数字. .
Well, about 64bit numbers, x
is really big number ( 19 digits: 9 223 372 036 854 775 807
), but fast_pow
is really fast and we could cache the reverse number, in case that we need for more than one query.
fast_pow
是一种著名的算法:
hash_t fast_pow( hash_t source, hash_t pow )
{
if( 0 == pow )
{
return 1;
}
if( 0 != pow % 2 )
{
return source * fast_pow( source, pow - 1 );
}
else
{
return fast_pow( source * source, pow / 2 );
}
}
添加:例如:
Addition: for example:
hash_t base = 2305843009213693951; // 9th mersenne prime
hash_t x = 1234567890987654321;
x *= fast_pow( base, 123456789 ); // x * ( base ^ 123456789 )
hash_t y = -1;
y /= 2;
hash_t base_reverse = fast_pow( base, y );
x *= fast_pow( base_reverse, 123456789 ); // x * ( base_reverse ^ 123456789 )
assert( x == 1234567890987654321 ) ;
完美且快速地工作.
这篇关于经过几次与溢出**的乘法之后,是否可以获得数字的原始值?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!