本文介绍了可变大小布尔数组的哈希算法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一些布尔数组,它们的大小不是恒定的,并且我需要一个强大而快速的哈希算法,以使它们发生哈希冲突的机会降到最低。

I have some boolean arrays that their sizes are not constant, And I need a strong and fast hash algorithm to give minimum chance of hash collision for them.

我的自己的想法是计算每个布尔数组的整数值,但是例如,这2个数组将给出3的相同哈希值:

[0,1,1]和[1,1]

My own idea was calculating the integer value of each boolean array but for example these 2 arrays will give same hash of 3:
[0 , 1, 1] and [1, 1]

在计算整数值后,我想乘以数组的大小,但是这个想法也很烂,因为哈希冲突的可能性很高。

I thought to multiply the size of array after calculating integer value, but this idea also sucks, because there is a high chance of hash collision.

有人有个好主意吗?

推荐答案

插入前哨 true 元素放在数组的开头,然后将数组解释为二进制数。对于少于32个元素的数组,这是完美的散列(无冲突)。对于较大的数组,我建议对小于2 的大质数进行模运算。

Insert a sentinel true element at the start of the array, then interpret the array as a binary number. This is a perfect hash (no collisions) for arrays with less than 32 elements. For larger arrays I suggest doing the arithmetic modulo a large prime less than 2.

示例:

Array       | Binary | Decimal
------------+--------+---------
[ 0, 1, 1 ] |  01011 |      11
[ 1, 1 ]    |  00111 |       7

这与将数组解释为二进制数并使用<$进行按位或c $ c> 1<< n ,其中 n 是数组的大小。

This is the same as interpreting the array as a binary number and taking the bitwise OR with 1 << n, where n is the size of the array.

实现:

int hash(int[] array)
{
    int h = (1 << array.length);
    for (int i = 0; i < array.length; i++)
    {
        h = h | (array[i] << (array.length - i - 1));
    }
    return h;
}

这篇关于可变大小布尔数组的哈希算法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-21 20:28