问题描述
我们知道,某些Python的数据结构使用哈希表来存储set
或dictionary
之类的项目.因此,这些对象中没有顺序.但似乎,对于某些数字序列而言,这是不正确的.
As we know, Some of Python's data structures use hash tables for storing items like set
or dictionary
. So there is no order in these objects. But it seems that, for some sequences of numbers that's not true.
例如,请考虑以下示例:
For example consider the following examples :
>>> set([7,2,5,3,6])
set([2, 3, 5, 6, 7])
>>> set([4,5,3,0,1,2])
set([0, 1, 2, 3, 4, 5])
但是,如果我们进行一些小的更改,则无法排序:
But it isn't sorted if we make a small change :
>>> set([8,2,5,3,6])
set([8, 2, 3, 5, 6])
所以问题是:Python的哈希函数如何在整数序列上工作?
So the question is: How does Python's hash function work on integer sequences?
推荐答案
尽管SO中有很多关于hash
及其顺序的问题,但是没有人解释哈希函数的算法.
Although there are a lot of questions in SO about hash
and its order,but no one of them explains the algorithm of hash function.
所以您在这里需要知道的是python如何计算哈希表中的索引.
So all you need here is know that how python calculate the indices in hash table.
如果您通过 hashtable.c
在CPython源代码中的文件中,您会在_Py_hashtable_set
函数中看到以下几行内容,该行显示了python计算哈希表键索引的方式:
If you go through the hashtable.c
file in CPython source code you'll see the following lines in _Py_hashtable_set
function which shows the way python calculate the index of hash table keys :
key_hash = ht->hash_func(key);
index = key_hash & (ht->num_buckets - 1);
因此,因为整数的哈希值是整数本身*(-1除外),所以索引基于数据结构的数目和长度(ht->num_buckets - 1
),并且它是按位计算的,且介于和号码.
So as the hash value of integers is the integer itself * (except for -1) the index is based on the number and the length of your data structure (ht->num_buckets - 1
) and it calculated with Bitwise-and between (ht->num_buckets - 1)
and the number.
现在考虑下面的示例,其中set
使用哈希表:
Now consider the following example with set
that use hash-table :
>>> set([0,1919,2000,3,45,33,333,5])
set([0, 33, 3, 5, 45, 333, 2000, 1919])
对于数字33
,我们有:
33 & (ht->num_buckets - 1) = 1
实际上是:
'0b100001' & '0b111'= '0b1' # 1 the index of 33
注意在这种情况下,(ht->num_buckets - 1)
是8-1=7
或0b111
.
Note in this case (ht->num_buckets - 1)
is 8-1=7
or 0b111
.
对于1919
:
'0b11101111111' & '0b111' = '0b111' # 7 the index of 1919
对于333
:
'0b101001101' & '0b111' = '0b101' # 5 the index of 333
以及上述示例:
>>> set([8,2,5,3,6])
set([8, 2, 3, 5, 6])
'0b1000' & '0b100'='0b0' # for 8
'0b110' & '0b100'='0b100' # for 8
这篇关于Python的哈希函数顺序背后的逻辑是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!