问题描述
如果我们从 Java 的角度来看,那么我们可以说 hashmap 查找需要恒定的时间.但是内部实现呢?它仍然必须通过特定的桶(匹配哪个键的哈希码)来搜索不同的匹配键.那为什么我们说哈希图查找需要恒定的时间呢?请解释.
If we look from Java perspective then we can say that hashmap lookup takes constant time. But what about internal implementation? It still would have to search through particular bucket (for which key's hashcode matched) for different matching keys.Then why do we say that hashmap lookup takes constant time? Please explain.
推荐答案
在使用哈希函数的适当假设下,我们可以说哈希表查找需要 预期 O(1) 时间 (假设您使用的是标准散列方案,如线性探测或链式散列).这意味着平均而言,哈希表执行查找的工作量最多是某个常数.
Under the appropriate assumptions on the hash function being used, we can say that hash table lookups take expected O(1) time (assuming you're using a standard hashing scheme like linear probing or chained hashing). This means that on average, the amount of work that a hash table does to perform a lookup is at most some constant.
直观地说,如果你有一个好"的散列函数,你会期望元素在整个散列表中或多或少地分布均匀,这意味着每个桶中的元素数量将接近被划分的元素数量通过桶的数量.如果哈希表实现将这个数字保持在较低水平(例如,每次元素与存储桶的比率超过某个常数时添加更多存储桶),那么预期完成的工作量最终会成为选择哪个存储桶的一些基线工作量应该被扫描,然后做不要太多"的工作来查看那里的元素,因为预期该桶中的元素数量只会是恒定的.
Intuitively, if you have a "good" hash function, you would expect that elements would be distributed more or less evenly throughout the hash table, meaning that the number of elements in each bucket would be close to the number of elements divided by the number of buckets. If the hash table implementation keeps this number low (say, by adding more buckets every time the ratio of elements to buckets exceeds some constant), then the expected amount of work that gets done ends up being some baseline amount of work to choose which bucket should be scanned, then doing "not too much" work looking at the elements there, because on expectation there will only be a constant number of elements in that bucket.
这并不意味着哈希表具有保证 O(1) 行为.事实上,在最坏的情况下,散列方案会退化,所有元素最终都会放在一个桶中,在最坏的情况下,查找需要时间 Θ(n).这就是为什么设计好的散列函数很重要.
This doesn't mean that hash tables have guaranteed O(1) behavior. In fact, in the worst case, the hashing scheme will degenerate and all elements will end up in one bucket, making lookups take time Θ(n) in the worst case. This is why it's important to design good hash functions.
如需更多信息,您可能需要阅读算法教科书以了解哈希表为何如此高效地支持查找的正式推导.这通常包含在典型的大学算法和数据结构课程中,并且在线上有很多很好的资源.
For more information, you might want to read an algorithms textbook to see the formal derivation of why hash tables support lookups so efficiently. This is usually included as part of a typical university course on algorithms and data structures, and there are many good resources online.
有趣的事实:在某些类型的哈希表(cuckoo 哈希表、动态完美哈希表)中,元素的最坏情况查找时间是 O(1).这些哈希表的工作原理是保证每个元素只能位于几个固定位置之一,插入有时会在元素周围争先恐后地试图使一切都合适.
Fun fact: there are certain types of hash tables (cuckoo hash tables, dynamic perfect hash tables) where the worst case lookup time for an element is O(1). These hash tables work by guaranteeing that each element can only be in one of a few fixed positions, with insertions sometimes scrambling around elements to try to make everything fit.
希望这会有所帮助!
这篇关于为什么 hashmap 查找是 O(1) 即常数时间?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!