本文介绍了为什么在搜索键是 O(n) 时,哈希表查找只有 O(1) 时间?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

从技术上讲,根据我在这里阅读的帖子,哈希表在最坏的情况下确实是 O(n) 时间查找.但我不明白内部机制如何保证平均为 O(1) 时间.

Technically speaking, based on posts I've read here, Hash table is indeed O(n) time lookup in the worst case. But I don't get how the internal mechanics guarantee it to be O(1) time on average.

我的理解是,给定一些 n 个元素,理想情况是有 n 个桶,这会导致 O(1) 空间.这就是我卡住的地方.假设我想查找一个键是否在字典中,这肯定需要 O(n) 时间.那么,当我想通过使用其键的哈希值来搜索元素是否在哈希表中时,为什么会有所不同呢?简而言之,使用原始键值进行搜索需要 O(n) 时间,但使用哈希值是 O(1) 时间.这是为什么?

My understanding is that given some n elements, the ideal case is there are n buckets, which result in O(1) space. This is where I'm stuck. Suppose I want to lookup whether a key is in the dictionary, this definitely takes me O(n) time. So why does it make a difference when I want to search whether an element is in the hash table by using its key's hash value? To put it concisely, searching using raw key values gives O(n) time but using hash values it's O(1) time. Why is that?

难道我还需要一个一个地查找哈希值来查看哪个匹配吗?为什么散列会立即让我知道要检索哪个元素或这样的元素是否存在?

Wouldn't I still need to look up the hash values one by one to see which one matches? Why does the hashing let me know immediately which element to retrieve or whether such an element exists or not?

推荐答案

我认为您混淆了术语,并且通过考虑存储桶也使事情复杂化.

I think you are confusing terminology and also complicating matters by thinking about buckets.

让我们想象一个哈希表,它被实现为长度为 n 的数组 a.假设我们有 n 个可能的键和一个完美的散列函数 H,它将每个键 k 映射到唯一索引 ia中的code>.

Let's imagine a hash table that is implemented in as an array a of length n. Let's also imagine we have n possible keys and a perfect hash function H that maps each key k to a unique index i in a.

让我们通过将 a 中的每个值设置为 nil 来初始化我们的哈希表.

Let's initialize our hash table by setting every value in a to nil.

我们可以将键值对(k1, v1)插入到我们的哈希表中,方法是将值放在数组中的适当位置:

We can insert a key, value pair (k1, v1) into our hash table by placing the value at the appropriate position in the array:

a[H(k1)] = v1

现在假设稍后我们忘记了 k1 是否在哈希表中,我们想检查它是否存在.为此,我们只需查找 a[H(k1)] 并查看是否存在 any 值,即 a[H(k1)] != nil.这显然是一个常数时间查找.

Now let's say that later we forgot if k1 is in the hash table and we want to check if it's there. To do this we simply look up a[H(k1)] and see if any value is there, i.e. a[H(k1)] != nil. This is clearly a constant time lookup.

但是,如果我们想查看 v1,甚至是其他一些 v2 是否在哈希表中的任何位置,该怎么办?这并不容易,因为我们没有将 vi 映射到数组中某个位置的函数.它可以与任何键相关联.所以查看它是否存在于表中的唯一方法是扫描整个数组,检查每个值:

But what if we want to see if v1, or even some other v2 is anywhere in our hashtable? This is not so easy because we have no function that maps a vi to a position in our array. It could be associated with any key. So the only way to see if it exists in the table is to scan the entire array, checking every value:

for i in 0..n-1:
  if a[i] == v2:
    return true
return false

为了更具体一点,假设您的键是名称,而您的值是居住城市.现在比较询问Bob Jones 在哈希表中吗?"到哈希表中有来自纽约的人吗?".我们可以散列Bob Jones"并查看对应的数组位置是否有任何内容(因为这就是Bob Jones"的插入方式),但我们没有类似的快速方法来查找New York".

To make this a bit more concrete, imagine your keys are names, and your values are cities of residence. Now compare asking "Is Bob Jones in the hash table?" to "Is there anyone from New York in the hash table?". We can hash "Bob Jones" and see if there's anything in the corresponding array position (because that's how "Bob Jones" would have been inserted), but we have no similarly quick way to look up "New York".

我假设这就是您要问的,并且您对术语有些混淆.如果这不是您想要的,请发表评论.

I am assuming this is what you are asking, and you have confused the terminology a bit. Please comment if this is not what you wanted.

这篇关于为什么在搜索键是 O(n) 时,哈希表查找只有 O(1) 时间?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

07-30 02:39