本文介绍了为什么Java的hashCode不支持通用哈希?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

一些哈希表方案,例如或,依赖以及通过从通用散列函数族中挑选新的散列函数来获取表现冲突的数据集合并解决冲突的能力。

前一段时间,我试图在Java中实现一个哈希表,并支持杜鹃散列,并且遇到了麻烦,因为所有的Java对象都有一个 hashCode 函数,对于每个对象, hashCode 返回的值是固定的(当然,除非对象改变)。这意味着如果用户没有提供通用散列函数的外部系列,则不可能构建依赖于通用散列的散列表。

最初,我认为我可以通过直接将通用哈希函数应用于对象的 hashCode s来解决此问题,但是这不起作用,因为如果两个对象具有相同的 hashCode ,那么您应用于这些哈希码的任何确定性函数,选择散列函数,将导致相同的值,从而导致冲突。



看起来这对Java的设计是不利的。这意味着即使语言设计者可能认为这样的表适合于语言设计,也完全禁止使用基于通用散列的表的 HashMap 和其他散列容器。这也使得第三方库设计者更难以建立这种哈希表。



我的问题是:是否有一个原因是Java选择设计 hashCode ,而不考虑使用多个散列函数散列对象的可能性我明白,像链式散列或二次探测这样的许多好的散列方案并不需要它,但似乎这个决定使得很难在Java对象上使用某些类别的算法。

解决方案

简单即可。 Java允许类设计者提供他们自己的 hashCode ,正如你所提到的,它对于普通散列表来说足够好,并且可以是

此外,还有一个问题就是如何解决这个问题,当设计Java Collections API时,标准库中的通用哈希表已经足够大胆了。 C从来没有过他们。 C ++在STL中将它们作为 hash_set hash_map ,但是它们并没有成为标准。现在,在C ++ 0x中,哈希表再次被用于标准化。


Some hash table schemes, such as cuckoo hashing or dynamic perfect hashing, rely on the existence of universal hash functions and the ability to take a collection of data exhibiting collisions and resolve those collisions by picking a new hash function from the family of universal hash functions.

A while ago I was trying to implement a hash table in Java backed by cuckoo hashing and ran into trouble because while all Java objects have a hashCode function, the value that hashCode returns is fixed for each object (unless, of course, the objects change). This means that without the user providing an external family of universal hash functions, it's impossible to build a hash table that relies on universal hashing.

Inititially I thought that I could get around this by applying a universal hash function to the object's hashCodes directly, but this doesn't work because if two objects have the same hashCode, then any deterministic function you apply to those hash codes, even a randomly-chosen hash function, will result in the same value and thus cause a collision.

It seems like this would be detrimental to Java's design. It means that HashMap and other hash containers are completely prohibited from using tables based on universal hashing, even if the language designers may think that such tables would be appropriate in the language design. It also makes it harder for third-party library designers to build hash tables of this sort as well.

My question is: is there a reason that Java opted to design hashCode without considering the possibility of hashing objects with multiple hash functions? I understand that many good hashing schemes like chained hashing or quadratic probing don't require it, but it seems as though the decision makes it hard to use certain classes of algorithms on Java objects.

解决方案

Simplicity. Java allows class designers to provide their own hashCode, which as you mention is good enough for "ordinary" hash tables, and can be hard enough to understand.

Besides, when the Java Collections API was designed, having generic hash tables in the standard library was bold enough a move already. C has never had them. C++ had them in the STL as hash_set and hash_map, but those didn't make it into the standard. Only now, in C++0x, are hash tables being considered for standardization again.

这篇关于为什么Java的hashCode不支持通用哈希?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-03 07:27