问题描述
我已经将Java应用程序更新为Java 8.该应用程序严重依赖于HashMaps。
当我运行基准测试时,我看到了不可预测的行为。对于某些输入,应用程序运行速度比以前更快,但对于较大的输入,它一直比较慢。
我检查了剖析器,最耗时的操作是HashMap.get。我怀疑
所做的更改是由于Java 8中的HashMap修改引起的,但它可能并非如此,因为我更改了其他部分。
一个简单的方法,我将原始Java 7 HashMap挂接到我的Java 8应用程序中,这样我就只需更改hashmap实现,以查看是否仍然可以观察性能变化。
以下是试图模拟我的应用程序正在执行的最小程序。
基本思想是我需要在应用程序中共享节点。在某些运行时间点,应该检索或创建一个节点
,如果它已经基于某些整数属性不存在的话。以下仅使用两个整数,但在实际应用程序中,我有一个,两个和三个整数键。
import java.util .HashMap;
import java.util.Map;
import java.util.Random;
public class Test1 {
static int max_k1 = 500;
static int max_k2 = 500;
静态地图< Node,Node>地图;
static Random = new Random();
public static void main(String [] args){
for(int i = 0; i long start = System.nanoTime() ;
run();
long end = System.nanoTime();
System.out.println((end - start)/ 1000_000);
private static void run(){
map = new HashMap<>();
for(int i = 0; i Node key = new Node(random.nextInt(max_k1),random.nextInt(max_k2));
节点val = getOrElseUpdate(key);
}
}
private static Node getOrElseUpdate(Node key){
Node val; ((val = map.get(key))== null){
val = key;
if
map.put(key,val);
}
返回val;
}
private static class Node {
private int k1;
private int k2;
public Node(int k1,int k2){
this.k1 = k1;
this.k2 = k2;
}
@Override
public int hashCode(){
int result = 17;
结果= 31 *结果+ k1;
结果= 31 *结果+ k2;
返回结果;
$ b @Override
public boolean equals(Object obj){
if(this == obj)
return true;
if(!(obj instanceof Node))
return false;
节点其他=(节点)obj;
return k1 == other.k1&& k2 == other.k2;
}
}
}
基准是原始的,但还有,这是在Java 8上运行15次的结果:
8143
7919
7984
7973
7948
7984
7931
7992
8038
7975
7924
7995
6903
7758
7627
这是用于Java 7的:
7247
基准是原始的,所以我很感谢熟悉JMH或其他基准测试工具的人运行它,但从我观察到的结果来看,结果更好Java 7.任何想法?
6955
6510
6514
6577
6489
6510
6570
6497
6482
6540
6462
6514 $ b $ 4603
6270
解决方案您的
hashCode()
较差的。在你发布的例子中,你有250000个唯一值,但只有15969个唯一的哈希码。由于大量冲突,。在你的情况下,它只会增加开销,因为许多元素不仅在散列表中具有相同的位置,而且也具有相同的散列码。无论如何,这棵树最终都会成为一个链表。
有几种方法可以解决这个问题:
- 使用
改进你的hashCode。
return k1 * 500 + k2;
解决了这个问题。 使用。可比
。这将被HashMap
用于在发生冲突时构建平衡树。
I have updated a Java application to Java 8. The application heavily relies on HashMaps. When I run the benchmarks, I see unpredictable behavoir. For some inputs, the application runs faster than before, but for larger inputs, it's constantly slower.
I've checked the profiler and the most time consuming operation is HashMap.get. I suspect the changes are due to the HashMap modification in Java 8, but it may not be true, as I have changed some other parts.
Is there an easy way that I hook in the original Java 7 HashMap into my Java 8 application so that I only change the hashmap implementation to see if I still observe the change in performance.
The following is a minimal program that tries to simulate what my application is doing. The basic idea is that i need to share nodes in the application. At some runtime point, a node should be retrieved or created if it already does not exist based on some integer properties. The following only uses two integer, but in the real application I have one, two and three integer keys.
import java.util.HashMap; import java.util.Map; import java.util.Random; public class Test1 { static int max_k1 = 500; static int max_k2 = 500; static Map<Node, Node> map; static Random random = new Random(); public static void main(String[] args) { for (int i = 0; i < 15; i++) { long start = System.nanoTime(); run(); long end = System.nanoTime(); System.out.println((end - start) / 1000_000); } } private static void run() { map = new HashMap<>(); for (int i = 0; i < 10_000_000; i++) { Node key = new Node(random.nextInt(max_k1), random.nextInt(max_k2)); Node val = getOrElseUpdate(key); } } private static Node getOrElseUpdate(Node key) { Node val; if ((val = map.get(key)) == null) { val = key; map.put(key, val); } return val; } private static class Node { private int k1; private int k2; public Node(int k1, int k2) { this.k1 = k1; this.k2 = k2; } @Override public int hashCode() { int result = 17; result = 31 * result + k1; result = 31 * result + k2; return result; } @Override public boolean equals(Object obj) { if (this == obj) return true; if (!(obj instanceof Node)) return false; Node other = (Node) obj; return k1 == other.k1 && k2 == other.k2; } } }
The benchmarking is primitive, but still, this is the result of 15 runs on Java 8:
8143 7919 7984 7973 7948 7984 7931 7992 8038 7975 7924 7995 6903 7758 7627
and this is for Java 7:
7247 6955 6510 6514 6577 6489 6510 6570 6497 6482 6540 6462 6514 4603 6270
The benchmarking is primitive, so I appreciate if someone who is familiar with JMH or other benchmarking tools run it, but from what I observe the results are better for Java 7. Any ideas?
解决方案Your
hashCode()
is very poor. In example you posted you have 250000 unique values but only 15969 unique hash codes. Because of lot of collisions, Java 8 swaps lists with trees. In your case it only adds overhead, because many elements not only have the same position in hash table but also the same hash code. The tree ends up as a linked list anyway.There are couple of ways to fix this:
Improve your hashCode.
return k1 * 500 + k2;
resolves the issue.Use THashMap. Open addressing should work better in case of collisions.
Make
Node
implementComparable
. This will be used byHashMap
to construct balanced tree in case of conflicts.
这篇关于在Java 8中使用Java 7 HashMap的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!