问题描述
我最近玩了一些基准测试,发现了非常有趣的结果,我现在无法解释。以下是基准:
I was recently playing with some benchmarks and found very interesting results that I can't explain right now. Here is the benchmark:
@BenchmarkMode(Mode.Throughput)
@Fork(1)
@State(Scope.Thread)
@Warmup(iterations = 10, time = 1, batchSize = 1000)
@Measurement(iterations = 10, time = 1, batchSize = 1000)
public class ArrayCopy {
@Param({"1","5","10","100", "1000"})
private int size;
private int[] ar;
@Setup
public void setup() {
ar = new int[size];
for (int i = 0; i < size; i++) {
ar[i] = i;
}
}
@Benchmark
public int[] SystemArrayCopy() {
final int length = size;
int[] result = new int[length];
System.arraycopy(ar, 0, result, 0, length);
return result;
}
@Benchmark
public int[] javaArrayCopy() {
final int length = size;
int[] result = new int[length];
for (int i = 0; i < length; i++) {
result[i] = ar[i];
}
return result;
}
@Benchmark
public int[] arraysCopyOf() {
final int length = size;
return Arrays.copyOf(ar, length);
}
}
结果:
Benchmark (size) Mode Cnt Score Error Units
ArrayCopy.SystemArrayCopy 1 thrpt 10 52533.503 ± 2938.553 ops/s
ArrayCopy.SystemArrayCopy 5 thrpt 10 52518.875 ± 4973.229 ops/s
ArrayCopy.SystemArrayCopy 10 thrpt 10 53527.400 ± 4291.669 ops/s
ArrayCopy.SystemArrayCopy 100 thrpt 10 18948.334 ± 929.156 ops/s
ArrayCopy.SystemArrayCopy 1000 thrpt 10 2782.739 ± 184.484 ops/s
ArrayCopy.arraysCopyOf 1 thrpt 10 111665.763 ± 8928.007 ops/s
ArrayCopy.arraysCopyOf 5 thrpt 10 97358.978 ± 5457.597 ops/s
ArrayCopy.arraysCopyOf 10 thrpt 10 93523.975 ± 9282.989 ops/s
ArrayCopy.arraysCopyOf 100 thrpt 10 19716.960 ± 728.051 ops/s
ArrayCopy.arraysCopyOf 1000 thrpt 10 1897.061 ± 242.788 ops/s
ArrayCopy.javaArrayCopy 1 thrpt 10 58053.872 ± 4955.749 ops/s
ArrayCopy.javaArrayCopy 5 thrpt 10 49708.647 ± 3579.826 ops/s
ArrayCopy.javaArrayCopy 10 thrpt 10 48111.857 ± 4603.024 ops/s
ArrayCopy.javaArrayCopy 100 thrpt 10 18768.866 ± 445.238 ops/s
ArrayCopy.javaArrayCopy 1000 thrpt 10 2462.207 ± 126.549 ops/s
所以这里有两件奇怪的事情:
So there are two strange things here:
-
Arrays.copyOf
对于小
数组(1,5,10大小),比System.arraycopy
快2倍。但是,在大型数组1000
Arrays.copyOf
变得几乎慢2倍。我知道两个
方法都是内在的,所以我期望性能相同。
这个差异来自哪里? - 1元素数组的手动复制比
System.arraycopy
快。我不清楚为什么。有人知道吗?
Arrays.copyOf
is 2 times faster thanSystem.arraycopy
for smallarrays (1,5,10 size). However, on a large array of size 1000Arrays.copyOf
becomes almost 2 times slower. I know that bothmethods are intrinsics, so I would expect the same performance. Wheredoes this difference come from?- Manual copy for a 1-element array is faster than
System.arraycopy
. It's not clear to me why. Does anybody know?
VM版本:JDK 1.8.0_131,VM 25.131-b11
VM version: JDK 1.8.0_131, VM 25.131-b11
推荐答案
您的 SystemArrayCopy
基准测试在语义上不等同于 arraysCopyOf
。
Your SystemArrayCopy
benchmark is not semantically equivalent to arraysCopyOf
.
如果你更换
System.arraycopy(ar, 0, result, 0, length);
with
System.arraycopy(ar, 0, result, 0, Math.min(ar.length, length));
通过此更改,两个基准的表现也将变得相似。
With this change the performance of both benchmarks will also become similar.
为什么第一个变种比较慢?
Why is the first variant slower then?
- 不知道
长度
与ar.length
JVM需要执行额外的边界检查,并准备抛出IndexOutOfBoundsException
时长度> ar.length
。 -
这也会破坏优化以消除冗余归零。您知道,每个已分配的数组必须用零初始化。但是,如果JIT在创建后立即填充数组,则可以避免归零。但是
-prof perfasm
清楚地表明原始的SystemArrayCopy
基准测试花费了大量时间来清除分配的数组:
- Without knowing how
length
relates toar.length
JVM needs to perform additional bounds check and be prepared to throwIndexOutOfBoundsException
whenlength > ar.length
. This also breaks the optimization to eliminate redundant zeroing. You know, every allocated array must be initialized with zeros. However, JIT can avoid zeroing if it sees that the array is filled right after creation. But
-prof perfasm
clearly shows that the originalSystemArrayCopy
benchmark spends significant amount of time clearing the allocated array:
0,84% 0x000000000365d35f: shr $0x3,%rcx
0,06% 0x000000000365d363: add $0xfffffffffffffffe,%rcx
0,69% 0x000000000365d367: xor %rax,%rax
0x000000000365d36a: shl $0x3,%rcx
21,02% 0x000000000365d36e: rep rex.W stos %al,%es:(%rdi) ;*newarray
手动复制出现得更快对于小数组,因为与 System.arraycopy
不同,它不会对VM函数执行任何运行时调用。
Manual copy appeared faster for small arrays, because unlike System.arraycopy
it does not perform any runtime calls to VM functions.
这篇关于对于小型数组,为什么Arrays.copyOf比System.arraycopy快2倍?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!