问题描述
我正在玩 jmh
以及关于
IF =取指令,ID =指令解码,EX =执行,MEM =存储器访问,WB =寄存器回写
来自:
有关流水线的更多信息:
I'm playing with jmh
and in the section about looping they said that
I tried it myself
@Benchmark
@OperationsPerInvocation(1)
public int measurewrong_1() {
return reps(1);
}
@Benchmark
@OperationsPerInvocation(1000)
public int measurewrong_1000() {
return reps(1000);
}
and got the following result:
Benchmark Mode Cnt Score Error Units
MyBenchmark.measurewrong_1 avgt 15 2.425 ± 0.137 ns/op
MyBenchmark.measurewrong_1000 avgt 15 0.036 ± 0.001 ns/op
It indeed shows that the MyBenchmark.measurewrong_1000
is dramatically faster than MyBenchmark.measurewrong_1
. But I cannot really understand the optimization JVM does to make this performance improvement.
What do they mean the loop is unrolled/pipelined?
Loop unrolling makes pipelining possible. So the pipeline-able CPU (for example RISC) can execute the unrolled code in parallel.
So if your CPU is able to execute 5 pipelines in parallel, your loop will be unrolled in the way:
// pseudo code
int pipelines = 5;
for(int i = 0; i < length; i += pipelines){
s += (x + y);
s += (x + y);
s += (x + y);
s += (x + y);
s += (x + y);
}
IF = Instruction Fetch, ID = Instruction Decode, EX = Execute, MEM = Memory access, WB = Register write back
From Oracle White paper:
more information about pipelining: Classic RISC pipeline
这篇关于了解jvm中的循环性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!