问题描述
我最近用Java编写了一个计算密集型算法,然后将其翻译成C ++。令我惊讶的是,C ++的执行速度要慢得多。我现在已经编写了一个更短的Java测试程序和相应的C ++程序 - 见下文。我的原始代码具有很多数组访问权限,测试代码也是如此。 C ++的执行时间要长5.5倍(请参阅每个程序结束时的注释)。
I recently wrote a computation-intensive algorithm in Java, and then translated it to C++. To my surprise the C++ executed considerably slower. I have now written a much shorter Java test program, and a corresponding C++ program - see below. My original code featured a lot of array access, as does the test code. The C++ takes 5.5 times longer to execute (see comment at end of each program).
1 以下21条评论......
Conclusions after 1 21 comments below ...
测试代码:
-
g ++ -o。 ..
Java快5.5倍 -
g ++ -O3 -o ...
Java快2.9倍 -
g ++ -fprofile-generate -march = native -O3 -o ...
(运行,然后g ++ -fprofile-use
等)Java快1.07倍。
g++ -o ...
Java 5.5 times fasterg++ -O3 -o ...
Java 2.9 times fasterg++ -fprofile-generate -march=native -O3 -o ...
(run, theng++ -fprofile-use
etc) Java 1.07 times faster.
我原来的项目(比测试复杂得多)代码):
My original project (much more complex than test code):
- Java快1.8倍
- C ++快1.9倍
- C ++快2倍
Software environment:
Ubuntu 16.04 (64 bit).
Netbeans 8.2 / jdk 8u121 (java code executed inside netbeans)
g++ (Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609
Compilation: g++ -o cpp_test cpp_test.cpp
Java代码:
public class JavaTest {
public static void main(String[] args) {
final int ARRAY_LENGTH = 100;
final int FINISH_TRIGGER = 100000000;
int[] intArray = new int[ARRAY_LENGTH];
for (int i = 0; i < ARRAY_LENGTH; i++) intArray[i] = 1;
int i = 0;
boolean finished = false;
long loopCount = 0;
System.out.println("Start");
long startTime = System.nanoTime();
while (!finished) {
loopCount++;
intArray[i]++;
if (intArray[i] >= FINISH_TRIGGER) finished = true;
else if (i <(ARRAY_LENGTH - 1)) i++;
else i = 0;
}
System.out.println("Finish: " + loopCount + " loops; " +
((System.nanoTime() - startTime)/1e9) + " secs");
// 5 executions in range 5.98 - 6.17 secs (each 9999999801 loops)
}
}
C ++代码:
//cpp_test.cpp:
#include <iostream>
#include <sys/time.h>
int main() {
const int ARRAY_LENGTH = 100;
const int FINISH_TRIGGER = 100000000;
int *intArray = new int[ARRAY_LENGTH];
for (int i = 0; i < ARRAY_LENGTH; i++) intArray[i] = 1;
int i = 0;
bool finished = false;
long long loopCount = 0;
std::cout << "Start\n";
timespec ts;
clock_gettime(CLOCK_REALTIME, &ts);
long long startTime = (1000000000*ts.tv_sec) + ts.tv_nsec;
while (!finished) {
loopCount++;
intArray[i]++;
if (intArray[i] >= FINISH_TRIGGER) finished = true;
else if (i < (ARRAY_LENGTH - 1)) i++;
else i = 0;
}
clock_gettime(CLOCK_REALTIME, &ts);
double elapsedTime =
((1000000000*ts.tv_sec) + ts.tv_nsec - startTime)/1e9;
std::cout << "Finish: " << loopCount << " loops; ";
std::cout << elapsedTime << " secs\n";
// 5 executions in range 33.07 - 33.45 secs (each 9999999801 loops)
}
推荐答案
在使用性能分析信息时,我唯一能让C ++程序超越Java的时间。这表明运行时信息中有一些东西(默认情况下是Java),可以更快地执行。
The only time I could get the C++ program to outperform Java was when using profiling information. This shows that there's something in the runtime information (that Java gets by default) that allows for faster execution.
除非非你的程序中没有太多事情发生琐碎的if语句。也就是说,如果不分析整个程序,很难预测哪个分支最有可能。这让我相信这是一个分支错误预测问题。现代CPU可以,从而提高CPU吞吐量。但是,这需要预测下一个要执行的指令。如果猜测错误,则必须清除指令管道,并加载正确的指令(这需要时间)。
There's not much going on in your program apart from a non-trivial if statement. That is, without analysing the entire program, it's hard to predict which branch is most likely. This leads me to believe that this is a branch misprediction issue. Modern CPUs do instruction pipelining which allows for higher CPU throughput. However, this requires a prediction of what the next instructions to execute are. If the guess is wrong, the instruction pipeline must be cleared out, and the correct instructions loaded in (which takes time).
在编译时,编译器没有足够的信息来预测哪个分支最有可能。 CPU也做了一些分支预测,但这通常是循环循环和ifs if(而不是其他)。
At compile time, the compiler doesn't have enough information to predict which branch is most likely. CPUs do a bit of branch prediction as well, but this is generally along the lines of loops loop and ifs if (rather than else).
然而,Java有能够在运行时使用信息以及编译时的优点。这允许Java将中间分支识别为最频繁发生的分支,因此为管道预测了此分支。
Java, however, has the advantage of being able to use information at runtime as well as compile time. This allows Java to identify the middle branch as the one that occurs most frequently and so have this branch predicted for the pipeline.
这篇关于与java相比,为什么这个C ++代码执行速度如此之慢?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!