本文介绍了C ++ 11元组性能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我只是想使我的代码更一般化使用 std :: tuple 在很多情况下,包括单个元素。我的意思是例如 tuple< double> 而不是 double

I just about to make my code more generalized by using std::tuple in a lot of cases including single element. I mean for example tuple<double> instead of double. But I decided to check performance of this particular case.

这里是简单的性能基准测试:

Here is simple performance benchmark test:

#include <tuple>
#include <iostream>

using std::cout;
using std::endl;
using std::get;
using std::tuple;

int main(void)
{

#ifdef TUPLE
    using double_t = std::tuple<double>;
#else
    using double_t = double;
#endif

    constexpr int count = 1e9;
    auto array = new double_t[count];

    long long sum = 0;
    for (int idx = 0; idx < count; ++idx) {
#ifdef TUPLE
        sum += get<0>(array[idx]);
#else
        sum += array[idx];
#endif
    }
    delete[] array;
    cout << sum << endl; // just "external" side effect for variable sum.
}

并运行结果:

$ g++ -DTUPLE -O2 -std=c++11 test.cpp && time ./a.out
0

real    0m3.347s
user    0m2.839s
sys     0m0.485s

$ g++  -O2 -std=c++11 test.cpp && time ./a.out
0

real    0m2.963s
user    0m2.424s
sys     0m0.519s

我认为tuple是严格的静态编译模板,所有get<>函数在这种情况下只是通常的变量访问。 BTW内存分配大小在这个测试中是相同的。
为什么会发生此执行时间差异?

I thought that tuple is strict static-compiled template and all of get<> functions are working just usual variable access in that case. BTW memory allocation sizes in this test are same.Why does this execution time difference happens?

编辑:问题是在tuple<要使测试更准确,必须更改一行:

Problem was in initialization of tuple<> object. To make test more accurate one line must be changed:

     constexpr int count = 1e9;
-    auto array = new double_t[count];
+    auto array = new double_t[count]();

     long long sum = 0;

之后,可以看到类似的结果:

After that one can observe similar results:

$ g++ -DTUPLE -g -O2 -std=c++11 test.cpp && (for i in $(seq 3); do time ./a.out; done) 2>&1 | grep real
real    0m3.342s
real    0m3.339s
real    0m3.343s

$ g++ -g -O2 -std=c++11 test.cpp && (for i in $(seq 3); do time ./a.out; done) 2>&1 | grep real
real    0m3.349s
real    0m3.339s
real    0m3.334s


推荐答案

元组所有默认构造值(所以一切都是0)doubles不会得到默认初始化。

The tuple all default construct values (so everything is 0) doubles do not get default initialized.

生成的程序集,下面的初始化循环只在使用元组时存在。

In generated assembly, following initialization loop is only present when using tuples. Otherwise they are equivalent.

.L2:
    movq    $0, (%rdx)
    addq    $8, %rdx
    cmpq    %rcx, %rdx
    jne .L2

这篇关于C ++ 11元组性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-27 23:41