问题描述
我正在将一个带有大量系数数组运算的 Matlab 算法移植到 C++,它看起来像这个例子,但通常要复杂得多:
I am porting a Matlab algorithm with lots of coefficient-wise array operations to C++, which look like this example, but are often much more complex:
Eigen::Array<double, Dynamic, 1> tx2(12);
tx2 << 1,2,3,4,5,6;
Eigen::Array<double, Dynamic, 1> tx1(12);
tx1 << 7,8,9,10,11,12;
Eigen::Array<double, Dynamic, 1> x = (tx1 + tx2) / 2;
结果证明 C++ 代码明显比 Matlab 慢(大约 20%).因此,在下一步中,我尝试打开 Eigen 的英特尔 MKL 实现,这对性能没有任何影响,就像字面上没有任何改进.MKL 是否有可能不改进系数向量操作?有没有办法测试我是否成功链接了 MKL?有没有比 Eigen::vector 类更快的替代品?提前致谢!
The C++ code turned out to be significantly slower than Matlab (around 20%). So in a next step I tried to turn on the Intel MKL implementation of Eigen, which did nothing for the performance, like literally no improvement. Is it possible that MKL does not improve coefficient-wise vector operations? Is there a way to test if I linked MKL sucessfully? Are there faster alternatives to the Eigen::vector classes?Thanks in advance!
我在运行 win7 64 位的 i7-3820 上使用 VS 2013.更长的例子是:
I`m using VS 2013 on an i7-3820 running win7 64bit.Longer Example would be:
Array<double, Dynamic, 1> ts = (k2 / (6 * b.pow(3)) + k / b - b / 2) - (k2 / (6 * a.pow(3)) + k / a - a / 2);
Array<double, Dynamic, 1> tp1 = -2 * r2*(b - a)/ (rp.pow(2));
Array<double, Dynamic, 1> tp2 = -2 * r2*rp*log(b / a) / rm2;
Array<double, Dynamic, 1> tp3 = r2*(b.pow(-1) - a.pow (-1)) / 2;
Array<double, Dynamic, 1> tp4 = 16 * r2.pow(2)*(r2.pow(2) + 1)*log((2 * rp*b - rm2) / (2 * rp*a - rm2)) / (rp.pow(3)*rm2);
Array<double, Dynamic, 1> tp5 = 16 * r2.pow(3)*((2 * rp*b - rm2).pow(-1) - (2 * rp*a - rm2).pow(-1)) / rp.pow(3);
Array<double, Dynamic, 1> tp = tp1 + tp2 + tp3 + tp4 + tp5;
Array<double, Dynamic, 1> f = (ts + tp) / (2 * ds*ds);
CMakeLists 的相关部分
relevant part of CMakeLists
set (CMAKE_C_FLAGS "${CMAKE_C_FLAGS} ${OpenMP_C_FLAGS}")
set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${OpenMP_CXX_FLAGS}")
target_link_libraries(MK ${VTK_LIBRARIES} ${Boost_LIBRARIES} mkl_intel_lp64_dll.lib mkl_intel_thread_dll.lib mkl_core_dll.lib libiomp5md.lib)
到目前为止,我只定义了 EIGEN_USE_MKL_ALL.
and I've only defined EIGEN_USE_MKL_ALL so far.
推荐答案
将调用替换为 pow(2)
、pow(3)
和类似的square()
, cube()
.pow(-1)
也一样,最好用除法代替.我希望 MatLab 能够为您完成所有这些优化,但在 C++ 中,只有在编译器级别工作才能使这种编译时优化成为可能.
Replace calls to pow(2)
, pow(3)
, and the likes to square()
, cube()
. Same for pow(-1)
which is advantageously replaced by a division. I hope MatLab is able to do all these kind of optimizations for you, but in C++, only working at the compiler level would make such compile-time optimizations possible.
这篇关于具有 mkl 后端的特征库的按系数数组操作的性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!