问题描述
我一直在测试各种开放源代码,以解决C ++中的线性方程组.到目前为止,我发现最快的是armadillo,也使用OPENblas软件包.为了解决密集的线性NxN系统,其中N = 5000在我的系统上大约需要8.3秒,这确实非常快(没有安装openblas,大约需要30秒).
I've been testing various open source codes for solving a linear system of equations in C++. So far the fastest I've found is armadillo, using the OPENblas package as well. To solve a dense linear NxN system, where N=5000 takes around 8.3 seconds on my system, which is really really fast (without openblas installed, it takes around 30 seconds).
这种增加的原因之一是armadillo + openblas似乎可以使用多个线程.它运行在我的两个内核上,而没有openblas的armadillo仅使用1.我有一个i7处理器,所以我想增加内核数量并进一步测试.我正在使用ubuntu,因此可以从openblas文档中在终端中进行操作:
One reason for this increase is that armadillo+openblas seems to enable using multiple threads. It runs on two of my cores, whereas armadillo without openblas only uses 1. I have an i7 processor, so I want to increase the number of cores, and test it further. I'm using ubuntu, so from the openblas documentation I can do in the terminal:
导出OPENBLAS_NUM_THREADS = 4
export OPENBLAS_NUM_THREADS=4
但是,再次运行代码似乎并没有增加使用的内核数量或速度.我是在做错什么,还是2是使用犰狳的"solve(A,b)"命令的最大数量?我无法在任何地方找到armadillo的源代码以进行查看.
however, running the code again doesn't seem to increase the number of cores being used or the speed. Am i doing something wrong, or is the 2 the max amount for using armadillo's "solve(A,b)" command? I wasn't able to find armadillo's source code anywhere to take a look.
偶然地有人知道armadillo/openblas用于解决Ax = b(具有并行性或其他问题的标准LU分解)的方法吗?谢谢!
Incidentally does anybody know the methods armadillo/openblas use for solving Ax=b (standard LU decomposition with parallelism or something else) ? Thanks!
实际上,当使用突触包管理器安装Openblas时,卡在2个内核上似乎是一个错误.请参阅此处.从源代码重新安装允许它检测我实际拥有多少个内核(8).现在,我可以使用export OPENBLAS_NUM_THREADS = 4等来管理它.
edit: Actually the number of cores stuck at 2 seems to be a bug when installing openblas with synaptic package manager see here. Reinstalling from source allows it to detect how many cores i actutally have (8). Now I can use export OPENBLAS_NUM_THREADS=4 etc to govern it.
推荐答案
Armadillo 不会阻止 OpenBlas 使用更多的内核.当前的OpenBlas实现可能只为某些操作选择2个内核.
Armadillo doesn't prevent OpenBlas from using more cores. It's possible that the current implementation of OpenBlas simply chooses 2 cores for certain operations.
您可以直接在可下载程序包(为开源)中查看Armadillo的源代码.文件夹包括".具体来说,请查看文件"include/armadillo_bits/fn_solve.hpp"(包含用户可访问的 solve()函数)和文件"include/armadillo_bits/auxlib_meat.hpp"(其中包含用于调用折磨的Blas和Lapack函数的包装程序和内务处理代码.
You can see Armadillo's source code directly in the downloadable package (it's open source), in the folder "include". Specifically, have a look at the file "include/armadillo_bits/fn_solve.hpp" (which contains the user accessible solve() function), and the file "include/armadillo_bits/auxlib_meat.hpp" (which contains the wrapper and housekeeping code for calling the torturous Blas and Lapack functions).
如果您已经在计算机上安装了Armadillo,请查看"/usr/include/armadillo_bits"或"/usr/local/include/armadillo_bits".
If you already have Armadillo installed on your machine, have a look at "/usr/include/armadillo_bits" or "/usr/local/include/armadillo_bits".
这篇关于犰狳线性系统求解器(带openblas)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!