本文介绍了使GNU Octave与多核处理器一起工作. (多线程)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

限时删除!!

我希望能够使用gnu octave编程多个线程,以便它将利用多个处理器.

I want to be able to program multiple threads with gnu octave so it will utilize multiple processors.

我在Fedora 17 Linux上安装了GNU Octave,并执行了以下操作:

I installed GNU Octave on Fedora 17 Linux and did the following:

yum install octave

在我的计算机上安装了最新版本的octave 3.6.2.它的效果很好,但是当您将两个巨大的矩阵相乘时,它会使八度使用的一个CPU陷入瘫痪.如果矩阵乘法利用所有内核,那就太好了,因为在这种情况下,CPU显然是瓶颈.

Which installed on my computer the latest version of octave, 3.6.2. It works great, however when you multiply two huge matrices together it bogs down the one CPU that octave uses. It would be nice if the matrix multiplication utilizes all of the cores, since in this case the CPU is obviously the bottleneck.

倍频程能否充分利用多核处理器并在多个线程上运行?是否为此提供了库或编译时标志?

Can octave fully utilize multi-core processors and run on multiple threads? Is there a library or compile time flag for this?

推荐答案

解决方案

Octave本身是在一个内核上运行的单线程应用程序.您可以倍频使用某些利用多个内核的ATLAS之类的库.因此,虽然八度仅使用一个内核,但是当您遇到繁重的操作时,八度会调用ATLAS中使用许多CPU的函数.

Octave itself is a single-thread application that runs on one core. You can get octave to use some libraries like ATLAS which utilize multiple cores. So while Octave only uses one core, when you encounter a heavy operation, octave calls functions in ATLAS that utilize many CPU's.

我能够做到这一点.首先从源代码编译"ATLAS",并将其提供给您的系统,以便octave可以找到它并使用这些库函数. ATLAS会根据您的系统和内核数量进行调整.当您从源代码安装八度音程并指定ATLAS时,它会使用它,因此,当八度音程执行繁重的运算(例如巨大的矩阵乘法)时,ATLAS会决定要使用多少个CPU.

I was able to do this. First compile 'ATLAS' from source code and make it available to your system so that octave can find it and use those library functions. ATLAS tunes itself to your system and number of cores. When you install octave from source and specify ATLAS, it uses it, so when octave does a heavy operation like a huge matrix multiplication, ATLAS decides how many cpu's to use.

我无法在Fedora上使用它,但是在Gentoo上,我可以使用它.

I was unable to get this to work for Fedora, but on Gentoo I could get it to work.

我使用了以下两个链接: ftp://ftp.gnu.org/gnu/octave/

I used these two links:ftp://ftp.gnu.org/gnu/octave/

http://math-atlas.sourceforge.net/

在安装ATLAS前后,我运行了以下八度音阶内核:

tic
bigMatrixA = rand(3000000,80);
bigMatrixB = rand(80,30);
bigMatrixC = bigMatrixA * bigMatrixB;
toc
disp("done");

使用多个处理器时,矩阵乘法的运行速度要快得多,比使用单核时要快3倍:

Without Atlas: Elapsed time is 3.22819 seconds.
With Atlas:    Elapsed time is 0.529 seconds.

我正在使用的三个库可以加快速度blas-atlascblas-atlaslapack-atlas.

The three libraries I am using which speed things up areblas-atlas,cblas-atlas,lapack-atlas.

如果octave可以使用这些代替默认的blas和lapack库,那么它将利用多核.

If octave can use these instead of the default blas, and lapack libraries, then it will utilize multi core.

使用ATLAS从源代码进行编译要获得八度音阶并不容易,并且需要一定的编程技巧.

It is not easy and takes some programming skill to get octave to compile from source with ATLAS.

使用地图集的欠缺:

此Atlas软件使用大量开销来将八度音程程序拆分为多个线程.如果您要做的只是巨大的矩阵乘法,那么它肯定会快得多,但是大多数命令不能由Atlas进行多线程处理.如果从内核中提取处理能力/速度的每一点都是头等大事,那么您只需编写要与自己的程序并行运行的程序,就会有更好的运气. (将您的程序拆分为8个等效程序,这些程序可以解决问题的1/8,并同时运行它们,完成所有操作后,重新组合结果).

This Atlas software uses a lot of overhead to split your octave program into multiple threads. Sure it goes much faster if all you are doing is huge matrix multiplications, but most commands can't be multi-threaded by atlas. If extracting every bit of processing power/speed out of your cores is top priority then you'll have much better luck just writing your program to be run in parallel with itself. (Split your program into 8 equivalent programs that work on 1/8th of the problem and run them all simultaneously, when all are done, reassemble the results).

Atlas帮助单线程八度音程程序的行为更像多线程应用程序,但这不是万灵丹. Atlas不会使您的单线程Octave程序最大程度地发挥2,4,6,8核心处理器的作用.您会注意到性能有所提高,但这种提高将使您寻找使用所有处理器的更好方法.答案是编写程序使其与自身并行运行,这需要很多编程技巧.

Atlas helps a single threaded octave program behave a little bit more like a multi-threaded app but it is no silver bullet. Atlas won't make your single threaded Octave program max out your 2,4,6,8 core processor. You'll notice a performance boost, but the boost will leave you searching for a better way to use all the processor. The answer is writing your program to run in parallel with itself, and this takes a lot of programming skill.

建议

将精力投入向量化最繁重的操作并在n个同时运行的线程上分配进程.如果您等待一个进程运行的时间太长,则最有可能加速进程的结果最少的就是使用更高效的算法或数据结构.

Put your energy into vectorizing your heaviest operations and distributing the process over n simultaneous running threads. If you are waiting too long for a process to run, most likely the lowest hanging fruit to speed it up is using a more efficient algorithm or data structure.

这篇关于使GNU Octave与多核处理器一起工作. (多线程)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

1403页,肝出来的..

09-07 15:13