本文介绍了为什么这个OpenMP程序比单线程慢?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
请看这段代码。
单线程程序:。编译:
具有openMP的多线程:
编译为:
Multithread with openMP: http://pastebin.com/fbe4gZSnCompiled with:
I tested it on a dual core system (so we have two threads running in parallel). But multi-threaded version is slower than the single-threaded one (and shows unstable time, try to run it few times). What's wrong? Where did I make mistake?
Some tests:
Single-thread:
Layers Neurons Inputs --- Time (ns)
10 200 200 --- 1898983
10 500 500 --- 11009094
10 1000 1000 --- 48116913
Multi-thread:
Layers Neurons Inputs --- Time (ns)
10 200 200 --- 2518262
10 500 500 --- 13861504
10 1000 1000 --- 53446849
I don't understand what is wrong.
解决方案
Is your goal here to study OpenMP, or to make your program faster? If the latter, it would be more worthwhile to write multiply-add code, reduce the number of passes, and incorporate SIMD.
Step 1: Combine loops and use multiply-add:
// remove the variable 'temp' completely
for(int i=0;i<LAYERS;i++)
{
for(int j=0;j<NEURONS;j++)
{
outputs[j] = 0;
for(int k=0,l=0;l<INPUTS;l++,k++)
{
outputs[j] += inputs[l] * weights[i][k];
}
outputs[j] = sigmoid(outputs[j]);
}
std::swap(inputs, outputs);
}
这篇关于为什么这个OpenMP程序比单线程慢?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!