本文介绍了为什么这个OpenMP程序比单线程慢?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

请看这段代码。



单线程程序:。编译:

具有openMP的多线程:
编译为:

Multithread with openMP: http://pastebin.com/fbe4gZSnCompiled with:

I tested it on a dual core system (so we have two threads running in parallel). But multi-threaded version is slower than the single-threaded one (and shows unstable time, try to run it few times). What's wrong? Where did I make mistake?

Some tests:

Single-thread:

Layers Neurons Inputs --- Time (ns)

10 200 200 --- 1898983

10 500 500 --- 11009094

10 1000 1000 --- 48116913

Multi-thread:

Layers Neurons Inputs --- Time (ns)

10 200 200 --- 2518262

10 500 500 --- 13861504

10 1000 1000 --- 53446849

I don't understand what is wrong.

解决方案

Is your goal here to study OpenMP, or to make your program faster? If the latter, it would be more worthwhile to write multiply-add code, reduce the number of passes, and incorporate SIMD.

Step 1: Combine loops and use multiply-add:

// remove the variable 'temp' completely
for(int i=0;i<LAYERS;i++)
{
  for(int j=0;j<NEURONS;j++)
  {
    outputs[j] = 0;

    for(int k=0,l=0;l<INPUTS;l++,k++)
    {
      outputs[j] += inputs[l] * weights[i][k];
    }

    outputs[j] = sigmoid(outputs[j]);
  }

  std::swap(inputs, outputs);
}

这篇关于为什么这个OpenMP程序比单线程慢?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-03 06:14