本文介绍了OpenMP与C ++ 11线程的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在下面的示例中,C ++ 11线程执行大约需要50秒,而OMP线程仅需要5秒.有什么想法吗? (我可以向您保证,如果您执行的是真实工作而不是doNothing,或者您执行的工作顺序不同,依此类推.)我也在16核计算机上.

In the following example the C++11 threads take about 50 seconds to execute, but the OMP threads only 5 seconds. Any ideas why? (I can assure you it still holds true if you are doing real work instead of doNothing, or if you do it in a different order, etc.) I'm on a 16 core machine, too.

#include <iostream>
#include <omp.h>
#include <chrono>
#include <vector>
#include <thread>

using namespace std;

void doNothing() {}

int run(int algorithmToRun)
{
    auto startTime = std::chrono::system_clock::now();

    for(int j=1; j<100000; ++j)
    {
        if(algorithmToRun == 1)
        {
            vector<thread> threads;
            for(int i=0; i<16; i++)
            {
                threads.push_back(thread(doNothing));
            }
            for(auto& thread : threads) thread.join();
        }
        else if(algorithmToRun == 2)
        {
            #pragma omp parallel for num_threads(16)
            for(unsigned i=0; i<16; i++)
            {
                doNothing();
            }
        }
    }

    auto endTime = std::chrono::system_clock::now();
    std::chrono::duration<double> elapsed_seconds = endTime - startTime;

    return elapsed_seconds.count();
}

int main()
{
    int cppt = run(1);
    int ompt = run(2);

    cout<<cppt<<endl;
    cout<<ompt<<endl;

    return 0;
}

推荐答案

OpenMP 线程池用于其语法(另请此处此处).旋转和拆除线程很昂贵. OpenMP避免了这种开销,因此它所要做的就是实际工作和执行状态的最小共享内存穿梭.在您的Threads代码中,您正在旋转并在每次迭代中拆除一组新的16个线程.

OpenMP thread-pools for its Pragmas (also here and here). Spinning up and tearing down threads is expensive. OpenMP avoids this overhead, so all it's doing is the actual work and the minimal shared-memory shuttling of the execution state. In your Threads code you are spinning up and tearing down a new set of 16 threads every iteration.

这篇关于OpenMP与C ++ 11线程的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

05-26 05:28
查看更多