本文介绍了如何使用线程和不同精度设置FFTW?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要使用具有不同算术精度和多线程计划的FFTW.

I need to use FFTW with different arithmetic precisions and multithreaded plans.

我需要为所有精度设置多线程吗?像这样:

I need to setup multithreading for all precisions? Like this:

fftwf_init_threads();
fftwf_plan_with_nthreads(nthreads);
fftw_init_threads();
fftw_plan_with_nthreads(nthreads);
fftwl_init_threads();
fftwl_plan_with_nthreads(nthreads);

或在我的 start 例程中编写该代码

Or writing this in my start routine

fftw_init_threads();
fftw_plan_with_nthreads(nthreads);

足够了吗?

推荐答案

不同精度的fftw库彼此完全独立.因此, 您需要通过调用fftw的相应函数来为所有精度设置多线程 :

The fftw libraries for different precisions are completely independent from one another. Hence, you need to setup multithreading for all precisions by calling the corresponding function of fftw:

int nbthreads=2;
fftw_init_threads();
fftw_plan_with_nthreads(nbthreads);
fftwf_init_threads();
fftwf_plan_with_nthreads(nbthreads);
fftwq_init_threads();
fftwq_plan_with_nthreads(nbthreads);

...

fftw_cleanup_threads();
fftwf_cleanup_threads();
fftwq_cleanup_threads();

这是一个使您信服的示例代码.一旦建立了相应的库,就可以在Unix上由gcc main.c -o main -lfftw3_threads -lfftw3 -lfftw3f_threads -lfftw3f -lfftw3q_threads -lfftw3q -lm -lpthread -Wall对其进行编译.

Here is a sample code to convinve you. It can be compiled by gcc main.c -o main -lfftw3_threads -lfftw3 -lfftw3f_threads -lfftw3f -lfftw3q_threads -lfftw3q -lm -lpthread -Wall on Unix, once the corrponding libraries are built.

#include <stdlib.h>
#include <stdio.h>
#include <time.h>

#include <fftw3.h>

int main ( ){
    int n = 40000000;
    int nf = n*2;
    int nq =n/32;

    fftw_complex *in;
    fftw_complex *in2;
    fftw_complex *out;
    fftw_plan plan_backward;
    fftw_plan plan_forward;

    fftwf_complex *inf;
    fftwf_complex *in2f;
    fftwf_complex *outf;
    fftwf_plan plan_backwardf;
    fftwf_plan plan_forwardf;

    fftwq_complex *inq;
    fftwq_complex *in2q;
    fftwq_complex *outq;
    fftwq_plan plan_backwardq;
    fftwq_plan plan_forwardq;

    //thread parameters
    int nbthreads=2;
    fftw_init_threads();
    fftw_plan_with_nthreads(nbthreads);
    fftwf_init_threads();
    fftwf_plan_with_nthreads(nbthreads);
    fftwq_init_threads();
    fftwq_plan_with_nthreads(nbthreads);

    in = fftw_malloc ( sizeof ( fftw_complex ) * n );
    out = fftw_malloc ( sizeof ( fftw_complex ) * n );

    inf = fftwf_malloc ( sizeof ( fftwf_complex ) * nf );
    outf = fftwf_malloc ( sizeof ( fftwf_complex ) * nf );

    inq = fftwq_malloc ( sizeof ( fftwq_complex ) * nq );
    outq = fftwq_malloc ( sizeof ( fftwq_complex ) * nq);

    // forward fft
    plan_forward = fftw_plan_dft_1d ( n, in, out, FFTW_FORWARD, FFTW_ESTIMATE );
    fftw_execute ( plan_forward );

    plan_forwardf = fftwf_plan_dft_1d ( nf, inf, outf, FFTW_FORWARD, FFTW_ESTIMATE );
    fftwf_execute ( plan_forwardf );

    plan_forwardq = fftwq_plan_dft_1d ( nq, inq, outq, FFTW_FORWARD, FFTW_ESTIMATE );
    fftwq_execute ( plan_forwardq );

    // backward fft
    in2 = fftw_malloc ( sizeof ( fftw_complex ) * n );
    plan_backward = fftw_plan_dft_1d ( n, out, in2, FFTW_BACKWARD, FFTW_ESTIMATE );
    fftw_execute ( plan_backward );

    in2f = fftwf_malloc ( sizeof ( fftwf_complex ) * nf );
    plan_backwardf = fftwf_plan_dft_1d ( nf, outf, in2f, FFTW_BACKWARD, FFTW_ESTIMATE );
    fftwf_execute ( plan_backwardf );

    in2q = fftwq_malloc ( sizeof ( fftwq_complex ) * nq );
    plan_backwardq = fftwq_plan_dft_1d ( nq, outq, in2q, FFTW_BACKWARD, FFTW_ESTIMATE );
    fftwq_execute ( plan_backwardq);

    fftw_cleanup_threads();
    fftw_destroy_plan ( plan_forward );
    fftw_destroy_plan ( plan_backward );

    fftw_free ( in );
    fftw_free ( in2 );
    fftw_free ( out );

    fftwf_cleanup_threads();
    fftwf_destroy_plan ( plan_forwardf );
    fftwf_destroy_plan ( plan_backwardf );

    fftwf_free ( inf );
    fftwf_free ( in2f );
    fftwf_free ( outf );

    fftwq_cleanup_threads();
    fftwq_destroy_plan ( plan_forwardq);
    fftwq_destroy_plan ( plan_backwardq );

    fftwq_free ( inq );
    fftwq_free ( in2q );
    fftwq_free ( outq );

    return 0;
}

您可以注释fftwq_init_threads(); fftwq_plan_with_nthreads(nbthreads);fftwq_cleanup_threads();并监视cpu的使用情况,以检查fftw在这种情况下不会为四精度使用多线程.

You can comment fftwq_init_threads(); fftwq_plan_with_nthreads(nbthreads); and fftwq_cleanup_threads(); and monitor cpu usage to check that fftw will not use multithreading for quad precision in this case.

要使用 intel simd sse2 构建具有不同精度的fftw3,请执行以下操作:

To build fftw3 for different precisions using intel simd sse2, do:

./configure --enable-threads --enable-shared --enable-sse2
make 
make install
./configure --enable-threads --enable-shared --enable-sse2 --enable-float
make 
make install
./configure --enable-threads --enable-shared --enable-quad-precision
make
make install

不支持sse2的四精度.

There is no sse2 support for quad precision.

这篇关于如何使用线程和不同精度设置FFTW?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-21 12:41