本文介绍了快速的方式来实现在C中的二维卷积的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想实现一个视觉算法,其中包括prefiltering阶段,拉普拉斯高斯的9x9的过滤器。你能指出这也解释了快速过滤器实现简单的文件?我想我应该利用FFT的最有效的过滤。

I am trying to implement a vision algorithm, which includes a prefiltering stage with a 9x9 Laplacian-of-Gaussian filter. Can you point to a document which explains fast filter implementations briefly? I think I should make use of FFT for most efficient filtering.

推荐答案

您确定要使用FFT?这将是一个整体数组变换,这将是昂贵的。如果您已经决定了一个9x9的卷积过滤器,你不需要任何的FFT。

Are you sure you want to use FFT? That will be a whole-array transform, which will be expensive. If you've already decided on a 9x9 convolution filter, you don't need any FFT.

一般来说,为了做卷积用C最便宜的方式是建立一个循环,移动一个指针阵列,在每个点总结了卷积值和数据写入到一个新的数组。此循环然后可以使用自己喜欢的方法(矢量化编译器,MPI库,OpenMP的,等等)并行化。

Generally, the cheapest way to do convolution in C is to set up a loop that moves a pointer over the array, summing the convolved values at each point and writing the data to a new array. This loop can then be parallelised using your favourite method (compiler vectorisation, MPI libraries, OpenMP, etc).

关于边界:

  • 如果您认为该值是0的边界之外,则0 4元素边框添加到您的2D点的阵列。这将避免需要`if`语句来处理的边界,这是昂贵的。
  • 如果数据在边界处包装(即它是周期性的),然后使用模或添加4元件边界,复制网格的相反的一侧(ABCDEFG - > fgabcdefgab为2分)。 **注:这是你所隐含的任何类型的傅立叶变换假设,包括FFT **。如果不是的话,你就需要考虑它任何FFT之前完成。

4点是因为一个9x9的内核的最大边界重叠是主电网外的4分。因此,n个点的边界所需的2N + 1×2n + 1个内核

The 4 points are because the maximum boundary overlap of a 9x9 kernel is 4 points outside the main grid. Thus, n points of border needed for a 2n+1 x 2n+1 kernel.

如果您需要此卷积,真正做到快速,和/或网格大,可以考虑分割成更小的部分,可以在处理器的缓存举行,从而计算出更迅速。这也去任何GPU的卸载你可能想要做的(他们是理想的这种类型的浮点计算)。

If you need this convolution to be really fast, and/or your grid is large, consider partitioning it into smaller pieces that can be held in the processor's cache, and thus calculated far more quickly. This also goes for any GPU-offloading you might want to do (they are ideal for this type of floating-point calculation).

这篇关于快速的方式来实现在C中的二维卷积的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-20 04:23