快速的方式来实现在C中的二维卷积

本文介绍了快速的方式来实现在C中的二维卷积的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想实现一个视觉算法，其中包括prefiltering阶段，拉普拉斯高斯的9x9的过滤器。你能指出这也解释了快速过滤器实现简单的文件？我想我应该利用FFT的最有效的过滤。

I am trying to implement a vision algorithm, which includes a prefiltering stage with a 9x9 Laplacian-of-Gaussian filter. Can you point to a document which explains fast filter implementations briefly? I think I should make use of FFT for most efficient filtering.

推荐答案

您确定要使用FFT？这将是一个整体数组变换，这将是昂贵的。如果您已经决定了一个9x9的卷积过滤器，你不需要任何的FFT。

Are you sure you want to use FFT? That will be a whole-array transform, which will be expensive. If you've already decided on a 9x9 convolution filter, you don't need any FFT.

一般来说，为了做卷积用C最便宜的方式是建立一个循环，移动一个指针阵列，在每个点总结了卷积值和数据写入到一个新的数组。此循环然后可以使用自己喜欢的方法（矢量化编译器，MPI库，OpenMP的，等等）并行化。

Generally, the cheapest way to do convolution in C is to set up a loop that moves a pointer over the array, summing the convolved values at each point and writing the data to a new array. This loop can then be parallelised using your favourite method (compiler vectorisation, MPI libraries, OpenMP, etc).

关于边界：

如果您认为该值是0的边界之外，则0 4元素边框添加到您的2D点的阵列。这将避免需要`if`语句来处理的边界，这是昂贵的。
如果数据在边界处包装（即它是周期性的），然后使用模或添加4元件边界，复制网格的相反的一侧（ABCDEFG - > fgabcdefgab为2分）。 **注：这是你所隐含的任何类型的傅立叶变换假设，包括FFT **。如果不是的话，你就需要考虑它任何FFT之前完成。

4点是因为一个9x9的内核的最大边界重叠是主电网外的4分。因此，n个点的边界所需的2N + 1×2n + 1个内核

The 4 points are because the maximum boundary overlap of a 9x9 kernel is 4 points outside the main grid. Thus, n points of border needed for a 2n+1 x 2n+1 kernel.

如果您需要此卷积，真正做到快速，和/或网格大，可以考虑分割成更小的部分，可以在处理器的缓存举行，从而计算出更迅速。这也去任何GPU的卸载你可能想要做的（他们是理想的这种类型的浮点计算）。

If you need this convolution to be really fast, and/or your grid is large, consider partitioning it into smaller pieces that can be held in the processor's cache, and thus calculated far more quickly. This also goes for any GPU-offloading you might want to do (they are ideal for this type of floating-point calculation).

这篇关于快速的方式来实现在C中的二维卷积的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！