问题描述
我写了一个简单的串行1D卷积函数(如下)。我也在试验GPU卷积实现。这主要是为了我自己的好奇心;我试图学习在各种非FFT实现策略之间的性能折衷。
I've written a simple serial 1D convolution function (below). I'm also experimenting with GPU convolution implementations. This is mostly for my own curiosity; I'm trying to learn the performance tradeoffs among various non-FFT implementation strategies.
避免分支对我的GPU卷积实验很重要,因为分支在Nvidia GPU上很贵。我的一个朋友提到有一种方法来实现下面的代码,而不用 if / else
语句,但他不记得它的工作原理。
Avoiding branching will be important for my GPU convolution experiments, since branching is expensive on Nvidia GPUs. One of my friends mentioned that there's a way to implement the code below without if/else
statements, but he couldn't remember how it works.
如何在不使用任何 if / else
语句
这是我在C ++中的基本1D序列码:
Here's my basic 1D serial code in C++:
vector<int> myConv1d(vector<int> vec, vector<int> kernel)
{
int paddedLength = vec.size() + kernel.size() - 1;
vector<int> convolved(paddedLength); //zeros
reverse(kernel.begin(), kernel.end()); //flip the kernel (if we don't flip it, then we have correlation instead of convolution)
for(int outputIdx=0; outputIdx<paddedLength; outputIdx++) //index into 'convolved' vector
{
int vecIdx = outputIdx - kernel.size() + 1; //aligns with leftmost element of kernel
for(int kernelIdx=0; kernelIdx<kernel.size(); kernelIdx++)
{
if( (vecIdx+kernelIdx) >= 0 && (vecIdx+kernelIdx) < vec.size() ) //TODO: FIND A WAY TO REMOVE THIS
{
convolved[outputIdx] += kernel[kernelIdx]*vec[vecIdx+kernelIdx];
}
}
}
return convolved;
}
几个快速笔记:
- 我发现 ,但我不太明白避免条件语句的策略。
- 也写了一个2D卷积实现,我希望将这个SO post的结果也应用到2D版本。
- 这不是家庭作业。
- I did find some related posts, but I didn't quite understand the strategy avoiding conditional statements.
- I've also written a 2D convolution implementation, and I'm hoping to apply the results of this SO post to the 2D version as well.
- This is NOT homework. It's marginally related to one of our research projects, but it's mostly for the sake of learning.
推荐答案
int lowerBound = std::max( 0, -vecIdx );
int upperBound = std::min( kernel.size(), vec.size() - vecIdx );
for( int kernelIdx = lowerBound; kernelIdx < upperBound; kernelIdx++ )
不明白这个问题。
这篇关于1D没有if-else语句的卷积(非FFT)?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!