问题描述
我对设备指针和 cudaArray
结构的预期用途之间的区别感到困惑。有人可以解释为什么我会使用一个对另一个?我的基本问题是,通过阅读文档和阅读的书CUDA的例子,我不明白API设计师的意图。
I am confused about the difference between the intended use of device pointers and cudaArray
structures. Could someone please explain why I would use one versus the other? My basic problem is that after looking through documentation and reading the book "CUDA by Example," I do not understand the intent of the API designers.
从我所见过的,似乎 cudaArray
应该用于纹理和指针应该用于直接访问内存。也似乎3D纹理只能使用 cudaArray
创建。是否应该使用 cudaArray
分配所有纹理?许多例子似乎没有。还有,为什么有一个函数 cudaMallocArray
和 cudaMallocArray3D
,但是没有等价的 cudaMallocArray2D
?相反,有一个
cudaBindTexture
和 cudaBindTexture2D
,但没有 cudaBindTexture3D
From what I have seen, it seems that cudaArray
should be used for textures and pointers should be used for directly accessing memory. It also seems that 3D textures can only be created using a cudaArray
. Should all textures be allocated using cudaArray
? Numerous examples seem not to. Also, why is there a function cudaMallocArray
and cudaMallocArray3D
, but no equivalent for cudaMallocArray2D
? Conversely, there is a cudaBindTexture
and cudaBindTexture2D
, but no cudaBindTexture3D
?
推荐答案
cudaArray
是一个不透明的内存块,用于绑定到纹理。纹理可以使用存储在中的内存,这允许更好的纹理缓存命中由于更好的2D空间局部性。将数据复制到 cudaArray
将使其格式化为这样的曲线。
cudaArray
is an opaque block of memory that is optimized for binding to textures. Textures can use memory stored in a space filling curve, which allows for a better texture cache hit rate due to better 2D spatial locality. Copying data to a cudaArray
will cause it to be formatted to such a curve.
cudaArray
是一种优化技术,可以产生更好的纹理缓存命中率。在早期的CUDA架构上,内核不能访问 cudaArray
。但是,计算能力> = 2.0的架构可以通过CUDA表面访问数组。
So, storing data in a cudaArray
is an optimization technique which can yield better texture cache hit rates. On early CUDA architectures, the cudaArray
also cannot be accessed by a kernel. However, architectures of compute capability >= 2.0 can access the array via CUDA surfaces.
确定是否应该使用 cudaArray
或全局存储器中的常规缓冲器归结为存储器的预期用途和存取模式。这将是项目特定的。
Determining if you should use a cudaArray
or a regular buffer in global memory comes down to the intended usage and access patterns for the memory. It will be project specific.
cudaMallocArray()
实际上分配一个二维数组,所以我认为问题只是不一致的命名。也许它更合乎逻辑的称为它 cudaMallocArray2D()
。
cudaMallocArray()
actually allocates a 2D array, so I think the issue is just inconsistent naming. Maybe it would have been more logical to call it cudaMallocArray2D()
.
我没有使用3D纹理。希望有人会回答,让我们知道为什么不需要 cudaBindTexture3D()
。
I haven't used 3D textures. Hopefully, someone will answer and let us know why there's no need for cudaBindTexture3D()
.
这篇关于cudaArray与设备指针的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!