问题描述
我在设备变量上使用 cudaMemset
时遇到问题。是否可以使用对 cudaMemset
的设备变量的引用,或者它只是缺少编译器标志或库的问题。我使用的是cuda 4.1和
I am having trouble using cudaMemset
on a device variable. Is it possible to use the reference to the device variable for cudaMemset
, or is it just a matter of missing compiler flags, or libraries.. I am using cuda 4.1, and
这是我的示例代码:
#include <stdio.h>
#include <stdlib.h>
#include <cuda_runtime.h>
// device variable and kernel
__device__ float d_test;
int main() {
if (cudaMemset(&d_test,0,sizeof(float)) !=cudaSuccess)
printf("Error!\n");
}
其输出:
Error!
推荐答案
您的问题是 d_test
(因为它出现在主机符号表中)不是有效的设备地址,运行时无法直接访问它。解决方案是使用 cudaGetSymbolAddress
API函数在运行时从上下文中读取设备符号的地址。这是您的演示案例的稍微扩展版本,应该可以正常工作:
Your problem is that d_test
(as it appears in the host symbol table) isn't a valid device address and the runtime cannot access it directly. The solution is to use the cudaGetSymbolAddress
API function to read the address of the device symbol from the context at runtime. Here is a slightly expanded version of your demonstration case which should work correctly:
#include <stdio.h>
#include <stdlib.h>
#include <cuda_runtime.h>
// device variable and kernel
__device__ float d_test;
inline void gpuAssert(cudaError_t code, char * file, int line, bool Abort=true)
{
if (code != cudaSuccess) {
fprintf(stderr, "GPUassert: %s %s %d\n", cudaGetErrorString(code),file,line);
if (Abort) exit(code);
}
}
#define gpuErrchk(ans) { gpuAssert((ans), __FILE__, __LINE__); }
int main()
{
float * _d_test;
gpuErrchk( cudaFree(0) );
gpuErrchk( cudaGetSymbolAddress((void **)&_d_test, "d_test") );
gpuErrchk( cudaMemset(_d_test,0,sizeof(float)) );
gpuErrchk( cudaThreadExit() );
return 0;
}
这里,我们读取设备符号 d_test
从上下文转换为主机指针 _d_test
。然后可以将其传递给主机端API函数,如 cudaMemset
, cudaMemcpy
等。
Here, we read the address of the device symbol d_test
from the context into a host pointer _d_test
. This can then be passed to host side API functions like cudaMemset
, cudaMemcpy
, etc.
这篇关于cudaMemset在__device__变量上失败的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!