问题描述
在那里,
我有以下代码:
Hey there,I have the following piece of code:
#if USE_CONST == 1
__constant__ double PNT[ SIZE ];
#else
__device__ double *PNT;
#endif
稍后我有:
#if USE_CONST == 0
cudaMalloc((void **)&PNT, sizeof(double)*SIZE);
cudaMemcpy(PNT, point, sizeof(double)*SIZE, cudaMemcpyHostToDevice);
#else
cudaMemcpyToSymbol(PNT, point, sizeof(double)*SIZE);
#endif
而点
是在代码中定义的某处。当使用 USE_CONST = 1
时,一切都按预期工作,但在没有它工作时,它不工作。我通过
whereas point
is somewhere defined in the code before. When working with USE_CONST=1
everything works as expected, but when working without it, than it doesn't. I access the array in my kernel-function via
PNT [index]
这两个变体之间的问题在哪里?
感谢!
Where's the problem between the both variants?Thanks!
推荐答案
CUDA 4.0之前的cudaMemcpyToSymbol的正确用法是:
The correct usage of cudaMemcpyToSymbol prior to CUDA 4.0 is:
cudaMemcpyToSymbol("PNT", point, sizeof(double)*SIZE)
或者:
double *cpnt;
cudaGetSymbolAddress((void **)&cpnt, "PNT");
cudaMemcpy(cpnt, point, sizeof(double)*SIZE, cudaMemcpyHostToDevice);
这可能会更快一点,如果你打算从主机API多次访问符号。
which might be a bit faster if you are planning to access the symbol from the host API more than once.
编辑:误解了这个问题。对于全局内存版本,对常量内存类似于第二个版本。
misunderstood the question. For the global memory version, do something similar to the second version for constant memory
double *gpnt;
cudaGetSymbolAddress((void **)&gpnt, "PNT");
cudaMemcpy(gpnt, point, sizeof(double)*SIZE. cudaMemcpyHostToDevice););
这篇关于在CUDA中使用全局对常数内存的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!