本文介绍了CUDA:cudaMemcpyToSymbol不复制数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在使用cudaMemcpyToSymbol时遇到问题。我有一个工作正常的代码。我的代码的简化版本是:

I am having problems using cudaMemcpyToSymbol. I have a code that works just fine. A cutdown version of my code is this:

mykernel.h file:
__global__
void foo(float* out);







mykernel.cu file:
#include "kernels.h"
__global__
void foo(float* out)
{
    uint32_t idx = blockIdx.x * blockDim.x + threadIdx.x;
    out[idx] = 10;
}







main.cu file:
#include "kernels.h"
main()
{
    // initialization and declaration stuff here

    foo<<<1,1,1>>>(my_global_memory);

    // read back global memory and investigate values
}

以上代码非常完美。现在,我想用一个来自恒定内存的值替换该 10值。因此,我要做的是:

The above code works just perfect. Now I want to replace this "10" value with a value coming from a constant memory. So what I did was to:


  • 在Mykernel中添加 __ constant__ float my_const_var; 。 h文件。

  • 在mykenel.cu
  • 中用 out [idx] = my_const_var; 替换内核的最后一行。
  • 浮点值= 10.0f; cudaMemcpyToSymbol(my_const_var,& value); 在我对main.cu进行调用之前

  • add __constant__ float my_const_var; in mykernel.h file.
  • replace the last line of my kernel with out[idx] = my_const_var; in mykenel.cu
  • add float value = 10.0f; cudaMemcpyToSymbol(my_const_var,&value); before my invocation in main.cu

看起来cudaMemcpyToSymbol不会复制实际值,因为我得到的结果是 0而不是 10。此外,我总是检查CUDA错误,没有错误。有人可以告诉我我在做什么错吗?为什么cudaMemcpyToSymbol不将值复制到符号?我在Debian Linux和CUDA SDK 5.0上使用带有最新驱动程序的GeForce9600M(计算功能1.1)。我还尝试运行cuda-memcheck,但没有收到错误。

After having done all that it looks like cudaMemcpyToSymbol doesn't copy the actual value because I get a result of '0' instead of '10'. In addition, I always check for CUDA errors and there is none. Can someone tell me what am I doing wrong? And why cudaMemcpyToSymbol does not copy the value to the symbol? I am using a GeForce9600M (compute capability 1.1) with latest drivers on Debian Linux and CUDA SDK 5.0. I also tried running cuda-memcheck and I get no errors.

推荐答案

由于您正尝试在一个编译单元中访问变量在另一个编译单元( main.cu mykernel.cu )中定义的代码,则需要。

Since you are attempting to access a variable in one compilation unit that is defined in another compilation unit, (main.cu and mykernel.cu) this will require separate device compilation.

不幸的是,单独的编译仅适用于计算能力为2.0或更高的设备。

Unfortunately, separate compilation is only available for devices of compute capability 2.0 or greater.

-cc2.0,将必须引用给定变量的所有CUDA代码放在同一文件(声明该变量的文件)中。

You can work around this for pre-cc2.0 by putting all your CUDA code that must reference a given variable in the same file (the same file where the variable is declared).

这篇关于CUDA:cudaMemcpyToSymbol不复制数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

07-13 06:46