本文介绍了在CUDA中分配设备变量时出现问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我无法尝试为设备变量分配值,然后将其复制到主机变量。
I'm having trouble trying to assign a value to a device variable and then copying this to a host variable.
我从d_test和h_test = 0.0开始。我有一个简单的内核将设备变量d_test设置为1.0。然后我将它复制到宿主变量h_test并打印。问题是,当我打印我得到h_test = 0.0。我究竟做错了什么?以下是代码:
I start with d_test and h_test = 0.0. I have a simple kernel to set the device variable, d_test, to 1.0. I then copy this to the host variable h_test and print. The problem is that when I print I get h_test = 0.0. What am I doing wrong? Here's the code:
时,使用 cudaMemcpyFromSymbol li>
Instead of cudaMemcpy, use cudaMemcpyFromSymbol when copying from a global __device__ variable.
这是完整的解决方案:
// -*- mode: C -*- #include <stdio.h> #include <stdlib.h> #include <cuda_runtime.h> // device variable and kernel __device__ float d_test; __global__ void kernel1() { d_test = 1.0; } int main() { // initialise variables float h_test = 0.0; cudaMemset(&d_test,0,sizeof(float)); // invoke kernel kernel1 <<<1,1>>> (); // Copy device variable to host and print cudaMemcpyFromSymbol(&h_test, "d_test", sizeof(float), 0, cudaMemcpyDeviceToHost); printf("%f\n",h_test); }输出:
$ nvcc test.cu -run 1.000000这篇关于在CUDA中分配设备变量时出现问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!