本文介绍了2D字符数组CUDA内核的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要CUDA技术的内核传递的char [] []帮助。这是我的code:

I need help with transfer char[][] to Cuda kernel. This is my code:

__global__
void kernel(char** BiExponent){
  for(int i=0; i<500; i++)
     printf("%c",BiExponent[1][i]); // I want print line 1
}

int main(){
  char (*Bi2dChar)[500] = new char [5000][500];
  char **dev_Bi2dChar;

  ...//HERE I INPUT DATA TO Bi2dChar

  size_t host_orig_pitch = 500 * sizeof(char);
  size_t pitch;
  cudaMallocPitch((void**)&dev_Bi2dChar, &pitch, 500 * sizeof(char), 5000);
  cudaMemcpy2D(dev_Bi2dChar, pitch, Bi2dChar, host_orig_pitch, 500 * sizeof(char), 5000, cudaMemcpyHostToDevice);
  kernel <<< 1, 512 >>> (dev_Bi2dChar);
  free(Bi2dChar); cudaFree(dev_Bi2dChar);
}

我用:
nvcc.exe-gen code = ARCH = compute_20,code = \\sm_20,compute_20 \\ - 使用本地-ENV 2012 --cl版本-ccbin

I use:nvcc.exe" -gencode=arch=compute_20,code=\"sm_20,compute_20\" --use-local-env --cl-version 2012 -ccbin

感谢您的帮助。

推荐答案

cudaMemcpy2D 不实际处理2维(即双指针, ** )的阵列C.
注意,文档表示,预计单球,双没有指针。

cudaMemcpy2D doesn't actually handle 2-dimensional (i.e. double pointer, **) arrays in C.Note that the documentation indicates it expects single pointers, not double pointers.

一般而言,在移动主机和设备之间的任意双指针C数组比单个指针数组更复杂

Generally speaking, moving arbitrary double pointer C arrays between the host and the device is more complicated than a single pointer array.

如果你真的要处理的双指针数组,然后搜索在该页面的右上角的CUDA二维阵列,你会发现怎么做的各种实例。 (例如,答复@talonmies )

If you really want to handle the double-pointer array, then search on "CUDA 2D Array" in the upper right hand corner of this page, and you'll find various examples of how to do it. (For example, the answer given by @talonmies here)

通常情况下,一个更简单的方法是简单地扁平化的阵列,因此它可以由一个指针引用,即的char [] 而不是的char [] [] ,然后利用指数运算来模拟2维的访问。

Often, an easier approach is simply to "flatten" the array so it can be referenced by a single pointer, i.e. char[] instead of char[][], and then use index arithmetic to simulate 2-dimensional access.

您扁平code会是这个样子:
(c您提供的$ C $是不可编译,不完整的片段,所以我的是也)

Your flattened code would look something like this:(the code you provided is an uncompilable, incomplete snippet, so mine is also)

#define XDIM 5000
#define YDIM 500

__global__
void kernel(char* BiExponent){
  for(int i=0; i<500; i++)
     printf("%c",BiExponent[(1*XDIM)+i]); // I want print line 1
}

int main(){
  char (*Bi2dChar)[YDIM] = new char [XDIM][YDIM];
  char *dev_Bi2dChar;

  ...//HERE I INPUT DATA TO Bi2dChar

  cudaMalloc((void**)&dev_Bi2dChar,XDIM*YDIM * sizeof(char));
  cudaMemcpy(dev_Bi2dChar, &(Bi2dChar[0][0]), host_orig_pitch, XDIM*YDIM * sizeof(char), cudaMemcpyHostToDevice);
  kernel <<< 1, 512 >>> (dev_Bi2dChar);
  free(Bi2dChar); cudaFree(dev_Bi2dChar);
}

如果你想有一个尖锐的数组,你同样可以创建它,但你仍然会这么做单指针数组,而不是双指针数组。

If you want a pitched array, you can create it similarly, but you will still do so as single pointer arrays, not double pointer arrays.

这篇关于2D字符数组CUDA内核的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-15 02:38