本文介绍了在原始指针与推力::迭代器之间进行转换的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想使用Thrust库来计算CUDA中设备阵列的前缀总和。
我的数组分配了 cudaMalloc()。我的要求如下:

I want to use Thrust library to calculate prefix sum of device array in CUDA.My array is allocated with cudaMalloc(). My requirement is as follows:

main()
{
     Launch kernel 1 on data allocated through cudaMalloc()
     // This kernel will poplulate some data d.
     Use thrust to calculate prefix sum of d.
     Launch kernel 2 on prefix sum.
}

我想在内核之间的某个地方使用Thrust,所以我需要方法来转换指针

I want to use Thrust somewhere between my kernels so I need method to convert pointers to device iterators and back.What is wrong in following code?

int main()
{
    int *a;
    cudaMalloc((void**)&a,N*sizeof(int));
    thrust::device_ptr<int> d=thrust::device_pointer_cast(a);
    thrust::device_vector<int> v(N);
    thrust::exclusive_scan(a,a+N,v);
    return 0;
}


推荐答案

来自您的完整示例最新修改如下:

A complete working example from your latest edit would look like this:

#include <thrust/device_ptr.h>
#include <thrust/device_vector.h>
#include <thrust/scan.h>
#include <thrust/fill.h>
#include <thrust/copy.h>
#include <cstdio>

int main()
{
    const int N = 16;
    int * a;
    cudaMalloc((void**)&a, N*sizeof(int));
    thrust::device_ptr<int> d = thrust::device_pointer_cast(a);
    thrust::fill(d, d+N, 2);
    thrust::device_vector<int> v(N);
    thrust::exclusive_scan(d, d+N, v.begin());

    int v_[N];
    thrust::copy(v.begin(), v.end(), v_);
    for(int i=0; i<N; i++)
        printf("%d %d\n", i, v_[i]);

    return 0;
}

您错了:


  1. N 在任何地方都没有定义

  2. 传递原始设备指针 a 而不是 device_ptr d 作为 exclusive_scan

  3. 通过 device_vector v exclusive_scan ,而不是适当的迭代器 v.begin()

  1. N not defined anywhere
  2. passing the raw device pointer a rather than the device_ptr d as the input iterator to exclusive_scan
  3. passing the device_vector v to exclusive_scan rather than the appropriate iterator v.begin()

对细节的关注是使这项工作缺乏的全部。工作确实可以做到:

Attention to detail was all that is lacking to make this work. And work it does:

$ nvcc -arch=sm_12 -o thrust_kivekset thrust_kivekset.cu
$ ./thrust_kivekset

0 0
1 2
2 4
3 6
4 8
5 10
6 12
7 14
8 16
9 18
10 20
11 22
12 24
13 26
14 28
15 30






编辑:


thrust :: device_vector.data()将返回 thrust :: device_ptr 指向向量的第一个元素。 thrust :: device_ptr.get()将返回原始设备指针。因此,

thrust::device_vector.data() will return a thrust::device_ptr which points to the first element of the vector. thrust::device_ptr.get() will return a raw device pointer. Therefore

cudaMemcpy(v_, v.data().get(), N*sizeof(int), cudaMemcpyDeviceToHost);

thrust::copy(v, v+N, v_);

在此示例中在功能上等效。

are functionally equivalent in this example.

这篇关于在原始指针与推力::迭代器之间进行转换的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-10 22:16