问题描述
我是 Thrust 的新手.我看到所有 Thrust 演示文稿和示例仅显示主机代码.
I am a newbie to Thrust. I see that all Thrust presentations and examples only show host code.
我想知道是否可以将 device_vector 传递给我自己的内核?如何?如果是,内核/设备代码中允许对其进行哪些操作?
I would like to know if I can pass a device_vector to my own kernel? How?If yes, what are the operations permitted on it inside kernel/device code?
推荐答案
正如最初写的那样,Thrust 纯粹是一个主机端抽象.它不能在内核中使用.您可以将封装在 thrust::device_vector
中的设备内存传递给您自己的内核,如下所示:
As it was originally written, Thrust is purely a host side abstraction. It cannot be used inside kernels. You can pass the device memory encapsulated inside a thrust::device_vector
to your own kernel like this:
thrust::device_vector< Foo > fooVector;
// Do something thrust-y with fooVector
Foo* fooArray = thrust::raw_pointer_cast( fooVector.data() );
// Pass raw array and its size to kernel
someKernelCall<<< x, y >>>( fooArray, fooVector.size() );
您还可以在推力算法中使用未由推力分配的设备内存,方法是使用裸 cuda 设备内存指针实例化推力::device_ptr.
and you can also use device memory not allocated by thrust within thrust algorithms by instantiating a thrust::device_ptr with the bare cuda device memory pointer.
四年半后编辑补充说,根据@JackOLantern 的回答,thrust 1.8 添加了顺序执行策略,这意味着您可以在设备上运行thrust 算法的单线程版本.请注意,仍然无法将推力设备向量直接传递给内核,并且设备向量不能直接在设备代码中使用.
Edited four and half years later to add that as per @JackOLantern's answer, thrust 1.8 adds a sequential execution policy which means you can run single threaded versions of thrust's alogrithms on the device. Note that it still isn't possible to directly pass a thrust device vector to a kernel and device vectors can't be directly used in device code.
请注意,在某些情况下,也可以使用 thrust::device
执行策略让内核作为子网格启动并行推力执行.这需要单独的编译/设备链接和支持动态并行的硬件.我不确定所有推力算法是否都支持这一点,但肯定适用于一些算法.
Note that it is also possible to use the thrust::device
execution policy in some cases to have parallel thrust execution launched by a kernel as a child grid. This requires separate compilation/device linkage and hardware which supports dynamic parallelism. I am not certain whether this is actually supported in all thrust algorithms or not, but certainly works with some.
这篇关于推进用户编写的内核的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!