我目前正在尝试使用推力::: upper_bound函数。我正在为函数提供的参数遇到问题。我想利用CUDA vector 类型,尤其是double3
,但是当我使用这种类型时,我遇到了一些推力库错误。
我正在运行的代码块如下:
/********************************************************************************
eos_search_gpu()
purpose --- kernel to find the upper bound index for the
interpolation values
arguments --
y --- input double3 values for which we are searching
my --- input int number of values for which we are searching
x --- input double3 array of structs containin the data table
values for x, y, and f corresponding to structs
".x", ".y", and ".z"
n --- input int number of data values in the table
dim_x --- input int number of data values in the x-direcion of table
j[] --- input/output int[] array of int'sthat contains
the index of the (x,y,f) position of the upper bound
library calls --
__host__ __device__ ForwardIterator thrust::upper_bound(
const thrust::detail::execution_policy_base<DerivedPolicy>& exec,
ForwardIterator first,
ForwardIterator last,
const LessThanComparable & value
)
exec --- the execution policy to use for parallelization
first --- the beginning of the ordered sequence
last --- the end of the ordered sequence
value --- the value to be searched.
Returns: the furthermost iterator i, such that value < *i is false
const detail::seq_t thrust::seq
an execution policy which requires analgorithm invocation to execute
sequentially in the current thread.
********************************************************************************/
__global__ void eos_search_gpu(const double3* y, const int my,
const double3* x, const int n,
const int dim_x, int * j){
int i = threadIdx.x + blockDim.x * blockIdx.x;
if ( i < my) {
const double ptr = thrust::upper_bound(thrust::seq, x[0].y , x[n-1].y, y[i].y);
j[i] = (ptr - x[i].y - 1);
}
}
显示的错误消息如下:
/opt/cudatoolkit/9.1/bin/../targets/x86_64-linux/include/thrust/iterator/iterator_traits.h(45): error: a class or namespace qualified name is required
detected during:
instantiation of class "thrust::iterator_traits<T> [with T=double]"
/opt/cudatoolkit/9.1/bin/../targets/x86_64-linux/include/thrust/iterator/detail/iterator_traits.inl(53): here
instantiation of class "thrust::iterator_difference<Iterator> [with Iterator=double]"
/opt/cudatoolkit/9.1/bin/../targets/x86_64-linux/include/thrust/system/detail/sequential/binary_search.h(102): here
instantiation of "ForwardIterator thrust::system::detail::sequential::upper_bound(thrust::system::detail::sequential::execution_policy<DerivedPolicy> &, ForwardIterator, ForwardIterator, const T &, StrictWeakOrdering) [with DerivedPolicy=thrust::detail::seq_t, ForwardIterator=double, T=double, StrictWeakOrdering=thrust::system::detail::generic::detail::binary_search_less]"
/opt/cudatoolkit/9.1/bin/../targets/x86_64-linux/include/thrust/detail/binary_search.inl(83): here
instantiation of "ForwardIterator thrust::upper_bound(const thrust::detail::execution_policy_base<DerivedPolicy> &, ForwardIterator, ForwardIterator, const T &, StrictWeakOrdering) [with DerivedPolicy=thrust::detail::seq_t, ForwardIterator=double, T=double, StrictWeakOrdering=thrust::system::detail::generic::detail::binary_search_less]"
/opt/cudatoolkit/9.1/bin/../targets/x86_64-linux/include/thrust/system/detail/generic/binary_search.inl(225): here
instantiation of "ForwardIterator thrust::system::detail::generic::upper_bound(thrust::execution_policy<DerivedPolicy> &, ForwardIterator, ForwardIterator, const T &) [with DerivedPolicy=thrust::detail::seq_t, ForwardIterator=double, T=double]"
/opt/cudatoolkit/9.1/bin/../targets/x86_64-linux/include/thrust/detail/binary_search.inl(69): here
instantiation of "ForwardIterator thrust::upper_bound(const thrust::detail::execution_policy_base<DerivedPolicy> &, ForwardIterator, ForwardIterator, const LessThanComparable &) [with DerivedPolicy=thrust::detail::seq_t, ForwardIterator=double, LessThanComparable=double]"
Interpolation_cuda.cu(254): here
/opt/cudatoolkit/9.1/bin/../targets/x86_64-linux/include/thrust/iterator/iterator_traits.h(45): error: global-scope qualifier (leading "::") is not allowed
detected during:
instantiation of class "thrust::iterator_traits<T> [with T=double]"
/opt/cudatoolkit/9.1/bin/../targets/x86_64-linux/include/thrust/iterator/detail/iterator_traits.inl(53): here
instantiation of class "thrust::iterator_difference<Iterator> [with Iterator=double]"
/opt/cudatoolkit/9.1/bin/../targets/x86_64-linux/include/thrust/system/detail/sequential/binary_search.h(102): here
instantiation of "ForwardIterator thrust::system::detail::sequential::upper_bound(thrust::system::detail::sequential::execution_policy<DerivedPolicy> &, ForwardIterator, ForwardIterator, const T &, StrictWeakOrdering) [with DerivedPolicy=thrust::detail::seq_t, ForwardIterator=double, T=double, StrictWeakOrdering=thrust::system::detail::generic::detail::binary_search_less]"
/opt/cudatoolkit/9.1/bin/../targets/x86_64-linux/include/thrust/detail/binary_search.inl(83): here
instantiation of "ForwardIterator thrust::upper_bound(const thrust::detail::execution_policy_base<DerivedPolicy> &, ForwardIterator, ForwardIterator, const T &, StrictWeakOrdering) [with DerivedPolicy=thrust::detail::seq_t, ForwardIterator=double, T=double, StrictWeakOrdering=thrust::system::detail::generic::detail::binary_search_less]"
/opt/cudatoolkit/9.1/bin/../targets/x86_64-linux/include/thrust/system/detail/generic/binary_search.inl(225): here
instantiation of "ForwardIterator thrust::system::detail::generic::upper_bound(thrust::execution_policy<DerivedPolicy> &, ForwardIterator, ForwardIterator, const T &) [with DerivedPolicy=thrust::detail::seq_t, ForwardIterator=double, T=double]"
/opt/cudatoolkit/9.1/bin/../targets/x86_64-linux/include/thrust/detail/binary_search.inl(69): here
instantiation of "ForwardIterator thrust::upper_bound(const thrust::detail::execution_policy_base<DerivedPolicy> &, ForwardIterator, ForwardIterator, const LessThanComparable &) [with DerivedPolicy=thrust::detail::seq_t, ForwardIterator=double, LessThanComparable=double]"
Interpolation_cuda.cu(254): here
/opt/cudatoolkit/9.1/bin/../targets/x86_64-linux/include/thrust/iterator/iterator_traits.h(45): error: expected a ";"
detected during:
instantiation of class "thrust::iterator_traits<T> [with T=double]"
/opt/cudatoolkit/9.1/bin/../targets/x86_64-linux/include/thrust/iterator/detail/iterator_traits.inl(53): here
instantiation of class "thrust::iterator_difference<Iterator> [with Iterator=double]"
/opt/cudatoolkit/9.1/bin/../targets/x86_64-linux/include/thrust/system/detail/sequential/binary_search.h(102): here
instantiation of "ForwardIterator thrust::system::detail::sequential::upper_bound(thrust::system::detail::sequential::execution_policy<DerivedPolicy> &, ForwardIterator, ForwardIterator, const T &, StrictWeakOrdering) [with DerivedPolicy=thrust::detail::seq_t, ForwardIterator=double, T=double, StrictWeakOrdering=thrust::system::detail::generic::detail::binary_search_less]"
/opt/cudatoolkit/9.1/bin/../targets/x86_64-linux/include/thrust/detail/binary_search.inl(83): here
instantiation of "ForwardIterator thrust::upper_bound(const thrust::detail::execution_policy_base<DerivedPolicy> &, ForwardIterator, ForwardIterator, const T &, StrictWeakOrdering) [with DerivedPolicy=thrust::detail::seq_t, ForwardIterator=double, T=double, StrictWeakOrdering=thrust::system::detail::generic::detail::binary_search_less]"
/opt/cudatoolkit/9.1/bin/../targets/x86_64-linux/include/thrust/system/detail/generic/binary_search.inl(225): here
instantiation of "ForwardIterator thrust::system::detail::generic::upper_bound(thrust::execution_policy<DerivedPolicy> &, ForwardIterator, ForwardIterator, const T &) [with DerivedPolicy=thrust::detail::seq_t, ForwardIterator=double, T=double]"
/opt/cudatoolkit/9.1/bin/../targets/x86_64-linux/include/thrust/detail/binary_search.inl(69): here
instantiation of "ForwardIterator thrust::upper_bound(const thrust::detail::execution_policy_base<DerivedPolicy> &, ForwardIterator, ForwardIterator, const LessThanComparable &) [with DerivedPolicy=thrust::detail::seq_t, ForwardIterator=double, LessThanComparable=double]"
Interpolation_cuda.cu(254): here
我想知道推力是否支持CUDA vector 类型的使用,或者我是否做错了什么。
最佳答案
您需要满足推力算法的所有预期输入类型。您没有这样做,因为您定义的几乎每个数量都不符合预期的推力。
对于初学者,我们将需要实际的迭代器。在设备代码中,这意味着指针。推力需要能够取消对迭代器/指针的引用,然后您必须指示推力如何处理该数量。为此,我们需要一个适当定义的函子。您可能希望阅读thrust quick start guide以了解函子的定义和用法。最后,这里的明智的指针/迭代器是double3
类型,因此我们需要制作几乎所有与double3
一起使用的东西。请注意,我们需要选择upper_bound
的the version来定义我们自己的自定义函子,以便我们可以正确地操作double3
数量(当取消引用迭代器/指针时得到的值)。
这可能会有所帮助:
#include <thrust/binary_search.h>
#include <thrust/execution_policy.h>
struct my_comp_functor{
template <typename T>
__host__ __device__
bool operator()(T &t1, T &t2) {
return (t1.y < t2.y);}
};
__global__ void eos_search_gpu(const double3* y, const int my,
const double3* x, const int n,
const int dim_x, int * j, my_comp_functor my_comp){
int i = threadIdx.x + blockDim.x * blockIdx.x;
if ( i < my) {
const double3 *ptr = thrust::upper_bound(thrust::seq, x, x+n, y[i], my_comp);
j[i] = (ptr[0].y - x[i].y - 1);
}
}
int main(){
double3 *d_y, *d_x;
int *d_j;
cudaMalloc(&d_y, 1024);
cudaMalloc(&d_x, 1024);
cudaMalloc(&d_j, 1024);
struct my_comp_functor my_obj;
eos_search_gpu<<<1,1>>>(d_y, 0, d_x, 0, 0, d_j, my_obj);
cudaDeviceSynchronize();
}
(上面的代码对我来说在CUDA 9.2上编译时没有编译错误,但显然不是为了功能/有用而设计的)
最后,对我来说,您似乎将
double
数量塞入j[i]
(整数),但它却是您的代码,这似乎很奇怪。另外,我可能在该函子中的排序错误,因此可能需要将
<
更改为>
。当您调用该内核时,请注意我添加了一个参数。您需要在主机代码中实例化
my_comp_functor
对象,然后将其传递到适当位置的内核。最后,似乎您正在执行向量化搜索,请注意,推力具有vectorized searches available,可能会消除对该内核的需求。
关于compiler-errors - Cuda向量类型的推力支持,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/51274740/