cublas如何实现异步标量变量传输

本文介绍了cublas如何实现异步标量变量传输的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

在许多cublas或cusparse函数调用中，它们使用标量变量，我们可以在主机指针或设备指针中传递，例如alpha和beta变量

in many cublas or cusparse function calls, they use scalar variables which we can pass in either host pointer or device pointer, such as the alpha and beta variable herehttp://docs.nvidia.com/cuda/cublas/#cublas-lt-t-gt-gemm

这是如何实现的？如果数据在主机中，我认为需要在设备上分配内存，然后调用cudaMemcpyAsync来复制数据。但是，做cudaMalloc会使函数调用同步。我们如何解决这个问题？

How is this actually implemented? If the data is in host, I assume it would need to allocate memory on device and then call cudaMemcpyAsync to copy the data. However, doing cudaMalloc would make the function call synchronous. How can we solve this problem?

cublas

cublas如何实现异步标量变量传输

问题描述

推荐答案