问题描述
我已使用CUDA流实现了以下类
class CudaStreams
{
private:
int nStreams_;
cudaStream_t * streams_;
cudaStream_t active_stream_;
public:
//默认构造函数
CudaStreams(){}
//流初始化
void InitStreams const int nStreams = 1){
nStreams_ = nStreams;
//分配和初始化流处理数组
streams_ =(cudaStream_t *)malloc(nStreams_ * sizeof(cudaStream_t));
for(int i = 0; i
active_stream_ = streams_ [0];}
//默认析构函数
〜CudaStreams(){
for(int i = 0; i< nstreams_; i ++)CudaSafeCall(cudaStreamDestroy(streams_ [i])); }
};
如果我现在运行这个简单的代码
void main(int argc,char ** argv)
{
streams.InitStreams(1);
streams。〜CudaStreams();
cudaDeviceReset();在 cudaDeviceReset()$ c $之后的
c> call,我收到以下消息:
未处理的异常0x772f15de在test.exe中:0x00000000
在使用 cudaDeviceReset()
?
EDIT
如果我在析构函数中添加 free(streams _);
,即
〜CudaStreams(){
for(int i = 0; i free(streams_); }
我收到以下错误讯息
cudaSafeCall()失败在C:\Users\Documents\Project\Library\CudaStreams.cuh:79:未知错误
其中行 79
是由 * $ c $
此外,如果我直接在代码中使用构造函数和析构函数的相同指令,即
void main(int argc,char ** argv)
{
int nStreams_ = 3;
cudaStream_t * streams_ =(cudaStream_t *)malloc(nStreams_ * sizeof(cudaStream_t));
for(int i = 0; i for(int i = 0; i 免费(streams_);
cudaDeviceReset();
}
一切都很好。 Perheps是与类的不良使用相关的东西。
这里有两个问题,类和范围。
首先,让我们从你的 main()
开始, / p>
int main(int argc,char ** argv)
{
{
CudaStreams streams ;
streams.InitStreams(1);
}
cudaDeviceReset();
return 0;
}
这样工作正常,因为 streams
正好调用一次(
streams
超出范围)和之前 cudaDeviceReset $
main()
(或其中的一个可编译版本,但更多的是可编译的版本)。
int main(int argc,char ** argv)
{
CudaStreams流;
streams.InitStreams(1);
streams。〜CudaStreams();
cudaDeviceReset();
return 0;
}
这里你显式调用 streams
(你应该几乎不会做),然后
超出范围。在上下文被销毁后自动调用析构函数是segfault / exception的来源。 cudaDeviceReset
,那么析构函数再次调用 c $ c> streams cudaStreamDestroy
调用正在尝试处理没有有效CUDA上下文的流。所以解决方案是没有任何类,使得CUDA API调用超出范围(或显式地调用它们的析构函数)时,没有上下文。
如果我们做了第三个版本像这样:
int main(int argc,char ** argv)
{
{
CudaStreams streams;
streams.InitStreams(1);
streams。〜CudaStreams();
}
cudaDeviceReset();
return 0;
}
您会得到一个CUDA运行时错误。因为析构函数会调用两次。第一次(显式)它会工作。第二个(隐含,超出范围)将产生一个运行时错误:你有一个有效的上下文,但现在试图破坏不存在的流。
评论/问题:您在原始问题中显示的代码的发布和实际可编译版本有多难?它字面上需要5个额外的行,使其成为一个适当的再现情况下,别人可以实际编译和运行。我发现如果你不愿意提供有用的代码和信息,使每个人的生活都更容易,我希望其他人努力回答基本调试问题有点不合理。想想吧。 [end of rant]
I have implemented the following class using CUDA streams
class CudaStreams
{
private:
int nStreams_;
cudaStream_t* streams_;
cudaStream_t active_stream_;
public:
// default constructor
CudaStreams() { }
// streams initialization
void InitStreams(const int nStreams = 1) {
nStreams_ = nStreams;
// allocate and initialize an array of stream handles
streams_ = (cudaStream_t*) malloc(nStreams_*sizeof(cudaStream_t));
for(int i = 0; i < nStreams_; i++) CudaSafeCall(cudaStreamCreate(&(streams_[i])));
active_stream_ = streams_[0];}
// default destructor
~CudaStreams() {
for(int i = 0; i<nStreams_; i++) CudaSafeCall(cudaStreamDestroy(streams_[i])); }
};
If I now run this simple code
void main( int argc, char** argv)
{
streams.InitStreams(1);
streams.~CudaStreams();
cudaDeviceReset();
}
after the cudaDeviceReset()
call, I receive the following message:
Unhandled exception 0x772f15de in test.exe: 0x00000000.
What should I do before invoking the destructor to avoid this issue when using cudaDeviceReset()
?
EDIT
If I add free(streams_);
in the destructor, namely
~CudaStreams() {
for(int i = 0; i<nStreams_; i++) CudaSafeCall(cudaStreamDestroy(streams_[i])); // *
free(streams_); }
I receive the following error message
cudaSafeCall() failed at C:\Users\Documents\Project\Library\CudaStreams.cuh:79 : unknown error
where line 79
is that denoted by *
in the destructor.
Furthermore, If I use the same instructions of the constructor and the destructor directly inside the code, namely
void main( int argc, char** argv)
{
int nStreams_ = 3;
cudaStream_t* streams_ = (cudaStream_t*) malloc(nStreams_*sizeof(cudaStream_t));
for(int i = 0; i < nStreams_; i++) CudaSafeCall(cudaStreamCreate(&(streams_[i])));
for(int i = 0; i<nStreams_; i++) CudaSafeCall(cudaStreamDestroy(streams_[i]));
free(streams_);
cudaDeviceReset();
}
everything works well. Perheps is something connected to a bad use of the class?
There are two problems here, both related to the destructor of your class and scope.
Firstly, let's start with a version of your main()
which will work correctly:
int main( int argc, char** argv)
{
{
CudaStreams streams;
streams.InitStreams(1);
}
cudaDeviceReset();
return 0;
}
This works correctly because the destructor for streams
is called exactly once (when streams
falls out of scope), and before cudaDeviceReset
is called.
Your original main()
(or a compilable version of it, but more about that later...) fails for two reasons. Let's look at it again:
int main( int argc, char** argv)
{
CudaStreams streams;
streams.InitStreams(1);
streams.~CudaStreams();
cudaDeviceReset();
return 0;
}
Here you explicitly call the destructor for streams
(which you should almost never do), then cudaDeviceReset
, then the destructor is called again at the return statement when streams
falls out of scope. The automatic calling the destructor after the context is destroyed is the source of the segfault/exception. The cudaStreamDestroy
calls are trying to work on streams without a valid CUDA context. So the solution is not to have any classes which make CUDA API calls fall out of scope (or call their destructors explicitly) when there is no context.
If we made a third version like this:
int main( int argc, char** argv)
{
{
CudaStreams streams;
streams.InitStreams(1);
streams.~CudaStreams();
}
cudaDeviceReset();
return 0;
}
You will get a CUDA runtime error. Because the destructor gets call twice. The first time (explicit) it will work. The second (implict, out of scope) will produce a runtime error: you have a valid context, but are now trying to destroy non-existent streams.
As a final comment/question: How hard would it have been to post and actual compilable version of the code you showed in your original question? It literally required 5 extra lines to make it into a proper repro case someone else could actual compile and run. I find it a bit unreasonable to expect others to make a effort to answer what are basically debugging questions if you are not willing to make a similar effort in providing useful code and information which makes everyone's life that much easier. Think about it. [end of rant]
这篇关于CUDA流销毁和CudaDeviceReset的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!