我正在尝试创建一个用于GPU时间测量的类,并在我的Cuda程序中使用它,但是由于某种原因,我无法使其正常运行。我的类和头文件如下所示:

GPUTimer.h:

#ifndef GPUTIMER_H_
#define GPUTIMER_H_

class GPUTimer
{
    public:
        GPUTimer();
        virtual ~GPUTimer();

        void start_timer(cudaStream_t stream_id = 0);
        void stop_timer(cudaStream_t stream_id = 0);
        void print_elapsed_time();

    public:
        bool m_Started;
        bool m_Stopped;
        cudaEvent_t m_StartEvent;
        cudaEvent_t m_StopEvent;
};

#endif /* GPUTIMER_H_ */

GPUTimer.cpp:
#include "GPUTimer.h"
#include "kernels.h"

GPUTimer::GPUTimer()
{
    m_Started = false;
    m_Stopped = false;
}

GPUTimer::GPUTimer() : m_Started(false), m_Stopped(false)
{
    cudaEventCreate(&m_StartEvent); CUDA_CHECK;
    cudaEventCreate(&m_StopEvent);  CUDA_CHECK;
}

GPUTimer::~GPUTimer()
{
    cudaEventDestroy(m_StartEvent); CUDA_CHECK;
    cudaEventDestroy(m_StopEvent);  CUDA_CHECK;
}

// Start event timer
void GPUTimer::start_timer(cudaStream_t stream_id = 0)
{
    cudaEventRecord(m_StartEvent, stream_id); CUDA_CHECK;
    m_Started = true;
    m_Stopped = false;
}

// End event timer
void GPUTimer::stop_timer(cudaStream_t stream_id = 0)
{
   if(!m_Started)
   {
       std::cout << "Timer hasn't started yet. Please call start_timer() before!" << std::endl;
       return;
   }
   cudaEventRecord(m_StopEvent, stream_id); CUDA_CHECK;
   m_Started = false;
   m_Stopped = true;
}

// Print elapsed time
void GPUTimer::print_elapsed_time()
       {
           if(!m_Stopped)
           {
               std::cout << "Timer hasn't stopped yet. Please call stop_timer() before!" << std::endl;
               return;
           }
           cudaEventSynchronize(m_StopEvent);
           float elapsed_time = 0.0f;
           cudaEventElapsedTime(&elapsed_time, m_StartEvent, m_StopEvent);

           std::cout << "Elapsed GPU Time: " << elapsed_time         << " msec" << std::endl;
           std::cout << "Elapsed GPU Time: " << elapsed_time / 1000  << " secs" << std::endl;
           std::cout << "Elapsed GPU Time: " << elapsed_time / 60000 << " mins" << std::endl;
       }

kernels.h内部,我包括<cuda_runtime.h>,但是当我尝试编译程序时,它说cudaStream_t尚未声明:
error: ‘cudaStream_t’ has not been declared
         void start_timer(cudaStream_t stream_id = 0);

知道可能是什么问题吗?

最佳答案

您应该将#include <cuda_runtime.h>(或#include "kernels.h")添加到头文件GPUTimer.h中。否则,当编译器查看cudaStream_t头文件中的代码时,确实未声明GPUTimer.h。您的头文件应该是自给自足的,并通过包括相应的头文件来声明您正在使用的所有类型。

关于注释:仅在函数声明(即头文件)中允许默认参数,而在定义(即源文件)中不允许默认参数。这是一件好事,否则您可能会不小心提供两个不同的值。
所以写

void GPUTimer::start_timer(cudaStream_t stream_id)

09-08 01:04