本文介绍了使用nvcc编译器使用-G参数编译时,GPU性能不佳的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在做一些测试,我意识到在编译时使用-G参数给我的性能比没有它。



我已经检查了文档Nvidia:

   - 设备调试(-G)
生成设备代码的调试信息。

但是,这并不能帮助我了解为什么给我这么糟糕的表现。
哪里生成这个调试信息?可能是这个糟糕表现的原因?

解决方案

使用 -G switch 。由于这个原因,结果代码的运行速度通常会比没有 -G 编译的代码慢。



通过在每种情况下通过 cuobjdump -sass myexecutable 运行可执行文件并查看生成的设备代码,非常容易看到。您将在非 -G 案例中看到通常较少的设备代码,您也可以看到特定优化的差异。



其中一个原因是高度优化的设备代码可能会消除源代码和实际源代码变量的实际行。这可能会使代码很难调试。因此,为了启用调试,大多数优化被禁用, -G



另请注意,使用,使用 -G 开关。较新版本的推力应该更好,但是在使用 -G 编译推力代码时,仍然可能会出现意想不到的问题。


I am doing some tests and I realized that using the -G parameter when compiling is giving me a bad performance than without it.

I have checked the documentation in Nvidia:

--device-debug (-G)
    Generate debug information for device code.

But it is not helping me to know the reason why is giving me such bad performance.Where is it generating this debug information and when? and what could be the cause of this bad performance?

解决方案

Using the -G switch disables most compiler optimizations that nvcc might do in device code. The resulting code will often run slower than code that is not compiled with -G, for this reason.

This is pretty easy to see by running your executable in each case through cuobjdump -sass myexecutable and looking at the generated device code. You'll see generally less device code in the non -G case, and you can see the differences in specific optimizations as well.

One of the reasons for this is that highly optimized device code may eliminate actual lines of source code and actual source code variables. This can make it very difficult to debug code. Therefore to enable debugging, most optimizations are disabled with -G.

Also note that with Thrust, using the -G switch may result in unpredictable behavior. Newer versions of thrust should behave better, but there may still be unexpected issues when compiling thrust code with -G.

这篇关于使用nvcc编译器使用-G参数编译时,GPU性能不佳的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

07-23 09:46