问题描述
我想尝试gst_inst_128bit指令。
在同一个程序中,nvvp给了很多gst_inst_128bit命令执行。
在nsight的profiler中,获取4次gst_inst_32bit指令。
它们应该是一个相同的程序。
I want to try gst_inst_128bit instruction.In the same program, nvvp give a lot of gst_inst_128bit command executed.While in nsight's profiler, 4 times gst_inst_32bit instructions is obtained.They should be a same program. How could this situation happen?
在Linux,CUDA 5.0,GTX 580上试验了这个实验。
程序只是将数据从一个数组复制到另一个数组内核函数:
在main中:
The experiment was tried on Linux, CUDA 5.0, GTX 580.The program is only copying data from one array to another in kernel function:In main:
cudaMalloc((void**)&dev_a, NUM * sizeof(float));
cudaMalloc((void**)&dev_b, NUM * sizeof(float));
kernel<<<grid,block>>>((uint4 *)dev_a, (uint4 *)dev_b);
内核:
__global__ void kernel(uint4 *a, uint4 *b){
unsigned int id = blockIdx.x * THREAD_NUM + threadIdx.x;
for(unsigned int i = 0;i < LOOP/4;i++){
b[id + i * GRID_NUM * THREAD_NUM] = a[id + i * GRID_NUM * THREAD_NUM];
}
return;
推荐答案
在Nsight EE中的Profiler和Linux上的独立Visual Profiler基于相同的代码库。请确保:
Profiler in Nsight EE and standalone Visual Profiler on Linux are based on a same codebase. Please make sure:
- 您正在使用相同的可执行文件。
- 环境变量值(例如LD_LIIBRARY_PATH)。
请注意,Nsight EE启动用户界面可能有些混乱。当您在调试调试版本之后单击配置文件时,它可能实际上在调试可执行文件上运行概要分析,试图保留您可以设置的所有自定义启动设置(例如命令行参数,工作文件夹等)。从主菜单中点击运行 - > 配置文件配置... 以查看Nsight在分析您的应用程序时使用的设置。
Please note that Nsight EE launch UI may be slightly confusing. When you click "Profile" after debugging the debug build, it may actually run profiling on debug executable trying to keep all the custom launch settings (e.g. command line arguments, working folder, etc.) you could have setup. From the main menu click Run->Profile Configurations... to see the settings Nsight uses when profiling your application.
这篇关于nvvp和nsight的profiler给出不同的结果?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!