本文介绍了Cuda未知错误(ErrNo:30)on cudaMalloc()的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我搜索了原因,但没有运气。它在这样一个简单的程序失败:

  #include< iostream& 

using namespace std;

int main(){
int * n;
cout<< cudaMallocManaged(& n,4 * sizeof(int))<< endl;
return 0;
}

返回代码为30,未知错误。 cudaMalloc 也会失败,并显示相同的代码。



这是我的硬件:

  $ lspci | grep NV 
01:00.0 3D控制器:NVIDIA公司GF117M [GeForce 610M / 710M / 820M / GT 620M / 625M / 630M / 720M](rev a1)

$ nvidia-smi
Sat Mar 7 14:02:04 2015
+ ---------------------------------- -------------------- +
| NVIDIA-SMI 331.113驱动程序版本:331.113 |
| ------------------------------- + ------------- --------- + ---------------------- +
| GPU名称Persistence-M | Bus-Id Disp.A |挥发性。 ECC |
| Fan Temp Perf Pwr:Usage / Cap |内存使用| GPU-Util Compute M. |
| ============================= + ============= ========= + ======================
| 0 NVS 5200M Off | 0000:01:00.0 N / A | N / A |
| N / A 53℃N / A N / A / N / A | 279MiB / 1023MiB |不适用默认|
+ ------------------------------- + ------------- --------- + ---------------------- +

+ -------- -------------------------------------------------- ------------------- +
|计算进程:GPU Memory |
| GPU PID进程名称用法|
| ========================================= ==============================
| 0不支持|
+ --------------------------------------------- -------------------------------- +

我使用的是Ubuntu 14.10,使用官方软件库中的CUDA 6.0(希望如果Ubuntu不会搞乱它)。



这是一个联想T430s labtop,该卡在Optimus上,以便可能会导致一些问题。


$ b

更新1

> OK, nvidia_uvm 未加载...

  $ lsmod | grep nv 

nvidia 10744914 65
nvram 14362 1 thinkpad_acpi
drm 310919 6 i915,drm_kms_helper,nvidia

$ sudo modprobe nvidia_uvm
modprobe :错误:../libkmod/libkmod-module.c:816 kmod_module_insert_module()无法找到模块by name ='nvidia_331_updates_uvm'
modprobe:错误:无法插入'nvidia_331_updates_uvm':功能未实现



更新2



,我重新安装了nvidia-331-updates-uvm并且模块已加载。

  $ lsmod | grep nv 
nvidia_uvm 34855 0
nvidia 10744914 66 nvidia_uvm
nvram 14362 1 thinkpad_acpi
drm 310919 6 i915,drm_kms_helper,nvidia

$ b

更新3

/ p>

经过一些更多的测试(主要是尝试以root身份运行),现在我得到错误71:操作不支持。但是,如果我只是使用 cudaMalloc 成功。我还将检查我的设备是否支持统一内存寻址。



更新4



OK,我的卡只支持SM 2.1,因此不支持统一内存。

解决方案

AFAIK nvidia_uvm 内核模块是CUDA工作所必需的。



您需要安装包含该内核模块的软件包,例如 nvidia-331-uvm
通过安装 nvidia-modprobe 软件包启用自动加载:

  sudo apt-get install nvidia-modprobe nvidia-331-uvm 

如果在安装 nvidia-modprobe 后不想重新启动,可以尝试以root身份运行程序 sudo ./a.out ) - 应在以root身份运行时加载模块。


I have searched for the reason but no luck. It fails on such a simple program:

#include <iostream>

using namespace std;

int main() {
  int* n;
  cout << cudaMallocManaged(&n, 4 * sizeof(int)) << endl;
  return 0;
}

The return code is 30, unknown error. cudaMalloc also fails with same code.

This is my hardware:

$ lspci | grep NV
01:00.0 3D controller: NVIDIA Corporation GF117M [GeForce 610M/710M/820M / GT 620M/625M/630M/720M] (rev a1)

$ nvidia-smi
Sat Mar  7 14:02:04 2015       
+------------------------------------------------------+                       
| NVIDIA-SMI 331.113    Driver Version: 331.113        |                       
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  NVS 5200M           Off  | 0000:01:00.0     N/A |                  N/A |
| N/A   53C  N/A     N/A /  N/A |    279MiB /  1023MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Compute processes:                                               GPU Memory |
|  GPU       PID  Process name                                     Usage      |
|=============================================================================|
|    0            Not Supported                                               |
+-----------------------------------------------------------------------------+

I am using Ubuntu 14.10, with CUDA 6.0 from official repository(hopefully, if Ubuntu does not mess it up).

It is a Lenovo T430s labtop, the card is on Optimus so that might cause some problem. I have tested on another machine and the same code works.

Update 1

OK, nvidia_uvm is not loaded...

$ lsmod |grep nv

nvidia              10744914  65 
nvram                  14362  1 thinkpad_acpi
drm                   310919  6 i915,drm_kms_helper,nvidia

$ sudo modprobe nvidia_uvm
modprobe: ERROR: ../libkmod/libkmod-module.c:816 kmod_module_insert_module() could not find module by name='nvidia_331_updates_uvm'
modprobe: ERROR: could not insert 'nvidia_331_updates_uvm': Function not implemented

Update 2

OK, I reinstalled nvidia-331-updates-uvm and the module was loaded.

$ lsmod | grep nv
nvidia_uvm             34855  0 
nvidia              10744914  66 nvidia_uvm
nvram                  14362  1 thinkpad_acpi
drm                   310919  6 i915,drm_kms_helper,nvidia

However, the code still returns error 30.

Update 3

After some more testing (mainly tried running as root), now I get error 71: operation not supported. However, if I am just using cudaMalloc it succeeded. I will also check whether my device support unified memory addressing.

Update 4

OK, my card only supports SM 2.1, so it does not support Unified Memory.

解决方案

AFAIK nvidia_uvm kernel module is required for CUDA to work.

You need to install package with that kernel module, e.g. nvidia-331-uvm and enable it's autoloading by installing nvidia-modprobe package:

sudo apt-get install nvidia-modprobe nvidia-331-uvm

If you don't want to reboot after installing nvidia-modprobe, you can try to run your program as root (e.g. sudo ./a.out) — module should be loaded during run as root.

这篇关于Cuda未知错误(ErrNo:30)on cudaMalloc()的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-12 15:02