问题描述
有时我会犯一个错误,并尝试在同一GPU(两个不同的脚本)中使用keras同时运行两次训练,这会使我的机器崩溃或破坏这两次训练.
Sometimes I make a mistake and try to run two simultaneous trainings with keras in the same GPU (two different scripts), making my machine crash or breaking both trainings.
我希望能够在脚本中进行测试,以了解是否正在运行某些培训,因此可以更改gpu或停止新的培训.
I would like to be able to test in my script if there is some training running and therefore either change of gpu or stop the new training.
我发现寻找答案的唯一提示是使用nvidia-smi
检查在GPU中运行的进程?
The only hint I found searching for an answer is to use nvidia-smi
to check processes running in gpus?
nvidia-smi输出示例:
An example of nvidia-smi output:
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 411.63 Driver Version: 411.63 |
|-------------------------------+----------------------+----------------------+
| GPU Name TCC/WDDM | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 TITAN Xp WDDM | 00000000:03:00.0 Off | N/A |
| 42% 67C P2 81W / 250W | 10114MiB / 12288MiB | 54% Default |
+-------------------------------+----------------------+----------------------+
| 1 TITAN Xp WDDM | 00000000:04:00.0 Off | N/A |
| 35% 58C P2 144W / 250W | 10315MiB / 12288MiB | 73% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 11660 C ...\conda\envs\tensorflow18-gpu\python.exe N/A |
| 1 1532 C+G Insufficient Permissions N/A |
| 1 5388 C+G C:\Windows\explorer.exe N/A |
| 1 6648 C+G Insufficient Permissions N/A |
| 1 7396 C+G ...t_cw5n1h2txyewy\ShellExperienceHost.exe N/A |
| 1 7688 C+G ...dows.Cortana_cw5n1h2txyewy\SearchUI.exe N/A |
| 1 9808 C ...\conda\envs\tensorflow18-gpu\python.exe N/A |
| 1 10820 C+G Insufficient Permissions N/A |
| 1 11232 C+G ...x64__8wekyb3d8bbwe\Microsoft.Photos.exe N/A |
+-----------------------------------------------------------------------------+
在这种情况下,GPU 0和GPU 1中都运行着python.exe.
In this case there is python.exe running in GPU 0 and in GPU 1.
是否有更直接的解决方案?谢谢
Is there a more direct solution? Thanks
推荐答案
您可以尝试以下python软件包, GPUtil
You can try this python package, GPUtil
这篇关于如何检查keras培训是否已在GPU中运行?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!