问题描述
我有一个简单的神经网络模型,我将 cuda()
或 DataParallel()
应用到模型上,如下所示.
I have a simple neural network model and I apply either cuda()
or DataParallel()
on the model like following.
model = torch.nn.DataParallel(model).cuda()
或,
model = model.cuda()
当我不使用 DataParallel,而只是将我的模型转换为 cuda()
时,我需要将批量输入显式转换为 cuda()
,然后给出将其添加到模型中,否则返回以下错误.
When I don't use DataParallel, rather simply transform my model to cuda()
, I need to explicitly convert the batch inputs to cuda()
and then give it to the model, otherwise it returns the following error.
torch.index_select 收到无效的参数组合 - got (torch.cuda.FloatTensor, int, torch.LongTensor)
但使用 DataParallel,代码运行良好.其余的都是一样的.为什么会发生这种情况?为什么当我使用 DataParallel 时,我不需要将批量输入显式转换为 cuda()
?
But with DataParallel, the code works fine. Rest of the other things are same. Why this happens? Why when I use DataParallel, I don't need to transform the batch inputs explicitly to cuda()
?
推荐答案
因为 DataParallel 允许 CPU 输入,因为它的第一步是将输入传输到适当的 GPU.
Because, DataParallel allows CPU inputs, as it's first step is to transfer inputs to appropriate GPUs.
信息来源:https://discuss.pytorch.org/t/cuda-vs-dataparallel-why-the-difference/4062/3
这篇关于CUDA 与 DataParallel:为什么不同?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!