CUDA 与 DataParallel:为什么不同?

本文介绍了CUDA 与 DataParallel:为什么不同?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个简单的神经网络模型，我将 cuda() 或 DataParallel() 应用到模型上，如下所示.

I have a simple neural network model and I apply either cuda() or DataParallel() on the model like following.

model = torch.nn.DataParallel(model).cuda()

或，

model = model.cuda()

当我不使用 DataParallel，而只是将我的模型转换为 cuda() 时，我需要将批量输入显式转换为 cuda()，然后给出将其添加到模型中，否则返回以下错误.

When I don't use DataParallel, rather simply transform my model to cuda(), I need to explicitly convert the batch inputs to cuda() and then give it to the model, otherwise it returns the following error.

torch.index_select 收到无效的参数组合 - got (torch.cuda.FloatTensor, int, torch.LongTensor)

但使用 DataParallel，代码运行良好.其余的都是一样的.为什么会发生这种情况?为什么当我使用 DataParallel 时，我不需要将批量输入显式转换为 cuda()?

But with DataParallel, the code works fine. Rest of the other things are same. Why this happens? Why when I use DataParallel, I don't need to transform the batch inputs explicitly to cuda()?

为什么不同

CUDA 与 DataParallel:为什么不同?

问题描述

推荐答案