问题描述
我有一个LongTensors列表和另一个标签列表.我是PyTorch和RNN的新手,所以对于如何对我的数据实施小批量培训我很困惑.这些数据还有很多,但我想保持简单,因此我只能理解如何实施微型批次培训部分.我正在基于在可变长度输入上训练的LSTM/GRU的最终隐藏状态进行多类分类.我设法使它以批处理大小1(基本上为SGD)工作,但是我在实现小型批次方面很挣扎.
I have a list of LongTensors, and another list of labels. I'm new to PyTorch and RNN's so I'm quite confused as to how to implement minibatch training for the data I have. There is much more to this data, but I want to keep it simple, so I can understand only how to implement the minibatch training part. I'm doing multiclass classification based on the final hidden state of an LSTM/GRU trained on variable length inputs. I managed to get it working with batch size 1(basically SGD) but I'm struggling with implementing minibatches.
我是否必须将序列填充到最大大小并创建一个包含所有元素的更大尺寸的新张量矩阵?我的意思是这样:
Do I have to pad the sequences to the maximum size and create a new tensor matrix of larger size which holds all the elements? I mean like this:
inputs = pad(sequences)
train = DataLoader(inputs, batch_size=batch_size, shuffle=True)
for i, data in train:
#do stuff using LSTM and/or GRU models
这是对自定义数据进行小批量培训的公认方法吗?我找不到有关使用DataLoader加载自定义数据的任何教程(但我想那是使用pyTorch创建批处理的方法吗?)
Is this the accepted way of doing minibatch training on custom data? I couldn't find any tutorials on loading custom data using DataLoader(but I assume that's the way to create batches using pyTorch?)
我对填充有另一个疑问.我使用LSTM/GRU的原因是由于输入的长度可变.填充不会破坏目的吗?小批量培训是否需要填充?
Another doubt I have is with regards to padding. The reason I'm using LSTM/GRU is because of the variable length of the input. Doesn't padding defeat the purpose? Is padding necessary for minibatch training?
推荐答案
是.对具有不同长度的序列进行小批量训练的问题在于,您不能将不同长度的序列堆叠在一起.
Yes. The issue with minibatch training on sequences which have different lengths is that you can't stack sequences of different lengths together.
通常一个人会做.
for e in range(epochs):
sequences = shuffle(sequences)
for mb in range(len(sequences)/mb_size):
batch = torch.stack(sequences[mb*mb_size:(mb+1)*mb_size])
,然后将神经网络应用于批次.但是由于您的序列长度不同,因此torch.stack
将失败.因此,实际上,您要做的就是用零填充序列,以使它们都具有相同的长度(至少在一个小批量中).因此,您有2个选择:
and then you apply your neural network on your batch. But because your sequences are of different lengths, the torch.stack
will fail. So indeed what you have to do is to pad your sequences with zeros so that they all have the same length (at least in a minibatch). So you have 2 options:
1)从一开始,就用所有初始零填充所有序列,以使它们的长度与所有数据中最长的序列的长度相同.
1) At the very very beginning, pad all your sequences with initial zeros so that they all have the same length as your longest sequence of all your data.
OR
2)对于每个小批量,在运行过程中,在将序列堆叠在一起之前,先填充将要进入小批量的所有序列,并以初始零填充,以使它们的长度与小批量最长的序列的长度相同.
2) On the fly, for each minibatch, before stacking the sequences together, pad all the sequences that will go into the minibatch with initial zeros so that they all have the same length as the longest sequence of the minibatch.
这篇关于小型批量培训,用于输入可变大小的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!