本文介绍了序列到序列自动编码器的变量输入的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我实现了序列到序列编码器解码器,但是在预测中改变目标长度时遇到了问题.它适用于相同长度的训练序列,但如果不同则无效.我需要更改什么?

I implemented a Sequence to Sequence Encoder Decoder but I am having problems with varying my target length in the prediction. It is working for the same length of the training sequence but not if it is different. What do I need to change ?

from keras.models import Model
from keras.layers import Input, LSTM, Dense
import numpy as np

num_encoder_tokens = 2
num_decoder_tokens = 2
encoder_seq_length = None
decoder_seq_length = None
batch_size = 100
epochs = 2000
hidden_units=10
timesteps=10

input_seqs = np.random.random((1000, 10, num_encoder_tokens))
target_seqs = np.random.random((1000, 10, num_decoder_tokens))



#define training encoder
encoder_inputs = Input(shape=(None, num_encoder_tokens))
encoder = LSTM(hidden_units, return_state=True)
encoder_outputs, state_h, state_c = encoder(encoder_inputs)
encoder_states = [state_h, state_c]
#define training decoder
decoder_inputs = Input(shape=(None,num_decoder_tokens))
decoder_lstm = LSTM(hidden_units, return_sequences=True, return_state=True)
decoder_outputs, _, _ = decoder_lstm(decoder_inputs, initial_state=encoder_states)
decoder_dense = Dense(num_encoder_tokens, activation='tanh')
decoder_outputs = decoder_dense(decoder_outputs)
model = Model([encoder_inputs, decoder_inputs], decoder_outputs)

#Run training
model.compile(optimizer='adam', loss='mse')
model.fit([input_seqs, target_seqs], target_seqs,batch_size=batch_size, epochs=epochs)

#new target data
target_seqs = np.random.random((2000, 10, num_decoder_tokens))


# define inference encoder
encoder_model = Model(encoder_inputs, encoder_states)
# define inference decoder
decoder_state_input_h = Input(shape=(hidden_units,))
decoder_state_input_c = Input(shape=(hidden_units,))
decoder_states_inputs = [decoder_state_input_h, decoder_state_input_c]
decoder_outputs, state_h, state_c = decoder_lstm(decoder_inputs, initial_state=decoder_states_inputs)
decoder_states = [state_h, state_c]
decoder_outputs = decoder_dense(decoder_outputs)
decoder_model = Model([decoder_inputs] + decoder_states_inputs, [decoder_outputs] + decoder_states)

# Initalizse states
states_values = encoder_model.predict(input_seqs)

,此处它希望与input_seqs中的批处理大小相同,并且不接受具有2000批处理的target_seqs

and here it wants the same batchsize as in the input_seqs and does not accept target_seqs having a batch of 2000

target_seq = np.zeros((1, 1, num_decoder_tokens))
output=list()
for t in range(timesteps):
    output_tokens, h, c  = decoder_model.predict([target_seqs] + states_values)
    output.append(output_tokens[0,0,:])
    states_values = [h,c]
    target_seq = output_tokens

我需要更改什么以使模型接受可变长度的输入?

What do I need to change that the model accepts a variable length of input ?

推荐答案

您可以在数据中创建表示end_of_sequence的单词/令牌.

You can create in your data a word/token that means end_of_sequence.

您将长度保持最大,并可能使用某些Masking(mask_value)层来避免处理不必要的步骤.

You keep the length to a maximum and probably use some Masking(mask_value) layer to avoid processing undesired steps.

在输入和输出中,都添加end_of_sequence令牌,并使用mask_value完成缺少的步骤.

In both the inputs and outputs, you add the end_of_sequence token and complete the missing steps with mask_value.

示例:

  • 最长的序列有4个步骤
    • 将其添加为5以添加end_of_sequence令牌:
      • [step1, step2, step3, step4, end_of_sequence]
      • the longest sequence has 4 steps
        • make it 5 to add an end_of_sequence token:
          • [step1, step2, step3, step4, end_of_sequence]
          • [step1, step2, end_of_sequence, mask_value, mask_value]

          然后您的形状将为(batch, 5, features).

          另一个问题中描述了另一种方法,其中用户手动循环执行每个步骤,并检查此步骤的结果是否为end_of_sequence令牌:

          Another approach is described in your other question, where the user loops each step manually and checks whether the result of this step is the end_of_sequence token: Difference between two Sequence to Sequence Models keras (with and without RepeatVector)

          如果这是一个自动编码器,则还有另一种可变长度的可能性,您可以直接从输入中获取长度(必须批量添加批处理,每个批处理只能有一个序列,没有填充/遮罩):

          If this is an autoencoder, there is also another possibility for variable lengths, where you take the length directly from the input (must feed batches with only one sequence each, no padding/masking): How to apply LSTM-autoencoder to variant-length time-series data?

          这是另一种方法,我们将输入长度明确地存储在潜在向量的保留元素中,然后我们阅读此信息(每批也必须仅以一个序列运行,不得填充):

          This is another approach where we store the input length explicitly in a reserved element of the latent vector and later we read this (must also run with only one sequence per batch, no padding): Variable length output in keras

          这篇关于序列到序列自动编码器的变量输入的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-03 09:28
查看更多