如何将上一个时间步的输出作为输入传递到下一个时间步

如何将上一个时间步的输出作为输入传递到下一个时间步

本文介绍了Tensorflow:如何将上一个时间步的输出作为输入传递到下一个时间步的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是这个问题的副本 如何将最后一个输出 y(t-1) 作为输入以在 tensorflow RNN 中生成 y(t)?

我想将 RNN 在时间步 T 的输出作为时间步 T+1 的输入传递.input_RNN(T+1) = output_RNN(T)根据文档, tf.nn.rnn 以及 tf.nn.dynamic_rnn 函数明确地将完整输入用于所有时间步.

I want to pass the output of RNN at time-step T as the input at time-step T+1. input_RNN(T+1) = output_RNN(T)As per the documentation, the tf.nn.rnn as well as tf.nn.dynamic_rnn functions explicitly take the complete input to all time-steps.

我在 https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/ops/seq2seq.py它使用循环并调用 cell(input,state) 函数.单元格可以是 lstm 或 gru 或任何其他 rnn 单元格.我检查了文档以找到 cell() 参数的数据类型和形状,但我只找到了 cell(num_neurons) 形式的构造函数.我想知道将输出传递给输入的正确方法.我不想使用其他库/包装器,比如在 tensorflow 上构建的 keras.有什么建议吗?

I checked the seq2seq example at https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/ops/seq2seq.pyIt uses a loop and calls the cell(input,state) function. The cell can be lstm or gru or any other rnn cell. I checked the documentation to find the data type and shape of the arguments to cell(), but I found only the contructor of the form cell(num_neurons).I would like to know the correct way of passing output to input. I don't want to use other libraries/wrappers like keras built over tensorflow. Any suggestions?

推荐答案

一种方法是编写您自己的 RNN 单元以及您自己的多 RNN 单元.通过这种方式,您可以在内部存储最后一个 RNN 单元的输出,并在下一个时间步中访问它.查看此博文了解更多信息.您还可以添加例如编码器或解码器直接在单元格中,以便您可以在将数据馈送到单元格之前或从单元格检索数据之后处理数据.

One way to do this is to write your own RNN cell, together with your own Multi-RNN cell. This way you can internally store the output of the last RNN cell and just access it in the next time step. Check this blogpost for more info. You can also add e.g. encoder or decoders directly in the cell, so that you can process the data before feeding it to the cell or after retrieving it from the cell.

另一种可能性是使用 tf.nn.raw_rnn 函数,它可以让您控制在调用 RNN 单元之前和之后发生的事情.下面的代码片段展示了如何使用这个函数,学分转到 这篇文章.

Another possibility is to use the function tf.nn.raw_rnn which lets you control what happens before and after the calls to the RNN cells. The following code snippet shows how to use this function, credits go to this article.

from tensorflow.python.ops.rnn import _transpose_batch_time
import tensorflow as tf


def sampling_rnn(self, cell, initial_state, input_, seq_lengths):

    # raw_rnn expects time major inputs as TensorArrays
    max_time = ...  # this is the max time step per batch
    inputs_ta = tf.TensorArray(dtype=tf.float32, size=max_time, clear_after_read=False)
    inputs_ta = inputs_ta.unstack(_transpose_batch_time(input_))  # model_input is the input placeholder
    input_dim = input_.get_shape()[-1].value  # the dimensionality of the input to each time step
    output_dim = ...  # the dimensionality of the model's output at each time step

        def loop_fn(time, cell_output, cell_state, loop_state):
            """
            Loop function that allows to control input to the rnn cell and manipulate cell outputs.
            :param time: current time step
            :param cell_output: output from previous time step or None if time == 0
            :param cell_state: cell state from previous time step
            :param loop_state: custom loop state to share information between different iterations of this loop fn
            :return: tuple consisting of
              elements_finished: tensor of size [bach_size] which is True for sequences that have reached their end,
                needed because of variable sequence size
              next_input: input to next time step
              next_cell_state: cell state forwarded to next time step
              emit_output: The first return argument of raw_rnn. This is not necessarily the output of the RNN cell,
                but could e.g. be the output of a dense layer attached to the rnn layer.
              next_loop_state: loop state forwarded to the next time step
            """
            if cell_output is None:
                # time == 0, used for initialization before first call to cell
                next_cell_state = initial_state
                # the emit_output in this case tells TF how future emits look
                emit_output = tf.zeros([output_dim])
            else:
                # t > 0, called right after call to cell, i.e. cell_output is the output from time t-1.
                # here you can do whatever ou want with cell_output before assigning it to emit_output.
                # In this case, we don't do anything
                next_cell_state = cell_state
                emit_output = cell_output

            # check which elements are finished
            elements_finished = (time >= seq_lengths)
            finished = tf.reduce_all(elements_finished)

            # assemble cell input for upcoming time step
            current_output = emit_output if cell_output is not None else None
            input_original = inputs_ta.read(time)  # tensor of shape (None, input_dim)

            if current_output is None:
                # this is the initial step, i.e. there is no output from a previous time step, what we feed here
                # can highly depend on the data. In this case we just assign the actual input in the first time step.
                next_in = input_original
            else:
                # time > 0, so just use previous output as next input
                # here you could do fancier things, whatever you want to do before passing the data into the rnn cell
                # if here you were to pass input_original than you would get the normal behaviour of dynamic_rnn
                next_in = current_output

            next_input = tf.cond(finished,
                                 lambda: tf.zeros([self.batch_size, input_dim], dtype=tf.float32),  # copy through zeros
                                 lambda: next_in)  # if not finished, feed the previous output as next input

            # set shape manually, otherwise it is not defined for the last dimensions
            next_input.set_shape([None, input_dim])

            # loop state not used in this example
            next_loop_state = None
            return (elements_finished, next_input, next_cell_state, emit_output, next_loop_state)

    outputs_ta, last_state, _ = tf.nn.raw_rnn(cell, loop_fn)
    outputs = _transpose_batch_time(outputs_ta.stack())
    final_state = last_state

    return outputs, final_state

附带说明:目前尚不清楚在训练期间依赖模型的输出是否是一个好主意.尤其是在开始时,模型的输出可能非常糟糕,因此您的训练可能永远不会收敛或可能学不到任何有意义的东西.

As a side note: It is not clear if relying on the model's outputs during training is a good idea. Especially in the beginning, the outputs of the model can be quite bad, so your training might never converge or might not learn anything meaningful.

这篇关于Tensorflow:如何将上一个时间步的输出作为输入传递到下一个时间步的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-13 16:48