LSTM单元上的矩阵乘法

LSTM单元上的矩阵乘法

本文介绍了Tensorflow LSTM-LSTM单元上的矩阵乘法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在Tensorflow中创建一个LSTM神经网络.

I'm making a LSTM neural network in Tensorflow.

输入张量大小为92.

import tensorflow as tf
from tensorflow.contrib import rnn
import data

test_x, train_x, test_y, train_y = data.get()

# Parameters
learning_rate = 0.001
epochs = 100
batch_size = 64
display_step = 10

# Network Parameters
n_input = 28   # input size
n_hidden = 128 # number of hidden layers
n_classes = 20 # output size

# Placeholders
x = tf.placeholder(dtype=tf.float32, shape=[None, n_input])
y = tf.placeholder(dtype=tf.float32, shape=[None, n_classes])

# Network
def LSTM(x):
    W = tf.Variable(tf.random_normal([n_hidden, n_classes]), dtype=tf.float32) # weights
    b = tf.Variable(tf.random_normal([n_classes]), dtype=tf.float32) # biases

    x_shape = 92

    x = tf.transpose(x)
    x = tf.reshape(x, [-1, n_input])
    x = tf.split(x, x_shape)

    lstm = rnn.BasicLSTMCell(
        num_units=n_hidden,
        forget_bias=1.0
    )
    outputs, states = rnn.static_rnn(
        cell=lstm,
        inputs=x,
        dtype=tf.float32
    )

    output = tf.matmul( outputs[-1], W ) + b

    return output

# Train Network
def train(x):
    prediction = LSTM(x)

    with tf.Session() as sess:
        sess.run(tf.global_variables_initializer())
        output = sess.run(prediction, feed_dict={"x": train_x})
        print(output)

train(x)

我没有遇到任何错误,但是正在输入大小为92的输入张量,并且LSTM函数中的矩阵乘法返回一个包含一个结果向量的列表,当所需量为92时,每一个结果向量输入.

I'm not getting any errors, but I'm feeding an input tensor of size 92, and the matrix multiplication in the LSTM function returns a list containing one result vector, when the desired amount is 92, one result vector per input.

问题是我只将输出数组中的最后一项乘以矩阵吗?像这样:

Is the problem that I'm matrix multiplying only the last item in the outputs array? Like this:

output = tf.matmul( outputs[-1], W ) + b

代替:

output = tf.matmul( outputs, W ) + b

这是我做后者时遇到的错误:

This is the error I get when I do the latter:

ValueError: Shape must be rank 2 but is rank 3 for 'MatMul' (op: 'MatMul') with input shapes: [92,?,128], [128,20].

推荐答案

static_rnn用于制作最简单的循环神经网络.这是tf文档.因此,其输入应为张量序列.假设您要输入4个单词,分别称为"Hi","how","Are","you".因此,您的输入占位符应包含与每个单词相对应的四个n(每个输入向量的大小)维向量.

static_rnn for making the simplest recurrent neural net.Here's the tf documentation.So the input to it should be a sequence of tensors. Let's say you want to input 4 words calling "Hi","how","Are","you". So your input place holder should consist of four n(size of each input vector) dimensional vectors corresponding to each words.

我认为您的占位符有问题.您应该使用RNN的输入数量来初始化它. 28是每个向量中的维数.我相信92是序列的长度. (更像是92个lstm单元)

I think there's something wrong with your place holder. You should initialize it with number of inputs to the RNN. 28 is number of dimensions in each vector. I believe 92 is the length of the sequence. (more like 92 lstm cell)

在输出列表中,您将获得与序列长度相等的向量集,每个向量的大小均等于隐藏单元的数量.

In the output list you will get set of vectors equal to length of sequence each of size equal to number of hidden units.

这篇关于Tensorflow LSTM-LSTM单元上的矩阵乘法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

07-27 20:20