问题描述
我正在尝试使用keras的功能API来构建递归神经网络,但是遇到了有关输出形状的一些问题,我们将不胜感激.
I am trying to use keras' functional API to build a recurrent neural network, but met some problems about the output shape, any help will be appreciated.
我的代码:
import tensorflow as tf
from tensorflow.python.keras.datasets import mnist
from tensorflow.python.keras.layers import Dense, CuDNNLSTM, Dropout
from tensorflow.python.keras.models import Sequential
from tensorflow.python.keras.utils import normalize
from tensorflow.python.keras.utils import np_utils
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = normalize(x_train, axis=1), normalize(x_test, axis=1)
y_train = np_utils.to_categorical(y_train, 10)
y_test = np_utils.to_categorical(y_test, 10)
feature_input = tf.keras.layers.Input(shape=(28, 28))
x = tf.keras.layers.CuDNNLSTM(128, kernel_regularizer=tf.keras.regularizers.l2(l=0.0004), return_sequences=True)(feature_input)
y = tf.keras.layers.Dense(10, activation='softmax')(x)
model = tf.keras.Model(inputs=feature_input, outputs=y)
opt = tf.keras.optimizers.Adam(lr=1e-3, decay=1e-5)
model.compile(optimizer=opt, loss="sparse_categorical_crossentropy", metrics=['accuracy'])
model.fit(x_train, y_train, epochs=3, validation_data=(x_test, y_test))
错误:
推荐答案
您的数据(目标)的形状为(60000, 10)
.
You data (targets) has shape (60000, 10)
.
您模型的输出(密集")的形状为(None, length, 10)
.
Your model's output ('dense') has shape (None, length, 10)
.
其中None
是批次大小(变量),length
是中间尺寸,表示LSTM的时间步长",而10是Dense
层的单位.
Where None
is the batch size (variable), length
is the middle dimension, which mean "time steps" for an LSTM, and 10 is the units of the Dense
layer.
现在,您没有在LSTM中处理时间序列的任何顺序,这没有任何意义.它将图像行"解释为连续的时间步长,将图像列"解释为独立的特征. (如果这不是您的意图,那么您就很幸运,它没有给您尝试将图像放入LSTM的错误)
Now, you don't have any sequence with time steps to process in an LSTM, it doesn't make sense. It is interpreting "image rows" as sequential time steps and "image columns" as independent features. (If this was not your intention, you simply got lucky that it didn't give you an error for trying to put an image into an LSTM)
无论如何,您可以使用return_sequences=False
修复此错误(丢弃序列的length
).这并不意味着该模型对于这种情况是最佳的.
Anyway, you can fix this error with return_sequences=False
(discard the length
of the sequences). Which does not mean this model is optimal for this case.
这篇关于keras,带有RNN模型的MNIST分类,关于输出形状的问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!