我正在尝试在Tensorflow中构建多层,多类,多标签LSTM。我一直在尝试将this教程应用于我的数据。
但是,我收到一个错误,说我在构建RNN时尺寸不匹配。
ValueError:尺寸必须相等,但对于输入形状为[?,1000],[923,2000]的'rnn / while / rnn / multi_rnn_cell / cell_0 / lstm_cell / MatMul_1'(op:'MatMul'),尺寸必须为1000和923 。
我无法查明在建筑体系结构中哪个变量不正确:
def weight_variable(shape):
initial = tf.truncated_normal(shape, stddev=0.1)
return tf.Variable(initial)
def bias_variable(shape):
initial = tf.constant(0.0, shape=shape)
return tf.Variable(initial)
def lstm(x, weight, bias, n_steps, n_classes):
cell = rnn_cell.LSTMCell(cfg.n_hidden_cells_in_layer, state_is_tuple=True)
multi_layer_cell = tf.nn.rnn_cell.MultiRNNCell([cell] * 2)
# FIXME : ERROR binding x to LSTM as it is
output, state = tf.nn.dynamic_rnn(multi_layer_cell, x, dtype=tf.float32)
# FIXME : ERROR
output_flattened = tf.reshape(output, [-1, cfg.n_hidden_cells_in_layer])
output_logits = tf.add(tf.matmul(output_flattened, weight), bias)
output_all = tf.nn.sigmoid(output_logits)
output_reshaped = tf.reshape(output_all, [-1, n_steps, n_classes])
# ??? switch batch size with sequence size. ???
# then gather last time step values
output_last = tf.gather(tf.transpose(output_reshaped, [1, 0, 2]), n_steps - 1)
return output_last, output_all
这些是我的占位符,损失函数和所有爵士乐:
x_test, y_test = load_multiple_vector_files(test_filepaths)
x_valid, y_valid = load_multiple_vector_files(valid_filepaths)
n_input, n_steps, n_classes = get_input_target_lengths(check_print=False)
# FIXME n_input should be the problem
x = tf.placeholder("float", [None, n_steps, n_input])
y = tf.placeholder("float", [None, n_classes])
y_steps = tf.placeholder("float", [None, n_classes])
weight = weight_variable([cfg.n_hidden_layers, n_classes])
bias = bias_variable([n_classes])
y_last, y_all = lstm(x, weight, bias, n_steps, n_classes)
#all_steps_cost=tf.reduce_mean(-tf.reduce_mean((y_steps * tf.log(y_all))+(1 - y_steps) * tf.log(1 - y_all),reduction_indices=1))
all_steps_cost = -tf.reduce_mean((y_steps * tf.log(y_all)) + (1 - y_steps) * tf.log(1 - y_all))
last_step_cost = -tf.reduce_mean((y * tf.log(y_last)) + ((1 - y) * tf.log(1 - y_last)))
loss_function = (cfg.alpha * all_steps_cost) + ((1 - cfg.alpha) * last_step_cost)
optimizer = tf.train.AdamOptimizer(learning_rate=cfg.learning_rate).minimize(loss_function)
我很确定是导致问题的原因是我的X占位符,导致图层及其矩阵尺寸不匹配。链接示例所使用的常量很难看出其实际含义。
有人可以帮我从这里出去吗? :)
更新:
我对不匹配的尺寸进行了“合理的猜测”。
一个是2 * hidden_width,因此隐藏获取新输入及其旧的循环输入。但是,不匹配的尺寸是input_width + hidden_width,就像它试图为输入层设置隐藏层宽度的重复性一样。
最佳答案
我发现我使用n_hidden_layers(隐藏层数)的常量而不是n_hidden_cells_in_layer(层数)的常量来设置权重变量是错误的。
关于tensorflow - 在Tensorflow中构建LSTM RNN时尺寸不匹配,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/49728273/