我正在尝试使用玩具问题进行序列分类来了解张量流中的递归网络。
数据:
half_len = 500
pos_ex = [1, 2, 3, 4, 5] # Positive sequence.
neg_ex = [1, 2, 3, 4, 6] # Negative sequence.
num_input = len(pos_ex)
data = np.concatenate((np.stack([pos_ex]*half_len), np.stack([neg_ex]*half_len)), axis=0)
labels = np.asarray([0, 1] * half_len + [1, 0] * half_len).reshape((2 * half_len, -1))
模型:
_, x_width = data.shape
X = tf.placeholder("float", [None, x_width])
Y = tf.placeholder("float", [None, num_classes])
weights = tf.Variable(tf.random_normal([num_input, n_hidden]))
bias = tf.Variable(tf.random_normal([n_hidden]))
def lstm_model():
from tensorflow.contrib import rnn
x = tf.split(X, num_input, 1)
rnn_cell = rnn.BasicLSTMCell(n_hidden)
outputs, states = rnn.static_rnn(rnn_cell, x, dtype=tf.float32)
return tf.matmul(outputs[-1], weights) + bias
训练:
logits = lstm_model()
prediction = tf.nn.softmax(logits)
# Define loss and optimizer
loss_op = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=Y))
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate)
train_op = optimizer.minimize(loss_op)
# Train...
我的训练精度大约是0.5,这使我感到困惑,因为问题很简单。
Step 1, Minibatch Loss = 82.2726, Training Accuracy = 0.453
Step 25, Minibatch Loss = 6.7920, Training Accuracy = 0.547
Step 50, Minibatch Loss = 0.8528, Training Accuracy = 0.500
Step 75, Minibatch Loss = 0.6989, Training Accuracy = 0.500
Step 100, Minibatch Loss = 0.6929, Training Accuracy = 0.516
将玩具数据更改为:
pos_ex = [1, 2, 3, 4, 5]
neg_ex = [1, 2, 3, 4, 100]
迅速达到准确性1.有人能解释一下为什么这个网络在如此简单的任务上失败了吗?谢谢。
上面的代码基于this tutorial。
最佳答案
您是否尝试过降低学习率?
在第二个示例中,最后一个坐标上的分隔值更大,这应该没有区别,但会影响学习率的选择。
如果要标准化数据(将每个坐标的域设置在-1和1之间),并找到合适的步长,则应该以相同的步数来解决这两个问题。
编辑:稍微玩了一些玩具示例,即使没有标准化也可以正常工作
import tensorflow as tf
import numpy as np
from tensorflow.contrib import rnn
# Meta parameters
n_hidden = 10
num_classes = 2
learning_rate = 1e-2
input_dim = 5
num_input = 5
# inputs
X = tf.placeholder("float", [None, input_dim])
Y = tf.placeholder("float", [None, num_classes])
# Model
def lstm_model():
# input layer
x = tf.split(X, num_input, 1)
# LSTM layer
rnn_cell = rnn.BasicLSTMCell(n_hidden)
outputs, states = rnn.static_rnn(rnn_cell, x, dtype=tf.float32)
# final layer - softmax
weights = tf.Variable(tf.random_normal([n_hidden, num_classes]))
bias = tf.Variable(tf.random_normal([num_classes]))
return tf.matmul(outputs[-1], weights) + bias
# logits and prediction
logits = lstm_model()
prediction = tf.nn.softmax(logits)
# Define loss and optimizer
loss_op = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=Y))
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate)
train_op = optimizer.minimize(loss_op)
# -----------
# Train func
# -----------
def train(data,labels):
with tf.Session() as session:
session.run(tf.global_variables_initializer())
for i in range(1000):
_, loss, onehot_pred = session.run([train_op, loss_op, prediction], feed_dict={X: data, Y: labels})
acc = np.mean(np.argmax(onehot_pred,axis=1) == np.argmax(labels,axis=1))
print('Iteration {} accuracy: {}'.format(i,acc))
if acc == 1:
print('---> Finised after {} iterations'.format(i+1))
break
# -----------
# Train 1
# -----------
# data generation
half_len = 500
pos_ex = [1, 2, 3, 4, 5] # Positive sequence.
neg_ex = [1, 2, 3, 4, 6] # Negative sequence.
data = np.concatenate((np.stack([pos_ex]*half_len), np.stack([neg_ex]*half_len)), axis=0)
labels = np.asarray([0, 1] * half_len + [1, 0] * half_len).reshape((2 * half_len, -1))
train(data,labels)
# -----------
# Train 2
# -----------
# data generation
half_len = 500
pos_ex = [1, 2, 3, 4, 5] # Positive sequence.
neg_ex = [1, 2, 3, 4, 100] # Negative sequence.
data = np.concatenate((np.stack([pos_ex]*half_len), np.stack([neg_ex]*half_len)), axis=0)
labels = np.asarray([0, 1] * half_len + [1, 0] * half_len).reshape((2 * half_len, -1))
train(data,labels)
输出为:
Iteration 0 accuracy: 0.5
Iteration 1 accuracy: 0.5
Iteration 2 accuracy: 0.5
Iteration 3 accuracy: 0.5
Iteration 4 accuracy: 0.5
Iteration 5 accuracy: 0.5
Iteration 6 accuracy: 0.5
Iteration 7 accuracy: 0.5
Iteration 8 accuracy: 0.5
Iteration 9 accuracy: 0.5
Iteration 10 accuracy: 1.0
---> Finised after 11 iterations
Iteration 0 accuracy: 0.5
Iteration 1 accuracy: 0.5
Iteration 2 accuracy: 0.5
Iteration 3 accuracy: 0.5
Iteration 4 accuracy: 0.5
Iteration 5 accuracy: 0.5
Iteration 6 accuracy: 0.5
Iteration 7 accuracy: 0.5
Iteration 8 accuracy: 0.5
Iteration 9 accuracy: 1.0
---> Finised after 10 iterations
祝好运!
关于python - 无法在 tensorflow 中训练玩具LSTM,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/49495364/