我正在尝试实现重量噪声正则化,就像Alex Graves在其博士学位论文中所做的那样,但是我在实现该方法方面存在几个问题。该算法应该看起来像
while stopping criteria not met do
Randomize training set order
for each example in the training set do
Add zero mean Gaussian Noise to weights
Run forward and backward pass to calculate the gradient
Restore original weights
Update weights with gradient descent algorithm
谁能给我一些启示?
编辑09/16/16
这是我的代码:
# e.g: log filter bank or MFCC features
# Has size [batch_size, max_stepsize, num_features], but the
# batch_size and max_stepsize can vary along each step
inputs = tf.placeholder(tf.float32, [None, None, num_features])
# Here we use sparse_placeholder that will generate a
# SparseTensor required by ctc_loss op.
targets = tf.sparse_placeholder(tf.int32)
# 1d array of size [batch_size]
seq_len = tf.placeholder(tf.int32, [None])
# Defining the cell
# Can be:
# tf.nn.rnn_cell.RNNCell
# tf.nn.rnn_cell.GRUCell
cell = tf.nn.rnn_cell.LSTMCell(num_hidden, state_is_tuple=True)
# Stacking rnn cells
stack = tf.nn.rnn_cell.MultiRNNCell([cell] * num_layers,
state_is_tuple=True)
# The second output is the last state and we will no use that
outputs, _ = tf.nn.dynamic_rnn(cell, inputs, seq_len, dtype=tf.float32)
shape = tf.shape(inputs)
batch_s, max_timesteps = shape[0], shape[1]
# Reshaping to apply the same weights over the timesteps
outputs = tf.reshape(outputs, [-1, num_hidden])
# Truncated normal with mean 0 and stdev=0.1
# Tip: Try another initialization
# see https://www.tensorflow.org/versions/r0.9/api_docs/python/contrib.layers.html#initializers
W = tf.Variable(tf.truncated_normal([num_hidden,
num_classes],
stddev=0.1))
# Zero initialization
# Tip: Is tf.zeros_initializer the same?
b = tf.Variable(tf.constant(0., shape=[num_classes]))
# Doing the affine projection
logits = tf.matmul(outputs, W) + b
# Reshaping back to the original shape
logits = tf.reshape(logits, [batch_s, -1, num_classes])
# Time major
logits = tf.transpose(logits, (1, 0, 2))
loss = tf.contrib.ctc.ctc_loss(logits, targets, seq_len)
cost = tf.reduce_mean(loss)
optimizer = tf.train.MomentumOptimizer(initial_learning_rate,
0.9).minimize(cost)
# Option 2: tf.contrib.ctc.ctc_beam_search_decoder
# (it's slower but you'll get better results)
decoded, log_prob = tf.contrib.ctc.ctc_greedy_decoder(logits, seq_len)
# Inaccuracy: label error rate
ler = tf.reduce_mean(tf.edit_distance(tf.cast(decoded[0], tf.int32),
targets))
编辑09/27/16
我意识到必须更改优化器才能添加噪声权重调整器。但是,我不知道如何在我的代码中插入它。
variables = tf.trainable_variables()
with tf.variable_scope(self.name or "OptimizeLoss", [loss, global_step]):
update_ops = set(ops.get_collection(ops.GraphKeys.UPDATE_OPS))
# Make sure update ops are ran before computing loss.
if update_ops:
loss = control_flow_ops.with_dependencies(list(update_ops), loss)
add_noise_ops = [tf.no_op()]
if self.weights_noise_scale is not None:
add_noise_ops, remove_noise_ops = self._noise_ops(variables, self.weights_noise_scale)
# Make sure add noise to weights before computing loss.
loss = control_flow_ops.with_dependencies(add_noise_ops, loss)
# Compute gradients.
gradients = self._opt.compute_gradients(loss, variables, colocate_gradients_with_ops=self.colocate_gradients_with_ops)
# Optionally add gradient noise.
if self.gradient_noise_scale is not None:
gradients = self._add_scaled_noise_to_gradients(gradients, self.gradient_noise_scale)
# Optionally clip gradients by global norm.
if self.clip_gradients_by_global_norm is not None:
gradients = self._clip_gradients_by_global_norm(gradients, self.clip_gradients_by_global_norm)
# Optionally clip gradients by value.
if self.clip_gradients_by_value is not None:
gradients = self._clip_gradients_by_value(gradients, self.clip_gradients_by_value)
# Optionally clip gradients by norm.
if self.clip_gradients_by_norm is not None:
gradients = self._clip_gradients_by_norm(gradients, self.clip_gradients_by_norm)
self._grads = [g[0] for g in gradients]
self._vars = [g[1] for g in gradients]
# Create gradient updates.
# Make sure that the noise of weights will be removed before the gradient update rule
grad_updates = self._opt.apply_gradients(gradients,
global_step=global_step,
name="train")
# Ensure the train_tensor computes grad_updates.
train_tensor = control_flow_ops.with_dependencies([grad_updates], loss)
谁能给我一些启示?
谢谢 :)
最佳答案
为了解决这个问题,我将构建2个图形:一个用于训练,另一个用于评估。后者不会将噪声加到权重上。要对权重进行随机噪声求和,您可以执行以下操作:
W = tf.Variable(tf.truncated_normal([num_hidden,
num_classes],
stddev=0.1))
noise = tf.truncated_normal([num_hidden, num_classes],
stddev=0.001))
W = W + noise
张量
tf.truncated_normal
将为您的权重添加少量随机噪声。关于machine-learning - Tensorflow-权重噪声调整,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/39515760/