本文介绍了TensorFlow:训练时参数不会更新的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用TensorFlow实现分类模型

I'm implementing a classification model using TensorFlow

我面临的问题是,当我运行训练步骤时,我的体重和错误没有得到更新.结果,我的网络不断返回相同的结果.

The problem that I'm facing is that my weights and error are not being updated when I run the training step. As a result, my network keeps returning the same results.

我已经基于 MNIST开发了我的模型TensorFlow网站上的示例.

import numpy as np
import tensorflow as tf
sess = tf.InteractiveSession()

#load dataset
dataset = np.loadtxt('char8k.txt', dtype='float', comments='#', delimiter=",")
Y = np.asmatrix( dataset[:,0] )
X = np.asmatrix( dataset[:,1:1201] )

m = 11527
labels = 26

# y is update to 11527x26
Yt = np.zeros((m,labels))

for i in range(0,m):
    index = Y[0,i] - 1
    Yt[i,index]= 1

Y = Yt
Y = np.asmatrix(Y)

#------------------------------------------------------------------------------

#graph settings

x = tf.placeholder(tf.float32, shape=[None, 1200])
y_ = tf.placeholder(tf.float32, shape=[None, 26])


Wtest = tf.Variable(tf.truncated_normal([1200,26], stddev=0.001))
W = tf.Variable(tf.truncated_normal([1200,26], stddev=0.001))
b = tf.Variable(tf.zeros([26]))
sess.run(tf.initialize_all_variables())

y = tf.nn.softmax(tf.matmul(x,W) + b)

cross_entropy = -tf.reduce_sum(y_*tf.log(y))

train_step = tf.train.GradientDescentOptimizer(0.01).minimize(cross_entropy)
Wtest = W


for i in range(10):
  print("iteracao:")
  print(i)
  Xbatch = X[np.random.randint(X.shape[0],size=100),:]
  Ybatch = Y[np.random.randint(Y.shape[0],size=100),:]
  train_step.run(feed_dict={x: Xbatch, y_: Ybatch})
  print("atualizacao de pesos")
  print(Wtest==W)#monitora atualizaçao dos pesos

  correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
  accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
  print("precisao:Y")
  print accuracy.eval(feed_dict={x: X, y_: Y})
  print(" ")
  print(" ")

推荐答案

问题可能是由如何初始化权重矩阵W引起的.如果将其初始化为全零,则所有神经元将在每个步骤中遵循相同的梯度,从而导致网络无法训练.替换行

The issue probably arises from how you initialize the weight matrix, W. If it is initialized to all zeroes, all of the neurons will follow the same gradient in each step, which leads to the network not training. Replacing the line

W = tf.Variable(tf.zeros([1200,26]))

...类似

W = tf.Variable(tf.truncated_normal([1200,26], stddev=0.001))

...应该使它开始训练.

...should cause it to start training.

此问题 CrossValidated网站很好地解释了为什么不应该将所有权重初始化为零.

This question on the CrossValidated site has a good explanation of why you should not initialize all of your weights to zero.

这篇关于TensorFlow:训练时参数不会更新的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-05 19:58
查看更多