问题描述
你好,这是我第一次使用tensorflow,我尝试在此处修改示例 TensorFlow-Examples 将此代码用于解决波士顿数据库的回归问题.基本上,我只更改成本函数,数据库,输入数和目标数,但运行MPL时不会收敛(我使用非常低的费率).我用Adam Optimization和下降梯度优化对其进行了测试,但我的行为相同.感谢您的建议和想法... !!!
Hello it is my first time working with tensorflow, i try to adapt the example here TensorFlow-Examples to use this code for regression problems with boston database. Basically, i only change the cost function ,the database, the inputs number, and the target number but when i run the MPL doesn't converge (i use a very low rate). I test it with Adam Optimization and descend gradient optimization but i have the same behavior.I appreciate your suggestions and ideas...!!!
观察:当我运行此程序但未做上述修改时,成本函数值始终减小.
这里是我运行模型时的演化过程,即使学习率很低,成本函数也会振荡.在最坏的情况下,我希望模型收敛于一个值,例如纪元944显示一个值0.2267548找到更好的值,然后必须保持该值,直到优化完成为止.
Here the evolution when i run the model, the cost function oscillated even with a very low learning rate.In the worst case, i hope the model converge in a value, for example the epoch 944 shows a value 0.2267548 if not other better value is find then this value must stay until the optimization is finished.
Epoch:0942年费用= 0.445707272
Epoch: 0942 cost= 0.445707272
Epoch:0943 cost = 0.389314095
Epoch: 0943 cost= 0.389314095
Epoch:0944 cost = 0.226754842
Epoch: 0944 cost= 0.226754842
Epoch:0945费用= 0.404150135
Epoch: 0945 cost= 0.404150135
Epoch:0946费用= 0.382190095
Epoch: 0946 cost= 0.382190095
Epoch:0947成本= 0.897880572
Epoch: 0947 cost= 0.897880572
Epoch:0948成本= 0.481954243
Epoch: 0948 cost= 0.481954243
Epoch:0949成本= 0.269408980
Epoch: 0949 cost= 0.269408980
Epoch:0950成本= 0.427961614
Epoch: 0950 cost= 0.427961614
Epoch:0951成本= 1.206053280
Epoch: 0951 cost= 1.206053280
Epoch:0952费用= 0.834200084
Epoch: 0952 cost= 0.834200084
from __future__ import print_function
# Import MNIST data
#from tensorflow.examples.tutorials.mnist import input_data
#mnist = input_data.read_data_sets("/tmp/data/", one_hot=True)
import tensorflow as tf
import ToolInputData as input_data
ALL_DATA_FILE_NAME = "boston_normalized.csv"
##Load complete database, then this database is splitted in training, validation and test set
completedDatabase = input_data.Databases(databaseFileName=ALL_DATA_FILE_NAME, targetLabel="MEDV", trainPercentage=0.70, valPercentage=0.20, testPercentage=0.10,
randomState=42, inputdataShuffle=True, batchDataShuffle=True)
# Parameters
learning_rate = 0.0001
training_epochs = 1000
batch_size = 5
display_step = 1
# Network Parameters
n_hidden_1 = 10 # 1st layer number of neurons
n_hidden_2 = 10 # 2nd layer number of neurons
n_input = 13 # number of features of my database
n_classes = 1 # one target value (float)
# tf Graph input
x = tf.placeholder("float", [None, n_input])
y = tf.placeholder("float", [None, n_classes])
# Create model
def multilayer_perceptron(x, weights, biases):
# Hidden layer with RELU activation
layer_1 = tf.add(tf.matmul(x, weights['h1']), biases['b1'])
layer_1 = tf.nn.relu(layer_1)
# Hidden layer with RELU activation
layer_2 = tf.add(tf.matmul(layer_1, weights['h2']), biases['b2'])
layer_2 = tf.nn.relu(layer_2)
# Output layer with linear activation
out_layer = tf.matmul(layer_2, weights['out']) + biases['out']
return out_layer
# Store layers weight & bias
weights = {
'h1': tf.Variable(tf.random_normal([n_input, n_hidden_1])),
'h2': tf.Variable(tf.random_normal([n_hidden_1, n_hidden_2])),
'out': tf.Variable(tf.random_normal([n_hidden_2, n_classes]))
}
biases = {
'b1': tf.Variable(tf.random_normal([n_hidden_1])),
'b2': tf.Variable(tf.random_normal([n_hidden_2])),
'out': tf.Variable(tf.random_normal([n_classes]))
}
# Construct model
pred = multilayer_perceptron(x, weights, biases)
# Define loss and optimizer
cost = tf.reduce_mean(tf.square(pred-y))
#cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(pred, y))
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)
# Initializing the variables
init = tf.initialize_all_variables()
# Launch the graph
with tf.Session() as sess:
sess.run(init)
# Training cycle
for epoch in range(training_epochs):
avg_cost = 0.
total_batch = int(completedDatabase.train.num_examples/batch_size)
# Loop over all batches
for i in range(total_batch):
batch_x, batch_y = completedDatabase.train.next_batch(batch_size)
# Run optimization op (backprop) and cost op (to get loss value)
_, c = sess.run([optimizer, cost], feed_dict={x: batch_x,
y: batch_y})
# Compute average loss
avg_cost += c / total_batch
# Display logs per epoch step
if epoch % display_step == 0:
print("Epoch:", '%04d' % (epoch+1), "cost=", \
"{:.9f}".format(avg_cost))
print("Optimization Finished!")
推荐答案
您说过,标签在[0,1]范围内,但我看不到预测值在同一范围内.为了使它们与标签具有可比性,您应在返回之前将它们转换为相同的范围,例如使用sigmoid函数:
You stated that your labels are in the range [0,1], but I cannot see that the predictions are in the same range. In order to make them comparable to the labels, you should transform them to the same range before returning, for example using the sigmoid function:
out_layer = tf.matmul(...)
out = tf.sigmoid(out_layer)
return out
也许这可以解决稳定性问题.您可能还需要稍微增加批处理大小,例如每批20个示例.如果这样可以提高性能,则可以稍微提高学习速度.
Maybe this fixes the problem with the stability. You might also want to increase the batch size a bit, for example 20 examples per batch. If this improves the performance, you can probably increase the learning rate a bit.
这篇关于张量流中的MLP进行回归...不收敛的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!