使用CSV数据的简单线性回归

使用CSV数据的简单线性回归

本文介绍了Tensorflow:使用CSV数据的简单线性回归的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是tensorflow的极端初学者,我受命使用我的csv数据进行简单的线性回归,该数据包含2列,分别为Height&充电状态(SoC),两个值均为浮点型.在CSV文件中,高度是第一个列,而SoC是第二个列.

I am an extreme beginner at tensorflow, and i was tasked to do a simple linear regression using my csv data which contains 2 columns, Height & State of Charge(SoC), where both values are float.In CSV file, Height is the first col while SoC is the second col.

我想使用高度来预测SoC

Using Height i'm suppose to predict SoC

我完全忘记了必须在代码的适合所有训练数据"部分中添加的内容.我看过其他线性回归模型,它们的代码令人费解,例如:

I'm completely lost as to what i have to add in the "Fit all training data" portion of the code. I've looked at other linear regression models and their codes are mind boggling, such as this one:

with tf.Session() as sess:
sess.run(init)
for epoch in range(training_epochs):
    sess.run(training_step,feed_dict={X:train_x,Y:train_y})
    cost_history = np.append(cost_history,sess.run(cost,feed_dict={X: train_x,Y: train_y}))

#calculate mean square error
pred_y = sess.run(y_, feed_dict={X: test_x})
mse = tf.reduce_mean(tf.square(pred_y - test_y))
print("MSE: %.4f" % sess.run(mse))

#plot cost
plt.plot(range(len(cost_history)),cost_history)
plt.axis([0,training_epochs,0,np.max(cost_history)])
plt.show()

fig, ax = plt.subplots()
ax.scatter(test_y, pred_y)
ax.plot([test_y.min(), test_y.max()], [test_y.min(), test_y.max()], 'k--', lw=3)
ax.set_xlabel('Measured')
ax.set_ylabel('Predicted')
plt.show()

使用该指南,我能够从我的CSV文件中获取数据而不会出错:

I've just been able to get data from my CSV file without error using this guide:

完整代码:

import tensorflow as tf
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
rng = np.random

from numpy import genfromtxt
from sklearn.datasets import load_boston

# Parameters
learning_rate = 0.01
training_epochs = 1000
display_step = 50
n_samples = 221

X = tf.placeholder("float") # create symbolic variables
Y = tf.placeholder("float")

filename_queue = tf.train.string_input_producer(["battdata.csv"],shuffle=False)

reader = tf.TextLineReader(skip_header_lines=1)
key, value = reader.read(filename_queue)

# Default values, in case of empty columns. Also specifies the type of the
# decoded result.
record_defaults = [[1.], [1.]]
col1, col2= tf.decode_csv(
    value, record_defaults=record_defaults)
features = tf.stack([col1])

# Set model weights
W = tf.Variable(rng.randn(), name="weight")
b = tf.Variable(rng.randn(), name="bias")

# Construct a linear model
pred = tf.add(tf.multiply(col1, W), b) # XW + b <- y = mx + b  where W is gradient, b is intercept

# Mean squared error
cost = tf.reduce_sum(tf.pow(pred-col2, 2))/(2*n_samples)

# Gradient descent
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)

# Initializing the variables
init = tf.global_variables_initializer()

with tf.Session() as sess:
    # Start populating the filename queue.
    coord = tf.train.Coordinator()
    threads = tf.train.start_queue_runners(coord=coord)
    sess.run(init)

    # Fit all training data
    for epoch in range(training_epochs):
        _, cost_value = sess.run([optimizer,cost])
        for (x, y) in zip(col2, col1):
                sess.run(optimizer, feed_dict={X: x, Y: y})

            #Display logs per epoch step
        if (epoch+1) % display_step == 0:
            c = sess.run(cost, feed_dict={X: col2, Y:col1})
            print( "Epoch:", '%04d' % (epoch+1), "cost=", "{:.9f}".format(c), \
                "W=", sess.run(W), "b=", sess.run(b))

        print("Optimization Finished!")
        training_cost = sess.run(cost, feed_dict={X: col2, Y: col1})
        print ("Training cost=", training_cost, "W=", sess.run(W), "b=", sess.run(b), '\n')

        #Graphic display
        plt.plot(train_X, train_Y, 'ro', label='Original data')
        plt.plot(train_X, sess.run(W) * col2 + sess.run(b), label='Fitted line')
        plt.legend()
        plt.show()

    coord.request_stop()
    coord.join(threads)

错误:

C:\ Users \ Shiina \ Anaconda3 \ envs \ tensorflow \ lib \ site-packages \ tensorflow \ python \ framework \ ops.py 在 iter (自己)中 514 TypeError:被调用时. 515" -> 516引发TypeError('Tensor'对象不可迭代.") 517 518 def bool (自己):

C:\Users\Shiina\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\framework\ops.py in iter(self) 514 TypeError: when invoked. 515 """ --> 516 raise TypeError("'Tensor' object is not iterable.") 517 518 def bool(self):

TypeError:"Tensor"对象不可迭代.

TypeError: 'Tensor' object is not iterable.

推荐答案

该错误是因为您试图遍历for (x, y) in zip(col2, col1)中的张量,这是不允许的.该代码的另一个问题是,您已经设置了输入管道队列,然后又尝试通过feed_dict {}进行输入,这是错误的.您的训练部分应如下所示:

The error is because your are trying to iterate over tensors in for (x, y) in zip(col2, col1) which is not allowed. The other issues with the code is that you have input pipeline queues setup and then your also trying to feed in through feed_dict{}, which is wrong. Your training part should look something like this:

with tf.Session() as sess:
# Start populating the filename queue.
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(coord=coord)
sess.run(init)

# Fit all training data
for epoch in range(training_epochs):
    _, cost_value = sess.run([optimizer,cost])

        #Display logs per epoch step
    if (epoch+1) % display_step == 0:
        c = sess.run(cost)
        print( "Epoch:", '%04d' % (epoch+1), "cost=", "{:.9f}".format(c), \
            "W=", sess.run(W), "b=", sess.run(b))

    print("Optimization Finished!")
    training_cost = sess.run(cost)
    print ("Training cost=", training_cost, "W=", sess.run(W), "b=", sess.run(b), '\n')

#Plot data after completing training
train_X = []
train_Y = []
for i in range(input_size): #Your input data size to loop through once
    X, Y = sess.run([col1, pred]) # Call pred, to get the prediction with the updated weights
    train_X.append(X)
    train_Y.append(y)
    #Graphic display
plt.plot(train_X, train_Y, 'ro', label='Original data')
plt.legend()
plt.show()

coord.request_stop()
coord.join(threads)

这篇关于Tensorflow:使用CSV数据的简单线性回归的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-11 17:09