下面是我修改过的一个简单的线性回归/ML代码。对于所有初始权重和偏差(即权重=np.数组([0.03,0.04,0.02]),偏差=0.01),训练将爆炸(只是不会收敛)。
不知道代码中是否有错误,或者如何选择好的初始值(权重和偏差),以便它收敛。
#Adopted from http://ml-cheatsheet.readthedocs.io/en/latest/linear_regression.html
import numpy as np
from numpy import genfromtxt
def predict(X, weight, bias):
return np.dot(X, weight) + bias
def cost_function(X, Y, weight, bias):
companies = X.shape[0]
return np.sum((predict(X, weight, bias) - Y) **2) / companies
def update_weights(X, Y, weight, bias, learning_rate):
companies = X.shape[0]
dW = 2 * np.dot(X.T, predict(X, weight, bias) - Y)
db = 2 * np.sum(predict(X, weight, bias) - Y)
"""
for i in range(companies):
# Calculate partial derivatives
# -2x(y - (mx + b))
dw += -2*X[i] * (sales[i] - (weight*X[i] + bias))
# -2(y - (mx + b))
db += -2*(sales[i] - (weight*X[i] + bias))
"""
#print(dW, db)
# We subtract because the derivatives point in direction of steepest ascent
#weight -= (dW / companies) * learning_rate
#bias -= (db / companies) * learning_rate
return weight - (dW / companies) * learning_rate, bias - (db / companies) * learning_rate
def train(X, Y, weight, bias, learning_rate, iters):
cost_history = []
for i in range(iters):
weight,bias = update_weights(X, Y, weight, bias, learning_rate)
#Calculate cost for auditing purposes
cost = cost_function(X, Y, weight, bias)
cost_history.append(cost)
# Log Progress
if i % 100 == 0:
print ("iter: "+str(i) + " cost: "+str(cost) + "\n")
return weight, bias, cost_history
#the Advertising.csv is from http://www-bcf.usc.edu/~gareth/ISL/Advertising.csv
if __name__ == "__main__":
my_data = genfromtxt('Advertising.csv', delimiter=',')
X = my_data[1:, 1:4:1]
Y = my_data[1:, 4]; #the sales
a,b, _ = train(X, Y, np.array([0.03, 0.04, 0.02]), 0.01, 0.001, 1000)
问题是,无论我使用什么值作为初始权重和偏差(即权重=np.数组([0.03,0.04,0.02]),偏差=0.01)都会导致值爆炸。
只是不会收敛。
列车(X,Y,重量,偏差,0.001,1000)
更新1
当我运行上面的代码片段时,我得到了
$ python linearRegression_multi.py
iter: 0 cost: 212337.75728564826
/Users/joe/anaconda3/lib/python3.6/site-packages/numpy/core/_methods.py:32: RuntimeWarning: overflow encountered in reduce
return umr_sum(a, axis, dtype, out, keepdims)
linearRegression_multi.py:11: RuntimeWarning: overflow encountered in square
return np.sum((predict(X, weight, bias) - Y) **2) / companies
iter: 100 cost: inf
linearRegression_multi.py:34: RuntimeWarning: invalid value encountered in subtract
return weight - dW * learning_rate / companies , bias - db * learning_rate / companies
iter: 200 cost: nan
iter: 300 cost: nan
iter: 400 cost: nan
iter: 500 cost: nan
iter: 600 cost: nan
iter: 700 cost: nan
iter: 800 cost: nan
iter: 900 cost: nan
最佳答案
找出了问题的原因!这种情况下的学习率太高。
将其更改为0.001
有效。也就是说,将原始代码片段的最后一行改为下面的行,这样就可以工作了。
a,b, _ = train(X, Y, np.array([0.03, 0.04, 0.02]), 0.01, 0.00001, 1000)
输出如下:
python te.py
iter: 0 cost: 23.07411798374272
iter: 100 cost: 6.479930413738248
iter: 200 cost: 5.097751463999494
iter: 300 cost: 4.528064099014893
iter: 400 cost: 4.263917598438141
iter: 500 cost: 4.1398851132621655
iter: 600 cost: 4.081383875535448
iter: 700 cost: 4.053584811192947
iter: 800 cost: 4.040172367398533
iter: 900 cost: 4.033501506011401
关于python - 不能使多元线性回归收敛,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/48981687/