下面是我修改过的一个简单的线性回归/ML代码。对于所有初始权重和偏差(即权重=np.数组([0.03,0.04,0.02]),偏差=0.01),训练将爆炸(只是不会收敛)。
不知道代码中是否有错误,或者如何选择好的初始值(权重和偏差),以便它收敛。

#Adopted from http://ml-cheatsheet.readthedocs.io/en/latest/linear_regression.html
import numpy as np
from numpy import genfromtxt


def predict(X, weight, bias):
    return np.dot(X, weight) + bias

def cost_function(X, Y, weight, bias):
    companies = X.shape[0]
    return np.sum((predict(X, weight, bias) - Y) **2) / companies



def update_weights(X, Y, weight, bias, learning_rate):
    companies = X.shape[0]

    dW = 2 * np.dot(X.T,  predict(X, weight, bias) - Y)
    db = 2 * np.sum(predict(X, weight, bias) - Y)
    """
    for i in range(companies):
        # Calculate partial derivatives
        # -2x(y - (mx + b))
        dw += -2*X[i] * (sales[i] - (weight*X[i] + bias))

        # -2(y - (mx + b))
        db += -2*(sales[i] - (weight*X[i] + bias))
    """
    #print(dW, db)
    # We subtract because the derivatives point in direction of steepest ascent
    #weight -= (dW / companies) * learning_rate
    #bias -= (db / companies) * learning_rate

    return weight - (dW / companies) * learning_rate, bias - (db / companies) * learning_rate

def train(X, Y, weight, bias, learning_rate, iters):
    cost_history = []

    for i in range(iters):
        weight,bias = update_weights(X, Y, weight, bias, learning_rate)

        #Calculate cost for auditing purposes
        cost = cost_function(X, Y, weight, bias)
        cost_history.append(cost)

        # Log Progress
        if i % 100 == 0:
            print ("iter: "+str(i) + " cost: "+str(cost) + "\n")

    return weight, bias, cost_history

#the Advertising.csv is from http://www-bcf.usc.edu/~gareth/ISL/Advertising.csv
if __name__ == "__main__":
    my_data = genfromtxt('Advertising.csv', delimiter=',')
    X = my_data[1:, 1:4:1]
    Y = my_data[1:, 4];  #the sales
    a,b, _ = train(X, Y, np.array([0.03, 0.04, 0.02]), 0.01, 0.001, 1000)

问题是,无论我使用什么值作为初始权重和偏差(即权重=np.数组([0.03,0.04,0.02]),偏差=0.01)都会导致值爆炸。
只是不会收敛。
列车(X,Y,重量,偏差,0.001,1000)
更新1
当我运行上面的代码片段时,我得到了
$ python linearRegression_multi.py
iter: 0 cost: 212337.75728564826

/Users/joe/anaconda3/lib/python3.6/site-packages/numpy/core/_methods.py:32: RuntimeWarning: overflow encountered in reduce
  return umr_sum(a, axis, dtype, out, keepdims)
linearRegression_multi.py:11: RuntimeWarning: overflow encountered in square
  return np.sum((predict(X, weight, bias) - Y) **2) / companies
iter: 100 cost: inf

linearRegression_multi.py:34: RuntimeWarning: invalid value encountered in subtract
  return weight - dW * learning_rate / companies , bias - db * learning_rate / companies
iter: 200 cost: nan

iter: 300 cost: nan

iter: 400 cost: nan

iter: 500 cost: nan

iter: 600 cost: nan

iter: 700 cost: nan

iter: 800 cost: nan

iter: 900 cost: nan

最佳答案

找出了问题的原因!这种情况下的学习率太高。
将其更改为0.001有效。也就是说,将原始代码片段的最后一行改为下面的行,这样就可以工作了。

a,b, _ = train(X, Y, np.array([0.03, 0.04, 0.02]), 0.01, 0.00001, 1000)

输出如下:
python te.py
iter: 0 cost: 23.07411798374272

iter: 100 cost: 6.479930413738248

iter: 200 cost: 5.097751463999494

iter: 300 cost: 4.528064099014893

iter: 400 cost: 4.263917598438141

iter: 500 cost: 4.1398851132621655

iter: 600 cost: 4.081383875535448

iter: 700 cost: 4.053584811192947

iter: 800 cost: 4.040172367398533

iter: 900 cost: 4.033501506011401

关于python - 不能使多元线性回归收敛,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/48981687/

10-12 22:11