代价函数cost function

  • 公式:

    ML:多变量代价函数和梯度下降(Linear Regression with Multiple Variables)-LMLPHP

    其中,变量θ(R或者R)

  • 向量化:

    ML:多变量代价函数和梯度下降(Linear Regression with Multiple Variables)-LMLPHP

Octave实现:

function J = computeCost(X, y, theta)
%COMPUTECOST Compute cost for linear regression
% J = COMPUTECOST(X, y, theta) computes the cost of using theta as the
% parameter for linear regression to fit the data points in X and y % Initialize some useful values
m = length(y); % number of training examples % You need to return the following variables correctly
J = 0; % ====================== YOUR CODE HERE ======================
% Instructions: Compute the cost of a particular choice of theta
% You should set J to the cost. prediction=X*theta;
sqerror=(prediction-y).^2;
J=1/(2*m)*sum(sqerror) % ========================================================================= end

多变量梯度下降(gradient descent for multiple variable)

  • 公式:

    ML:多变量代价函数和梯度下降(Linear Regression with Multiple Variables)-LMLPHP

    也即,

    ML:多变量代价函数和梯度下降(Linear Regression with Multiple Variables)-LMLPHP
  • 矩阵化:

    梯度下降可以表示为,

    ML:多变量代价函数和梯度下降(Linear Regression with Multiple Variables)-LMLPHP

    其中,ML:多变量代价函数和梯度下降(Linear Regression with Multiple Variables)-LMLPHP为,

    ML:多变量代价函数和梯度下降(Linear Regression with Multiple Variables)-LMLPHP

    其中微分可以求得,

    ML:多变量代价函数和梯度下降(Linear Regression with Multiple Variables)-LMLPHP

    将其向量化后,

    ML:多变量代价函数和梯度下降(Linear Regression with Multiple Variables)-LMLPHP

    则最终的梯度下降的矩阵化版本,

    ML:多变量代价函数和梯度下降(Linear Regression with Multiple Variables)-LMLPHP

Octave版本:

function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters)
%GRADIENTDESCENT Performs gradient descent to learn theta
% theta = GRADIENTDESCENT(X, y, theta, alpha, num_iters) updates theta by
% taking num_iters gradient steps with learning rate alpha % Initialize some useful values
m = length(y); % number of training examples
J_history = zeros(num_iters, 1); for iter = 1:num_iters % ====================== YOUR CODE HERE ======================
% Instructions: Perform a single gradient step on the parameter vector
% theta.
%
% Hint: While debugging, it can be useful to print out the values
% of the cost function (computeCost) and gradient here.
% predictions=X*theta;
updates=X'*(predictions-y);
theta=theta-alpha*(1/m)*updates; % ============================================================ % Save the cost J in every iteration
J_history(iter) = computeCost(X, y, theta); end end
05-11 15:13