ML:多变量代价函数和梯度下降(Linear Regression with Multiple Variables)

代价函数cost function

公式：

其中，变量θ（R或者R）
向量化：

Octave实现：

function J = computeCost(X, y, theta)

%COMPUTECOST Compute cost for linear regression

%   J = COMPUTECOST(X, y, theta) computes the cost of using theta as the

%   parameter for linear regression to fit the data points in X and y

% Initialize some useful values

m = length(y); % number of training examples

% You need to return the following variables correctly

J = 0;

% ====================== YOUR CODE HERE ======================

% Instructions: Compute the cost of a particular choice of theta

%               You should set J to the cost.

prediction=X*theta;

sqerror=(prediction-y).^2;

J=1/(2*m)*sum(sqerror)

% =========================================================================

end

多变量梯度下降（gradient descent for multiple variable）

公式：

也即，
矩阵化：

梯度下降可以表示为，

其中，为，

其中微分可以求得，

将其向量化后，

则最终的梯度下降的矩阵化版本，

Octave版本：

function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters)

%GRADIENTDESCENT Performs gradient descent to learn theta

%   theta = GRADIENTDESCENT(X, y, theta, alpha, num_iters) updates theta by

%   taking num_iters gradient steps with learning rate alpha

% Initialize some useful values

m = length(y); % number of training examples

J_history = zeros(num_iters, 1);

for iter = 1:num_iters

	% ====================== YOUR CODE HERE ======================

	% Instructions: Perform a single gradient step on the parameter vector

	%               theta.

	%

	% Hint: While debugging, it can be useful to print out the values

	%       of the cost function (computeCost) and gradient here.

	%

	predictions=X*theta;

	updates=X'*(predictions-y);

	theta=theta-alpha*(1/m)*updates;

	% ============================================================

	% Save the cost J in every iteration

	J_history(iter) = computeCost(X, y, theta);

end

end