03-regularization
title: regularization
overfitting
- Reduce number of features
- manually select which features to keep
- model selection algorithm
- regularization
- keep all the features, but reduce parameters $\theta$.
The regularization works well when we have a lot of features, each of which contributes a bit to predicting y.
cost function
Suppose we want $\theta_3,\theta_4$ really small, consider this:
Then $\theta_3,\theta_4$ will $\to$ 0, and less prone to overfitting.
regularization:
where the adding term is the regularization term and $\lambda$ is the regularization parameter.Using the above cost function with the extra summation, we can smooth the output of our hypothesis function to reduce overfitting. If lambda is chosen to be too large, it may smooth out the function too much and cause underfitting.
regularized linear regression
We will seperate out $\theta_0$ :
The update rule can be also represented as:
The normal equation;
regularized logistic regression
the similar form which exclude $\theta_0$:
Note: no $j_0$
matlab function
contour()
x = linspace(-1, 1.5, 50);
y = linspace(-1, 1.5, 50);
z=f(x,y);%x and y determines z
contour(u, v, z,v, 'ShowText','on','LineWidth',2);
% v can be a number, refer the number of the contour
% v can be a vector, which specifies at value of z to draw the contour
% it has some properties
fminuc
options = optimset('GradObj', 'on', 'MaxIter', 400);
%'GradOBBJ' tells fminunc our function returns both cost and gradient
[theta, J, exit_flag] =fminunc(@(t)(costFunctionReg(t, X, y, lambda)), initial_theta, options);
%input the initial value and the function should