Regularized Linear Regression

#Math #Computers

$\displaystyle \hat{w}=(\mathbf{X^{^{\top}}X+\lambda I})^{-1}\mathbf{X^{^{\top}}y}$

The ideal weights/parameters of the linear model
$\displaystyle \lambda$ is the regularization parameter

$\displaystyle \nabla J(\mathbf{w})=2(\mathbf{X^{^{\top}}Xw-X^{^{\top}}y+\lambda w})$