Regularized Linear Regression

#Math #Computers

$\displaystyle \hat{w}=(\mathbf{X^{^{\top}}X+\lambda I})^{-1}\mathbf{X^{^{\top}}y}$

  • The ideal weights/parameters of the linear model
  • $\displaystyle \lambda$ is the regularization parameter

$\displaystyle \nabla J(\mathbf{w})=2(\mathbf{X^{^{\top}}Xw-X^{^{\top}}y+\lambda w})$