Ridge (Statistics)

#Math
Regularization method that tends to shrink all coefficients to non-zero values

$\displaystyle {\lambda}\sum_{j = 1}^{J}\beta_{j} ^{2}=\lambda \lVert \beta\rVert_{2}$

This ridge term effectively minimizes the coefficients for any feature $\displaystyle \beta$
$\displaystyle J$ is the number of parameters
Effectively uses the L2 norm on each parameter
This term is added to the loss function