Regularization method that tends to shrink all coefficients to non-zero values

  • This ridge term effectively minimizes the coefficients for any feature
  • is the number of parameters
  • Effectively uses the L2 norm on each parameter
  • This term is added to the loss function