• Probability that our prediction is of a class given some data when there are classes
  • Has the effect of setting one term close to 1 and the others close to 0
  • The denominator is a normalization term to make this a valid probability