Softmax

#Computers

$\displaystyle P(Y=i|X)= \frac{e^{\beta_{0,i}+\beta_{1,i}X}}{\sum_{j=1}^{K}e^{{\beta}{0,j}+\beta{1,j}X}}$

  • Probability that our prediction $\displaystyle Y$ is of a class $\displaystyle i$ given some data $\displaystyle X$ when there are $\displaystyle K$ classes
  • Has the effect of setting one term close to 1 and the others close to 0
  • The denominator is a normalization term to make this a valid probability