Softmax
#Computers
$\displaystyle P(Y=i|X)= \frac{e^{\beta_{0,i}+\beta_{1,i}X}}{\sum_{j=1}^{K}e^{{\beta}{0,j}+\beta{1,j}X}}$
- Probability that our prediction $\displaystyle Y$ is of a class $\displaystyle i$ given some data $\displaystyle X$ when there are $\displaystyle K$ classes
- Has the effect of setting one term close to 1 and the others close to 0
- The denominator is a normalization term to make this a valid probability