A model that predicts the probability of a data point belonging to a certain class. This probability may then be used to classify (e.g. ) if desired

Topics

  • Negative log of the product of log likelihood…

Training

Use gradient descent to minimize the negative log likelihood of the dataset (same as finding the Maximum Likelihood Estimate) by modifying , or the weights of the model

  • The log likelihood of the dataset
  • is the same as

  • is the error of the th training sample

Testing

  • is the dimensionality of the data