Backpropagation

#Computers
A method to tune the parameters a neural network. Often done by gradient descent, where the change in the loss function is calculated chain rule using computation graphs

$\displaystyle \frac{\partial J }{\partial \Theta^{(L)}}=(a^{(L)}-y)(a^{(L-1)})^{T}$

  • The derivative of the loss function with respect to the last layer's parameters