Projects data onto the dimension that maximizes variance of the data along that dimension, which so happens to be along the eigenvector with the largest eigenvalue when looking at the sample covariance matrix of the data

  • the lower-dimensional representation of the data, where is the number of datapoints and is the number of principal components
  • is the data, where is the dimensionality of the data
  • is the transformation matrix

  • If you choose the first highest eigenvalues, this is proportion of variance you retain after PCA

  • The goal of PCA is to maximize the variance represented by by varying with the constraint that is normal
  • is the covariance matrix

Training