By Novikoff 1962, Block 1962 For a dataset {(x1,y1),…,(xN,yN)} with R≥∥xn∥2 and labels yn∈{−1,1} Suppose ∃ u∈RD: ∃ γ>0∧γ≤ynu⊤xn u can be thought of as a w of a candidate for the margin of the dataset Then the PerceptronTrainingAlgorithm will make ≤γ2R2 mistakes on the training sequence