A way to classify a data point based on its “similarity distance” to the $k$ nearest neighbors.

Topics

Feature Normalization

Steps to Classify (based on this animation)

Initialize
1. Place the data point in a feature space
Calculate Distance
1. Find the Euclidean distance from the data point with all other labeled data
2. May have to normalize data point values along different coordinates to be Z-scores for each coordinate
Sort Distance
1. Sort distances of the data point with other labeled data points in increasing order
2. For classification, the most common class of $k$ -nearest neighbors determines the data point class
3. For regression, use the average of the $k$ -nearest neighbors’ labels

Choices of $k$

Too small $k$
- Results in overfitting of the model to the data
Too large $k$
- Results in underfitting

CS M146

$nn_{k} (x) = argmin_{n \in ([N] - \sum_{i = 1}^{k - 1} nn_{i} (x))} ∥ x - x_{n} ∥_{2}^{2}$

Gives the index in $[N]$ of the $k$ th nearest neighbor of $x$ , or the data point we’re interested in classifying
$[N]$ is an array of indices for each data sample point
We use the square of the L2 norm here to calculate distance but other measures could work

$knn (x) = {nn_{1} (x), nn_{2} (x), \dots, nn_{K} (x)}$

Gives the set of indices of the $K$ nearest neighbors to $x$

$v_{c} = n \in knn (x) \sum I (y_{n} == c) \forall c \in [C]$

Gives the number of “votes” for a particular class/label $c$
We iterate over each of the $K$ -nearest neighbors and increment by one every time the neighbor equals the class of interest

$y = h (x) = argmax_{c \in [C]} v_{c}$

The prediction is the class $c$ that maximizes the number of votes

Training Time Complexity $O (1)$

I don’t know why this isn’t $O (n)$ , but spatial complexity definitely is $O (n)$

Classifying Time Complexity: $O (n d + n lo g n)$

To classify, we need to calculate the distances from our point to all of our $n$ data points, where calculating the distances scales linearly with the number of features $d$ , which overall takes $O (n d)$ time. We must also sort these distances before finding the $K$ -nearest neighbors, which takes $O (n lo g n)$ time

Knowledge

Explorer

KNN

Topics

Steps to Classify (based on this animation)

Choices of $k$

CS M146

$nn_{k} (x) = argmin_{n \in ([N] - \sum_{i = 1}^{k - 1} nn_{i} (x))} ∥ x - x_{n} ∥_{2}^{2}$

$knn (x) = {nn_{1} (x), nn_{2} (x), \dots, nn_{K} (x)}$

$v_{c} = n \in knn (x) \sum I (y_{n} == c) \forall c \in [C]$

$y = h (x) = argmax_{c \in [C]} v_{c}$

Training Time Complexity $O (1)$

Classifying Time Complexity: $O (n d + n lo g n)$

Graph View

Table of Contents

Backlinks

Knowledge

Explorer

KNN

Topics

Steps to Classify (based on this animation)

Choices of k

CS M146

nnk​(x)=argminn∈([N]−∑i=1k−1​nni​(x))​∥x−xn​∥22​

knn(x)={nn1​(x),nn2​(x),…,nnK​(x)}

vc​=n∈knn(x)∑​I(yn​==c) ∀ c∈[C]

y=h(x)=argmaxc∈[C]​vc​

Training Time Complexity O(1)

Classifying Time Complexity: O(nd+nlogn)

Graph View

Table of Contents

Backlinks

Choices of $k$

$nn_{k} (x) = argmin_{n \in ([N] - \sum_{i = 1}^{k - 1} nn_{i} (x))} ∥ x - x_{n} ∥_{2}^{2}$

$knn (x) = {nn_{1} (x), nn_{2} (x), \dots, nn_{K} (x)}$

$v_{c} = n \in knn (x) \sum I (y_{n} == c) \forall c \in [C]$

$y = h (x) = argmax_{c \in [C]} v_{c}$

Training Time Complexity $O (1)$

Classifying Time Complexity: $O (n d + n lo g n)$