Machine Learning for OpenCV
上QQ阅读APP看书,第一时间看更新

Understanding the k-NN algorithm

The k-NN algorithm is arguably one of the simplest machine learning algorithms. The reason for this is that we basically only need to store the training dataset. Then, in order to make a prediction for a new data point, we only need to find the closest data point in the training dataset-its nearest neighbor.

In a nutshell, the k-NN algorithm argues that a data point probably belongs to the same class as its neighbors. Think about it: if our neighbor is a Reds fan, we're probably Reds fans, too; otherwise we would have moved away a long time ago. The same can be said for the Blues.

Of course, some neighborhoods might be a little more complicated. In this case, we would not just consider our closest neighbor (where k=1), but instead our k nearest neighbors. To stick with our example as mentioned earlier, if we were Reds fans, we probably wouldn't move into a neighborhood where the majority of people are Blues fans.

That's all there is to it.