
Understanding feature matching
Once we have extracted features and their descriptors from two (or more) images, we can start asking whether some of these features show up in both (or all) images. For example, if we have descriptors for both our object of interest (self.desc_train) and the current video frame (desc_query), we can try to find regions of the current frame that look like our object of interest.
This is done by the following method, which makes use of FLANN:
good_matches = self.match_features(desc_query)
The process of finding frame-to-frame correspondences can be formulated as the search for the nearest neighbor from one set of descriptors for every element of another set.
The first set of descriptors is usually called the train set, because, in machine learning, these descriptors are used to train a model, such as the model of the object that we want to detect. In our case, the train set corresponds to the descriptor of the template image (our object of interest). Hence, we call our template image the train image (self.img_train).
The second set is usually called the query set because we continually ask whether it contains our train image. In our case, the query set corresponds to the descriptor of each incoming frame. Hence, we call a frame the query image (img_query).
Features can be matched in any number of ways, for example, with the help of a brute-force matcher (cv2.BFMatcher) that looks for each descriptor in the first set and the closest descriptor in the second set by trying each one (an exhaustive search).
In the next section, we'll learn how to match features across images with FLANN.