The point of the application is to recognize an image from an already set list of images. The list of images have had their SIFT descriptors extracted and saved in files. Nothing interesting here:

std::vector<cv::KeyPoint> detectedKeypoints;
cv::Mat objectDescriptors;

// Extract data
cv::SIFT sift;
sift.detect(image, detectedKeypoints);
sift.compute(image, detectedKeypoints, objectDescriptors);

// Save the file
cv::FileStorage fs(file, cv::FileStorage::WRITE);
fs << "descriptors" << objectDescriptors;
fs << "keypoints" << detectedKeypoints;


Then the device takes a picture. SIFT descriptors are extracted in the same way. The idea now was to compare the descriptors to the ones from the files. I am doing that using the FLANN matcher from OpenCV. I am trying to quantify the similarity, image by image. After going through the whole list I should have the best match.

const cv::Ptr<cv::flann::IndexParams>& indexParams = new cv::flann::KDTreeIndexParams(1);
const cv::Ptr<cv::flann::SearchParams>& searchParams = new cv::flann::SearchParams(64);

// Match using Flann
cv::Mat indexMat;
cv::FlannBasedMatcher matcher(indexParams, searchParams);
std::vector< cv::DMatch > matches;
matcher.match(objectDescriptors, readDescriptors, matches);


After matching I understand that I get a list of the closest found distances between the feature vectors. I find the minimum distance and, using it I can count "good matches" and even get a list of the respective points:

// Count the number of mathes where the distance is less than 2 * min_dist
int goodCount = 0;
for (int i = 0; i < objectDescriptors.rows; i++)
    if (matches[i].distance <  2 * min_dist)
        // Save the points for the homography calculation


I'm showing easy parts of the code just to make this more easy to follow, I know some of it doesn't need to be here.

接下来,我希望只是简单地计算好的匹配数量就足够了,但结果是大多数情况下只是指向具有最多描述符的图像。我之后尝试的是计算单应性。目的是计算它,看看它是否是一个有效的homoraphy或不。希望一个好的比赛,只有一个很好的匹配,会有一个单调是一个很好的转变。使用< em>< em>和< em>场景< / em>的 cv :: findHomography cv :: Point2f> 。我使用我在网上找到的一些代码检查单应性的有效性:

Continuing, I was hoping that simply counting the number of good matches like this would be enough, but it turned out to mostly just point me to the image with the most descriptors. What I tried to after this was computing the homography. The aim was to compute it and see whether it's a valid homoraphy or not. The hope was that a good match, and only a good match, would have a homography that is a good transformation. Creating the homography was done simply using cv::findHomography on the obj and scene which are std::vector< cv::Point2f>. I checked the validity of the homography using some code I found online:

bool niceHomography(cv::Mat H)
    std::cout << H << std::endl;

    const double det = H.at<double>(0, 0) * H.at<double>(1, 1) - H.at<double>(1, 0) * H.at<double>(0, 1);
    if (det < 0)
        std::cout << "Homography: bad determinant" << std::endl;
        return false;

    const double N1 = sqrt(H.at<double>(0, 0) * H.at<double>(0, 0) + H.at<double>(1, 0) * H.at<double>(1, 0));
    if (N1 > 4 || N1 < 0.1)
        std::cout << "Homography: bad first column" << std::endl;
        return false;

    const double N2 = sqrt(H.at<double>(0, 1) * H.at<double>(0, 1) + H.at<double>(1, 1) * H.at<double>(1, 1));
    if (N2 > 4 || N2 < 0.1)
        std::cout << "Homography: bad second column" << std::endl;
        return false;

    const double N3 = sqrt(H.at<double>(2, 0) * H.at<double>(2, 0) + H.at<double>(2, 1) * H.at<double>(2, 1));
    if (N3 > 0.002)
        std::cout << "Homography: bad third row" << std::endl;
        return false;

    return true;


I don't understand the math behind this so, while testing, I sometimes replaced this function with a simple check whether the determinant of the homography was positive. The problem is that I kept having issues here. The homographies were either all bad, or good when they shouldn't have been (when I was checking only the determinant).

我想我应该使用单应性和多个点,只是使用它们在源图像中的位置计算它们在目标图像中的位置。然后我将比较这些平均距离,并且在正确的图像的情况下,我理想地将获得非常明显的更小的平均距离。这根本不工作。所有的距离是巨大的。我想我可能已经使用单应性计算正确的位置,但切换 obj 场景彼此给出类似的结果。

I figured I should actually use the homography and for a number of points just compute their position in the destination image using their position in the source image. Then I would compare these average distances, and I would ideally get a very obvious smaller average distance in the case of the correct image. This did not work at all. All the distances were colossal. I thought I might have used the homography the other way around to calculate the right position, but switching obj and scene with each other gave similar results.

我尝试的其他东西是SURF描述符,而不是SIFT,BFMatcher(暴力)而不是FLANN,获得每个图像的 n 而不是取决于最小距离的数字,或者取决于全局最大距离的距离。这些方法都没有给我确定的好结果,我现在感到困惑。

Other things I tried were SURF descriptors instead of SIFT, BFMatcher (brute force) instead of FLANN, getting the n smallest distances for every image instead of a number depending on the minimum distance, or getting distances depending on a global maximum distance. None of these approaches gave me definite good results, and I feel stuck now.


My only next strategy would be to sharpen the images or even turn them to binary images using some local threshold or some algorithms used for segmentation. I am looking for any suggestions or mistake anyone can see in my work.


I don't know whether this is relevant, but I added some of the images I am testing this on. Many times in the test images most of the SIFT vectors come from the frame (higher contrast) than the painting. This is why I'm thinking sharpening the images might work, but I don't want to go deeper in case something I did previously is wrong.


The gallery of images is here with the descriptions in the titles. The images are of quite high resolution, please view in case it might give some hints.



You can try to test if when matching, the lines between the source image and the target image are relatively parallel. If it's not a correct match, then you'd have a lot of noise and the lines won't be parallel.

查看附图显示正确的匹配使用SURF和BF) - 所有的行大部分是并行的(虽然我应该指出这是一个很容易的例子)。

See the attached image which shows a correct match (using SURF and BF) - all the lines are mostly parallel (though I should point out that this is an easy example).

