使用FLANN匹配从OpenCV

使用FLANN匹配从OpenCV

本文介绍了使用FLANN匹配从OpenCV SIFT列表中识别图像的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

应用程序的要点是从已经设置的图像列表中识别图像。图像列表已将其SIFT描述符提取并保存在文件中。这里没有什么有趣的:

The point of the application is to recognize an image from an already set list of images. The list of images have had their SIFT descriptors extracted and saved in files. Nothing interesting here:

std::vector<cv::KeyPoint> detectedKeypoints;
cv::Mat objectDescriptors;

// Extract data
cv::SIFT sift;
sift.detect(image, detectedKeypoints);
sift.compute(image, detectedKeypoints, objectDescriptors);

// Save the file
cv::FileStorage fs(file, cv::FileStorage::WRITE);
fs << "descriptors" << objectDescriptors;
fs << "keypoints" << detectedKeypoints;
fs.release();

然后设备拍摄照片。以相同的方式提取SIFT描述符。现在的想法是将描述符与文件中的描述符进行比较。我使用来自OpenCV的FLANN匹配器。我试图量化的相似性,逐个图像。经过整个列表后,我应该有最好的匹配。

Then the device takes a picture. SIFT descriptors are extracted in the same way. The idea now was to compare the descriptors to the ones from the files. I am doing that using the FLANN matcher from OpenCV. I am trying to quantify the similarity, image by image. After going through the whole list I should have the best match.

const cv::Ptr<cv::flann::IndexParams>& indexParams = new cv::flann::KDTreeIndexParams(1);
const cv::Ptr<cv::flann::SearchParams>& searchParams = new cv::flann::SearchParams(64);

// Match using Flann
cv::Mat indexMat;
cv::FlannBasedMatcher matcher(indexParams, searchParams);
std::vector< cv::DMatch > matches;
matcher.match(objectDescriptors, readDescriptors, matches);

匹配后,我了解我得到了一个列表,列出了特征向量之间最近的距离。我找到最小距离,使用它我可以算好匹配,甚至得到相应的点列表:

After matching I understand that I get a list of the closest found distances between the feature vectors. I find the minimum distance and, using it I can count "good matches" and even get a list of the respective points:

// Count the number of mathes where the distance is less than 2 * min_dist
int goodCount = 0;
for (int i = 0; i < objectDescriptors.rows; i++)
{
    if (matches[i].distance <  2 * min_dist)
    {
        ++goodCount;
        // Save the points for the homography calculation
        obj.push_back(detectedKeypoints[matches[i].queryIdx].pt);
        scene.push_back(readKeypoints[matches[i].trainIdx].pt);
    }
}

使这更容易跟随,我知道一些它不需要在这里。

I'm showing easy parts of the code just to make this more easy to follow, I know some of it doesn't need to be here.

接下来,我希望只是简单地计算好的匹配数量就足够了,但结果是大多数情况下只是指向具有最多描述符的图像。我之后尝试的是计算单应性。目的是计算它,看看它是否是一个有效的homoraphy或不。希望一个好的比赛,只有一个很好的匹配,会有一个单调是一个很好的转变。使用< em>< em>和< em>场景< / em>的 cv :: findHomography cv :: Point2f> 。我使用我在网上找到的一些代码检查单应性的有效性:

Continuing, I was hoping that simply counting the number of good matches like this would be enough, but it turned out to mostly just point me to the image with the most descriptors. What I tried to after this was computing the homography. The aim was to compute it and see whether it's a valid homoraphy or not. The hope was that a good match, and only a good match, would have a homography that is a good transformation. Creating the homography was done simply using cv::findHomography on the obj and scene which are std::vector< cv::Point2f>. I checked the validity of the homography using some code I found online:

bool niceHomography(cv::Mat H)
{
    std::cout << H << std::endl;

    const double det = H.at<double>(0, 0) * H.at<double>(1, 1) - H.at<double>(1, 0) * H.at<double>(0, 1);
    if (det < 0)
    {
        std::cout << "Homography: bad determinant" << std::endl;
        return false;
    }

    const double N1 = sqrt(H.at<double>(0, 0) * H.at<double>(0, 0) + H.at<double>(1, 0) * H.at<double>(1, 0));
    if (N1 > 4 || N1 < 0.1)
    {
        std::cout << "Homography: bad first column" << std::endl;
        return false;
    }

    const double N2 = sqrt(H.at<double>(0, 1) * H.at<double>(0, 1) + H.at<double>(1, 1) * H.at<double>(1, 1));
    if (N2 > 4 || N2 < 0.1)
    {
        std::cout << "Homography: bad second column" << std::endl;
        return false;
    }

    const double N3 = sqrt(H.at<double>(2, 0) * H.at<double>(2, 0) + H.at<double>(2, 1) * H.at<double>(2, 1));
    if (N3 > 0.002)
    {
        std::cout << "Homography: bad third row" << std::endl;
        return false;
    }

    return true;
}



我不明白这个数学背后的原理,用简单检查单应性的行列式是否为正的替代这个函数。问题是我在这里一直有问题。单应性或者是全部坏,或者当他们不应该是(当我只检查决定因素时)。

I don't understand the math behind this so, while testing, I sometimes replaced this function with a simple check whether the determinant of the homography was positive. The problem is that I kept having issues here. The homographies were either all bad, or good when they shouldn't have been (when I was checking only the determinant).

我想我应该使用单应性和多个点,只是使用它们在源图像中的位置计算它们在目标图像中的位置。然后我将比较这些平均距离,并且在正确的图像的情况下,我理想地将获得非常明显的更小的平均距离。这根本不工作。所有的距离是巨大的。我想我可能已经使用单应性计算正确的位置,但切换 obj 场景彼此给出类似的结果。

I figured I should actually use the homography and for a number of points just compute their position in the destination image using their position in the source image. Then I would compare these average distances, and I would ideally get a very obvious smaller average distance in the case of the correct image. This did not work at all. All the distances were colossal. I thought I might have used the homography the other way around to calculate the right position, but switching obj and scene with each other gave similar results.

我尝试的其他东西是SURF描述符,而不是SIFT,BFMatcher(暴力)而不是FLANN,获得每个图像的 n 而不是取决于最小距离的数字,或者取决于全局最大距离的距离。这些方法都没有给我确定的好结果,我现在感到困惑。

Other things I tried were SURF descriptors instead of SIFT, BFMatcher (brute force) instead of FLANN, getting the n smallest distances for every image instead of a number depending on the minimum distance, or getting distances depending on a global maximum distance. None of these approaches gave me definite good results, and I feel stuck now.

我唯一的下一个策略是锐化图像,甚至使用一些局部阈值或一些用于分割的算法将它们转换为二进制图像。我正在寻找任何建议或错误任何人都可以看到我的工作。

My only next strategy would be to sharpen the images or even turn them to binary images using some local threshold or some algorithms used for segmentation. I am looking for any suggestions or mistake anyone can see in my work.

我不知道这是否相关,但我添加了一些我正在测试的图像。许多时候在测试图像中大多数SIFT矢量来自画面(更高的对比度)。这就是为什么我认为锐化图像可能工作,但我不想深入,如果我以前做的错误。

I don't know whether this is relevant, but I added some of the images I am testing this on. Many times in the test images most of the SIFT vectors come from the frame (higher contrast) than the painting. This is why I'm thinking sharpening the images might work, but I don't want to go deeper in case something I did previously is wrong.

图片库是与标题中的描述。图像是相当高的分辨率,请查看,万一它可能会提供一些提示。

The gallery of images is here with the descriptions in the titles. The images are of quite high resolution, please view in case it might give some hints.

推荐答案

您可以尝试测试匹配时源图像和目标图像之间的线是否相对平行。如果它不是一个正确的匹配,那么你会有很多的噪音,线将不会并行。

You can try to test if when matching, the lines between the source image and the target image are relatively parallel. If it's not a correct match, then you'd have a lot of noise and the lines won't be parallel.

查看附图显示正确的匹配使用SURF和BF) - 所有的行大部分是并行的(虽然我应该指出这是一个很容易的例子)。

See the attached image which shows a correct match (using SURF and BF) - all the lines are mostly parallel (though I should point out that this is an easy example).

这篇关于使用FLANN匹配从OpenCV SIFT列表中识别图像的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-04 22:52