问题描述
我正在开发一个应用程序,其中我使用SIFT + RANSAC和Homography来查找对象(OpenCV C ++,Java)。我面对的问题是,有很多离群值RANSAC表现不佳。
I am developing an application where I am using SIFT + RANSAC and Homography to find an object (OpenCV C++,Java). The problem I am facing is that where there are many outliers RANSAC performs poorly.
因为这个原因,我想尝试什么作者的SIFT说是相当不错:投票。
For this reasons I would like to try what the author of SIFT said to be pretty good: voting.
我读过我们应该在4维特征空间中投票,其中4个维度是:
I have read that we should vote in a 4 dimension feature space, where the 4 dimensions are:
- 位置[x,y](有人说Traslation)
- 缩放
- $ b
- Location [x, y] (someone says Traslation)
- Scale
- Orientation
虽然使用opencv很容易得到匹配 scale
和 code>与:
While with opencv is easy to get the match
scale
and orientation
with:
cv::Keypoints.octave
cv::Keypoints.angle
我很难理解如何计算位置。
I am having hard time to understand how I can calculate the location.
我找到了其中只有
一个匹配
我们能够绘制边框:
I have found an interesting slide where with only
one match
we are able to draw a bounding box:
但我不会我怎么能绘制边界框只有一个匹配。
But I don't get how I could draw that bounding box with just one match. Any help?
推荐答案
您正在寻找适合从图像1到图像2的几何变换的最大的匹配特征集合。在这种情况下,是相似变换,它有4个参数:translation
(dx,dy)
,缩放变化 ds
和旋转 d_theta
。
You are looking for the largest set of matched features that fit a geometric transformation from image 1 to image 2. In this case, it is the similarity transformation, which has 4 parameters: translation
(dx, dy)
, scale change ds
, and rotation d_theta
.
假设你已经匹配到特征:图像1的f1和图像的f2 2.让
(x1,y1)
是图像1中f1的位置,让它 s1
让 theta1
是它的方向。同样,你有(x2,y2)
, s2
和 theta2
for f2。
Let's say you have matched to features: f1 from image 1 and f2 from image 2. Let
(x1,y1)
be the location of f1 in image 1, let s1
be its scale, and let theta1
be it's orientation. Similarly you have (x2,y2)
, s2
, and theta2
for f2.
两个特征之间的转换是
(dx,dy)=(x2-x1,y2-y1)
。
The translation between two features is
(dx,dy) = (x2-x1, y2-y1)
.
两个特征之间的比例变化是
ds = s2 / s1
。
The scale change between two features is
ds = s2 / s1
.
两个特征之间的旋转
d_theta = theta2-theta1
。
因此,
是您的Hough空间的尺寸。每个bin对应于一个相似变换。 dx
, dy
, ds
code> d_theta
So, dx
, dy
, ds
, and d_theta
are the dimensions of your Hough space. Each bin corresponds to a similarity transformation.
一旦你执行了霍夫投票,并找到最大bin,该bin给你从图像1到图像2的转换。你可以做的事情是采取图像1的边界框,并使用该转换变换:应用相应的翻译,旋转和缩放到图像的角。通常,将参数打包到转换矩阵中,并使用齐次坐标。这将给你在图像2中对应于您检测到的对象的边界框。
Once you have performed Hough voting, and found the maximum bin, that bin gives you a transformation from image 1 to image 2. One thing you can do is take the bounding box of image 1 and transform it using that transformation: apply the corresponding translation, rotation and scaling to the corners of the image. Typically, you pack the parameters into a transformation matrix, and use homogeneous coordinates. This will give you the bounding box in image 2 corresponding to the object you've detected.
这篇关于SIFT匹配和识别?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!