本文介绍了如何使用3x3单应性增强立方体到特定位置的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我可以通过计算它们之间的3x3单应性来跟踪同一场景的不同图像上的4个坐标。这样做,我可以覆盖其他2D图像到这些坐标。我想知道如果我可以使用这个单应性增加一个立方体到这个位置,而不是使用opengl?我认为3x3矩阵不能提供足够的信息,但如果我知道相机校准矩阵,我可以得到足够创建一个模型视图矩阵来做这个?



感谢您提供任何帮助。

解决方案

您有相机校准矩阵(内在参数)和单应性,因为单应性(同一平面对象的两个视图之间)定义为:



<$ c $其中K是3×3校准矩阵,R(3×3旋转矩阵)和T(3×1旋转矩阵)平移向量)是视图变换(从对象坐标到相机坐标)。有很多关于如何从H计算R和T的方法。一种方法是计算直接解,另一种方法是使用一些非线性最小化技术来计算R和T.显然,后一种方法更好,因为它将给出更好的近似解。前者只是一种开始做增强现实的方式):



让我们来看看当使用直接方法时如何导出R和T。如果h1,h2和h3是H的
列向量,则根据K,R和T定义为:



H = K [r1 r2 t] (记住我们说的是z = 0的点)



其中r1是第一列向量R,r2为第二个,t为平移向量。然后:



r1 = l1 *(K ^ -1)h1



r2 = l2 *(K ^ -1)h2



r3 = r1×r2 (r1和r2之间的交叉乘积)



t = l3 * 1)h3



其中l1,l2,l3是比例因子(实数值):
l1 = 1 / norm((K ^ -1)* h1)



l2 = 1 / norm ^ -1)* h2)



l3 =(l1 + l2)/ 2



请记住,此解决方案应使用非线性最小化方法进行细化(例如,您可以使用此解决方案作为起点)。你也可以使用一些失真模型从镜头失真中恢复,但这一步是不必要的(即使没有它也会得到好的结果)。



如果你想使用最小化方法来计算更好的近似R和T,有很多不同的方法。我建议您阅读本文



从视频图像快速和全局趋同的姿态估计,Lu,Hager



它提供了一个最好的算法为您的目的。


I am able to track 4 coordinates over different images of the same scene by calculating a 3x3 homography between them. Doing this I can overlay other 2D images onto these coordinates. I am wondering if I could use this homography to augment a cube onto this position instead using opengl? I think the 3x3 matrix doesn't give enough information but if I know the camera calibration matrix can I get enough to create a model view matrix to do this?

Thank you for any help you can give.

解决方案

If you have the camera calibration matrix (intrinsic parameters) and the homography, since the homography (between two view of the same planar object) is defined as:

H = K[R|T]

where K is the 3x3 calibration matrix, R (3x3 rotation matrix) and T (3x1 translation vector) is the view transform (from object coordinates to camera coordinates). There is a lot to say about how to compute R and T from H. One way is to compute a direct solution, the other way is to use some non-linear minimization technique to compute R and T. Obviously, the latter method is better, since it will give the better approximate solution. The former is just a way to start doing augmented reality ;):

Let'see how to derive R and T for when using a direct method. If h1,h2 and h3 are thecolumn vectors of H, define in terms of K,R and T as:

H = K [r1 r2 t] (remember that we are speaking of points with z=0)

where r1 is the first column vector of R, r2 the second and t is the translation vector. Then:

r1 = l1 * (K^-1) h1

r2 = l2 * (K^-1) h2

r3 = r1 x r2 (cross product between r1 and r2)

t = l3 * (K^-1) h3

where l1,l2,l3 are scaling factors (real values):l1 = 1 / norm((K^-1)*h1)

l2 = 1 / norm((K^-1)*h2)

l3 = (l1+l2)/2

Keep in mind that this solution should be refined using a non linear minimization method (for example, you can use this solution as a starting point). You can also use some distorsion model to recover from lens distorsions, but this step is unnecessary (you will get good results even without it).

If you want to use a minimization method to compute a better approximation to R and T, there are a lot of different ways. I suggest to you to read the paper

"Fast and globally convergent pose estimation from video images", Lu, Hager

which presents one of the best algorithms out there for your purpose.

这篇关于如何使用3x3单应性增强立方体到特定位置的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

06-14 23:18