问题描述
我正在使用 Microsoft 的 CustomVision.ai 构建自定义视觉应用程序.
I am building a custom vision application with Microsoft's CustomVision.ai.
我正在使用 本教程.
当您在物体检测项目中标记图像时,您需要使用标准化坐标指定每个标记物体的区域.
When you tag images in object detection projects, you need to specify the region of each tagged object using normalized coordinates.
我有一个 XML 文件,其中包含有关图像的注释,例如名为sample_1.jpg
:
I have an XML file containing the annotations about the image, e.g. named sample_1.jpg
:
<annotation>
<filename>sample_1.jpg</filename>
<size>
<width>410</width>
<height>400</height>
<depth>3</depth>
</size>
<object>
<bndbox>
<xmin>159</xmin>
<ymin>15</ymin>
<xmax>396</xmax>
<ymax>302</ymax>
</bndbox>
</object>
</annotation>
我必须根据提供的教程将边界框坐标从 xmin,xmax,ymin,ymax 转换为标准化的 x,y,w,h 坐标.
I have to convert the bounding box coordinates from xmin,xmax,ymin,ymax to x,y,w,h coordinates normalized according to the provided tutorial.
谁能给我一个转换函数?
Can anyone provide me a conversion function?
推荐答案
假设 x/ymin 和 x/ymax 分别是左上角和右下角的边界角.然后:
Assuming x/ymin and x/ymax are your bounding corners, top left and bottom right respectively. Then:
x = xmin
y = ymin
w = xmax - xmin
h = ymax - ymin
然后您需要对这些进行归一化,这意味着将它们作为整个图像的比例,因此简单地将每个值除以上述值的各自大小:
You then need to normalize these, which means give them as a proportion of the whole image, so simple divide each value by its respective size from the values above:
x = xmin / width
y = ymin / height
w = (xmax - xmin) / width
h = (ymax - ymin) / height
这里假设原点为左上角,如果不是这种情况,您将必须应用移位因子.
This assumes a top-left origin, you will have to apply a shift factor if this is not the case.
这篇关于如何将图像中标准化的 [xmin ymin xmax ymax] 形式转换为 [x y 宽度高度]?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!