问题描述
我正在尝试检测不同形状图像(非方形)中的对象.我使用了faster_rcnn_inception_v2模型,在那里我可以使用图像调整器来保持图像的纵横比并且输出令人满意.
I am trying to detect objects in different shaped images (not square). I used faster_rcnn_inception_v2 model and there I can use image resizer which maintains the aspect ratio of the image and the output is satisfactory.
image_resizer {
keep_aspect_ratio_resizer {
min_dimension: 100
max_dimension: 600
}
}
现在为了更快的性能,我想使用 ssd_inception_v2 或 ssd_inception_v2 模型来训练它.示例配置使用固定形状调整大小,如下所示,
Now for faster performance, I want to train it using ssd_inception_v2 or ssd_inception_v2 model. The sample configuration uses fixed shape resize as below,
image_resizer {
fixed_shape_resizer {
height: 300
width: 300
}
}
但问题是由于固定的调整大小,我得到的检测结果非常差.我尝试将其更改为 keep_aspect_ratio_resizer,如之前在 fast_rcnn_inception_v2 中所述.我收到以下错误,
But the problem is I get a very poor detection result because of that fixed resize. I tried changing it to keep_aspect_ratio_resizer as stated earlier in faster_rcnn_inception_v2. I get the following error,
InvalidArgumentError(回溯见上文):ConcatOp:维度输入应该匹配:shape[0] = [1,100,500,3] vs. shape1 =[1,100,439,3]
如何在 SSD 型号中进行配置以调整图像大小以保持纵横比?
How can I make the configuration in SSD models to resize the image maintaining the aspect ratio?
推荐答案
SSD 和更快的 R-CNN 的工作方式完全不同,因此,即使 F-RCNN 没有这样的约束,对于 SSD,您需要输入图像始终具有相同的大小(实际上您需要特征图始终具有相同的大小,但最好的方法是确保它始终具有相同的输入大小).这是因为它以全连接层结束,你需要知道特征图的大小;而对于 F-RCNN,只有卷积(适用于任何输入尺寸)直到 ROI 池化层(不需要固定的图像尺寸).
SSD and faster R-CNN work quite differently one from another, so, even though F-RCNN has no such constraint, for SSD you need input images that always have the same size (actually you need the feature map to always have the same size, but the best way to ensure it is with always the same input size). This is because it ends with fully connected layers, for which you need to know the size of the feature maps; whereas for F-RCNN there are only convolutions (which work on any input size) up to the ROI-pooling layer (which only doesnt need a fixed image size).
因此您需要为 SSD 使用固定形状的调整器.在最好的情况下,您的数据始终具有相同的宽/高
比率.在这种情况下,只需使用具有相同比率的 fixed_shape_resizer
.否则,您将不得不或多或少地任意选择图像大小 (w, h)
(您的数据的某种平均值就可以).从那时起,您有多种选择:
So you need to use a fixed shape resizer for SSD. In the best case, your data always has the same width/height
ratio. In that case, just use a fixed_shape_resizer
with the same ratio. Otherwise, you'll have to choose an image size (w, h)
yourself more or less arbitrarily (some kind of average of your data would do). You have several options from then on:
只是让 TF 使用 resizer 重塑
(w, h)
的输入,无需预处理.问题在于图像会变形,这可能(也可能不变形,取决于您的数据和您尝试检测的对象)是一个问题.
Just letting TF reshape the input to
(w, h)
with the resizer, without preprocessing. The problem is that the images will be deformed, which may (or not, depending on your data and the objects you're trying to detect) be a problem.
裁剪所有图像以具有与 (w, h)
具有相同纵横比的子图像.问题:您将丢失部分图像或必须对每个图像进行更多推断.
Cropping all the images to have sub-images with the same aspect ratio as (w, h)
. Problem: you'll lose part of the images or have to do more inferences for each image.
填充所有图像(使用黑色像素或随机白噪声)以获得与 (w, h)
具有相同纵横比的图像.您必须对输出边界框进行一些坐标转换(您将获得的坐标将在增强图像中,您必须通过在两个轴上乘以 old_size/new_size 来转换为初始坐标).问题在于,某些对象的尺寸(相对于完整图像大小)会比其他对象缩小得更多,这可能是也可能不是问题,具体取决于您的数据和您尝试检测的内容.
Padding all images (with black pixels or random white noise) to get images with the same aspect ratio as (w, h)
. You'll have to do some coordinate translations on the output bounding boxes (the coordinates you'll get will be in the augmented image, you'll have to translate to initial coordinates by multiplying them by old_size/new_size on both axes). The problem is that some objects will be downsized (relatively to the full image size) more than some others, which may or may not be a problem depending on your data and what you're trying to detect.
这篇关于使用“keep_aspect_ratio_resizer"的 Tensorflow 对象检测 api SSD 模型的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!