问题描述
假设我的原始原始数据集有 100 张图像.我应用 random_horizontal_flip
数据增强,默认情况下水平翻转的概率为 50%.举个例子,假设它翻转了 100 张图像中的 50 张.所以,
So let's say my original raw dataset has 100 images. And I apply random_horizontal_flip
data augmentation, which by default horizontally flips with 50% probability. So just for the sake of example, lets say it flips 50 of the 100 images. So,
- 这是否意味着我的算法现在将使用 150 张图像(100 张原始图像和 50 个翻转版本)进行训练,还是意味着它将仍然使用 100 张图像进行训练,但其中 50 个将是原始图像的翻转版本?
- 问题 1 的答案是否可以推广到 Tensorflow 对象检测 API 提供的所有数据增强选项?
我尽可能多地阅读官方文档,并查看了预处理器代码,但找不到我的答案.
I read as much official documentation as possible, and looked into preprocessor code, but couldn't find my answer.
推荐答案
默认增强概率为 50%,独立应用于每张图像.用于训练模型/算法的图像数量取决于时代数.
Default augmentation probability, which is 50%, is independenetly applied to each image. Number of images that your model/algorithm is trained on depends on the number of epochs.
假设您的批次大小为 1 并且总周期数为 100:您的算法将在 100 张图像上进行训练,其中 50 张将是原始图像的翻转版本.在这种情况下,模型将看不到原始的 50 张图像,因为您的 epoch 太低了.
Let's say your batch size is 1 and total number of epoch is 100:Your algoirthm will be trained on 100 images, 50 of them will be flipped version of the original images. In this case, model will not see the original 50 images because your epoch is too low.
假设您的批次大小为 1 并且总周期数为 200:您的算法将在 200 张图像上进行训练,其中 100 张将是原始图像的翻转版本.
Let's say your batch size is 1 and total number of epoch is 200:Your algoirthm will be trained on 200 images, 100 of them will be flipped version of the original images.
因此,只要您的 epoch 大小不限制您的数据集,有 50% 的概率,您就会看到效果,就像您通过翻转每个项目将数据集翻了一番一样.
As a result, as long as your epoch size is not limiting your dataset, with the probability of 50%, you will see an effect as if you have doubled the dataset by flipping each item.
除了水平翻转之外,如果您还添加了垂直翻转 (random_vertical_flip
),您的数据集就会增加三倍.
In addition to the horizontal flip, if you add the vertical flip (random_vertical_flip
) too, you triple your dataset.
这篇关于Tensorflow 对象检测 API 的数据增强是否会导致比原始样本更多的样本?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!