问题描述
在训练CNN时,许多作者提到从原始图像的中心随机裁剪图像,放大倍数为2048.谁能解释一下这是什么意思?
I training a CNN, many authors have mentioned of randomly cropping images from the center of the original image with a factor of 2048 data augmentation. Can anyone plz elaborate what does it mean?
推荐答案
我相信您指的是采用深度卷积神经网络的ImageNet分类数据增强方案.他们的数据扩充方案的2048x方面如下:
I believe you are referring to the ImageNet Classification with Deep Convolutional Neural Networks data augmentation scheme. The 2048x aspect of their data augmentation scheme goes as follows:
- 首先将所有图像重新缩放为256x256
- 然后,对于每个图像,它们会随机抽取224x224尺寸的农作物. 对于每个随机的224x224作物,它们还通过对这224x224色块进行水平反射来进行增强.
- First all images are rescaled down to 256x256
- Then for each image they take random 224x224 sized crops.
- For each random 224x224 crop, they additionally augment by taking horizontal reflections of these 224x224 patches.
所以我对它们如何达到2048x数据增强因子的猜测:
So my guess as to how they get to the 2048x data augmentation factor:
- 256x256图像中有32 * 32 = 1024个可能的224x224尺寸的图像裁剪.要看到这一点,只需观察到256-224 = 32,所以我们有32种可能的水平指数和32种可能的垂直指数.
- 对每种作物进行水平反射会使尺寸加倍.
- 1024 * 2 =2048.
您的问题的中心裁切方面是由于原始图像并非都具有相同的大小.因此,作者所做的是重新缩放每个矩形图像,以便最短的一面现在的大小为256,然后他们从中获取中心裁剪,从而将整个数据集重新缩放为256x256.一旦将所有图像缩放到256x256,他们就可以执行上述(最多)-2048x数据增强方案.
The center crop aspect of your question stems from the fact that the original images are not all the same size. So what the authors did was they rescaled each rectangular image so that the shortest side was now of size 256, and they they took the center crop from this, thereby rescaling the entire dataset to 256x256. Once they have rescaled all the images to 256x256, they can perform the above (up to)-2048x data augmentation scheme.
这篇关于训练CNN时的数据扩充因子的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!