问题描述
我目前正在使用Keras(自动编码器)开发CNN模型.此类型的输入的形状为(47,47,3)
,即一个具有3个(RGB)层的47x47图像.
I am currently developing a CNN model with Keras (an autoencoder). This type my inputs are of shape (47,47,3)
, that is a 47x47 image with 3 (RGB) layers.
我过去曾经使用过某些CNN,但是这次我的输入尺寸是质数(47像素).我认为这会导致我的实现出现问题,特别是在模型中使用MaxPooling2D
和UpSampling2D
时.我注意到,在最大池化然后向上采样时会丢失某些尺寸.
I have worked with some CNN's in the past, but this time my input dimensions are prime numbers (47 pixels). This I think is causing issues with my implementation, specifically when using MaxPooling2D
and UpSampling2D
in my model. I noticed that some dimensions are lost when max pooling and then up sampling.
使用model.summary()
,我看到将 (47,47,3)
输入通过Conv2D(24)
和带有(2,2)
内核的MaxPooling(即24个滤镜和一半形状)传递后,我得到了输出形状为 (24, 24, 24)
.
Using model.summary()
I can see that after passing my (47,47,3)
input through a Conv2D(24)
and MaxPooling with a (2,2)
kernel (that is 24 filters and half the shape) I get a output shape of (24, 24, 24)
.
现在,如果我尝试通过使用(2,2)
内核(形状加倍)的UpSampling进行反转并再次卷积,则会得到 (48,48,3)
形状的输出.那是一排多余的行和列.
Now, if I try to reverse that by UpSampling with a (2,2)
kernel (double the shape) and convolving again I get a (48,48,3)
shaped output. That is one extra row and column than needed.
为此,我认为没问题,只需选择一个内核大小即可在向上采样时为您提供所需的47个像素" ,但是考虑到47是质数,在我看来没有内核大小可以做到这一点.
To this I thought "no problem, just chose a kernel size that gives you the desired 47 pixels when up sampling", but given that 47 is a prime number it seems to me that there is no kernel size that can do that.
有什么方法可以绕过这个问题,而不必涉及将输入尺寸更改为非质数?也许我的方法中缺少某些东西,或者Keras具有某些我可以忽略的功能,在这里帮助.
Is there any way to bypass this problem that does not involve changing my input dimensions to a non-prime? Maybe I am missing something in my approach or maybe Keras has some feature I ignore that could help here.
推荐答案
我建议您使用 ZeroPadding2D 和 Cropping2D .您可以使用0
s不对称地填充图像,而无需调整大小即可获得均匀大小的图像.这应该解决上采样的问题.此外,请记住在所有卷积层中设置padding=same
.
I advice you to use ZeroPadding2D and Cropping2D. You can pad your image asymmetrically with 0
s and obtain an even size of your image without resizing it. This should solve the problem with upsampling. Moreover - remember about setting padding=same
in all of your convolutional layers.
仅向您提供有关如何执行此类操作的示例策略:
Just to give you an example strategy on how to perform such operations:
- 如果在合并网络之前,网络的大小是奇数-对其进行零填充以使其均匀.
- 在进行相应的上采样操作后,请使用裁切功能将特征图恢复为原始的奇数大小.
这篇关于发出带有素数输入维度的培训CNN的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!