问题描述
当我从头开始在python中实现批量标准化时,我感到困惑。请参阅
纸上的图:
就我而言,批次规范化的表示在原始论文中是不正确的。我将问题发布在这里进行讨论。
我认为批处理规范化应类似于下图。
关键是如何计算均值和标准差。
要素地图的形状为(批量大小,通道数,宽度,高度)
,
平均值= X.mean(轴=(0,2,3),keepdims = True)
或
mean = X.mean(axis =(0,1),keepdims = True)
哪个是正确的?
您应该计算批次图像中所有像素的均值和标准差。因此,请使用axis =(0,2,3)参数。
如果通道的分布大致相同-您也可以计算通道的均值和标准差。
本文中的数字是正确的-它需要在H和W上取均值和std(图像尺寸)每批。显然,通道未显示在3d多维数据集中。
When I implement batch normalization in python from scrach, I am confused. Please see A paper demonstrates some figures about normalization methods, I think it may be not correct. The description and figure are both not correct.
Description from the paper:
Figure from the paper:As far as I am concerned, the representation of batch normalization is not correct in the original paper. I post the issue here for discussion.I think the batch normalization should be like the following figure.
The key point is how to calculate mean and std.With feature maps' shape as (batch_size, channel_number, width, height)
,mean = X.mean(axis=(0, 2, 3), keepdims=True)
ormean = X.mean(axis=(0, 1), keepdims=True)
Which one is correct?
You should calculate mean and std across all pixels in the images of the batch. So use axis=(0, 2, 3) parameters.If the channels have roughly same distributions - you may calculate mean and std across channels as well. so just use mean() and std() without axes parameter.
The figure in the article is correct - it takes mean and std across H and W (image dimensions) for each batch. Obviously, channel is not shown in the 3d cube.
这篇关于如何使用python计算批量归一化?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!