


Module's parameters get changed during training, that is, they are what is learnt during training of a neural network, but what is a buffer?


and is it learnt during neural network training?


Pytorch 文档用于读取 register_buffer()方法

Pytorch doc for register_buffer() method reads

-保存为 state_dict 的一部分.
-使用模型的其余参数移至 cuda() cpu().
-使用模型的其余参数强制转换为 float / half / double .
将这些参数"注册为模型的 buffer 可使pytorch跟踪它们并像常规参数一样保存它们,但是阻止pytorch使用SGD机制更新它们.

As you already observed, model parameters are learned and updated using SGD during the training process.
However, sometimes there are other quantities that are part of a model's "state" and should be
- saved as part of state_dict.
- moved to cuda() or cpu() with the rest of the model's parameters.
- cast to float/half/double with the rest of the model's parameters.
Registering these "arguments" as the model's buffer allows pytorch to track them and save them like regular parameters, but prevents pytorch from updating them using SGD mechanism.

可以在 _BatchNorm 模块,其中 running_mean running_var num_batches_tracked 被注册为缓冲区并通过累加进行更新通过该层转发的数据的统计信息.这与 weight bias 参数相反,后者使用常规SGD优化来学习数据的仿射变换.

An example for a buffer can be found in _BatchNorm module where the running_mean , running_var and num_batches_tracked are registered as buffers and updated by accumulating statistics of data forwarded through the layer. This is in contrast to weight and bias parameters that learns an affine transformation of the data using regular SGD optimization.


07-24 08:54