通道维度上的 Pytorch 最大池化

本文介绍了通道维度上的 Pytorch 最大池化的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我试图用 Pytorch 构建一个 cnn，但在 maxpooling 方面遇到了困难.我参加了斯坦福大学举办的cs231n.我记得，maxpooling 可以用作维度推导步骤，例如，我有这个 (1, 20, height, width) 输入 ot max_pool2d (假设我的 batch_size 是 1).如果我使用 (1, 1) 内核，我希望得到这样的输出:(1, 1, height, width)，这意味着内核应该在通道维度上滑动.但是，在检查 pytorch 文档后，它说内核在高度和宽度上滑动.感谢 Pytorch 论坛上的 @ImgPrcSng，他告诉我使用 max_pool3d，结果证明效果很好.但是conv2d层的输出和max_pool3d层的输入之间仍然存在reshape操作.所以很难聚合成nn.Sequential，所以我想知道还有其他方法可以做到这一点吗?

I was trying to build a cnn to with Pytorch, and had difficulty in maxpooling. I have taken the cs231n held by Stanford. As I recalled, maxpooling can be used as a dimensional deduction step, for example, I have this (1, 20, height, width) input ot max_pool2d (assuming my batch_size is 1). And if I use (1, 1) kernel, I want to get output like this: (1, 1, height, width), which means the kernel should be slide over the channel dimension. However, after checking the pytorch docs, it says the kernel slides over height and width. And thanks to @ImgPrcSng on Pytorch forum who told me to use max_pool3d, and it turned out worked well. But there is still a reshape operation between the output of the conv2d layer and the input of the max_pool3d layer. So it is hard to be aggregated into a nn.Sequential, so I wonder is there another way to do this?

推荐答案

这样的事情会奏效吗?

from torch.nn import MaxPool1d
import torch.nn.functional as F


class ChannelPool(MaxPool1d):
    def forward(self, input):
        n, c, w, h = input.size()
        input = input.view(n, c, w * h).permute(0, 2, 1)
        pooled = F.max_pool1d(
            input,
            self.kernel_size,
            self.stride,
            self.padding,
            self.dilation,
            self.ceil_mode,
            self.return_indices,
        )
        _, _, c = pooled.size()
        pooled = pooled.permute(0, 2, 1)
        return pooled.view(n, c, w, h)

或者，使用 einops

from torch.nn import MaxPool1d
import torch.nn.functional as F
from einops import rearrange


class ChannelPool(MaxPool1d):
    def forward(self, input):
        n, c, w, h = input.size()
        pool = lambda x: F.max_pool1d(
            x,
            self.kernel_size,
            self.stride,
            self.padding,
            self.dilation,
            self.ceil_mode,
            self.return_indices,
        )
        return rearrange(
            pool(rearrange(input, "n c w h -> n (w h) c")),
            "n (w h) c -> n c w h",
            n=n,
            w=w,
            h=h,
        )

这篇关于通道维度上的 Pytorch 最大池化的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！