省流
-
nn.Conv2d 需要的输入张量格式为 (batch_size, channels, height, width),但您的示例输入张量 x 是 (batch_size, height, width, channels)。因此,需要对输入张量进行转置。
-
注意,TensorFlow使用"NHWC"(批次、高度、宽度、通道)格式,而PyTorch使用"NCHW"(批次、通道、高度、宽度)格式
异常报错
RuntimeError: Given groups=1, weight of size [16, 3, 2, 3],
expected input[8, 65, 66, 3] to have 3 channels,
but got 65 channels instead
异常截图
异常代码
def down_shifted_conv2d(x , num_filters , filters_size = [2,3],stride = 1, **kwargs):
batch_size,H,W,channels = x.shape
padding = (0,0,
int(((filters_size[1]) - 1) / 2 ) , int((int(filters_size[1]) - 1) / 2),
int(filters_size[0]) - 1 , 0,
0,0)
x_paded = nn.functional.pad(x, padding)
print(x_paded.shape)
conv_layer = nn.Conv2d(in_channels=channels, out_channels=num_filters,
kernel_size=filters_size,
stride=stride, **kwargs)
return conv_layer(x_paded)
# Example usage
x = torch.randn(8, 64, 64, 3) # Example input with batch size 8, height and width 64, and 3 channels
num_filters = 16
output = down_shifted_conv2d(x, num_filters)
print(output.shape)
原因解释
-
在pytorch中,“nn.Conv2d”需要输入的张量格式为(batch_size,channels,height,width),原图输入的x的格式是(batch_size,height ,weight,channel)所以需要对tensor进行转置。
-
矩阵交换维度的函数permute,按照编号,将新的顺序填好即可。
def down_shifted_conv2d(x , num_filters , filters_size = [2,3], stride = 1, **kwargs):
batch_size, H, W, channels = x.shape
# Transpose the input tensor to (batch_size, channels, height, width)
x = x.permute(0, 3, 1, 2)
# Padding
padding = (int((filters_size[1] - 1) / 2), int((filters_size[1] - 1) / 2),
filters_size[0] - 1, 0)
x_paded = F.pad(x, padding)
修正代码
def down_shifted_conv2d(x , num_filters , filters_size = [2,3],stride = 1, **kwargs):
batch_size,H,W,channels = x.shape
# 按照顺序对4个维度分别进行填充
padding = (0,0,
int(((filters_size[1]) - 1) / 2 ) , int((int(filters_size[1]) - 1) / 2),
int(filters_size[0]) - 1 , 0,
0,0)
x_paded = nn.functional.pad(x, padding)
x_paded = x_paded.permute(0,3,1,2)
# 进行卷积
conv_layer = nn.Conv2d(in_channels=channels, out_channels=num_filters,
kernel_size=filters_size,
stride=stride, **kwargs)
return conv_layer(x_paded)
# Example usage
x = torch.randn(8, 64, 64, 3)
num_filters = 16
output = down_shifted_conv2d(x, num_filters)
print(output.shape)