输入:128个30,64,64预处理后
ipts[128, 30, 64, 44]->sils[128,1, 1186, 64, 44]
GaitSet(
(set_block1): SetBlockWrapper(
sils先transpose且reshape成torch.Size([3840, 1, 64, 44])
(forward_block): Sequential(
(0): BasicConv2d(
(conv): Conv2d(1, 32, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), bias=False)
)
(1): LeakyReLU(negative_slope=0.01, inplace=True)
(2): BasicConv2d(
(conv): Conv2d(32, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
)
(3): LeakyReLU(negative_slope=0.01, inplace=True)
(4): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
)
得到torch.Size([3840, 32, 32, 22])后
再reshape且transpose成torch.Size([128, 32, 30, 32, 22])outs
)
(set_pooling): PackSequenceWrapper()
对outs最大池化得到gl
(gl_block2): Sequential(
这里其实是set_block2的深拷贝gl_block2
输入gl[128, 32, 32, 22]
(0): BasicConv2d(
(conv): Conv2d(32, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
)
(1): LeakyReLU(negative_slope=0.01, inplace=True)
(2): BasicConv2d(
(conv): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
)
(3): LeakyReLU(negative_slope=0.01, inplace=True)
(4): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
)得到gl
(set_block2): SetBlockWrapper(
也是先把outs[128, 32, 30, 32, 22]转置
(forward_block): Sequential(
(0): BasicConv2d(
(conv): Conv2d(32, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
)
(1): LeakyReLU(negative_slope=0.01, inplace=True)
(2): BasicConv2d(
(conv): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
)
(3): LeakyReLU(negative_slope=0.01, inplace=True)
(4): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
)得到torch.Size([128, 64, 30, 16, 11])-》outs
)
(set_pooling): PackSequenceWrapper()
outs再最大池化后与gl求和得到torch.Size([128, 64, 16, 11])-》gl
(gl_block3): Sequential(
输入gl
(0): BasicConv2d(
(conv): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
)
(1): LeakyReLU(negative_slope=0.01, inplace=True)
(2): BasicConv2d(
(conv): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
)
(3): LeakyReLU(negative_slope=0.01, inplace=True)
)得到gl[128, 128, 16, 11]
(set_block3): SetBlockWrapper(
这里输入的是outs[128, 64, 30, 16, 11]
(forward_block): Sequential(
(0): BasicConv2d(
(conv): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
)
(1): LeakyReLU(negative_slope=0.01, inplace=True)
(2): BasicConv2d(
(conv): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
)
(3): LeakyReLU(negative_slope=0.01, inplace=True)
)
)得到outs[128, 128, 30, 16, 11]
(set_pooling): PackSequenceWrapper()
outs再最大池化后得到[128, 128, 16, 11]
与gl求和得到torch.Size([128, 128, 16, 11])-》gl
(HPP): HorizontalPoolingPyramid()
输入outs[128, 128, 16, 11]
根据bin[16, 8, 4, 2, 1]把outs给view成对应形状的z,例如取8时有[128, 128, 8, 22]
随后将z的最大值和均值求和得到[128, 128, 16]、[128, 128, 8]等等
最后拼接得到feature1[128, 128, 31]
同样输入gl[128, 128, 16, 11]
得到feature2[128, 128, 16]
两个feature拼接得到feature[128, 128, 62]
(Head): SeparateFCs()
输入feature
得到embs[128, 256, 62]
(loss_aggregator): LossAggregator(
(losses): ModuleDict(
(triplet): TripletLoss()
)
)
)
最后有
{
'training_feat': {
'triplet': {
'embeddings': embs[128, 256, 62],
'labels': 128维}},
'visual_summary': {
'image/sils': sils给view成[3840, 1, 64, 44]},
'inference_feat': {
'embeddings': embs[128, 256, 62]}}