Partial Convolution (PConv卷积),有助于提升模型对小目标检测的性能。目前许多研究都集中在减少浮点运算(FLOPs)的数量上。然而FLOPs的这种减少不一定会带来延迟的类似程度的减少。这主要源于每秒低浮点运算(FLOPS)效率低下。为了实现更快的网络,作者重新回顾了FLOPs的运算符,并证明了如此低的FLOPS主要是由于运算符的频繁内存访问,尤其是深度卷积。因此提出了一种新的partial convolution(PConv卷积),通过同时减少冗余计算和内存访问可以更有效地提取空间特征。
原文地址:Run, Don’t Walk: Chasing Higher FLOPS for Faster Neural Networks
代码实现:
class PConv(nn.Module):
def __init__(self, dim, ouc, n_div=4, forward='split_cat'):
super().__init__()
self.dim_conv3 = dim // n_div
self.dim_untouched = dim - self.dim_conv3
self.partial_conv3 = nn.Conv2d(self.dim_conv3, self.dim_conv3, 3, 1, 1, bias=False)
self.conv = Conv(dim, ouc, k=1)
if forward == 'slicing':
self.forward = self.forward_slicing
elif forward == 'split_cat':
self.forward = self.forward_split_cat
else:
raise NotImplementedError
def forward_slicing(self, x):
# only for inference
x = x.clone() # !!! Keep the original input intact for the residual connection later
x[:, :self.dim_conv3, :, :] = self.partial_conv3(x[:, :self.dim_conv3, :, :])
x = self.conv(x)
return x
def forward_split_cat(self, x):
# for training/inference
x1, x2 = torch.split(x, [self.dim_conv3, self.dim_untouched], dim=1)
x1 = self.partial_conv3(x1)
x = torch.cat((x1, x2), 1)
x = self.conv(x)
return x