• yolov7增加mobileone


    代码地址:GitHub - apple/ml-mobileone: This repository contains the official implementation of the research paper, "An Improved One millisecond Mobile Backbone".

    论文地址:https://arxiv.org/abs/2206.04040

    MobileOne出自Apple,它的作者声称在iPhone 12上MobileOne的推理时间只有1毫秒,这也是MobileOne这个名字中One的含义。从MobileOne的快速落地可以看到重参数化在移动端的潜力:简单、高效、即插即用。

    图3中的左侧部分构成了MobileOne的一个完整building block。它由上下两部分构成,其中上面部分基于深度卷积(Depthwise Convolution),下面部分基于点卷积(Pointwise Convolution)。深度卷积与点卷积的术语来自于MobileNet。深度卷积本质上是一个分组卷积,它的分组数g与输入通道相同。而点卷积是一个1×1卷积。

    图3中的深度卷积模块由三条分支构成。最左侧分支是1×1卷积;中间分支是过参数化的3×3卷积,即k个3×3卷积;右侧部分是一个包含BN层的shortcut连接。这里的1×1卷积和3×3卷积都是深度卷积(也即分组卷积,分组数g等于输入通道数)。

    图3中的点卷积模块由两条分支构成。左侧分支是过参数化的1×1卷积,由k个1×1卷积构成。右侧分支是一个包含BN层的跳跃连接。在训练阶段,MobileOne就是由这样的building block堆叠而成。当训练完成后,可以使用重参数化方法将图3中左侧所示的building block重参数化图3中右侧的结构。

    这里用yolov7tiny的网络结构做示范,v7改起来差不多。在这里,我修改的思路不是将mobileone的backbone整体替换,而是保留v7tiny每个ELAN block,将每个block中的3*3卷积替换为图3中的重参数化的深度可分离卷积,这样既保留了网络整体结构,同时又将重参数化的mobileone block添加到网络结构中

    [-1, 1, Conv, [32, 1, 1, None, 1]],
    [-2, 1, Conv, [32, 1, 1, None, 1]],
    [-1, 1, Conv, [32, 3, 1, None, 1]], #替换
    [-1, 1, Conv, [32, 3, 1, None, 1]], #替换
    [[-1, -2, -3, -4], 1, Concat, [1]],
    [-1, 1, Conv, [64, 1, 1, None, 1]],

    也就是上面这个替换的部分

    这里我简化了上面这个结构,可以参看yolov7简化yaml配置文件-CSDN博客

    首先创建yolov7-tiny-ELANMO.yaml

    1. # parameters
    2. nc: 80 # number of classes
    3. depth_multiple: 1.0 # model depth multiple
    4. width_multiple: 1.0 # layer channel multiple
    5. activation: nn.ReLU()
    6. # anchors
    7. anchors:
    8. - [10,13, 16,30, 33,23] # P3/8
    9. - [30,61, 62,45, 59,119] # P4/16
    10. - [116,90, 156,198, 373,326] # P5/32
    11. # yolov7-tiny backbone
    12. backbone:
    13. # [from, number, module, args] c2, k=1, s=1, p=None, g=1, act=True, num_blocks_per_stage=1, num_conv_branches=4,
    14. [[-1, 1, Conv, [32, 3, 2, None, 1]], # 0-P1/2
    15. [-1, 1, Conv, [64, 3, 2, None, 1]], # 1-P2/4
    16. [-1, 1, ELANMO, [64, 1, 1, None, 1, 1, 4]], # 2
    17. [-1, 1, MP, []], # 3-P3/8
    18. [-1, 1, ELANMO, [128, 1, 1, None, 1, 1, 4]], # 4
    19. [-1, 1, MP, []], # 5-P4/16
    20. [-1, 1, ELANMO, [256, 1, 1, None, 1, 1, 4]], # 6
    21. [-1, 1, MP, []], # 7-P5/32
    22. [-1, 1, ELANMO, [512, 1, 1, None, 1, 1, 4]], # 8
    23. ]
    24. # yolov7-tiny head
    25. head:
    26. [[-1, 1, SPPCSPCSIM, [256]], # 9
    27. [-1, 1, Conv, [128, 1, 1, None, 1]],
    28. [-1, 1, nn.Upsample, [None, 2, 'nearest']],
    29. [6, 1, Conv, [128, 1, 1, None, 1]], # route backbone P4
    30. [[-1, -2], 1, Concat, [1]], # 13
    31. [-1, 1, ELANMO, [128, 1, 1, None, 1, 1, 4]], # 14
    32. [-1, 1, Conv, [64, 1, 1, None, 1]],
    33. [-1, 1, nn.Upsample, [None, 2, 'nearest']],
    34. [4, 1, Conv, [64, 1, 1, None, 1]], # route backbone P3
    35. [[-1, -2], 1, Concat, [1]],
    36. [-1, 1, ELANMO, [64, 1, 1, None, 1, 1, 4]], # 19
    37. [-1, 1, Conv, [128, 3, 2, None, 1]],
    38. [[-1, 14], 1, Concat, [1]],
    39. [-1, 1, ELANMO, [128, 1, 1, None, 1, 1, 4]], # 22
    40. [-1, 1, Conv, [256, 3, 2, None, 1]],
    41. [[-1, 9], 1, Concat, [1]],
    42. [-1, 1, ELANMO, [256, 1, 1, None, 1, 1, 4]], # 25
    43. [19, 1, Conv, [128, 3, 1, None, 1]],
    44. [22, 1, Conv, [256, 3, 1, None, 1]],
    45. [25, 1, Conv, [512, 3, 1, None, 1]],
    46. [[26,27,28], 1, Detect, [nc, anchors]], # Detect(P3, P4, P5)
    47. ]

     在common.py中增加

    1. import torch.nn.functional as F
    2. class SEBlock(nn.Module):
    3. """ Squeeze and Excite module.
    4. Pytorch implementation of `Squeeze-and-Excitation Networks` -
    5. https://arxiv.org/pdf/1709.01507.pdf
    6. """
    7. def __init__(self,
    8. in_channels: int,
    9. rd_ratio: float = 0.0625) -> None:
    10. """ Construct a Squeeze and Excite Module.
    11. :param in_channels: Number of input channels.
    12. :param rd_ratio: Input channel reduction ratio.
    13. """
    14. super(SEBlock, self).__init__()
    15. self.reduce = nn.Conv2d(in_channels=in_channels,
    16. out_channels=int(in_channels * rd_ratio),
    17. kernel_size=1,
    18. stride=1,
    19. bias=True)
    20. self.expand = nn.Conv2d(in_channels=int(in_channels * rd_ratio),
    21. out_channels=in_channels,
    22. kernel_size=1,
    23. stride=1,
    24. bias=True)
    25. def forward(self, inputs: torch.Tensor) -> torch.Tensor:
    26. """ Apply forward pass. """
    27. b, c, h, w = inputs.size()
    28. x = F.avg_pool2d(inputs, kernel_size=[h, w])
    29. x = self.reduce(x)
    30. x = F.relu(x)
    31. x = self.expand(x)
    32. x = torch.sigmoid(x)
    33. x = x.view(-1, c, 1, 1)
    34. return inputs * x
    35. class MobileOneBlock(nn.Module):
    36. """ MobileOne building block.
    37. This block has a multi-branched architecture at train-time
    38. and plain-CNN style architecture at inference time
    39. For more details, please refer to our paper:
    40. `An Improved One millisecond Mobile Backbone` -
    41. https://arxiv.org/pdf/2206.04040.pdf
    42. """
    43. def __init__(self,
    44. in_channels: int,
    45. out_channels: int,
    46. kernel_size: int,
    47. stride: int = 1,
    48. padding: int = 0,
    49. dilation: int = 1,
    50. groups: int = 1,
    51. inference_mode: bool = False,
    52. use_se: bool = False,
    53. num_conv_branches: int = 1) -> None:
    54. """ Construct a MobileOneBlock module.
    55. :param in_channels: Number of channels in the input.
    56. :param out_channels: Number of channels produced by the block.
    57. :param kernel_size: Size of the convolution kernel.
    58. :param stride: Stride size.
    59. :param padding: Zero-padding size.
    60. :param dilation: Kernel dilation factor.
    61. :param groups: Group number.
    62. :param inference_mode: If True, instantiates model in inference mode.
    63. :param use_se: Whether to use SE-ReLU activations.
    64. :param num_conv_branches: Number of linear conv branches.
    65. """
    66. super(MobileOneBlock, self).__init__()
    67. self.inference_mode = inference_mode
    68. self.groups = groups
    69. self.stride = stride
    70. self.kernel_size = kernel_size
    71. self.in_channels = in_channels
    72. self.out_channels = out_channels
    73. self.num_conv_branches = num_conv_branches
    74. # Check if SE-ReLU is requested
    75. if use_se:
    76. self.se = SEBlock(out_channels)
    77. else:
    78. self.se = nn.Identity()
    79. self.activation = nn.ReLU()
    80. if inference_mode:
    81. self.reparam_conv = nn.Conv2d(in_channels=in_channels,
    82. out_channels=out_channels,
    83. kernel_size=kernel_size,
    84. stride=stride,
    85. padding=padding,
    86. dilation=dilation,
    87. groups=groups,
    88. bias=True)
    89. else:
    90. # Re-parameterizable skip connection
    91. self.rbr_skip = nn.BatchNorm2d(num_features=in_channels) \
    92. if out_channels == in_channels and stride == 1 else None
    93. # Re-parameterizable conv branches
    94. rbr_conv = list()
    95. for _ in range(self.num_conv_branches):
    96. rbr_conv.append(self._conv_bn(kernel_size=kernel_size,
    97. padding=padding))
    98. self.rbr_conv = nn.ModuleList(rbr_conv)
    99. # Re-parameterizable scale branch
    100. self.rbr_scale = None
    101. if kernel_size > 1:
    102. self.rbr_scale = self._conv_bn(kernel_size=1,
    103. padding=0)
    104. def forward(self, x: torch.Tensor):
    105. """ Apply forward pass. """
    106. # Inference mode forward pass.
    107. if self.inference_mode:
    108. return self.activation(self.se(self.reparam_conv(x)))
    109. # Multi-branched train-time forward pass.
    110. # Skip branch output
    111. identity_out = 0
    112. if self.rbr_skip is not None:
    113. identity_out = self.rbr_skip(x)
    114. # Scale branch output
    115. scale_out = 0
    116. if self.rbr_scale is not None:
    117. scale_out = self.rbr_scale(x)
    118. # Other branches
    119. out = scale_out + identity_out
    120. for ix in range(self.num_conv_branches):
    121. out += self.rbr_conv[ix](x)
    122. return self.activation(self.se(out))
    123. def reparameterize(self):
    124. """ Following works like `RepVGG: Making VGG-style ConvNets Great Again` -
    125. https://arxiv.org/pdf/2101.03697.pdf. We re-parameterize multi-branched
    126. architecture used at training time to obtain a plain CNN-like structure
    127. for inference.
    128. """
    129. if self.inference_mode:
    130. return
    131. kernel, bias = self._get_kernel_bias()
    132. self.reparam_conv = nn.Conv2d(in_channels=self.rbr_conv[0].conv.in_channels,
    133. out_channels=self.rbr_conv[0].conv.out_channels,
    134. kernel_size=self.rbr_conv[0].conv.kernel_size,
    135. stride=self.rbr_conv[0].conv.stride,
    136. padding=self.rbr_conv[0].conv.padding,
    137. dilation=self.rbr_conv[0].conv.dilation,
    138. groups=self.rbr_conv[0].conv.groups,
    139. bias=True)
    140. self.reparam_conv.weight.data = kernel
    141. self.reparam_conv.bias.data = bias
    142. # Delete un-used branches
    143. for para in self.parameters():
    144. para.detach_()
    145. self.__delattr__('rbr_conv')
    146. self.__delattr__('rbr_scale')
    147. if hasattr(self, 'rbr_skip'):
    148. self.__delattr__('rbr_skip')
    149. self.inference_mode = True
    150. def _get_kernel_bias(self):
    151. """ Method to obtain re-parameterized kernel and bias.
    152. Reference: https://github.com/DingXiaoH/RepVGG/blob/main/repvgg.py#L83
    153. :return: Tuple of (kernel, bias) after fusing branches.
    154. """
    155. # get weights and bias of scale branch
    156. kernel_scale = 0
    157. bias_scale = 0
    158. if self.rbr_scale is not None:
    159. kernel_scale, bias_scale = self._fuse_bn_tensor(self.rbr_scale)
    160. # Pad scale branch kernel to match conv branch kernel size.
    161. pad = self.kernel_size // 2
    162. kernel_scale = torch.nn.functional.pad(kernel_scale,
    163. [pad, pad, pad, pad])
    164. # get weights and bias of skip branch
    165. kernel_identity = 0
    166. bias_identity = 0
    167. if self.rbr_skip is not None:
    168. kernel_identity, bias_identity = self._fuse_bn_tensor(self.rbr_skip)
    169. # get weights and bias of conv branches
    170. kernel_conv = 0
    171. bias_conv = 0
    172. for ix in range(self.num_conv_branches):
    173. _kernel, _bias = self._fuse_bn_tensor(self.rbr_conv[ix])
    174. kernel_conv += _kernel
    175. bias_conv += _bias
    176. kernel_final = kernel_conv + kernel_scale + kernel_identity
    177. bias_final = bias_conv + bias_scale + bias_identity
    178. return kernel_final, bias_final
    179. def _fuse_bn_tensor(self, branch):
    180. """ Method to fuse batchnorm layer with preceeding conv layer.
    181. Reference: https://github.com/DingXiaoH/RepVGG/blob/main/repvgg.py#L95
    182. :param branch:
    183. :return: Tuple of (kernel, bias) after fusing batchnorm.
    184. """
    185. if isinstance(branch, nn.Sequential):
    186. kernel = branch.conv.weight
    187. running_mean = branch.bn.running_mean
    188. running_var = branch.bn.running_var
    189. gamma = branch.bn.weight
    190. beta = branch.bn.bias
    191. eps = branch.bn.eps
    192. else:
    193. assert isinstance(branch, nn.BatchNorm2d)
    194. if not hasattr(self, 'id_tensor'):
    195. input_dim = self.in_channels // self.groups
    196. kernel_value = torch.zeros((self.in_channels,
    197. input_dim,
    198. self.kernel_size,
    199. self.kernel_size),
    200. dtype=branch.weight.dtype,
    201. device=branch.weight.device)
    202. for i in range(self.in_channels):
    203. kernel_value[i, i % input_dim,
    204. self.kernel_size // 2,
    205. self.kernel_size // 2] = 1
    206. self.id_tensor = kernel_value
    207. kernel = self.id_tensor
    208. running_mean = branch.running_mean
    209. running_var = branch.running_var
    210. gamma = branch.weight
    211. beta = branch.bias
    212. eps = branch.eps
    213. std = (running_var + eps).sqrt()
    214. t = (gamma / std).reshape(-1, 1, 1, 1)
    215. return kernel * t, beta - running_mean * gamma / std
    216. def _conv_bn(self,
    217. kernel_size: int,
    218. padding: int) -> nn.Sequential:
    219. """ Helper method to construct conv-batchnorm layers.
    220. :param kernel_size: Size of the convolution kernel.
    221. :param padding: Zero-padding size.
    222. :return: Conv-BN module.
    223. """
    224. mod_list = nn.Sequential()
    225. mod_list.add_module('conv', nn.Conv2d(in_channels=self.in_channels,
    226. out_channels=self.out_channels,
    227. kernel_size=kernel_size,
    228. stride=self.stride,
    229. padding=padding,
    230. groups=self.groups,
    231. bias=False))
    232. mod_list.add_module('bn', nn.BatchNorm2d(num_features=self.out_channels))
    233. return mod_list
    234. class ELANMO(nn.Module):
    235. # Yolov7 ELANMO with args(ch_in, ch_out, kernel, stride, padding, groups, num_blocks, num_conv, activation)
    236. def __init__(self, c1, c2, k=1, s=1, p=None, g=1,
    237. num_blocks_per_stage=1,
    238. num_conv_branches=4,
    239. act=True,
    240. down_sample=False,
    241. use_se=False,
    242. inference_mode=False):
    243. """ Construct a ELAN module with MobileOneBlock.
    244. :param c1: Number of channels in the input.
    245. :param c2: Number of channels produced by the block.
    246. :param k: Size of the convolution kernel.
    247. :param s: Stride size.
    248. :param p: Zero-padding size.
    249. :param g: Group number.
    250. :param num_blocks_per_stage: If True, instantiates model in inference mode.
    251. :param num_conv_branches: Number of linear conv branches.
    252. :param act: If True, use activations
    253. :param down_sample:If True, first conv block set stride 2
    254. :param use_se: Whether to use SE-ReLU activations.
    255. :param inference_mode: If True, instantiates model in inference mode.
    256. """
    257. super().__init__()
    258. c_ = int(c2 // 2)
    259. c_out = c_ * 4
    260. self.inference_mode = inference_mode
    261. self.in_planes = c_
    262. self.down_sample = down_sample
    263. self.use_se = use_se
    264. self.num_blocks_per_stage = num_blocks_per_stage
    265. self.num_conv_branches = num_conv_branches
    266. # self.cur_layer_idx = 1
    267. self.cv1 = Conv(c1, c_, k=k, s=s, p=p, g=g, act=act)
    268. self.cv2 = Conv(c1, c_, k=k, s=s, p=p, g=g, act=act)
    269. self.cv3 = self._make_stage(c_, self.num_blocks_per_stage, num_se_blocks=0)
    270. self.cv4 = self._make_stage(c_, self.num_blocks_per_stage, num_se_blocks=0)
    271. self.cv5 = Conv(c_out, c2, k=k, s=s, p=p, g=g, act=act)
    272. def _make_stage(self,
    273. planes: int,
    274. num_blocks: int,
    275. num_se_blocks: int) -> nn.Sequential:
    276. """ Build a stage of MobileOne model.
    277. :param planes: Number of output channels.
    278. :param num_blocks: Number of blocks in this stage.
    279. :param num_se_blocks: Number of SE blocks in this stage.
    280. :return: A stage of MobileOne model.
    281. """
    282. # Get strides for all layers
    283. strides = [2 if self.down_sample else 1] + [1] * (num_blocks - 1)
    284. blocks = []
    285. for ix, stride in enumerate(strides):
    286. use_se = False
    287. if num_se_blocks > num_blocks:
    288. raise ValueError("Number of SE blocks cannot "
    289. "exceed number of layers.")
    290. if ix >= (num_blocks - num_se_blocks):
    291. use_se = True
    292. # Depthwise conv
    293. blocks.append(MobileOneBlock(in_channels=self.in_planes,
    294. out_channels=self.in_planes,
    295. kernel_size=3,
    296. stride=stride,
    297. padding=1,
    298. groups=self.in_planes,
    299. inference_mode=self.inference_mode,
    300. use_se=use_se,
    301. num_conv_branches=self.num_conv_branches))
    302. # Pointwise conv
    303. blocks.append(MobileOneBlock(in_channels=self.in_planes,
    304. out_channels=planes,
    305. kernel_size=1,
    306. stride=1,
    307. padding=0,
    308. groups=1,
    309. inference_mode=self.inference_mode,
    310. use_se=use_se,
    311. num_conv_branches=self.num_conv_branches))
    312. self.in_planes = planes
    313. # self.cur_layer_idx += 1
    314. return nn.Sequential(*blocks)
    315. def forward(self, x):
    316. x1 = self.cv1(x)
    317. x2 = self.cv2(x)
    318. x3 = self.cv3(x2)
    319. x4 = self.cv4(x3)
    320. x5 = torch.cat((x1, x2, x3, x4), 1)
    321. return self.cv5(x5)

    在yolo.py的parse_model中添加ELANMO

    1. if m in (Conv, GhostConv, Bottleneck, GhostBottleneck, SPP, SPPF, DWConv, MixConv2d, Focus, CrossConv,
    2. BottleneckCSP, C3, C3TR, C3SPP, C3Ghost, nn.ConvTranspose2d, DWConvTranspose2d, C3x, SPPCSPC, RepConv,
    3. RFEM, ELAN, SPPCSPCSIM,ELANMO):
    4. c1, c2 = ch[f], args[0]
    5. if c2 != no: # if not output
    6. c2 = make_divisible(c2 * gw, 8)
    7. args = [c1, c2, *args[1:]]
    8. if m in [BottleneckCSP, C3, C3TR, C3Ghost, C3x]:
    9. args.insert(2, n) # number of repeats
    10. n = 1

    同时在yolo.py的BaseModel中添加reparameterize()

    1. def fuse(self): # fuse model Conv2d() + BatchNorm2d() layers
    2. LOGGER.info('Fusing layers... ')
    3. for m in self.model.modules():
    4. if isinstance(m, (Conv, DWConv)) and hasattr(m, 'bn'):
    5. m.conv = fuse_conv_and_bn(m.conv, m.bn) # update conv
    6. delattr(m, 'bn') # remove batchnorm
    7. m.forward = m.forward_fuse # update forward
    8. if isinstance(m, RepConv):
    9. # print(f" fuse_repvgg_block")
    10. m.fuse_repvgg_block()
    11. # m.switch_to_deploy()
    12. if hasattr(m, 'reparameterize'):
    13. m.reparameterize()
    14. self.info()
    15. return self

    替换新的配置文件运行yolo.py

    原始的yolov7tiny的参数量和计算量: 

    可以看到参数量和计算量相对于tiny少了很多

    导出onnx后可以看一下网络结构,下图是原始的v7tiny网络结构:

    增加mobileone block未融合重参数的网络结构:

    这结构看着提复杂的,不过融合后就好了

    融合重参数后的网络结构:

    融合之后看起来就是将ELAN中的两个3*3的卷积替换成深度可分离卷积的形式 

  • 相关阅读:
    Java毕业设计项目_企业级实战全栈项目中信CRM
    mysql in查询,同时查询两个字段
    Docker的常用命令
    百度实习一面(知识图谱部门)
    企业完善质量、环境、健康安全三体系认证的作用及其意义!
    一码胜千言,博园Polo衫,上架预售啦
    【SWAT】SWAT-CUP动态基流分割相关说明
    Spring 从入门到精通 (九) 配置文件参数化
    数据结构——二叉树搜索树(二叉搜索树的概念、实现、先序遍历、中序遍历、后序遍历)
    java学习day8(Java基础)static关键字和继承
  • 原文地址:https://blog.csdn.net/athrunsunny/article/details/132784492