• yolo改进替换VanillaNet backbone


    论文地址:https://arxiv.org/pdf/2305.12972.pdf

    代码地址:GitHub - huawei-noah/VanillaNet

    VanillaNet简介

    基础模型的核心是“更多不同”的哲学,计算机视觉和自然语言处理的惊人成功就是例证。 然而,优化的挑战和Transformer模型固有的复杂性要求范式向简单转变。 在本研究中,我们介绍了VanillaNet,一个在设计中包含优雅的神经网络架构。 通过避免高深度、快捷方式和复杂的操作,如自注意力,VanillaNet是令人耳目一新的简洁,但非常强大。 每一层都被精心制作成紧凑和简单的结构,非线性激活函数在训练后被剪枝,以恢复原始的架构。 VanillaNet克服了固有复杂性的挑战,使其成为资源紧张环境的理想选择。 其易于理解和高度简化的体系结构为高效部署打开了新的可能性。 大量实验表明,VanillaNet提供了与著名的深度神经网络和视觉Transformer相当的性能,展示了极简主义在深度学习中的力量。 VanillaNet的这一远见之旅具有重大的潜力,可以重新定义并挑战基础模型的现状,为优雅有效的模型设计开辟一条新的道路。

            在过去的几十年里,研究人员在神经网络的基本设计上达成了一些共识。大多数最先进的图像分类网络架构应该由三部分组成:

    1. 主干块,用于将输入图像从3个通道转换为多个通道,并进行下采样,一个学习有用的信息主题
    2. 主体,通常有四个阶段,每个阶段都是通过堆叠相同的块来派生的。在每个阶段之后,特征的通道将扩展,而高度和宽度将减小。不同的网络利用和堆叠不同种类的块来构建深度模型。
    3. 全连接层分类输出。

            尽管现有的深度网络取得了成功,但它们利用大量复杂层来为以下任务提取高级特征。例如,著名的ResNet需要34或50个带shortcat的层才能在ImageNet上实现超过70%的top-1精度。Vit的基础版本由62层组成,因为自注意力中的K、Q、V需要多层来计算。随着AI芯片雨来越大,神经网络推理速度的瓶颈不再是FLOPs或参数,因为现代GPU可以很容易地进行并行计算。相比之下,它们复杂的设计和较大的深度阻碍了它们的速度。为此我们提出了Vanilla网络,即VanillaNet,其框架图如图1所示。我们遵循流行的神经网络设计,包括主干、主体和全连接层。与现有的深度网络不同,我们在每个阶段只使用一层,以建立一个尽可能少的层的极其简单的网络。该网络的特点是不采用shortcut(shortcut会增加访存时间),同时没有复杂的模块如自注意力等。

    在深度学习中,通过在训练阶段引入更强的容量来增强模型的性能是很常见的。为此,我们建议利用深度训练技术来提高所提出的VanillaNet在训练期间的能力。

    优化策略1: 深度训练,浅层推理         为了提升VanillaNet这个架构的非线性,我们提出首先提出了深度训练(Deep training)策略,在训练过程中把一个卷积层拆成两个卷积层,并在中间插入如下的非线性操作:

            其中, A 是传统的非线性激活函数,最简单的还是 ReLU, λ 会随着模型的优化逐渐变为1,两个卷积层就可以合并成为一层,不改变VanillaNet的结构。

    优化策略2:换激活函数         既然我们想提升VanillaNet的非线性,一个更直接的方案是有没有非线性更强的激活函数,并且这个激活函数好并行速度快?为了实现这个既要又要的宪法,我们提出一种基于级数启发的激活函数,把多个ReLU加权加偏置堆叠起来:

            然后再进行微调,提升这个激活函数对信息的感知能力。

    YOLO中改进

    以yolov7-tiny的backbone为例,想用v5v8的可以把backbone复制到v5v8的配置中就行

    首先配置文件yolov7-tiny-vanilla.yaml:

    1. # parameters
    2. nc: 80 # number of classes
    3. depth_multiple: 1.0 # model depth multiple
    4. width_multiple: 1.0 # layer channel multiple
    5. activation: nn.ReLU()
    6. # anchors
    7. anchors:
    8. - [10,13, 16,30, 33,23] # P3/8
    9. - [30,61, 62,45, 59,119] # P4/16
    10. - [116,90, 156,198, 373,326] # P5/32
    11. # yolov7-tiny backbone
    12. backbone:
    13. # [from, number, module, args] c2, k=1, s=1, p=None, g=1, act=True
    14. [[-1, 1, VanillaStem, [64, 4, 4, None, 1]], # 0-P1/4
    15. [-1, 1, VanillaBlock, [256, 1, 2, None, 1]], # 1-P2/8
    16. [-1, 1, VanillaBlock, [512, 1, 2, None, 1]], # 2-P3/16
    17. [-1, 1, VanillaBlock, [1024, 1, 2, None, 1]], # 3-P4/32
    18. ]
    19. # yolov7-tiny head
    20. head:
    21. [[1, 1, Conv, [128, 1, 1, None, 1]], # 4
    22. [2, 1, Conv, [256, 1, 1, None, 1]], # 5
    23. [3, 1, Conv, [512, 1, 1, None, 1]], # 6
    24. [-1, 1, SPPCSPCSIM, [256]], # 7
    25. [-1, 1, Conv, [128, 1, 1, None, 1]],
    26. [-1, 1, nn.Upsample, [None, 2, 'nearest']],
    27. [5, 1, Conv, [128, 1, 1, None, 1]], # route backbone P3
    28. [[-1, -2], 1, Concat, [1]], # 11
    29. [-1, 1, ELAN, [128, 1, 1, None, 1]], # 12
    30. [-1, 1, Conv, [64, 1, 1, None, 1]],
    31. [-1, 1, nn.Upsample, [None, 2, 'nearest']],
    32. [4, 1, Conv, [64, 1, 1, None, 1]], # route backbone P2
    33. [[-1, -2], 1, Concat, [1]], # 16
    34. [-1, 1, ELAN, [64, 1, 1, None, 1]], # 17
    35. [-1, 1, Conv, [128, 3, 2, None, 1]],
    36. [[-1, 12], 1, Concat, [1]],
    37. [-1, 1, ELAN, [128, 1, 1, None, 1]], # 20
    38. [-1, 1, Conv, [256, 3, 2, None, 1]],
    39. [[-1, 7], 1, Concat, [1]],
    40. [-1, 1, ELAN, [256, 1, 1, None, 1]], # 23
    41. [17, 1, Conv, [128, 3, 1, None, 1]],
    42. [20, 1, Conv, [256, 3, 1, None, 1]],
    43. [23, 1, Conv, [512, 3, 1, None, 1]],
    44. [[24,25,26], 1, Detect, [nc, anchors]], # Detect(P3, P4, P5)
    45. ]

    其中ELAN模块为对原配置文件中相应的模块进行过简化可以

    简化方式:yolov7简化网络yaml配置文件_athrunsunny的博客-CSDN博客

    为了保证模型整体参数量和v5s以及v7tiny相差太多, VanillaBlock的channel数设置的比原文中要小

    在common.py中添加

    1. class activation(nn.ReLU):
    2. def __init__(self, dim, act_num=3, deploy=False):
    3. super(activation, self).__init__()
    4. self.act_num = act_num
    5. self.deploy = deploy
    6. self.dim = dim
    7. self.weight = torch.nn.Parameter(torch.randn(dim, 1, act_num * 2 + 1, act_num * 2 + 1))
    8. if deploy:
    9. self.bias = torch.nn.Parameter(torch.zeros(dim))
    10. else:
    11. self.bias = None
    12. self.bn = nn.BatchNorm2d(dim, eps=1e-6)
    13. nn.init.trunc_normal_(self.weight, std=.02)
    14. def forward(self, x):
    15. if self.deploy:
    16. return torch.nn.functional.conv2d(
    17. super(activation, self).forward(x),
    18. self.weight, self.bias, padding=self.act_num, groups=self.dim)
    19. else:
    20. return self.bn(torch.nn.functional.conv2d(
    21. super(activation, self).forward(x),
    22. self.weight, padding=self.act_num, groups=self.dim))
    23. def _fuse_bn_tensor(self, weight, bn):
    24. kernel = weight
    25. running_mean = bn.running_mean
    26. running_var = bn.running_var
    27. gamma = bn.weight
    28. beta = bn.bias
    29. eps = bn.eps
    30. std = (running_var + eps).sqrt()
    31. t = (gamma / std).reshape(-1, 1, 1, 1)
    32. return kernel * t, beta + (0 - running_mean) * gamma / std
    33. def switch_to_deploy(self):
    34. kernel, bias = self._fuse_bn_tensor(self.weight, self.bn)
    35. self.weight.data = kernel
    36. self.bias = torch.nn.Parameter(torch.zeros(self.dim))
    37. self.bias.data = bias
    38. self.__delattr__('bn')
    39. self.deploy = True
    40. class VanillaStem(nn.Module):
    41. def __init__(self, in_chans=3, dims=96,
    42. k=0, s=0, p=None,g=0, act_num=3, deploy=False, ada_pool=None, **kwargs):
    43. super().__init__()
    44. self.deploy = deploy
    45. stride, padding = (4, 0) if not ada_pool else (3, 1)
    46. if self.deploy:
    47. self.stem = nn.Sequential(
    48. nn.Conv2d(in_chans, dims, kernel_size=k, stride=stride, padding=padding),
    49. activation(dims, act_num, deploy=self.deploy)
    50. )
    51. else:
    52. self.stem1 = nn.Sequential(
    53. nn.Conv2d(in_chans, dims, kernel_size=k, stride=stride, padding=padding),
    54. nn.BatchNorm2d(dims, eps=1e-6),
    55. )
    56. self.stem2 = nn.Sequential(
    57. nn.Conv2d(dims, dims, kernel_size=1, stride=1),
    58. nn.BatchNorm2d(dims, eps=1e-6),
    59. activation(dims, act_num)
    60. )
    61. self.act_learn = 1
    62. self.apply(self._init_weights)
    63. def _init_weights(self, m):
    64. if isinstance(m, (nn.Conv2d, nn.Linear)):
    65. nn.init.trunc_normal_(m.weight, std=.02)
    66. nn.init.constant_(m.bias, 0)
    67. def forward(self, x):
    68. if self.deploy:
    69. x = self.stem(x)
    70. else:
    71. x = self.stem1(x)
    72. x = torch.nn.functional.leaky_relu(x, self.act_learn)
    73. x = self.stem2(x)
    74. return x
    75. def _fuse_bn_tensor(self, conv, bn):
    76. kernel = conv.weight
    77. bias = conv.bias
    78. running_mean = bn.running_mean
    79. running_var = bn.running_var
    80. gamma = bn.weight
    81. beta = bn.bias
    82. eps = bn.eps
    83. std = (running_var + eps).sqrt()
    84. t = (gamma / std).reshape(-1, 1, 1, 1)
    85. return kernel * t, beta + (bias - running_mean) * gamma / std
    86. def switch_to_deploy(self):
    87. self.stem2[2].switch_to_deploy()
    88. kernel, bias = self._fuse_bn_tensor(self.stem1[0], self.stem1[1])
    89. self.stem1[0].weight.data = kernel
    90. self.stem1[0].bias.data = bias
    91. kernel, bias = self._fuse_bn_tensor(self.stem2[0], self.stem2[1])
    92. self.stem1[0].weight.data = torch.einsum('oi,icjk->ocjk', kernel.squeeze(3).squeeze(2),
    93. self.stem1[0].weight.data)
    94. self.stem1[0].bias.data = bias + (self.stem1[0].bias.data.view(1, -1, 1, 1) * kernel).sum(3).sum(2).sum(1)
    95. self.stem = torch.nn.Sequential(*[self.stem1[0], self.stem2[2]])
    96. self.__delattr__('stem1')
    97. self.__delattr__('stem2')
    98. self.deploy = True
    99. class VanillaBlock(nn.Module):
    100. def __init__(self, dim, dim_out,k=0 , stride=2,p=None,g=0, ada_pool=None,act_num=3 ,deploy=False):
    101. super().__init__()
    102. self.act_learn = 1
    103. self.deploy = deploy
    104. if self.deploy:
    105. self.conv = nn.Conv2d(dim, dim_out, kernel_size=1)
    106. else:
    107. self.conv1 = nn.Sequential(
    108. nn.Conv2d(dim, dim, kernel_size=1),
    109. nn.BatchNorm2d(dim, eps=1e-6),
    110. )
    111. self.conv2 = nn.Sequential(
    112. nn.Conv2d(dim, dim_out, kernel_size=1),
    113. nn.BatchNorm2d(dim_out, eps=1e-6)
    114. )
    115. if not ada_pool:
    116. self.pool = nn.Identity() if stride == 1 else nn.MaxPool2d(stride)
    117. else:
    118. self.pool = nn.Identity() if stride == 1 else nn.AdaptiveMaxPool2d((ada_pool, ada_pool))
    119. self.act = activation(dim_out, act_num, deploy=self.deploy)
    120. def forward(self, x):
    121. if self.deploy:
    122. x = self.conv(x)
    123. else:
    124. x = self.conv1(x)
    125. # We use leakyrelu to implement the deep training technique.
    126. x = torch.nn.functional.leaky_relu(x, self.act_learn)
    127. x = self.conv2(x)
    128. x = self.pool(x)
    129. x = self.act(x)
    130. return x
    131. def _fuse_bn_tensor(self, conv, bn):
    132. kernel = conv.weight
    133. bias = conv.bias
    134. running_mean = bn.running_mean
    135. running_var = bn.running_var
    136. gamma = bn.weight
    137. beta = bn.bias
    138. eps = bn.eps
    139. std = (running_var + eps).sqrt()
    140. t = (gamma / std).reshape(-1, 1, 1, 1)
    141. return kernel * t, beta + (bias - running_mean) * gamma / std
    142. def switch_to_deploy(self):
    143. kernel, bias = self._fuse_bn_tensor(self.conv1[0], self.conv1[1])
    144. self.conv1[0].weight.data = kernel
    145. self.conv1[0].bias.data = bias
    146. # kernel, bias = self.conv2[0].weight.data, self.conv2[0].bias.data
    147. kernel, bias = self._fuse_bn_tensor(self.conv2[0], self.conv2[1])
    148. self.conv = self.conv2[0]
    149. self.conv.weight.data = torch.matmul(kernel.transpose(1, 3),
    150. self.conv1[0].weight.data.squeeze(3).squeeze(2)).transpose(1, 3)
    151. self.conv.bias.data = bias + (self.conv1[0].bias.data.view(1, -1, 1, 1) * kernel).sum(3).sum(2).sum(1)
    152. self.__delattr__('conv1')
    153. self.__delattr__('conv2')
    154. self.act.switch_to_deploy()
    155. self.deploy = True

    同时在yolo.py中做如下修改

    1、parse_model函数中

    1. if m in (Conv, GhostConv, Bottleneck, GhostBottleneck, SPP, SPPF, DWConv, MixConv2d, Focus, CrossConv,
    2. BottleneckCSP, C3, C3TR, C3SPP, C3Ghost, nn.ConvTranspose2d, DWConvTranspose2d, C3x, SPPCSPC, RepConv,
    3. RFEM, ELAN, SPPCSPCSIM,VanillaBlock,VanillaStem):
    4. c1, c2 = ch[f], args[0]
    5. if c2 != no: # if not output
    6. c2 = make_divisible(c2 * gw, 8)
    7. args = [c1, c2, *args[1:]]
    8. if m in [BottleneckCSP, C3, C3TR, C3Ghost, C3x]:
    9. args.insert(2, n) # number of repeats
    10. n = 1

    2、BaseModel的fuse函数

    1. if isinstance(m, (VanillaStem, VanillaBlock)):
    2. # print(m)
    3. m.deploy = True
    4. m.switch_to_deploy()

    在yolo.py中测试yolov7-tiny-vanilla.yaml

  • 相关阅读:
    区块链原理及Fabric学习笔记
    一文速通MybatisPlus
    第P9周:YOLOv5-Backbone模块实现
    android 完全退出应用程序
    MySQL 主从复制与读写分离
    跨语言调用C#代码的新方式-DllExport
    ssm校园失物招领系统毕业设计源码080008
    Linux内核中ideapad-laptop.c文件全解析5
    uniapp uni.showToast 一闪而过的问题
    linux出现oom分析流程
  • 原文地址:https://blog.csdn.net/athrunsunny/article/details/134451582