yolov7基础知识先导篇

*免责声明:
1\此方法仅提供参考
2\搬了其他博主的操作方法,以贴上路径.
3*

场景一:MP

场景二:高效聚合网络

场景三:SPPCSPC

场景四:结构重参数化

场景五:标签分配–>细分方法：simOTA

场景六:模型复合缩放

…

场景一:MPC-B、MPC-N

1.1 MPC-B

在这里插入图片描述

   [-1, 1, MP, []],
   [-1, 1, Conv, [128, 1, 1]],
   [-3, 1, Conv, [128, 1, 1]],
   [-1, 1, Conv, [128, 3, 2]],
   [[-1, -3], 1, Concat, [1]],  
1
2
3
4
5

class MP(nn.Module):
    def __init__(self, k=2):
        super(MP, self).__init__()
        self.m = nn.MaxPool2d(kernel_size=k, stride=k)

    def forward(self, x):
        return self.m(x)
1
2
3
4
5
6
7

1.2 MPC-N

在这里插入图片描述

   # MPC-H
   [-1, 1, MP, []],
   [-1, 1, Conv, [128, 1, 1]],
   [-3, 1, Conv, [128, 1, 1]],
   [-1, 1, Conv, [128, 3, 2]],
   [[-1, -3, 63], 1, Concat, [1]],
1
2
3
4
5
6

   #MPC-H
   [-1, 1, MP, []],
   [-1, 1, Conv, [256, 1, 1]],
   [-3, 1, Conv, [256, 1, 1]],
   [-1, 1, Conv, [256, 3, 2]],
   [[-1, -3, 51], 1, Concat, [1]],
1
2
3
4
5
6

…

场景二:高效聚合网络

VoVNet：实时目标检测的新backbone网络

在这里插入图片描述

1.1 VoVNet

强推先看–>场景十：新增模型：DenseNet

VovNet论文地址

在这里插入图片描述

1.2 VoVNet v2

论文地址：CenterMask论文中提出VoVNet v2

场景三:常见的Attention机制–>SENet

在这里插入图片描述

1.3 Yolov7中的ELAN结构

场景五: CSPNet

在这里插入图片描述

   [-1, 1, Conv, [128, 1, 1]],
   [-2, 1, Conv, [128, 1, 1]],
   [-1, 1, Conv, [128, 3, 1]],
   [-1, 1, Conv, [128, 3, 1]],
   [-1, 1, Conv, [128, 3, 1]],
   [-1, 1, Conv, [128, 3, 1]],
   [[-1, -3, -5, -6], 1, Concat, [1]],
   [-1, 1, Conv, [512, 1, 1]],  
1
2
3
4
5
6
7
8

1.4 Yolov7中的ELAN-H结构

在这里插入图片描述

   [-1, 1, Conv, [256, 1, 1]],
   [-2, 1, Conv, [256, 1, 1]],
   [-1, 1, Conv, [128, 3, 1]],
   [-1, 1, Conv, [128, 3, 1]],
   [-1, 1, Conv, [128, 3, 1]],
   [-1, 1, Conv, [128, 3, 1]],
   [[-1, -2, -3, -4, -5, -6], 1, Concat, [1]],
   [-1, 1, Conv, [256, 1, 1]], 
1
2
3
4
5
6
7
8

…

场景三:SPPCSPC

空间金字塔池化改进 SPP / SPPF / ASPP / RFB / SPPCSPC

1.1 SPPCSPC

设计理念是什么？？？？？欢迎讨论，很多博主这里结构画错了，这里是正解.

在这里插入图片描述

class SPPCSPC(nn.Module):
    # CSP https://github.com/WongKinYiu/CrossStagePartialNetworks
    def __init__(self, c1, c2, n=1, shortcut=False, g=1, e=0.5, k=(5, 9, 13)):
        super(SPPCSPC, self).__init__()
        c_ = int(2 * c2 * e)  # hidden channels
        self.cv1 = Conv(c1, c_, 1, 1)
        self.cv2 = Conv(c1, c_, 1, 1)
        self.cv3 = Conv(c_, c_, 3, 1)
        self.cv4 = Conv(c_, c_, 1, 1)
        self.m = nn.ModuleList([nn.MaxPool2d(kernel_size=x, stride=1, padding=x // 2) for x in k])
        self.cv5 = Conv(4 * c_, c_, 1, 1)
        self.cv6 = Conv(c_, c_, 3, 1)
        self.cv7 = Conv(2 * c_, c2, 1, 1)

    def forward(self, x):
        x1 = self.cv4(self.cv3(self.cv1(x)))
        y1 = self.cv6(self.cv5(torch.cat([x1] + [m(x1) for m in self.m], 1)))
        y2 = self.cv2(x)
        return self.cv7(torch.cat((y1, y2), dim=1))
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19

1.2 SPP与SPPF

在这里插入图片描述

1.3 SPP3

在这里插入图片描述

class SPP3(nn.Module):
    def __init__(self, c1, c2, k1):
        super().__init__()
        c_ = c1 // 2
        k1, k2, k3 = 3, 5, 7
        self.cn1 = Conv(c1, c_, 1, 1)
        self.cn2 = Conv(c_ * 4, c2, 1, 1)
        self.m1 = nn.AvgPool2d(kernel_size=k1, stride=1, padding=k1 // 2)
        self.m2 = nn.AvgPool2d(kernel_size=k2, stride=1, padding=k2 // 2)
        self.m3 = nn.AvgPool2d(kernel_size=k3, stride=1, padding=k3 // 2)

    def forward(self, x):
         x = self.cn1(x)
         with warnings.catch_warnings():
             warnings.simplefilter('ignore')
         m1 = self.m1(x)
         m2 = self.m2(m1)
         m3 = self.m3(m1)
         return self.cn2(torch.cat([x, m1, m2, m3],1))
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19

…

场景四:结构重参数化

1.1 前言

结构重参数化：利用参数转换解耦训练和推理结构

解读模型压缩6：结构重参数化技术：进可暴力提性能，退可无损做压缩

请添加图片描述

1.2 ACNet

ACNet论文地址

【CNN结构设计】无痛的涨点技巧：ACNet

33卷积+13卷积+3*1卷积=白给的精度提升 | ICCV 2019

在这里插入图片描述
请添加图片描述
训练阶段

推理阶段

总结

1.3 RepVGG

在这里插入图片描述

训练阶段

推理阶段

在这里插入图片描述

不同版本

在这里插入图片描述

1.4 Yolov7中的RepConv

在这里插入图片描述

#Yolov7里面的RepConv借鉴的是RepVGG
# 重参化结构

class RepConv(nn.Module):
   
    #RepVGG网址 https://arxiv.org/abs/2101.03697

    def __init__(self, c1, c2, k=3, s=1, p=None, g=1, act=True, deploy=False):
        super(RepConv, self).__init__()

        self.deploy = deploy        # deploy是推理部署的意思
        self.groups = g                # 输入的特征层分为几组，这是分组卷积概念，单卡GPU不用考虑，默认为1，分组卷积概念详见下面
        self.in_channels = c1       # 输入通道数
        self.out_channels = c2     #输出通道数

        assert k == 3
        assert autopad(k, p) == 1    # 为什么这么设置呢，图像padding=1后经过 3x3 卷积之后图像大小不变

        padding_11 = autopad(k, p) - k // 2

        self.act = nn.SiLU() if act is True else (act if isinstance(act, nn.Module) else nn.Identity())

        # 定义推理模型时，基本block就是一个简单的 conv2D
        if deploy:
            self.rbr_reparam = nn.Conv2d(c1, c2, k, s, autopad(k, p), groups=g, bias=True)

        else:
            # 定义训练模型时，基本block是 identity、1x1 conv_bn、3x3 conv_bn 组合

            # 如果是训练模式，就是执行identity操作+bn ，也就是输入直接+bn
            self.rbr_identity = (nn.BatchNorm2d(num_features=c1) if c2 == c1 and s == 1 else None)

            # 普通的3x3的卷积+bn操作
            self.rbr_dense = nn.Sequential(
                nn.Conv2d(c1, c2, k, s, autopad(k, p), groups=g, bias=False),
                nn.BatchNorm2d(num_features=c2),
            )
            
            # 普通的1x1的卷积+bn操作
            self.rbr_1x1 = nn.Sequential(
                nn.Conv2d( c1, c2, 1, s, padding_11, groups=g, bias=False),
                nn.BatchNorm2d(num_features=c2),
            )

    
    def forward(self, inputs):

       # hasattr() 函数用于判断对象是否包含对应的属性，也就是如果上面是deploy模式，就有了rbr_reparam属性，也就是重参化参数
     
        if hasattr(self, "rbr_reparam"):
            return self.act(self.rbr_reparam(inputs))     # 推理阶段, conv2D 后 SiLU

        if self.rbr_identity is None:  #如果bn（x）的操作是0，也就是输入是0，那么输出为0
            id_out = 0
        else:                
            id_out = self.rbr_identity(inputs)
        
        # （返回3x3的卷积+bn）   +   （1x1的卷积1x1+bn）  + （identity+bn的结果）
        return self.act(self.rbr_dense(inputs) + self.rbr_1x1(inputs) + id_out)
    


    #repvgg的转换，也就是训练模式转换为部署模式
    def repvgg_convert(self):
        kernel, bias = self.get_equivalent_kernel_bias()
        return (
            kernel.detach().cpu().numpy(),
            bias.detach().cpu().numpy(),
        )

    

    #下面这个函数就是按照论文的方法把1×1卷积和Identity操作转化成3×3卷积的------------------------------------------------------------------
    #最后返回的是2个量：kernel3x3 + self._pad_1x1_to_3x3_tensor(kernel1x1) + kernelid和bias3x3 + bias1x1 + biasid。
    #分别代表这个等价的3×3卷积的权重和偏置。

    def get_equivalent_kernel_bias(self):
        
        kernel3x3, bias3x3 = self._fuse_bn_tensor(self.rbr_dense)     # BN（3x3 卷积核两个参数 W 和 b）后 提出来
        kernel1x1, bias1x1 = self._fuse_bn_tensor(self.rbr_1x1)         #BN（1x1 卷积核两个参数 W 和 b）后 提出来
        kernelid, biasid = self._fuse_bn_tensor(self.rbr_identity)
        
        # 卷积核运算本质就是 W(x)+b，融合的策略是w相加，b相加 ，但是为啥 identity 可以提取W、b？看后面
        return (
            kernel3x3 + self._pad_1x1_to_3x3_tensor(kernel1x1) + kernelid,
            bias3x3 + bias1x1 + biasid,
        )
    

    #将1x1的卷积转换为3x3的卷积
    def _pad_1x1_to_3x3_tensor(self, kernel1x1):
        if kernel1x1 is None:
            return 0
        else:
            return nn.functional.pad(kernel1x1, [1, 1, 1, 1])
    # 这代码讲的是将 1x1 conv padding 一圈成 3x3 conv，填充的是0
    #                        [0  0  0] 
    #    [1]  >>>padding>>>  [0  1  0]
    #                        [0  0  0]   


    # 各分支的卷积进行bn操作，返回对应的bn
    # 融合bn操作,这个函数的作用是“吸BN”，是怎么返回w 核 b
    def _fuse_bn_tensor(self, branch):
        if branch is None:
            return 0, 0
            # 当branch不是3x3、1x1、BN，那就返回 W=0, b=0
    
        # 普通的3x3的卷积+bn操作  self.rbr_dense = nn.Sequential( nn.Conv2d(c1, c2, k, s, autopad(k, p), groups=g, bias=False), nn.BatchNorm2d(num_features=c2),)
        # 普通的1x1的卷积+bn操作  self.rbr_1x1 = nn.Sequential( nn.Conv2d( c1, c2, 1, s, padding_11, groups=g, bias=False),  nn.BatchNorm2d(num_features=c2), )
        # 当branch是3x3、1x1卷积时候，返回以上数据，为后面做融合 

        if isinstance(branch, nn.Sequential):
            kernel = branch[0].weight                              # conv权重  ，这里不考虑卷积带偏置项的情况
            running_mean = branch[1].running_mean    # BN mean
            running_var = branch[1].running_var            # BN val
            gamma = branch[1].weight                            # BN γ    
            beta = branch[1].bias                                      # BN β
            eps = branch[1].eps                                         # 防止分母为0 
         
             
        else:
            
            #如果是identity的操作，这里branch就是nn.BatchNorm2d ； 前文 self.rbr_identity = (nn.BatchNorm2d(num_features=c1) if c2 == c1 and s == 1 else None) 
            assert isinstance(branch, nn.BatchNorm2d)

            if not hasattr(self, "id_tensor"):
                input_dim = self.in_channels // self.groups     # 通道分组，单个GPU不用考虑，详情去搜索分组卷积

                # 定义 Conv size为（in_channles,  input_dim , 3，3）的全0数组
                kernel_value = np.zeros(
                    (self.in_channels, input_dim, 3, 3), dtype=np.float32
                )

                # 将卷积核中心部分部分赋予1， 1x1的卷积-->3x3的卷积的本质是 在1的周围填充0，也就是将（1，1）的位置设置为1
                #假如输入输出的通道数是3 ， 也就是有3个卷积核 ， 每个卷积核有3个1x1的卷积操作，
                # indentity-->3x3的卷积操作就是  将 （0 ， 0 ，1，1） ， （1，1，1，1）， （2，2，1，1）的位置为0
                for i in range(self.in_channels):
                    kernel_value[i, i % input_dim, 1, 1] = 1

                self.id_tensor = torch.from_numpy(kernel_value).to(branch.weight.device)
           #这样操作以后的identity的权重就变为了3x3

            kernel = self.id_tensor                                     # conv权重
            running_mean = branch.running_mean        # BN mean 
            running_var = branch.running_var                 # BN va
            gamma = branch.weight                                 # BN γ 
            beta = branch.bias                                           # BN β           
            eps = branch.eps                                              # 防止分母为0 

        
        #BN(conv(x))  =  [ γ  *w  / 开平方（var）]  *x  +  β - γ  * mean /  开平方（var） 

        std = (running_var + eps).sqrt()

        #kernel是四维张量，而t是个1维向量，所以会t = (gamma / std).reshape(-1, 1, 1, 1)使其维度和kernel匹配上。
        t = (gamma / std).reshape(-1, 1, 1, 1)

        return kernel * t, beta - running_mean * gamma / std
     #--------------------------------------------------------------------------------------------------------------------------------------------------



    def fuse_conv_bn(self, conv, bn):

        std = (bn.running_var + bn.eps).sqrt()
        bias = bn.bias - bn.running_mean * bn.weight / std

        t = (bn.weight / std).reshape(-1, 1, 1, 1)
        weights = conv.weight * t

        bn = nn.Identity()
        conv = nn.Conv2d(in_channels = conv.in_channels,
                              out_channels = conv.out_channels,
                              kernel_size = conv.kernel_size,
                              stride=conv.stride,
                              padding = conv.padding,
                              dilation = conv.dilation,
                              groups = conv.groups,
                              bias = True,
                              padding_mode = conv.padding_mode)

        conv.weight = torch.nn.Parameter(weights)
        conv.bias = torch.nn.Parameter(bias)
        return conv

    def fuse_repvgg_block(self):    
        if self.deploy:
            return
        print(f"RepConv.fuse_repvgg_block")
                
        self.rbr_dense = self.fuse_conv_bn(self.rbr_dense[0], self.rbr_dense[1])
        
        self.rbr_1x1 = self.fuse_conv_bn(self.rbr_1x1[0], self.rbr_1x1[1])
        rbr_1x1_bias = self.rbr_1x1.bias
        weight_1x1_expanded = torch.nn.functional.pad(self.rbr_1x1.weight, [1, 1, 1, 1])
        
        # Fuse self.rbr_identity
        if (isinstance(self.rbr_identity, nn.BatchNorm2d) or isinstance(self.rbr_identity, nn.modules.batchnorm.SyncBatchNorm)):
            # print(f"fuse: rbr_identity == BatchNorm2d or SyncBatchNorm")
            identity_conv_1x1 = nn.Conv2d(
                    in_channels=self.in_channels,
                    out_channels=self.out_channels,
                    kernel_size=1,
                    stride=1,
                    padding=0,
                    groups=self.groups, 
                    bias=False)
            identity_conv_1x1.weight.data = identity_conv_1x1.weight.data.to(self.rbr_1x1.weight.data.device)
            identity_conv_1x1.weight.data = identity_conv_1x1.weight.data.squeeze().squeeze()
            # print(f" identity_conv_1x1.weight = {identity_conv_1x1.weight.shape}")
            identity_conv_1x1.weight.data.fill_(0.0)
            identity_conv_1x1.weight.data.fill_diagonal_(1.0)
            identity_conv_1x1.weight.data = identity_conv_1x1.weight.data.unsqueeze(2).unsqueeze(3)
            # print(f" identity_conv_1x1.weight = {identity_conv_1x1.weight.shape}")

            identity_conv_1x1 = self.fuse_conv_bn(identity_conv_1x1, self.rbr_identity)
            bias_identity_expanded = identity_conv_1x1.bias
            weight_identity_expanded = torch.nn.functional.pad(identity_conv_1x1.weight, [1, 1, 1, 1])            
        else:
            # print(f"fuse: rbr_identity != BatchNorm2d, rbr_identity = {self.rbr_identity}")
            bias_identity_expanded = torch.nn.Parameter( torch.zeros_like(rbr_1x1_bias) )
            weight_identity_expanded = torch.nn.Parameter( torch.zeros_like(weight_1x1_expanded) )            
        

        #print(f"self.rbr_1x1.weight = {self.rbr_1x1.weight.shape}, ")
        #print(f"weight_1x1_expanded = {weight_1x1_expanded.shape}, ")
        #print(f"self.rbr_dense.weight = {self.rbr_dense.weight.shape}, ")

        self.rbr_dense.weight = torch.nn.Parameter(self.rbr_dense.weight + weight_1x1_expanded + weight_identity_expanded)
        self.rbr_dense.bias = torch.nn.Parameter(self.rbr_dense.bias + rbr_1x1_bias + bias_identity_expanded)
                
        self.rbr_reparam = self.rbr_dense
        self.deploy = True

        if self.rbr_identity is not None:
            del self.rbr_identity
            self.rbr_identity = None

        if self.rbr_1x1 is not None:
            del self.rbr_1x1
            self.rbr_1x1 = None

        if self.rbr_dense is not None:
            del self.rbr_dense
            self.rbr_dense = None
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246

…

场景五:标签分配–>细分方法：simOTA

在这里插入图片描述

请添加图片描述

…

场景六:模型复合缩放

1.1 模型缩放

在这里插入图片描述

1.2 EfficientNet

EfficientNet论文地址

EfficientNet网络详解

【一看就懂】EfficientNet详解。凭什么EfficientNet号称当今最强？

神经结构搜索(Neural Architecture Search, NAS)学习

在这里插入图片描述

…

you did it
在这里插入图片描述

相关阅读:
redis增删改查
 使用 Ring Buffer 完成数据传递
 k8s--基础--19--DaemonSet
PerfView专题 (第四篇)：如何寻找 C# 中程序集泄漏
 Flink中的批和流
 【深度学习】记录为什么没有调用GPU
C语言详解（文件操作）2
机器人力控制构架
 IOC容器加载过程及Bean的生命周期和后置处理器
 线上数据问题排查案例分享-因为 HMS 和底层 orc 文件中某字段的数据精度不一致造成的数据丢失问题
原文地址：https://blog.csdn.net/qq_41580422/article/details/126317027