yolov5-6.0使用&改进

代码版本V6.0 源码

推出了新的 P5 和 P6 ‘Nano’ 模型： YOLOV5n和YOLOV5n6。
Nano 将 YOLOv5s 的深度倍数保持为 0.33，但将 YOLOv5 的宽度倍数从 0.50 降低到 0.25，从而将参数从 7.5M 降低到 1.9M，非常适合移动和 CPU 解决方案。

yolov5-6.0

使用
修改
New.改进
else
姿态估计
qt界面
※ 附 YOLOv5模型文件学习记录

使用

copy数据集到yolov5-6.0文件夹
data文件夹下test.yaml 修改train val nc names
models文件夹下用yolov5s：修改yolov5s.yaml 的 nc
下载预训练模型weights 下载注意版本对应

train.py 修改
在这里插入图片描述
训练结果保存在run文件夹。
中断之后继续训练：resume default= True

val.py 修改评估模型
在这里插入图片描述
detect.py 模型推理

yolov5如何控制检测视频的速度

预训练模型有无“6”的区别：
train出现错误 libiomp5md.dll 的解决方案

修改

test1: IOU→DIOU_nms

参考
 一图看清IoU,GIoU,DIoU,CIoU
Yolov5中采用加权nms的方式。
将nms中IOU修改成DIOU_nms。对于一些遮挡重叠的目标，会有一些改进。

CIOU Loss的性能要比DIOU Loss好，那为什么不用CIOU_nms，而用DIOU_nms?
因为CIOU_loss，是在DIOU_loss的基础上，添加了一个的影响因子，包含groundtruth标注框的信息，在训练时用于回归。但是NMS在推理过程中，并不需要groundtruth的信息，所以CIOU NMS不可使用。

utils/general.py
non_max_suppression函数中，将

 i = torchvision.ops.nms(boxes, scores, iou_thres)  # NMS
1

改为

i = NMS(boxes, scores, iou_thres, GIoU=False, DIoU=True, CIoU=False)
1

定义函数NMS

def NMS(boxes, scores, iou_thres, GIoU=False, DIoU=False, CIoU=False):
    """
    :param boxes:  (Tensor[N, 4])): are expected to be in ``(x1, y1, x2, y2)
    :param scores: (Tensor[N]): scores for each one of the boxes
    :param iou_thres: discards all overlapping boxes with IoU > iou_threshold
    :return:keep (Tensor): int64 tensor with the indices
            of the elements that have been kept
            by NMS, sorted in decreasing order of scores
    """
    # 按conf从大到小排序
    B = torch.argsort(scores, dim=-1, descending=True)
    keep = []
    while B.numel() > 0:
        # 取出置信度最高的
        index = B[0]
        keep.append(index)
        if B.numel() == 1: break
        # 计算iou,根据需求可选择GIOU,DIOU,CIOU
        iou = bbox_iou(boxes[index, :], boxes[B[1:], :], GIoU=GIoU, DIoU=DIoU, CIoU=CIoU)
        # 找到符合阈值的下标
        inds = torch.nonzero(iou <= iou_thres).reshape(-1)
        B = B[inds + 1]
    return torch.tensor(keep)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23

定义函数bbox_iou
这里的计算IOU的函数——bbox_iou则是直接引用了YOLOV5中的代码，其简洁的集成了对与GIOU,DIOU,CIOU的计算。

def bbox_iou(box1, box2, x1y1x2y2=True, GIoU=False, DIoU=False, CIoU=False, eps=1e-9):
    # Returns the IoU of box1 to box2. box1 is 4, box2 is nx4
    box2 = box2.T
 
    # Get the coordinates of bounding boxes
    if x1y1x2y2:  # x1, y1, x2, y2 = box1
        b1_x1, b1_y1, b1_x2, b1_y2 = box1[0], box1[1], box1[2], box1[3]
        b2_x1, b2_y1, b2_x2, b2_y2 = box2[0], box2[1], box2[2], box2[3]
    else:  # transform from xywh to xyxy
        b1_x1, b1_x2 = box1[0] - box1[2] / 2, box1[0] + box1[2] / 2
        b1_y1, b1_y2 = box1[1] - box1[3] / 2, box1[1] + box1[3] / 2
        b2_x1, b2_x2 = box2[0] - box2[2] / 2, box2[0] + box2[2] / 2
        b2_y1, b2_y2 = box2[1] - box2[3] / 2, box2[1] + box2[3] / 2
 
    # Intersection area
    inter = (torch.min(b1_x2, b2_x2) - torch.max(b1_x1, b2_x1)).clamp(0) * \
            (torch.min(b1_y2, b2_y2) - torch.max(b1_y1, b2_y1)).clamp(0)
 
    # Union Area
    w1, h1 = b1_x2 - b1_x1, b1_y2 - b1_y1 + eps
    w2, h2 = b2_x2 - b2_x1, b2_y2 - b2_y1 + eps
    union = w1 * h1 + w2 * h2 - inter + eps
 
    iou = inter / union
    if GIoU or DIoU or CIoU:
        cw = torch.max(b1_x2, b2_x2) - torch.min(b1_x1, b2_x1)  # convex (smallest enclosing box) width
        ch = torch.max(b1_y2, b2_y2) - torch.min(b1_y1, b2_y1)  # convex height
        if CIoU or DIoU:  # Distance or Complete IoU https://arxiv.org/abs/1911.08287v1
            c2 = cw ** 2 + ch ** 2 + eps  # convex diagonal squared
            rho2 = ((b2_x1 + b2_x2 - b1_x1 - b1_x2) ** 2 +
                    (b2_y1 + b2_y2 - b1_y1 - b1_y2) ** 2) / 4  # center distance squared
            if DIoU:
                return iou - rho2 / c2  # DIoU
            elif CIoU:  # https://github.com/Zzh-tju/DIoU-SSD-pytorch/blob/master/utils/box/box_utils.py#L47
                v = (4 / math.pi ** 2) * torch.pow(torch.atan(w2 / h2) - torch.atan(w1 / h1), 2)
                with torch.no_grad():
                    alpha = v / ((1 + eps) - iou + v)
                return iou - (rho2 / c2 + v * alpha)  # CIoU
        else:  # GIoU https://arxiv.org/pdf/1902.09630.pdf
            c_area = cw * ch + eps  # convex area
            return iou - (c_area - union) / c_area  # GIoU
    else:
        return iou  # IoU
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43

test2: 设置网络结构为mobilenet-V2

参考
在models/common.py里，实现MobileNetv2的 bottleneck（InvertedResidual）和 Pwconv（Pointwise Convolution）

#mobilenet  Bottleneck  InvertedResidual  
class BottleneckMOB(nn.Module):  
    #c1:inp  c2:oup s:stride  expand_ratio:t  
    def __init__(self, c1, c2, s, expand_ratio):  
        super(BottleneckMOB, self).__init__()  
        self.s = s  
        hidden_dim = round(c1 * expand_ratio)  
        self.use_res_connect = self.s == 1 and c1 == c2  
        if expand_ratio == 1:  
            self.conv = nn.Sequential(  
                # dw  
                nn.Conv2d(hidden_dim, hidden_dim, 3, s, 1, groups=hidden_dim, bias=False),  
                nn.BatchNorm2d(hidden_dim),  
                nn.ReLU6(inplace=True),  
                # pw-linear  
                nn.Conv2d(hidden_dim, c2, 1, 1, 0, bias=False),  
                nn.BatchNorm2d(c2),  
            )  
        else:  
            self.conv = nn.Sequential(  
                # pw  
                nn.Conv2d(c1, hidden_dim, 1, 1, 0, bias=False),  
                nn.BatchNorm2d(hidden_dim),  
                nn.ReLU6(inplace=True),  
                # dw  
                nn.Conv2d(hidden_dim, hidden_dim, 3, s, 1, groups=hidden_dim, bias=False),  
                nn.BatchNorm2d(hidden_dim),  
                nn.ReLU6(inplace=True),  
                # pw-linear  
                nn.Conv2d(hidden_dim, c2, 1, 1, 0, bias=False),  
                nn.BatchNorm2d(c2),  
            )  

    def forward(self, x):  
        if self.use_res_connect:  
            return x + self.conv(x)  
        else:  
            return self.conv(x)  

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39

class PW_Conv(nn.Module):  
    def __init__(self, c1, c2):  # ch_in, ch_out  
        super(PW_Conv, self).__init__()  
        self.conv = nn.Conv2d(c1, c2, 1, 1, 0, bias=False)  
        self.bn = nn.BatchNorm2d(c2)  
        self.act = nn.ReLU6(inplace=True)  

    def forward(self, x):  
        return self.act(self.bn(self.conv(x)))  

1
2
3
4
5
6
7
8
9
10

yolov5的读取模型配置文件的代码（models/yolo.py的parse_model函数）进行修改，使得能够调用到上面的模块，只需修改下面这部分代码：

n = n_ = max(round(n * gd), 1) if n > 1 else n  # depth gain  
if m in [nn.Conv2d, Conv, Bottleneck, SPP, DWConv, MixConv2d, Focus, CrossConv, BottleneckCSP, C3, PW_Conv, BottleneckMOB]:  
    c1, c2 = ch[f], args[0]  
1
2
3

将yolov5s的backbone替换成mobilenetv2，重新建立了一个模型配置文件yolov5-mobilenetV2.yaml

# parameters  
nc: 3  # number of classes  
depth_multiple: 0.33  # model depth multiple  
width_multiple: 0.50  # layer channel multiple  

# anchors  
anchors:  
  - [116,90, 156,198, 373,326]  # P5/32  
  - [30,61, 62,45, 59,119]  # P4/16  
  - [10,13, 16,30, 33,23]  # P3/8  

# YOLOv5 backbone: mobilenet v2  
backbone:  
  # [from, number, module, args]  
  [[-1, 1, nn.Conv2d, [32, 3, 2]],  # 0-P1/2   oup, k, s     640  
   [-1, 1, BottleneckMOB, [16, 1, 1]],  # 1-P2/4   oup, s, t 320  
   [-1, 2, BottleneckMOB, [24, 2, 6]],  #                    320  
   [-1, 1, PW_Conv, [256]],  #4  output p3                   160  
   [-1, 3, BottleneckMOB, [32, 2, 6]],  # 3-P3/8             160  
   [-1, 4, BottleneckMOB, [64, 1, 6]],  # 5                  80  
   [-1, 1, PW_Conv, [512]],  #7 output p4  6                 40  
   [-1, 3, BottleneckMOB, [96, 2, 6]],  # 7                  80  
   [-1, 3, BottleneckMOB, [160, 1, 6,]], #                   40  
   [-1, 1, BottleneckMOB, [320, 1, 6,]], #                   40  
   [-1, 1, nn.Conv2d, [1280, 1, 1]],     #                   40  
   [-1, 1, SPP, [1024, [5, 9, 13]]],  #11     #              40  
  ]  

# YOLOv5 head  
head:  
  [[-1, 3, BottleneckCSP, [1024, False]],  # 12             40  

   [-1, 1, Conv, [512, 1, 1]],                      #       40  
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],      #       40  
   [[-1, 6], 1, Concat, [1]],  # cat backbone P4-7  #       80  
   [-1, 3, BottleneckCSP, [512, False]],  # 16      #       80  

   [-1, 1, Conv, [256, 1, 1]],                      #       80  
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],      #       160  
   [[-1, 3], 1, Concat, [1]],  # cat backbone P3-4          160  
   [-1, 3, BottleneckCSP, [256, False]],            #       160  
   [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1]],  # 21 (P3/8-small)   #        160  

   [-2, 1, Conv, [256, 3, 2]],                     #       160  
   [[-1, 17], 1, Concat, [1]],  # cat head P4      #       160  
   [-1, 3, BottleneckCSP, [512, False]],           #       160  
   [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1]],  # 25 (P4/16-medium)  #       160  

   [-2, 1, Conv, [512, 3, 2]],                     #       160  
   [[-1, 13], 1, Concat, [1]],  # cat head P5-13   #      160  
   [-1, 3, BottleneckCSP, [1024, False]],          #      160  
   [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1]],  # 29 (P5/32-large)           160  

   [[21, 25, 29], 1, Detect, [nc, anchors]],  # Detect(P5, P4, P3)     nc:number class, na:number of anchors  
  ]  

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56

train.py: 使用时将网络结构配置参数—cfg修改成 –cfg yolov5-mobilenet.yaml

test3: 加入SE注意力模块

参考1和参考2博客是从yolov5x改的，我是从yolov5s改的

配置文件yolov5s_se.yaml：在backbone最后一层添加了SELayer

[-1, 1, SELayer, [1024, 4]], #10
1

common.py中添加SELayer

class SELayer(nn.Module):
    def __init__(self, c1, r=16):
        super(SELayer, self).__init__()
        self.avgpool = nn.AdaptiveAvgPool2d(1)
        self.l1 = nn.Linear(c1, c1//r, bias=False)
        self.relu = nn.ReLU(inplace=True)
        self.l2 = nn.Linear(c1//r, c1, bias=False)
        self.sig = nn.Sigmoid()
        
    def forward(self, x):
        b, c, _, _ = x.size()
        y = self.avgpool(x).view(b, c)
        y = self.l1(y)
        y = self.relu(y)
        y = self.l2(y)
        y = self.sig(y)
        y = y.view(b, c, 1, 1)
        return x * y.expand_as(x)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18

yolo.py 添加
在这里插入图片描述

elif m is SELayer:  # ----------这里是修改的部分-----------
        channel, re = args[0], args[1]
        channel = make_divisible(channel * gw, 8) if channel != no else channel 
        args = [channel, re]
1
2
3
4

train.py: 使用时将网络结构配置参数—cfg修改成 –cfg yolov5s_se.yaml

test4: MobileNetV3（2）ShuffleNetV2（3）

参考YOLOv5-ShuffleNetV2，下载五个yaml文件
1.加入模块代码
models/common.py导入

from torch import Tensor
from typing import Callable, Any, List
1
2

ShuffleNetV2和MobileNetV3相关的函数都加入到common.py的底部

# -------------------------------------------------------------------------
# ShuffleNetV2
def channel_shuffle(x: Tensor, groups: int) -> Tensor:
    batchsize, num_channels, height, width = x.size()
    channels_per_group = num_channels // groups

    # reshape
    x = x.view(batchsize, groups,
               channels_per_group, height, width)

    x = torch.transpose(x, 1, 2).contiguous()

    # flatten
    x = x.view(batchsize, -1, height, width)

    return x


class conv_bn_relu_maxpool(nn.Module):
    def __init__(self, c1, c2):  # ch_in, ch_out
        super(conv_bn_relu_maxpool, self).__init__()
        self.conv = nn.Sequential(
            nn.Conv2d(c1, c2, kernel_size=3, stride=2, padding=1, bias=False),
            nn.BatchNorm2d(c2),
            nn.ReLU(inplace=True),
        )
        self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)

    def forward(self, x):
        return self.maxpool(self.conv(x))


class ShuffleNetV2_InvertedResidual(nn.Module):
    def __init__(
            self,
            inp: int,
            oup: int,
            stride: int
    ) -> None:
        super(ShuffleNetV2_InvertedResidual, self).__init__()

        if not (1 <= stride <= 3):
            raise ValueError('illegal stride value')
        self.stride = stride

        branch_features = oup // 2
        assert (self.stride != 1) or (inp == branch_features << 1)

        if self.stride > 1:
            self.branch1 = nn.Sequential(
                self.depthwise_conv(inp, inp, kernel_size=3, stride=self.stride, padding=1),
                nn.BatchNorm2d(inp),
                nn.Conv2d(inp, branch_features, kernel_size=1, stride=1, padding=0, bias=False),
                nn.BatchNorm2d(branch_features),
                nn.ReLU(inplace=True),
            )
        else:
            self.branch1 = nn.Sequential()

        self.branch2 = nn.Sequential(
            nn.Conv2d(inp if (self.stride > 1) else branch_features,
                      branch_features, kernel_size=1, stride=1, padding=0, bias=False),
            nn.BatchNorm2d(branch_features),
            nn.ReLU(inplace=True),
            self.depthwise_conv(branch_features, branch_features, kernel_size=3, stride=self.stride, padding=1),
            nn.BatchNorm2d(branch_features),
            nn.Conv2d(branch_features, branch_features, kernel_size=1, stride=1, padding=0, bias=False),
            nn.BatchNorm2d(branch_features),
            nn.ReLU(inplace=True),
        )

    @staticmethod
    def depthwise_conv(
            i: int,
            o: int,
            kernel_size: int,
            stride: int = 1,
            padding: int = 0,
            bias: bool = False
    ) -> nn.Conv2d:
        return nn.Conv2d(i, o, kernel_size, stride, padding, bias=bias, groups=i)

    def forward(self, x: Tensor) -> Tensor:
        if self.stride == 1:
            x1, x2 = x.chunk(2, dim=1)
            out = torch.cat((x1, self.branch2(x2)), dim=1)
        else:
            out = torch.cat((self.branch1(x), self.branch2(x)), dim=1)

        out = channel_shuffle(out, 2)

        return out


# -------------------------------------------------------------------------
# Pelee: A Real-Time Object Detection System onMobileDevices

class StemBlock(nn.Module):
    def __init__(self, c1, c2, k=3, s=2, p=None, g=1, act=True):
        super(StemBlock, self).__init__()
        self.stem_1 = Conv(c1, c2, k, s, p, g, act)
        self.stem_2a = Conv(c2, c2 // 2, 1, 1, 0)
        self.stem_2b = Conv(c2 // 2, c2, 3, 2, 1)
        self.stem_2p = nn.MaxPool2d(kernel_size=2, stride=2, ceil_mode=True)
        self.stem_3 = Conv(c2 * 2, c2, 1, 1, 0)

    def forward(self, x):
        stem_1_out = self.stem_1(x)
        stem_2a_out = self.stem_2a(stem_1_out)
        stem_2b_out = self.stem_2b(stem_2a_out)
        stem_2p_out = self.stem_2p(stem_1_out)
        out = self.stem_3(torch.cat((stem_2b_out, stem_2p_out), 1))
        return out


# -------------------------------------------------------------------------


# MobileNetV3

class h_sigmoid(nn.Module):
    def __init__(self, inplace=True):
        super(h_sigmoid, self).__init__()
        self.relu = nn.ReLU6(inplace=inplace)

    def forward(self, x):
        return self.relu(x + 3) / 6


class h_swish(nn.Module):
    def __init__(self, inplace=True):
        super(h_swish, self).__init__()
        self.sigmoid = h_sigmoid(inplace=inplace)

    def forward(self, x):
        y = self.sigmoid(x)
        return x * y


class SELayer(nn.Module):
    def __init__(self, channel, reduction=4):
        super(SELayer, self).__init__()
        self.avg_pool = nn.AdaptiveAvgPool2d(1)
        self.fc = nn.Sequential(
            nn.Linear(channel, channel // reduction),
            nn.ReLU(inplace=True),
            nn.Linear(channel // reduction, channel),
            h_sigmoid()
        )

    def forward(self, x):
        b, c, _, _ = x.size()
        y = self.avg_pool(x)
        y = y.view(b, c)
        y = self.fc(y).view(b, c, 1, 1)
        return x * y


class conv_bn_hswish(nn.Module):
    """
    This equals to
    def conv_3x3_bn(inp, oup, stride):
        return nn.Sequential(
            nn.Conv2d(inp, oup, 3, stride, 1, bias=False),
            nn.BatchNorm2d(oup),
            h_swish()
        )
    """

    def __init__(self, c1, c2, stride):
        super(conv_bn_hswish, self).__init__()
        self.conv = nn.Conv2d(c1, c2, 3, stride, 1, bias=False)
        self.bn = nn.BatchNorm2d(c2)
        self.act = h_swish()

    def forward(self, x):
        return self.act(self.bn(self.conv(x)))

    def fuseforward(self, x):
        return self.act(self.conv(x))


class MobileNetV3_InvertedResidual(nn.Module):
    def __init__(self, inp, oup, hidden_dim, kernel_size, stride, use_se, use_hs):
        super(MobileNetV3_InvertedResidual, self).__init__()
        assert stride in [1, 2]

        self.identity = stride == 1 and inp == oup

        if inp == hidden_dim:
            self.conv = nn.Sequential(
                # dw
                nn.Conv2d(hidden_dim, hidden_dim, kernel_size, stride, (kernel_size - 1) // 2, groups=hidden_dim,
                          bias=False),
                nn.BatchNorm2d(hidden_dim),
                h_swish() if use_hs else nn.ReLU(inplace=True),
                # Squeeze-and-Excite
                SELayer(hidden_dim) if use_se else nn.Sequential(),
                # pw-linear
                nn.Conv2d(hidden_dim, oup, 1, 1, 0, bias=False),
                nn.BatchNorm2d(oup),
            )
        else:
            self.conv = nn.Sequential(
                # pw
                nn.Conv2d(inp, hidden_dim, 1, 1, 0, bias=False),
                nn.BatchNorm2d(hidden_dim),
                h_swish() if use_hs else nn.ReLU(inplace=True),
                # dw
                nn.Conv2d(hidden_dim, hidden_dim, kernel_size, stride, (kernel_size - 1) // 2, groups=hidden_dim,
                          bias=False),
                nn.BatchNorm2d(hidden_dim),
                # Squeeze-and-Excite
                SELayer(hidden_dim) if use_se else nn.Sequential(),
                h_swish() if use_hs else nn.ReLU(inplace=True),
                # pw-linear
                nn.Conv2d(hidden_dim, oup, 1, 1, 0, bias=False),
                nn.BatchNorm2d(oup),
            )

    def forward(self, x):
        y = self.conv(x)
        if self.identity:
            return x + y
        else:
            return y

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227

2.更改解析模块，告诉YOLOv5，加入了InvertedResidual模块
265行左右

        if m in [Conv, GhostConv, Bottleneck, GhostBottleneck, SPP, SPPF, DWConv, MixConv2d, Focus, CrossConv,
                 BottleneckCSP, C3, C3TR, C3SPP, C3Ghost, ShuffleNetV2_InvertedResidual, StemBlock,
                 conv_bn_relu_maxpool, conv_bn_relu_maxpool, conv_bn_hswish, MobileNetV3_InvertedResidual]:
1
2
3

3.配置
目录models下粘贴下载好的yaml文件，改参数（配置的参数说明）
train.py修改cfg
exp：MobileNetV3 Small

test5: Facal Loss 改为 VFLoss

VariFocalNet
util/loss.py
替换ComputeLoss中的FL

class VFLoss(nn.Module):
    def __init__(self, loss_fcn, gamma=1.5, alpha=0.25):
        super(VFLoss, self).__init__()
        # 传递 nn.BCEWithLogitsLoss() 损失函数  must be nn.BCEWithLogitsLoss()
        self.loss_fcn = loss_fcn  #
        self.gamma = gamma
        self.alpha = alpha
        self.reduction = loss_fcn.reduction
        self.loss_fcn.reduction = 'mean'  # required to apply VFL to each element
 
    def forward(self, pred, true):
 
        loss = self.loss_fcn(pred, true)
 
        pred_prob = torch.sigmoid(pred)  # prob from logits
 
        focal_weight = true * (true > 0.0).float() + self.alpha * (pred_prob - true).abs().pow(self.gamma) * (true <= 0.0).float()
        loss *= focal_weight
 
        if self.reduction == 'mean':
            return loss.mean()
        elif self.reduction == 'sum':
            return loss.sum()
        else:
            return loss
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25

test6: v6.0内置 TRANSFORMERS 训练

TRANSFORMERS论文
在这里插入图片描述

test7: CBAM模块添加（cbam，bifpn，carafe，bot（CTR3），cooratt，involution）

作用：帮助网络在区域覆盖范围大的图像中找到感兴趣的区域参考。

参考 ASFFV5和CBAM模块添加 (CBAM) 和代码。
这个代码作者还改了bottleneckCSP的LeakyRELU为SILU，common.py209-210行。SILU效果相对好一点

asffv5在head最后detect，Detect可以改为ASFF_Detect，现在测试运行不了，没有使用
involution不能运行//11.6可以运行
Coordinate Attention注意力机制(cooratt)目前效果最好

在这里插入图片描述

New.改进

bottleneckCSP改进

改动1 bottleneckCSP：lacky relu→silu

数据集太少

三帧帧差法
爬虫
imgaug+天气
更改data/hyps/hyp.scratch.yaml中:mosaic、mixup

针对小目标

yolov5数据强化方法并不是越多越好

data/hyps/hyp.scratch.yaml中:

mosaic设置为0.小目标非常多，因此不使用mosaic反而会增加模型的训练效果

data/hyps/hyp.finetune.yaml中：

scale=0.898改小，0.4或0.5
yolov5增加检测层、针对小目标识别

针对样本不均衡问题

train.py中的参数设置：有代码解决了这个问题。
根据样本种类分布使用图像调用频率不同的方法解决。
1、将样本中的groundtruth读出来，存为一个列表；
2、统计训练样本列表中不同类别的矩形框个数，然后给每个类别按相应目标框数的倒数赋值，（数目越多的种类权重越小），形成按种类的分布直方图；
3、对于训练数据列表，每个epoch训练按照类别权重筛选出每类的图像作为训练数据，如使用random.choice(population, weights=None, *, cum_weights=None, k=1)更改训练图像索引，可达到样本均衡的效果。
utils/loss.py中focalloss解决
在目标检测领域focal loss主要解决的是前景和背景样本不均衡的问题，即是anchor box中背景过多，positive的太少，是解决这个问题的
使用focal loss并没有很好的结果，反而让结果变差了。
训练时样本类别不均衡2

针对复杂背景问题

添加注意力机制参考test8，SE、CBAM、CA

else

yolov5添加注意力机制–以EPSA为例

损失函数的改进

yolov5软剪枝(一)：模型代码重构，（二），（三）
卷积层和BN层的融合

旋转目标
 专栏
 理论：目标检测 YOLOv5 - 如何提高模型的指标，提高精确率，召回率，mAP等.数据集、AI

错误较多：
垂直旋转的增强,损失修改了置信度的赋值,所有类别参与NMS
PANet层改为BiFPN

YoloV5 + deepsort + Fast-ReID 完整行人重识别系统

YOLO-Fastest训练自己的数据

姿态估计

yolov5 + 姿态估计
AlphaPose推理demo复现
 AlphaPose_yolov5复现
 AlphaPose_yolov4推理demo复现

谷歌极速人脸、手、人体姿态分析Blaze算法家族知乎
 项目主页
BlazePose: On-device Real-time Body Pose tracking
CVPRW 2020 论文 code

qt界面

用 pyqt5给深度学习目标检测+跟踪(yolov3＋siamrpn)搭建界面(3)
YOLOv5检测界面-PyQt5实现
 Pyqt搭建YOLOV5目标检测界面
 使用PyQt5为YoloV5添加界面（一）
基于MobileNet-v3和YOLOv5的餐饮有害虫鼠识别及防治系统的设计与实现

pip install pyQt5 -i https://pypi.tuna.tsinghua.edu.cn/simple

pip install pyqt5-tools  -i https://pypi.tuna.tsinghua.edu.cn/simple
1
2
3

※ 附 YOLOv5模型文件学习记录

NMS

检测的多个框叫做候选框，对应的置信度不同，里面找到置信度最大的框。yolov5用的是IOU 目标框（置信度最高的框，作为目标框）和候选框（剩下的框）进行计算。
得到一个设定的阈值。一般是0.5 （–conf：设置置信度阈值）原始是0.45，根据它排序选出分数最高的添加到输出列表

将NMS替换成DIOU_NMS：可以修改utils文件中的general.py的bbox_iou函数，bbox_iou里面，有几种iou的方式，进行选择

YOLOv5

在这里插入图片描述

在这里插入图片描述
这个图不是很官方因为代码里面主要用到的bottleneck不是直接用csp，但可以参考这个格式来，现在是第五代。和第三代稍微有一些改动。第四代的效果会好很多。

输入端608 * 608 * 3,3是RGB，608是对图片填充或压缩统一成608.进行三个尺度的训练。focus之前的一个数据增强。
为什么要进行多尺度训练：不同尺度会让这个模型对于小尺度的识别更好。

马赛克增强：v4v5.位置&小目标
yolov5数据增强代码解读
‘data/hyp.scratch.yaml’
随机缩放、随机裁剪、随机排布的方式进行拼接，对于小目标的检测效果还是很不错的。
hsv（色彩），degrees（角度），translate（缩放），mixup（1.0：需要使用马赛克，目标都比较大的话就不用，对模型更好）。hyp.scracth可以改数据，是否数据增强等

backbone
在这里插入图片描述
focus（304 * 304 * 32）：切片操作。会把一张图切成四份，再用concat连接起来。把rgb三个通道变成32个通道，经过32个卷积核操作，有利于后续学习。

cbl：在这里插入图片描述
卷积层后面为什么有bn层，bn层作用：解决梯度爆炸。传输的数据不同会导致网络很震荡，bn层会把数据大致限制在一个范围内。

残差网络：在这里插入图片描述
第一条支路经过了卷积操作，第二条支路不经过这些操作直接卷积进行累加。
resnet、vgg等

csp1：全新的残差网络。是有三个卷积层和x个残差单元连接而成。把残差网络放到网络结构当中然后进行一系列的卷积拼接的操作。是残差单元的一个应用。【X个残差组件不是串联或者并联的操作，是一个组合操作】
csp2：不使用残差单元。

spp：在这里插入图片描述
多个池化层组成。

yolov5四个版本，smlx。网络深度不同。宽度和深度s是0.33和0.50是最小的。网络结构当中卷积核和csp这样的组合的个数。如果深度越深宽度越宽，网络的深度越深，层数越多，通道数越多。（可以尝试sl）
‘models/yolov5s.yaml’

nc：类的数目
anchors：锚点框
Yolov5在Coco数据集上初始设定的锚框：
在这里插入图片描述
加尺寸的话改backbone

neck：mpn和pan
在这里插入图片描述
为什么要进行上采样和下采样，上采样提取图像高维特征数据，增强小目标学习。低维高维特征的融合学习。
fpn下采样，下采样之后，上采样过程中学习下采样学习到的特征。

从不同的通道下来形成的三个尺度→输出端。

损失函数：yolov5中使用ciou

代码

models

common.py 里定义的函数都是多次使用的

dwconv深度可分离卷积，yolo中并未使用，（可以试试使用）
conv卷积层的定义。2d+bn层+激活函数
bottleneck 一个主干网络
bottleneckCSP是主干网络的一些结合
c3 bottleneck跟三个卷积层融合，
spp
focus
contract
concat focus图片进行四个切片之后会用到concat，在yolov5中多次使用
NMS 原始值就是在这里设定。
detections 检测用到的东西
classify 二级分类（再次分类的过程。车牌：先识别是个车牌在识别车牌号）。 flat展平。经常会用在全连接层前面。

experimental.py 是作者实验的东西不用看

export.py 不用看

onnx 移动端经常用onnx

yolo.py …

model yolov5定义的整个模型。
foward前项推理的一个过程。
forward_once 在整个模型里边（前项推理，偏置处理）。
该程序时把model这个函数走通就可以

utils 工具类文件

activations.py 定义激活函数.

也可以修改。
nms，多尺度，激活函数，预热学习率调整，都可以修改

autoanchor.py 自动锚框

datasets.py 数据处理

general.py 比较机械的处理

lebels跟模型匹配
txt里的文字怎么处理
图片的xy尺寸怎么变成xywh

google_utils.py

没怎么用到

loss.py. 损失函数部分，可以改一些比较好的损失函数

plots.py 画点，画框操作

train.py

在这里插入图片描述
改的最多的地方。

detect.py

工程类的东西要用到。输出或者抓取数据都是从这里抓取。（多少人，头，帽）

动的比较多的函数：检测→画框。
在这里插入图片描述
一般修改都是从原函数上修改（最好新建一个函数，不要覆盖原函数）

下面就是确定中心点落在哪里，person是0，head是1，Helmet是2.
先去检测person，然后判断head和helmet这两个框的中心点是否落在person这个框内，如果落在框内，就会画这个框，如果没落在框内就不画框。

相关阅读:
k8s--工作负载资源
 maven简单配置
 为什么混合云是未来云计算的主流形态?
31.nacos集成Feign和Gateway实例（springcloud）
CentOS7启动SSH服务报错
 pandas学习（三） grouping
坦克世界WOT知识图谱之知识图谱篇
 参数估计的均方误差（MSE），偏置（Bias）与方差（Variance）分解，无偏估计
 基于JAVA学校运动会信息管理系统计算机毕业设计源码+系统+mysql数据库+lw文档+部署
 软件工程与计算总结（八）软件设计基础
原文地址：https://blog.csdn.net/zrg_hzr_1/article/details/120934620