YOLOV5是一个速度和精度表现都非常优秀的目标检测算法,但我们在使用过程中有自己各种各样的需求,官方原本的代码可能不能满足我们需要,比如检测小目标,我们就需要使用对小目标检测效果好的Backbone,或者想要移植到嵌入设备上,我们就需要轻量化的Backbone。因此,在面对不同的任务需求的时候,我们需要采用不同的Backbone。下面就开始今天教程的正文。
YOLOV5官方仓库代码:
GitHub - ultralytics/yolov5: YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite
在models/common.py中定义自己需要使用到的模块儿
首先,在common.py的顶部导入依赖包
from torch import Tensor
from typing import Callable, Any, List
将ShuffleNetV2_InvertedResidual类和ShuffleNetV2_InvertedResidual类需要的channel_shuffle、conv_bn_relu_maxpool类都加入到common.py的底部
def channel_shuffle(x: Tensor, groups: int) -> Tensor:
batchsize, num_channels, height, width = x.size()
channels_per_group = num_channels // groups
# reshape
x = x.view(batchsize, groups,
channels_per_group, height, width)
x = torch.transpose(x, 1, 2).contiguous()
# flatten
x = x.view(batchsize, -1, height, width)
return x
class conv_bn_relu_maxpool(nn.Module):
def __init__(self, c1, c2): # ch_in, ch_out
super(conv_bn_relu_maxpool, self).__init__()
self.conv= nn.Sequential(
nn.Conv2d(c1, c2, kernel_size=3, stride=2, padding=1, bias=False),
nn.BatchNorm2d(c2),
nn.ReLU(inplace=True),
)
self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
def forward(self, x):
return self.maxpool(self.conv(x))
class ShuffleNetV2_InvertedResidual(nn.Module):
def __init__(
self,
inp: int,
oup: int,
stride: int
) -> None:
super(ShuffleNetV2_InvertedResidual, self).__init__()
if not (1 <= stride <= 3):
raise ValueError('illegal stride value')
self.stride = stride
branch_features = oup // 2
assert (self.stride != 1) or (inp == branch_features << 1)
if self.stride > 1:
self.branch1 = nn.Sequential(
self.depthwise_conv(inp, inp, kernel_size=3, stride=self.stride, padding=1),
nn.BatchNorm2d(inp),
nn.Conv2d(inp, branch_features, kernel_size=1, stride=1, padding=0, bias=False),
nn.BatchNorm2d(branch_features),
nn.ReLU(inplace=True),
)
else:
self.branch1 = nn.Sequential()
self.branch2 = nn.Sequential(
nn.Conv2d(inp if (self.stride > 1) else branch_features,
branch_features, kernel_size=1, stride=1, padding=0, bias=False),
nn.BatchNorm2d(branch_features),
nn.ReLU(inplace=True),
self.depthwise_conv(branch_features, branch_features, kernel_size=3, stride=self.stride, padding=1),
nn.BatchNorm2d(branch_features),
nn.Conv2d(branch_features, branch_features, kernel_size=1, stride=1, padding=0, bias=False),
nn.BatchNorm2d(branch_features),
nn.ReLU(inplace=True),
)
@staticmethod
def depthwise_conv(
i: int,
o: int,
kernel_size: int,
stride: int = 1,
padding: int = 0,
bias: bool = False
) -> nn.Conv2d:
return nn.Conv2d(i, o, kernel_size, stride, padding, bias=bias, groups=i)
def forward(self, x: Tensor) -> Tensor:
if self.stride == 1:
x1, x2 = x.chunk(2, dim=1)
out = torch.cat((x1, self.branch2(x2)), dim=1)
else:
out = torch.cat((self.branch1(x), self.branch2(x)), dim=1)
out = channel_shuffle(out, 2)
return out
第一步,我们在models/common.py中定义了ShuffleNetV2_InvertedResidual类,但是YOLO算法并不知道我们定义了这个类,我们还需要YOLO的参数解析处对这个类名进行注册
在models/yolo.py的parse_model函数中注册加入InvertedResidual,这样在解析YAML文件的时候就知道配置文件中的字符串“InvertedResidual”对应的类了,在原版YOV5的基础上,一共需要另外注册ShuffleNetV2_InvertedResidual、conv_bn_relu_maxpool两个类
if m in [Conv, GhostConv, Bottleneck, GhostBottleneck, SPP, SPPF, DWConv, MixConv2d, Focus, CrossConv,
BottleneckCSP, C3, C3TR, C3SPP, C3Ghost, CoordAtt,ShuffleNetV2_InvertedResidual,conv_bn_relu_maxpool]:
新建一个YAML文件,命名为yolov5s-shufflenetv2.yaml,具体配置如下
# parameters
nc: 80 # number of classes
depth_multiple: 1.0 # model depth multiple
width_multiple: 0.5 # layer channel multiple
# anchors
anchors:
- [4,5, 8,10, 13,16] # P3/8
- [23,29, 43,55, 73,105] # P4/16
- [146,217, 231,300, 335,433] # P5/32
# custom backbone
backbone:
# [from, number, module, args]
[[-1, 1, conv_bn_relu_maxpool, [3]], # 0-P2/4
[-1, 1, ShuffleNetV2_InvertedResidual, [128, 2]], # 1-P3/8
[-1, 3, ShuffleNetV2_InvertedResidual, [128, 1]], # 2
[-1, 1, ShuffleNetV2_InvertedResidual, [256, 2]], # 3-P4/16
[-1, 7, ShuffleNetV2_InvertedResidual, [256, 1]], # 4
[-1, 1, ShuffleNetV2_InvertedResidual, [512, 2]], # 5-P5/32
[-1, 3, ShuffleNetV2_InvertedResidual, [512, 1]], # 6
]
# YOLOv5 head
head:
[[-1, 1, Conv, [128, 1, 1]],
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
[[-1, 4], 1, Concat, [1]], # cat backbone P4
[-1, 1, C3, [128, False]], # 10
[-1, 1, Conv, [128, 1, 1]],
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
[[-1, 2], 1, Concat, [1]], # cat backbone P3
[-1, 1, C3, [128, False]], # 14 (P3/8-small)
[-1, 1, Conv, [128, 3, 2]],
[[-1, 11], 1, Concat, [1]], # cat head P4
[-1, 1, C3, [128, False]], # 17 (P4/16-medium)
[-1, 1, Conv, [128, 3, 2]],
[[-1, 7], 1, Concat, [1]], # cat head P5
[-1, 1, C3, [128, False]], # 20 (P5/32-large)
[[14, 17, 20], 1, Detect, [nc, anchors]], # Detect(P3, P4, P5)
]
完成上述工作之后,我们就可以在train.py中使用刚才创建的那个配置文件了
parser.add_argument('--cfg', type=str, default='yolov5s-shufflenetv2.yaml', help='model.yaml path')
之后网络就可以使用这个yolov5-shufflenetv2.yaml配置文件愉快的训练起来了