• 机器学习周记(第三十八周:语义分割)2024.5.6~2024.5.12


    目录

    摘要

    ABSTRACT

    1 DeeplabV3+实现思路

    预测部分

    ①主干网络介绍​编辑

    ② 加强特征提取结构

    ③ 利用特征获得预测结果


    摘要

      本周继续了语义分割的学习,主要学习了DeepLabV3+的部分实现思路,即DeepLabV3+的整个模型的预测过程,并通过代码进行了实现。待预测图片首先通过主干网络(MobilenetV2)进行特征提取,得到两个有效特征层,一个有效特征层是输入图片高和宽压缩两次的结果,一个有效特征层是输入图片高和宽压缩四次的结果。然后再通过加强特征提取,最后输出预测结果。

    ABSTRACT

    This week, We continued learning about semantic segmentation, focusing on the implementation details of DeepLabV3+. Specifically, I studied the prediction process of the entire DeepLabV3+ model and implemented it through code. For the prediction, the input image is first passed through the backbone network (MobileNetV2) for feature extraction, resulting in two effective feature layers. One effective feature layer is obtained by compressing the height and width of the input image twice, while the other is obtained by compressing them four times. Then, enhanced feature extraction is performed, followed by the final output of the prediction results.

    1 DeeplabV3+实现思路

    预测部分

    ①主干网络介绍

    DeeplabV3+在论文中采用的是Xception系列作为主干特征提取网络,除了Xception主干网络外,还可以使用ResNetmobilenetv2等。这里主要使用mobilenetv2。

    MobileNet模型是Google针对手机等嵌入式设备提出的一种轻量级的深层神经网络。

    MobileNetV2MobileNet的升级版,它具有一个非常重要的特点就是使用了Inverted resblock,整个mobilenetv2都由Inverted resblock组成。

    Inverted resblock可以分为两个部分:
    左边是主干部分,首先利用1x1卷积进行升维,然后利用3x3深度可分离卷积进行特征提取,然后再利用1x1卷积降维。
    右边是残差边部分,输入和输出直接相接。

    需要注意的是,在DeeplabV3当中,一般不会5次下采样,可选的有3次下采样和4次下采样,这里使用的4次下采样。这里所提到的下采样指的是不会进行五次长和宽的压缩,通常选用三次或者四次长和宽的压缩。

    在完成MobilenetV2的特征提取后,我们可以获得两个有效特征层,一个有效特征层是输入图片高和宽压缩两次的结果,一个有效特征层是输入图片高和宽压缩四次的结果。

    1. import math
    2. import os
    3. import torch
    4. import torch.nn as nn
    5. import torch.utils.model_zoo as model_zoo
    6. BatchNorm2d = nn.BatchNorm2d
    7. def conv_bn(inp, oup, stride):
    8. return nn.Sequential(
    9. nn.Conv2d(inp, oup, 3, stride, 1, bias=False),
    10. BatchNorm2d(oup),
    11. nn.ReLU6(inplace=True)
    12. )
    13. def conv_1x1_bn(inp, oup):
    14. return nn.Sequential(
    15. nn.Conv2d(inp, oup, 1, 1, 0, bias=False),
    16. BatchNorm2d(oup),
    17. nn.ReLU6(inplace=True)
    18. )
    19. class InvertedResidual(nn.Module):
    20. def __init__(self, inp, oup, stride, expand_ratio):
    21. super(InvertedResidual, self).__init__()
    22. self.stride = stride
    23. assert stride in [1, 2]
    24. hidden_dim = round(inp * expand_ratio)
    25. self.use_res_connect = self.stride == 1 and inp == oup
    26. if expand_ratio == 1:
    27. self.conv = nn.Sequential(
    28. # dw
    29. nn.Conv2d(hidden_dim, hidden_dim, 3, stride, 1, groups=hidden_dim, bias=False),
    30. BatchNorm2d(hidden_dim),
    31. nn.ReLU6(inplace=True),
    32. # pw-linear
    33. nn.Conv2d(hidden_dim, oup, 1, 1, 0, bias=False),
    34. BatchNorm2d(oup),
    35. )
    36. else:
    37. self.conv = nn.Sequential(
    38. # pw
    39. nn.Conv2d(inp, hidden_dim, 1, 1, 0, bias=False),
    40. BatchNorm2d(hidden_dim),
    41. nn.ReLU6(inplace=True),
    42. # dw
    43. nn.Conv2d(hidden_dim, hidden_dim, 3, stride, 1, groups=hidden_dim, bias=False),
    44. BatchNorm2d(hidden_dim),
    45. nn.ReLU6(inplace=True),
    46. # pw-linear
    47. nn.Conv2d(hidden_dim, oup, 1, 1, 0, bias=False),
    48. BatchNorm2d(oup),
    49. )
    50. def forward(self, x):
    51. if self.use_res_connect:
    52. return x + self.conv(x)
    53. else:
    54. return self.conv(x)
    55. class MobileNetV2(nn.Module):
    56. def __init__(self, n_class=1000, input_size=224, width_mult=1.):
    57. super(MobileNetV2, self).__init__()
    58. block = InvertedResidual
    59. input_channel = 32
    60. last_channel = 1280
    61. interverted_residual_setting = [
    62. # t, c, n, s
    63. [1, 16, 1, 1],
    64. [6, 24, 2, 2],
    65. [6, 32, 3, 2],
    66. [6, 64, 4, 2],
    67. [6, 96, 3, 1],
    68. [6, 160, 3, 2],
    69. [6, 320, 1, 1],
    70. ]
    71. # building first layer
    72. assert input_size % 32 == 0
    73. input_channel = int(input_channel * width_mult)
    74. self.last_channel = int(last_channel * width_mult) if width_mult > 1.0 else last_channel
    75. self.features = [conv_bn(3, input_channel, 2)]
    76. # building inverted residual blocks
    77. for t, c, n, s in interverted_residual_setting:
    78. output_channel = int(c * width_mult)
    79. for i in range(n):
    80. if i == 0:
    81. self.features.append(block(input_channel, output_channel, s, expand_ratio=t))
    82. else:
    83. self.features.append(block(input_channel, output_channel, 1, expand_ratio=t))
    84. input_channel = output_channel
    85. # building last several layers
    86. self.features.append(conv_1x1_bn(input_channel, self.last_channel))
    87. # make it nn.Sequential
    88. self.features = nn.Sequential(*self.features)
    89. # building classifier
    90. self.classifier = nn.Sequential(
    91. nn.Dropout(0.2),
    92. nn.Linear(self.last_channel, n_class),
    93. )
    94. self._initialize_weights()
    95. def forward(self, x):
    96. x = self.features(x)
    97. x = x.mean(3).mean(2)
    98. x = self.classifier(x)
    99. return x
    100. def _initialize_weights(self):
    101. for m in self.modules():
    102. if isinstance(m, nn.Conv2d):
    103. n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels
    104. m.weight.data.normal_(0, math.sqrt(2. / n))
    105. if m.bias is not None:
    106. m.bias.data.zero_()
    107. elif isinstance(m, BatchNorm2d):
    108. m.weight.data.fill_(1)
    109. m.bias.data.zero_()
    110. elif isinstance(m, nn.Linear):
    111. n = m.weight.size(1)
    112. m.weight.data.normal_(0, 0.01)
    113. m.bias.data.zero_()
    114. def load_url(url, model_dir='./model_data', map_location=None):
    115. if not os.path.exists(model_dir):
    116. os.makedirs(model_dir)
    117. filename = url.split('/')[-1]
    118. cached_file = os.path.join(model_dir, filename)
    119. if os.path.exists(cached_file):
    120. return torch.load(cached_file, map_location=map_location)
    121. else:
    122. return model_zoo.load_url(url,model_dir=model_dir)
    123. def mobilenetv2(pretrained=False, **kwargs):
    124. model = MobileNetV2(n_class=1000, **kwargs)
    125. if pretrained:
    126. model.load_state_dict(load_url('http://sceneparsing.csail.mit.edu/model/pretrained_resnet/mobilenet_v2.pth.tar'), strict=False)
    127. return model
    ② 加强特征提取结构

    DeeplabV3+中,加强特征提取网络可以分为两部分:
    Encoder中,我们会对压缩四次的初步有效特征层利用并行的Atrous Convolution,分别用不同rateAtrous Convolution进行特征提取,再进行合并,再进行1x1卷积压缩特征。

    Decoder中,我们会对压缩两次的初步有效特征层利用1x1卷积调整通道数,再和空洞卷积后的有效特征层上采样的结果进行堆叠,在完成堆叠后,进行两次深度可分离卷积块。
     

    在这里插入图片描述

    这个时候,我们就获得了一个最终的有效特征层,它是整张图片的特征浓缩。

    1. import torch
    2. import torch.nn as nn
    3. import torch.nn.functional as F
    4. from nets.xception import xception
    5. from nets.mobilenetv2 import mobilenetv2
    6. class MobileNetV2(nn.Module):
    7. def __init__(self, downsample_factor=8, pretrained=True):
    8. super(MobileNetV2, self).__init__()
    9. from functools import partial
    10. model = mobilenetv2(pretrained)
    11. self.features = model.features[:-1]
    12. self.total_idx = len(self.features)
    13. self.down_idx = [2, 4, 7, 14]
    14. if downsample_factor == 8:
    15. for i in range(self.down_idx[-2], self.down_idx[-1]):
    16. self.features[i].apply(
    17. partial(self._nostride_dilate, dilate=2)
    18. )
    19. for i in range(self.down_idx[-1], self.total_idx):
    20. self.features[i].apply(
    21. partial(self._nostride_dilate, dilate=4)
    22. )
    23. elif downsample_factor == 16:
    24. for i in range(self.down_idx[-1], self.total_idx):
    25. self.features[i].apply(
    26. partial(self._nostride_dilate, dilate=2)
    27. )
    28. def _nostride_dilate(self, m, dilate):
    29. classname = m.__class__.__name__
    30. if classname.find('Conv') != -1:
    31. if m.stride == (2, 2):
    32. m.stride = (1, 1)
    33. if m.kernel_size == (3, 3):
    34. m.dilation = (dilate//2, dilate//2)
    35. m.padding = (dilate//2, dilate//2)
    36. else:
    37. if m.kernel_size == (3, 3):
    38. m.dilation = (dilate, dilate)
    39. m.padding = (dilate, dilate)
    40. def forward(self, x):
    41. low_level_features = self.features[:4](x)
    42. x = self.features[4:](low_level_features)
    43. return low_level_features, x
    44. #-----------------------------------------#
    45. # ASPP特征提取模块
    46. # 利用不同膨胀率的膨胀卷积进行特征提取
    47. #-----------------------------------------#
    48. class ASPP(nn.Module):
    49. def __init__(self, dim_in, dim_out, rate=1, bn_mom=0.1):
    50. super(ASPP, self).__init__()
    51. self.branch1 = nn.Sequential(
    52. nn.Conv2d(dim_in, dim_out, 1, 1, padding=0, dilation=rate,bias=True),
    53. nn.BatchNorm2d(dim_out, momentum=bn_mom),
    54. nn.ReLU(inplace=True),
    55. )
    56. self.branch2 = nn.Sequential(
    57. nn.Conv2d(dim_in, dim_out, 3, 1, padding=6*rate, dilation=6*rate, bias=True),
    58. nn.BatchNorm2d(dim_out, momentum=bn_mom),
    59. nn.ReLU(inplace=True),
    60. )
    61. self.branch3 = nn.Sequential(
    62. nn.Conv2d(dim_in, dim_out, 3, 1, padding=12*rate, dilation=12*rate, bias=True),
    63. nn.BatchNorm2d(dim_out, momentum=bn_mom),
    64. nn.ReLU(inplace=True),
    65. )
    66. self.branch4 = nn.Sequential(
    67. nn.Conv2d(dim_in, dim_out, 3, 1, padding=18*rate, dilation=18*rate, bias=True),
    68. nn.BatchNorm2d(dim_out, momentum=bn_mom),
    69. nn.ReLU(inplace=True),
    70. )
    71. self.branch5_conv = nn.Conv2d(dim_in, dim_out, 1, 1, 0,bias=True)
    72. self.branch5_bn = nn.BatchNorm2d(dim_out, momentum=bn_mom)
    73. self.branch5_relu = nn.ReLU(inplace=True)
    74. self.conv_cat = nn.Sequential(
    75. nn.Conv2d(dim_out*5, dim_out, 1, 1, padding=0,bias=True),
    76. nn.BatchNorm2d(dim_out, momentum=bn_mom),
    77. nn.ReLU(inplace=True),
    78. )
    79. def forward(self, x):
    80. [b, c, row, col] = x.size()
    81. #-----------------------------------------#
    82. # 一共五个分支
    83. #-----------------------------------------#
    84. conv1x1 = self.branch1(x)
    85. conv3x3_1 = self.branch2(x)
    86. conv3x3_2 = self.branch3(x)
    87. conv3x3_3 = self.branch4(x)
    88. #-----------------------------------------#
    89. # 第五个分支,全局平均池化+卷积
    90. #-----------------------------------------#
    91. global_feature = torch.mean(x,2,True)
    92. global_feature = torch.mean(global_feature,3,True)
    93. global_feature = self.branch5_conv(global_feature)
    94. global_feature = self.branch5_bn(global_feature)
    95. global_feature = self.branch5_relu(global_feature)
    96. global_feature = F.interpolate(global_feature, (row, col), None, 'bilinear', True)
    97. #-----------------------------------------#
    98. # 将五个分支的内容堆叠起来
    99. # 然后1x1卷积整合特征。
    100. #-----------------------------------------#
    101. feature_cat = torch.cat([conv1x1, conv3x3_1, conv3x3_2, conv3x3_3, global_feature], dim=1)
    102. result = self.conv_cat(feature_cat)
    103. return result
    104. class DeepLab(nn.Module):
    105. def __init__(self, num_classes, backbone="mobilenet", pretrained=True, downsample_factor=16):
    106. super(DeepLab, self).__init__()
    107. if backbone=="xception":
    108. #----------------------------------#
    109. # 获得两个特征层
    110. # 浅层特征 [128,128,256]
    111. # 主干部分 [30,30,2048]
    112. #----------------------------------#
    113. self.backbone = xception(downsample_factor=downsample_factor, pretrained=pretrained)
    114. in_channels = 2048
    115. low_level_channels = 256
    116. elif backbone=="mobilenet":
    117. #----------------------------------#
    118. # 获得两个特征层
    119. # 浅层特征 [128,128,24]
    120. # 主干部分 [30,30,320]
    121. #----------------------------------#
    122. self.backbone = MobileNetV2(downsample_factor=downsample_factor, pretrained=pretrained)
    123. in_channels = 320
    124. low_level_channels = 24
    125. else:
    126. raise ValueError('Unsupported backbone - `{}`, Use mobilenet, xception.'.format(backbone))
    127. #-----------------------------------------#
    128. # ASPP特征提取模块
    129. # 利用不同膨胀率的膨胀卷积进行特征提取
    130. #-----------------------------------------#
    131. self.aspp = ASPP(dim_in=in_channels, dim_out=256, rate=16//downsample_factor)
    132. #----------------------------------#
    133. # 浅层特征边
    134. #----------------------------------#
    135. self.shortcut_conv = nn.Sequential(
    136. nn.Conv2d(low_level_channels, 48, 1),
    137. nn.BatchNorm2d(48),
    138. nn.ReLU(inplace=True)
    139. )
    140. self.cat_conv = nn.Sequential(
    141. nn.Conv2d(48+256, 256, 3, stride=1, padding=1),
    142. nn.BatchNorm2d(256),
    143. nn.ReLU(inplace=True),
    144. nn.Dropout(0.5),
    145. nn.Conv2d(256, 256, 3, stride=1, padding=1),
    146. nn.BatchNorm2d(256),
    147. nn.ReLU(inplace=True),
    148. nn.Dropout(0.1),
    149. )
    150. self.cls_conv = nn.Conv2d(256, num_classes, 1, stride=1)
    151. def forward(self, x):
    152. H, W = x.size(2), x.size(3)
    153. #-----------------------------------------#
    154. # 获得两个特征层
    155. # 浅层特征-进行卷积处理
    156. # 主干部分-利用ASPP结构进行加强特征提取
    157. #-----------------------------------------#
    158. low_level_features, x = self.backbone(x)
    159. x = self.aspp(x)
    160. low_level_features = self.shortcut_conv(low_level_features)
    161. #-----------------------------------------#
    162. # 将加强特征边上采样
    163. # 与浅层特征堆叠后利用卷积进行特征提取
    164. #-----------------------------------------#
    165. x = F.interpolate(x, size=(low_level_features.size(2), low_level_features.size(3)), mode='bilinear', align_corners=True)
    166. x = self.cat_conv(torch.cat((x, low_level_features), dim=1))
    167. x = self.cls_conv(x)
    168. x = F.interpolate(x, size=(H, W), mode='bilinear', align_corners=True)
    169. return x
    ③ 利用特征获得预测结果

    利用1、2步,我们可以获取输入进来的图片的特征,此时,我们需要利用特征获得预测结果。

    利用特征获得预测结果的过程可以分为2步:
    1、利用一个1x1卷积进行通道调整,调整成Num_Classes。
    2、利用resize进行上采样使得最终输出层,宽高和输入图片一样。

    1. class DeepLab(nn.Module):
    2. def __init__(self, num_classes, backbone="mobilenet", pretrained=True, downsample_factor=16):
    3. super(DeepLab, self).__init__()
    4. if backbone=="xception":
    5. #----------------------------------#
    6. # 获得两个特征层
    7. # 浅层特征 [128,128,256]
    8. # 主干部分 [30,30,2048]
    9. #----------------------------------#
    10. self.backbone = xception(downsample_factor=downsample_factor, pretrained=pretrained)
    11. in_channels = 2048
    12. low_level_channels = 256
    13. elif backbone=="mobilenet":
    14. #----------------------------------#
    15. # 获得两个特征层
    16. # 浅层特征 [128,128,24]
    17. # 主干部分 [30,30,320]
    18. #----------------------------------#
    19. self.backbone = MobileNetV2(downsample_factor=downsample_factor, pretrained=pretrained)
    20. in_channels = 320
    21. low_level_channels = 24
    22. else:
    23. raise ValueError('Unsupported backbone - `{}`, Use mobilenet, xception.'.format(backbone))
    24. #-----------------------------------------#
    25. # ASPP特征提取模块
    26. # 利用不同膨胀率的膨胀卷积进行特征提取
    27. #-----------------------------------------#
    28. self.aspp = ASPP(dim_in=in_channels, dim_out=256, rate=16//downsample_factor)
    29. #----------------------------------#
    30. # 浅层特征边
    31. #----------------------------------#
    32. self.shortcut_conv = nn.Sequential(
    33. nn.Conv2d(low_level_channels, 48, 1),
    34. nn.BatchNorm2d(48),
    35. nn.ReLU(inplace=True)
    36. )
    37. self.cat_conv = nn.Sequential(
    38. nn.Conv2d(48+256, 256, 3, stride=1, padding=1),
    39. nn.BatchNorm2d(256),
    40. nn.ReLU(inplace=True),
    41. nn.Dropout(0.5),
    42. nn.Conv2d(256, 256, 3, stride=1, padding=1),
    43. nn.BatchNorm2d(256),
    44. nn.ReLU(inplace=True),
    45. nn.Dropout(0.1),
    46. )
    47. self.cls_conv = nn.Conv2d(256, num_classes, 1, stride=1)
    48. def forward(self, x):
    49. H, W = x.size(2), x.size(3)
    50. #-----------------------------------------#
    51. # 获得两个特征层
    52. # 浅层特征-进行卷积处理
    53. # 主干部分-利用ASPP结构进行加强特征提取
    54. #-----------------------------------------#
    55. low_level_features, x = self.backbone(x)
    56. x = self.aspp(x)
    57. low_level_features = self.shortcut_conv(low_level_features)
    58. #-----------------------------------------#
    59. # 将加强特征边上采样
    60. # 与浅层特征堆叠后利用卷积进行特征提取
    61. #-----------------------------------------#
    62. x = F.interpolate(x, size=(low_level_features.size(2), low_level_features.size(3)), mode='bilinear', align_corners=True)
    63. x = self.cat_conv(torch.cat((x, low_level_features), dim=1))
    64. x = self.cls_conv(x)
    65. x = F.interpolate(x, size=(H, W), mode='bilinear', align_corners=True)
    66. return x
  • 相关阅读:
    网站死链检测的软件-网站死链检测的工具
    交换机与路由技术-15-链路聚合
    【多线程】常见的锁策略
    2020年中职组“网络安全”赛项广东省竞赛任务书
    ByteArray转byte[]的两种方式
    阿里云新用户:定义,专享福利及优惠活动
    [MAUI程序设计]界面多态与实现
    sprigboot+在线预定车位管理 毕业设计-附源码221738
    Java毕业设计-在线点餐系统
    HTB靶场之OnlyForYou
  • 原文地址:https://blog.csdn.net/DominaterWE/article/details/138746877