• GIS之深度学习10:运行Faster RCNN算法


    (未完成,待补充)


    获取Faster RCNN源码

    (开源的很多,论文里也有,在这里不多赘述)

    替换自己的数据集(图片+标签文件)

    (需要使用labeling生成标签文件)

    打开终端,进入gpupytorch环境

    运行voc_annotation.py文件生成与训练文件

    1. E:\DeepLearningModel\Model01>activate gpupytorch
    2. (gpupytorch) E:\DeepLearningModel\Model01>python voc_annotation.py
    3. D:\Anaconda\envs\gpupytorch\lib\site-packages\numpy\_distributor_init.py:30: UserWarning: loaded more than 1 DLL from .libs:
    4. D:\Anaconda\envs\gpupytorch\lib\site-packages\numpy\.libs\libopenblas.PYQHXLVVQ7VESDPUVUADXEVJOBGHJPAY.gfortran-win_amd64.dll
    5. D:\Anaconda\envs\gpupytorch\lib\site-packages\numpy\.libs\libopenblas64__v0.3.21-gcc_10_3_0.dll
    6. warnings.warn("loaded more than 1 DLL from .libs:\n%s" %
    7. Generate txt in ImageSets.
    8. train and val size 777
    9. train size 699
    10. Generate txt in ImageSets done.
    11. Generate 2007_train.txt and 2007_val.txt for train.

     结果所示:

    1. (gpupytorch) E:\DeepLearningModel\Model01>python voc_annotation.py
    2. D:\Anaconda\envs\gpupytorch\lib\site-packages\numpy\_distributor_init.py:30: UserWarning: loaded more than 1 DLL from .libs:
    3. D:\Anaconda\envs\gpupytorch\lib\site-packages\numpy\.libs\libopenblas.PYQHXLVVQ7VESDPUVUADXEVJOBGHJPAY.gfortran-win_amd64.dll
    4. D:\Anaconda\envs\gpupytorch\lib\site-packages\numpy\.libs\libopenblas64__v0.3.21-gcc_10_3_0.dll
    5. warnings.warn("loaded more than 1 DLL from .libs:\n%s" %
    6. Generate txt in ImageSets.
    7. train and val size 777
    8. train size 699
    9. Generate txt in ImageSets done.
    10. Generate 2007_train.txt and 2007_val.txt for train.
    11. Generate 2007_train.txt and 2007_val.txt for train done.
    12. | leopard | 174 |
    13. | boar | 491 |
    14. | roe_deer | 352 |
    15. (gpupytorch) E:\DeepLearningModel\Model01>

    运行:train.py文件

    1. import colorsys
    2. import os
    3. import time
    4. import numpy as np
    5. import torch
    6. import torch.nn as nn
    7. from PIL import Image, ImageDraw, ImageFont
    8. from nets.frcnn import FasterRCNN
    9. from utils.utils import (cvtColor, get_classes, get_new_img_size, resize_image,
    10. preprocess_input, show_config)
    11. from utils.utils_bbox import DecodeBox
    12. class FRCNN(object):
    13. _defaults = {
    14. "model_path" : 'logs/loss_2024_03_05_22_26_24.pth',
    15. "classes_path" : 'model_data/voc_classes.txt',
    16. "backbone" : "resnet50",
    17. "confidence" : 0.5,
    18. "nms_iou" : 0.3,
    19. 'anchors_size' : [8, 16, 32],
    20. "cuda" : True,
    21. }
    22. @classmethod
    23. def get_defaults(cls, n):
    24. if n in cls._defaults:
    25. return cls._defaults[n]
    26. else:
    27. return "Unrecognized attribute name '" + n + "'"
    28. def __init__(self, **kwargs):
    29. self.__dict__.update(self._defaults)
    30. for name, value in kwargs.items():
    31. setattr(self, name, value)
    32. self._defaults[name] = value
    33. self.class_names, self.num_classes = get_classes(self.classes_path)
    34. self.std = torch.Tensor([0.1, 0.1, 0.2, 0.2]).repeat(self.num_classes + 1)[None]
    35. if self.cuda:
    36. self.std = self.std.cuda()
    37. self.bbox_util = DecodeBox(self.std, self.num_classes)
    38. #---------------------------------------------------#
    39. hsv_tuples = [(x / self.num_classes, 1., 1.) for x in range(self.num_classes)]
    40. self.colors = list(map(lambda x: colorsys.hsv_to_rgb(*x), hsv_tuples))
    41. self.colors = list(map(lambda x: (int(x[0] * 255), int(x[1] * 255), int(x[2] * 255)), self.colors))
    42. self.generate()
    43. show_config(**self._defaults)
    44. #---------------------------------------------------#
    45. # 载入模型
    46. #---------------------------------------------------#
    47. def generate(self):
    48. self.net = FasterRCNN(self.num_classes, "predict", anchor_scales = self.anchors_size, backbone = self.backbone)
    49. device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
    50. self.net.load_state_dict(torch.load(self.model_path, map_location=device))
    51. self.net = self.net.eval()
    52. print('{} model, anchors, and classes loaded.'.format(self.model_path))
    53. if self.cuda:
    54. self.net = nn.DataParallel(self.net)
    55. self.net = self.net.cuda()
    56. #---------------------------------------------------#
    57. # 检测图片
    58. #---------------------------------------------------#
    59. def detect_image(self, image, crop = False, count = False):
    60. #---------------------------------------------------#
    61. # 计算输入图片的高和宽
    62. #---------------------------------------------------#
    63. image_shape = np.array(np.shape(image)[0:2])
    64. #---------------------------------------------------#
    65. # 计算resize后的图片的大小,resize后的图片短边为600
    66. #---------------------------------------------------#
    67. input_shape = get_new_img_size(image_shape[0], image_shape[1])
    68. #---------------------------------------------------------#
    69. # 在这里将图像转换成RGB图像,防止灰度图在预测时报错。
    70. # 代码仅仅支持RGB图像的预测,所有其它类型的图像都会转化成RGB
    71. #---------------------------------------------------------#
    72. image = cvtColor(image)
    73. #---------------------------------------------------------#
    74. # 给原图像进行resize,resize到短边为600的大小上
    75. #---------------------------------------------------------#
    76. image_data = resize_image(image, [input_shape[1], input_shape[0]])
    77. #---------------------------------------------------------#
    78. # 添加上batch_size维度
    79. #---------------------------------------------------------#
    80. image_data = np.expand_dims(np.transpose(preprocess_input(np.array(image_data, dtype='float32')), (2, 0, 1)), 0)
    81. with torch.no_grad():
    82. images = torch.from_numpy(image_data)
    83. if self.cuda:
    84. images = images.cuda()
    85. #-------------------------------------------------------------#
    86. # roi_cls_locs 建议框的调整参数
    87. # roi_scores 建议框的种类得分
    88. # rois 建议框的坐标
    89. #-------------------------------------------------------------#
    90. roi_cls_locs, roi_scores, rois, _ = self.net(images)
    91. #-------------------------------------------------------------#
    92. # 利用classifier的预测结果对建议框进行解码,获得预测框
    93. #-------------------------------------------------------------#
    94. results = self.bbox_util.forward(roi_cls_locs, roi_scores, rois, image_shape, input_shape,
    95. nms_iou = self.nms_iou, confidence = self.confidence)
    96. #---------------------------------------------------------#
    97. # 如果没有检测出物体,返回原图
    98. #---------------------------------------------------------#
    99. if len(results[0]) <= 0:
    100. return image
    101. top_label = np.array(results[0][:, 5], dtype = 'int32')
    102. top_conf = results[0][:, 4]
    103. top_boxes = results[0][:, :4]
    104. #---------------------------------------------------------#
    105. # 设置字体与边框厚度
    106. #---------------------------------------------------------#
    107. font = ImageFont.truetype(font='model_data/simhei.ttf', size=np.floor(3e-2 * image.size[1] + 0.5).astype('int32'))
    108. thickness = int(max((image.size[0] + image.size[1]) // np.mean(input_shape), 1))
    109. #---------------------------------------------------------#
    110. # 计数
    111. #---------------------------------------------------------#
    112. if count:
    113. print("top_label:", top_label)
    114. classes_nums = np.zeros([self.num_classes])
    115. for i in range(self.num_classes):
    116. num = np.sum(top_label == i)
    117. if num > 0:
    118. print(self.class_names[i], " : ", num)
    119. classes_nums[i] = num
    120. print("classes_nums:", classes_nums)
    121. #---------------------------------------------------------#
    122. # 是否进行目标的裁剪
    123. #---------------------------------------------------------#
    124. if crop:
    125. for i, c in list(enumerate(top_label)):
    126. top, left, bottom, right = top_boxes[i]
    127. top = max(0, np.floor(top).astype('int32'))
    128. left = max(0, np.floor(left).astype('int32'))
    129. bottom = min(image.size[1], np.floor(bottom).astype('int32'))
    130. right = min(image.size[0], np.floor(right).astype('int32'))
    131. dir_save_path = "img_crop"
    132. if not os.path.exists(dir_save_path):
    133. os.makedirs(dir_save_path)
    134. crop_image = image.crop([left, top, right, bottom])
    135. crop_image.save(os.path.join(dir_save_path, "crop_" + str(i) + ".png"), quality=95, subsampling=0)
    136. print("save crop_" + str(i) + ".png to " + dir_save_path)
    137. #---------------------------------------------------------#
    138. # 图像绘制
    139. #---------------------------------------------------------#
    140. for i, c in list(enumerate(top_label)):
    141. predicted_class = self.class_names[int(c)]
    142. box = top_boxes[i]
    143. score = top_conf[i]
    144. top, left, bottom, right = box
    145. top = max(0, np.floor(top).astype('int32'))
    146. left = max(0, np.floor(left).astype('int32'))
    147. bottom = min(image.size[1], np.floor(bottom).astype('int32'))
    148. right = min(image.size[0], np.floor(right).astype('int32'))
    149. label = '{} {:.2f}'.format(predicted_class, score)
    150. draw = ImageDraw.Draw(image)
    151. label_size = draw.textsize(label, font)
    152. label = label.encode('utf-8')
    153. # print(label, top, left, bottom, right)
    154. if top - label_size[1] >= 0:
    155. text_origin = np.array([left, top - label_size[1]])
    156. else:
    157. text_origin = np.array([left, top + 1])
    158. for i in range(thickness):
    159. draw.rectangle([left + i, top + i, right - i, bottom - i], outline=self.colors[c])
    160. draw.rectangle([tuple(text_origin), tuple(text_origin + label_size)], fill=self.colors[c])
    161. draw.text(text_origin, str(label,'UTF-8'), fill=(0, 0, 0), font=font)
    162. del draw
    163. return image
    164. def get_FPS(self, image, test_interval):
    165. #---------------------------------------------------#
    166. # 计算输入图片的高和宽
    167. #---------------------------------------------------#
    168. image_shape = np.array(np.shape(image)[0:2])
    169. input_shape = get_new_img_size(image_shape[0], image_shape[1])
    170. #---------------------------------------------------------#
    171. # 在这里将图像转换成RGB图像,防止灰度图在预测时报错。
    172. # 代码仅仅支持RGB图像的预测,所有其它类型的图像都会转化成RGB
    173. #---------------------------------------------------------#
    174. image = cvtColor(image)
    175. #---------------------------------------------------------#
    176. # 给原图像进行resize,resize到短边为600的大小上
    177. #---------------------------------------------------------#
    178. image_data = resize_image(image, [input_shape[1], input_shape[0]])
    179. #---------------------------------------------------------#
    180. # 添加上batch_size维度
    181. #---------------------------------------------------------#
    182. image_data = np.expand_dims(np.transpose(preprocess_input(np.array(image_data, dtype='float32')), (2, 0, 1)), 0)
    183. with torch.no_grad():
    184. images = torch.from_numpy(image_data)
    185. if self.cuda:
    186. images = images.cuda()
    187. roi_cls_locs, roi_scores, rois, _ = self.net(images)
    188. #-------------------------------------------------------------#
    189. # 利用classifier的预测结果对建议框进行解码,获得预测框
    190. #-------------------------------------------------------------#
    191. results = self.bbox_util.forward(roi_cls_locs, roi_scores, rois, image_shape, input_shape,
    192. nms_iou = self.nms_iou, confidence = self.confidence)
    193. t1 = time.time()
    194. for _ in range(test_interval):
    195. with torch.no_grad():
    196. roi_cls_locs, roi_scores, rois, _ = self.net(images)
    197. #-------------------------------------------------------------#
    198. # 利用classifier的预测结果对建议框进行解码,获得预测框
    199. #-------------------------------------------------------------#
    200. results = self.bbox_util.forward(roi_cls_locs, roi_scores, rois, image_shape, input_shape,
    201. nms_iou = self.nms_iou, confidence = self.confidence)
    202. t2 = time.time()
    203. tact_time = (t2 - t1) / test_interval
    204. return tact_time
    205. #---------------------------------------------------#
    206. # 检测图片
    207. #---------------------------------------------------#
    208. def get_map_txt(self, image_id, image, class_names, map_out_path):
    209. f = open(os.path.join(map_out_path, "detection-results/"+image_id+".txt"),"w")
    210. #---------------------------------------------------#
    211. # 计算输入图片的高和宽
    212. #---------------------------------------------------#
    213. image_shape = np.array(np.shape(image)[0:2])
    214. input_shape = get_new_img_size(image_shape[0], image_shape[1])
    215. #---------------------------------------------------------#
    216. # 在这里将图像转换成RGB图像,防止灰度图在预测时报错。
    217. # 代码仅仅支持RGB图像的预测,所有其它类型的图像都会转化成RGB
    218. #---------------------------------------------------------#
    219. image = cvtColor(image)
    220. #---------------------------------------------------------#
    221. # 给原图像进行resize,resize到短边为600的大小上
    222. #---------------------------------------------------------#
    223. image_data = resize_image(image, [input_shape[1], input_shape[0]])
    224. #---------------------------------------------------------#
    225. # 添加上batch_size维度
    226. #---------------------------------------------------------#
    227. image_data = np.expand_dims(np.transpose(preprocess_input(np.array(image_data, dtype='float32')), (2, 0, 1)), 0)
    228. with torch.no_grad():
    229. images = torch.from_numpy(image_data)
    230. if self.cuda:
    231. images = images.cuda()
    232. roi_cls_locs, roi_scores, rois, _ = self.net(images)
    233. #-------------------------------------------------------------#
    234. # 利用classifier的预测结果对建议框进行解码,获得预测框
    235. #-------------------------------------------------------------#
    236. results = self.bbox_util.forward(roi_cls_locs, roi_scores, rois, image_shape, input_shape,
    237. nms_iou = self.nms_iou, confidence = self.confidence)
    238. #--------------------------------------#
    239. # 如果没有检测到物体,则返回原图
    240. #--------------------------------------#
    241. if len(results[0]) <= 0:
    242. return
    243. top_label = np.array(results[0][:, 5], dtype = 'int32')
    244. top_conf = results[0][:, 4]
    245. top_boxes = results[0][:, :4]
    246. for i, c in list(enumerate(top_label)):
    247. predicted_class = self.class_names[int(c)]
    248. box = top_boxes[i]
    249. score = str(top_conf[i])
    250. top, left, bottom, right = box
    251. if predicted_class not in class_names:
    252. continue
    253. f.write("%s %s %s %s %s %s\n" % (predicted_class, score[:6], str(int(left)), str(int(top)), str(int(right)),str(int(bottom))))
    254. f.close()
    255. return

     终端/编码器运行:

    1. E:\DeepLearningModel\Model01>activate gpupytorch
    2. (gpupytorch) E:\DeepLearningModel\Model01>python train.py
    3. D:\Anaconda\envs\gpupytorch\lib\site-packages\numpy\_distributor_init.py:30: UserWarning: loaded more than 1 DLL from .libs:
    4. D:\Anaconda\envs\gpupytorch\lib\site-packages\numpy\.libs\libopenblas.PYQHXLVVQ7VESDPUVUADXEVJOBGHJPAY.gfortran-win_amd64.dll
    5. D:\Anaconda\envs\gpupytorch\lib\site-packages\numpy\.libs\libopenblas64__v0.3.21-gcc_10_3_0.dll
    6. warnings.warn("loaded more than 1 DLL from .libs:\n%s" %
    7. Number of devices: 1
    8. initialize network with normal type
    9. Load weights model_data/voc_weights_resnet.pth.
    10. Successful Load Key: ['extractor.0.weight', 'extractor.1.weight', 'extractor.1.bias', 'extractor.1.running_mean', 'extractor.1.running_var', 'extractor.1.num_batches_tracked', 'extractor.4.0.conv1.weight', 'extractor.4.0.bn1.weight', 'extractor.4.0.bn1.bias', 'extractor.4.0.bn1.running_mean', 'extractor.4.0.bn1.running_var', 'extractor.4.0.bn1.num_batches_tracked', 'extractor.4.0.conv2.weight', 'extractor.4.0.bn2.weight', 'extractor.4.0.bn2.bias', 'extractor.4.0.bn2.running_mean', 'extractor.4.0.bn2.running_var', 'e ……
    11. Successful Load Key Num: 324
    12. Fail To Load Key: ['head.cls_loc.weight', 'head.cls_loc.bias', 'head.score.weight', 'head.score.bias'] ……
    13. Fail To Load Key num: 4
    14. 温馨提示,head部分没有载入是正常现象,Backbone部分没有载入是错误的。
    15. Configurations:
    16. ----------------------------------------------------------------------
    17. | keys | values|
    18. ----------------------------------------------------------------------
    19. | classes_path | model_data/voc_classes.txt|
    20. | model_path | model_data/voc_weights_resnet.pth|
    21. | input_shape | [600, 600]|
    22. | Init_Epoch | 0|
    23. | Freeze_Epoch | 50|
    24. | UnFreeze_Epoch | 100|
    25. | Freeze_batch_size | 4|
    26. | Unfreeze_batch_size | 2|
    27. | Freeze_Train | True|
    28. | Init_lr | 0.0001|
    29. | Min_lr | 1.0000000000000002e-06|
    30. | optimizer_type | adam|
    31. | momentum | 0.9|
    32. | lr_decay_type | cos|
    33. | save_period | 5|
    34. | save_dir | logs|
    35. | num_workers | 4|
    36. | num_train | 699|
    37. | num_val | 78|
    38. ----------------------------------------------------------------------
    39. Start Train
    40. Epoch 1/100: 0%| | 0/174 [00:00dict'>]D:\Anaconda\envs\gpupytorch\lib\site-packages\numpy\_distributor_init.py:30: UserWarning: loaded more than 1 DLL from .libs:
    41. D:\Anaconda\envs\gpupytorch\lib\site-packages\numpy\.libs\libopenblas.PYQHXLVVQ7VESDPUVUADXEVJOBGHJPAY.gfortran-win_amd64.dll

    查看结果:

    1. Calculate Map.
    2. 96.35% = boar AP || score_threhold=0.5 : F1=0.81 ; Recall=97.92% ; Precision=69.12%
    3. 94.74% = leopard AP || score_threhold=0.5 : F1=0.90 ; Recall=94.74% ; Precision=85.71%
    4. 94.97% = roe_deer AP || score_threhold=0.5 : F1=0.86 ; Recall=96.88% ; Precision=77.50%
    5. mAP = 95.35%
    6. Get map done.
    7. Epoch:100/100
    8. Total Loss: 0.505 || Val Loss: 0.621
    9. Save best model to best_epoch_weights.pth

  • 相关阅读:
    Linux 安全 - 扩展属性xattr
    前端面试常见问题
    关于汽车电子工程师的全流程思考
    同源策略和跨域问题
    亚马逊云科技大语言模型的创新科技
    1、3快速格式代码
    通过Vue 完成简单的tab栏切换
    SSL证书对于SEO优化的重要性
    JVM-内存模型(运行时数据区)
    Elasticsearch学习-ES中的别名是什么
  • 原文地址:https://blog.csdn.net/weixin_55429615/article/details/136491701