目录
在目标检测的初始结果中,同一个物体,可能对应有多个边界框 (bounding box,bb),这些边界框通常相互重叠。如何从中选择一个最合适 的(也就是与真实目标框最接近的)呢?通常采用的做法是NMS(Nonmaximum suppression),即非极大值抑制。

非极大值抑制的大体思路就像其名一样,对于多个边界框,以置信度 (class score)最大的那个框为准,其他与之重合度高的框,则认为它们检测 的是同一个物体,将其他框除掉,也就是抑制掉。给定一系列候选框,针对每 个类别,分别执行NMS,具体流程如下:
1. 找到置信度最大的框,该框肯定是目标,得到第1个框;
2. 依次计算其他相同类别框与第1个框的重合度(IOU值),如果大于一 定阈值,抑制掉;
3. 剩下的框中,同样找置信度最大的框,为第2个框,抑制掉重合的框;
4. 反复执行上述步骤,直到剩下最后一个框,此亦为目标框。
- #需求
- #使用nms实现图像阈值参数的提取
- """
- 1导包
- 2,定义nms对象候选边框
- 2.1 边界框
- 2.2 边框坐标
- 2.3 边界框的置信度
- 2.4 选择边界框, 2.5 计算边界框的面积
- 3, 根据边界框置信度排序
- 3.1 迭代边界框
- 3.2 边界框信心指数得分最大
- 3.3 选择信心得分最大
- 4 ,计算相交过并并坐标(IOU)
- 4.1 计算交集-过并域
- 4.2 计算交集和并集之间的比率
- 5,图像设置--->名称
- 5.1 边界框
- 5.2读取图像文件
- 5.3复制图像作为原始
- 5.4参数设置
- 5.5IOU阈值
- 5.6绘制边界框和信心得分
- 5.7执行非max抑制算法
- 5.8绘制非最大值抑制后的边界框和置信度得分
- 5.9显示图像
- """
- import cv2
- import numpy as np
- """
- Non-max Suppression Algorithm
- @param list Object candidate bounding boxes
- @param list Confidence score of bounding boxes
- @param float IoU threshold
- @return Rest boxes after nms operation
- """
- #定义nms对象候选边框
- def nms(bounding_boxes,confidence_srore,threashold): #定义nms
- if len(bounding_boxes) == 0:
- return [],[]
- boxes = np.array(bounding_boxes)
- # coordinates of bounding boxes 边框坐标
- start_x = boxes[:,0]
- start_y = boxes[:,1] end_x = boxes[:,2]
- end_y = boxes[:,3]
- # Bounding boxes 边界框
- boxes = np.array(bounding_boxes)
- # Confidence scores of bounding boxes 边界框的置信度
- score = np.array(confidence_score)
- # Picked bounding boxes 选择边界框
- picked_boxes = []
- picked_score = []
- # Compute areas of bounding boxes 计算边界框的面积
- areas = (end_x - start_x + 1) * (end_y - start_y + 1)
- # Sort by confidence score of bounding boxes 根据边界框置
- 信度排序
- order = np.argsort(score)
- # Iterate bounding boxes 迭代边界框
- while order.size > 0:
- # The index of largest confidence score 边界框信心指数
- 得分最大
- index = order[-1] #未匹配到元素
- # Pick the bounding box with largest confidence score
- 选择信心得分最大的区域
- picked_boxes.append(bounding_boxes[index])
- picked_score.append(confidence_score[index])
- # Compute ordinates of intersection-over-union(IOU) 计
- 算相交过并并坐标(IOU)
- x1 = np.maximum(start_x[index], start_x[order[:-1]])
- x2 = np.minimum(end_x[index], end_x[order[:-1]])
- y1 = np.maximum(start_y[index], start_y[order[:-1]])
- y2 = np.minimum(end_y[index], end_y[order[:-1]])
- # Compute areas of intersection-over-union 计算交集-过
- 并域
- w = np.maximum(0.0, x2 - x1 + 1)
- h = np.maximum(0.0, y2 - y1 + 1)
- intersection = w * h # Compute the ratio between intersection and union 计
- 算交集和并集之间的比率
- ratio = intersection / (areas[index] +
- areas[order[:-1]] - intersection)
- left = np.where(ratio < threshold)
- order = order[left]
- return picked_boxes, picked_score
- # Image name
- image_name = '1.png'
- # Bounding boxes 边界框
- bounding_boxes = [(187, 82, 337, 317), (150, 67, 305, 282),
- (246, 121, 368, 304)]
- confidence_score = [0.9, 0.75, 0.8]
- # Read image 读取图像文件
- image = cv2.imread(image_name)
- # Copy image as original 复制图像作为原始
- org = image.copy()
- # Draw parameters 参数设置
- font = cv2.FONT_HERSHEY_SIMPLEX
- font_scale = 1
- thickness = 2
- # IoU threshold IOU阈值
- threshold = 0.4
- # Draw bounding boxes and confidence score 绘制边界框和信心得分
- for (start_x, start_y, end_x, end_y), confidence in
- zip(bounding_boxes, confidence_score):
- (w, h), baseline = cv2.getTextSize(str(confidence), font,
- font_scale, thickness)
- cv2.rectangle(org, (start_x, start_y - (2 * baseline +
- 5)), (start_x + w, start_y), (0, 255, 255), -1)
- cv2.rectangle(org, (start_x, start_y), (end_x, end_y),
- (0, 255, 255), 2) 第十一章, YOLO系列概述
- 1.深度学习经典检测方法
-
- (1) tow-stage(两阶段):Faster-rcnn Mask-rcnn系列:增加了区域建议网
- 络(RPN),即预选框
- 特点
- 速度通常较慢(5FPS),但是效果通常不错
- cv2.putText(org, str(confidence), (start_x, start_y),
- font, font_scale, (0, 0, 0), thickness)
- # Run non-max suppression algorithm 执行非max抑制算法
- picked_boxes, picked_score = nms(bounding_boxes,
- confidence_score, threshold)
- # Draw bounding boxes and confidence score after non-maximum
- supression 绘制非最大值抑制后的边界框和置信度得分
- for (start_x, start_y, end_x, end_y), confidence in
- zip(picked_boxes, picked_score):
- (w, h), baseline = cv2.getTextSize(str(confidence), font,
- font_scale, thickness)
- cv2.rectangle(image, (start_x, start_y - (2 * baseline +
- 5)), (start_x + w, start_y), (0, 255, 255), -1)
- cv2.rectangle(image, (start_x, start_y), (end_x, end_y),
- (0, 255, 255), 2)
- cv2.putText(image, str(confidence), (start_x, start_y),
- font, font_scale, (0, 0, 0), thickness)
- # Show image 显示图像
- cv2.imshow('Original', org)
- cv2.imshow('NMS', image)
- cv2.waitKey(0)