NMS(Python实现)

上一篇写了Python OpenCV 的 NMS接口
https://blog.csdn.net/HaoZiHuang/article/details/126460067
这篇咱手动实现一下：

第一步，函数签名设计：和OpenCV的 cv2.dnn.NMSBoxes 要一样

def py_nms(dets, nms_threshold=0.5, score_threshold=0.5):
1

dets 的shape为(m, 5)，前4列为xyxy的坐标，最后一列为置信度conf
所以检测一下输入参数正确性

assert dets.shape[1] == 5
assert len(dets.shape) == 2
1
2

先做最简单的一步，把置信度低的过滤掉：

# 先过滤掉置信度低的
conf = dets[:, -1]
conf_bool_idx = (conf > score_threshold)
dets = dets[conf_bool_idx]
1
2
3
4

接下来将每个位置的xyxy提取出来：

x1 = dets[:, 0]
y1 = dets[:, 1]
x2 = dets[:, 2]
y2 = dets[:, 3]
scores = dets[:, 4]
1
2
3
4
5

（以上5个变量都是1维的）

计算面积并按照置信度排序：

#计算每个检测框的面积，并对目标检测得分进行降序排序
areas = (x2 - x1 + 1) * (y2 - y1 + 1)
order = scores.argsort()[::-1]
1
2
3

+1 是因为，bbox覆盖的区域都需要计算
[::-1] 是因为，我们需要置信度从大到小排列

接下来计算最高得分矩形框 与 剩余矩形框的相交区域：

i = order[0] # 取置信度最高的框的索引
1

计算最高得分矩形框与剩余矩形框的相交区域

xx1 = np.maximum(x1[i], x1[order[1:]])      # 两个框框左边  的  最右边
yy1 = np.maximum(y1[i], y1[order[1:]])      # 两个框框上边  的  最下边
xx2 = np.minimum(x2[i], x2[order[1:]])      # 两个框框右边  的  最左边
yy2 = np.minimum(y2[i], y2[order[1:]])      # 两个框框下边  的  最上边
1
2
3
4

np.maximum 和 np.minimum 函数是取两个参数张量(或 int)的较大值和较小值
用下边这4个图想一下

在这里插入图片描述

接下来计算重叠面积：

# 计算相交的面积,不重叠时面积为 0
w = np.maximum(0.0, xx2 - xx1 + 1)
h = np.maximum(0.0, yy2 - yy1 + 1)
inter = w * h
1
2
3
4

如果没有交集，xx2 - xx1 + 1 < 0 则w直接取0，则交集面积inter=0

注意此时所有的变量，inter、x、y都是张量

接下来直接计算 iou 即可，一般无需担心有除0错误，因为 $a re a > 0$ 且 $\le areas$

ovr = inter / (areas[i] + areas[order[1:]] - inter)
1

接下来价格大于阈值的过滤掉，也就是保留小于阈值 nms_threshold 的：

inds = np.where(ovr <= nms_threshold)[0]
1

接下来，更新 order，因为之前已经过滤掉一部分重复的框了，所以直接用 inds 变量索引即可

注意这里索引加了1, 因为ovr数组的长度比order数组的长度少一个，就是因为我们将第一个置信度最高的框已经取走了，第一个框和其余的框计算IoU

order = order[inds + 1]
1

然后接下来从第一步再去进行循环即可

我们这里不能用 for 循环是因为 order 每次都在变换，所以应当用 while 循环
循环终止的条件就是 order 中已经没有元素了，被 nms 过滤完毕了，所以只要 order.size > 0 循环就应当进行

总结一下，我们的策略是：

每次先将置信度高的框的索引放到保留列表里，该列表命名为 keep，然后用该置信度高的框与其它框进行 nms，滤除置信度低的框
然后再取当前置信度最高的框，与剩余的框进行 nms, 重复该过程

放上完整代码：

import numpy as np
import cv2

def py_nms(dets, nms_threshold=0.5, score_threshold=0.5):
    
    """Pure Python NMS baseline."""
    
    # 先过滤掉置信度低的
    conf = dets[:, -1]
    conf_bool_idx = (conf > score_threshold)
    dets = dets[conf_bool_idx]
    
    # tl_x, tl_y, br_x, br_y 及 score
    x1 = dets[:, 0]
    y1 = dets[:, 1]
    x2 = dets[:, 2]
    y2 = dets[:, 3]
    scores = dets[:, 4]
    

    #计算每个检测框的面积，并对目标检测得分进行降序排序
    areas = (x2 - x1 + 1) * (y2 - y1 + 1)
    order = scores.argsort()[::-1]

    keep = []   #保留框的结果集合
    while order.size > 0:
        i = order[0]
        
        keep.append(i)
        
        # 计算最高得分矩形框与剩余矩形框的相交区域
        xx1 = np.maximum(x1[i], x1[order[1:]])
        yy1 = np.maximum(y1[i], y1[order[1:]])
        xx2 = np.minimum(x2[i], x2[order[1:]])
        yy2 = np.minimum(y2[i], y2[order[1:]])

        # 计算相交的面积,不重叠时面积为0
        w = np.maximum(0.0, xx2 - xx1 + 1)
        h = np.maximum(0.0, yy2 - yy1 + 1)
        inter = w * h
        
        #计算IoU：重叠面积 /（面积1+面积2-重叠面积）
        ovr = inter / (areas[i] + areas[order[1:]] - inter)

        # 保留IoU小于阈值的box
        inds = np.where(ovr <= nms_threshold)[0]
        order = order[inds + 1]   # 注意这里索引加了1,因为ovr数组的长度比order数组的长度少一个

    return np.where(conf_bool_idx)[0][keep]
    
if __name__ == '__main__':
    
    dets = np.array([[100,120,170,200,0.98],
                     [20,40,80,90,0.99],
                     [20,38,82,88,0.96],
                     [200,380,282,488,0.9],
                     [19,38,75,91, 0.8]])
                     
    res = py_nms(dets, 0.5, 0.5)
    print(dets[res])
    print(res)
    
    # dets_ = np.array(bbox_list)
    dets_ = dets
    nms_idx = cv2.dnn.NMSBoxes(dets_[:, :-1], dets_[:, -1], 0.5, 0.5)
    print(dets_[nms_idx])
    print(nms_idx)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67

看下结果：

[[ 20.    40.    80.    90.     0.99]
 [100.   120.   170.   200.     0.98]
 [200.   380.   282.   488.     0.9 ]]
[1 0 3]
[[ 20.    40.    80.    90.     0.99]
 [100.   120.   170.   200.     0.98]
 [200.   380.   282.   488.     0.9 ]]
[1 0 3]
1
2
3
4
5
6
7
8

这里说一下：

inds = np.where(ovr <= nms_threshold)[0]
order = order[inds + 1]
1
2

如果某一步，把所有的框都nms过滤掉了，则 inds 返回一个空数组

>>> inds
[]
>>> inds.shape
(0,)
1
2
3
4

这时，inds + 1，空数字加1 会返回什么呢?

>>> inds + 1
[]
1
2

空数组加一，依旧是空数组

那空数组去索引order的元素会返回什么呢:

>>> order[inds + 1]
[]
1
2

依旧是空数组，这样再检测循环条件时 order.size > 0，该条件为 False，跳出循环，循环结束

只能说 numpy 牛逼！这都考虑到了

最后要说的是 return 语句

return np.where(conf_bool_idx)[0][keep]
1

由于第一步，先用置信度阈值过滤了

# 先过滤掉置信度低的
conf = dets[:, -1]
conf_bool_idx = (conf > score_threshold)
dets = dets[conf_bool_idx]
1
2
3
4

所以最终的索引要按照没置信度过滤之前的数组算

有参考自：

https://zhuanlan.zhihu.com/p/110483330

相关阅读:
102.二叉树的层序遍历
定压补水装置隔膜式定压补水装置
Day117.尚医通：生成挂号订单模块
# Java 常用代码片段
Python + Django4 搭建个人博客（八）:实现博文列表页面的视图函数和模板
MYSQL的视图
为Linux内核增加一个系统调用
码出高效（一） Java 编程风格规约
Java 注释
小学期-中期总结报告

原文地址：https://blog.csdn.net/HaoZiHuang/article/details/126463032