IOU是衡量俩个目标框之间重叠程度的一个指标,常用于目标检测中,用于评估预测框的准确率。接下来简要介绍一下IOU原理和实现。
假设有俩个框,分别是框
M
M
M和框
N
N
N,记它们的面积为
A
M
A_M
AM和
A
N
A_N
AN。那么它们的IOU为
I
O
U
(
M
,
N
)
=
M
∩
N
M
∪
N
IOU(M,N) = \frac{M\cap N}{M\cup N}
IOU(M,N)=M∪NM∩N其中
M
∩
N
M\cap N
M∩N为区域
M
M
M和区域
N
N
N之间的交区域,记为
A
M
N
A_{MN}
AMN,
M
∩
N
M\cap N
M∩N为区域
M
M
M和区域
N
N
N之间的并区域。那么就有
I
O
U
(
M
,
N
)
=
M
∩
N
M
∪
N
=
A
M
N
A
M
+
A
N
−
A
M
N
IOU(M,N) = \frac{M\cap N}{M\cup N} = \frac{A_{MN}}{A_M+A_N-A_{MN}}
IOU(M,N)=M∪NM∩N=AM+AN−AMNAMN所以当
A
M
N
=
0
A_{MN}=0
AMN=0(M和N之间不重叠)时有
I
O
U
(
M
,
N
)
=
0
IOU(M,N) = 0
IOU(M,N)=0,当
A
M
N
>
0
A_{MN}>0
AMN>0(M和N之间有部分重叠)时有
I
O
U
(
M
,
N
)
=
0
IOU(M,N)=0
IOU(M,N)=0,当且仅当
A
M
N
=
A
M
=
A
N
A_{MN} =A_M=A_N
AMN=AM=AN(区域
M
M
M和区域
N
N
N完全重叠,且它们的面积相等)时有
I
O
U
(
M
,
N
)
=
1
IOU(M,N)=1
IOU(M,N)=1。
直接用IOU来当损失有俩种,一种是
L
o
s
s
i
o
u
=
−
l
n
(
I
O
U
)
Loss_iou = -ln(IOU)
Lossiou=−ln(IOU),但实际使用中比较多的是
L
o
s
s
i
o
u
=
1
−
I
O
U
Loss_iou = 1-IOU
Lossiou=1−IOU。
IOU损失的缺点有俩个,第一个是不能描述俩个框之间的距离关系,只能描述重叠面积。第二个是
对于没有交集的框,IoU 的值为 0,这会导致梯度为 0,难以优化。
参考了yolov5的实现,加入eps和xywh的格式,可以使函数更加全面。
def bbox_iou(box1, box2, xywh=True, eps=1e-7):
if xywh: # transform from xywh to xyxy
(x1, y1, w1, h1), (x2, y2, w2, h2) = box1, box2
w1_, h1_, w2_, h2_ = w1 / 2, h1 / 2, w2 / 2, h2 / 2
b1_x1, b1_x2, b1_y1, b1_y2 = x1 - w1_, x1 + w1_, y1 - h1_, y1 + h1_
b2_x1, b2_x2, b2_y1, b2_y2 = x2 - w2_, x2 + w2_, y2 - h2_, y2 + h2_
else: # x1, y1, x2, y2 = box1
b1_x1, b1_y1, b1_x2, b1_y2 = box1
b2_x1, b2_y1, b2_x2, b2_y2 = box2
w1, h1 = b1_x2 - b1_x1, b1_y2 - b1_y1 + eps
w2, h2 = b2_x2 - b2_x1, b2_y2 - b2_y1 + eps
inter_area = (b1_x2.minimum(b2_x2) - b1_x1.maximum(b2_x1)).clamp_(0) * (b1_y2.minimum(b2_y2) - b1_y1.maximum(b2_y1)).clamp_(0)
union = w1 * h1 + w2 * h2 - inter_area + eps #防止union为0
iou = inter_area / union
return iou
GIOU是IOU的改进方式,它在IOU的基础上,考虑了俩个框之间的位置关系,同时也解决了IOU为0时难以优化的问题。GIOU的公式如下
G
I
O
U
(
M
,
N
)
=
I
O
U
(
M
,
N
)
−
∣
C
−
(
M
∩
N
)
∣
∣
C
∣
GIOU(M,N)=IOU(M,N)-\frac{|C-(M\cap N)|}{|C|}
GIOU(M,N)=IOU(M,N)−∣C∣∣C−(M∩N)∣其中C是指
M
M
M和
N
N
N的最小外接矩阵。
通过引入俩个框最小外接矩阵,当
∣
M
∩
N
∣
=
0
|M\cap N|=0
∣M∩N∣=0时,
G
I
O
U
=
−
1
GIOU=-1
GIOU=−1,这就解决IOU梯度为0的问题。同时这种方式也能考虑到俩个框不相交的部分的关系。
但是GIOU需要计算它们的外接矩形,增加了计算量。同时由于俩个框的外接矩阵随着预测的不同而变动,或导致它很难有稳定的优化效果。
def bbox_giou(box1, box2, xywh=True, eps=1e-7):
if xywh: # transform from xywh to xyxy
(x1, y1, w1, h1), (x2, y2, w2, h2) = box1, box2
w1_, h1_, w2_, h2_ = w1 / 2, h1 / 2, w2 / 2, h2 / 2
b1_x1, b1_x2, b1_y1, b1_y2 = x1 - w1_, x1 + w1_, y1 - h1_, y1 + h1_
b2_x1, b2_x2, b2_y1, b2_y2 = x2 - w2_, x2 + w2_, y2 - h2_, y2 + h2_
else: # x1, y1, x2, y2 = box1
b1_x1, b1_y1, b1_x2, b1_y2 = box1
b2_x1, b2_y1, b2_x2, b2_y2 = box2
w1, h1 = b1_x2 - b1_x1, b1_y2 - b1_y1 + eps
w2, h2 = b2_x2 - b2_x1, b2_y2 - b2_y1 + eps
inter_area = (b1_x2.minimum(b2_x2) - b1_x1.maximum(b2_x1)).clamp_(0) * (b1_y2.minimum(b2_y2) - b1_y1.maximum(b2_y1)).clamp_(0)
union = w1 * h1 + w2 * h2 - inter_area + eps #防止union为0
iou = inter_area / union
cw = b1_x2.maximum(b2_x2) - b1_x1.minimum(b2_x1)
ch = b1_y2.maximum(b2_y2) - b1_y1.minimum(b2_y1)
mini_rectangle = cw * ch + eps #计算最小外接矩形
return iou-(mini_rectangle-union)/mini_rectangle
DIOU是基于IOU的改进,它通过俩个框的中心点的欧式距离
p
p
p来表示俩个框的,同时借助俩个框的最小外接矩形的对角线来
c
c
c归一化
p
p
p从而有
D
I
O
U
(
M
,
N
)
=
I
O
U
−
p
2
c
2
DIOU(M,N) = IOU-\frac{p^2}{c^2}
DIOU(M,N)=IOU−c2p2
DIOU是一个比较好的方法,它考虑框相对位置关系,同时通过这种方式还能很好地衡量这个关系。所以在训练中它的收敛速度比较块, 是许多回归框损失函数不错的一个选择。
def bbox_giou(box1, box2, xywh=True, eps=1e-7):
if xywh: # transform from xywh to xyxy
(x1, y1, w1, h1), (x2, y2, w2, h2) = box1, box2
w1_, h1_, w2_, h2_ = w1 / 2, h1 / 2, w2 / 2, h2 / 2
b1_x1, b1_x2, b1_y1, b1_y2 = x1 - w1_, x1 + w1_, y1 - h1_, y1 + h1_
b2_x1, b2_x2, b2_y1, b2_y2 = x2 - w2_, x2 + w2_, y2 - h2_, y2 + h2_
else: # x1, y1, x2, y2 = box1
b1_x1, b1_y1, b1_x2, b1_y2 = box1
b2_x1, b2_y1, b2_x2, b2_y2 = box2
w1, h1 = b1_x2 - b1_x1, b1_y2 - b1_y1 + eps
w2, h2 = b2_x2 - b2_x1, b2_y2 - b2_y1 + eps
inter_area = (b1_x2.minimum(b2_x2) - b1_x1.maximum(b2_x1)).clamp_(0) * (b1_y2.minimum(b2_y2) - b1_y1.maximum(b2_y1)).clamp_(0)
union = w1 * h1 + w2 * h2 - inter_area + eps #防止union为0
iou = inter_area / union
cw = b1_x2.maximum(b2_x2) - b1_x1.minimum(b2_x1)
ch = b1_y2.maximum(b2_y2) - b1_y1.minimum(b2_y1)
c2 = cw.pow(2) + ch.pow(2) + eps #计算对角线
rho2 = ((b2_x1 + b2_x2 - b1_x1 - b1_x2).pow(2) + (b2_y1 + b2_y2 - b1_y1 -b1_y2).pow(2)) / 4 #计算俩个中心的距离,这个实现很优雅,2倍的中心差,平方后除4
return iou-rho2/c2
CIOU是在DIOU的基础上考虑了长宽比,许多目标检测任务中,物体的长宽比相对比较固定,那么将它们考虑进损失函数中,理论上是有利于回归框收敛的。CIOU为
C
I
O
U
(
M
,
N
)
=
I
O
U
−
p
2
c
2
−
α
v
CIOU(M,N) = IOU-\frac{p^2}{c^2}-\alpha v
CIOU(M,N)=IOU−c2p2−αv
其中
α
\alpha
α为权重参数,具体如下
α
=
v
(
1
−
I
O
U
)
+
v
\alpha = \frac{v}{(1-IOU)+v}
α=(1−IOU)+vv在实际中可以给
α
\alpha
α一个固定的值。
其中
v
v
v是指长框比的差。具体如下
v
=
4
π
2
(
arctan
w
M
h
M
−
arctan
w
N
h
N
)
2
v = \frac{4}{\pi^2}\left(\arctan\frac{w_{M}}{h_{M}}-\arctan\frac{w_N}{h_N}\right)^2
v=π24(arctanhMwM−arctanhNwN)2由于长宽比的范围为
[
0
,
∞
)
[0,\infty)
[0,∞),所以使用
arctan
\arctan
arctan可以固定到
(
0
,
π
2
)
(0,\frac{\pi}{2})
(0,2π),而
4
π
2
\frac{4}{\pi^2}
π24就是做了归一化的操作了。
def bbox_giou(box1, box2, xywh=True, eps=1e-7):
if xywh: # transform from xywh to xyxy
(x1, y1, w1, h1), (x2, y2, w2, h2) = box1, box2
w1_, h1_, w2_, h2_ = w1 / 2, h1 / 2, w2 / 2, h2 / 2
b1_x1, b1_x2, b1_y1, b1_y2 = x1 - w1_, x1 + w1_, y1 - h1_, y1 + h1_
b2_x1, b2_x2, b2_y1, b2_y2 = x2 - w2_, x2 + w2_, y2 - h2_, y2 + h2_
else: # x1, y1, x2, y2 = box1
b1_x1, b1_y1, b1_x2, b1_y2 = box1
b2_x1, b2_y1, b2_x2, b2_y2 = box2
w1, h1 = b1_x2 - b1_x1, b1_y2 - b1_y1 + eps
w2, h2 = b2_x2 - b2_x1, b2_y2 - b2_y1 + eps
inter_area = (b1_x2.minimum(b2_x2) - b1_x1.maximum(b2_x1)).clamp_(0) * (b1_y2.minimum(b2_y2) - b1_y1.maximum(b2_y1)).clamp_(0)
union = w1 * h1 + w2 * h2 - inter_area + eps #防止union为0
iou = inter_area / union
cw = b1_x2.maximum(b2_x2) - b1_x1.minimum(b2_x1)
ch = b1_y2.maximum(b2_y2) - b1_y1.minimum(b2_y1)
c2 = cw.pow(2) + ch.pow(2) + eps #计算对角线
rho2 = ((b2_x1 + b2_x2 - b1_x1 - b1_x2).pow(2) + (b2_y1 + b2_y2 - b1_y1 -b1_y2).pow(2)) / 4 #计算俩个中心的距离,这个实现很优雅,2倍的中心差,平方后除4
v = (4 / math.pi**2) * ((w2 / h2).atan() - (w1 / h1).atan()).pow(2)#计算v
alpha = v / (v - iou + (1 + eps)) #计算a
return iou-rho2/c2-v*alpha
EIOU是将长宽比改成了长长比和宽宽比,学习了中心点距离的方式。具体如下 E I O U ( M , N ) = I O U − p ( b M , b N ) 2 c 2 − p ( w M , w N ) 2 c w 2 − p ( h M , h N ) 2 c h 2 EIOU(M,N) = IOU-\frac{p(b_M,b_N)^2}{c^2}-\frac{p(w_M,w_N)^2}{c_w^2}-\frac{p(h_M,h_N)^2}{c_h^2} EIOU(M,N)=IOU−c2p(bM,bN)2−cw2p(wM,wN)2−ch2p(hM,hN)2EIOU的计算会比CIOU更直接一点, 对长宽的约束更加的明显。
def bbox_giou(box1, box2, xywh=True, eps=1e-7):
if xywh: # transform from xywh to xyxy
(x1, y1, w1, h1), (x2, y2, w2, h2) = box1, box2
w1_, h1_, w2_, h2_ = w1 / 2, h1 / 2, w2 / 2, h2 / 2
b1_x1, b1_x2, b1_y1, b1_y2 = x1 - w1_, x1 + w1_, y1 - h1_, y1 + h1_
b2_x1, b2_x2, b2_y1, b2_y2 = x2 - w2_, x2 + w2_, y2 - h2_, y2 + h2_
else: # x1, y1, x2, y2 = box1
b1_x1, b1_y1, b1_x2, b1_y2 = box1
b2_x1, b2_y1, b2_x2, b2_y2 = box2
w1, h1 = b1_x2 - b1_x1, b1_y2 - b1_y1 + eps
w2, h2 = b2_x2 - b2_x1, b2_y2 - b2_y1 + eps
inter_area = (b1_x2.minimum(b2_x2) - b1_x1.maximum(b2_x1)).clamp_(0) * (b1_y2.minimum(b2_y2) - b1_y1.maximum(b2_y1)).clamp_(0)
union = w1 * h1 + w2 * h2 - inter_area + eps #防止union为0
iou = inter_area / union
cw = b1_x2.maximum(b2_x2) - b1_x1.minimum(b2_x1)
ch = b1_y2.maximum(b2_y2) - b1_y1.minimum(b2_y1)
c2 = cw.pow(2) + ch.pow(2) + eps #计算对角线
rho2 = ((b2_x1 + b2_x2 - b1_x1 - b1_x2).pow(2) + (b2_y1 + b2_y2 - b1_y1 -b1_y2).pow(2)) / 4 #计算俩个中心的距离,这个实现很优雅,2倍的中心差,平方后除4
rhow = (w1-w2).pow(2)/(cw.pow(2)+eps)
rhoh = (h1-h2).pow(2)/(ch.pow(2)+eps)
return iou-rho2/c2-rhow-rhoh