作者 treat face detection as a general object detection task(face detection is just a sub task of general object detection),在 yolov5 目标检测的工程基础上,改进提出人脸检测器 YOLO5Face
2 Related Work
Object Detection
traditional
two-stage
one-stage
Face Detection 解决 scale, pose, occlusion, expression, makeup, illumination, blur and etc 问题
YOLO
3 Advantages / Contributions
设计 YOLO5Face 人脸检测器
针对不用应用需求,提出不同大小的人脸检测器
在 WiderFace 上评估,实现了 SOTA(validation)
4 Method
1)Network Architecture
相比于 yolov5 的改进
add a landmark regression head to the YOLOv5 network. 采用的是 Wing Loss
replace the Focus with a Stem block structure(图 1 的(d))
change the SPP block and use a smaller kernel(13x13-9x9-5x5 改成了 7x7-5x7-3x3)
add a P6 output block with stride of 64(增加大脸检出率)
调增了 DA 策略,取消了上下翻转,Mosaic 和小目标兼容性不好,random cropping 效果不错
design two super light-weight models based on ShuffleNetV2(YOLOv5n / YOLOv5n-0.5)
2)Landmark Regression
L1 L2 和 smooth L1,these loss functions are not sensitive to small errors.
采用的 wing loss 对小 error 更敏感 最终的 loss 由 objection loss 和 landmark loss 构成
5 Experiments
5.1 Datasets
WiderFace
contains 32,203 images and 393,703 faces
train/validation/test sets by ratio 50%/10%/40%
three levels of difficulty: Easy, Medium, and Hard.
FDDB
5171 faces annotated in 2845 images.
5.2 Ablation Study
模型结构细节 消融实验
Stem Block vs. Focus Layer
SPP with Smaller Size Kernels
P6 Output Block
Data Augmentation
Mosaic helps the mAP in the Hard dataset.
the Mosaic has to work with the ignoring small faces, otherwise the performance degrades dramatically
5.3 YOLO5Face for Face Recognition
关键点的对比,vs RetinaFace 大角度作者的方法会更准确一点
5.4 YOLO5Face on WiderFace Dataset
SCRFD 感觉好猛
看看 PR 曲线
validation dataset:YOLOv5x6-Face detector achieves 96.9%, 96.0%, 91.6% mAP on the Easy, Medium, and Hard subset(比SOTA 猛)
test dataset:YOLOv5x6-Face detector achieves 95.8%, 94.9%, 90.5% mAP on the Easy, Medium, and Hard subset(没 SOTA猛)
we only use multiple scales and left-right flipping without using other test-time augmentation (TTA) methods.