Yolov5的类激活图

在本教程中，我们将了解如何将 EigenCAM（无梯度方法之一）用于 YOLO5。

这是https://github.com/jacobgil/pytorch-grad-cam/blob/master/tutorials/Class Activation Maps for Object Detection With Faster RCNN.ipynb 中适用于 YOLO5的教程的简单得多的版本。

如果您想使用其他方法，如 AblationCAM，您可以使用其他教程。

作为上面教程的提醒，我们将使用无梯度方法进行对象检测，因为大多数框架不支持计算梯度。

我们将使用 ultralytics 的 YOLO5 模型

model = torch.hub.load('ultralytics/yolov5', 'yolov5s', pretrained=True)
1

您还记得，在使这个库适应新架构时，您需要考虑三件主要事情：

重塑变换。这用于从模型中获取激活并处理它们，使它们成为二维格式。例如，有时这些层不会输出张量，而是张量的元组。所以我们需要一个知道深入输出并找到我们的 2D 激活的函数。
在 YOLO5 的情况下，不需要这个，我们得到一个二维空间张量。
指导我们的类激活图的目标函数。
对于 EigenCAM，没有目标函数。我们将对 2D 激活进行 PCA。
如果我们要使用另一种方法，如 AblationCAM，我们将需要它，然后您可以查看上面的 faster-rcnn 教程。
从中提取 2D 激活的目标层。我们将使用倒数第二层。YOLO5 中的最后一层输出检测结果，因此我们将使用它之前的一层。打印模型并使用它之后，这是在

model.model.model.model[-2]
1

首先让我们编写一些样板代码来对图像进行前向传递并显示检测结果：

import warnings
warnings.filterwarnings('ignore')
warnings.simplefilter('ignore')
import torch    
import cv2
import numpy as np
import requests
import torchvision.transforms as transforms
from pytorch_grad_cam import EigenCAM
from pytorch_grad_cam.utils.image import show_cam_on_image, scale_cam_image
from PIL import Image

COLORS = np.random.uniform(0, 255, size=(80, 3))

def parse_detections(results):
    detections = results.pandas().xyxy[0]
    detections = detections.to_dict()
    boxes, colors, names = [], [], []

    for i in range(len(detections["xmin"])):
        confidence = detections["confidence"][i]
        if confidence < 0.2:
            continue
        xmin = int(detections["xmin"][i])
        ymin = int(detections["ymin"][i])
        xmax = int(detections["xmax"][i])
        ymax = int(detections["ymax"][i])
        name = detections["name"][i]
        category = int(detections["class"][i])
        color = COLORS[category]

        boxes.append((xmin, ymin, xmax, ymax))
        colors.append(color)
        names.append(name)
    return boxes, colors, names


def draw_detections(boxes, colors, names, img):
    for box, color, name in zip(boxes, colors, names):
        xmin, ymin, xmax, ymax = box
        cv2.rectangle(
            img,
            (xmin, ymin),
            (xmax, ymax),
            color, 
            2)

        cv2.putText(img, name, (xmin, ymin - 5),
                    cv2.FONT_HERSHEY_SIMPLEX, 0.8, color, 2,
                    lineType=cv2.LINE_AA)
    return img


image_url = "https://upload.wikimedia.org/wikipedia/commons/f/f1/Puppies_%284984818141%29.jpg"
img = np.array(Image.open(requests.get(image_url, stream=True).raw))
img = cv2.resize(img, (640, 640))
rgb_img = img.copy()
img = np.float32(img) / 255
transform = transforms.ToTensor()
tensor = transform(img).unsqueeze(0)

model = torch.hub.load('ultralytics/yolov5', 'yolov5s', pretrained=True)
model.eval()
model.cpu()
target_layers = [model.model.model.model[-2]]

results = model([rgb_img])
boxes, colors, names = parse_detections(results)
detections = draw_detections(boxes, colors, names, rgb_img.copy())
Image.fromarray(detections)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70

在这里插入图片描述
现在让我们创建我们的 CAM 模型并在图像上运行它：

cam = EigenCAM(model, target_layers, use_cuda=False)
grayscale_cam = cam(tensor)[0, :, :]
cam_image = show_cam_on_image(img, grayscale_cam, use_rgb=True)
Image.fromarray(cam_image)
1
2
3
4

在这里插入图片描述

相关阅读:
CefSharp进阶
 java BufferedReader类、BufferedWriter类
 使用MySQL
2020年12月 Scratch（二级）真题解析#中国电子学会#全国青少年软件编程等级考试
 若依开源框架
 docker搭建redis哨兵集群并且整合springboot
【1684. 统计一致字符串的数目】
【动态规划】64. 最小路径和
 页面关闭前，如何发送一个可靠请求
 上海计算机学会 2024年4月月赛丙组T5 数字迷宫
原文地址：https://blog.csdn.net/weixin_42990464/article/details/127997708