• mmcv和openCV两个库imcrop()和imresize()方法的对应【基础分析】


    🥇 版权: 本文由【墨理学AI】原创首发、各位读者大大、敬请查阅、感谢三连
    🎉 声明: 作为全网 AI 领域 干货最多的博主之一,❤️ 不负光阴不负卿 ❤️

    0-9

    MMCV 全家桶

    是一个用于训练深度学习模型的基础库

    1-90

    有很多star数量较高的集成仓库方便开发者进行模型训练和部署的全流程搭建

    1-9

    mmcv.imresize(img, (1000, 600), return_scale=True) 方法实现

    mmcv.imresize(img, (1000, 600), return_scale=True)
    
    • 1

    源码链接如下

    
    cv2_interp_codes = {
        'nearest': cv2.INTER_NEAREST,
        'bilinear': cv2.INTER_LINEAR,
        'bicubic': cv2.INTER_CUBIC,
        'area': cv2.INTER_AREA,
        'lanczos': cv2.INTER_LANCZOS4
    }
    
    def imresize(img,
                 size,
                 return_scale=False,
                 interpolation='bilinear',
                 out=None,
                 backend=None):
        """Resize image to a given size.
        Args:
            img (ndarray): The input image.
            size (tuple[int]): Target size (w, h).
            return_scale (bool): Whether to return `w_scale` and `h_scale`.
            interpolation (str): Interpolation method, accepted values are
                "nearest", "bilinear", "bicubic", "area", "lanczos" for 'cv2'
                backend, "nearest", "bilinear" for 'pillow' backend.
            out (ndarray): The output destination.
            backend (str | None): The image resize backend type. Options are `cv2`,
                `pillow`, `None`. If backend is None, the global imread_backend
                specified by ``mmcv.use_backend()`` will be used. Default: None.
        Returns:
            tuple | ndarray: (`resized_img`, `w_scale`, `h_scale`) or
            `resized_img`.
        """
        h, w = img.shape[:2]
        if backend is None:
            backend = imread_backend
        if backend not in ['cv2', 'pillow']:
            raise ValueError(f'backend: {backend} is not supported for resize.'
                             f"Supported backends are 'cv2', 'pillow'")
    
        if backend == 'pillow':
            assert img.dtype == np.uint8, 'Pillow backend only support uint8 type'
            pil_image = Image.fromarray(img)
            pil_image = pil_image.resize(size, pillow_interp_codes[interpolation])
            resized_img = np.array(pil_image)
        else:
            resized_img = cv2.resize(
                img, size, dst=out, interpolation=cv2_interp_codes[interpolation])
        if not return_scale:
            return resized_img
        else:
            w_scale = size[0] / w
            h_scale = size[1] / h
            return resized_img, w_scale, 
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41
    • 42
    • 43
    • 44
    • 45
    • 46
    • 47
    • 48
    • 49
    • 50
    • 51
    • 52

    对应的 openCV 的 cv2.resize 方法

    resized_img = cv2.resize(
                img, size, dst=out, interpolation=cv2_interp_codes[interpolation])
    
    • 1
    • 2

    mmcv 中 imcrop(img, bboxes, scale=1.0, pad_fill=None) 方法

    调用方式

    bboxes = np.array([10, 10, 100, 120])
    patch = mmcv.imcrop(img, bboxes)
    
    • 1
    • 2
    def imcrop(img, bboxes, scale=1.0, pad_fill=None):
        """Crop image patches.
        3 steps: scale the bboxes -> clip bboxes -> crop and pad.
        Args:
            img (ndarray): Image to be cropped.
            bboxes (ndarray): Shape (k, 4) or (4, ), location of cropped bboxes.
            scale (float, optional): Scale ratio of bboxes, the default value
                1.0 means no padding.
            pad_fill (Number | list[Number]): Value to be filled for padding.
                Default: None, which means no padding.
        Returns:
            list[ndarray] | ndarray: The cropped image patches.
        """
        chn = 1 if img.ndim == 2 else img.shape[2]
        if pad_fill is not None:
            if isinstance(pad_fill, (int, float)):
                pad_fill = [pad_fill for _ in range(chn)]
            assert len(pad_fill) == chn
    
        _bboxes = bboxes[None, ...] if bboxes.ndim == 1 else bboxes
        scaled_bboxes = bbox_scaling(_bboxes, scale).astype(np.int32)
        clipped_bbox = bbox_clip(scaled_bboxes, img.shape)
    
        patches = []
        for i in range(clipped_bbox.shape[0]):
            x1, y1, x2, y2 = tuple(clipped_bbox[i, :])
            if pad_fill is None:
                patch = img[y1:y2 + 1, x1:x2 + 1, ...]
            else:
                _x1, _y1, _x2, _y2 = tuple(scaled_bboxes[i, :])
                if chn == 1:
                    patch_shape = (_y2 - _y1 + 1, _x2 - _x1 + 1)
                else:
                    patch_shape = (_y2 - _y1 + 1, _x2 - _x1 + 1, chn)
                patch = np.array(
                    pad_fill, dtype=img.dtype) * np.ones(
                        patch_shape, dtype=img.dtype)
                x_start = 0 if _x1 >= 0 else -_x1
                y_start = 0 if _y1 >= 0 else -_y1
                w = x2 - x1 + 1
                h = y2 - y1 + 1
                patch[y_start:y_start + h, x_start:x_start + w,
                      ...] = img[y1:y1 + h, x1:x1 + w, ...]
            patches.append(patch)
    
        if bboxes.ndim == 1:
            return patches[0]
        else:
            return patches
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41
    • 42
    • 43
    • 44
    • 45
    • 46
    • 47
    • 48
    • 49

    依赖方法

    def bbox_clip(bboxes, img_shape):
        """Clip bboxes to fit the image shape.
        Args:
            bboxes (ndarray): Shape (..., 4*k)
            img_shape (tuple[int]): (height, width) of the image.
        Returns:
            ndarray: Clipped bboxes.
        """
        assert bboxes.shape[-1] % 4 == 0
        cmin = np.empty(bboxes.shape[-1], dtype=bboxes.dtype)
        cmin[0::2] = img_shape[1] - 1
        cmin[1::2] = img_shape[0] - 1
        clipped_bboxes = np.maximum(np.minimum(bboxes, cmin), 0)
        return clipped_bboxes
    
    def bbox_scaling(bboxes, scale, clip_shape=None):
        """Scaling bboxes w.r.t the box center.
        Args:
            bboxes (ndarray): Shape(..., 4).
            scale (float): Scaling factor.
            clip_shape (tuple[int], optional): If specified, bboxes that exceed the
                boundary will be clipped according to the given shape (h, w).
        Returns:
            ndarray: Scaled bboxes.
        """
        if float(scale) == 1.0:
            scaled_bboxes = bboxes.copy()
        else:
            w = bboxes[..., 2] - bboxes[..., 0] + 1
            h = bboxes[..., 3] - bboxes[..., 1] + 1
            dw = (w * (scale - 1)) * 0.5
            dh = (h * (scale - 1)) * 0.5
            scaled_bboxes = bboxes + np.stack((-dw, -dh, dw, dh), axis=-1)
        if clip_shape is not None:
            return bbox_clip(scaled_bboxes, clip_shape)
        else:
            return scaled_bboxes
    
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38

    mmcv/image/geometric.py 文件分析

    从头文件的import和基础代码分析,可以看到 mmcv 中是把 openCV 和 PIL 两个库的 image 处理集成到 geometric.py 各个方法中

    import numbers
    import warnings
    from typing import Optional, Tuple
    
    import cv2
    import numpy as np
    
    from ..utils import to_2tuple
    from .io import imread_backend
    
    try:
        from PIL import Image
    except ImportError:
        Image = None
    
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15

    📙 精选专栏


    计算机视觉领域 八大专栏、不少干货、有兴趣可了解一下

    9-9

  • 相关阅读:
    微信小程序反编译 2024 unveilr.exe
    break ,continue,retrun的区别
    【iOS】—— 调用手机相册换图片
    ESP8266-Arduino网络编程实例-ESP-Now点对点双向通信(Two Way)
    蓝桥等考C++组别一级010
    4.Redis的Key的操作命令
    接口测试之文件上传
    高性能实体类转换工具MapStruct 使用教程
    J2L3x,实现企业团队协作的完美工具!
    算法专题-单调栈
  • 原文地址:https://blog.csdn.net/sinat_28442665/article/details/126485436