mindspore两日集训营202209-自定义算子数据处理

作业一：基于pyfunc模式定义sin算子
写一个基于numpy的计算正弦函数的python原生函数。
定义两个函数，一个是张量形状的推导函数（infer_shape）,另一个是张量数据类型的推导函数（infer_dtype）。
使用作业模板提供的测试代码测试基于上面函数的pyfunc类型自定义算子，并与MindSpore自带的Sin算子作为比较。
作业二：利用hybrid类型的自定义算子实现三维张量的加法函数
要修改的地方非常简单

def sin_by_numpy(x):
    
    # 在这里书写你的计算逻辑
    # 提示：使用np.sin函数
    return np.sin(x)

def infer_shape(x):
    
    # 在这里书写你的计算逻辑
    # 提示：
    #    1. 这里的输入x是算子输入张量的形状
    #    2. sin函数是逐元素计算，输入的形状和输出的一样
    return x.shape

# 然后我们要定义两个函数，一个是张量形状的推导函数（infer_shape）,另一个是张量数据类型的推导函数（infer_dtype）。这里我们要注意：

# - 张量形状的推导函数是输入张量的形状；
# - 张量数据类型的推导函数是输入张量的数据类型。

def infer_dtype(x):
    
    # 在这里书写你的计算逻辑
    # 提示：
    #    1. 这里的输入x是算子输入张量的数据类型
    #    2. sin函数输入的数据类型和输出的一样
    return x.dtype

@ms_kernel
def tensor_add_3d(x, y):
    result = output_tensor(x.shape, x.dtype)

    # 在这里书写你的计算逻辑
    # 提示：
    #    1. 你需要一个三层循环
    #    2. 第i层循环的上界可以用x.shape[i]获得
    #    3. 你需要基于每个元素表达计算，例如加法为 x[i, j, k] + y[i, j, k]
    for i in range(x.shape[0]):
        for j in range(x.shape[1]):
            for k in range(x.shape[2]):
                result[i,j,k] = x[i, j, k] + y[i, j, k]
    
    return result

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43

在这里插入图片描述
list没有shape
稍微修改一下，最后的结果

结果出来了，非常顺利，验证的结果和自己创建的结果完全相同。

最终的代码

import numpy as np
import mindspore as ms
from mindspore import ops
from mindspore.ops import ms_kernel
from mindspore.nn import Cell

########################################
# 这里选用你使用的平台类型：CPU或者GPU #
########################################

ms.set_context(mode=ms.GRAPH_MODE, device_target="GPU")

## 作业一：基于pyfunc模式定义sin算子

# 首先，我们写一个基于numpy的计算正弦函数的python原生函数。

def sin_by_numpy(x):
    
    # 在这里书写你的计算逻辑
    # 提示：使用np.sin函数
    return np.sin(x)

def infer_shape(x):
    
    # 在这里书写你的计算逻辑
    # 提示：
    #    1. 这里的输入x是算子输入张量的形状
    #    2. sin函数是逐元素计算，输入的形状和输出的一样
    return x

# 然后我们要定义两个函数，一个是张量形状的推导函数（infer_shape）,另一个是张量数据类型的推导函数（infer_dtype）。这里我们要注意：

# - 张量形状的推导函数是输入张量的形状；
# - 张量数据类型的推导函数是输入张量的数据类型。

def infer_dtype(x):
    
    # 在这里书写你的计算逻辑
    # 提示：
    #    1. 这里的输入x是算子输入张量的数据类型
    #    2. sin函数输入的数据类型和输出的一样
    return x

# 下面我们用上面的函数自定义一个算子，其输入包括

# - func：自定义算子的函数表达，这里我们用`sin_by_numpy`函数;
# - out_shape: 输出形状的推导函数，这里我们用`infer_shape`函数;
# - out_dtype: 输出数据类型的推导函数，这里我们用`infer_dtype`函数;
# - func_type: 自定义算子类型，这里我们用`"pyfunc"`。


sin_by_numpy_op = ops.Custom(func = sin_by_numpy,     # 这里填入自定义算子的函数表达
                             out_shape = infer_shape, # 这里填入输出形状的推导函数
                             out_dtype = infer_dtype, # 这里填入输出数据类型的推导函数
                             func_type = "pyfunc"     # 这里填入自定义算子类型
                            )

# 然后我们调用算子计算结果，直接运行下面的代码块获取结果，验证上面定义的正确性。

input_tensor = ms.Tensor([0,1, 0.2, 0.3, 0.4], dtype=ms.float32)
result_cus = sin_by_numpy_op(input_tensor)
print("====================================================")
print("sin_by_numpy_op", result_cus)

# 作为比较，我们用MindSpore自带的Sin算子作为比较。运行下面的代码块，并且比较输出结果和上面的结果，验证上面计算的正确性。

sin_ms = ops.Sin()
result = sin_ms(input_tensor)
print("====================================================")
print("sin", result)

## 作业二：利用hybrid类型的自定义算子实现三维张量的加法函数

# 首先，我们写一个基于MindSpore Hybrid DSL书写一个计算三维张量相加的函数。

# 注意：

# - 对于输出张量使用 `output_tensor`，用法为：`output_tensor(shape, dtype)`；
# - 所有的计算需要基于标量计算，如果是Tensor对象那么写清楚所有index；
# - 基本循环的写法和Python一样，循环维度的表达可以使用 `range`。

@ms_kernel
def tensor_add_3d(x, y):
    result = output_tensor(x.shape, x.dtype)

    # 在这里书写你的计算逻辑
    # 提示：
    #    1. 你需要一个三层循环
    #    2. 第i层循环的上界可以用x.shape[i]获得
    #    3. 你需要基于每个元素表达计算，例如加法为 x[i, j, k] + y[i, j, k]
    for i in range(x.shape[0]):
        for j in range(x.shape[1]):
            for k in range(x.shape[2]):
                result[i,j,k] = x[i, j, k] + y[i, j, k]
    
    return result

# 下面我们用上面的函数自定义一个算子。

# 注意到基于`ms_kernel`的`hybrid`函数时，我们可以使用自动的形状和数据类型推导。

# 因此我们只用给一个`func`输入（`func_type`的默认值为`"hybrid"`）。

tensor_add_3d_op = ops.Custom(func = tensor_add_3d)

# 然后我们调用算子计算结果，直接运行下面的代码块获取结果，验证上面定义的正确性。

input_tensor_x = ms.Tensor(np.random.normal(0, 1, [2, 3, 4]).astype(np.float32))
input_tensor_y = ms.Tensor(np.random.normal(0, 1, [2, 3, 4]).astype(np.float32))
result_cus = tensor_add_3d_op(input_tensor_x, input_tensor_y)
print("====================================================")
print("hubrid, tensor_add_3d_op", result_cus)

# 同时我们可以使用`pyfunc`模式验证上面定义的正确性。

# 这里我们不需要重新定义算子计算函数`tensor_add_3d`，直接将`func_type`改为`"pyfunc"`即可。

# 注意`pyfunc`模式时我们需要手写类型推导函数。

def infer_shape_py(x, y):
    return x

def infer_dtype_py(x, y):
    return x

tensor_add_3d_py_func = ops.Custom(func = tensor_add_3d,
                                   out_shape = infer_shape_py,
                                   out_dtype = infer_dtype_py,
                                   func_type = "pyfunc")

result_pyfunc = tensor_add_3d_py_func(input_tensor_x, input_tensor_y)
print("====================================================")
print("pyfunc, tensor_add_3d_py", result_pyfunc)


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135

MindSpore 数据增强API统一
作业1
题目：使用MindSpore定义Cifar10数据预处理流程，并输入真实Cifar10数据集完成预处理，获得预处理的结果
说明：参考#I5O5X5:MindSpore 数据增强API统一小作业第一部分
在这里插入图片描述

如果用wsl，会遇到如下的错误
qt.qpa.xcb: could not connect to display 172.26.192.1:0
qt.qpa.plugin: Could not load the Qt platform plugin “xcb” in “/home/kewei/.local/lib/python3.9/site-packages/cv2/qt/plugins” even though it was found.
This application failed to start because no Qt platform plugin could be initialized. Reinstalling the application may fix this problem.

Available platform plugins are: xcb, eglfs, minimal, minimalegl, offscreen, vnc, webgl.

Aborted

解决办法
这和IDE不能直接回传图形界面相关，比如一份代码中添加cv.imshow()后报上述错误
如果有cv2show的，就在命令行运行。同时打开mobaxterm就可以了。
在这里插入图片描述

然后自行参考dataset API列表做更多样的数据增强操作，并将预处理结果保存到本地。
作业2
题目：实现更丰富的数据预处理策略，穿插MindSpore预定义的数据增强API、以及自定义的python function操作
说明：参考#I5O5X5:MindSpore 数据增强API统一小作业第二部分
这个很简单，直接套进去就可以了
比如
在这里插入图片描述

修改后如图
在这里插入图片描述
只要在上面定义增强算子，然后map中调用即可。
效果如下

那我们自定义一个pyfunc，叫做阈值化处理吧
阈值化的思路在图像处理也经常用到，思路是

{\begin{cases} 0 & x < a \\ 255 & x \geq a \end{cases}

x = {0255 x < a x \geq a

找到fast-rcnn代码
加入如下内容
在这里插入图片描述

在这里插入图片描述
包括了3个库函数

# Copyright 2020-2022 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================

"""FasterRcnn dataset"""
from __future__ import division

import os
import numpy as np
from numpy import random

import cv2
import mindspore as ms
import mindspore.dataset as de
from mindspore.mindrecord import FileWriter


def bbox_overlaps(bboxes1, bboxes2, mode='iou'):
    """Calculate the ious between each bbox of bboxes1 and bboxes2.

    Args:
        bboxes1(ndarray): shape (n, 4)
        bboxes2(ndarray): shape (k, 4)
        mode(str): iou (intersection over union) or iof (intersection
            over foreground)

    Returns:
        ious(ndarray): shape (n, k)
    """

    assert mode in ['iou', 'iof']

    bboxes1 = bboxes1.astype(np.float32)
    bboxes2 = bboxes2.astype(np.float32)
    rows = bboxes1.shape[0]
    cols = bboxes2.shape[0]
    ious = np.zeros((rows, cols), dtype=np.float32)
    if rows * cols == 0:
        return ious
    exchange = False
    if bboxes1.shape[0] > bboxes2.shape[0]:
        bboxes1, bboxes2 = bboxes2, bboxes1
        ious = np.zeros((cols, rows), dtype=np.float32)
        exchange = True
    area1 = (bboxes1[:, 2] - bboxes1[:, 0] + 1) * (bboxes1[:, 3] - bboxes1[:, 1] + 1)
    area2 = (bboxes2[:, 2] - bboxes2[:, 0] + 1) * (bboxes2[:, 3] - bboxes2[:, 1] + 1)
    for i in range(bboxes1.shape[0]):
        x_start = np.maximum(bboxes1[i, 0], bboxes2[:, 0])
        y_start = np.maximum(bboxes1[i, 1], bboxes2[:, 1])
        x_end = np.minimum(bboxes1[i, 2], bboxes2[:, 2])
        y_end = np.minimum(bboxes1[i, 3], bboxes2[:, 3])
        overlap = np.maximum(x_end - x_start + 1, 0) * np.maximum(
            y_end - y_start + 1, 0)
        if mode == 'iou':
            union = area1[i] + area2 - overlap
        else:
            union = area1[i] if not exchange else area2
        ious[i, :] = overlap / union
    if exchange:
        ious = ious.T
    return ious


class PhotoMetricDistortion:
    """Photo Metric Distortion"""

    def __init__(self,
                 brightness_delta=32,
                 contrast_range=(0.5, 1.5),
                 saturation_range=(0.5, 1.5),
                 hue_delta=18):
        self.brightness_delta = brightness_delta
        self.contrast_lower, self.contrast_upper = contrast_range
        self.saturation_lower, self.saturation_upper = saturation_range
        self.hue_delta = hue_delta

    def __call__(self, img, boxes, labels):
        # random brightness
        img = img.astype('float32')

        if random.randint(2):
            delta = random.uniform(-self.brightness_delta,
                                   self.brightness_delta)
            img += delta

        # mode == 0 --> do random contrast first
        # mode == 1 --> do random contrast last
        mode = random.randint(2)
        if mode == 1:
            if random.randint(2):
                alpha = random.uniform(self.contrast_lower,
                                       self.contrast_upper)
                img *= alpha

        # convert color from BGR to HSV
        img = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)

        # random saturation
        if random.randint(2):
            img[..., 1] *= random.uniform(self.saturation_lower,
                                          self.saturation_upper)

        # random hue
        if random.randint(2):
            img[..., 0] += random.uniform(-self.hue_delta, self.hue_delta)
            img[..., 0][img[..., 0] > 360] -= 360
            img[..., 0][img[..., 0] < 0] += 360

        # convert color from HSV to BGR
        img = cv2.cvtColor(img, cv2.COLOR_HSV2BGR)

        # random contrast
        if mode == 0:
            if random.randint(2):
                alpha = random.uniform(self.contrast_lower,
                                       self.contrast_upper)
                img *= alpha

        # randomly swap channels
        if random.randint(2):
            img = img[..., random.permutation(3)]

        return img, boxes, labels


class Expand:
    """expand image"""

    def __init__(self, mean=(0, 0, 0), to_rgb=True, ratio_range=(1, 4)):
        if to_rgb:
            self.mean = mean[::-1]
        else:
            self.mean = mean
        self.min_ratio, self.max_ratio = ratio_range

    def __call__(self, img, boxes, labels):
        if random.randint(2):
            return img, boxes, labels

        h, w, c = img.shape
        ratio = random.uniform(self.min_ratio, self.max_ratio)
        expand_img = np.full((int(h * ratio), int(w * ratio), c),
                             self.mean).astype(img.dtype)
        left = int(random.uniform(0, w * ratio - w))
        top = int(random.uniform(0, h * ratio - h))
        expand_img[top:top + h, left:left + w] = img
        img = expand_img
        boxes += np.tile((left, top), 2)
        return img, boxes, labels


def rescale_with_tuple(img, scale):
    h, w = img.shape[:2]
    scale_factor = min(max(scale) / max(h, w), min(scale) / min(h, w))
    new_size = int(w * float(scale_factor) + 0.5), int(h * float(scale_factor) + 0.5)
    rescaled_img = cv2.resize(img, new_size, interpolation=cv2.INTER_LINEAR)

    return rescaled_img, scale_factor


def rescale_with_factor(img, scale_factor):
    h, w = img.shape[:2]
    new_size = int(w * float(scale_factor) + 0.5), int(h * float(scale_factor) + 0.5)
    return cv2.resize(img, new_size, interpolation=cv2.INTER_NEAREST)


def rescale_column(img, img_shape, gt_bboxes, gt_label, gt_num, config):
    """rescale operation for image"""
    img_data, scale_factor = rescale_with_tuple(img, (config.img_width, config.img_height))
    if img_data.shape[0] > config.img_height:
        img_data, scale_factor2 = rescale_with_tuple(img_data, (config.img_height, config.img_height))
        scale_factor = scale_factor * scale_factor2

    gt_bboxes = gt_bboxes * scale_factor
    gt_bboxes[:, 0::2] = np.clip(gt_bboxes[:, 0::2], 0, img_data.shape[1] - 1)
    gt_bboxes[:, 1::2] = np.clip(gt_bboxes[:, 1::2], 0, img_data.shape[0] - 1)

    pad_h = config.img_height - img_data.shape[0]
    pad_w = config.img_width - img_data.shape[1]
    assert ((pad_h >= 0) and (pad_w >= 0))

    pad_img_data = np.zeros((config.img_height, config.img_width, 3)).astype(img_data.dtype)
    pad_img_data[0:img_data.shape[0], 0:img_data.shape[1], :] = img_data

    img_shape = (config.img_height, config.img_width, 1.0)
    img_shape = np.asarray(img_shape, dtype=np.float32)

    return (pad_img_data, img_shape, gt_bboxes, gt_label, gt_num)


def rescale_column_test(img, img_shape, gt_bboxes, gt_label, gt_num, config):
    """rescale operation for image of eval"""
    img_data, scale_factor = rescale_with_tuple(img, (config.img_width, config.img_height))
    if img_data.shape[0] > config.img_height:
        img_data, scale_factor2 = rescale_with_tuple(img_data, (config.img_height, config.img_height))
        scale_factor = scale_factor * scale_factor2

    pad_h = config.img_height - img_data.shape[0]
    pad_w = config.img_width - img_data.shape[1]
    assert ((pad_h >= 0) and (pad_w >= 0))

    pad_img_data = np.zeros((config.img_height, config.img_width, 3)).astype(img_data.dtype)
    pad_img_data[0:img_data.shape[0], 0:img_data.shape[1], :] = img_data

    img_shape = np.append(img_shape, (scale_factor, scale_factor))
    img_shape = np.asarray(img_shape, dtype=np.float32)

    return (pad_img_data, img_shape, gt_bboxes, gt_label, gt_num)


def resize_column(img, img_shape, gt_bboxes, gt_label, gt_num, config):
    """resize operation for image"""
    img_data = img
    h, w = img_data.shape[:2]
    img_data = cv2.resize(
        img_data, (config.img_width, config.img_height), interpolation=cv2.INTER_LINEAR)
    h_scale = config.img_height / h
    w_scale = config.img_width / w

    scale_factor = np.array(
        [w_scale, h_scale, w_scale, h_scale], dtype=np.float32)
    img_shape = (config.img_height, config.img_width, 1.0)
    img_shape = np.asarray(img_shape, dtype=np.float32)

    gt_bboxes = gt_bboxes * scale_factor

    gt_bboxes[:, 0::2] = np.clip(gt_bboxes[:, 0::2], 0, img_shape[1] - 1)
    gt_bboxes[:, 1::2] = np.clip(gt_bboxes[:, 1::2], 0, img_shape[0] - 1)

    return (img_data, img_shape, gt_bboxes, gt_label, gt_num)


def resize_column_test(img, img_shape, gt_bboxes, gt_label, gt_num, config):
    """resize operation for image of eval"""
    img_data = img
    h, w = img_data.shape[:2]
    img_data = cv2.resize(
        img_data, (config.img_width, config.img_height), interpolation=cv2.INTER_LINEAR)
    h_scale = config.img_height / h
    w_scale = config.img_width / w

    scale_factor = np.array(
        [w_scale, h_scale, w_scale, h_scale], dtype=np.float32)
    img_shape = np.append(img_shape, (h_scale, w_scale))
    img_shape = np.asarray(img_shape, dtype=np.float32)

    gt_bboxes = gt_bboxes * scale_factor

    gt_bboxes[:, 0::2] = np.clip(gt_bboxes[:, 0::2], 0, img_shape[1] - 1)
    gt_bboxes[:, 1::2] = np.clip(gt_bboxes[:, 1::2], 0, img_shape[0] - 1)

    return (img_data, img_shape, gt_bboxes, gt_label, gt_num)


def impad_to_multiple_column(img, img_shape, gt_bboxes, gt_label, gt_num, config):
    """impad operation for image"""
    img_data = cv2.copyMakeBorder(img,
                                  0, config.img_height - img.shape[0], 0, config.img_width - img.shape[1],
                                  cv2.BORDER_CONSTANT,
                                  value=0)
    img_data = img_data.astype(np.float32)
    return (img_data, img_shape, gt_bboxes, gt_label, gt_num)


def imnormalize_column(img, img_shape, gt_bboxes, gt_label, gt_num):
    """imnormalize operation for image"""
    # Computed from random subset of ImageNet training images
    mean = np.asarray([123.675, 116.28, 103.53])
    std = np.asarray([58.395, 57.12, 57.375])
    img_data = img.copy().astype(np.float32)
    cv2.cvtColor(img_data, cv2.COLOR_BGR2RGB, img_data)  # inplace
    cv2.subtract(img_data, np.float64(mean.reshape(1, -1)), img_data)  # inplace
    cv2.multiply(img_data, 1 / np.float64(std.reshape(1, -1)), img_data)  # inplace

    img_data = img_data.astype(np.float32)
    return (img_data, img_shape, gt_bboxes, gt_label, gt_num)


def flip_column(img, img_shape, gt_bboxes, gt_label, gt_num):
    """flip operation for image"""
    img_data = img
    img_data = np.flip(img_data, axis=1)
    flipped = gt_bboxes.copy()
    _, w, _ = img_data.shape

    flipped[..., 0::4] = w - gt_bboxes[..., 2::4] - 1
    flipped[..., 2::4] = w - gt_bboxes[..., 0::4] - 1

    return (img_data, img_shape, flipped, gt_label, gt_num)


def transpose_column(img, img_shape, gt_bboxes, gt_label, gt_num):
    """transpose operation for image"""
    img_data = img.transpose(2, 0, 1).copy()
    img_data = img_data.astype(np.float32)
    img_shape = img_shape.astype(np.float32)
    gt_bboxes = gt_bboxes.astype(np.float32)
    gt_label = gt_label.astype(np.int32)
    gt_num = gt_num.astype(np.bool)

    return (img_data, img_shape, gt_bboxes, gt_label, gt_num)


def photo_crop_column(img, img_shape, gt_bboxes, gt_label, gt_num):
    """photo crop operation for image"""
    random_photo = PhotoMetricDistortion()
    img_data, gt_bboxes, gt_label = random_photo(img, gt_bboxes, gt_label)

    return (img_data, img_shape, gt_bboxes, gt_label, gt_num)


def expand_column(img, img_shape, gt_bboxes, gt_label, gt_num):
    """expand operation for image"""
    expand = Expand()
    img, gt_bboxes, gt_label = expand(img, gt_bboxes, gt_label)

    return (img, img_shape, gt_bboxes, gt_label, gt_num)


def preprocess_fn(image, box, is_training, config):
    """Preprocess function for dataset."""

    def _infer_data(image_bgr, image_shape, gt_box_new, gt_label_new, gt_iscrowd_new_revert):
        image_shape = image_shape[:2]
        input_data = image_bgr, image_shape, gt_box_new, gt_label_new, gt_iscrowd_new_revert

        if config.keep_ratio:
            input_data = rescale_column_test(*input_data, config=config)
        else:
            input_data = resize_column_test(*input_data, config=config)
        input_data = imnormalize_column(*input_data)

        output_data = transpose_column(*input_data)
        return output_data

    def _data_aug(image, box, is_training):
        """Data augmentation function."""
        pad_max_number = config.num_gts
        if pad_max_number < box.shape[0]:
            box = box[:pad_max_number, :]
        image_bgr = image.copy()
        image_bgr[:, :, 0] = image[:, :, 2]
        image_bgr[:, :, 1] = image[:, :, 1]
        image_bgr[:, :, 2] = image[:, :, 0]
        image_shape = image_bgr.shape[:2]
        gt_box = box[:, :4]
        gt_label = box[:, 4]
        gt_iscrowd = box[:, 5]

        gt_box_new = np.pad(gt_box, ((0, pad_max_number - box.shape[0]), (0, 0)), mode="constant", constant_values=0)
        gt_label_new = np.pad(gt_label, ((0, pad_max_number - box.shape[0])), mode="constant", constant_values=-1)
        gt_iscrowd_new = np.pad(gt_iscrowd, ((0, pad_max_number - box.shape[0])), mode="constant", constant_values=1)
        gt_iscrowd_new_revert = (~(gt_iscrowd_new.astype(np.bool))).astype(np.int32)

        if not is_training:
            return _infer_data(image_bgr, image_shape, gt_box_new, gt_label_new, gt_iscrowd_new_revert)

        flip = (np.random.rand() < config.flip_ratio)
        expand = (np.random.rand() < config.expand_ratio)
        input_data = image_bgr, image_shape, gt_box_new, gt_label_new, gt_iscrowd_new_revert

        if expand:
            input_data = expand_column(*input_data)
        if config.keep_ratio:
            input_data = rescale_column(*input_data, config=config)
        else:
            input_data = resize_column(*input_data, config=config)
        input_data = imnormalize_column(*input_data)
        if flip:
            input_data = flip_column(*input_data)

        output_data = transpose_column(*input_data)
        return output_data

    return _data_aug(image, box, is_training)


def create_coco_label(is_training, config):
    """Get image path and annotation from COCO."""
    from pycocotools.coco import COCO

    coco_root = config.coco_root
    data_type = config.val_data_type
    if is_training:
        data_type = config.train_data_type

    # Classes need to train or test.
    train_cls = config.coco_classes
    train_cls_dict = {}
    for i, cls in enumerate(train_cls):
        train_cls_dict[cls] = i

    anno_json = os.path.join(coco_root, config.instance_set.format(data_type))
    if hasattr(config, 'train_set') and is_training:
        anno_json = os.path.join(coco_root, config.train_set)
    if hasattr(config, 'val_set') and not is_training:
        anno_json = os.path.join(coco_root, config.val_set)
    coco = COCO(anno_json)
    classs_dict = {}
    cat_ids = coco.loadCats(coco.getCatIds())
    for cat in cat_ids:
        classs_dict[cat["id"]] = cat["name"]

    image_ids = coco.getImgIds()
    image_files = []
    image_anno_dict = {}

    for img_id in image_ids:
        image_info = coco.loadImgs(img_id)
        file_name = image_info[0]["file_name"]
        anno_ids = coco.getAnnIds(imgIds=img_id, iscrowd=None)
        anno = coco.loadAnns(anno_ids)
        image_path = os.path.join(coco_root, data_type, file_name)
        annos = []
        for label in anno:
            bbox = label["bbox"]
            class_name = classs_dict[label["category_id"]]
            if class_name in train_cls:
                x1, x2 = bbox[0], bbox[0] + bbox[2]
                y1, y2 = bbox[1], bbox[1] + bbox[3]
                annos.append([x1, y1, x2, y2] + [train_cls_dict[class_name]] + [int(label["iscrowd"])])

        image_files.append(image_path)
        if annos:
            image_anno_dict[image_path] = np.array(annos)
        else:
            image_anno_dict[image_path] = np.array([0, 0, 0, 0, 0, 1])

    return image_files, image_anno_dict


def parse_json_annos_from_txt(anno_file, config):
    """for user defined annotations text file, parse it to json format data"""
    if not os.path.isfile(anno_file):
        raise RuntimeError("Evaluation annotation file {} is not valid.".format(anno_file))

    annos = {
        "images": [],
        "annotations": [],
        "categories": []
    }

    # set categories field
    for i, cls_name in enumerate(config.coco_classes):
        annos["categories"].append({"id": i, "name": cls_name})

    with open(anno_file, "rb") as f:
        lines = f.readlines()

    img_id = 1
    anno_id = 1
    for line in lines:
        line_str = line.decode("utf-8").strip()
        line_split = str(line_str).split(' ')
        # set image field
        file_name = line_split[0]
        annos["images"].append({"file_name": file_name, "id": img_id})
        # set annotations field
        for anno_info in line_split[1:]:
            anno = anno_info.split(",")
            x = float(anno[0])
            y = float(anno[1])
            w = float(anno[2]) - float(anno[0])
            h = float(anno[3]) - float(anno[1])
            category_id = int(anno[4])
            iscrowd = int(anno[5])
            annos["annotations"].append({"bbox": [x, y, w, h],
                                         "area": w * h,
                                         "category_id": category_id,
                                         "iscrowd": iscrowd,
                                         "image_id": img_id,
                                         "id": anno_id})
            anno_id += 1
        img_id += 1

    return annos


def create_train_data_from_txt(image_dir, anno_path):
    """Filter valid image file, which both in image_dir and anno_path."""

    def anno_parser(annos_str):
        """Parse annotation from string to list."""
        annos = []
        for anno_str in annos_str:
            anno = anno_str.strip().split(",")
            xmin, ymin, xmax, ymax = list(map(float, anno[:4]))
            cls_id = int(anno[4])
            iscrowd = int(anno[5])
            annos.append([xmin, ymin, xmax, ymax, cls_id, iscrowd])
        return annos

    image_files = []
    image_anno_dict = {}
    if not os.path.isdir(image_dir):
        raise RuntimeError("Path given is not valid.")
    if not os.path.isfile(anno_path):
        raise RuntimeError("Annotation file is not valid.")

    with open(anno_path, "rb") as f:
        lines = f.readlines()
    for line in lines:
        line_str = line.decode("utf-8").strip()
        line_split = str(line_str).split(' ')
        file_name = line_split[0]
        image_path = os.path.join(image_dir, file_name)
        if os.path.isfile(image_path):
            image_anno_dict[image_path] = anno_parser(line_split[1:])
            image_files.append(image_path)
    return image_files, image_anno_dict


def data_to_mindrecord_byte_image(config, dataset="coco", is_training=True, prefix="fasterrcnn.mindrecord", file_num=8):
    """Create MindRecord file."""
    mindrecord_dir = config.mindrecord_dir
    mindrecord_path = os.path.join(mindrecord_dir, prefix)
    writer = FileWriter(mindrecord_path, file_num)
    if dataset == "coco":
        image_files, image_anno_dict = create_coco_label(is_training, config=config)
    else:
        image_files, image_anno_dict = create_train_data_from_txt(config.image_dir, config.anno_path)

    fasterrcnn_json = {
        "image": {"type": "bytes"},
        "annotation": {"type": "int32", "shape": [-1, 6]},
    }
    writer.add_schema(fasterrcnn_json, "fasterrcnn_json")

    for image_name in image_files:
        with open(image_name, 'rb') as f:
            img = f.read()
        annos = np.array(image_anno_dict[image_name], dtype=np.int32)
        row = {"image": img, "annotation": annos}
        writer.write_raw_data([row])
    writer.commit()

# 随机翻转
random_horizontal = vision.RandomHorizontalFlip()
# 随机调整颜色
random_color = vision.RandomColorAdjust(brightness=(0.8, 1), contrast=(0.8, 1), saturation=(0.3, 1))
# 直方图均衡化
equal_op = vision.Equalize()

def pyfunc(x):
    """定义对数据的操作"""
    return x*2 % 255



def create_fasterrcnn_dataset(config, mindrecord_file, batch_size=2, device_num=1, rank_id=0, is_training=True,
                              num_parallel_workers=8, python_multiprocessing=False):
    """Create FasterRcnn dataset with MindDataset."""
    cv2.setNumThreads(0)
    de.config.set_prefetch_size(8)
    ds = de.MindDataset(mindrecord_file, columns_list=["image", "annotation"], num_shards=device_num, shard_id=rank_id,
                        num_parallel_workers=4, shuffle=is_training)
    decode = ms.dataset.vision.Decode()
    ds = ds.map(input_columns=["image"], operations=[decode, random_horizontal, random_color, equal_op, pyfunc])
    compose_map_func = (lambda image, annotation: preprocess_fn(image, annotation, is_training, config=config))

    if is_training:
        ds = ds.map(input_columns=["image", "annotation"],
                    output_columns=["image", "image_shape", "box", "label", "valid_num"],
                    column_order=["image", "image_shape", "box", "label", "valid_num"],
                    operations=compose_map_func, python_multiprocessing=python_multiprocessing,
                    num_parallel_workers=num_parallel_workers)
        ds = ds.batch(batch_size, drop_remainder=True)
    else:
        ds = ds.map(input_columns=["image", "annotation"],
                    output_columns=["image", "image_shape", "box", "label", "valid_num"],
                    column_order=["image", "image_shape", "box", "label", "valid_num"],
                    operations=compose_map_func,
                    num_parallel_workers=num_parallel_workers)
        ds = ds.batch(batch_size, drop_remainder=True)
    return ds

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587

效果如下
在这里插入图片描述
可见，我们的pyfunc起到了一定边缘提取的作用

相关阅读:
Package和Activity
python调用c++版本dll04-传入无参返回的图片处理
 Python 概念集合浅谈
 某ke登录密码加密寻找
 【JS基本数据类型和引用数据类型的根本区别】
Abnova 鸡抗小鼠 IgG (H&L) 二抗（过氧化物酶）说明书
 十堰市2022年高新技术企业奖补政策以及申报条件汇总！
github上创建分支并合并到master
播放视频出现错误代码0xc00d36c4如何修复？
md笔记上传到CSDN---Typora+SMMS+PicGo
原文地址：https://blog.csdn.net/weixin_54227557/article/details/126798269

mindspore两日集训营202209-自定义算子 数据处理

mindspore两日集训营202209-自定义算子数据处理