• 【Pytorch学习】Transforms


    transforms.py相当于一个工具箱,里面有很多工具,比如totensor(将数据转换为tensor类型)、resize等。这个工具箱的输入是图片

    一、Transforms的使用

    from PIL import Image
    from torchvision import transforms
    
    # 绝对路径:/home/xjy/PycharmProjects/pythonProject/dataset/train/ants/0013035.jpg
    # 相对路径:dataset/train/ants/0013035.jpg
    img_path = "dataset/train/ants/0013035.jpg"
    img = Image.open(img_path)
    print(img) # <PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=768x512 at 0x7FE4A77C61D0>
    
    tensor_trans = transforms.ToTensor()
    tensor_img = tensor_trans(img)
    print(tensor_img) # tensor([[...]])
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12

    首先我们需要创建一个具体的工具,如transforms.ToTensor(),然后我们需要去使用这个工具,将输入转换为输出result = tool(input),即上面的tensor_img = tensor_trans(img)

    PS. 使用opencv的代码:

    import cv2
    cv_img = cv2.imread(img_path) # 为ndarray格式
    
    • 1
    • 2

    二、TensorBoard显示

    from PIL import Image
    from torch.utils.tensorboard import SummaryWriter
    from torchvision import transforms
    
    # 绝对路径:/home/xjy/PycharmProjects/pythonProject/dataset/train/ants/0013035.jpg
    # 相对路径:dataset/train/ants/0013035.jpg
    img_path = "dataset/train/ants/0013035.jpg"
    img = Image.open(img_path)
    print(img) # <PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=768x512 at 0x7FE4A77C61D0>
    
    tensor_trans = transforms.ToTensor()
    tensor_img = tensor_trans(img)
    print(tensor_img) # tensor([[...]])
    
    writer = SummaryWriter("logs") # save_dir : logs
    writer.add_image("Tensor_img", tensor_img)
    writer.close()
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17

    运行后在终端输入tensorboard --logdir=logs --port=6007即可显示图片

    三、常用的Transforms

    需要去关注输入、输出、作用。不同的函数会生成不同的数据类型,如

    Image.open()PIL
    ToTensor()tensor
    cv.imread()narrays

    1. ToTensor()

    首先,回顾一下类的用法:

    class Person:
        def __call__(self, name):
            print("__call__" + " Hello " + name)
    
        def hello(self, name):
            print("hello " + name)
    
    person = Person()
    person("Zhangsan") # __call__ Hello Zhangsan
    person.hello("lisi") # hello lisi
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10

    接着看一下ToTensor()的用法:

    from PIL import Image
    from torch.utils.tensorboard import SummaryWriter
    from torchvision import transforms
    
    img = Image.open("images/pink.jpg")
    print(img)
    # <PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=500x375 at 0x7FC0B136F240>
    
    writer = SummaryWriter("logs")
    
    trans_totensor = transforms.ToTensor()
    img_tensor = trans_totensor(img)
    writer.add_image("ToTensor", img_tensor)
    writer.close()
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14

    因为add_image()函数要求输入torch.Tensor, numpy.array, or string/blobname的图片,所以需要先将img转换为tensor类型。

    2. ToPILImage()

    作用:将tensor或ndarray数据类型转换为PIL image类型

    3. Normalize()

    作用:归一化一个tensor image,其公式为 result[channel] = (input[channel] - mean[channel]) / std[channel],那么如果input的范围为[0,1],将mean和std均设置为0.5,那么result的范围就会为[-1,1]。

    from PIL import Image
    from torch.utils.tensorboard import SummaryWriter
    from torchvision import transforms
    
    writer = SummaryWriter("logs")
    img = Image.open("images/pink.jpg")
    print(img)
    # <PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=500x375 at 0x7FC0B136F240>
    
    # ToTensor
    trans_totensor = transforms.ToTensor()
    img_tensor = trans_totensor(img)
    writer.add_image("ToTensor", img_tensor)
    
    # Normalize
    # print(img_tensor[0][0][0])
    trans_norm = transforms.Normalize([0.5, 0.5, 0.5], [0.5, 0.5, 0.5]) # mean, std
    img_norm = trans_norm(img_tensor)
    # print(img_norm[0][0][0])
    writer.add_image("Normalize", img_norm)
    
    writer.close()
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22

    显示的结果如图:

    在这里插入图片描述

    4. Resize()

    作用:将输入的PIL图片resize成给定的尺寸,输出仍为PIL image数据类型

    from PIL import Image
    from torch.utils.tensorboard import SummaryWriter
    from torchvision import transforms
    
    writer = SummaryWriter("logs")
    img = Image.open("images/pink.jpg")
    print(img)
    # <PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=500x375 at 0x7FC0B136F240>
    
    # Resize
    print(img.size) # (500, 375)
    trans_resize = transforms.Resize((512, 512))
    img_resize = trans_resize(img)
    print(img_resize) # <PIL.Image.Image image mode=RGB size=512x512 at 0x7FAB69DFBE10>
    
    # 显示
    trans_totensor = transforms.ToTensor()
    img_resize = trans_totensor(img_resize)
    print(img_resize)
    writer.add_image("Resize", img_resize)
    
    writer.close()
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22

    5. Compose()

    作用:transforms.Compose([trans_resize_2, trans_totensor]),其输入为PIL image,输出tensor

    from PIL import Image
    from torch.utils.tensorboard import SummaryWriter
    from torchvision import transforms
    
    writer = SummaryWriter("logs")
    img = Image.open("images/pink.jpg")
    print(img)
    # <PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=500x375 at 0x7FC0B136F240>
    
    # ToTensor
    trans_totensor = transforms.ToTensor()
    
    # Compose - resize - 2
    trans_resize_2 = transforms.Resize(512) # 等比缩放
    trans_compose = transforms.Compose([trans_resize_2, trans_totensor])
    img_resize_2 = trans_compose(img)
    writer.add_image("Resize", img_resize_2, 1)
    
    writer.close()
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19

    6. RandomCrop()

    作用:随机裁剪

    from PIL import Image
    from torch.utils.tensorboard import SummaryWriter
    from torchvision import transforms
    
    writer = SummaryWriter("logs")
    img = Image.open("images/pink.jpg")
    print(img)
    # <PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=500x375 at 0x7FC0B136F240>
    
    # ToTensor
    trans_totensor = transforms.ToTensor()
    
    # RandomCrop()
    trans_random = transforms.RandomCrop(256) 
    # trans_random = transforms.RandomCrop((256, 300)) 
    trans_compose_2 = transforms.Compose([trans_random, trans_totensor])
    for i in range(10):
        img_crop = trans_compose_2(img)
        writer.add_image("RandomCrop", img_crop, i)
    
    writer.close()
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21

    在这里插入图片描述

    四、总结

    1. 要关注输入和输出类型,多看看官方文档。
    2. 还要关注方法需要什么参数,可以将光标放在函数的括号内,同时按ctrl+p可弹出参数提示。
    3. 不知道返回值类型的时候,可以用print()或者print(type())或者debug获取。
  • 相关阅读:
    图扑软件用数据可视化形式告诉你,楼宇建设如何数字化转型
    TMS Diagram Studio 一组组件Crack版
    把Eclipse创建的Web项目(非Maven)导入Idea
    OSPF高级配置——学习OSPF路由协议的高级应用
    nRF52832看门狗WDT使用(SDK17.1.0)
    [附源码]java毕业设计社区健康服务平台管理系统lunwen
    linux系统目录结构、上传下载文件、命令及用法
    CVE-2022-32532 Apache Shiro RegExPatternMatcher 认证绕过复现
    SMART 200 PLC S型速度曲线应用(梯形图算法优化)
    如何在Odoo中添加水印?
  • 原文地址:https://blog.csdn.net/XXXXXXJY/article/details/126813843