• 搭建 HRNet-Image-Classification,训练数据集


    在这里插入图片描述

    This is the official code of high-resolution representations for ImageNet classification. We augment the HRNet with a classification head shown in the figure below. First, the four-resolution feature maps are fed into a bottleneck and the number of output channels are increased to 128, 256, 512, and 1024, respectively. Then, we downsample the high-resolution representations by a 2-strided 3x3 convolution outputting 256 channels and add them to the representations of the second-high-resolution representations. This process is repeated two times to get 1024 channels over the small resolution. Last, we transform 1024 channels to 2048 channels through a 1x1 convolution, followed by a global average pooling operation. The output 2048-dimensional representation is fed into the classifier.

    在这里插入图片描述
    git: https://github.com/HRNet/HRNet-Image-Classification


    install 🌊

    1. Install dependencies: pip install -r requirements.txt

    valid ⚡️

    1. 下载预训练模型 ( https://github.com/HRNet/HRNet-Image-Classification ) 在这里插入图片描述
    2. python tools/valid.py --cfg experiments/cls_hrnet_w32_sgd_lr5e-2_wd1e-4_bs32_x100.yaml --testModel models/hrnetv2_w32_imagenet_pretrained.pth

    train 👻

    1. 修改 NUM_CLASSES

      1. 修改 yaml , 作者默认 num_cls = 1000 , MODEL 下 ,增加 NUM_CLASSES

        MODEL: 
          NAME: cls_hrnet
          NUM_CLASSES: 2
        
        • 1
        • 2
        • 3
      2. 修改 lib/models/cls_hrnet.py #308

        # HighResolutionNet
        self.classifier = nn.Linear(2048, cfg['MODEL']['NUM_CLASSES'])
        
        • 1
        • 2
      3. 修改 lib/core/function.py accuracy topk

        prec1, prec5 = accuracy(output, target, (1, 2))
        
        • 1
      4. 修改 lib/core/evaluate.py accuracy

        #view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Use .reshape(...) instead.
        # 增加 contiguous
        correct_k = correct[:k].contiguous().view(-1).float().sum(0, keepdim=True)
        
        • 1
        • 2
        • 3
    2. 准备数据集

      ├─ datasets
        ├─ images
        │  ├─ train
        │  │  ├─ a
        │  │  ├─ b
        │  ├─ val
        │  │  ├─ a
        │  │  ├─ b
      
      • 1
      • 2
      • 3
      • 4
      • 5
      • 6
      • 7
      • 8
    3. train

      python tools/train.py --cfg experiments/cls_hrnet_w32_sgd_lr5e-2_wd1e-4_bs32_x100.yaml
      
      • 1

    detect 🛫

    1. 推理单张图片

      from __future__ import absolute_import
      from __future__ import division
      from __future__ import print_function
      import cv2 as cv
      import torch
      import torch.nn.parallel
      import torch.backends.cudnn as cudnn
      import torch.optim
      import torch.utils.data
      import torch.utils.data.distributed
      import torchvision.transforms as transforms
      from torch.autograd import Variable
      from lib.config import config
      from PIL import Image
      from lib.models import cls_hrnet
      
      
      def prediect():
          # loda HRNET
          config.merge_from_file(
              '..\\experiments\\cls_hrnet_w32_sgd_lr5e-2_wd1e-4_bs32_x100.yaml')
          config.freeze()
      
          model_file = r"../models/hrnetv2_w32_imagenet_pretrained.pth"
      
          cudnn.benchmark = config.CUDNN.BENCHMARK
          torch.backends.cudnn.deterministic = config.CUDNN.DETERMINISTIC
          torch.backends.cudnn.enabled = config.CUDNN.ENABLED
          hrnet = cls_hrnet.get_cls_net(config)
      
          hrnet.load_state_dict(torch.load(model_file))
      
          gpus = list(config.GPUS)
          hrnet = torch.nn.DataParallel(hrnet, device_ids=gpus).cuda()
          hrnet.eval()
      
          pli_img_path = r'../data/images/000000000139.jpg'
          pil_img = Image.open(pli_img_path)
      
          normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406],
                                           std=[0.229, 0.224, 0.225])
          input = transforms.Compose([
              transforms.Resize(int(config.MODEL.IMAGE_SIZE[0] / 0.875)),
              transforms.CenterCrop(config.MODEL.IMAGE_SIZE[0]),
              transforms.ToTensor(),
              normalize,
          ])(pil_img)
          input = Variable(torch.unsqueeze(input, dim=0).float(), requires_grad=False)
          # switch to evaluate mode
          cls_pred = 0
          with torch.no_grad():
              output = hrnet(input)
              print(output)
              # free image
              torch.cuda.empty_cache()
              cls_pred = output.argmax(dim=1)
              print(cls_pred)
      
      if __name__ == '__main__':
          prediect()
      
      • 1
      • 2
      • 3
      • 4
      • 5
      • 6
      • 7
      • 8
      • 9
      • 10
      • 11
      • 12
      • 13
      • 14
      • 15
      • 16
      • 17
      • 18
      • 19
      • 20
      • 21
      • 22
      • 23
      • 24
      • 25
      • 26
      • 27
      • 28
      • 29
      • 30
      • 31
      • 32
      • 33
      • 34
      • 35
      • 36
      • 37
      • 38
      • 39
      • 40
      • 41
      • 42
      • 43
      • 44
      • 45
      • 46
      • 47
      • 48
      • 49
      • 50
      • 51
      • 52
      • 53
      • 54
      • 55
      • 56
      • 57
      • 58
      • 59
      • 60

    END 🔚

    1. 有需要的小伙伴可以参考下。 ​😝
  • 相关阅读:
    Python中的三种推导式及用法
    双胶合透镜初始设计
    Java案例找素数(三种方法)
    最小二乘法在编程中的实现
    怎样提升小程序日活?签到抽奖可行吗?
    【已解决】Springboot后端运行之后端口不是yml文件设置的端口号
    Unity调用C++ dll的那些坑
    TI毫米波雷达 IWR1642串口接收的4个字节数据如何解析为距离的?
    本地快速让某个目录变成服务器访问
    机器人视觉教学实训平台
  • 原文地址:https://blog.csdn.net/haiyangyunbao813/article/details/127998525