• TensorMask 0.1 & Detectron2 0.6 在Windows的环境下编译安装与测试【2022.8.7】


    TensorMask 0.1 & Detectron2 0.6 在Windows的环境下安装编译【2022】

    本人在做实例分割调研时,找到模型TensorMask,其需要安装前置框架Detectron2。在Detectron2的安装文档INSTALL.md中并没有Windows的安装手册,且需要Linux的 gcc & g++ 的环境。这里提供本人在Windows编译过程。

    先说一下配置,2022年3月份购入的Y9000p,处理器i7-12700H、显卡Nvidia GTX 3060 6GB。

    依赖环境

    笔者使用的是Anconda3 4.12.0的Python3.8.8虚拟环境,安装CUDA 11.3、
    CUDNN8.2.0、PyTorch 1.10.2+cu113、Torchvision 0.10.3+cu113

    首先安装MSVC VS C++生成工具。笔者是在Visual Studio Installer Enterprise 2019 安装使用C++的桌面开发下载链接
    在这里插入图片描述

    Microsoft Visual Studio Community 2022也行,看个人需求

    安装完成后在环境变量加入\安装路径\2019\Enterprise\VC\Auxiliary\Build\, 如放在D盘Softwares文件夹里面:

    在这里插入图片描述
    确定后输入Win+R输入cmd,进入命令提示符输入vcvars64和cl,验证是否配置完成。(出现下列信息即配置完成)
    在这里插入图片描述

    Detectron2

    使用Git克隆源码(也可直接在Github下载压缩包解压) ,按照顺序输入以下命令进行安装:

    pip instal opencv-python
    git clone https://github.com/facebookresearch/detectron2.git detectron2
    cd detectron2
    SET DISTUTILS_USE_SDK=1
    vcvars64
    pip install -e .
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6

    出现以下提示则安装完成:

    Installing collected packages: termcolor, tensorboard-plugin-wit, pywin32, pyasn1, mypy-extensions, antlr4-python3-runtime, zipp, urllib3, tomli, tensorboard-data-server, tabulate, six, rsa, pyyaml, pyparsing, pyasn1-modules, protobuf, portalocker, platformdirs, pathspec, oauthlib, MarkupSafe, kiwisolver, idna, future, fonttools, cycler, colorama, cloudpickle, charset-normalizer, cachetools, absl-py, yacs, werkzeug, tqdm, requests, python-dateutil, pydot, packaging, omegaconf, importlib-resources, importlib-metadata, grpcio, google-auth, fairscale, click, timm, requests-oauthlib, matplotlib, markdown, iopath, hydra-core, black, pycocotools, google-auth-oauthlib, fvcore, tensorboard, detectron2
      Running setup.py develop for detectron2
    Successfully installed detectron2-0.6
    
    • 1
    • 2
    • 3

    试着运行demo程序,将COCO2017的000000439715.jpg放进demo文件夹,输入命令:

    如果想保存结果,加入参数--outout 即可;若无则使用OpenCV的imshow展示结果

    cd demo
    python demo.py --config-file ../configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml --input 000000439715.jpg  --opts MODEL.WEIGHTS detectron2://COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl 
    
    • 1
    • 2

    在这里插入图片描述

    TensorMask

    进入projects/TensorMask目录,输入简单安装命令:

    cd projects\TensorMask
    pip install -e .
    
    • 1
    • 2

    不出意外,应该会报错:

    note: This error originates from a subprocess, and is likely not a problem with pip.
    
    • 1

    往上翻阅,发现这个错误:

    × python setup.py develop did not run successfully.
    │ exit code: 1
    ╰─> [91 lines of output]
        running develop
        running egg_info
        writing tensormask.egg-info\PKG-INFO
        writing dependency_links to tensormask.egg-info\dependency_links.txt
        writing top-level names to tensormask.egg-info\top_level.txt
        reading manifest file 'tensormask.egg-info\SOURCES.txt'
        writing manifest file 'tensormask.egg-info\SOURCES.txt'
        running build_ext
        building 'tensormask._C' extension
        Emitting ninja build file D:\Workspace\exps\detectron2\projects\TensorMask\build\temp.win-amd64-3.8\Release\build.ninja...
        Compiling objects...
        Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
        D:\Softwares\Anaconda3\envs\detectron2\lib\site-packages\setuptools\command\easy_install.py:144: EasyInstallDeprecationWarning: easy_install command is deprecated. Use build and pip and other standards-based tools.
          warnings.warn(
        D:\Softwares\Anaconda3\envs\detectron2\lib\site-packages\setuptools\command\install.py:34: SetuptoolsDeprecationWarning: setup.py install is deprecated. Use build and pip and other standards-based tools.
          warnings.warn(
        [1/1] D:\Softwares\CUDA\v11.3\bin\nvcc --generate-dependencies-with-compile --dependency-output D:\Workspace\exps\detectron2\projects\TensorMask\build\temp.win-amd64-3.8\Release\Workspace\exps\detectron2\projects\TensorMask\tensormask\layers\csrc\SwapAlign2Nat\SwapAlign2Nat_cuda.obj.d --use-local-env -Xcompiler /MD -Xcompiler /wd4819 -Xcompiler /wd4251 -Xcompiler /wd4244 -Xcompiler /wd4267 -Xcompiler /wd4275 -Xcompiler /wd4018 -Xcompiler /wd4190 -Xcompiler /EHsc -Xcudafe --diag_suppress=base_class_has_different_dll_interface -Xcudafe --diag_suppress=field_without_dll_interface -Xcudafe --diag_suppress=dll_interface_conflict_none_assumed -Xcudafe --diag_suppress=dll_interface_conflict_dllexport_assumed -DWITH_CUDA -ID:\Workspace\exps\detectron2\projects\TensorMask\tensormask\layers\csrc -ID:\Softwares\Anaconda3\envs\detectron2\lib\site-packages\torch\include -ID:\Softwares\Anaconda3\envs\detectron2\lib\site-packages\torch\include\torch\csrc\api\include -ID:\Softwares\Anaconda3\envs\detectron2\lib\site-packages\torch\include\TH -ID:\Softwares\Anaconda3\envs\detectron2\lib\site-packages\torch\include\THC -ID:\Softwares\CUDA\v11.3\include -ID:\Softwares\Anaconda3\envs\detectron2\include -ID:\Softwares\Anaconda3\envs\detectron2\Include "-ID:\Softwares\Microsoft Visual Studio\2019\Enterprise\VC\Tools\MSVC\14.29.30133\ATLMFC\include" "-ID:\Softwares\Microsoft Visual Studio\2019\Enterprise\VC\Tools\MSVC\14.29.30133\include" "-ID:\Windows Kits\10\include\10.0.19041.0\ucrt" "-ID:\Windows Kits\10\include\10.0.19041.0\shared" "-ID:\Windows Kits\10\include\10.0.19041.0\um" "-ID:\Windows Kits\10\include\10.0.19041.0\winrt" "-ID:\Windows Kits\10\include\10.0.19041.0\cppwinrt" -c D:\Workspace\exps\detectron2\projects\TensorMask\tensormask\layers\csrc\SwapAlign2Nat\SwapAlign2Nat_cuda.cu -o D:\Workspace\exps\detectron2\projects\TensorMask\build\temp.win-amd64-3.8\Release\Workspace\exps\detectron2\projects\TensorMask\tensormask\layers\csrc\SwapAlign2Nat\SwapAlign2Nat_cuda.obj -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -DCUDA_HAS_FP16=1 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86
        FAILED: D:/Workspace/exps/detectron2/projects/TensorMask/build/temp.win-amd64-3.8/Release/Workspace/exps/detectron2/projects/TensorMask/tensormask/layers/csrc/SwapAlign2Nat/SwapAlign2Nat_cuda.obj
        D:\Softwares\CUDA\v11.3\bin\nvcc --generate-dependencies-with-compile --dependency-output D:\Workspace\exps\detectron2\projects\TensorMask\build\temp.win-amd64-3.8\Release\Workspace\exps\detectron2\projects\TensorMask\tensormask\layers\csrc\SwapAlign2Nat\SwapAlign2Nat_cuda.obj.d --use-local-env -Xcompiler /MD -Xcompiler /wd4819 -Xcompiler /wd4251 -Xcompiler /wd4244 -Xcompiler /wd4267 -Xcompiler /wd4275 -Xcompiler /wd4018 -Xcompiler /wd4190 -Xcompiler /EHsc -Xcudafe --diag_suppress=base_class_has_different_dll_interface -Xcudafe --diag_suppress=field_without_dll_interface -Xcudafe --diag_suppress=dll_interface_conflict_none_assumed -Xcudafe --diag_suppress=dll_interface_conflict_dllexport_assumed -DWITH_CUDA -ID:\Workspace\exps\detectron2\projects\TensorMask\tensormask\layers\csrc -ID:\Softwares\Anaconda3\envs\detectron2\lib\site-packages\torch\include -ID:\Softwares\Anaconda3\envs\detectron2\lib\site-packages\torch\include\torch\csrc\api\include -ID:\Softwares\Anaconda3\envs\detectron2\lib\site-packages\torch\include\TH -ID:\Softwares\Anaconda3\envs\detectron2\lib\site-packages\torch\include\THC -ID:\Softwares\CUDA\v11.3\include -ID:\Softwares\Anaconda3\envs\detectron2\include -ID:\Softwares\Anaconda3\envs\detectron2\Include "-ID:\Softwares\Microsoft Visual Studio\2019\Enterprise\VC\Tools\MSVC\14.29.30133\ATLMFC\include" "-ID:\Softwares\Microsoft Visual Studio\2019\Enterprise\VC\Tools\MSVC\14.29.30133\include" "-ID:\Windows Kits\10\include\10.0.19041.0\ucrt" "-ID:\Windows Kits\10\include\10.0.19041.0\shared" "-ID:\Windows Kits\10\include\10.0.19041.0\um" "-ID:\Windows Kits\10\include\10.0.19041.0\winrt" "-ID:\Windows Kits\10\include\10.0.19041.0\cppwinrt" -c D:\Workspace\exps\detectron2\projects\TensorMask\tensormask\layers\csrc\SwapAlign2Nat\SwapAlign2Nat_cuda.cu -o D:\Workspace\exps\detectron2\projects\TensorMask\build\temp.win-amd64-3.8\Release\Workspace\exps\detectron2\projects\TensorMask\tensormask\layers\csrc\SwapAlign2Nat\SwapAlign2Nat_cuda.obj -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -DCUDA_HAS_FP16=1 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86
        D:\Workspace\exps\detectron2\projects\TensorMask\tensormask\layers\csrc\SwapAlign2Nat\SwapAlign2Nat_cuda.cu(438): error: no instance of function template "at::cuda::ATenCeilDiv" matches the argument list
                    argument types are: (int64_t, long)
    
        D:\Workspace\exps\detectron2\projects\TensorMask\tensormask\layers\csrc\SwapAlign2Nat\SwapAlign2Nat_cuda.cu(438): error: no instance of overloaded function "std::min" matches the argument list
                    argument types are: (<error-type>, long)
    
        D:\Workspace\exps\detectron2\projects\TensorMask\tensormask\layers\csrc\SwapAlign2Nat\SwapAlign2Nat_cuda.cu(495): error: no instance of function template "at::cuda::ATenCeilDiv" matches the argument list
                    argument types are: (int64_t, long)
    
        D:\Workspace\exps\detectron2\projects\TensorMask\tensormask\layers\csrc\SwapAlign2Nat\SwapAlign2Nat_cuda.cu(495): error: no instance of overloaded function "std::min" matches the argument list
                    argument types are: (<error-type>, long)
    
        4 errors detected in the compilation of "D:/Workspace/exps/detectron2/projects/TensorMask/tensormask/layers/csrc/SwapAlign2Nat/SwapAlign2Nat_cuda.cu".
        SwapAlign2Nat_cuda.cu
        ninja: build stopped: subcommand failed.
        Traceback (most recent call last):
          File "D:\Softwares\Anaconda3\envs\detectron2\lib\site-packages\torch\utils\cpp_extension.py", line 1717, in _run_ninja_build
            subprocess.run(
          File "D:\Softwares\Anaconda3\envs\detectron2\lib\subprocess.py", line 516, in run
            raise CalledProcessError(retcode, process.args,
        subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41
    • 42
    • 43

    注意,这里并不是按照网上方法,修改['ninja', '-v']['ninja', '-V']或者['ninja', '--version']就能解决问题。产生错误的原因在这里:

    D:\Workspace\exps\detectron2\projects\TensorMask\tensormask\layers\csrc\SwapAlign2Nat\SwapAlign2Nat_cuda.cu(438): error: no instance of function template "at::cuda::ATenCeilDiv" matches the argument list
                    argument types are: (int64_t, long)
    
        D:\Workspace\exps\detectron2\projects\TensorMask\tensormask\layers\csrc\SwapAlign2Nat\SwapAlign2Nat_cuda.cu(438): error: no instance of overloaded function "std::min" matches the argument list
                    argument types are: (<error-type>, long)
    
        D:\Workspace\exps\detectron2\projects\TensorMask\tensormask\layers\csrc\SwapAlign2Nat\SwapAlign2Nat_cuda.cu(495): error: no instance of function template "at::cuda::ATenCeilDiv" matches the argument list
                    argument types are: (int64_t, long)
    
        D:\Workspace\exps\detectron2\projects\TensorMask\tensormask\layers\csrc\SwapAlign2Nat\SwapAlign2Nat_cuda.cu(495): error: no instance of overloaded function "std::min" matches the argument list
                    argument types are: (<error-type>, long)
    
        4 errors detected in the compilation of "D:/Workspace/exps/detectron2/projects/TensorMask/tensormask/layers/csrc/SwapAlign2Nat/SwapAlign2Nat_cuda.cu".
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13

    可以发现,这里是因为detectron2/projects/TensorMask/tensormask/layers/csrc/SwapAlign2Nat/SwapAlign2Nat_cuda.cu文件的整数类型和浮点数类型产生了冲突,导致无法使用C++进行源代码编译。找到该文件后,修改438行和495行的数据类型:

    // 438行
    //  dim3 grid(std::min(at::cuda::ATenCeilDiv(Y.numel(), 512L), 4096L));  
    dim3 grid(std::min(at::cuda::ATenCeilDiv((int)Y.numel(), 512), 4096));
    // 495行
    // dim3 grid(std::min(at::cuda::ATenCeilDiv(gY.numel(), 512L), 4096L)); 
    dim3 grid(std::min(at::cuda::ATenCeilDiv((int)gY.numel(), 512), 4096));
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6

    重新运行pip install -e .,编译安装成功:

    Obtaining file:///D:/Workspace/exps/detectron2/projects/TensorMask
      Preparing metadata (setup.py) ... done
    Installing collected packages: tensormask
      Running setup.py develop for tensormask
    Successfully installed tensormask-0.1
    
    • 1
    • 2
    • 3
    • 4
    • 5

    试着运行TensorMask的训练文件`train_net.py·:

    python train_net.py --config-file configs/tensormask_R_50_FPN_1x.yaml SOLVER.IMS_PER_BATCH 1
    
    • 1

    不出意外的话又报错了:

    Traceback (most recent call last):
      File "train_net.py", line 63, in <module>
        launch(
      File "d:\workspace\exps\detectron2\detectron2\engine\launch.py", line 82, in launch
        main_func(*args)
      File "train_net.py", line 55, in main
        trainer = Trainer(cfg)
      File "d:\workspace\exps\detectron2\detectron2\engine\defaults.py", line 378, in __init__
        data_loader = self.build_train_loader(cfg)
      File "d:\workspace\exps\detectron2\detectron2\engine\defaults.py", line 547, in build_train_loader
        return build_detection_train_loader(cfg)
      File "d:\workspace\exps\detectron2\detectron2\config\config.py", line 207, in wrapped
        explicit_args = _get_args_from_config(from_config, *args, **kwargs)
      File "d:\workspace\exps\detectron2\detectron2\config\config.py", line 245, in _get_args_from_config
        ret = from_config_func(*args, **kwargs)
      File "d:\workspace\exps\detectron2\detectron2\data\build.py", line 344, in _train_loader_from_config
        dataset = get_detection_dataset_dicts(
      File "d:\workspace\exps\detectron2\detectron2\data\build.py", line 241, in get_detection_dataset_dicts
        dataset_dicts = [DatasetCatalog.get(dataset_name) for dataset_name in names]
      File "d:\workspace\exps\detectron2\detectron2\data\build.py", line 241, in <listcomp>
        dataset_dicts = [DatasetCatalog.get(dataset_name) for dataset_name in names]
      File "d:\workspace\exps\detectron2\detectron2\data\catalog.py", line 58, in get
        return f()
      File "d:\workspace\exps\detectron2\detectron2\data\datasets\coco.py", line 500, in <lambda>
        DatasetCatalog.register(name, lambda: load_coco_json(json_file, image_root, name))
      File "d:\workspace\exps\detectron2\detectron2\data\datasets\coco.py", line 69, in load_coco_json
        coco_api = COCO(json_file)
      File "D:\Softwares\Anaconda3\envs\detectron2\lib\site-packages\pycocotools\coco.py", line 81, in __init__
        with open(annotation_file, 'r') as f:
    FileNotFoundError: [Errno 2] No such file or directory: 'datasets\\coco/annotations/instances_train2017.json'
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30

    提示是找不到COCO数据集的instances_train2017.json,这里我们创建一个软链接,链接到本地的数据集路径:

    # 这里的D:\Workspace\data是笔者的数据集存放路径
    mklink /j datasets D:\Workspace\data
    
    • 1
    • 2

    再次运行训练命令,这里又又报错了:

    Traceback (most recent call last):
      File "train_net.py", line 63, in <module>
        launch(
      File "d:\workspace\exps\detectron2\detectron2\engine\launch.py", line 82, in launch
        main_func(*args)
      File "train_net.py", line 55, in main
        trainer = Trainer(cfg)
      File "d:\workspace\exps\detectron2\detectron2\engine\defaults.py", line 396, in __init__
        self.register_hooks(self.build_hooks())
      File "d:\workspace\exps\detectron2\detectron2\engine\defaults.py", line 463, in build_hooks
        ret.append(hooks.PeriodicWriter(self.build_writers(), period=20))
      File "d:\workspace\exps\detectron2\detectron2\engine\defaults.py", line 475, in build_writers
        return default_writers(self.cfg.OUTPUT_DIR, self.max_iter)
      File "d:\workspace\exps\detectron2\detectron2\engine\defaults.py", line 248, in default_writers
        TensorboardXWriter(output_dir),
      File "d:\workspace\exps\detectron2\detectron2\utils\events.py", line 145, in __init__
        from torch.utils.tensorboard import SummaryWriter
      File "D:\Softwares\Anaconda3\envs\detectron2\lib\site-packages\torch\utils\tensorboard\__init__.py", line 4, in <module>
        LooseVersion = distutils.version.LooseVersion
    AttributeError: module 'distutils' has no attribute 'version'
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20

    原因是setuptools版本太高,笔者的环境是61.2.0,因此强制安装59.5.0 [ 3 ] ^{[3]} [3],再次运行训练命令:

    在这里插入图片描述

    开始训练,安装教程结束。

    总结

    2022.8.7

    笔者之前也是被detectron2折磨过了好几天,后面仔细翻阅大量资料,发现很多解决方案都是修改['ninja', '-V'],修改过后会报LINK : fatal error LNK1181: cannot open input file ...的错误,实际上这是因为ninja编译错误导致没有生成编译过后的文件才导致的缺陷问题。
    因此需要往上追溯,寻找错误的原点,最终发现错误的地方在tensormask/layers/csrc/SwapAlign2Nat/SwapAlign2Nat_cuda.cu的438和495行处,因为数据类型不符合导致编译错误,可能原因是在Linux的gcc&g++平台中,int64_t能够与long浮点数进行胡同(?),在Windows的MSVC下则要求严格必须统一数据类型。

    在此期间追溯错误的根源,重新审视了自己——因为时间的挤压导致自己忘记了学习的初衷:发现错误,就要追溯到根源,并且提出解决的方案。

    期间一直抽空查阅C++的相关资料,从零开始学习Python的话可能不会接触到C++因此找不到正确的解决方案。本次分享是为了纪念自己重新找回学习、科研的初衷;同时也为了解决部分因Windows编译而困扰的网友们。最近有时间的话将更新detectron2的自定义数据集训练和验证。。。

    有问题大家也可以在评论区回复或私信,基本都会秒回,我也是个菜鸟hhh.

    参考

    [1] 2022年最新的Detectron 2 (0.6) 安装流程(联想笔记本Y9000K+Anaconda+Win 11 +RTX3070)

    [2] Detectron2——0.2.1安装(windows10)

    [3] AttributeError: module ‘distutils’ has no attribute ‘version’ 解决方案

  • 相关阅读:
    ajax笔记五
    手工测试转自动化,学习路线必不可少,更有【117页】测开面试题,欢迎来预测
    压力测试的3种常见模式
    【建议收藏】Kafka 面试连环炮, 看看你能撑到哪一步?
    02 CSS技巧
    Hadoop3.3.4 + HDFS Router-Based Federation配置
    计算机毕设JAVA——学习考试管理系统(基于SpringBoot+Vue前后端分离的项目)
    ChatGPT在数据分析学习阶段的应用
    [附源码]SSM计算机毕业设计学生档案管理系统JAVA
    KVM虚机添加磁盘
  • 原文地址:https://blog.csdn.net/weixin_45921726/article/details/126201128