文档参考链接
nnUNet被广泛的用于各种比赛,并且占据了各大比赛的排行榜。
使用docker 运行nnUNet,有以下好处
首先已经安装好了新的显卡驱动
安装最新的docker和nvidia-docker
$sudo apt-get update
$sudo apt-get install docker.io
# 启动Docker
systemctl start docker
# 设置开机自启
systemctl enable docker
使用docker拉取pytorch在GPU中运行的环境
docker pull pytorch/pytorch:latest
从nnUNet官网下载最新的代码
nnUNetgithub地址
进入nnUNet代码文件夹,创建空的Dockerfile文件
FROM pytorch/pytorch:latest
RUN apt-get update && apt-get install -y vim \
&& apt-get install -y --no-install-recommends \
python3-pip \
python3-setuptools \
build-essential \
&& \
apt-get clean && \
python -m pip install --upgrade pip
WORKDIR /workspace
COPY ./ /workspace
RUN pip install pip -U
RUN pip config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple
RUN pip install -e .
ENV nnUNet_raw_data_base="/workspace/data"
ENV nnUNet_preprocessed="/workspace/data/nnUNet_preprocessed"
ENV RESULTS_FOLDER="/workspace/data/RESULTS_FOLDER"
注意:本地硬盘中存放训练数据的文件夹nnUNetData内容目录需要跟下面保持一致,如果不知道自己在做什么,不建议自己改文件加名。
整理好的TaskXXXDemo 训练数据放入nnUNet_raw_data中
docker build -t nnunet_docker:0.0.1 .
其中nnunet_docker:0.0.1代表名字:版本号,最后一个 "."代表是当前目录
build完成后可以使用docker image save 保存下来用于其他平台的训练
docker run --gpus all --rm -it nnunet_docker:0.0.1 /bin/bash
进入交互式终端如下
root@02c06f65cc7e:/workspace#
首先测试一下gpu是否可以使用
root@02c06f65cc7e:/workspace# nvidia-smi
Fri Aug 5 15:37:12 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 495.44 Driver Version: 495.44 CUDA Version: 11.5 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... Off | 00000000:21:00.0 Off | N/A |
| 60% 63C P2 249W / 350W | 18809MiB / 24265MiB | 81% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 1 NVIDIA GeForce ... Off | 00000000:49:00.0 Off | N/A |
| 56% 61C P2 246W / 350W | 17557MiB / 24268MiB | 72% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 2 NVIDIA GeForce ... Off | 00000000:4A:00.0 Off | N/A |
| 67% 67C P2 250W / 350W | 20051MiB / 24268MiB | 65% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
+-----------------------------------------------------------------------------+
docker run --rm -it --gpus all --ipc=host -v /home/lus/Project/datasets/nnUNet/:/workspace/data/ -v /home/lus/myProject/codes/A_project/nnUNet:/workspace/ nnunet_docker:0.0.1 /bin/bash -c "sh tmp_fold1.sh"
我们以阿里云为例
首先注册一下阿里云镜像的账号
https://cr.console.aliyun.com/cn-hangzhou/instances
使用如下官方入门说明注册
我们首先对我们生成的镜像使用tag进行重新命名
docker tag [ImgID(就是镜像ID)] registry.cn-hangzhou.aliyuncs.com/------[镜像版本号]
终端登录你的账号
docker login --username=账号 registry.cn-hangzhou.aliyuncs.com
推送
docker push registry.cn-hangzhou.aliyuncs.com/-----:[镜像版本号]
RuntimeError: MultiThreadedAugmenter.abort_event was set, something went wrong. Maybe one of your workers crashed. This is not the actual error message! Look further up your stdout to see what caused the error. Please also check whether your RAM was full
在Docker环境下运行nnUNet可能会遇到如上报错
我们需要在docker命令后面添加 --ipc=host
docker run --rm -it --gpus all --ipc=host