• Ubuntu离线安装nvidia-docker完整过程(最简单的解决方法解决nvidia-docker: command not found)


    问题说明

    安装完docker、NVIDIA驱动后,执行指令:

    nvidia-docker
    
    • 1

    报错如下:

    nvidia-docker: command not found
    
    • 1

    第二种错误:

    Error response from daemon: Unknown runtime specified nvidia.
    See 'docker run --help'.
    
    • 1
    • 2

    【注】第二种错误的解决方法直接看 【3.3修改配置文件 daemon.json】 再按照4,5步骤依次进行

    运行环境

    • Ubuntu 18.04
    • Docker version 18.09.6
    • NVIDIA-SMI 460.32.03 Driver Version: 460.32.03 CUDA Version: 11.2

    解决方法

    • 1、下载5个deb文件
      libnvidia-container1
      libnvidia-container-tools
      nvidia-container-toolkit
      nvidia-container-runtime
      nvidia-docker2

      docker官网下载链接:
      http://mirror.cs.uchicago.edu/nvidia-docker/libnvidia-container/stable/ubuntu16.04/amd64/

      我下载的文件:

      libnvidia-container-tools_1.7.0-1_amd64.deb  
      nvidia-container-toolkit_1.7.0-1_amd64.deb
      libnvidia-container1_1.7.0-1_amd64.deb  
      nvidia-container-runtime_3.7.0-1_all.deb     
      nvidia-docker2_2.8.0-1_all.deb
      
      • 1
      • 2
      • 3
      • 4
      • 5
    • 2、安装

      执行命令:

      sudo dpkg -i ./lib*  ./nvidia*
      
      • 1
    • 3、如果第二步安装过程提示下面的错误:

      dpkg: dependency problems prevent configuration of nvidia-docker2:
       nvidia-docker2 depends on docker-ce (>= 18.06.0~ce~3-0~ubuntu) | docker-ee (>= 18.06.0~ce~3-0~ubuntu) | docker.io (>= 18.06.0); however:
        Package docker-ce is not installed.
        Package docker-ee is not installed.
        Package docker.io is not installed.
      
      
      • 1
      • 2
      • 3
      • 4
      • 5
      • 6

      解决方法:安装 docker-ce,docker-ee, docker.io

    • 3.1下载下面文件
      【说明】我安装的 docker 版本是 18.09.6。(docker 20.10.2的相关文件见后面)

      	containerd.io_1.2.6-3_amd64.deb  
      	docker-ce_18.09.6~3-0~ubuntu-bionic_amd64.deb  
      	docker-ce-cli_18.09.6~3-0~ubuntu-bionic_amd64.deb
      
      • 1
      • 2
      • 3

      官网下载地址:https://download.docker.com/linux/ubuntu/dists/bionic/pool/stable/amd64/

      我的下载地址:

      https://download.docker.com/linux/ubuntu/dists/bionic/pool/stable/amd64/docker-ce_18.09.6~3-0~ubuntu-bionic_amd64.deb
      https://download.docker.com/linux/ubuntu/dists/bionic/pool/stable/amd64/docker-ce-cli_18.09.6~3-0~ubuntu-bionic_amd64.deb
      https://download.docker.com/linux/ubuntu/dists/bionic/pool/stable/amd64/containerd.io_1.2.6-3_amd64.deb
      
      • 1
      • 2
      • 3

      下载文件保存到文件夹:docker_deb/

    • 3.2安装

      	cd docker_deb/
      	sudo dpkg -i ./*deb
      
      • 1
      • 2
        安装末尾提示:
      
      • 1
      	Configuration file '/etc/docker/daemon.json'
      	 ==> File on system created by you or by a script.
      	 ==> File also in package provided by package maintainer.
      	   What would you like to do about it ?  Your options are:
      	    Y or I  : install the package maintainer's version
      	    N or O  : keep your currently-installed version
      	      D     : show the differences between the versions
      	      Z     : start a shell to examine the situation
      	 The default action is to keep your current version.
      	*** daemon.json (Y/I/N/O/D/Z) [default=N] ? 
      	
      
      • 1
      • 2
      • 3
      • 4
      • 5
      • 6
      • 7
      • 8
      • 9
      • 10
      • 11

      选择:N

    • 3.3修改配置文件 daemon.json
      首先查看 daemon.json:

      {
        "data-root": "/var/lib/docker",
        "exec-opts": ["native.cgroupdriver=systemd"],
        "insecure-registries": ["xxx"],
        "max-concurrent-downloads": 10,
        "live-restore": true,
        "log-driver": "json-file",
        "log-level": "warn",
        "log-opts": {
          "max-size": "50m",
          "max-file": "1"
          },
        "storage-driver": "overlay2"
      }
      
      • 1
      • 2
      • 3
      • 4
      • 5
      • 6
      • 7
      • 8
      • 9
      • 10
      • 11
      • 12
      • 13
      • 14

      问题:缺少 runtimes、default-runtime,新增如下内容:

        "default-runtime": "nvidia",
        "runtimes": {
              "nvidia": {
                  "path": "nvidia-container-runtime",
                  "runtimeArgs": []
          }
          },
      
      • 1
      • 2
      • 3
      • 4
      • 5
      • 6
      • 7
    • 4、重启docker服务
      先停止所有运行的容器:

      docker stop $(docker ps -a -q)
      
      • 1

      再重启 docker 服务:

      sudo systemctl daemon-reload
      sudo systemctl restart docker
      
      • 1
      • 2
    • 5、验证nvidia-docker

      执行指令:

      nvidia-docker -v
      
      • 1

      返回结果:

      Docker version 18.09.6, build 481bc77
      
      • 1

      说明 nvidia-docker 安装成功

    测试nvidia-docker

    nvidia-docker  run -it -d \
        --name your_name \
        -e TZ='Asia/Shanghai' \
        -d your_ai_image:latest
    
    • 1
    • 2
    • 3
    • 4

    不报错则说明 nvidia-docker 正常可用。

    docker 20.10.2的安装

    下载文件:

    containerd.io_1.4.6-1_amd64.deb 
    docker-ce_20.10.2_3-0_ubuntu-bionic_amd64.deb  
    docker-ce-cli_20.10.2_3-0_ubuntu-bionic_amd64.deb
    
    • 1
    • 2
    • 3

    对应的下载链接:

    https://download.docker.com/linux/ubuntu/dists/bionic/pool/stable/amd64/containerd.io_1.4.6-1_amd64.deb
    https://download.docker.com/linux/ubuntu/dists/bionic/pool/stable/amd64/docker-ce-cli_20.10.2~3-0~ubuntu-bionic_amd64.deb
    https://download.docker.com/linux/ubuntu/dists/bionic/pool/stable/amd64/docker-ce_20.10.2~3-0~ubuntu-bionic_amd64.deb
    
    • 1
    • 2
    • 3

    其他过程同 docker 18.09.6 一致。

    参考

    【1】https://blog.csdn.net/zengNLP/article/details/126732645?spm=1001.2014.3001.5502
    【2】http://mirror.cs.uchicago.edu/nvidia-docker/libnvidia-container/stable/ubuntu16.04/amd64/
    【3】https://download.docker.com/linux/ubuntu/dists/bionic/pool/stable/amd64/
    【4】https://www.codenong.com/cs109532661/

  • 相关阅读:
    拿捏大厂offer教程之接口自动化测试pytest用例管理框架
    计算机网络(数据链路层)
    计算机考研408-I/O方式大题答题流程
    Linux CentOS7 history命令
    排序算法--快速排序
    【C++入门到精通】C++入门 —— map & multimap (STL)
    20 个提升效率的 JS 简写技巧
    vue3 快速入门系列 —— vue3 路由
    【青书学堂】 2023年第二学期 JavaScript 基础编程(高起专) 作业
    计算机体系结构:不同方案的机器性能比较例题(1.6)
  • 原文地址:https://blog.csdn.net/zengNLP/article/details/127007908