项目地址: GitHub - utkuozdemir/nvidia_gpu_exporter: Nvidia GPU exporter for prometheus using nvidia-smi binary
具体操作步骤:
麒麟V10系统中有创建nvidia_gpu_exporter服务
安装nvidia_gpu_exporter服务
wget https://github.com/utkuozdemir/nvidia_gpu_exporter/releases/download/v0.5.0/nvidia_gpu_exporter_0.5.0_linux_arm64.tar.gz
tar -xvzf nvidia_gpu_exporter_0.5.0_linux_arm64.tar.gz
mv nvidia_gpu_exporter /usr/local/bin
nvidia_gpu_exporter
此时通过web页面就可查看此台GPU服务器的gpu-metircs信息,如下图
访问链接:http://IP:9835/metrics
创建nvidia_gpu_exporter服务
# vim /etc/systemd/system/nvidia_gpu_exporter.service
- [Unit]
- Description=Nvidia GPU Exporter
- After=network-online.target
-
- [Service]
- Type=simple
-
- User=root
-
- ExecStart=/usr/local/bin/nvidia_gpu_exporter
-
- SyslogIdentifier=nvidia_gpu_exporter
-
- Restart=always
- RestartSec=1
-
- NoNewPrivileges=yes
-
- ProtectHome=yes
- ProtectSystem=strict
- ProtectControlGroups=true
- ProtectKernelModules=true
- ProtectKernelTunables=yes
- ProtectHostname=yes
- ProtectKernelLogs=yes
- ProtectProc=yes
-
- [Install]
- WantedBy=multi-user.target
启动服务配置
# systemctl daemon-reload
# systemctl enable nvidia_gpu_exporter
# systemctl start nvidia_gpu_exporter.service
# systemctl status nvidia_gpu_exporter.service
服务启动成功,通过页面查看
访问链接:http://IP:9835/metrics