部署准备:
docker-compose.yml
文件内容如下:
version: '2'
services:
prometheus:
image: prom/prometheus:latest
container_name: prometheus
hostname: prometheus
restart: always
ports:
- '9090:9090'
volumes:
- '/home/app/prometheus/config:/config'
- '/home/app/prometheus/data/prometheus:/prometheus/data'
command:
- '--config.file=/config/prometheus.yml'
- '--web.enable-lifecycle'
alertmanager:
image: prom/alertmanager:latest
container_name: altermanager
hostname: altermanager
restart: always
ports:
- '9093:9093'
volumes:
- '/home/app/prometheus/config:/config'
- '/home/app/prometheus/data/alertmanager:/alertmanager/data'
command:
- '--config.file=/config/alertmanager.yml'
grafana:
image: grafana/grafana:latest
container_name: grafana
hostname: grafana
restart: always
ports:
- '3000:3000'
volumes:
- '/home/app/grafana/config/grafana.ini:/etc/grafana/grafana.ini'
- '/home/app/grafana/data:/var/lib/grafana'
node-exporter:
image: quay.io/prometheus/node-exporter
container_name: node-exporter
hostname: node-exporter
restart: always
ports:
- "9100:9100"
由于本次部署使用本地虚拟机进行部署,所以将node-exporter直接部署在本机,如在生产环境进行部署,需将node-exporter直接部署在需要进行监控的服务器上。
创建挂载目录:
mkdir -p /home/app/prometheus/config/rules
mkdir -p /home/app/prometheus/data
mkdir -p /home/app/grafana/config
mkdir -p /home/app/grafana/data
在prometheus的config目录和rules目录中分别创建alertmanager.yml、prometheus.yml、rules.yml
配置文件;在grafana的config中创建grafana.ini配置文件。
prometheus.yml
文件内容如下:
global:
# 指定Prometheus抓取应用程序数据的间隔为15秒。
scrape_interval: 15s # By default, scrape targets every 15 seconds.
#
# Attach these labels to any time series or alerts when communicating with
# external systems (federation, remote storage, Alertmanager).
external_labels:
monitor: 'line-monitor'
# 普罗米修斯 规则文件
rule_files:
- "rules/*.yml"
# prometheus自身的Alert功能是根据我们配置的 规则文件 进行触发的,但是它并没有告警发邮件的功能,发送邮件的这件事儿是由 Alertmanager来做的
alerting:
alertmanagers:
- scheme: http
static_configs:
- targets:
- "172.16.37.100:9093"
# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
# The job name is added as a label \`job=\` to any timeseries scraped from this config.
- job_name: 'prometheus'
# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.
static_configs:
- targets: ['172.16.37.100:9090']
# 后期追加mysql监控
- job_name: 'uat-mysql'
static_configs:
- targets: ['172.16.37.100:9104']
# 后期追加机器监控
- job_name: 'node-monitor'
static_configs:
- targets: ['172.16.37.100:9100'] #如部署在生产环境,此ip应为部署node-monitor的服务器ip
alertmanager.yml
文件内容如下:
global:
resolve_timeout: 1m
# The smarthost and SMTP sender used for mail notifications.
# #smtp_smarthost: ''
# #smtp_from: ''
# #smtp_auth_username: ''
# #smtp_auth_password: ''
#
route:
receiver: 'default-receiver'
# The labels by which incoming alerts are grouped together. For example,
# multiple alerts coming in for cluster=A and alertname=LatencyHigh would
# be batched into a single group.
# group_by: ['alertname']
#
# When a new group of alerts is created by an incoming alert, wait at
# least 'group_wait' to send the initial notification.
# This way ensures that you get multiple alerts for the same group that start
# firing shortly after another are batched together on the first
# notification.
# group_wait: 5s
# When the first notification was sent, wait 'group_interval' to send a batch
# of new alerts that started firing for that group.
# group_interval: 30s
# If an alert has successfully been sent, wait 'repeat_interval' to
# resend them.
repeat_interval: 1m
receivers:
- name: 'default-receiver'
rules.yml
文件内容如下:
# 告警规则内容,监控demo指标是否大于等于140,如果大于则判断3是否持续30秒
groups:
- name: demo-alert
rules:
- alert: demo-monitor
expr: demo > 140
for: 3m
labels:
bp: hypertension
annotations:
company: "demo"
desc: "current value:{{ $value }}"
grafana.ini
文件内容如下:
# 配置邮件服务器
[smtp]
enabled = true
# 发件服务器
host = smtp.exmail.qq.com:465
# smtp账号
user = example@qq.com
# smtp 密码
password = ******
# 发信邮箱
from_address = gsk-portal@pharmeyes.com
# 发信人
from_name = Grafana
赋予文件夹权限,防止部署时因权限问题启动容器失败:
chmod -R 777 /home/app
至此准备工作完毕,进行部署:
cd /home/app/prometheus
[root@centos1810-100 prometheus]# docker-compose up -d
Creating network "prometheus_default" with the default driver
Pulling grafana (grafana/grafana:latest)...
latest: Pulling from grafana/grafana
ab6db1bc80d0: Pull complete
f3a8945791a4: Pull complete
6340cc8f982e: Pull complete
e96b26b29a27: Pull complete
e7c17007278e: Pull complete
975497b4745b: Pull complete
5faa5d00171a: Pull complete
e43637e667d7: Pull complete
c804c40bda18: Pull complete
Digest: sha256:f19ce6baedc93dfe44dd4a4dcc84d491d7ae6be2f2c6c7158bce7700c9e9de08
Status: Downloaded newer image for grafana/grafana:latest
Creating node-exporter ...
Creating grafana ...
Creating node-exporter
Creating altermanager ...
Creating prometheus ...
Creating grafana
Creating altermanager
Creating altermanager ... done
[root@centos1810-100 prometheus]# docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
7324ee026f92 prom/prometheus:latest "/bin/prometheus --c…" 26 seconds ago Restarting (2) 1 second ago prometheus
0047ba346e44 prom/alertmanager:latest "/bin/alertmanager -…" 26 seconds ago Up 3 seconds 0.0.0.0:9093->9093/tcp, :::9093->9093/tcp altermanager
cb173e817088 grafana/grafana:latest "/run.sh" 26 seconds ago Up 6 seconds 0.0.0.0:3000->3000/tcp, :::3000->3000/tcp grafana
3b2413261721 quay.io/prometheus/node-exporter "/bin/node_exporter" 26 seconds ago Up 3 seconds 0.0.0.0:9100->9100/tcp, :::9100->9100/tcp node-exporter
[root@centos1810-100 prometheus]# chmod -R 777 /home/app
[root@centos1810-100 prometheus]# docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
7324ee026f92 prom/prometheus:latest "/bin/prometheus --c…" 38 seconds ago Up 3 seconds 0.0.0.0:9090->9090/tcp, :::9090->9090/tcp prometheus
0047ba346e44 prom/alertmanager:latest "/bin/alertmanager -…" 38 seconds ago Up 14 seconds 0.0.0.0:9093->9093/tcp, :::9093->9093/tcp altermanager
cb173e817088 grafana/grafana:latest "/run.sh" 38 seconds ago Up 18 seconds 0.0.0.0:3000->3000/tcp, :::3000->3000/tcp grafana
3b2413261721 quay.io/prometheus/node-exporter "/bin/node_exporter" 38 seconds ago Up 15 seconds 0.0.0.0:9100->9100/tcp, :::9100->9100/tcp node-exporter
浏览器访问ip:3000访问grafana,默认账号密码admin/admin:
添加data source:
选择prometheus:
输入prometheus的ip并保存:
导入grafana面板:
grafana 面板官网
此处我使用的是下载到本地后进行上传。
上传后选择对应的data source并保存:
保存后查看监控效果:
至此,已部署完成。