• Prometheus安装部署和Exporter集成


    Prometheus监控平台与Grafana可视化平台使用


    Prometheus是一套开源的容器和微服务监控报警系统和时间序列数据库的组合,提供丰富度量指标和高性能、高可定制的云原生监控系统。
    Prometheus主要通过拉数据的形式实现数据的监控,而把监控收集和暴露出来供Prometheus拉取的组件叫Exporter。

    查看正在运行的端口,避免端口冲突

    netstat -ntlp
    
    • 1

    开启端口

    firewall-cmd --zone=public --add-port=9100/tcp --permanent
    firewall-cmd --reload
    
    • 1
    • 2

    1、Prometheus安装

    1.1、下载

    下载地址:https://prometheus.io/download/
    Github地址:https://github.com/prometheus/prometheus
    本文通过二进制文件安装,下载:prometheus-2.37.0.linux-amd64.tar.gz

    1.2、安装和运行

    安装也很简单,直接解压到部署目录,可以直接命令运行

    tar -zxvf prometheus-2.37.0.linux-amd64.tar.gz -C /usr/local
    ln -sv /usr/local/prometheus-2.37.0.linux-amd64 /usr/local/prometheus
    cd /usr/local/prometheus
    ./prometheus  # 运行
    
    • 1
    • 2
    • 3
    • 4

    访问http://192.168.28.131:9090/

    1.3、可用启动参数

    相关启动参数可以通过以下命令查询

    ./prometheus -h
    
    usage: prometheus [<flags>]
    
    The Prometheus monitoring server
    
    Flags:
      -h, --help                     Show context-sensitive help (also try
                                     --help-long and --help-man).
          --version                  Show application version.
          --config.file="prometheus.yml"  
                                     Prometheus configuration file path.
          --web.listen-address="0.0.0.0:9090"  
                                     Address to listen on for UI, API, and
                                     telemetry.
          --web.config.file=""       [EXPERIMENTAL] Path to configuration file that
                                     can enable TLS or authentication.
          --web.read-timeout=5m      Maximum duration before timing out read of the
                                     request, and closing idle connections.
          --web.max-connections=512  Maximum number of simultaneous connections.
          --web.external-url=<URL>   The URL under which Prometheus is externally
                                     reachable (for example, if Prometheus is served
                                     via a reverse proxy). Used for generating
                                     relative and absolute links back to Prometheus
                                     itself. If the URL has a path portion, it will
                                     be used to prefix all HTTP endpoints served by
                                     Prometheus. If omitted, relevant URL components
                                     will be derived automatically.
          --web.route-prefix=<path>  Prefix for the internal routes of web
                                     endpoints. Defaults to path of
                                     --web.external-url.
          --web.user-assets=<path>   Path to static asset directory, available at
                                     /user.
          --web.enable-lifecycle     Enable shutdown and reload via HTTP request.
          --web.enable-admin-api     Enable API endpoints for admin control actions.
          --web.enable-remote-write-receiver  
                                     Enable API endpoint accepting remote write
                                     requests.
          --web.console.templates="consoles"  
                                     Path to the console template directory,
                                     available at /consoles.
          --web.console.libraries="console_libraries"  
                                     Path to the console library directory.
          --web.page-title="Prometheus Time Series Collection and Processing Server"  
                                     Document title of Prometheus instance.
          --web.cors.origin=".*"     Regex for CORS origin. It is fully anchored.
                                     Example: 'https?://(domain1|domain2)\.com'
          --storage.tsdb.path="data/"  
                                     Base path for metrics storage. Use with server
                                     mode only.
          --storage.tsdb.retention=STORAGE.TSDB.RETENTION  
                                     [DEPRECATED] How long to retain samples in
                                     storage. This flag has been deprecated, use
                                     "storage.tsdb.retention.time" instead. Use with
                                     server mode only.
          --storage.tsdb.retention.time=STORAGE.TSDB.RETENTION.TIME  
                                     How long to retain samples in storage. When
                                     this flag is set it overrides
                                     "storage.tsdb.retention". If neither this flag
                                     nor "storage.tsdb.retention" nor
                                     "storage.tsdb.retention.size" is set, the
                                     retention time defaults to 15d. Units
                                     Supported: y, w, d, h, m, s, ms. Use with
                                     server mode only.
          --storage.tsdb.retention.size=STORAGE.TSDB.RETENTION.SIZE  
                                     Maximum number of bytes that can be stored for
                                     blocks. A unit is required, supported units: B,
                                     KB, MB, GB, TB, PB, EB. Ex: "512MB". Based on
                                     powers-of-2, so 1KB is 1024B. Use with server
                                     mode only.
          --storage.tsdb.no-lockfile  
                                     Do not create lockfile in data directory. Use
                                     with server mode only.
          --storage.tsdb.allow-overlapping-blocks  
                                     Allow overlapping blocks, which in turn enables
                                     vertical compaction and vertical query merge.
                                     Use with server mode only.
          --storage.tsdb.head-chunks-write-queue-size=0  
                                     Size of the queue through which head chunks are
                                     written to the disk to be m-mapped, 0 disables
                                     the queue completely. Experimental. Use with
                                     server mode only.
          --storage.agent.path="data-agent/"  
                                     Base path for metrics storage. Use with agent
                                     mode only.
          --storage.agent.wal-compression  
                                     Compress the agent WAL. Use with agent mode
                                     only.
          --storage.agent.retention.min-time=STORAGE.AGENT.RETENTION.MIN-TIME  
                                     Minimum age samples may be before being
                                     considered for deletion when the WAL is
                                     truncated Use with agent mode only.
          --storage.agent.retention.max-time=STORAGE.AGENT.RETENTION.MAX-TIME  
                                     Maximum age samples may be before being
                                     forcibly deleted when the WAL is truncated Use
                                     with agent mode only.
          --storage.agent.no-lockfile  
                                     Do not create lockfile in data directory. Use
                                     with agent mode only.
          --storage.remote.flush-deadline=<duration>  
                                     How long to wait flushing sample on shutdown or
                                     config reload.
          --storage.remote.read-sample-limit=5e7  
                                     Maximum overall number of samples to return via
                                     the remote read interface, in a single query. 0
                                     means no limit. This limit is ignored for
                                     streamed response types. Use with server mode
                                     only.
          --storage.remote.read-concurrent-limit=10  
                                     Maximum number of concurrent remote read calls.
                                     0 means no limit. Use with server mode only.
          --storage.remote.read-max-bytes-in-frame=1048576  
                                     Maximum number of bytes in a single frame for
                                     streaming remote read response types before
                                     marshalling. Note that client might have limit
                                     on frame size as well. 1MB as recommended by
                                     protobuf by default. Use with server mode only.
          --rules.alert.for-outage-tolerance=1h  
                                     Max time to tolerate prometheus outage for
                                     restoring "for" state of alert. Use with server
                                     mode only.
          --rules.alert.for-grace-period=10m  
                                     Minimum duration between alert and restored
                                     "for" state. This is maintained only for alerts
                                     with configured "for" time greater than grace
                                     period. Use with server mode only.
          --rules.alert.resend-delay=1m  
                                     Minimum amount of time to wait before resending
                                     an alert to Alertmanager. Use with server mode
                                     only.
          --alertmanager.notification-queue-capacity=10000  
                                     The capacity of the queue for pending
                                     Alertmanager notifications. Use with server
                                     mode only.
          --query.lookback-delta=5m  The maximum lookback duration for retrieving
                                     metrics during expression evaluations and
                                     federation. Use with server mode only.
          --query.timeout=2m         Maximum time a query may take before being
                                     aborted. Use with server mode only.
          --query.max-concurrency=20  
                                     Maximum number of queries executed
                                     concurrently. Use with server mode only.
          --query.max-samples=50000000  
                                     Maximum number of samples a single query can
                                     load into memory. Note that queries will fail
                                     if they try to load more samples than this into
                                     memory, so this also limits the number of
                                     samples a query can return. Use with server
                                     mode only.
          --enable-feature= ...      Comma separated feature names to enable. Valid
                                     options: agent, exemplar-storage,
                                     expand-external-labels,
                                     memory-snapshot-on-shutdown,
                                     promql-at-modifier, promql-negative-offset,
                                     promql-per-step-stats, remote-write-receiver
                                     (DEPRECATED), extra-scrape-metrics,
                                     new-service-discovery-manager, auto-gomaxprocs.
                                     See
                                     https://prometheus.io/docs/prometheus/latest/feature_flags/
                                     for more details.
          --log.level=info           Only log messages with the given severity or
                                     above. One of: [debug, info, warn, error]
          --log.format=logfmt        Output format of log messages. One of: [logfmt,
                                     json]
    
    
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41
    • 42
    • 43
    • 44
    • 45
    • 46
    • 47
    • 48
    • 49
    • 50
    • 51
    • 52
    • 53
    • 54
    • 55
    • 56
    • 57
    • 58
    • 59
    • 60
    • 61
    • 62
    • 63
    • 64
    • 65
    • 66
    • 67
    • 68
    • 69
    • 70
    • 71
    • 72
    • 73
    • 74
    • 75
    • 76
    • 77
    • 78
    • 79
    • 80
    • 81
    • 82
    • 83
    • 84
    • 85
    • 86
    • 87
    • 88
    • 89
    • 90
    • 91
    • 92
    • 93
    • 94
    • 95
    • 96
    • 97
    • 98
    • 99
    • 100
    • 101
    • 102
    • 103
    • 104
    • 105
    • 106
    • 107
    • 108
    • 109
    • 110
    • 111
    • 112
    • 113
    • 114
    • 115
    • 116
    • 117
    • 118
    • 119
    • 120
    • 121
    • 122
    • 123
    • 124
    • 125
    • 126
    • 127
    • 128
    • 129
    • 130
    • 131
    • 132
    • 133
    • 134
    • 135
    • 136
    • 137
    • 138
    • 139
    • 140
    • 141
    • 142
    • 143
    • 144
    • 145
    • 146
    • 147
    • 148
    • 149
    • 150
    • 151
    • 152
    • 153
    • 154
    • 155
    • 156
    • 157
    • 158
    • 159
    • 160
    • 161
    • 162
    • 163
    • 164
    • 165
    • 166

    1.4、设置登录帐号密码

    Prometheus的Web UI和相关Exporter组件都是默认是允许所有人直接访问的。
    生成帐号和密钥

    #  安装https-tools
    [root]# yum install -y httpd-tools
    # 使用httpd-tools内的htpasswd生成密钥
    [root]# htpasswd -nbBC 12 penngo 123456
    
    penngo:$2y$12$HBw06HgxQlm3z6I85OPH.eNqeUCbqP.w7xFnb0ch60RcK9p3ZFLea # 密码123对应的密钥,在config.yml文件中使用
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6

    配置用户信息文件

    vi /usr/local/prometheus-2.37.0.linux-amd64/config.yml
    
    # config.yml文件内容
    basic_auth_users:
    # 可配置多个用户
      penngo: $2y$12$PoUH4HDg3hxWqqcrfWDUB.f52O/oW0J6wRP5/Epwf5k2qd0XNhFVe
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6

    可以在启动参数中添加参数–web.config.file=/usr/local/prometheus/config.yml,限制必须登录才能访问Prometheus的Web UI
    在这里插入图片描述

    1.5、设置为系统服务

    vi /usr/lib/systemd/system/prometheus.service

    [Unit]
    Description=Prometheus server daemon
    After=network.target
    
    [Service]
    Type=simple
    User=root
    Group=root
    ExecStart=/usr/local/prometheus/prometheus \
        --config.file=/usr/local/prometheus/prometheus.yml \
        --web.config.file=/usr/local/prometheus/config.yml \
        --web.enable-lifecycle \       # curl http://127.0.0.1:9090/-/reload 重新加载配置
        --storage.tsdb.path=/usr/local/prometheus/data \
        --storage.tsdb.retention=15d \
        --web.console.templates=/usr/local/prometheus/consoles \
        --web.console.libraries=/usr/local/prometheus/console_libraries \
        --web.max-connections=512 \
        --web.external-url=http://192.168.28.131:9090 \
        --web.listen-address=0.0.0.0:9090
    Restart=on-failure
    [Install]
    WantedBy=multi-user.target
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22

    prometheus服务命令

    systemctl daemon-reload       # 通知systemd重新加载配置文件
    systemctl enable prometheus   # 设置开机启动
    systemctl disable prometheus  # 取消开机启动
    systemctl start prometheus    # 启动服务
    systemctl restart prometheus  # 重启服务
    systemctl stop prometheus     # 关闭服务
    systemctl status prometheus   # 查看状态
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7

    2、Grafana安装

    2.1、下载安装

    下载地址:https://grafana.com/grafana/download
    Linux系统下的安装方法
    Ubuntu and Debian(64 Bit)

    sudo apt-get install -y adduser libfontconfig1
    wget https://dl.grafana.com/enterprise/release/grafana-enterprise_8.5.11_amd64.deb
    sudo dpkg -i grafana-enterprise_8.5.11_amd64.deb
    Read the Ubuntu / Debian installation guide for more information. We also provide an APT package repository.
    
    • 1
    • 2
    • 3
    • 4

    Standalone Linux Binaries(64 Bit)

    wget https://dl.grafana.com/enterprise/release/grafana-enterprise-8.5.11.linux-amd64.tar.gz
    tar -zxvf grafana-enterprise-8.5.11.linux-amd64.tar.gz
    
    • 1
    • 2

    Red Hat, CentOS, RHEL, and Fedora(64 Bit)

    wget https://dl.grafana.com/enterprise/release/grafana-enterprise-8.5.11-1.x86_64.rpm
    sudo yum install grafana-enterprise-8.5.11-1.x86_64.rpm
    
    • 1
    • 2

    OpenSUSE and SUSE

    wget https://dl.grafana.com/enterprise/release/grafana-enterprise-8.5.11-1.x86_64.rpm
    sudo rpm -i --nodeps grafana-enterprise-8.5.11-1.x86_64.rpm
    
    • 1
    • 2
    wget https://dl.grafana.com/enterprise/release/grafana-enterprise-8.5.11-1.x86_64.rpm
    sudo yum install grafana-enterprise-8.5.11-1.x86_64.rpm
    
    • 1
    • 2

    2.2、systemd操作grafana服务

    systemctl daemon-reload       # 通知systemd重新加载配置文件
    systemctl enable grafana-server   # 设置开机启动
    systemctl disable grafana-server  # 取消开机启动
    systemctl start grafana-server    # 启动服务
    systemctl stop grafana-server     # 关闭服务
    systemctl status grafana-server   # 查看状态
    
    ps -ef | grep grafana  #查看启动情况:
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8

    2.3、文件位置

    二进制文件安装位置:/usr/sbin/grafana-server
    启动脚本文件:/etc/init.d/grafana-server
    默认环境变量文件:/etc/sysconfig/grafana-server
    默认配置文件:/etc/grafana/grafana.ini
    systemd服务用进程名称:grafana-server.service
    默认日志文件:/var/log/grafana/grafana.log
    默认指定sqlite3数据库文件:/var/lib/grafana/grafana.db
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7

    2.4、默认访问地址

    访问:http://127.0.0.1:3000,输入默认用户名/密码:admin/admin。

    3、Prometheus常用Exporter

    prometheus提供两种方式集成
    客户端库集成:https://prometheus.io/docs/instrumenting/clientlibs/
    通过不同语言的客户端库,可以非常方便的把各种应用系统接入prometeus的监控。

    Exporter集成:https://prometheus.io/docs/instrumenting/exporters/
    现成监控组件,提供对数据库、硬件、消息队列、存储、HTTP服务、API服务、日志等的监控。

    这两种集成方式都同时有官方提供和社区提供。

    3.1、主机监控Node_exporter主机监控

    3.1.1、Linux主机监控
    3.1.1.1、下载与安装

    下载地址:https://prometheus.io/download/
    Github:https://github.com/prometheus/node_exporter
    本文下载:node_exporter-1.3.1.linux-amd64.tar.gz

    # 解压到指定目录
    tar -zxvf node_exporter-1.3.1.linux-amd64.tar.gz -C /usr/local
    
    # 启动
    /usr/local/node_exporter-1.3.1.linux-amd64/node_explorter --web.listen-address=":9100"
    
    • 1
    • 2
    • 3
    • 4
    • 5
    3.1.1.2、设置为系统服务

    创建系统服务

    vi /usr/lib/systemd/system/node_exporter.service
    
    # node_exporter.service文件内容
    
    [Unit]
    Description=node_exporter
    Wants=network-online.target
    After=network-online.target
    
    [Service]
    User=root
    Group=root
    Type=simple
    ExecStart=/usr/local/node_exporter-1.3.1.linux-amd64/node_exporter
    
    [Install]
    WantedBy=multi-user.target
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17

    node_exporter服务命令

    systemctl daemon-reload       # 通知systemd重新加载配置文件
    systemctl enable node_exporter   # 设置开机启动
    systemctl disable node_exporter  # 取消开机启动
    systemctl start node_exporter    # 启动服务
    systemctl stop node_exporter     # 关闭服务
    systemctl status node_exporter   # 查看状态
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6

    本地查看监控参数

    curl http://127.0.0.1:9100/metrics
    
    • 1
    3.1.1.3、与prometheus集成

    修改prometheus.yml

    vi /usr/local/prometheus-2.37.0.linux-amd64/prometheus.yml
    
    • 1

    prometheus.yml配置

    # 全局配置
    global:
      scrape_interval: 15s # 设置采集时间为15秒,默认为1分钟。
      evaluation_interval: 15s # 每15秒评估一次规则。默认为1分钟。
      # scrape_timeout 设置为全局默认值(10s)。
    
    # Alertmanager configuration
    alerting:
      alertmanagers:
        - static_configs:
            - targets:
              # - alertmanager:9093
    
    # Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
    rule_files:
      # - "first_rules.yml"
      # - "second_rules.yml"
    
    # A scrape configuration containing exactly one endpoint to scrape:
    # Here it's Prometheus itself.
    scrape_configs:
      # The job name is added as a label `job=` to any timeseries scraped from this config.
      - job_name: "prometheus"
        basic_auth:
          username: penngo
          password: 123456
        # metrics_path defaults to '/metrics'
        # scheme defaults to 'http'.
        static_configs:
          - targets: ["localhost:9090"]
    # 添加以下配置,与prometheus集成      
      - job_name: 'node_expporter'
        static_configs:
          - targets: ['192.168.28.136:9100']
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    3.1.2、Window主机监控
    3.1.2.1、下载

    Github:https://github.com/prometheus-community/windows_exporter
    本文不介绍,需要集成的去windows_exporter官网查看文档。

    3.1.3、Prometheus显示

    在这里插入图片描述

    3.1.4、在Grafana可视化显示监控数据

    使用主机的监控模板:https://grafana.com/grafana/dashboards/16098-1-node-exporter-for-prometheus-dashboard-cn-0417-job/
    在这里插入图片描述

  • 相关阅读:
    JavaScript 59 JavaScript 常见错误
    函数式编程中元组的简单运用
    前端性能优化的方式
    计网课设-发送TCP数据包
    Android 进入 Activity 时禁止弹出输入法
    5、Redis的发布和订阅
    【python数据分析基础】—对列操作:获取DataFrame不同的类型columns
    Linux系统,误按win+L键被锁住了
    Spring Boot 篇四: Spring Data JPA使用SQL Server
    详解mybatis三种分页方式
  • 原文地址:https://blog.csdn.net/penngo/article/details/126912702