prometheus学习2数据类型了解&PromQL

Prometheus查询

在这里插入图片描述

数据通过9100端口暴露出去，http://192.168.1.11:9100/metrics，执行相应的字段获取简单的图表。

Prometheus数据模型

Prometheus将采集的数据存储到TSDB,时序数据库下，以下是Prometheus的数据模型说明

总结

metric名称为我们采集指标的名称
标签同时标识metric,同时metric作为数据查询过滤的重要字段
metric+label组成时序数据的唯一标识
timestamp时间戳下对应改时间节点的value

metrics数据类型

Prometheus从node exporter 采集metics，采集metric共4中类型

Counter: 计数器，用于记录持续增涨的指标，比如HTTP请求的总数。计数器的值在node exporter重启后会被重置为0.
Gauge: 仪表盘，用于记录可增可减的指标，比如内存使用量、队列长度、正在执行的请求或当前CPU使用量。
Histogram：直方图，用于记录数据的分布，比如HTTP返回在各个时间范围内的分布，使用时需要提前定义数据分布的区间范围----Buckets。
Summary：摘要，类似于Histogram，但是能提供更为精确的分位数计算。
1
2
3
4

Counter（计数器）

HTTP请求总数

totalRequests := prometheus.NewCounter(prometheus.CounterOpts{
	Name: "http_requests_total",
	Help: "The total number of handled HTTP requests.",
})
1
2
3
4

写入数据

totalRequests.Inc()
totalRequests.Add(23)
1
2

数据采集结果

# HELP http_requests_total The total number of handled HTTP requests.
# TYPE http_requests_total counter
http_requests_total 7734
1
2
3

Guage（仪表盘）

队列长度

queueLength := prometheus.NewGauge(prometheus.GaugeOpts{
	Name: "queue_length",
	Help: "The number of items in the queue.",
})
1
2
3
4

写入数据

// Use Set() when you know the absolute value from some other source.
queueLength.Set(0)

// Use these methods when your code directly observes the increase or decrease of something, such as adding an item to a queue.
queueLength.Inc() // Increment by 1.
queueLength.Dec() // Decrement by 1.
queueLength.Add(23)
queueLength.Sub(42)
1
2
3
4
5
6
7
8

数据采集结果

# HELP queue_length The number of items in the queue.
# TYPE queue_length gauge
queue_length 42
1
2
3

Histogram（直方图）

采集HTTP请求处理时长直方图

requestDurations := prometheus.NewHistogram(prometheus.HistogramOpts{
  Name:    "http_request_duration_seconds",
  Help:    "A histogram of the HTTP request durations in seconds.",
  // Bucket configuration: the first bucket includes all requests finishing in 0.05 seconds, the last one includes all requests finishing in 10 seconds.
  Buckets: []float64{0.05, 0.1, 0.25, 0.5, 1, 2.5, 5, 10},
})
1
2
3
4
5
6

写入数据

requestDurations.Observe(0.42)
1

数据采集结果

# HELP http_request_duration_seconds A histogram of the HTTP request durations in seconds.
# TYPE http_request_duration_seconds histogram
http_request_duration_seconds_bucket{le="0.05"} 4599
http_request_duration_seconds_bucket{le="0.1"} 24128
http_request_duration_seconds_bucket{le="0.25"} 45311
http_request_duration_seconds_bucket{le="0.5"} 59983
http_request_duration_seconds_bucket{le="1"} 60345
http_request_duration_seconds_bucket{le="2.5"} 114003
http_request_duration_seconds_bucket{le="5"} 201325
http_request_duration_seconds_bucket{le="+Inf"} 227420
http_request_duration_seconds_sum 88364.234
http_request_duration_seconds_count 227420
1
2
3
4
5
6
7
8
9
10
11
12

Summary（摘要）

采集HTTP请求处理时长分布图

requestDurations := prometheus.NewSummary(prometheus.SummaryOpts{
    Name:       "http_request_duration_seconds",
    Help:       "A summary of the HTTP request durations in seconds.",
    Objectives: map[float64]float64{
      0.5: 0.05,   // 50th percentile with a max. absolute error of 0.05.
      0.9: 0.01,   // 90th percentile with a max. absolute error of 0.01.
      0.99: 0.001, // 99th percentile with a max. absolute error of 0.001.
    },
  },
)
1
2
3
4
5
6
7
8
9
10

写入数据

requestDurations.Observe(0.42)
1

数据采集结果

# HELP http_request_duration_seconds A summary of the HTTP request durations in seconds.
# TYPE http_request_duration_seconds summary
http_request_duration_seconds{quantile="0.5"} 0.052
http_request_duration_seconds{quantile="0.90"} 0.564
http_request_duration_seconds{quantile="0.99"} 2.372
http_request_duration_seconds_sum 88364.234
http_request_duration_seconds_count 227420
1
2
3
4
5
6
7

PromQL

表达式数据类型

- Instant vector： 瞬时向量，某一时间戳下的指标数据
- Range vector： 范围向量，时间区间下的指标数据
- Scalar ：标量，浮点数
- String: 字符串
1
2
3
4

1、瞬时向量

查询参数

node_memory_MemTotal_bytes
1

2、范围向量

查询参数

go_threads[1m]
1

瞬时向量和范围向量
3、时间序列查询

瞬时向量查询

http_requests_total
1

范围向量查询

http_requests_total{job="prometheus"}[5m]
1

3、标签匹配

在线正则匹配测试

=  精确匹配
!= 不等于
=~ 正则匹配
!~ 不等于+正则匹配
1
2
3
4

示例

http_requests_total{environment=~"staging|testing|development",method!="GET"}
up{ instance=~"192.168.1.12:9100"}
1
2

4、时间选择

范围

http_requests_total{job="prometheus"}[5m]
up{ instance=~"192.168.1.12:9100"}[5m]

ms - milliseconds
s - seconds
m - minutes
h - hours
d - days - assuming a day has always 24h
w - weeks - assuming a week has always 7d
y - years - assuming a year has always 365d
1
2
3
4
5
6
7
8
9
10

时间偏移

http_requests_total offset 5m
http_requests_total[5m] offset 1w
1
2

指定具体时间

http_requests_total @ 1609746000                #2021-01-04T07:40:00+00:00
rate(http_requests_total[5m] @ 1609746000)
1
2

时间在线转换工具

5、操作符

二元算术运算

+ - * / %  ^   
1

二元算术运算结果说明

标量 OPERATORS 标量 => 标量
瞬时向量 OPERATORS 标量 => 没有metric的瞬时向量
瞬时向量 OPERATORS 瞬时向量 => 没有metrci的瞬时向量
1
2
3

比较运算符

== 等于 != 不等于 > 大于 < 小于  >= 大于等于  <= 小于等于
1

标量 OPERATORS 标量 => 返回0或者1

1 > bool -1
1

向量 OPERATORS 标量 => 保留满足条件的数据，可实现过滤的功能
向量 OPERATORS 向量 => 左边与右边匹配，匹配不上标签或者结果为false不显示。条件满足则保留左边的向量
在这里插入图片描述
逻辑运算

- 向量1 and 向量2    交集：左边和右边匹配，匹配上保留左边的向量
- 向量1 or 向量2     并集：保留向量1并保留不在向量1的向量2 
- 向量1 unless 向量2 差集
1
2
3

6、向量的匹配

一对一

语法

  向量表达式     操作符     忽略/选择标签               向量表达式
  ignoring()     
  on() 
1
2
3

数据

method_code:http_errors:rate5m{method="get", code="500"}  24
method_code:http_errors:rate5m{method="get", code="404"}  30
method_code:http_errors:rate5m{method="put", code="501"}  3
method_code:http_errors:rate5m{method="post", code="500"} 6
method_code:http_errors:rate5m{method="post", code="404"} 21
method:http_requests:rate5m{method="get"}  600
method:http_requests:rate5m{method="del"}  34
method:http_requests:rate5m{method="post"} 120
1
2
3
4
5
6
7
8

查询语句

method_code:http_errors:rate5m{code="500"} / ignoring(code) method:http_requests:rate5m
#忽略掉code标签匹配，下面的举例：统计了所有忽略code标签的get
1
2

结果

{method="get"}  0.04            //  24 / 600
{method="post"} 0.05            //   6 / 120
1
2

多对多

语法

  ignoring() group_left() 
group_left  以昨边统计的为准
  ignoring() group_right() 
  on() group_left() 
  on() group_right() 
1
2
3
4
5

查询语句

method_code:http_errors:rate5m / ignoring(code) group_left method:http_requests:rate5m
1

结果

{method="get", code="500"}  0.04            //  24 / 600
{method="get", code="404"}  0.05            //  30 / 600
{method="post", code="500"} 0.05            //   6 / 120
{method="post", code="404"} 0.175           //  21 / 120
1
2
3
4

7、标签分组使用

group_left or goup_right

确定左右变量那边的优先级高,拥有同样标签以group那边为准
1

- ignoring(code)     忽略code字段
- on (label list)         只匹配label list内的标签

1
2
3

查询每颗CPU下各个时间占比

sum by (instance,cpu) (node_cpu_seconds_total)
1

node_cpu_seconds_total / on (instance,cpu) group_left sum by (instance,cpu) (node_cpu_seconds_total)
1

各时间占比总CPU时间

sum by (instance,mode) (node_cpu_seconds_total) / ignoring (mode) group_left  sum by (instance) (node_cpu_seconds_total)
1

8、聚合函数

聚合函数列表

- sum （计算维度的总和）
- min （选择最小尺寸）
- max （选择最大尺寸）
- avg （计算维度上的平均值）
- group （结果向量中的所有值都是 1）
- stddev （计算维度上的总体标准偏差）
- stdvar （计算维度上的总体标准方差）
- count （计算向量中元素的数量）
- count_values （计算具有相同值的元素数）
- bottomk （样本值的最小 k 个元素）
- topk （样本值最大的 k 个元素）
- quantile （在维度上计算 φ-分位数 (0 ≤ φ ≤ 1)）
1
2
3
4
5
6
7
8
9
10
11
12

常规使用格式

 [without|by ()] ([parameter,] )
1

count_values，quantile，topk和 bottomk`需要传入参数

([parameter,] ) [without|by ()]
1

9、各函数功能说明

调整值

abs(v instant-vector): 绝对值
ceil()： 上取整
clamp_max(v instant-vector, max scalar): 函数，输入一个瞬时向量和最大值，样本数据值若大于 max，则改为 max，否则不变。
clamp_min(v instant-vector, min scalar): 输入一个瞬时向量和最小值，样本数据值若小于 min，则改为 min，否则不变。
round(v instant-vector, to_nearest=1 scalar):与 ceil 和 floor 函数类似，返回向量中所有样本值的最接近的整数。
scalar(v instant-vector): 参数是一个单元素的瞬时向量,它返回其唯一的时间序列的值作为一个标量。
sqrt(v instant-vector): 计算向量 v 中所有元素的平方根。
vector(s scalar): 将标量 s 作为没有标签的向量返回，即返回结果为：key: value= {}, s。
floor(v instant-vector): 与 ceil() 函数相反，将 v 中所有元素的样本值向下四舍五入到最接近的整数。
1
2
3
4
5
6
7
8
9

时间相关

day_of_month(v=vector(time()) instant-vector): 返回被给定 UTC 时间所在月的第几天。返回值范围：1~31。
day_of_week(v=vector(time()) instant-vector): 返回被给定 UTC 时间所在周的第几天。返回值范围：0~6，0 表示星期天。
days_in_month(v=vector(time()) instant-vector): 返回当月一共有多少天。返回值范围：28~31。
hour(v=vector(time()) instant-vector): 返回被给定 UTC 时间的当前第几个小时，时间范围：0~23。
minute(v=vector(time()) instant-vector): 返回给定 UTC 时间当前小时的第多少分钟。结果范围：0~59。
month(v=vector(time()) instant-vector): 返回给定 UTC 时间当前属于第几个月，结果范围：0~12。
time(): 函数返回从 1970-01-01 到现在的秒数。
timestamp(v instant-vector): 返回向量 v 中的每个样本的时间戳（从 1970-01-01 到现在的秒数）。
year(v=vector(time()) instant-vector): 函数返回被给定 UTC 时间的当前年份。
1
2
3
4
5
6
7
8
9

排序

sort(v instant-vector): 对向量按元素的值进行升序排序，返回结果：key: value = 度量指标：样本值[升序排列]。
sort_desc(v instant-vector): 对向量按元素的值进行降序排序，返回结果：key: value = 度量指标：样本值[降序排列]。
1
2

其他

absent(v instant-vector)： 判断瞬时向量如果有元素返回空向量，没有返回1，判断时间序列下是否有数据发送报警
absent_over_time(v range-vector)： 判断范围向量
changes(v range-vector)：范围向量发生变化的次数
resets(v range-vector): 参数是一个区间向量。对于每个时间序列，它都返回一个计数器重置的次数。两个连续样本之间的值的减少被认为是一次计数器重置。
delta(v range-vector): 参数是一个区间向量，返回一个瞬时向量。它计算一个区间向量 v 的第一个元素和最后一个元素之间的差值。由于这个值被外推到指定的整个时间范围，所以即使样本值都是整数，你仍然可能会得到一个非整数值。

histogram_quantile(φ float, b instant-vector): 从 bucket 类型的向量 b 中计算 φ (0 ≤ φ ≤ 1) 分位数（百分位数的一般形式）的样本的最大值。（有关 φ 分位数的详细说明以及直方图指标类型的使用，请参阅直方图和摘要）。向量 b 中的样本是每个 bucket 的采样点数量。每个样本的 labels 中必须要有 le 这个 label 来表示每个 bucket 的上边界，没有 le 标签的样本会被忽略。直方图指标类型自动提供带有 _bucket 后缀和相应标签的时间序列。
1
2
3
4
5
6
7

速率计算和线性回归

increase(v range-vector): 获取区间向量中的第一个和最后一个样本并返回其增长量.
irate(v range-vector): 用于计算区间向量的增长率，但是其反应出的是瞬时增长率。
rate(v range-vector): 计算范围向量中时间序列的每秒平均增长率。
predict_linear(v range-vector, t scalar): 可以预测时间序列 v 在 t 秒后的值。它基于简单线性回归的方式，对时间窗口内的样本数据进行统计，从而可以对时间序列的变化趋势做出预测。
deriv(v range-vector): 参数是一个区间向量,返回一个瞬时向量。它使用简单的线性回归计算区间向量 v 中各个时间序列的导数。
exp(v instant-vector): 输入一个瞬时向量，返回各个样本值的 e 的指数值，即 e 的 N 次方。当 N 的值足够大时会返回 +Inf
1
2
3
4
5
6

区间向量下的汇聚函数

avg_over_time(range-vector) : 区间向量内每个度量指标的平均值。
min_over_time(range-vector) : 区间向量内每个度量指标的最小值。
max_over_time(range-vector) : 区间向量内每个度量指标的最大值。
sum_over_time(range-vector) : 区间向量内每个度量指标的求和。
count_over_time(range-vector) : 区间向量内每个度量指标的样本数据个数。
quantile_over_time(scalar, range-vector) : 区间向量内每个度量指标的样本数据值分位数，φ-quantile (0 ≤ φ ≤ 1)。
stddev_over_time(range-vector) : 区间向量内每个度量指标的总体标准差。
stdvar_over_time(range-vector) : 区间向量内每个度量指标的总体标准方差。
1
2
3
4
5
6
7
8

rate vs irate vs increase


- 函数一般用于Counter的数据类型，求增量速率相关值。
- rate函数以秒为单位求平均增长速率，irate只取区间范围的最后两个值，increase取区间的差值。
- 以上函数使用需要更具使用场景而定。
- irate适合快速变化的计数器（counter），而rate适合缓慢变化的计数器（counter）。
- 同样的区间时间rate显示结果要平滑，同时rate函数存在长尾效应，V6到V5的瞬时增长被平均。
间向量中的第一个和最后一个样本并返回其增长量.
irate(v range-vector): 用于计算区间向量的增长率，但是其反应出的是瞬时增长率。
rate(v range-vector): 计算范围向量中时间序列的每秒平均增长率。
predict_linear(v range-vector, t scalar): 可以预测时间序列 v 在 t 秒后的值。它基于简单线性回归的方式，对时间窗口内的样本数据进行统计，从而可以对时间序列的变化趋势做出预测。
deriv(v range-vector): 参数是一个区间向量,返回一个瞬时向量。它使用简单的线性回归计算区间向量 v 中各个时间序列的导数。
exp(v instant-vector): 输入一个瞬时向量，返回各个样本值的 e 的指数值，即 e 的 N 次方。当 N 的值足够大时会返回 +Inf
1
2
3
4
5
6
7
8
9
10
11
12

区间向量下的汇聚函数

avg_over_time(range-vector) : 区间向量内每个度量指标的平均值。
min_over_time(range-vector) : 区间向量内每个度量指标的最小值。
max_over_time(range-vector) : 区间向量内每个度量指标的最大值。
sum_over_time(range-vector) : 区间向量内每个度量指标的求和。
count_over_time(range-vector) : 区间向量内每个度量指标的样本数据个数。
quantile_over_time(scalar, range-vector) : 区间向量内每个度量指标的样本数据值分位数，φ-quantile (0 ≤ φ ≤ 1)。
stddev_over_time(range-vector) : 区间向量内每个度量指标的总体标准差。
stdvar_over_time(range-vector) : 区间向量内每个度量指标的总体标准方差。
1
2
3
4
5
6
7
8

rate vs irate vs increase
总结

以上函数一般用于Counter的数据类型，求增量速率相关值。
rate函数以秒为单位求平均增长速率，irate只取区间范围的最后两个值，increase取区间的差值。
以上函数使用需要更具使用场景而定。
irate适合快速变化的计数器（counter），而rate适合缓慢变化的计数器（counter）。
同样的区间时间rate显示结果要平滑，同时rate函数存在长尾效应，V6到V5的瞬时增长被平均。

相关阅读:
三甲川荧光染料Cy3DIGE NHS ester，Cy3DIGE琥珀酰亚胺活化酯，Cyanine3DIGE 活化酯，Ex:555nmEm:569nm
扩展学习|大数据分析的现状和分类
 电力系统强大的Gurobi 求解器的学习（Python&Matlab）
电子元器件行业B2B交易管理系统：提升数据化驱动能力，促进企业销售业绩增长
 位图的详细介绍及模拟实现
 【数据结构】详解二叉树之堆
 【windows|012】光猫、路由器、交换机详解
 leetcode 380. Insert Delete GetRandom O(1)（在O(1)时间添加，删除，取随机）
C++ Qt开发：使用关联容器类
 int main(int argc,char* argv[]) 的含义和用法
原文地址：https://blog.csdn.net/weixin_60092693/article/details/126203347