【HBZ分享】ES的聚合函数汇总

聚合分类

指标聚合：对数据集求最大、最小、和、平均值等指标的聚合，称为指标聚合 metric

格式：
GET /index/_search
{
  "size": 0,
  "aggs": {
    "aggregation_name": {
      "aggregation_type": {
        "aggregation_field": "field_name"
        // 可选参数
      }
    }
    // 可以添加更多的聚合
  }
}

# 解析
-index：要执行聚合查询的索引名称。
-size: 设置为 0 来仅返回聚合结果，而不返回实际的搜索结果，这里将hits改为0表示返回的原始数据变为0
-aggs：指定聚合操作的容器。

-aggregation_name：聚合名称，可以自定义(名字随便起，无所谓)。
-aggregation_type：聚合操作的类型，例如 terms、avg、sum 等。
-aggregation_field：聚合操作的目标字段，对哪些字段进行聚合
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23

桶聚合：对数据集进行分组group by，然后在组上可再次进行【指标聚合】，在 ES 中称为分桶，桶聚合bucketing

格式：
GET /index/_search
{
  "size": 0,
  "aggs": {
    "aggregation_name": {
      "bucket_type": {
        "bucket_options": {
          "bucket_option_name": "bucket_option_value",
          ...
        },
        "aggs": {
          "sub_aggregation_name": {
            "sub_aggregation_type": {
              "sub_aggregation_options": {
                "sub_aggregation_option_name": "sub_aggregation_option_value",
                ...
              }
            }
          }
        }
      }
    }
  }
}
#解析
-index: 替换为要执行聚合查询的索引名称。
-aggregation_name: 替换为自定义的聚合名称(名字随便起)。
-bucket_type: 替换为特定的桶聚合类型（如 terms、date_histogram、range 等）。
-bucket_option_name 和 bucket_option_value: 替换为特定桶聚合选项的名称和值。

-sub_aggregation_name: 替换为子聚合的名称。
-sub_aggregation_type: 替换为特定的子聚合类型（如 sum、avg、max、min 等）。
-sub_aggregation_option_name 和 sub_aggregation_option_value: 替换为特定子聚合选项的名称和值
-两个aggs，表示有两个聚合，第一个外层aggs表示桶聚合， 第二层aggs表示指标聚合，在每个分组桶内进行数据指标聚合，指标聚合范围就是一个桶内的数据。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35

常见的聚合类型有哪些

指标聚合（Aggregation Metrics）：
(1). Avg Aggregation：计算文档字段的平均值。
(2). Sum Aggregation：计算文档字段的总和。
(3). Min Aggregation：找到文档字段的最小值。
(4). Max Aggregation：找到文档字段的最大值。
聚合桶（Aggregation Buckets）：
(1). Terms Aggregation：基于字段值将文档分组到不同的桶中。
(2). Date Histogram Aggregation：按日期/时间字段创建时间间隔的桶。
(3). Range Aggregation：根据字段值的范围创建桶。

聚合案例实战写法

DEMO原始数据:
# 批量插入数据
POST /sales/_bulk
{"index": {}}
{"product": "iPhone", "sales": 4, "date": "2021-04-21"}
{"index": {}}
{"product": "Samsung", "sales": 60, "date": "2022-04-21"}
{"index": {}}
{"product": "iPhone", "sales": 100, "date": "2021-05-21"}
{"index": {}}
{"product": "Samsung", "sales": 80, "date": "2021-05-21"}
{"index": {}}
{"product": "华为手机", "sales": 50, "date": "2021-06-21"}
{"index": {}}
{"product": "华为手机", "sales": 5000, "date": "2021-06-21"}
{"index": {}}
{"product": "华为手机", "sales": 200, "date": "2021-06-21"}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17

案例1： 分别按照商品名称（product）进行分组

GET /sales/_search
{
  "aggs":{//聚合操作
    "product_group":{//名称，随意起名
      "terms":{//分组
        "field":"product"//分组字段
      }
    }
  }
}
1
2
3
4
5
6
7
8
9
10
11
12

案例2：查询结果将返回每个产品的名称和销售总量

GET /sales/_search
{
  "size": 0,
  "aggs": {
    "product_sales": {
      "terms": {
        "field": "product"
      },
      "aggs": {
        "total_sales": {
          "sum": {
            "field": "sales"
          }
        }
      }
    }
  }
}

解析： 限根据产品名称进行分组，外层的aggs下的terms中根据字段product进行分组。 然后进行二次指标聚合，每个桶中进行指标聚合，每个桶根据sales字段求sum总和
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22

案例3： 获取最大值

GET /sales/_search
{
  "size": 0,
  "aggs": {
    "max_sales": {
      "max": {
        "field": "sales"
      }
    }
  }
}

size： 0 表示只返回最大值的那个值，去掉size属性，会把【参与统计】最大值的所有数据都返回
max_price： 随便起名
max: 固定的关键字，表示最大值
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17

案例4： 获取最小值max

GET /sales/_search
{
  "size": 0,
  "aggs": {
    "min_sales": {
      "min": {
        "field": "sales"
      }
    }
  }
}
size： 0 表示只返回最大值的那个值，去掉size属性，会把【参与统计】最小值的所有数据都返回
min_price： 随便起名
min:  固定的关键字，表示最小值
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16

案例5： 获取平均值avg

GET /sales/_search
{
  "size": 0,
  "aggs": {
    "avg_sales": {
      "avg": {
        "field": "sales"
      }
    }
  }
}
size： 0 表示只返回最大值的那个值，去掉size属性，会把【参与统计】平均值的所有数据都返回
avg_sales： 随便起名
avg: 固定的关键字，表示平均值
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16

案例6： 获取总和sum

GET /sales/_search
{
  "size": 0,
  "aggs": {
    "total_sales": {
      "sum": {
        "field": "sales"
      }
    }
  }
}
size： 0 表示只返回最大值的那个值，去掉size属性，会把【参与统计】总和的所有数据都返回
total_sales： 随便起名
sum: 固定的关键字，表示求总值
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16

案例7： 桶聚合 term实战
根据book_title字段分组，并统计每个组内sales_count的总值
GET /book_sales/_search
{
  "size": 0,
  "aggs": {
    "book_buckets": {
      "terms": {
        "field": "book_title",
        "size": 10
      },
      "aggs": {
        "total_sales": {
          "sum": {
            "field": "sales_count"
          }
        }
      }
    }
  }
}

book_buckets: 名字随便起
terms:  表示桶聚合，即分组，根据book_title字段进行分组
book_title: 聚合统计的字段， 该字段必须时term精准匹配类型，不能是text模糊匹配
内层aggs: 在每个组中在进行内部二次聚合
total_sales: 二次聚合名称，随便起
sum: 对每个组内数据根据sales_count字段进行求总和
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28

案例8：Date Histogram分组
将日期类型的字段按照固定的时间间隔进行分桶，并对每个时间间隔内的文档进行进一步的操作和计算

GET /order_history/_search
{
  "size": 0,
  "aggs": {
    "sales_per_month": {
      "date_histogram": {
        "field": "order_date",
        "calendar_interval": "month",
        "format": "yyyy-MM"
      },
      "aggs": {
        "total_sales": {
          "sum": {
            "field": "amount"
          }
        }
      }
    }
  }
}

size: 返回参与统计的数据 ，size = 0表示不返回参与统计的数据，只返回统计结果数据
sales_per_month：名字，自定义，随便取
date_histogram：关键字，必须这么写，这个关键字就表示要按照日期字段进行分组
field：根据哪个字段进行分组统计
calendar_interval： 指定用于分桶的时间间隔。时间间隔可以是一个有效的日期格式（如 1d、1w、1M），也可以是一个数字加上一个时间单位的组合（如 7d 表示 7 天，1h 表示 1 小时）。month表示按 每个月分组， 2h 表示每2个小时为一组
format：输出的日期格式

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31


案例9：分桶聚合-Range 范围聚合
GET /product_v4/_search
{
  "size": 0,
  "aggs": {
    "price_ranges": {
      "range": {
        "field": "price",
        "ranges": [
          { "to": 100 },
          { "from": 100, "to": 200 },
          { "key": "safaf" , "from": 200 }
        ]
      }
    }
  }
}

price_ranges: 名字随便起， 自定义
range: 固定关键字，表示范围聚合
field： 要聚合的字段
ranges: 指定范围组，每个范围使用 key、from 和 to 参数进行定义。
key: 每个组唯一标识， 当然可以不写，不写系统会默认起名,  如果要写，名一定要用key, 并且不能重复
from：从哪里开始， 不写就表示从0开始 到 to的值
to： 到哪里结束，不写就表示从from 到 无穷大
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26

相关阅读:
Vue3路由——基本使用、动态路由、路由正则、重复参数、嵌套路由、编程式导航、命名路由、重定向、别名、路由模式与导航守卫
 微信小程序项目源码在线考试系统+后台含论文+PPT+源码
 《C++ Primer Plus》第九章：内存模型和名称空间（2）
CPT205 Lab1 Code Collection
现代 Android 开发的第一步Kotlin
STC单片机17——adc 8032
RabbitMQ
桶装水水厂送水小程序开发
 mysql日志持久化机制
 高效回顾深度学习DL、CV、NLP
原文地址：https://blog.csdn.net/a645293829/article/details/134321758