• ES聚合与分组查询取值参数含义(Java api版本)


    一、说明

            在项目中使用Elasticsearch的聚合与分组查询后,对于返回结果一脸懵逼,查阅各资料后,自己总结了一下参数取值的含义,不一定全面,只含常见参数

    二、分组查询

    2.1 参数解释

    1. SearchResponse searchResponse = null;
    2. try {
    3. searchResponse = client.search(s -> s.index("tbanalyzelist").query(
    4. q -> q.bool(
    5. t -> {
    6. t.must(m -> m.match(b -> b.field("machineType.keyword").query(FieldValue.of(machineType))));
    7. if (ToolUtil.isNotEmpty(bizCodes))
    8. t.must(m -> m.terms(b -> b.field("bizCode.keyword").terms(f -> f.value(values))));
    9. t.must(a -> a.range(r -> r.field("duration").gt(JsonData.of(0))));
    10. t.must(a -> a.range(r -> r.field("open_time").gt(JsonData.of(startTime)).lte(JsonData.of(endTime1))));
    11. return t;
    12. }
    13. )
    14. )
    15. //.size(2000000) 数据太多暂且注释
    16. .from(1) //分页查询 起始位置
    17. .size(2) // 每页两条数据
    18. .aggregations("byOpenTime", aggregationBuilder ->
    19. aggregationBuilder.terms(termsAggregationBuilder ->
    20. termsAggregationBuilder.field("openTime")
    21. )),
    22. Map.class);
    23. } catch (IOException e) {
    24. e.printStackTrace();
    25. }
    26. //查询结果
    27. System.out.println(searchResponse);
    28. System.out.println("耗时:" + searchResponse.took());
    29. HitsMetadata hits = searchResponse.hits();
    30. System.out.println(hits.total());
    31. System.out.println("符合条件的总文档数量:" + hits.total().value());
    32. //注意:第一个hits() 与 第二个hits()含义不一样
    33. List> hitList = searchResponse.hits().hits();
    34. //获取分组结果
    35. Map aggregations = searchResponse.aggregations();
    36. System.out.println("aggregations:" + aggregations);
    37. Aggregate aggregate = aggregations.get("byOpenTime");
    38. System.out.println("byOpenTime分组结果 = " + aggregate);
    39. LongTermsAggregate lterms = aggregate.lterms();
    40. Buckets buckets = lterms.buckets();
    41. for (LongTermsBucket b : buckets.array()) {
    42. System.out.println(b.key() + " : " + b.docCount());
    43. }
    •  searchResponse输出结果转JSON
    1. {
    2. "took":190, //执行整个搜索请求耗费了多少毫秒
    3. "timed_out":false,//查询是否超时。默认情况下,搜索请求不会超时。
    4. "_shards":{ // 在查询中参与分片情况
    5. "failed":0, //失败分片数量
    6. "successful":1,//成功
    7. "total":1,//总计
    8. "skipped":0//跳过
    9. },
    10. "hits":{ //结果命中数据
    11. "total":{ //匹配到的文档总数
    12. "relation":"gte",//是否是我们的实际的满足条件的所有文档数
    13. "value":10000 //文档总数
    14. },
    15. "hits":[//每一个命中数据
    16. {
    17. "_index":"tbanalyzelist", //索引名相当于数据库的表名
    18. "_id":"QF2THIQBzxpesqmRtMpw",
    19. "_score":3.0470734,//分数
    20. "_type":"_doc",//类型
    21. //资源,这里才是存储的我们想要的数据
    22. "_source":"{
    23. duration=317.0, //每个字段的值相当于mysql中的字段
    24. machineId=ZFB007422,
    25. bizName=wangyf,
    26. bizCode=221026172721ZBTQ,
    27. open_time=1664296386000,
    28. openTime=2022-09-27,
    29. machineType=DEV-HL
    30. }"
    31. },
    32. {
    33. "_index":"tbanalyzelist",
    34. "_id":"QV2THIQBzxpesqmRtMpw",
    35. "_score":3.0470734,
    36. "_type":"_doc",
    37. "_source":"{
    38. duration=313.0,
    39. machineId=ZFB007422,
    40. bizName=wangyf,
    41. bizCode=221026172721ZBTQ,
    42. open_time=1664383009000,
    43. openTime=2022-09-28,
    44. machineType=DEV-HL
    45. }"
    46. }
    47. ],
    48. "max_score":3.0470734 //查询所匹配文档的 _score 的最大值
    49. },
    50. "aggregations":{//聚合结果
    51. "lterms#byOpenTime":{//分组的桶名称
    52. "buckets":[ //分组桶结果
    53. {
    54. "doc_count":20144,//
    55. "key":"1664150400000",
    56. "key_as_string":"2022-09-26T00:00:00.000Z"
    57. },
    58. {
    59. "doc_count":19724,
    60. "key":"1664409600000",
    61. "key_as_string":"2022-09-29T00:00:00.000Z"
    62. },
    63. {
    64. "doc_count":19715,
    65. "key":"1664236800000",
    66. "key_as_string":"2022-09-27T00:00:00.000Z"
    67. },
    68. {
    69. "doc_count":19653,
    70. "key":"1664323200000",
    71. "key_as_string":"2022-09-28T00:00:00.000Z"
    72. },
    73. {
    74. "doc_count":19376,
    75. "key":"1664496000000",
    76. "key_as_string":"2022-09-30T00:00:00.000Z"
    77. },
    78. {
    79. "doc_count":331,
    80. "key":"1664064000000",
    81. "key_as_string":"2022-09-25T00:00:00.000Z"
    82. }
    83. ],
    84. "doc_count_error_upper_bound":0,
    85. "sum_other_doc_count":0
    86. }
    87. }
    88. }
    • doc_count_error_upper_bound:表示没有在这次聚合中返回、但是可能存在的潜在聚合结果,
    • sum_other_doc_count:表示这次聚合中没有统计到的文档数。因为ES为分布式部署,不同文档分散于多个分片,这样当聚合时,会在每个分片上分别聚合,然后由协调节点汇总结果后返回。
    • doc_count:每个桶的文档数量。
    • key: 分组后的key值

    2.2 获取桶数据方式

    1. Buckets longBuckets = aggregate.lterms().buckets();
    2. Buckets stringBuckets = aggregate.sterms().buckets();
    3. Buckets doubleBuckets = aggregate.dterms().buckets();

    三、聚合查询

            查询条件先忽略,这里聚合后的条件可以直接取到max,count,min,avg,sum等值

    1. String cinemaId = "15989";
    2. SearchResponse searchResponse = null;
    3. try {
    4. searchResponse = client.search(
    5. s -> s.index("tbmaoyan")
    6. .query(q -> q.bool(t -> {
    7. t.must(m -> m.match(f -> f.field("cinemaId.keyword").query(FieldValue.of(cinemaId))));
    8. //t.must(m -> m.term(f -> f.field("cinemaId.keyword").value(cinemaId)));
    9. //t.must(m -> m.match(f -> f.field("cinemaId").query("36924")));
    10. // t.must(m -> m.match(f -> f.field("bizCode").query(FieldValue.of("220104182434IIZF"))));//220104182434IIZF 220120143442CB4C
    11. return t;
    12. }
    13. )
    14. )
    15. // .sort(o -> o.field(f -> f.field("openTime").order(SortOrder.Asc)))
    16. //对viewInfo进行统计
    17. .aggregations("sumViewInfo", aggregationBuilder -> aggregationBuilder
    18. .stats(statsAggregationBuilder -> statsAggregationBuilder
    19. .field("viewInfo")))
    20. //对showInfo进行统计
    21. .aggregations("aggregateShowInfo", aggregationBuilder -> aggregationBuilder
    22. .stats(statsAggregationBuilder -> statsAggregationBuilder
    23. .field("showInfo")))
    24. .from(0)
    25. .size(10000)
    26. , Map.class
    27. );
    28. } catch (IOException e) {
    29. e.printStackTrace();
    30. }
    31. //查询结果
    32. System.out.println(searchResponse);
    33. System.out.println("耗时:" + searchResponse.took());
    34. HitsMetadata hits = searchResponse.hits();
    35. System.out.println(hits.total());
    36. System.out.println("符合条件的总文档数量:" + hits.total().value());
    37. //注意:第一个hits() 与 第二个hits()的区别
    38. List> hitList = searchResponse.hits().hits();
    39. List hitListCopy = new ArrayList<>();
    40. for (Hit mapHit : hitList) {
    41. String source = mapHit.source().toString();
    42. System.out.println("文档原生信息:" + source);
    43. Map map = mapHit.source();
    44. hitListCopy.add(map);
    45. }
    46. //获取聚合结果
    47. Map aggregations = searchResponse.aggregations();
    48. System.out.println("aggregations:" + aggregations);
    49. Aggregate aggregateViewInfo = aggregations.get("sumViewInfo");
    50. Aggregate aggregateShowInfo = aggregations.get("aggregateShowInfo");
    51. System.out.println("viewInfo:" + aggregateViewInfo);
    52. System.out.println("showInfo:" + aggregateShowInfo);
    53. System.out.println("统计个数:" + aggregateViewInfo.stats().count());
    54. System.out.println("最高分账票房:" + aggregateViewInfo.stats().max());
    55. System.out.println("最低分账票房:" + aggregateViewInfo.stats().min());
    56. System.out.println("平均分账票房:" + aggregateViewInfo.stats().avg());
    57. System.out.println("聚合查询的分账票房:" + aggregateViewInfo.stats().sum());
    58. Double sumViewInfoCopy = hitListCopy.stream().mapToDouble(h -> Double.parseDouble(h.get("viewInfo").toString())).sum();
    59. System.out.println("********************");
    60. System.out.println("聚合查询的分账票房:" + aggregateViewInfo.stats().sum());
    61. System.out.println("stream流查询的分账票房: " + sumViewInfoCopy);
    • searchResponse.aggregations()的结果跟上面分组查询类似,不过赘述了
    • aggregations.get("sumViewInfo")的取值

    •  aggregations.get("aggregateShowInfo")的取值

    •  比对一下聚合查询跟我们自己算的数据是否一致

  • 相关阅读:
    直接插入排序(C++实现)
    会声会影软件2023破解版最新激活序列号
    【Python高级编程】Matplotlib 绘图中文显示问题与常见错误合集
    C/C++教程 从入门到精通《第十六章》—— 网络编程详解
    Netty核心源码剖析(四)
    [单片机框架][drivers层][ADC] fuelgauge 软件电量计(二)
    好用的 WAF 工具(SafeLine)
    由电阻电容采购引发的思考
    esp32-S3 + visual studio code 开发环境搭建
    在VS Code中使用VIM
  • 原文地址:https://blog.csdn.net/m0_74444744/article/details/134438853