• 16. python-es-8.3.3-重要或异常词项聚合significant_terms


    世界上并没有完美的程序,但是我们并不因此而沮丧,因为写程序就是一个不断追求完美的过程。-侯氏工坊

    文章目录

    • 原理:前景频率与背景频率比较

    significant_terms

    from elasticsearch import Elasticsearch
    import urllib3
    
    urllib3.disable_warnings()
    
    # PUT es_significant_terms
    # {
    #   "mappings": {
    #     "properties": {
    #       "name": {"type": "keyword"},
    #       "type": {"type": "keyword"}
    #     }
    #   }
    # }
    
    # POST es_significant_terms/_bulk
    # {"index": {"_id": 1}}
    # {"name": "es", "type": "lan"}
    # {"index": {"_id": 2}}
    # {"name": "good", "type": "lan"}
    # {"index": {"_id": 3}}
    # {"name": "es", "type": "lan"}
    # {"index": {"_id": 4}}
    # {"name": "elastic", "type": "lan"}
    # {"index": {"_id": 5}}
    # {"name": "es", "type": "te"}
    # {"index": {"_id": 6}}
    # {"name": "good", "type": "te"}
    
    # GET es_significant_terms/_search
    # {
    #   "query": {"term": {
    #     "type": {
    #       "value": "te"
    #     }
    #   }},
    #   "size": 0,
    #   "aggs": {
    #     "my_significant_terms": {
    #       "significant_terms": {
    #         "field": "name",
    #         "min_doc_count": 1
    #       }
    #     }
    #   }
    # }
    
    # 创建es实例
    es = Elasticsearch("https://192.168.2.64:9200",
    				   verify_certs=False,
    				   basic_auth=("elastic", "MuZkDqdW--VsfDjTcoex"),
    				   request_timeout=60,
    				   max_retries=3,
    				   retry_on_timeout=True,
    				   node_selector_class="round_robin")
    
    # 刷新
    es.indices.refresh(index="es_significant_terms")
    
    significant_terms = {
        "my_significant_terms": {
          "significant_terms": {
            "field": "name",
            "min_doc_count": 1
          }
        }
      }
    
    query = {"term": {
        "type": {
          "value": "te"
        }
      }}
    
    resp = es.search(index="es_significant_terms", size=0, query=query, aggregations=significant_terms)
    
    print(resp['aggregations']['my_significant_terms']['buckets'])
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41
    • 42
    • 43
    • 44
    • 45
    • 46
    • 47
    • 48
    • 49
    • 50
    • 51
    • 52
    • 53
    • 54
    • 55
    • 56
    • 57
    • 58
    • 59
    • 60
    • 61
    • 62
    • 63
    • 64
    • 65
    • 66
    • 67
    • 68
    • 69
    • 70
    • 71
    • 72
    • 73
    • 74
    • 75
    • 76
    • 77
  • 相关阅读:
    强化学习——Q-Learning算法原理
    2024牛客寒假算法基础集训营1(补题)
    useInfiniteScroll --- react滚动加载
    SpringCloud链路追踪SkyWalking-第六章-日志采集
    字符串的扩展
    python绘制蕨菜叶分形
    Linux【3】系统管理
    多线程入门总结
    Oracle官方文档对nfs挂载参数的说明
    JDK8、11、17的新特性
  • 原文地址:https://blog.csdn.net/a13662080711/article/details/126848763