• elasticsearch bulk数据--ES批量导入json数据


    一、Bulk API
    官网给出的介绍:https://www.elastic.co/guide/en/elasticsearch/reference/6.0/docs-bulk.html

    The REST API endpoint is /_bulk, and it expects the following newline delimited JSON (NDJSON) structure:

    action_and_meta_data
    
    optional_source
    
    action_and_meta_data
    
    optional_source
    
    ....
    action_and_meta_data
    
    optional_source
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12

    也就是说每一个操作都有2行数据组成,末尾要回车换行。第一行用来说明操作命令和原数据、第二行是自定义的选项.举个例子,同时执行插入2条数据、删除一条数据。

    { "create" : { "_index" : "blog", "_type" : "article", "_id" : "3" }}
    { "title":"title1","posttime":"2016-07-02","content":"内容一" }
    
    { "create" : { "_index" : "blog", "_type" : "article", "_id" : "4" }}
    { "title":"title2","posttime":"2016-07-03","content":"内容2" }
    
    { "delete":{"_index" : "blog", "_type" : "article", "_id" : "1" }}
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7

    官网的解释和例子:
    Because this format uses literal 's as delimiters, please be sure that the JSON actions and sources are not pretty printed. Here is an example of a correct sequence of bulk commands:

    POST _bulk
    { "index" : { "_index" : "test", "_type" : "type1", "_id" : "1" } }
    { "field1" : "value1" }
    { "delete" : { "_index" : "test", "_type" : "type1", "_id" : "2" } }
    { "create" : { "_index" : "test", "_type" : "type1", "_id" : "3" } }
    { "field1" : "value3" }
    { "update" : {"_id" : "1", "_type" : "type1", "_index" : "test"} }
    { "doc" : {"field2" : "value2"} }
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8

    二、把数据保存在文件中的提交方法。 官网的介绍和说明:
    If you’re providing text file input to curl, you must use the --data-binary flag instead of plain -d. The latter doesn’t preserve newlines. Example:

    $ cat requests
    { "index" : { "_index" : "test", "_type" : "type1", "_id" : "1" } }
    { "field1" : "value1" }
    $ curl -s -H "Content-Type: application/x-ndjson" -XPOST localhost:9200/_bulk --data-binary "@requests"
    
    • 1
    • 2
    • 3
    • 4

    具体例子: 把下面的数据保存在文件request中,然后使用命令提交:

    vim retuqest
    curl  -XPOST  '192.168.0.153:9200/_bulk'   --data-binary  @request
    
    { "index" : { "_index" : "test_index", "_type" : "chen", "_id" : "1" } }
    { "field1" : "value1" }
    { "index" : { "_index" : "test_index", "_type" : "chen", "_id" : "2" } }
    { "field1" : "value2" }
    { "index" : { "_index" : "test_index", "_type" : "chen", "_id" : "3" } }
    { "field1" : "value3" }
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9

    看看有没有提交成功:

    curl -XGET 'http://192.168.0.153:9200/test_index/chen/1?pretty'
    {
      "_index" : "test_index",
      "_type" : "chen",
      "_id" : "1",
      "_version" : 2,
      "found" : true,
      "_source" : {
        "field1" : "value1"
      }
    }
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11

    ok,提交成功。

  • 相关阅读:
    贪心算法part2 | ● 122.买卖股票的最佳时机II ● 55. 跳跃游戏 ● 45.跳跃游戏II
    AI创新下的生成式人工智能——Amazon Bedrock
    Postgresql源码(60)事务系统框架总结
    『LeetCode|每日一题』---->最小路径和
    Spring复习大纲:依赖注入Ioc+Beans+注解+数据访问+AOP+MVC等
    js事件流
    下一代架构设计:云原生、容器和微前端的综合应用
    软件2班20240418
    React报错之Type '() => JSX.Element[]' is not assignable to type FunctionComponent
    虚拟机扩容
  • 原文地址:https://blog.csdn.net/m0_67401746/article/details/126358582