Elasticsearch实践操作集合

文章目录

Elasticsearch介绍
- 优势
- 全文检索
- - Lucene词汇表和架构
- 使用案例
- 对比Elasticsearch和Solr
- 对比Elasticsearch和splunk
- JSON
- ES基础
- HTTP协议
Elasticsearch安装
- 下载`7.8.1`版本
- - 集群部署
  - ==安装报错==
- 创建用户
- ES安装
- - 集群安装
- 管理
- 注意事项
- 插件安装
- 安装kibana
- - 安装7.8.1
- 安装中文分词器
ES部署事项
- **服务器配置的选择**
- 关闭swap
- 角色隔离和脑裂
- 分片配置
索引
- ES索引
- - 分片和副本
  - 创建索引
- 映射配置
- - 类型确定机制
  - 索引和结构映射
- index template
- ==倒排索引==
- 删除索引
- - 删除索引内的数据
RESTful API
- 基本介绍
- cat apis
- Index APIs
- Cluster APIs
- Task management API
- Nodes stats API
- document APIs
- Search APIs
- stats索引统计API
- 状态API
搜索数据
- 搜索请求共有的基本结构
- 搜索请求的结构
- 介绍查询和过滤器DSL
- - match查询和term过滤器
  - bool查询
- 去重查询
- 返回去重内容
- 搜索案例
- - 分组统计
  - 搜索大于1万条数据
问题处理
- 启动报错
- gc
- - gc频繁
- 分片
- - unassigned shards根本原因
- api
- - `_doc`
  - `relationship_type`
- 设置分片副本
- 超过磁盘写入上限，无法写入
性能优化
ES迁移
- 迁移方法
- 集群内迁移
- snapshot
ES-SQL
- 查询
- - group by
使用场景
- 删除索引

参考链接：http://www.itxm.cn/post/8887.html

Elasticsearch介绍

优势

高可用
海量数据搜索、PB级
便于扩展，可以很方便增加一个节点
实时的，延时低于1ms

全文检索

Lucene词汇表和架构

文档(Document)：索引和搜索时的主要数据载体，包含一个或多个现存有数据的字段
字段(field)：文档的一部分，包含名称和值两部分
词(term): 一个搜索单元，表示文本中得一个单词
标记(token): 表示在字段中出现词得属性，又词得文本、开始和结束偏移量以及类型组成

使用案例

Github
维基百科
百度

对比Elasticsearch和Solr

Solr利用zookeeper分布式管理，Elasticsearch自带分布式协调管理能力
Solr支持更多的数据，ES仅支持json
Solr官方提供的功能更多，es有很多第三方插件
Solr在传统的搜索应用好于es,es处理实时应用搜索效率更好

对比Elasticsearch和splunk

JSON

JSON(JavaScript Object Notation)是一种轻量级的数据交换格式，不仅易于人们阅读和编写
json基本数据类型
- 数字：有符号的十进制数字，可能包含小数部分，也可能使用指数E表示法，但不能包括非数字，如NaN。该格式不区分整数和浮点数。
- 字符串：0个或多个Unicode字符的序列。字符串用双引号分隔，并支持反斜杠转义语法。
- 布尔值：为true或false的任一值。
- 数组：0个或多个值的有序列表，每个值可以是任何类型。数组使用方括号符号，元素以逗号分隔。
- 对象：名称/值对的无序集合，其中名称（也称为键）是字符串。由于对象旨在表示关联数组，推荐每个键在对象内是唯一的。对象用大括号分隔，并使用逗号分隔每对，而在每一对中，用冒号（:）将键或名称与其值分隔开
- null：一个空值，使用单词null

ES基础

ES把输入文档和复杂的查询语法及输出的查询结果封装为XContent，数据就可以采用XML和JSON格式表示成可读的形式
使用RESTFul API隐藏Lucene的复杂性
Lucene是由一个Java语言开发的开源全文检索引擎工具包。把Lucene用Netty封装成服务，使用JSON访问就是Elasticsearch，底层是Luecen
内置了对分布式集群和分布式索引的管理，所以相对Solr来说，不需要额外安装ZooKeeper，其更容易分布式部署
搜索系统整体架构

ES结构主要特征

每一个运行的实例称之为一个节点，可以在同一台计算机上运行多个实例，也可以在每台计算机上只运行一个实例
多个运行的实例可以组成一个集群，集群中存在一个动态选举出的一个主节点（master）,主节点宕机，会自动选出新的主节点，所以不存在单点故障
为了实现容错，Elasticsearch把查询文档集合分解为多个小的索引，每一个小的索引称之为==分片（shard），每一个分片都可以有0到多个副本(replicas)，每一个副本==都是分片的完整复制品，这样提高了查询速度
通过**Gateway来管理集群恢复，可以配置集群加入多个节点才能启动恢复数据。网关配置用于恢复任何失败的索引，当节点崩溃重新启动时，Elasticsearch将网关读取所有的索引和元数据**
Transport：内部节点或者集群客户端之间的交互方式，默认使用TCP协议进行交互，同时支持HTTP协议（JSON格式）、Thrift,Servlet,Memcached,ZeroMQ等多种传输协议
支持多种扩展插件

数据架构的主要概念

索引(Index)
- 逻辑数据的存储单元，相当于RDBMS的数据库
- 其结构为快速有效的全文索引准备，不存储原始值
- ES把索引放在一台或多台服务器上

每个索引有一个多个分片（Shards），每个分片可以有多个副本(replica)

文档(document)
- 存储在ES中的主要实体，相当于RDBMS中的一行记录
- 在文档中，相同字段必须有相同类型
- 多值字段（multivalued)：文档由多个字段组成，每个字段多次出现在一个文档里
文档类型
- 一个索引对象可以存储很多不同用途的对象
- 相当于Mysql中的表
映射
- ES在映射中存储有关字段的信息，每一个文档类型都有自己的映射

Elasticsearch主要概念

节点和集群
分片
- 作用:解决遇到大量文档时，内存限制、硬盘能力、处理能力不足、无法快速响应，可以把数据拆分成shard，存放在不同节点上，其中每个分片都是一个独立的索引，
- 查询的索引分布在多个**分片**上，ES会把查询发送给每个相关的分片，并将结果合并在一起
- 应用程序并不知道分片的存在，多个分片可以加快索引
副本(replica)
- 作用：为了提高查询吞吐量和实现可用性
- replica是一个分片的精确复制，每个分片有零个或多个副本
- 主分片(primary)：被动选择更改索引操作，其余成为副本分片(replica shard)
- 主分片丢失，该分片数据在集群不可用，集群将提升副本为新的主分片
Query DSL
- 查询语言
时光之门

索引建立和搜索

Elasticsearch使用文档的唯一标识符来计算文档应该被放到哪个分片中
执行搜索请求

ES与关系型数据库（`RDBMS`）对比

Elasticsearch	RDBMS
Cluster	Database
Shard	Shard
Index	table
Field	Column
Documnet	Row

分词器

从一串文本切分出一个一个的词条，对词条进行标准化
包括三部分
- character filter: 分词前的预处理，过滤HTML标签，特俗符号转换
- tokenzier:分词
- token filter:标准化
内置分词器

HTTP协议

客户端发起一个到服务器指定端口的http请求，服务器端按指定格式返回网页或者其他网络资源

Elasticsearch安装

下载`7.8.1`版本

wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.8.1-linux-x86_64.tar.gz
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.8.1-linux-x86_64.tar.gz.sha512
shasum -a 512 -c elasticsearch-7.8.1-linux-x86_64.tar.gz.sha512 
tar -xzf elasticsearch-7.8.1-linux-x86_64.tar.gz
cd elasticsearch-7.8.1/ ll
1
2
3
4
5

集群部署

# 设置集群名称，集群内所有节点的名称必须一致。
cluster.name: es
# 设置节点名称，集群内节点名称必须唯一。
node.name: es-node1
# 表示该节点会不会作为主节点，true表示会；false表示不会
node.master: true
# 当前节点是否用于存储数据，是：true、否：false
node.data: true
# 索引数据存放的位置
path.data: /opt/elasticsearch-7.8.1/data
# 日志文件存放的位置
path.logs: /opt/elasticsearch-7.8.1/logs
# 需求锁住物理内存，是：true、否：false
bootstrap.memory_lock: true
# 监听地址，用于访问该es
network.host: node1
# es对外提供的http端口，默认 9200
http.port: 9200
# TCP的默认监听端口，默认 9300
transport.tcp.port: 9300
# 设置这个参数来保证集群中的节点可以知道其它N个有master资格的节点。默认为1，对于大的集群来说，可以设置大一点的值（2-4）
discovery.zen.minimum_master_nodes: 2
# es7.x 之后新增的配置，写入候选主节点的设备地址，在开启服务后可以被选为主节点
discovery.seed_hosts: ["node1:9300", "node2:9300", "node3:9300"] 
discovery.zen.fd.ping_timeout: 1m
discovery.zen.fd.ping_retries: 5
# es7.x 之后新增的配置，初始化一个新的集群时需要此配置来选举master
cluster.initial_master_nodes: ["es-node1", "es-node2", "es-node3"]
# 是否支持跨域，是：true，在使用head插件时需要此配置
http.cors.enabled: true
# “*” 表示支持所有域名
http.cors.allow-origin: "*"
action.destructive_requires_name: true
action.auto_create_index: .security,.monitoring*,.watches,.triggered_watches,.watcher-history*
xpack.security.enabled: false
xpack.monitoring.enabled: true
xpack.graph.enabled: false
xpack.watcher.enabled: false
xpack.ml.enabled: false
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39

安装报错

[2020-08-17T05:53:23,496][WARN ][o.e.c.c.ClusterFormationFailureHelper] [es-node1] master not discovered yet, this node has not previously joined a bootstrapped (v7+) cluster, and this node must discover master-eligible nodes [node1] to bootstrap a cluster: have discovered [{es-node1}{D73-QBgpTp-Q7RgailBDiQ}{o83R1VLyQZWcs8lAG9o1Ug}{node1}{192.168.199.137:9300}{dimrt}{xpack.installed=true, transform.node=true}, {es-node2}{LghG7C8pRoamT1h9GYBtRA}{CskX0v3wTwGvUNNSp3Rg2A}{node2}{192.168.199.138:9300}{dimrt}{xpack.installed=true, transform.node=true}, {es-node3}{BPDEwYozS4OWGkQGED-P2w}{toxtMJHXT1SOSmDUYz_tjg}{node3}{192.168.199.139:9300}{dimrt}{xpack.installed=true, transform.node=true}]; discovery will continue using [192.168.199.138:9300, 192.168.199.139:9300] from hosts providers and [{es-node1}{D73-QBgpTp-Q7RgailBDiQ}{o83R1VLyQZWcs8lAG9o1Ug}{node1}{192.168.199.137:9300}{dimrt}{xpack.installed=true, transform.node=true}] from last-known cluster state; node term 0, last-accepted version 0 in term 0

1
2

解决方法
- 修改cluster.initial_master_nodes为es的node.name，不是服务器名称

java.lang.IllegalStateException: transport not ready yet to handle incoming requests
无需处理
1
2

创建用户

[root@es1 ~]# adduser es
为这个用户初始化密码，linux会判断密码复杂度，不过可以强行忽略：

[root@es1 ~]# passwd es
更改用户 es 的密码 。
新的 密码：1q23lyc$%
无效的密码： 密码未通过字典检查 - 过于简单化/系统化
重新输入新的 密码：
passwd：所有的身份验证令牌已经成功更新。
1
2
3
4
5
6
7
8
9

赋予用户权限

在root用户
vi /etc/sudoers
添加 ： USERNAME    ALL=(ALL)   ALL
以下配置可以给sudo权限免密
添加 ： USERNAME    ALL=(ALL)   NOPASSWD:ALL

1
2
3
4
5
6

ES安装

目录结构
配置
- elasticsearch.yml
  - 负责设置服务器的默认配置
  - 有两个值不能在运行时更改，分别是 cluster.name 和 node.name
- logging.yml
  - 定义了多少信息写入系统日志，定义了日志文件，并定期创建新文件
  - 只有在调整监控、备份方案或系统调试时，才需要修改
  - 系统限制打开的文件描述符：32000个,一般在/etc/security/limits.conf中修改，当前的值可以用 ulimit
    命令来查看。如果达到极限，Elasticsearch将无法创建新的文件，所以合并会失败，索引会失败，
    新的索引无法创建。
- JVM 堆内存限制:默认内存（1024M）
  - 志文件中发现OutOfMemoryError 异常的条目，把 ES_HEAP_SIZE 变量设置到大于1024
  - 当选择分配给JVM的合适内存大小时，记住，通常不应该分配超过50%的系统总内存

集群安装

修改配置文件

$ vim /opt/elasticsearch/elasticsearch5.6.5/config/elasticsearch.yml
    cluster.name: es   (集群名称，同一集群要一样)
    node.name： es-node1  (节点名称，同一集群要不一样)
    http.port: 9200  #连接端口
    network.host: node1   #默认网络连接地址，写当前主机的静态IP，这里不能写127.0.0.1
    path.data: /opt/elasticsearch/data   #数据文件存储路径
    path.logs: /opt/elasticsearch/logs    #log文件存储路径
    discovery.zen.ping.unicast.hosts: ["node1","node2","node3"]#集群中master节点的初始列表，可以通过这些节点来自动发现新加入集群的节点。
    bootstrap.memory_lock: true
    bootstrap.system_call_filter: false    # 因centos6不支持SecComp而默认bootstrap.system_call_filter为true进行检测，所以，要设置为 false。注：SecComp为secure computing mode简写
    http.cors.enabled: true  #是否支持跨域，默认为false
    http.cors.allow-origin: "*"   #当设置允许跨域，默认为*,表示支持所有域名
    discovery.zen.minimum_master_nodes: 2     #这个参数来保证集群中的节点可以知道其它N个有master资格的节点。默认为1，对于大的集群来说，可以设置大一点的值（2-4）
1
2
3
4
5
6
7
8
9
10
11
12
13

elasticsearch备份位置，路径要手动创建

path.repo: ["/opt/elasticsearch/elasticseaarch-5.6.3/data/backups"]
1

如果安装elasticsearch-head插件，需要添加以下选项
```
http.cors.enabled: true
http.cors.allow-origin: "*"
1
2
```
如果安装x-pack插件，我们取消他的basic认证，需要添加以下选项
```
xpack.security.enabled: false
1
```
修改jvm内存[这个配置项很重要，在现实生产中要配的大一些，但是最大不能超过32g]
```
vim config/jvm.options  
-Xms2g  ---> -Xms512m  
-Xmx2g  ---> -Xms512m   
1
2
3
```

管理

启动与关闭

每一台设备都要单独启动

前台启动 
$ ./elasticsearch 

后台启动 -d为守护进程运行
$ ./elasticsearch –d
$ ./elasticsearch & # 使用这种方式他会打印日志在前台
1
2
3
4
5
6
7
8

查看进程
```
$ jps
2369 Elasticsearch
1
2
```
验证是否安装成功

注意事项

es不能使用root用户运行

[2020-07-25T00:44:08,878][WARN ][o.e.b.ElasticsearchUncaughtExceptionHandler] [es-node1] uncaught exception in thread [main]
org.elasticsearch.bootstrap.StartupException: java.lang.RuntimeException: can not run elasticsearch as root
        at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:125) ~[elasticsearch-6.1.1.jar:6.1.1]
        at org.elasticsearch.bootstrap.Elasticsearch.execute(Elasticsearch.java:112) ~[elasticsearch-6.1.1.jar:6.1.1]
        at org.elasticsearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:86) ~[elasticsearch-6.1.1.jar:6.1.1]
        at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:124) ~[elasticsearch-cli-6.1.1.jar:6.1.1]
        at org.elasticsearch.cli.Command.main(Command.java:90) ~[elasticsearch-cli-6.1.1.jar:6.1.1]
        at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:92) ~[elasticsearch-6.1.1.jar:6.1.1]
        at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:85) ~[elasticsearch-6.1.1.jar:6.1.1]
Caused by: java.lang.RuntimeException: can not run elasticsearch as root
        at org.elasticsearch.bootstrap.Bootstrap.initializeNatives(Bootstrap.java:104) ~[elasticsearch-6.1.1.jar:6.1.1]
        at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:171) ~[elasticsearch-6.1.1.jar:6.1.1]
        at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:322) ~[elasticsearch-6.1.1.jar:6.1.1]
        at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:121) ~[elasticsearch-6.1.1.jar:6.1.1]
        ... 6 more
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15

错误：索引文件个数限制

ERROR: [2] bootstrap checks failed
[1]: max file descriptors [4096] for elasticsearch process is too low, increase to at least [65536]
[2]: max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]

1
2
3
4

解决方法

vim /etc/security/limits.conf
#在最后添加
* soft nofile 65536
* hard nofile 131072
1
2
3
4

vi /etc/sysctl.conf 
vm.max_map_count=655360
1
2

执行：sysctl -p

bind错误

org.elasticsearch.bootstrap.StartupException: BindTransportException[Failed to bind to [9300-9400]]; nested: BindException[Cannot assign requested address];
        at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:125) ~[elasticsearch-6.1.1.jar:6.1.1]
        at org.elasticsearch.bootstrap.Elasticsearch.execute(Elasticsearch.java:112) ~[elasticsearch-6.1.1.jar:6.1.1]
        at org.elasticsearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:86) ~[elasticsearch-6.1.1.jar:6.1.1]
        at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:124) ~[elasticsearch-cli-6.1.1.jar:6.1.1]
        at org.elasticsearch.cli.Command.main(Command.java:90) ~[elasticsearch-cli-6.1.1.jar:6.1.1]
        at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:92) ~[elasticsearch-6.1.1.jar:6.1.1]
        at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:85) ~[elasticsearch-6.1.1.jar:6.1.1]

1
2
3
4
5
6
7
8
9

解决：修改配置文件的host.name

发送信息给master失败

[2020-07-25T02:00:20,423][INFO ][o.e.d.z.ZenDiscovery     ] [es-node2] failed to send join request to master [{es-node1}{Jqt6xka3Q6e_HM7HP7eazQ}{h6izGMXsQWWW8bOWfr-3_g}{node1}{192.168.199.137:9300}], reason [RemoteTransportException[[es-node1][192.168.199.137:9300][internal:discovery/zen/join]]; nested: IllegalArgumentException[can't add node {es-node2}{Jqt6xka3Q6e_HM7HP7eazQ}{G5EKFO-WR0yZY5gbIKnDDA}{node2}{192.168.199.138:9300}, found existing node {es-node1}{Jqt6xka3Q6e_HM7HP7eazQ}{h6izGMXsQWWW8bOWfr-3_g}{node1}{192.168.199.137:9300} with the same id but is a different node instance]; ]
1

原因：是因为复制的elasticsearch文件夹下包含了data文件中示例一的节点数据，需要把示例二data文件下的文件清空
解决方法：删除es集群data数据库文件夹下所有文件即可

插件安装

ES-HEAD

安装nodejs

mkdir /opt/nodejs
wget https://nodejs.org/dist/v10.15.2/node-v10.15.2-linux-x64.tar.xz
# 解压
tar -xf node-v10.15.2-linux-x64.tar.xz
 # 创建软链接
/opt/nodejs/node-v10.15.2-linux-x64

#配置path环境变量
   vi ~/.bash_profile
            export NODE_HOME=/opt/nodejs/node-v10.15.2-linux-x64
            export PATH=$PATH:$NODE_HOME/bin
   source ~/.bash_profile
   
# 执行 node -v 验证安装
1
2
3
4
5
6
7
8
9
10
11
12
13
14

yum install git npm

克隆git项目，切换head目录启动head插件

git clone https://github.com/mobz/elasticsearch-head.git 
cd elasticsearch-head/
ls
npm install
# 报错，执行
npm install phantomjs-prebuilt@2.1.16   --ignore-scripts

nohup npm run start &
1
2
3
4
5
6
7
8

安装成功
使用技巧
- 参考链接：https://blog.csdn.net/bsh_csn/article/details/53908406

ES-SQL

# 下载链接
https://github.com/NLPchina/elasticsearch-sql/releases/download/5.4.1.0/es-sql-site-standalone.zip
# 切换到解压目录中的site-server
 npm install express --save  

#site-server/site_configuration.json配置文件中修改启动服务的端口

#重启es，再启动es-sql前端
node node-server.js &

# 更新
npm install -g npm 

#docker安装
 docker run -d --name elasticsearch-sql -p 9680:8080 -p 9700:9300 -p 9600:9200 851279676/es-sql:6.6.2
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15

cerebro

#可以选择rpm安装或者源码包安装：
#我这里为了方便快捷直接使用rpm
wget https://github.com/lmenezes/cerebro/releases/download/v0.8.5/cerebro-0.8.5-1.noarch.rpm

#安装：
rpm -ivh cerebro-0.8.5-1.noarch.rpm
#配置
rpm -ql cerebro-0.8.5-1
#可以看到配置文件
/usr/share/cerebro/conf/application.conf
#日志文件：
/var/log/cerebro
#配置：
#可以指定配置参数启动：
bin/cerebro -Dhttp.port=1234 -Dhttp.address=127.0.0.1
#可以指定配置文件启动:
启动：
bin/cerebro -Dconfig.file=/some/other/dir/alternate.conf
 
 
配置：
# vim /usr/share/cerebro/conf/application.conf
 
# A list of known hosts
hosts = [
  {
   host = "http://192.168.8.102:9200"
   name = "ES Cluster"
  #  headers-whitelist = [ "x-proxy-user", "x-proxy-roles", "X-Forwarded-For" ]
  #}
  # Example of host with authentication
  #{
  #  host = "http://some-authenticated-host:9200"
  #  name = "Secured Cluster"
  #  auth = {
  #    username = "username"
  #    password = "secret-password"
  #  }
  }
]

cerebro的启动 状态查看和关闭：
# systemctl stop cerebro  
# systemctl start cerebro
# systemctl status cerebro
● cerebro.service - Elasticsearch web admin tool
   Loaded: loaded (/usr/lib/systemd/system/cerebro.service; enabled; vendor preset: disabled)
   Active: active (running) since Thu 2019-12-12 14:36:39 CST; 6s ago
  Process: 11484 ExecStartPre=/bin/chmod 755 /run/cerebro (code=exited, status=0/SUCCESS)
 
为了便于问题排除可以直接使用命令启动cerebro：
# /usr/bin/cerebro
默认启动的：
[info] play.api.Play - Application started (Prod) (no global state)
[info] p.c.s.AkkaHttpServer - Listening for HTTP on /0:0:0:0:0:0:0:0:9000
允许网络范围内的任意主机登陆访问：
 
 
登陆：
node1:9000
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60

安装kibana

vi /kibana/config/kibana.yaml

server.host: "node3"
elasticsearch.url: "http://node3:9200"
# 配置防火墙5601端口
# 启动
./kibana -d
1
2
3
4
5
6
7

安装7.8.1

curl -O https://artifacts.elastic.co/downloads/kibana/kibana-7.8.1-linux-x86_64.tar.gz
curl https://artifacts.elastic.co/downloads/kibana/kibana-7.8.1-linux-x86_64.tar.gz.sha512 | shasum -a 512 -c - 
tar -xzf kibana-7.8.1-linux-x86_64.tar.gz
cd kibana-7.8.1-linux-x86_64/ 
1
2
3
4

安装中文分词器

常用的分词器有 standard、keyword、whitespace、pattern等。
standard 分词器将字符串分割成单独的字词，删除大部分标点符号。keyword 分词器输出和它接收到的相同的字符串，不做任何分词处理。whitespace 分词器只通过空格俩分割文本。pattern 分词器可以通过正则表达式来分割文本。最常用的一般为 standard 分词器。

下载中文分词器：https://github.com/medcl/elasticsearch-analysis-ik
解压
进入elasticsearch-analysis-ik-master/，编译源码
```
mvn clean install -Dmaven.test.skip=true
1
```
在es的plugins文件夹下创建目录ik
将编译后的/src/elasticsearch-analysis-ik-master/target/releases/elasticsearch-analysis-ik-7.4.0.zip移到/opt/elasticsearch/plugins/目录下

执行/.elasticsearch能够正常运行就说明成功了

git clone https://github.com/medcl/elasticsearch-analysis-ik
cd elasticsearch-analysis-ik
#git checkout tags/{version}
git checkout {v6.1.1}

mvn clean
mvn compile
mvn package
1
2
3
4
5
6
7
8

docker安装ik分词器

../bin/elasticsearch-plugin install https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v6.6.2/elasticsearch-analysis-ik-6.6.2.zip
1

安装成功之后，进行重启

docker restart container-id
1

测试

创建index

curl -XPUT http://localhost:9200/chinese
1

create a mapping

curl -XPOST http://192.168.199.137:9200/chinese/fulltext/_mapping -H 'Content-Type:application/json' -d'
{
        "properties": {
            "content": {
                "type": "text",
                "analyzer": "ik_max_word",
                "search_analyzer": "ik_smart"
            }
        }

}'
1
2
3
4
5
6
7
8
9
10
11

index some docs

curl -XPOST localhost:9200/chinese/fulltext/1 -H 'Content-Type:application/json' -d'
{"content":"美国留给伊拉克的是个烂摊子吗"}
'

curl -XPOST http://localhost:node1/index/_create/2 -H 'Content-Type:application/json' -d'
{"content":"公安部：各地校车将享最高路权"}
'
curl -XPOST http://localhost:node1/index/_create/3 -H 'Content-Type:application/json' -d'
{"content":"中韩渔警冲突调查：韩警平均每天扣1艘中国渔船"}
'
curl -XPOST http://localhost:node1/index/_create/4 -H 'Content-Type:application/json' -d'
{"content":"中国驻洛杉矶领事馆遭亚裔男子枪击 嫌犯已自首"}
'
1
2
3
4
5
6
7
8
9
10
11
12
13

最细粒度拆分

curl -XPOST node2:9200/chinses/ -H 'Content-Type:application/json' -d'
{
  "text": ["中国驻洛杉矶领事馆遭亚裔男子枪击 嫌犯已自首"],
  "tokenizer": "ik_max_word"
}
'
1
2
3
4
5
6

ik_smart: 会做最粗粒度的拆分
```
1
```

qury with highlighting

curl -XPOST node1:9200/chinese/_search  -H 'Content-Type:application/json' -d'
{
    "query" : { "match" : { "content" : "中国" }},
    "highlight" : {
        "pre_tags" : ["<tag1>", "<tag2>"],
        "post_tags" : ["</tag1>", "</tag2>"],
        "fields" : {
            "content" : {}
        }
    }
}
'
1
2
3
4
5
6
7
8
9
10
11
12

分词效果对比

curl -XPUT http://192.168.199.137:9200/ik_test
curl -XPUT http://192.168.199.137:9200/ik_test_1
1
2

curl -XPOST http://192.168.199.137:9200/ik_test/fulltext/_mapping  -H 'Content-Type:application/json'-d'
{
    "fulltext": {
        "_all": {
            "analyzer": "ik_max_word",
            "search_analyzer": "ik_max_word",
            "term_vector": "no",
            "store": "false"
        },
        "properties": {
            "content": {
                "type": "string",
                "store": "no",
                "term_vector": "with_positions_offsets",
                "analyzer": "ik_max_word",
                "search_analyzer": "ik_max_word",
                "include_in_all": "true",
                "boost": 8
            }
        }
    }
}'
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22

curl -XPOST http://192.168.199.137:9200/ik_test/fulltext/1 -d'
{"content":"美国留给伊拉克的是个烂摊子吗"}
'
curl -XPOST http://192.168.199.137:9200/ik_test/fulltext/2 -d'
{"content":"公安部：各地校车将享最高路权"}
'
curl -XPOST http://192.168.199.137:9200/ik_test/fulltext/3 -d'
{"content":"中韩渔警冲突调查：韩警平均每天扣1艘中国渔船"}
'
curl -XPOST http://192.168.199.137:9200/ik_test/fulltext/4 -d'
{"content":"中国驻洛杉矶领事馆遭亚裔男子枪击 嫌犯已自首"}
'

curl -XPOST http://192.168.199.137:9200/ik_test_1/fulltext/1 -d'
{"content":"美国留给伊拉克的是个烂摊子吗"}
'
curl -XPOST http://192.168.199.137:9200/ik_test_1/fulltext/2 -d'
{"content":"公安部：各地校车将享最高路权"}
'
curl -XPOST http://192.168.199.137:9200/ik_test_1/fulltext/3 -d'
{"content":"中韩渔警冲突调查：韩警平均每天扣1艘中国渔船"}
'
curl -XPOST http://192.168.199.137:9200/ik_test_1/fulltext/4 -d'
{"content":"中国驻洛杉矶领事馆遭亚裔男子枪击 嫌犯已自首"}
'
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25

curl -XPOST http://192.168.199.137:9200/ik_test/fulltext/_search?pretty  -d'{
    "query" : { "match" : { "content" : "洛杉矶领事馆" }},
    "highlight" : {
        "pre_tags" : ["<tag1>", "<tag2>"],
        "post_tags" : ["</tag1>", "</tag2>"],
        "fields" : {
            "content" : {}
        }
    }
}'
1
2
3
4
5
6
7
8
9
10

7curl -XPOST http://192.168.199.137:9200/ik_test_1/fulltext/_search?pretty  -d'{
    "query" : { "match" : { "content" : "洛杉矶领事馆" }},
    "highlight" : {
        "pre_tags" : ["<tag1>", "<tag2>"],
        "post_tags" : ["</tag1>", "</tag2>"],
        "fields" : {
            "content" : {}
        }
    }
}'
1
2
3
4
5
6
7
8
9
10

报错

#报错
#"analyzer [ik_max_word] not found for field [content]"}]
给所有节点安装ik，并重启
1
2
3

ES部署事项

参考链接[超详细的Elasticsearch高性能优化实践 - 不言不语技术 - 博客园 (cnblogs.com)](https://www.cnblogs.com/hzcya1995/p/13312071.html)

服务器配置的选择

Swapping 是性能的坟墓

如果服务器配置很低，则意味着需要更多的节点，节点数量的增加会导致集群管理的成本大幅度提高。
如果服务器配置很高，而在单机上运行多个节点时，也会增加逻辑的复杂度

关闭swap

为了使 ES 有更好等性能，强烈建议关闭 Swap

暂时禁用

sudo swapoff -a
#可以执行命令刷新一次SWAP（将SWAP里的数据转储回内存，并清空SWAP里的数据）
swapoff -a && swapon -a
sysctl -p  (执行这个使其生效，不用重启)
1
2
3
4

永久关闭

# /etc/sysctl.conf
vm.swappiness = 1      //0-100，则表示越倾向于使用虚拟内存。
#注意：Swappiness 设置为 1 比设置为 0 要好，因为在一些内核版本，Swappness=0 会引发 OOM（内存溢出）
1
2
3

ES中配置

#elasticsearch.yml 
bootstrap.mlockall: true
1
2

角色隔离和脑裂

角色	描述	存储	内存	计算	网络
数据节点	存储和检索数据	极高	高	高	中
主节点	管理集群状态	低	低	低	低
Ingest 节点	转换输入数据	低	中	高	中
机器学习节点	机器学习	低	极高	极高	中
协调节点	请求转发和合并检索结果	低	中	中	中

角色隔离

ES 集群中的数据节点负责对数据进行增、删、改、查和聚合等操作，所以对 CPU、内存和 I/O 的消耗很大。在搭建 ES 集群时，我们应该对 ES 集群中的节点进行角色划分和隔离
候选主节点

node.master=true
node.data=false
1
2

数据节点

node.master=false
node.data=true
1
2

640?wx_fmt=png

分片配置

#参考链接
https://www.elastic.co/cn/blog/how-many-shards-should-i-have-in-my-elasticsearch-cluster

https://blog.csdn.net/laoyang360/article/details/103545432
1
2
3
4

单节点分片数量为1-3,每 GB JVM 堆内存支持的分片数不超过 20 个
单分片大小为20-50G
单分片存储文档数量: (2^32-1）

索引

 Elasticsearch索引；
 配置索引结构映射，知道可使用的字段类型；
 使用批量索引加快索引过程；
 使用附加的内部信息扩展索引结构；
 理解、设置及控制段合并；
 理解路由的工作原理，并根据需求设置。

ES索引

ES是无模式的搜索引擎，但自己控制和定义结构最好

分片和副本

全局配置文件（elasticsearch.yml）定义的默认值创建索引，索引结束时将得到5个分片及1个副本，10个lucene索引分布在集群中，每一个分片都有自己的分片副本（copy），所以实际上有5个分片和5个相应分片副本
特征
- 更多分片使索引能传送到更多的服务器，可以处理更多文件，而不会降低性能
- 更多分片意味着获取特定文档所需的资源量会减少，因为相较于部署更少分片时，存储在单个分片中的文件数量更少
- 更多分片意味着搜索索引时会面临更多问题，因为必须从更多分片中合并结果，使得查询的聚合阶段需要更多资源
- 更多副本会增强集群系统的容错性，因为当原始分片不可用时，其副本将替代原始分片发挥作用。只拥有单个副本，集群可能在不丢失数据的情况下遗失分片。当有两个副本时，即使丢失了原始分片及其中一个副本，一切工作仍可以很好地持续下去。
- 更多副本可以增强集群系统的容错性，当原始分片不可用，副本将替代原始分片发挥作用。只拥有单个副本，集群可能在不丢失数据的情况下，遗失分片，有两个分片时，即使丢失了一个原始分片及其中一个副本，一切工作仍然可以继续进行
- 更多副本意味着查询吞吐量将会增加

创建索引

不创建索引情况下，创建文档

curl -H "Content-Type: application/json" -XPUT http://node1:9200/blog/article/1 -d '{"title": "New
version of Elasticsearch released!", "content": "...", "tags":
["announce", "elasticsearch", "release"] }'
1
2
3

#索引名为blog,类型为article，自定义id是“1”，新加一个文档
curl -H 'Content-Type:application/json' -XPUT http://localhost:9600/blog/article/1 -d '
{
"id": "1",
"title": "New version of Elasticsearch released!",
"content": "Version 1.0 released today!",
"priority": 10,
"tags": ["announce", "elasticsearch", "release"]
}'

curl -H 'Content-Type:application/json' -XPOST http://localhost:9600/blog -d '{
	"from": 0,
	"size": 0,
	"_source": {
		"includes": [
			"id",
			"title",
			"COUNT"
		],
		"excludes": []
	},
	"stored_fields": [
		"id",
		"title"
	],
	"aggregations": {
		"num": {
			"value_count": {
				"field": "_index"
			}
		}
	}
}'

curl -H 'Content-Type:application/json' -XPUT http://node1:9200/blog/article/2 -d '
{
"id": "2",
"title": "Create Index Test",
"content": "Success",
"priority": 100,
"tags": ["Test", "elasticsearch", "curl"]
}'
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42

创建索引

curl -XPUT http://192.168.199.136:9200/blog/
1

修改索引的自动创建

#vim elasticsearch.yml
action.auto_create_index: false
1
2

新建索引的设定

设置分片和副本数量

curl -H "Content-Type: application/json" -XPUT http://node1:9200/test/ -d '{
"settings" : {
"number_of_shards" : 1,
"number_of_replicas" : 2
}
}'
1
2
3
4
5
6

删除索引

curl –XDELETE http://node:9200/posts
1

映射配置

创建映射满足需求和匹配你的数据结构

类型确定机制

通过定义文档的JSON来猜测文档结构

{
"field1": 10, # field1确定为数字,long型
"field2": "10" # field2确定为字符串
}
1
2
3
4

curl -XPUT http://192.168.199.136:9200/blog/?pretty -d '{
	"mappings" : {
		"article": {
			"numeric_detection" : true
		}
	}
}'

# 定义可被识别的日期格式
curl -XPUT 'http://192.168.199.136:9200/blog/' -d '{
	"mappings" : {
		"article" : {
			"dynamic_date_formats" : ["yyyy-MM-dd hh:mm"]
		}
	}
}'
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16

禁用字段类型猜测

curl -XPUT 'http://192.168.199.136:9200/blog/' -d '{
	"mappings" : {
		"article" : {
			"dynamic" : "false",
			"properties" : {
				"id" : { "type" : "string" },
				"content" : { "type" : "string" },
				"author" : { "type" : "string" }
			}
		}
	}
}'
1
2
3
4
5
6
7
8
9
10
11
12

索引和结构映射

模式映射（schema mapping，或简称映射）用于定义索引结构
结构：

 唯一标识符；
 名称；
 发布日期；
 内容。

post.json

{
	"mappings": {
		"post": {
			"properties": {
				"id": {"type":"long", "store":"yes",
				"precision_step":"0" },
				"name": {"type":"string", "store":"yes",
				"index":"analyzed" },
				"published": {"type":"date", "store":"yes",
				"precision_step":"0" },
				"contents": {"type":"string", "store":"no",
				"index":"analyzed" }
			}
		}
	}
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16

curl -XPOST 'http://192.168.199.136:9200/posts' -d @posts.json
1

index template

什么是索引模板

索引模板: 就是把已经创建好的某个索引的参数设置(settings)和索引映射(mapping)保存下来作为模板, 在创建新索引时, 指定要使用的模板名, 就可以直接重用已经定义好的模板中的设置和映射.

索引模板中的内容

(1) settings: 指定index的配置信息, 比如分片数、副本数, tranlog同步条件、refresh策略等信息;

(2) mappings: 指定index的内部构建信息, 主要有:

① _all: All Field字段, 如果开启, _all字段就会把所有字段的内容都包含进来,检索的时候可以不用指定字段查询 —— 会检索多个字段, 设置方式: "_all": {"enabled": true};

在ES 6.0开始, _all字段被禁用了, 作为替换, 可以通过copy_to自定义实现all字段的功能.

② _source: Source Field字段, ES为每个文档都保存一份源数据, 如果不开启, 也就是"_source": {"enabled": false}, 查询的时候就只会返回文档的ID, 其他的文档内容需要通过Fields字段到索引中再次获取, 效率很低. 但若开启, 索引的体积会更大, 此时就可以通过Compress进行压缩, 并通过inclueds、excludes等方式在field上进行限制 —— 指定义允许哪些字段存储到_source中, 哪些不存储;

③ properties: 最重要的配置, 是对索引结构和文档字段的设置.

{
  "order": 0,                               // 模板优先级
  "template": "sample_info*",               // 模板匹配的名称方式
  "settings": {...},                        // 索引设置
  "mappings": {...},                        // 索引中各字段的映射定义
  "aliases": {...}                          // 索引的别名
}
1
2
3
4
5
6
7

索引模板的用途

索引模板一般用在时间序列相关的索引中.

—— 也就是说, 如果你需要每间隔一定的时间就建立一次索引, 你只需要配置好索引模板, 以后就可以直接使用这个模板中的设置, 不用每次都设置settings和mappings.

索引模板一般与索引别名一起使用. 关于索引别名, 后续研究之后再做补充.

创建索引模板

创建一个商品的索引模板的示例:

(1) ES 6.0之前的版本:

PUT _template/shop_template
{
    "template": "shop*",       // 可以通过"shop*"来适配
    "order": 0,                // 模板的权重, 多个模板的时候优先匹配用, 值越大, 权重越高
    "settings": {
        "number_of_shards": 1  // 分片数量, 可以定义其他配置项
    },
    "aliases": {
        "alias_1": {}          // 索引对应的别名
    },
    "mappings": {
        "_default": {          // 默认的配置, ES 6.0开始不再支持
            "_source": { "enabled": false },  // 是否保存字段的原始值
            "_all": { "enabled": false },     // 禁用_all字段
            "dynamic": "strict"               // 只用定义的字段, 关闭默认的自动类型推断
        },
        "type1": {             // 默认的文档类型设置为type1, ES 6.0开始只支持一种type, 所以这里不需要指出
        */
            "_source": {"enabled": false},
            "properties": {        // 字段的映射
                "@timestamp": {    // 具体的字段映射
                    "type": "date",           
                    "format": "yyyy-MM-dd HH:mm:ss"
                },
                "@version": {
                    "doc_values": true,
                    "index": "not_analyzed",  // 不索引
                    "type": "string"          // string类型
                },
                "logLevel": {
                    "type": "long"
                }
            }
        }
    }
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36

(2) ES 6.0之后的版本:

PUT _template/shop_template
{
    "index_patterns": ["shop*", "bar*"],       // 可以通过"shop*"和"bar*"来适配, template字段已过期
    "order": 0,                // 模板的权重, 多个模板的时候优先匹配用, 值越大, 权重越高
    "settings": {
        "number_of_shards": 1  // 分片数量, 可以定义其他配置项
    },
    "aliases": {
        "alias_1": {}          // 索引对应的别名
    },
    "mappings": {
        // ES 6.0开始只支持一种type, 名称为“_doc”
        "_doc": {
            "_source": {            // 是否保存字段的原始值
                "enabled": false
            },
            "properties": {        // 字段的映射
                "@timestamp": {    // 具体的字段映射
                    "type": "date",           
                    "format": "yyyy-MM-dd HH:mm:ss"
                },
                "@version": {
                    "doc_values": true,
                    "index": "false",   // 设置为false, 不索引
                    "type": "text"      // text类型
                },
                "logLevel": {
                    "type": "long"
                }
            }
        }
    }
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33

ES 6.X设置template

curl -XPUT http://114.115.200.44:9600/_template/alarm   -H "Content-Type: application/json" -d '
{
  "order": 0,
  "template": "alarm*",
  "settings": {
    "index": {
      "max_result_window": "10100",
      "number_of_shards": "1",
      "number_of_replicas": "0"
    }
  },
  "mappings": {
    "_doc": {
      "_source": {},
      "_all": {
        "enabled": false
      },
      "properties": {
        "alarm_name": {
          "type": "text",
          "fielddata": true
        },
        "alarm_type": {
          "type": "text",
          "fielddata": true
        },
        "id": {
          "type": "text",
          "fielddata": true
        },
        "alarm_content": {
          "type": "text",
          "fielddata": true
        }
      }
    }
  },
  "aliases": {}
}'

# keyword
curl -XPUT http://114.115.200.44:9600/_template/log   -H "Content-Type: application/json" -d '
{
  "order": 0,
  "template": "log*",
  "settings": {
    "index": {
      "max_result_window": "10100",
      "number_of_shards": "1",
      "number_of_replicas": "0"
    }
  },
  "mappings": {
    "_doc": {
      "_source": {},
      "_all": {
        "enabled": false
      },
      "properties": {
        "log_name": {
          "type": "keyword",
          "fielddata": true
        },
        "log_type": {
          "type": "keyword",
          "fielddata": true
        },
        "id": {
          "type": "keyword",
          "fielddata": true
        },
        "log_content": {
          "type": "keyword",
          "fielddata": true
        }
      }
    }
  },
  "aliases": {}
}'
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80

ES 7.X创建template

put /test
{
  "settings":{
    "number_of_shards":3,
    "number_of_replicas":2
  },
  "mappings":{
    "properties":{
      "id":{"type":"long"},
      "name":{"type":"text","analyzer":"ik_smart"},
      "text":{"type":"text","analyzer":"ik_max_word"}
    }
  }
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14

说明:

直接修改mapping的优先级 > 索引模板中的设置;

索引匹配了多个template, 当属性等配置出现不一致时, 以模板的权重(order属性的值)为准, 值越大越优先, order的默认值是0.

ES 6.0之后的版本API变化较大, 请重点关注.

使用模板创建索引

curl -XPUT http://114.115.200.44:9600/alarm/_doc/1   -H "Content-Type: application/json" -d '
{
"id": "6",
"alarm_name": "sql injection",
"alarm_type": "exploit"
}'
1
2
3
4
5
6

查看索引模板

(1) 查看示例:

GET _template                // 查看所有模板
GET _template/temp*          // 查看与通配符相匹配的模板
GET _template/temp1,temp2    // 查看多个模板
GET _template/shop_template  // 查看指定模板
1
2
3
4

(2) 判断模板是否存在:

判断示例:

HEAD _template/shop_tem
1

结果说明:

a) 如果存在, 响应结果是: 200 - OK
b) 如果不存在, 响应结果是: 404 - Not Found

删除索引模板

删除示例:

DELETE _template/shop_template	// 删除上述创建的模板
1

如果模板不存在, 将抛出如下错误:

{
  "error" : {
    "root_cause" : [
      {
        "type" : "index_template_missing_exception",
        "reason" : "index_template [shop_temp] missing"
      }
    ],
    "type" : "index_template_missing_exception",
    "reason" : "index_template [shop_temp] missing"
  },
  "status" : 404
}
1
2
3
4
5
6
7
8
9
10
11
12
13

使用keyword类型

如果我们只关心精确匹配, 就设置test_field: {"type": "keyword"}.

—— keyword类型要比text类型的性能更高, 并且还能节省磁盘的存储空间.

需要使用分词器

倒排索引

提高查询效率
实现方式
- 分词
- 记录频率，对搜索结果进行排行
- 记录位置信息

删除索引

curl -X POST "http://192.168.16.65:9211/blog/_search" -H 'Content-Type: application/json' -d'
{
    "query": {
        "match_phrase": {
            "title": "小明今晚真的不加班"
        }
    }
}
'

curl -X POST "http://192.168.16.65:9211/blog/_delete_by_query" -H 'Content-Type: application/json' -d'
{
  "query":{
    "match":{
      "title":"小明今晚真的不加班"
    }
  }
}
'


curl -X POST "http://node1:9200/_search?format=json&pretty" -H 'Content-Type: application/json' -d'
{
  "query":{
    "match":{
      "message": "test"
    }
  }
}
'

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31

多字段查询

{
  "dis_max": {
    "queries":  [
      {
        "match": {
          "title": {
            "query": "Quick brown fox",
            "minimum_should_match": "30%"
          }
        }
      },
      {
        "match": {
          "body": {
            "query": "Quick brown fox",
            "minimum_should_match": "30%"
          }
        }
      },
    ],
    "tie_breaker": 0.3
  }
}


{
    "multi_match": {
        "query":                "Quick brown fox",
        "type":                 "best_fields", 
        "fields":               [ "title", "body" ],
        "tie_breaker":          0.3,
        "minimum_should_match": "30%" 
    }
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34

删除索引内的数据

curl  -X POST "localhost:9200/suricata/_delete_by_query?pretty" -H 'Content-Type:application/json' -d '{
        "query": { 
                "range": {
                        "timestamp": {   
                                "gte": "now-5d",
                                "lte": "now-1d",
                                "format": "epoch_millis"
                                }
                        }
                }
}'
1
2
3
4
5
6
7
8
9
10
11

RESTful API

官方链接：https://www.elastic.co/guide/en/elasticsearch/reference/current/indices.html

基本介绍

curl
```
curl -X<VERB> '<PROTOCOL>://<HOST>/<PATH>?<QUERY_STRING>' -d '<BODY>'
1
```
- VERB HTTP方法：
  - GET 获取资源
  - POST 新建资源（也可以用于更新资源）
  - PUT 更新资源
  - HEAD
  - DELETE 删除资源
- PROTOCOL : http或者https协议
- HOST ：主机名
- PORT: 默认：9200
- QUERY_STRING: 可选的查询请求参数，例如pretty参数将使请求返回更加美观易读的JSON数据
- BODY: 一个JSON格式的请求主体
```
curl -i 显示响应的头信息
curl -v 显示http请求的通信过程
1
2
```

计算集群中的文档数量

curl -H "Content-Type: application/json" -XGET 'http://node1:9200/_count?pretty' -d '
{
	"query": {
		"match_all": {}
	}
}'
1
2
3
4
5
6

{
    "count" : 0,
    "_shards" : {
        "total" : 5,
        "successful" : 5,
        "failed" : 0
    }
}
1
2
3
4
5
6
7
8

cat apis

查看节点
```
curl 'http://node1:9200/_cat/nodes'
1
```
查看分片
```
curl 'http://node1:9200/_cat/shards'
1
```
查看所有索引
```
curl 'http://node1:9200/_cat/indices'
1
```

查看索引信息

curl -XGET 'http://node1:9200/blog/_search'
1

verbose

turn on verbose output（显示输出内容）

#显示主节点
GET /_cat/master?v
curl -XGET 'http://node1:9200/_cat/master?v'
1
2
3

help

GET /_cat/master?help
curl -XGET 'http://node1:9200/_cat/master?help'
1
2

#output
id   |   | node id
host | h | host name
ip   |   | ip address
node | n | node name
1
2
3
4
5

节点信息

curl -XGET 'http://node1:9200/_cat/nodes?help'
1

id                           | id,nodeId                      | unique node id
pid                          | p                              | process id
ip                           | i                              | ip address
port                         | po                             | bound transport port
http_address                 | http                           | bound http address
version                      | v                              | es version
build                        | b                              | es build hash
jdk                          | j                              | jdk version
disk.total                   | dt,diskTotal                   | total disk space
disk.used                    | du,diskUsed                    | used disk space
disk.avail                   | d,da,disk,diskAvail            | available disk space
disk.used_percent            | dup,diskUsedPercent            | used disk space percentage
heap.current                 | hc,heapCurrent                 | used heap
heap.percent                 | hp,heapPercent                 | used heap ratio
heap.max                     | hm,heapMax                     | max configured heap
ram.current                  | rc,ramCurrent                  | used machine memory
ram.percent                  | rp,ramPercent                  | used machine memory ratio
ram.max                      | rm,ramMax                      | total machine memory
file_desc.current            | fdc,fileDescriptorCurrent      | used file descriptors
file_desc.percent            | fdp,fileDescriptorPercent      | used file descriptor ratio
file_desc.max                | fdm,fileDescriptorMax          | max file descriptors
cpu                          | cpu                            | recent cpu usage
load_1m                      | l                              | 1m load avg
load_5m                      | l                              | 5m load avg
load_15m                     | l                              | 15m load avg
uptime                       | u                              | node uptime
node.role                    | r,role,nodeRole                | m:master eligible node, d:data node, i:ingest node, -:coordinating node only
master                       | m                              | *:current master
name                         | n                              | node name
completion.size              | cs,completionSize              | size of completion
fielddata.memory_size        | fm,fielddataMemory             | used fielddata cache
fielddata.evictions          | fe,fielddataEvictions          | fielddata evictions
query_cache.memory_size      | qcm,queryCacheMemory           | used query cache
query_cache.evictions        | qce,queryCacheEvictions        | query cache evictions
request_cache.memory_size    | rcm,requestCacheMemory         | used request cache
request_cache.evictions      | rce,requestCacheEvictions      | request cache evictions
request_cache.hit_count      | rchc,requestCacheHitCount      | request cache hit counts
request_cache.miss_count     | rcmc,requestCacheMissCount     | request cache miss counts
flush.total                  | ft,flushTotal                  | number of flushes
flush.total_time             | ftt,flushTotalTime             | time spent in flush
get.current                  | gc,getCurrent                  | number of current get ops
get.time                     | gti,getTime                    | time spent in get
get.total                    | gto,getTotal                   | number of get ops
get.exists_time              | geti,getExistsTime             | time spent in successful gets
get.exists_total             | geto,getExistsTotal            | number of successful gets
get.missing_time             | gmti,getMissingTime            | time spent in failed gets
get.missing_total            | gmto,getMissingTotal           | number of failed gets
indexing.delete_current      | idc,indexingDeleteCurrent      | number of current deletions
indexing.delete_time         | idti,indexingDeleteTime        | time spent in deletions
indexing.delete_total        | idto,indexingDeleteTotal       | number of delete ops
indexing.index_current       | iic,indexingIndexCurrent       | number of current indexing ops
indexing.index_time          | iiti,indexingIndexTime         | time spent in indexing
indexing.index_total         | iito,indexingIndexTotal        | number of indexing ops
indexing.index_failed        | iif,indexingIndexFailed        | number of failed indexing ops
merges.current               | mc,mergesCurrent               | number of current merges
merges.current_docs          | mcd,mergesCurrentDocs          | number of current merging docs
merges.current_size          | mcs,mergesCurrentSize          | size of current merges
merges.total                 | mt,mergesTotal                 | number of completed merge ops
merges.total_docs            | mtd,mergesTotalDocs            | docs merged
merges.total_size            | mts,mergesTotalSize            | size merged
merges.total_time            | mtt,mergesTotalTime            | time spent in merges
refresh.total                | rto,refreshTotal               | total refreshes
refresh.time                 | rti,refreshTime                | time spent in refreshes
refresh.listeners            | rli,refreshListeners           | number of pending refresh listeners
script.compilations          | scrcc,scriptCompilations       | script compilations
script.cache_evictions       | scrce,scriptCacheEvictions     | script cache evictions
search.fetch_current         | sfc,searchFetchCurrent         | current fetch phase ops
search.fetch_time            | sfti,searchFetchTime           | time spent in fetch phase
search.fetch_total           | sfto,searchFetchTotal          | total fetch ops
search.open_contexts         | so,searchOpenContexts          | open search contexts
search.query_current         | sqc,searchQueryCurrent         | current query phase ops
search.query_time            | sqti,searchQueryTime           | time spent in query phase
search.query_total           | sqto,searchQueryTotal          | total query phase ops
search.scroll_current        | scc,searchScrollCurrent        | open scroll contexts
search.scroll_time           | scti,searchScrollTime          | time scroll contexts held open
search.scroll_total          | scto,searchScrollTotal         | completed scroll contexts
segments.count               | sc,segmentsCount               | number of segments
segments.memory              | sm,segmentsMemory              | memory used by segments
segments.index_writer_memory | siwm,segmentsIndexWriterMemory | memory used by index writer
segments.version_map_memory  | svmm,segmentsVersionMapMemory  | memory used by version map
segments.fixed_bitset_memory | sfbm,fixedBitsetMemory         | memory used by fixed bit sets for nested object field types and type filters for types referred in _parent fields
suggest.current              | suc,suggestCurrent             | number of current suggest ops
suggest.time                 | suti,suggestTime               | time spend in suggest
suggest.total                | suto,suggestTotal              | number of suggest ops
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84

`headers`

GET /_cat/nodes?h=ip,port,heapPercent,name
curl -XGET 'http://node1:9200/_cat/nodes?h=ip,port,heapPercent,name'
1
2

numeric formats

b: bytes
s: sort
- value: store.size:desc
v: verbose

GET /_cat/indices?bytes=b&s=store.size:desc&v
curl -XGET 'http://node1:9200/_cat/indices?bytes=b&s=store.size:desc&v'
1
2

#output
health status index uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   blog  pMV25OdfSxmRTO9lUWv9Gw   5   1          2            0      25122          12561
green  open   cq    OqLBJsIuT8aXmVTYBe-Ygg   5   1          0            0       1864            699
1
2
3
4

reposne as text,json,smile,yaml or cbor

json

curl 'node1:9200/_cat/indices?format=json&pretty'
curl 'node1:9200/_cat/indices?pretty' -H "Accept: application/json"
curl 'node1:9200/_cat/indices?pretty'
1
2
3

yaml

[es@node1 root]$ curl 'node1:9200/_cat/indices?pretty' -H "Accept: application/yaml"
---
- health: "green"
  status: "open"
  index: "cq"
  uuid: "OqLBJsIuT8aXmVTYBe-Ygg"
  pri: "5"
  rep: "1"
  docs.count: "0"
  docs.deleted: "0"
  store.size: "2.2kb"
  pri.store.size: "1.1kb"
- health: "green"
  status: "open"
  index: "blog"
  uuid: "pMV25OdfSxmRTO9lUWv9Gw"
  pri: "5"
  rep: "1"
  docs.count: "2"
  docs.deleted: "0"
  store.size: "24.5kb"
  pri.store.size: "12.2kb"
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22

smile

curl 'node1:9200/_cat/indices?pretty' -H "Accept: application/smile"
:)
▒▒healthDgreen▒statusCopen▒indexAcq▒uuidUOqLBJsIuT8aXmVTYBe-Ygg▒pri@5▒rep@1▒docs.count@0▒docs.deleted@0▒store.sizeD2.2kb▒pri.store.sizeD1.1kb▒▒@DgreenACopenBCblogCUpMV25OdfSxmRTO9lUWv9GwD@5E@1F@2G@0HE24.5kbIE12.2kb▒▒[es@node1 root]$ xterm-256color

1
2
3
4

cbor

[es@node1 root]$ curl 'node1:9200/_cat/indices?pretty' -H "Accept: application/cbor"
▒▒fhealthegreenfstatusdopeneindexbcqduuidvOqLBJsIuT8aXmVTYBe-Yggcpria5crepa1jdocs.counta0ldocs.deleteda0jstore.sizee2.2kbnpri.store.sizee1.1kb▒▒fhealthegreenfstatusdopeneindexdblogduuidvpMV25OdfSxmRTO9lUWv9Gwcpria5crepa1jdocs.counta2ldocs.deleteda0jstore.sizef24.5kbnpri.store.sizef12.2kb▒▒[es@node1 root]$
1
2

sort

GET _cat/templates?v&s=order:desc,index_patterns
curl 'node1:9200/_cat/blog?v&s=order:desc,index_patterns'
1
2

参数

/_cat/allocation
/_cat/shards
/_cat/shards/{index}
/_cat/master
/_cat/nodes
/_cat/tasks
/_cat/indices
/_cat/indices/{index}
/_cat/segments
/_cat/segments/{index}
/_cat/count
/_cat/count/{index}
/_cat/recovery
/_cat/recovery/{index}
/_cat/health
/_cat/pending_tasks
/_cat/aliases
/_cat/aliases/{alias}
/_cat/thread_pool
/_cat/thread_pool/{thread_pools}
/_cat/plugins
/_cat/fielddata
/_cat/fielddata/{fields}
/_cat/nodeattrs
/_cat/repositories
/_cat/snapshots/{repository}
/_cat/templates
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27

Index APIs

create index

PUT /twitter
curl -XPUT 'http://node1:9200/twitter
1
2

path parameter:路径参数
- Lowercase only: 只允许小写
- Cannot include `, /, *, ?, ", <, >, |, (space character), ,, #: 不能包含以上特殊字符
- Indices prior to 7.0 could contain a colon (:), but that’s been deprecated and won’t be supported in 7.0+
- Cannot start with -, _, +
- Cannot be . or ..
- Cannot be longer than 255 bytes (note it is bytes, so multi-byte characters will count towards the 255 limit faster)
- Names starting with . are deprecated, except for hidden indices and internal indices managed by plugins
query parameters：结构参数
- include_type_name：布尔类型，如果为true，映射主体中应有一个映射类型
- wait_for_active_shards：继续操作之前必须处于活动状态的分片副本数
```
PUT /test?wait_for_active_shards=2

PUT /test
{
  "settings": {
    "index.write.wait_for_active_shards": "2"
  }
}
1
2
3
4
5
6
7
8
```
- master_timeout：超时时间，默认为30s
- timeout：等待响应的确切时间段

query body: 结构体

aliases: 索引别名

PUT /test
{
  "aliases": {
    "alias_1": {},
    "alias_2": {
      "filter": {
        "term": { "user": "kimchy" }
      },
      "routing": "kimchy"
    }
  }
}
1
2
3
4
5
6
7
8
9
10
11
12

mappings: 映射

PUT /test
{
  "settings": {
    "number_of_shards": 1
  },
  "mappings": {
    "properties": {
      "field1": { "type": "text" }
    }
  }
}
1
2
3
4
5
6
7
8
9
10
11

settings: 索引配置选项

curl -H 'Content-Type:application/json' -XPUT http://node1:9200/twitter -d '
{
  "settings": {
    "index": {
      "number_of_shards": 3,  
      "number_of_replicas": 2 
    }
  }
}'

curl -H 'Content-Type:application/json' -XPUT http://node1:9200/blog/article/2 -d '
{
"id": "2",
"title": "Create Index Test",
"content": "Success",
"priority": 100,
"tags": ["Test", "elasticsearch", "curl"]
}'

curl -H 'Content-Type:application/json' -XPUT http://node1:9200/twitter/article/1 -d '
{
"id": "1",
"title": "Create Index Test",
"content": "Success",
"priority": 100,
"tags": ["Test", "elasticsearch", "curl"]
}'
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27

Delete index

curl -XDELETE 'http://node1:9200/twitter'
1

qurey parameters
- allow_no_indices: 如果通配符表达式或_all值仅检索丢失或闭合的索引，则请求不会返回错误
- expand_wildcards:扩展通配符
  - all:展开以打开和关闭索引，包括隐藏索引
  - open: 仅展开打开索引
  - colse : 仅展开关闭索引
  - hidden: 通配符的扩展将包括隐藏索引。必须与打开，关闭或两者结合使用
  - none: 不接受通配符表达式

Get Index

GET /<index>

curl node2:9200/_all/_search?pretty
1
2
3

Index exists

head /twitter
1

Close Index

POST /twitter/_close
1

[es@node2 bin]$ curl -XPOST 'http://node1:9200/blog/_close'
{"acknowledged":true}

#关闭所有索引 
curl -XPOST node2:9200/_all/_close
{"acknowledged":true}
1
2
3
4
5
6

Open Index

POST /twitter/_open
# 打开
curl -XPOST 'http://node2:9200/blog/_open'
{"acknowledged":true,"shards_acknowledged":true}

# 打开所有索引
curl -XPOST 'http://node2:9200/_all/_open'
1
2
3
4
5
6
7

Shrink Index

使用更少的主分片将现有索引缩减为新索引

POST /twitter/_shrink/shrunk-twitter-index
1

curl -H 'Content-Type:application/json' -XPUT node2:9200/blog/_settings -d '
{
  "settings": {
    "index.number_of_replicas": 0,                                
    "index.routing.allocation.require._name": "shrink_blog_index", 
    "index.blocks.write": true                                    
  }
}'
1
2
3
4
5
6
7
8

Cluster APIs

# If no filters are given, the default is to select all nodes
GET /_nodes
# Explicitly select all nodes
GET /_nodes/_all
# Select just the local node
GET /_nodes/_local
# Select the elected master node
GET /_nodes/_master
# Select nodes by name, which can include wildcards
GET /_nodes/node_name_goes_here
GET /_nodes/node_name_goes_*
# Select nodes by address, which can include wildcards
GET /_nodes/10.0.0.3,10.0.0.4
GET /_nodes/10.0.0.*
# Select nodes by role
GET /_nodes/_all,master:false
GET /_nodes/data:true,ingest:true
GET /_nodes/coordinating_only:true
GET /_nodes/master:true,voting_only:false
# Select nodes by custom attribute (e.g. with something like `node.attr.rack: 2` in the configuration file)
GET /_nodes/rack:2
GET /_nodes/ra*:2
GET /_nodes/ra*:2*

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24

Cluster State API

GET /_cluster/state/<metrics>/<index>
1

(Optional, string) A comma-separated list of the following options:

_all

Shows all metrics.

blocks

Shows the blocks part of the response.

[root@node1 ~]# curl  node2:9200/_cluster/state/blocks?pretty
{
  "cluster_name" : "es",
  "compressed_size_in_bytes" : 1691,
  "blocks" : {
    "indices" : {
      "blog" : {
        "8" : {
          "description" : "index write (api)",
          "retryable" : false,
          "levels" : [
            "write"
          ]
        }
      }
    }
  }
}

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19

master_node

Shows the elected master_node part of the response.

[root@node1 ~]# curl node2:9200/_cluster/state/master_node?pretty
{
  "cluster_name" : "es",
  "compressed_size_in_bytes" : 1691,
  "master_node" : "Jqt6xka3Q6e_HM7HP7eazQ"
}
1
2
3
4
5
6

metadata

Shows the metadata part of the response. If you supply a comma separated list of indices, the returned output will only contain metadata for these indices.
```
  
1
```

nodes

Shows the nodes part of the response.

  [root@node1 ~]# curl node2:9200/_cluster/state/nodes?pretty
  {
    "cluster_name" : "es",
    "compressed_size_in_bytes" : 1691,
    "nodes" : {
      "4hD7E_O7RyeaqjwUNBuxDQ" : {
        "name" : "es-node3",
        "ephemeral_id" : "5zuQOwjhQ1iD-yUVnNmOGQ",
        "transport_address" : "192.168.199.139:9300",
        "attributes" : { }
      },
      "zMljaDu_Te-x5PIhFA-qEg" : {
        "name" : "es-node2",
        "ephemeral_id" : "jIiW4u0JTHuc2iEKTQ3hvw",
        "transport_address" : "192.168.199.138:9300",
        "attributes" : { }
      },
      "Jqt6xka3Q6e_HM7HP7eazQ" : {
        "name" : "es-node1",
        "ephemeral_id" : "0wcDvzymSRefsPfwBuo6LQ",
        "transport_address" : "192.168.199.137:9300",
        "attributes" : { }
      }
    }
  }
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26

routing_nodes

Shows the routing_nodes part of the response.
routing_table

Shows the routing_table part of the response. If you supply a comma separated list of indices, the returned output will only contain the routing table for these indices.

version

Shows the cluster state version.

   curl node2:9200/_cluster/state/version?pretty
  {
    "cluster_name" : "es",
    "compressed_size_in_bytes" : 1691,
    "version" : 70,
    "state_uuid" : "6OFUpDwAQamUugC6UDoOtw"
  }
  
1
2
3
4
5
6
7
8

<index>

(Optional, string) Comma-separated list or wildcard expression of index names used to limit the request.
```
GET /_cluster/state/metadata,routing_table/foo,bar
GET /_cluster/state/_all/foo,bar
1
2
```

Cluster update settins api

PUT /_cluster/settings
1

examples

# presistent update
PUT /_cluster/settings
{
  "persistent" : {
    "indices.recovery.max_bytes_per_sec" : "50mb"
  }
}

# transient update
PUT /_cluster/settings?flat_settings=true
{
  "transient" : {
    "indices.recovery.max_bytes_per_sec" : "20mb"
  }
}

# response
{
  ...
  "persistent" : { },
  "transient" : {
    "indices.recovery.max_bytes_per_sec" : "20mb"
  }
}

# reset a setting
PUT /_cluster/settings
{
  "transient" : {
    "indices.recovery.max_bytes_per_sec" : null
  }
}

# response
{
  ...
  "persistent" : {},
  "transient" : {}
}

### dynamic indices.recovery settings
PUT /_cluster/settings
{
  "transient" : {
    "indices.recovery.*" : null
  }
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47

设置85%写保护

curl -Ss -H 'Content-Type: application/json' -XPUT 'http://localhost:9200/_cluster/settings' -d '{"transient": {"cluster.routing.allocation.disk.watermark.low": "88%"}}'
1

pending_tasks

GET /_cluster/pending_tasks?pretty
1

取消执行的任务

curl -XPOST localhost:9200/_tasks/ID/_cancel
1

Task management API

GET /_tasks/<task_id>
GET /_tasks

GET _tasks 
GET _tasks?nodes=nodeId1,nodeId2 
GET _tasks?nodes=nodeId1,nodeId2&actions=cluster:* 
1
2
3
4
5
6

查询任务详情

curl -X GET 'localhost:9200/_tasks?pretty&detailed&actions=*reindex,*byquery'
1

取消任务

curl -XPOST localhost:9200/_tasks/ID/_cancel
1

Nodes stats API

GET /_nodes/stats

GET /_nodes/<node_id>/stats

GET/_nodes/stats/<metric>

GET/_nodes/<node_id>/stats/<metric>

GET /_nodes/stats/<metric>/<index_metric>

GET /_nodes/<node_id>/stats/<metric>/<index_metric>
1
2
3
4
5
6
7
8
9
10
11

<mertric>

adaptive_selection

Statistics about adaptive replica selection.

breaker

Statistics about the field data circuit breaker.

discovery

Statistics about the discovery.

fs

File system information, data path, free disk space, read/write stats.

http

HTTP connection information.

indices

Indices stats about size, document count, indexing and deletion times, search times, field cache size, merges and flushes.

ingest

Statistics about ingest preprocessing.

jvm

JVM stats, memory pool information, garbage collection, buffer pools, number of loaded/unloaded classes.

os

Operating system stats, load average, mem, swap.

process

Process statistics, memory consumption, cpu usage, open file descriptors.

thread_pool

Statistics about each thread pool, including current size, queue and rejected tasks.

transport

Transport statistics about sent and received bytes in cluster communication.
<index_metric>
- completion
- docs
- fielddata
- flush
- get
- indexing
- merge
- query_cache
- recovery
- refresh
- request_cache
- search
```
curl 10.218.80.41:9200/_nodes/stats/indices/search?pretty
1
```
- segments
- store
- translog
- warmer

document APIs

Search APIs

Routing

curl -H 'Content-Type:application/json' -XPOST node2:9200/blog/_doc?routing=Test -d'
{
  "title": "Create Index Test",
  "post_date" : "2009-11-15T14:12:12",
  "message" : "trying out Elasticsearch"
}'

curl -H 'Content-Type:application/json' -XPOST node2:9200/blog/_search?routing=kimchy -d '
{
  "query": {
    "bool": {
      "must": {
        "query_string": {
          "query": "some query string here"
        }
      },
      "filter": {
        "term": { "user": "kimchy" }
      }
    }
  }
}'
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22

stats索引统计API

docs:响应中的 docs 节点显示索引文档的信息

"docs" : {
"count" : 4, #文档的数目
"deleted" : 0
}
1
2
3
4

store:关于存储的信息

"store" : {
"size_in_bytes" : 6003,
"throttle_time_in_millis" : 0
}
1
2
3
4

indexing、get和search:索引删除操作、实时的 get 和搜索

 "indexing" : {
        "index_total" : 1,
        "index_time_in_millis" : 32,
        "index_current" : 0,
        "index_failed" : 0,
        "delete_total" : 0,
        "delete_time_in_millis" : 0,
        "delete_current" : 0,
        "noop_update_total" : 0,
        "is_throttled" : false,
        "throttle_time_in_millis" : 0
      },
      "get" : {
        "total" : 0,
        "time_in_millis" : 0,
        "exists_total" : 0,
        "exists_time_in_millis" : 0,
        "missing_total" : 0,
        "missing_time_in_millis" : 0,
        "current" : 0
      },
      "search" : {
        "open_contexts" : 0,
        "query_total" : 12,
        "query_time_in_millis" : 39,
        "query_current" : 0,
        "fetch_total" : 2,
        "fetch_time_in_millis" : 12,
        "fetch_current" : 0,
        "scroll_total" : 0,
        "scroll_time_in_millis" : 0,
        "scroll_current" : 0,
        "suggest_total" : 0,
        "suggest_time_in_millis" : 0,
        "suggest_current" : 0
      },
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36

额外信息

 merges ：该节点包含Lucene段合并的信息。
 refresh ：该节点包含刷新操作的信息。
 flush ：该节点包含清理信息。
 warmer ：该节点包含预热器的信息，以及它们执行了多久。
 filter_cache ：这些是过滤器缓存统计信息。
 id_cache ：这些是标识符缓存统计信息。
 fielddata ：这些是字段数据缓存统计信息。
 percolate ：该节点包含预匹配器使用情况的信息。
 completion ：该节点包含自动完成建议器的信息。
 segments ：该节点包含Lucene段的信息。
 translog ：该节点包含事务日志计数和大小的信息。

curl node1:9200/blog/_stats?pretty
1

{
  "_shards" : {
    "total" : 10,
    "successful" : 10,
    "failed" : 0
  },
  "_all" : {
    "primaries" : {
      "docs" : {
        "count" : 1,
        "deleted" : 0
      },
      "store" : {
        "size_in_bytes" : 7009
      },
      "indexing" : {
        "index_total" : 1,
        "index_time_in_millis" : 32,
        "index_current" : 0,
        "index_failed" : 0,
        "delete_total" : 0,
        "delete_time_in_millis" : 0,
        "delete_current" : 0,
        "noop_update_total" : 0,
        "is_throttled" : false,
        "throttle_time_in_millis" : 0
      },
      "get" : {
        "total" : 0,
        "time_in_millis" : 0,
        "exists_total" : 0,
        "exists_time_in_millis" : 0,
        "missing_total" : 0,
        "missing_time_in_millis" : 0,
        "current" : 0
      },
      "search" : {
        "open_contexts" : 0,
        "query_total" : 12,
        "query_time_in_millis" : 39,
        "query_current" : 0,
        "fetch_total" : 2,
        "fetch_time_in_millis" : 12,
        "fetch_current" : 0,
        "scroll_total" : 0,
        "scroll_time_in_millis" : 0,
        "scroll_current" : 0,
        "suggest_total" : 0,
        "suggest_time_in_millis" : 0,
        "suggest_current" : 0
      },
      "merges" : {
        "current" : 0,
        "current_docs" : 0,
        "current_size_in_bytes" : 0,
        "total" : 0,
        "total_time_in_millis" : 0,
        "total_docs" : 0,
        "total_size_in_bytes" : 0,
        "total_stopped_time_in_millis" : 0,
        "total_throttled_time_in_millis" : 0,
        "total_auto_throttle_in_bytes" : 104857600
      },
      "refresh" : {
        "total" : 19,
        "total_time_in_millis" : 75,
        "listeners" : 0
      },
      "flush" : {
        "total" : 1,
        "total_time_in_millis" : 9
      },
      "warmer" : {
        "current" : 0,
        "total" : 8,
        "total_time_in_millis" : 6
      },
      "query_cache" : {
        "memory_size_in_bytes" : 0,
        "total_count" : 0,
        "hit_count" : 0,
        "miss_count" : 0,
        "cache_size" : 0,
        "cache_count" : 0,
        "evictions" : 0
      },
      "fielddata" : {
        "memory_size_in_bytes" : 0,
        "evictions" : 0
      },
      "completion" : {
        "size_in_bytes" : 0
      },
      "segments" : {
        "count" : 1,
        "memory_in_bytes" : 2863,
        "terms_memory_in_bytes" : 2089,
        "stored_fields_memory_in_bytes" : 312,
        "term_vectors_memory_in_bytes" : 0,
        "norms_memory_in_bytes" : 256,
        "points_memory_in_bytes" : 2,
        "doc_values_memory_in_bytes" : 204,
        "index_writer_memory_in_bytes" : 0,
        "version_map_memory_in_bytes" : 0,
        "fixed_bit_set_memory_in_bytes" : 0,
        "max_unsafe_auto_id_timestamp" : -1,
        "file_sizes" : { }
      },
      "translog" : {
        "operations" : 1,
        "size_in_bytes" : 488,
        "uncommitted_operations" : 0,
        "uncommitted_size_in_bytes" : 215
      },
      "request_cache" : {
        "memory_size_in_bytes" : 703,
        "evictions" : 0,
        "hit_count" : 0,
        "miss_count" : 2
      },
      "recovery" : {
        "current_as_source" : 0,
        "current_as_target" : 0,
        "throttle_time_in_millis" : 0
      }
    },
    "total" : {
      "docs" : {
        "count" : 2,
        "deleted" : 0
      },
      "store" : {
        "size_in_bytes" : 14018
      },
      "indexing" : {
        "index_total" : 2,
        "index_time_in_millis" : 70,
        "index_current" : 0,
        "index_failed" : 0,
        "delete_total" : 0,
        "delete_time_in_millis" : 0,
        "delete_current" : 0,
        "noop_update_total" : 0,
        "is_throttled" : false,
        "throttle_time_in_millis" : 0
      },
      "get" : {
        "total" : 0,
        "time_in_millis" : 0,
        "exists_total" : 0,
        "exists_time_in_millis" : 0,
        "missing_total" : 0,
        "missing_time_in_millis" : 0,
        "current" : 0
      },
      "search" : {
        "open_contexts" : 0,
        "query_total" : 25,
        "query_time_in_millis" : 76,
        "query_current" : 0,
        "fetch_total" : 4,
        "fetch_time_in_millis" : 19,
        "fetch_current" : 0,
        "scroll_total" : 0,
        "scroll_time_in_millis" : 0,
        "scroll_current" : 0,
        "suggest_total" : 0,
        "suggest_time_in_millis" : 0,
        "suggest_current" : 0
      },
      "merges" : {
        "current" : 0,
        "current_docs" : 0,
        "current_size_in_bytes" : 0,
        "total" : 0,
        "total_time_in_millis" : 0,
        "total_docs" : 0,
        "total_size_in_bytes" : 0,
        "total_stopped_time_in_millis" : 0,
        "total_throttled_time_in_millis" : 0,
        "total_auto_throttle_in_bytes" : 209715200
      },
      "refresh" : {
        "total" : 38,
        "total_time_in_millis" : 158,
        "listeners" : 0
      },
      "flush" : {
        "total" : 2,
        "total_time_in_millis" : 22
      },
      "warmer" : {
        "current" : 0,
        "total" : 16,
        "total_time_in_millis" : 6
      },
      "query_cache" : {
        "memory_size_in_bytes" : 0,
        "total_count" : 0,
        "hit_count" : 0,
        "miss_count" : 0,
        "cache_size" : 0,
        "cache_count" : 0,
        "evictions" : 0
      },
      "fielddata" : {
        "memory_size_in_bytes" : 0,
        "evictions" : 0
      },
      "completion" : {
        "size_in_bytes" : 0
      },
      "segments" : {
        "count" : 2,
        "memory_in_bytes" : 5726,
        "terms_memory_in_bytes" : 4178,
        "stored_fields_memory_in_bytes" : 624,
        "term_vectors_memory_in_bytes" : 0,
        "norms_memory_in_bytes" : 512,
        "points_memory_in_bytes" : 4,
        "doc_values_memory_in_bytes" : 408,
        "index_writer_memory_in_bytes" : 0,
        "version_map_memory_in_bytes" : 0,
        "fixed_bit_set_memory_in_bytes" : 0,
        "max_unsafe_auto_id_timestamp" : -1,
        "file_sizes" : { }
      },
      "translog" : {
        "operations" : 2,
        "size_in_bytes" : 976,
        "uncommitted_operations" : 0,
        "uncommitted_size_in_bytes" : 430
      },
      "request_cache" : {
        "memory_size_in_bytes" : 2812,
        "evictions" : 0,
        "hit_count" : 0,
        "miss_count" : 5
      },
      "recovery" : {
        "current_as_source" : 0,
        "current_as_target" : 0,
        "throttle_time_in_millis" : 0
      }
    }
  },
  "indices" : {
    "blog" : {
      "primaries" : {
        "docs" : {
          "count" : 1,
          "deleted" : 0
        },
        "store" : {
          "size_in_bytes" : 7009
        },
        "indexing" : {
          "index_total" : 1,
          "index_time_in_millis" : 32,
          "index_current" : 0,
          "index_failed" : 0,
          "delete_total" : 0,
          "delete_time_in_millis" : 0,
          "delete_current" : 0,
          "noop_update_total" : 0,
          "is_throttled" : false,
          "throttle_time_in_millis" : 0
        },
        "get" : {
          "total" : 0,
          "time_in_millis" : 0,
          "exists_total" : 0,
          "exists_time_in_millis" : 0,
          "missing_total" : 0,
          "missing_time_in_millis" : 0,
          "current" : 0
        },
        "search" : {
          "open_contexts" : 0,
          "query_total" : 12,
          "query_time_in_millis" : 39,
          "query_current" : 0,
          "fetch_total" : 2,
          "fetch_time_in_millis" : 12,
          "fetch_current" : 0,
          "scroll_total" : 0,
          "scroll_time_in_millis" : 0,
          "scroll_current" : 0,
          "suggest_total" : 0,
          "suggest_time_in_millis" : 0,
          "suggest_current" : 0
        },
        "merges" : {
          "current" : 0,
          "current_docs" : 0,
          "current_size_in_bytes" : 0,
          "total" : 0,
          "total_time_in_millis" : 0,
          "total_docs" : 0,
          "total_size_in_bytes" : 0,
          "total_stopped_time_in_millis" : 0,
          "total_throttled_time_in_millis" : 0,
          "total_auto_throttle_in_bytes" : 104857600
        },
        "refresh" : {
          "total" : 19,
          "total_time_in_millis" : 75,
          "listeners" : 0
        },
        "flush" : {
          "total" : 1,
          "total_time_in_millis" : 9
        },
        "warmer" : {
          "current" : 0,
          "total" : 8,
          "total_time_in_millis" : 6
        },
        "query_cache" : {
          "memory_size_in_bytes" : 0,
          "total_count" : 0,
          "hit_count" : 0,
          "miss_count" : 0,
          "cache_size" : 0,
          "cache_count" : 0,
          "evictions" : 0
        },
        "fielddata" : {
          "memory_size_in_bytes" : 0,
          "evictions" : 0
        },
        "completion" : {
          "size_in_bytes" : 0
        },
        "segments" : {
          "count" : 1,
          "memory_in_bytes" : 2863,
          "terms_memory_in_bytes" : 2089,
          "stored_fields_memory_in_bytes" : 312,
          "term_vectors_memory_in_bytes" : 0,
          "norms_memory_in_bytes" : 256,
          "points_memory_in_bytes" : 2,
          "doc_values_memory_in_bytes" : 204,
          "index_writer_memory_in_bytes" : 0,
          "version_map_memory_in_bytes" : 0,
          "fixed_bit_set_memory_in_bytes" : 0,
          "max_unsafe_auto_id_timestamp" : -1,
          "file_sizes" : { }
        },
        "translog" : {
          "operations" : 1,
          "size_in_bytes" : 488,
          "uncommitted_operations" : 0,
          "uncommitted_size_in_bytes" : 215
        },
        "request_cache" : {
          "memory_size_in_bytes" : 703,
          "evictions" : 0,
          "hit_count" : 0,
          "miss_count" : 2
        },
        "recovery" : {
          "current_as_source" : 0,
          "current_as_target" : 0,
          "throttle_time_in_millis" : 0
        }
      },
      "total" : {
        "docs" : {
          "count" : 2,
          "deleted" : 0
        },
        "store" : {
          "size_in_bytes" : 14018
        },
        "indexing" : {
          "index_total" : 2,
          "index_time_in_millis" : 70,
          "index_current" : 0,
          "index_failed" : 0,
          "delete_total" : 0,
          "delete_time_in_millis" : 0,
          "delete_current" : 0,
          "noop_update_total" : 0,
          "is_throttled" : false,
          "throttle_time_in_millis" : 0
        },
        "get" : {
          "total" : 0,
          "time_in_millis" : 0,
          "exists_total" : 0,
          "exists_time_in_millis" : 0,
          "missing_total" : 0,
          "missing_time_in_millis" : 0,
          "current" : 0
        },
        "search" : {
          "open_contexts" : 0,
          "query_total" : 25,
          "query_time_in_millis" : 76,
          "query_current" : 0,
          "fetch_total" : 4,
          "fetch_time_in_millis" : 19,
          "fetch_current" : 0,
          "scroll_total" : 0,
          "scroll_time_in_millis" : 0,
          "scroll_current" : 0,
          "suggest_total" : 0,
          "suggest_time_in_millis" : 0,
          "suggest_current" : 0
        },
        "merges" : {
          "current" : 0,
          "current_docs" : 0,
          "current_size_in_bytes" : 0,
          "total" : 0,
          "total_time_in_millis" : 0,
          "total_docs" : 0,
          "total_size_in_bytes" : 0,
          "total_stopped_time_in_millis" : 0,
          "total_throttled_time_in_millis" : 0,
          "total_auto_throttle_in_bytes" : 209715200
        },
        "refresh" : {
          "total" : 38,
          "total_time_in_millis" : 158,
          "listeners" : 0
        },
        "flush" : {
          "total" : 2,
          "total_time_in_millis" : 22
        },
        "warmer" : {
          "current" : 0,
          "total" : 16,
          "total_time_in_millis" : 6
        },
        "query_cache" : {
          "memory_size_in_bytes" : 0,
          "total_count" : 0,
          "hit_count" : 0,
          "miss_count" : 0,
          "cache_size" : 0,
          "cache_count" : 0,
          "evictions" : 0
        },
        "fielddata" : {
          "memory_size_in_bytes" : 0,
          "evictions" : 0
        },
        "completion" : {
          "size_in_bytes" : 0
        },
        "segments" : {
          "count" : 2,
          "memory_in_bytes" : 5726,
          "terms_memory_in_bytes" : 4178,
          "stored_fields_memory_in_bytes" : 624,
          "term_vectors_memory_in_bytes" : 0,
          "norms_memory_in_bytes" : 512,
          "points_memory_in_bytes" : 4,
          "doc_values_memory_in_bytes" : 408,
          "index_writer_memory_in_bytes" : 0,
          "version_map_memory_in_bytes" : 0,
          "fixed_bit_set_memory_in_bytes" : 0,
          "max_unsafe_auto_id_timestamp" : -1,
          "file_sizes" : { }
        },
        "translog" : {
          "operations" : 2,
          "size_in_bytes" : 976,
          "uncommitted_operations" : 0,
          "uncommitted_size_in_bytes" : 430
        },
        "request_cache" : {
          "memory_size_in_bytes" : 2812,
          "evictions" : 0,
          "hit_count" : 0,
          "miss_count" : 5
        },
        "recovery" : {
          "current_as_source" : 0,
          "current_as_target" : 0,
          "throttle_time_in_millis" : 0
        }
      }
    }
  }
}

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490

状态API

集群状态status
- green:所有分片已妥善分配
- yellow:主片已分配，但部分或所有副本都还未分配
- red:至少一个主片未分配,集群未准备好，查询会返回错误或不完整的结果

library 和 map 索引的健康

curl '192.168.199.136:9200/_cluster/health/library,map/?pretty'
1

{
  "cluster_name" : "es",
  "status" : "red",
  "timed_out" : true,
  "number_of_nodes" : 3,
  "number_of_data_nodes" : 3,
  "active_primary_shards" : 0,
  "active_shards" : 0,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 100.0
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17

控制信息细节
- level: 把它指定为 cluster （默认）、 indices 或shards ，这样就能够控制由健康度API返回信息的细节
额外的参数
- timeout: 控制命令执行的最长时间，默认30s
- wait_for_status: 它设置为 green 、 yellow 和 red 。例如，设置为 green 时，健康度API调用将返回绿色状态，或者达
  到 timeout 时间。
- wait_for_nodes: 参数允许设置返回响应时需要多少节点可用（或者达到 timeout 时间）。可以设置该参数为整数值，比如3，或者一个简单等式，比如>=3（大于或等于3个节点），<=3（小于或等于3个节点）
- wait_for_relocating_shard: 默认不指定。它告诉Elasticsearch应该重定位多少分片（或者等待 timeout 时间）。设置该参数为 0 意味着Elasticsearch应该等待所有重定位分片。

curl '192.168.199.136:9200/_cluster/health?pretty&wait_for_status=green&wait_for_nodes=>=3&timeout=10s'
1

{
  "cluster_name" : "es",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 3,
  "number_of_data_nodes" : 3,
  "active_primary_shards" : 0,
  "active_shards" : 0,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 100.0
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17

搜索数据

可以搜索的数据
- 分组(group)
- 活动(event)

构造Elasticsearch索引)

populate.sh

#!/usr/bin/env bash

ADDRESS=$1

if [ -z $ADDRESS ]; then
  ADDRESS="localhost:9200"
fi

# Check that Elasticsearch is running
curl -s "http://$ADDRESS" 2>&1 > /dev/null
if [ $? != 0 ]; then
    echo "Unable to contact Elasticsearch at $ADDRESS"
    echo "Please ensure Elasticsearch is running and can be reached at http://$ADDRESS/"
    exit -1
fi

echo "WARNING, this script will delete the 'get-together' and the 'myindex' indices and re-index all data!"
echo "Press Control-C to cancel this operation."
echo
echo "Press [Enter] to continue."
read

# Delete the old index, swallow failures if it doesn't exist
curl -s -XDELETE "$ADDRESS/get-together" > /dev/null

# Create the next index using mapping.json
echo "Creating 'get-together' index..."
curl -s -XPUT -H'Content-Type: application/json' "$ADDRESS/get-together" -d@$(dirname $0)/mapping.json

# Wait for index to become yellow
curl -s "$ADDRESS/get-together/_health?wait_for_status=yellow&timeout=10s" > /dev/null
echo
echo "Done creating 'get-together' index."

echo
echo "Indexing data..."

echo "Indexing groups..."
curl -s -XPOST "$ADDRESS/get-together/_doc/1" -H'Content-Type: application/json' -d'{
  "relationship_type": "group",
  "name": "Denver Clojure",
  "organizer": ["Daniel", "Lee"],
  "description": "Group of Clojure enthusiasts from Denver who want to hack on code together and learn more about Clojure",
  "created_on": "2012-06-15",
  "tags": ["clojure", "denver", "functional programming", "jvm", "java"],
  "members": ["Lee", "Daniel", "Mike"],
  "location_group": "Denver, Colorado, USA"
}'

echo
curl -s -XPOST "$ADDRESS/get-together/_doc/2" -H'Content-Type: application/json' -d'{
  "relationship_type": "group",
  "name": "Elasticsearch Denver",
  "organizer": "Lee",
  "description": "Get together to learn more about using Elasticsearch, the applications and neat things you can do with ES!",
  "created_on": "2013-03-15",
  "tags": ["denver", "elasticsearch", "big data", "lucene", "solr"],
  "members": ["Lee", "Mike"],
  "location_group": "Denver, Colorado, USA"
}'

echo
curl -s -XPOST "$ADDRESS/get-together/_doc/3" -H'Content-Type: application/json' -d'{
  "relationship_type": "group",
  "name": "Elasticsearch San Francisco",
  "organizer": "Mik",
  "description": "Elasticsearch group for ES users of all knowledge levels",
  "created_on": "2012-08-07",
  "tags": ["elasticsearch", "big data", "lucene", "open source"],
  "members": ["Lee", "Igor"],
  "location_group": "San Francisco, California, USA"
}'

echo
curl -s -XPOST "$ADDRESS/get-together/_doc/4" -H'Content-Type: application/json' -d'{
  "relationship_type": "group",
  "name": "Boulder/Denver big data get-together",
  "organizer": "Andy",
  "description": "Come learn and share your experience with nosql & big data technologies, no experience required",
  "created_on": "2010-04-02",
  "tags": ["big data", "data visualization", "open source", "cloud computing", "hadoop"],
  "members": ["Greg", "Bill"],
  "location_group": "Boulder, Colorado, USA"
}'

echo
curl -s -XPOST "$ADDRESS/get-together/_doc/5" -H'Content-Type: application/json' -d'{
  "relationship_type": "group",
  "name": "Enterprise search London get-together",
  "organizer": "Tyler",
  "description": "Enterprise search get-togethers are an opportunity to get together with other people doing search.",
  "created_on": "2009-11-25",
  "tags": ["enterprise search", "apache lucene", "solr", "open source", "text analytics"],
  "members": ["Clint", "James"],
  "location_group": "London, England, UK"
}'

echo
echo "Done indexing groups."

echo "Indexing events..."

curl -s -XPOST "$ADDRESS/get-together/_doc/100?routing=1" -H'Content-Type: application/json' -d'{
  "relationship_type": {
    "name": "event",
    "parent": "1"
  },
  "host": ["Lee", "Troy"],
  "title": "Liberator and Immutant",
  "description": "We will discuss two different frameworks in Clojure for doing different things. Liberator is a ring-compatible web framework based on Erlang Webmachine. Immutant is an all-in-one enterprise application based on JBoss.",
  "attendees": ["Lee", "Troy", "Daniel", "Tom"],
  "date": "2013-09-05T18:00",
  "location_event": {
    "name": "Stoneys Full Steam Tavern",
    "geolocation": "39.752337,-105.00083"
  },
  "reviews": 4
}'
echo
curl -s -XPOST "$ADDRESS/get-together/_doc/101?routing=1" -H'Content-Type: application/json' -d'{
  "relationship_type": {
    "name": "event",
    "parent": "1"
  },
  "host": "Sean",
  "title": "Sunday, Surly Sunday",
  "description": "Sort out any setup issues and work on Surlybird issues. We can use the EC2 node as a bounce point for pairing.",
  "attendees": ["Daniel", "Michael", "Sean"],
  "date": "2013-07-21T18:30",
  "location_event": {
    "name": "IRC, #denofclojure"
  },
  "reviews": 2
}'

echo
curl -s -XPOST "$ADDRESS/get-together/_doc/102?routing=1" -H'Content-Type: application/json' -d'{
  "relationship_type": {
    "name": "event",
    "parent": "1"
  },
  "host": "Daniel",
  "title": "10 Clojure coding techniques you should know, and project openbike",
  "description": "What are ten Clojure coding techniques that you wish everyone knew? We will also check on the status of Project Openbike.",
  "attendees": ["Lee", "Tyler", "Daniel", "Stuart", "Lance"],
  "date": "2013-07-11T18:00",
  "location_event": {
    "name": "Stoneys Full Steam Tavern",
    "geolocation": "39.752337,-105.00083"
  },
  "reviews": 3
}'

echo
curl -s -XPOST "$ADDRESS/get-together/_doc/103?routing=2" -H'Content-Type: application/json' -d'{
  "relationship_type": {
    "name": "event",
    "parent": "2"
  },
  "host": "Lee",
  "title": "Introduction to Elasticsearch",
  "description": "An introduction to ES and each other. We can meet and greet and I will present on some Elasticsearch basics and how we use it.",
  "attendees": ["Lee", "Martin", "Greg", "Mike"],
  "date": "2013-04-17T19:00",
  "location_event": {
    "name": "Stoneys Full Steam Tavern",
    "geolocation": "39.752337,-105.00083"
  },
  "reviews": 5
}'

echo
curl -s -XPOST "$ADDRESS/get-together/_doc/104?routing=2" -H'Content-Type: application/json' -d'{
  "relationship_type": {
    "name": "event",
    "parent": "2"
  },
  "host": "Lee",
  "title": "Queries and Filters",
  "description": "A get together to talk about different ways to query Elasticsearch, what works best for different kinds of applications.",
  "attendees": ["Lee", "Greg", "Richard"],
  "date": "2013-06-17T18:00",
  "location_event": {
    "name": "Stoneys Full Steam Tavern",
    "geolocation": "39.752337,-105.00083"
  },
  "reviews": 1
}'

echo
curl -s -XPOST "$ADDRESS/get-together/_doc/105?routing=2" -H'Content-Type: application/json' -d'{
  "relationship_type": {
    "name": "event",
    "parent": "2"
  },
  "host": "Lee",
  "title": "Elasticsearch and Logstash",
  "description": "We can get together and talk about Logstash - http://logstash.net with a sneak peek at Kibana",
  "attendees": ["Lee", "Greg", "Mike", "Delilah"],
  "date": "2013-07-17T18:30",
  "location_event": {
    "name": "Stoneys Full Steam Tavern",
    "geolocation": "39.752337,-105.00083"
  },
  "reviews": null
}'

echo
curl -s -XPOST "$ADDRESS/get-together/_doc/106?routing=3" -H'Content-Type: application/json' -d'{
  "relationship_type": {
    "name": "event",
    "parent": "3"
  },
  "host": "Mik",
  "title": "Social management and monitoring tools",
  "description": "Shay Banon will be there to answer questions and we can talk about management tools.",
  "attendees": ["Shay", "Mik", "John", "Chris"],
  "date": "2013-03-06T18:00",
  "location_event": {
    "name": "Quid Inc",
    "geolocation": "37.798442,-122.399801"
  },
  "reviews": 5
}'

echo
curl -s -XPOST "$ADDRESS/get-together/_doc/107?routing=3" -H'Content-Type: application/json' -d'{
  "relationship_type": {
    "name": "event",
    "parent": "3"
  },
  "host": "Mik",
  "title": "Logging and Elasticsearch",
  "description": "Get a deep dive for what Elasticsearch is and how it can be used for logging with Logstash as well as Kibana!",
  "attendees": ["Shay", "Rashid", "Erik", "Grant", "Mik"],
  "date": "2013-04-08T18:00",
  "location_event": {
    "name": "Salesforce headquarters",
    "geolocation": "37.793592,-122.397033"
  },
  "reviews": 3
}'

echo
curl -s -XPOST "$ADDRESS/get-together/_doc/108?routing=3" -H'Content-Type: application/json' -d'{
  "relationship_type": {
    "name": "event",
    "parent": "3"
  },
  "host": "Elyse",
  "title": "Piggyback on Elasticsearch training in San Francisco",
  "description": "We can piggyback on training by Elasticsearch to have some Q&A time with the ES devs",
  "attendees": ["Shay", "Igor", "Uri", "Elyse"],
  "date": "2013-05-23T19:00",
  "location_event": {
    "name": "NoSQL Roadshow",
    "geolocation": "37.787742,-122.398964"
  },
  "reviews": 5
}'

echo
curl -s -XPOST "$ADDRESS/get-together/_doc/109?routing=4" -H'Content-Type: application/json' -d'{
  "relationship_type": {
    "name": "event",
    "parent": "4"
  },
  "host": "Andy",
  "title": "Hortonworks, the future of Hadoop and big data",
  "description": "Presentation on the work that hortonworks is doing on Hadoop",
  "attendees": ["Andy", "Simon", "David", "Sam"],
  "date": "2013-06-19T18:00",
  "location_event": {
    "name": "SendGrid Denver office",
    "geolocation": "39.748477,-104.998852"
  },
  "reviews": 2
}'

echo
curl -s -XPOST "$ADDRESS/get-together/_doc/110?routing=4" -H'Content-Type: application/json' -d'{
  "relationship_type": {
    "name": "event",
    "parent": "4"
  },
  "host": "Andy",
  "title": "Big Data and the cloud at Microsoft",
  "description": "Discussion about the Microsoft Azure cloud and HDInsight.",
  "attendees": ["Andy", "Michael", "Ben", "David"],
  "date": "2013-07-31T18:00",
  "location_event": {
    "name": "Bing Boulder office",
    "geolocation": "40.018528,-105.275806"
  },
  "reviews": 1
}'

echo
curl -s -XPOST "$ADDRESS/get-together/_doc/111?routing=4" -H'Content-Type: application/json' -d'{
  "relationship_type": {
    "name": "event",
    "parent": "4"
  },
  "host": "Andy",
  "title": "Moving Hadoop to the mainstream",
  "description": "Come hear about how Hadoop is moving to the main stream",
  "attendees": ["Andy", "Matt", "Bill"],
  "date": "2013-07-21T18:00",
  "location_event": {
    "name": "Courtyard Boulder Louisville",
    "geolocation": "39.959409,-105.163497"
  },
  "reviews": 4
}'

echo
curl -s -XPOST "$ADDRESS/get-together/_doc/112?routing=5" -H'Content-Type: application/json' -d'{
  "relationship_type": {
    "name": "event",
    "parent": "5"
  },
  "host": "Dave Nolan",
  "title": "real-time Elasticsearch",
  "description": "We will discuss using Elasticsearch to index data in real time",
  "attendees": ["Dave", "Shay", "John", "Harry"],
  "date": "2013-02-18T18:30",
  "location_event": {
    "name": "SkillsMatter Exchange",
    "geolocation": "51.524806,-0.099095"
  },
  "reviews": 3
}'

echo
curl -s -XPOST "$ADDRESS/get-together/_doc/113?routing=5" -H'Content-Type: application/json' -d'{
  "relationship_type": {
    "name": "event",
    "parent": "5"
  },
  "host": "Dave",
  "title": "Elasticsearch at Rangespan and Exonar",
  "description": "Representatives from Rangespan and Exonar will come and discuss how they use Elasticsearch",
  "attendees": ["Dave", "Andrew", "David", "Clint"],
  "date": "2013-06-24T18:30",
  "location_event": {
    "name": "Alumni Theatre",
    "geolocation": "51.51558,-0.117699"
  },
  "reviews": 3
}'

echo
curl -s -XPOST "$ADDRESS/get-together/_doc/114?routing=5" -H'Content-Type: application/json' -d'{
  "relationship_type": {
    "name": "event",
    "parent": "5"
  },
  "host": "Yann",
  "title": "Using Hadoop with Elasticsearch",
  "description": "We will walk through using Hadoop with Elasticsearch for big data crunching!",
  "attendees": ["Yann", "Bill", "James"],
  "date": "2013-09-09T18:30",
  "location_event": {
    "name": "SkillsMatter Exchange",
    "geolocation": "51.524806,-0.099095"
  },
  "reviews": 2
}'

echo
echo "Done indexing events."

# Refresh so data is available
curl -s -XPOST "$ADDRESS/get-together/_refresh"

echo
echo "Done indexing data."
echo

echo
echo "Creating Templates."
curl -s -XPUT "http://$ADDRESS/_template/logging_index_all" -H'Content-Type: application/json' -d'{
    "template" : "logstash-09-*",
    "order" : 1,
    "settings" : {
        "number_of_shards" : 2,
        "number_of_replicas" : 1
    },
    "aliases" : { "november" : {} }
}'

echo
curl -s -XPUT "http://$ADDRESS/_template/logging_index" -H'Content-Type: application/json' -d '{
    "template" : "logstash-*",
    "order" : 0,
    "settings" : {
        "number_of_shards" : 2,
        "number_of_replicas" : 1
    }
}'
echo
echo "Done Creating Templates."


echo
echo "Adding Dynamic Mapping"
curl -s -XDELETE "http://$ADDRESS/myindex" > /dev/null
curl -s -XPUT "http://$ADDRESS/myindex" -H'Content-Type: application/json' -d'
{
    "mappings" : {
        "my_type" : {
            "dynamic_templates" : [{
                "UUID" : {
                    "match" : "*_guid",
                    "match_mapping_type" : "string",
                    "mapping" : {
                        "type" : "keyword"
                    }
                }
            }]
        }
    }
}'
echo
echo "Done Adding Dynamic Mapping"

echo
echo "Adding Aliases"
curl -s -XDELETE "http://$ADDRESS/november_2014_invoices" > /dev/null
curl -s -XDELETE "http://$ADDRESS/december_2014_invoices" > /dev/null
curl -s -XPUT "http://$ADDRESS/november_2014_invoices"
echo
curl -s -XPUT "http://$ADDRESS/december_2014_invoices" -H'Content-Type: application/json' -d'
{
    "mappings" :
    {
        "invoice" :
        {
            "properties" :
            {
                "revenue" : { "type" : "integer" }
            }
        }
    }
}'

echo

curl -s -XPOST "http://$ADDRESS/_aliases" -H'Content-Type: application/json' -d'
{
  "actions" : [
    {"add" : {"index" : "november_2014_invoices", "alias" : "2014_invoices"}},
    {"add" : {"index" : "december_2014_invoices", "alias" : "2014_invoices"}},
    {"remove" : {"index" : "myindex", "alias" : "december_2014_invoices"}}
  ]
}'
echo
echo "Done Adding Aliases"

echo "Adding Filter Alias"
curl -s -XPOST "http://$ADDRESS/_aliases" -H'Content-Type: application/json' -d '
{
    "actions" : [
        {
            "add" : {
                 "index" : "december_2014_invoices",
                 "alias" : "bigmoney",
                 "filter" :
                 {
                    "range" :
                    {
                      "revenue" :
                      {
                        "gt" : 1000
                      }

                    }
                 }
            }
        }
    ]
}'
echo
echo "Done Adding Filter Alias"

echo
echo "Adding Routing Alias"
curl -s -XPOST "http://$ADDRESS/_aliases" -H'Content-Type: application/json' -d '
{
    "actions" : [
        {
            "add" : {
                 "index" : "december_2014_invoices",
                 "alias" : "2014_invoices",
                 "search_routing" : "en,es",
                 "index_routing" : "en"
            }
        }
    ]
}'
echo
echo "Done Adding Routing Alias"

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503

执行报错

mapper_parsing_exception","reason":"failed to parse field [relationship_type] of type [text] in document with id '100'
1

git下载7.0 branch可以解决

git clone https://github.com/dakrone/elasticsearch-in-action.git -b 7.x
1

搜索请求共有的基本结构

搜索请求的结构

确定搜索范围

指定索引或类型名称限制范围

％ curl 'localhost：9200/_search' -d '……'　←——搜索整个集群
％ curl 'localhost：9200/get-together/_search' -d '……' ←——搜索get-together 索引
％ curl 'localhost：9200/get-together/event/_search' -d '……' ←——在get-together索引中搜索事件类型
％ curl 'localhost：9200/_all/event/_search' -d '……'　←——在全部索引中搜索所有的事件类型
％ curl 'localhost：9200/*/event/_search' -d '……'
％ curl 'localhost：9200/get-together，other/event，group/_search' -d '……' ←——在get-together 和其他索引中搜索事件和分组类型
％ curl 'localhost：9200/＋get-toge*，-get-together/_search' -d '……' ←——搜索所有名字以get-toge开头的索引，但是不包括get-together
1
2
3
4
5
6
7

搜索请求的基本模块

query:模块使用查询DSL和过滤器DSL来配置
size:代表了返回文档的数量
from:和size一起使用，from用于分页操作。需要注意的是，为了确定第2页的10项结果，Elasticsearch必须要计算前20个结果。如果结果集合不断增加，获取某些靠后的翻页将会成为代价高昂的操作
_source:指定_source字段如何返回,如果索引的文档很大，而且无须结果中的全部内容，就使用这个功能

结果起始和页面大小

命名适宜的from和size字段，用于指定结果的开始点，以及每“页”结果的数量。如果发送的from值是7，size值是5，那么Elasticsearch将返回第8、9、10、11和12项结果（由于from参数是从0开始，指定7就是从第8项结果开始）

基于URL的搜索请求

使用from和size参数来实现结果分页

％ curl 'localhost:9200/get-together/_search?from=10&size=10'　←——请求匹配了所有文档，URL 中发送了from 和size 参数
1

改变结果的顺序

％ curl 'localhost:9200/get-together/_search?sort=date:asc'　←——请求匹配了所有文档，但是默认返回前10项结果，按照日期的升序排列
1

在你期望的回复中限制source的字段

％ curl 'node1:9200/get-together/_search?sort=date:asc&_source=title,date'　
1

更改结果的排序

％ curl 'node1:9200/get-together/_search?sort=date:asc&q=title:elasticsearch'　 ←——请求匹配了所有标题中含有“elasticsearch”字样的活动
1

基于主体的搜索请求

curl 'localhost：9200/get-together/_search' -d '{
　"query"：{
　	"match_all"：{}
　},
　"from"：10,　←——返回从第10 项开始的结果
　"size"：10　←——总共返回最多10个结果
}'

curl -X POST "node1:9200/get-together/_search" -H 'Content-Type: application/json' -d'
{
    "query": {
        "match_all": { }
    },
    "from":10,
    "size":10
}'
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16

理解复合结构

　％ curl 'localhost：9200/_search?q=title：elasticsearch&_source=title，date'
　　{
　　　 "took"：2，　←——查询所用的毫秒数
　　　 "timed_out"：false，　 ←——表明是否有分片超时，也就是说是否只返回了部分结果
　　　 "_shards"：{
　　　　 "total"：2，　←——成功响应该请求和未能成功响应该请求的分片数量
　　　　 "successful"：2，
　　　　　"failed"：0
　　　 }，
　　　 "hits"：{　←——回复中包含了命中（ hits ）的键，其值是命中文档的数组
　　　　　"total"：7，　 ←——该搜索请求所有匹配结果的数量
　　　　　"max_score"：0.9904146，　←——这个搜索结果中的最大得分
　　　　　"hits"：［　←——命中（hits）关键词元素中的命中文档数组是否只返回了部分结果
　　　 "_shards"：{
　　　　 "total"：2，　←——成功响应该请求和未能成功响应该请求的分片数量
　　　　 "successful"：2，
　　　　　"failed"：0
　　　 }，
　　　 "hits"：{　←——回复中包含了命中（ hits ）的键，其值是命中文档的数组
　　　　　"total"：7，　 ←——该搜索请求所有匹配结果的数量
　　　　　"max_score"：0.9904146，　←——这个搜索结果中的最大得分
　　　　　"hits"：［　←——命中（hits）关键词元素中的命中文档数组
 {
　　　　　　　　"_index"："get-together"，　←——结果文档的索引
　　　　　　　　"_type"："event"，　←——结果文档的Elasticsearch 类型
　　　　　　　　"_id"："103"，　←——结果文档的ID
　　　　　　　　"_score"：0.9904146，　←——结果的相关性得分
　　　　　　　　"_source"：{
　　　　　　　　　 "date"："2013-04-17T19：00"，　←——请求的_source 字段（本例中是标题和日期）
　　　　　　　　　 "title"："Introduction to Elasticsearch"
　　　　　　　　}
　　　　　　 }，
　　　　　　 {
　　　　　　 "_index"："get-together"，
　　　　　　　　"_type"："event"，
　　　　　　　　"_id"："105"，
　　　　　　　　"_score"：0.9904146，
　　　　　　　　"_source"：{
　　　　　　　　"date"："2013-07-17T18：30"，
　　　　　　　　"title"："Elasticsearch and Logstash"
　　　　　　 }
　　　　　}，
　　　　…　←——其他的命中结果，为了简洁这里略去
　　　］
　　}
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46

介绍查询和过滤器DSL

使用请求主体的基本搜索请求

％ curl 'localhost：9200/get-together/_search' -d '{
　"query"：{　←——搜索API 中的查询模块
　　"match_all"：{}　←——查询 API 的基本样例
　　}
}'
1
2
3
4
5

match查询和term过滤器

％ curl 'localhost：9200/get-together/event/_search' -d '{
　"query"：{
　"match"：{　←——match 查询展示了如何搜索标题中有“hadoop”字样的活动
　　　"title"："hadoop"　←——注意查询单词“Hadoop”是以小写的h 开头
　　}
　}
}'
1
2
3
4
5
6
7

bool查询

{ 
    "bool": { 
        "must":     { "match": { "title": "how to make millions" }}, 
        "must_not": { "match": { "tag":   "spam" }}, 
        "should": [ 
            { "match": { "tag": "starred" }}, 
            { "range": { "date": { "gte": "2014-01-01" }}} 
        ] 
    } 
}
1
2
3
4
5
6
7
8
9
10

# 查询所有数据，筛选出工资等于6666或者7777的数据
GET 51jobs/job/_search
{
  "query": {
    "bool": {
      "must": {
        "match_all": {}
      }, 
      "filter": {
        "terms": {
          "salary": [6666,7777]
        }
      
      }
    }
  }
}
# 查询salary等于6666或者title等于python、salary不等于7777、salary不等于8888
GET 51jobs/job/_search
{
  "query": {
    "bool": {
      "should": [
        {"term": {
          "salary": {
            "value": 6666
          }
        }},
        {"term": {
          "title": {
            "value": "python"
          }
        }}
      ],
      "must_not": [
        {"term": {
          "salary": {
            "value": 7777
          }
        }},
        {"term": {
          "salary": {
            "value": 8888
          }
        }}
      ]
    }
  }
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49

GET 51jobs/job/_search
{
  "query": {
    "bool": {
      "must_not": [
        {"tile": ""}
      ]
    }
  }

1
2
3
4
5
6
7
8
9
10

去重查询

user_onoffline_log是有的索引名字，uid是其中的一个字段，uid_aggs是聚合的名字，可以随便自定义

POST user_onoffline_log/_search
{
  "query": {
    "match_all": {}
  },
  "size": 0,
  "aggs": {
    "uid_aggs": {
      "cardinality": {
        "field": "uid"
      }
    }
  }
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14

返回去重内容

POST /user_onoffline_log/
{
  "query": {
    "match_all": {}
  },
  "aggs": {
    "uid_aggs": {
      "terms": {
        "field": "uid",
        "size": 1
      },
      "aggs": {
        "uid_top": {
          "top_hits": {
            "sort": [
              {
                "uid": {
                  "order": "desc"
                }
              }
            ],
            "size": 1
          }
        }
      }
    }
  },
  "size": 0
}

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30

搜索案例

统计soar_case，去重_doc.name

{
  "query": {
    "bool": {
      "must": [
        {
          "term": {
            "name": "自动封禁_安全事件_大于阈值_挡板"
          }
        },
        {
          "range": {
            "start_time": {
              "gte": 1609387208000
            }
          }
        },
        {
          "exists": {
            "field": "end_time"
          }
        }
      ]
    }
  },
  "aggs": {
    "sum_end_time": {
      "sum": {
        "field": "end_time"
      }
    },
    "sum_start_time": {
      "sum": {
        "field": "start_time"
      }
    }
  }
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37

参考

{
  "query": {
  "bool": {
  "must": [
  {
  "match_all": { }
  }
  ],
  "must_not": [ ],
  "should": [ ]
  }
  },
  "from": 0,
  "size": 10,
  "sort": [ ],
  "aggs": { }
  }
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17

去重统计name个数

  {
    "query": {
      "match_all": {}
    },
    "aggs": {
      "name_aggs": {
        "cardinality": {
          "field": "name"
        }
      }
    }
1
2
3
4
5
6
7
8
9
10
11

返回去重统计内容

 {
    "query": {
      "match_all": {}
    },
    "aggs": {
      "name_aggs": {
        "terms": {
          "field": "name"
        },
        "aggs": {
          "name_top": {
            "top_hits": {
              "sort": [
                {
                  "name": {
                    "order": "desc"
                  }
                }
              ]
            }
          }
        }
      }
    }
  }
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25

返回特定字段内容

POST https://215.9.167.42/es/soar_case_202012/_search
{
  "query": {
    "match_all": {}
  },
  "_source": [
    "name",
    "duration",
    "plain_id"
  ]
}

{
  "query": {
    "match_all": {}
  },
  "_source": [
    "inputs.value.attacker_array",
    "inputs.value.alarm_source",
    "inputs.value.victim_array",
    "plain_id"
  ],
  "size": 100
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24

分组统计

curl  XPOST 'abdi.node49:9200/das_logger-v6-2021_09_08d,das_logger-v6-2021_09_07d,das_logger-v6-2021_09_13d,das_logger-v6-2021_09_12d/_search' -H 'Content-Type: application/json' -d '
{
"from": 0,
"size": 0,
"_source": {
"includes": [
"devSubType",
"deviceName"
],
"excludes": []
},
"stored_fields": [
"devSubType",
"deviceName"
],
"aggregations": {
"devSubType": {
"terms": {
"field": "devSubType",
"size": 1000,
"shard_size": 20000,
"min_doc_count": 1,
"shard_min_doc_count": 0,
"show_term_doc_count_error": false,
"order": [
{
"_count": "desc"
},
{
"_key": "asc"
}
]
},
"aggregations": {
"deviceName": {
"terms": {
"field": "deviceName",
"size": 1000,
"shard_size": 20000,
"min_doc_count": 1,
"shard_min_doc_count": 0,
"show_term_doc_count_error": false,
"order": [
{
"_count": "desc"
},
{
"_key": "asc"
}
]
}
}
}
}
}
}'
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56

搜索大于1万条数据

from-size分页的缺点

ES机制限制：单次最多查询1万条数据

GET /{index_name}/_search
{
"from":0,
"size":10
}
1
2
3
4
5

es客户端实时分页一般使用from-size。如果有100条数据，按size=10共分10页，那么当用户查询第n页的时候，实际上es是把前n页的数据全部找出来，再去除前n-1页最后得到需要的数据返回，查最后一页就相当于全扫描。其中利弊大家自行思考。所以离线大批量数据的处理业务或迁移不适合使用from-size方式查询。

SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
searchSourceBuilder.from((page.getPageNum() - 1) * page.getPageSize());
searchSourceBuilder.size(page.getPageSize());
1
2
3

scroll

我们可以给初始化查询传递参数scroll=5m ，es会返回一个_scroll_id,这是一个base64编码的长字符串，用于下次查询时传入。5m表示_scroll_id缓存5分钟，之后自动过期，可以根据需要配置。size可以指定每次滚动拉取多少数据。不过如果你做了分片，查询结果可能超过指定的 size 大小。案例如下
例如：第一次查询

GET /sms/_search?scroll=5m
{
  "size": 20,
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "userId": "9d995c0b90fe4128896a1a84eca213bf"
          }
        }
      ]
    }
  }
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15

{
  "_scroll_id": "DnF1ZXJ5VGhlbkZldGNoBgAAAAAATJH1FlFTYzlSZ0VNVGdlM2o0T0dTX2tVUncAAAAAAE0-zBZQUVp6Sy04X1J1NjJCaVZfQUhHWjFnAAAAAABMkfYWUVNjOVJnRU1UZ2UzajRPR1Nfa1VSdwAAAAAATXVxFk83UWRhNGg3UmxTQnpXTEUzd0dreXcAAAAAAEyR9xZRU2M5UmdFTVRnZTNqNE9HU19rVVJ3AAAAAABNPs0WUFFaekstOF9SdTYyQmlWX0FIR1oxZw==",
  "took": 6,
      ......
}
1
2
3
4
5

之后我们把上一次得到的_scroll_id拿到按以下查询即可得到下一轮的数据。

GET /_search/scroll/
{
  "scroll":"1m",
  "scroll_id":"DnF1ZXJ5VGhlbkZldGNoBgAAAAAATJH1FlFTYzlSZ0VNVGdlM2o0T0dTX2tVUncAAAAAAE0-zBZQUVp6Sy04X1J1NjJCaVZfQUhHWjFnAAAAAABMkfYWUVNjOVJnRU1UZ2UzajRPR1Nfa1VSdwAAAAAATXVxFk83UWRhNGg3UmxTQnpXTEUzd0dreXcAAAAAAEyR9xZRU2M5UmdFTVRnZTNqNE9HU19rVVJ3AAAAAABNPs0WUFFaekstOF9SdTYyQmlWX0FIR1oxZw=="
}
1
2
3
4
5

除了等待scroll_id过期时间之外，我们也可以手动删除scroll_id：

// 手动删除scroll_id的方法
DELETE /_search/scroll { "scroll_id" : "DnF1ZXJ5VGhlbkZldGNoBgAAAAAATJH1FlFTYzlSZ0VNVGdlM2o0T0dTX2tVUncAAAAAAE0-zBZQUVp6
1
2

修改max_result_window

curl -XPUT http://es-ip:9200/_settings -d '{ "index" : { "max_result_window" : 100000}}

#设置单独索引
curl -XPUT http://ip:port/索引名称/_settings -H ‘Content-Type:application/json’ -d ‘{“index” : {“max_result_window” : 所要返回的数据量大小}}’


GET 索引名/_search
	{
	 "query": {
		 "match_all": {}
			 },
		 "track_total_hits":true
	}
1
2
3
4
5
6
7
8
9
10
11
12
13

问题处理

启动报错

2020-07-30 00:12:48,712 main ERROR Unable to locate appender "rolling" for logger config "root"
解决方法：需要修改config配置里的log4j2.properties 文件, 将 logger.deprecation.level = warn 改为 error
java.io.FileNotFoundException: /opt/elasticsearch/logs/es.log (Permission denied) \
- 更改文件的所有者为es

gc

gc频繁

参考
- gc频繁(官方论坛)
观察：_cat/fielddata?v

分片

unassigned shards根本原因

1）INDEX_CREATED：由于创建索引的API导致未分配。
2）CLUSTER_RECOVERED ：由于完全集群恢复导致未分配。
3）INDEX_REOPENED ：由于打开open或关闭close一个索引导致未分配。
4）DANGLING_INDEX_IMPORTED ：由于导入dangling索引的结果导致未分配。
5）NEW_INDEX_RESTORED ：由于恢复到新索引导致未分配。
6）EXISTING_INDEX_RESTORED ：由于恢复到已关闭的索引导致未分配。
7）REPLICA_ADDED：由于显式添加副本分片导致未分配。
8）ALLOCATION_FAILED ：由于分片分配失败导致未分配。
9）NODE_LEFT ：由于承载该分片的节点离开集群导致未分配。
10）REINITIALIZED ：由于当分片从开始移动到初始化时导致未分配（例如，使用影子shadow副本分片）。
11）REROUTE_CANCELLED ：作为显式取消重新路由命令的结果取消分配。
12）REALLOCATED_REPLICA ：确定更好的副本位置被标定使用，导致现有的副本分配被取消，出现未分配。
1
2
3
4
5
6
7
8
9
10
11
12

使用Cluster Allocation Explaine API

GET /_cluster/allocation/explain
{
  "index": "myindex",
  "shard": 0,
  "primary": true
}

#You may also specify an optional current_node request parameter to only explain a shard that is currently located on current_node. The current_node can be specified as either the node id or node name.

GET /_cluster/allocation/explain
{
  "index": "myindex",
  "shard": 0,
  "primary": false,
  "current_node": "nodeA"                        
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16

The api response for an unassigned shard

{
  "index" : "idx",
  "shard" : 0,
  "primary" : true,
  "current_state" : "unassigned", #分片的当前状态                
  "unassigned_info" : {
    "reason" : "INDEX_CREATED",  # 未分片原因                 
    "at" : "2017-01-04T18:08:16.600Z",
    "last_allocation_status" : "no"
  },
  "can_allocate" : "no", # 是否分配分片                         
  "allocate_explanation" : "cannot allocate because allocation is not permitted to any of the nodes",
  "node_allocation_decisions" : [
    {
      "node_id" : "8qt2rY-pT6KNZB3-hGfLnw",
      "node_name" : "node-0",
      "transport_address" : "127.0.0.1:9401",
      "node_attributes" : {},
      "node_decision" : "no", # 是否分配这个分片到确切的节点                  
      "weight_ranking" : 1,
      "deciders" : [
        {
          "decider" : "filter",   #导致节点无决策的决策者                 
          "decision" : "NO",
          "explanation" : "node does not match index setting [index.routing.allocation.include] filters [_name:\"non_existent_node\"]"  
        }
      ]
    }
  ]
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30

查看未分片

curl -H "Content-Type: application/json" node1:9200/_cluster/allocation/explain?pretty -d '{"index":
"{index}","shard": {shard},"primary": false}'

curl -XGET node1:9200/_cat/shards?h=index,shard,prirep,state,unassigned.reason| grep UNASSIGNED

# 解决方法
curl -XDELETE '192.168.199.136:9200/index_name/'
1
2
3
4
5
6
7

allocation failed

POST _cluster/reroute?retry_failed  

 curl -XPOST 'localhost:9200/_cluster/reroute' -d '{
  "commands" : [ {
  "allocate" : {
  "index" : "rs_wx_test",
  "shard" : 1,
  "node" : "AfUyuXmGTESHXpwi4OExxx",
  "allow_primary" : true
  }
  }
  ] }'
  
  
  for index in $(curl -XGET 'http://localhost:9200/_cat/shards' | grep UNASSIGNED |awk '{print $1}'|sort |uniq):do
for shards in $(curl -XGET 'http://localhost:9200/_cat/shards' | grep UNASSIGNED | grep $index | awk '{print $2}'|sort|uniq):do
curl XPOST 'http://localhost:9200/_cluster/reroute'-d '{
"commands":[
{
"allocate":{
"index":$index,
"shards":$shards,
"node":"ali-k-ops-elk1",
"allow_primary":"true"
}
}
]
}'
done
done
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30

api

`_doc`

"Document mapping type name can't start with '_', found: [_doc]"
- 使用fulltext或者article替代

`relationship_type`

{"error":{"root_cause":[{"type":"mapper_parsing_exception","reason":"failed to parse [relationship_type]"}],"type":"mapper_parsing_exception","reason":"failed to parse [relationship_type]","caused_by":{"type":"illegal_state_exception","reason":"Can't get text on a START_OBJECT at 2:24"}},"status":400}
1

设置分片副本

curl -XPUT localhost:9200/*/_settings -H 'Content-Type: application/json'  -d '{"settings" : {"index" : {"number_of_replicas" : 0 }}}'
1

设置所有分片副本

grep "\"number_of_replicas\": 0" |xargs sed -i 's/\"number_of_replicas\": 0/\"number_of_replicas\": 1/
1

超过磁盘写入上限，无法写入

curl -H "Content-Type:application/json" -X PUT "http://node2:9200/_all/_settings" -d '{"index":{"blocks":{"read_only_allow_delete": "false"}}}'
1

性能优化

https://my.oschina.net/u/4277979/blog/4719417
1

根据集群大小保证集群总分片数大小合理。28节点集群，分片总数在2w8以内能达到最佳性能。单节点活跃分片数不超过1000
单索引分片总数，每个节点分片数在1-3合适，28节点集群，设置每个节点分片数为1
单个索引的分片数不要超过40，如果计算出单个索引分片数超过40，可以适当调大单个分片的大小，保障不超过60G，所以这个分片数为40
如果单个分片达到60G，并且单个索引的分片数大于40，那么必须进行分表处理

ES迁移

迁移方法

集群模式routing模式
- 优点：安全度高、速度尚可
- 缺点：需要额外节点机器，并且网络互通良好
snapshot模式
- 优点：安全度高、具备过滤功能
- 缺点：速度慢、版本限制
拷贝数据文件
- 优点：速度快
- 缺点：容易出现恢复不出来、版本要求高
外部工具：
- es-dump: 优点：数据可靠/缺点：速度慢，版本间限制（npm install elasticdump -g）
- esm: 优点：数据可靠，跨版本/缺点：速度慢(https://github.com/medcl/esm)

集群内迁移

1）把要迁移的7台ES节点分片数据迁移到其他5台物理服务器，需要迁移物理服务器为：

10.253.79.149
10.253.79.150
10.253.79.155
10.253.79.156
10.253.79.157
10.253.79.158
10.253.79.159

其中10.253.79.149，10.253.79.150两台服务器为512G，80C，性能较高，需做SAE和ICE。

a) 在ES主节点上设置不往10.253.79.149，10.253.79.150，10.253.79.155，10.253.79.155，10.253.79.157，10.253.79.158，10.253.79.159上写入数据并把此7台节点的索引分片分到不迁移的5台上。

命令：

curl -XPUT localhost:9200/_cluster/settings -d '{"transient" : {"cluster.routing.allocation.exclude._ip" : "10.253.79.149，10.253.79.150，10.253.79.155，10.253.79.155，10.253.79.157，10.253.79.158，10.253.79.159"}}'
1

b) 此时用cerebro查看集群分片情况：

此时分片还未迁移完成，分片还存在被移除的节点上

迁移期间需要等待几小时（时间根据磁盘容量决定）

c) 迁移完成查看节点分片信息

查询被迁移节点分片数据不存在

期间需注意关闭的索引节点分片信息

在cerebro里点击closed

d) 确认数据节点分片迁移全部完成后，关闭需要下线的7台ES服务器

supervisorctl stop elasticSearch
1

e) 关闭7台服务器后，查询ES集群状态

节点分片全部迁移完成

f) 修改未迁移5台ES服务器配置（5台都修改）下方列出了其中一台配置。（防止服务器宕机重新加载原来12台集群配置，修改为5台）

配置文件路径 /opt/hansight/enterprise/elasticsearch/config/

Vi elasticsearch.yml中的

discovery.zen.ping.unicast.hosts: [ "10.253.79.145:9300","10.253.79.151:9300","10.253.79.152:9300","10.253.79.153:9300","10.253.79.154:9300" ] //剩下5台的IP和端口

snapshot

snapshot api是Elasticsearch用于对数据进行备份和恢复的一组api接口，可以通过snapshot api进行跨集群的数据迁移，原理就是从源ES集群创建数据快照，然后在目标ES集群中进行恢复。需要注意ES的版本问题：

目标ES集群的主版本号(如5.6.4中的5为主版本号)要大于等于源ES集群的主版本号;
1.x版本的集群创建的快照不能在5.x版本中恢复;
1
2

源ES集群中创建repository

创建快照前必须先创建repository仓库，一个repository仓库可以包含多份快照文件，repository主要有一下几种类型

 fs: 共享文件系统，将快照文件存放于文件系统中
 url: 指定文件系统的URL路径，支持协议：http,https,ftp,file,jar
 s3: AWS S3对象存储,快照存放于S3中，以插件形式支持
 hdfs: 快照存放于hdfs中，以插件形式支持
 cos: 快照存放于腾讯云COS对象存储中，以插件形式支持
1
2
3
4
5

如果需要从自建ES集群迁移至腾讯云的ES集群，可以直接使用fs类型仓库，注意需要在Elasticsearch配置文件elasticsearch.yml设置仓库路径：

 path.repo: ["/usr/local/services/test"]
1

之后调用snapshot api创建repository：

 curl -XPUT http://172.16.0.39:9200/_snapshot/my_backup -H       'Content-Type: application/json' -d '{
     "type": "fs",
     "settings": {
         "location": "/usr/local/services/test" 
         "compress": true
     }
 }'
1
2
3
4
5
6
7

如果需要从其它云厂商的ES集群迁移至腾讯云ES集群，或者腾讯云内部的ES集群迁移，可以使用对应云厂商他提供的仓库类型，如AWS的S3, 阿里云的OSS，腾讯云的COS等

 curl -XPUT http://172.16.0.39:9200/_snapshot/my_s3_repository
 {
     "type": "s3",
     "settings": {
     "bucket": "my_bucket_name",
     "region": "us-west"
     }
 }
1
2
3
4
5
6
7
8

源ES集群中创建snapshot

调用snapshot api在创建好的仓库中创建快照
```
 curl -XPUT http://172.16.0.39:9200/_snapshot/my_backup/snapshot_1?wait_for_completion=true
1
```
创建快照可以指定索引，也可以指定快照中包含哪些内容，具体的api接口参数可以查阅官方文档
目标ES集群中创建repository

目标ES集群中创建仓库和在源ES集群中创建仓库类似，用户可在腾讯云上创建COS对象bucket，将仓库将在COS的某个bucket下。
移动源ES集群snapshot至目标ES集群的仓库

把源ES集群创建好的snapshot上传至目标ES集群创建好的仓库中

从快照恢复

 curl -XPUT http://172.16.0.20:9200/_snapshot/my_backup/snapshot_1/_restore
1

查看快照恢复状态

 curl http://172.16.0.20:9200/_snapshot/_status
1

ES-SQL

查询

group by

select event_type, event_name, count(*) as num from event group by event_name,event_type order by num desc limit 3
1

sql转换语句

{
	"from": 0,
	"size": 0,
	"_source": {
		"includes": [
			"event_type",
			"event_name",
			"COUNT"
		],
		"excludes": []
	},
	"stored_fields": [
		"event_type",
		"event_name"
	],
	"aggregations": {
		"event_name": {
			"terms": {
				"field": "event_name",
				"size": 3,
				"shard_size": 5000,
				"min_doc_count": 1,
				"shard_min_doc_count": 0,
				"show_term_doc_count_error": false,
				"order": [
					{
						"_count": "desc"
					},
					{
						"_key": "asc"
					}
				]
			},
			"aggregations": {
				"event_type": {
					"terms": {
						"field": "event_type",
						"size": 3,
						"shard_size": 5000,
						"min_doc_count": 1,
						"shard_min_doc_count": 0,
						"show_term_doc_count_error": false,
						"order": [
							{
								"num": "desc"
							},
							{
								"_key": "asc"
							}
						]
					},
					"aggregations": {
						"num": {
							"value_count": {
								"field": "_index"
							}
						}
					}
				}
			}
		}
	}
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63

样例2

select devSubType,deviceName from das_logger group by devSubType,deviceName
1

{
	"from": 0,
	"size": 0,
	"_source": {
		"includes": [
			"devSubType",
			"deviceName"
		],
		"excludes": []
	},
	"stored_fields": [
		"devSubType",
		"deviceName"
	],
	"aggregations": {
		"devSubType": {
			"terms": {
				"field": "devSubType",
				"size": 1000,
				"shard_size": 20000,
				"min_doc_count": 1,
				"shard_min_doc_count": 0,
				"show_term_doc_count_error": false,
				"order": [
					{
						"_count": "desc"
					},
					{
						"_key": "asc"
					}
				]
			},
			"aggregations": {
				"deviceName": {
					"terms": {
						"field": "deviceName",
						"size": 1000,
						"shard_size": 20000,
						"min_doc_count": 1,
						"shard_min_doc_count": 0,
						"show_term_doc_count_error": false,
						"order": [
							{
								"_count": "desc"
							},
							{
								"_key": "asc"
							}
						]
					}
				}
			}
		}
	}
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55

使用场景

删除索引

#!/bin/bash
# Remove the index of one month old in elasticserch
CMD_ECHO='echo'
SCRIPT_NAME=`basename $0`
LOG_PRINT="eval $CMD_ECHO \"[$SCRIPT_NAME]\" @$(date +"%Y%m%d %T") [INFO] :"
time_ago=7
es_cluster_ip=10.26.22.130
function delete_index(){
    comp_date=`date -d "${time_ago} day ago" +"%Y-%m-%d"`
    date1="${1} 00:00:00"
    date2="${comp_date} 00:00:00"
    index_date=`date -d "${date1}" +%s`
    limit_date=`date -d "${date2}" +%s`
    if [ $index_date -le $limit_date ];then
            $LOG_PRINT  "$1 will perform the delete task earlier than  ${time_ago} days ago" >> tmp.txt
            del_date=`echo $1 | awk -F  "-" '{print $1"."$2"."$3}'`
            echo "=========开始删除========="
            curl -XDELETE http://${es_cluster_ip}:9200/devlog-$del_date >> tmp.txt
            curl -XDELETE http://${es_cluster_ip}:9200/devbacklog-$del_date >> tmp.txt
            curl -XDELETE http://${es_cluster_ip}:9200/testlog-$del_date >> tmp.txt
            curl -XDELETE http://${es_cluster_ip}:9200/testbacklog-$del_date >> tmp.txt
            curl -XDELETE http://${es_cluster_ip}:9200/uatbacklog-$del_date >> tmp.txt
            curl -XDELETE http://${es_cluster_ip}:9200/uatlog-$del_date >> tmp.txt
            curl -XDELETE http://${es_cluster_ip}:9200/prodlog-$del_date >> tmp.txt
            curl -XDELETE http://${es_cluster_ip}:9200/prodbacklog-$del_date >> tmp.txt
            curl -XDELETE http://${es_cluster_ip}:9200/alllogback-$del_date >> tmp.txt
    fi
}
 
# get the date in all index
curl -XGET http://${es_cluster_ip}:9200/_cat/indices|awk -F " " '{print $3}'  | egrep "[0-9]*\.[0-9]*\.[0-9]*" |awk -F  "-" '{print $NF}' | awk -F  "." '{print $((NF-2))"-"$((NF-1))"-"$NF}' | sort | uniq | while read LINE   
do
   delete_index  ${LINE}
done
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34

相关阅读:
【微软技术栈】C#.NET 泛型中的协变和逆变
 STM32G491RCT6,STM32H743BIT6规格书 32bit IC MCU
Java字符串分割
 java基于SpringBoot+vue的餐厅点餐外卖系统 elementui 前后端分离
 GIC/ITS代码分析（1）MADT表
 如何使用Abaqus进行跌落仿真
 在线webp转换jpg免费转换教程
 SQL多个字段拼接组合成新字段的常用方法
 WMS是什么？怎么选择WMS?
vue父组件调用子组件的方法或传递值给子组件
原文地址：https://blog.csdn.net/why123wh/article/details/125404406

Elasticsearch实践操作集合

文章目录

Elasticsearch介绍

优势

全文检索

Lucene词汇表和架构

使用案例

对比Elasticsearch和Solr

对比Elasticsearch和splunk

JSON

ES基础

ES结构主要特征

数据架构的主要概念

Elasticsearch主要概念

索引建立和搜索

ES与关系型数据库（RDBMS）对比

分词器

HTTP协议

Elasticsearch安装

下载7.8.1版本

集群部署

安装报错

创建用户

ES安装

集群安装

管理

注意事项

es不能使用root用户运行

错误：索引文件个数限制

bind错误

发送信息给master失败

插件安装

ES-HEAD

ES-SQL

cerebro

安装kibana

安装7.8.1

安装中文分词器

docker安装ik分词器

测试

分词效果对比

报错

ES部署事项

服务器配置的选择

关闭swap

角色隔离和脑裂

分片配置

索引

ES索引

分片和副本

创建索引

映射配置

类型确定机制

索引和结构映射

index template

什么是索引模板

索引模板中的内容

索引模板的用途

创建索引模板

使用模板创建索引

查看索引模板

删除索引模板

使用keyword类型

倒排索引

删除索引

删除索引内的数据

RESTful API

基本介绍

cat apis

verbose

help

headers

numeric formats

reposne as text,json,smile,yaml or cbor

sort

参数

Index APIs

create index

Delete index

Get Index

ES与关系型数据库（`RDBMS`）对比

下载`7.8.1`版本

`headers`

`_doc`

`relationship_type`