官方建议Elasticsearch设置堆内存为32G,因为Elasticsearch是Java语言实现的程序,所以:
1)这部分堆内存,首先得包括Elasticsearch从字节码加载验证解析到内存的部分,如局部变量存储虚拟机栈,实例对象存储堆空间等;
2)新的文档写入原理是,首先被添加到内存索引缓存中,然后写入到一个基于磁盘的段;
3)查询时,如果用到filter过滤查询,会有查询结果缓存;
4)当针对一个索引或多个索引运行搜索请求时,每个涉及的分片都会在本地执行搜索并将其本地结果返回到协调节点,协调节点将这些分片级结果组合成“全局”结果集。分片级请求缓存模块将本地结果缓存在每个分片上;
5)当对一个字段进行排序、聚合,或某些过滤,比如地理位置过滤、某些与字段相关的脚本计算等操作,就会需要Field Data Cache
6)ES 底层存储采用 Lucene,Lucene 引入排索引的二级索引 FST,原理上可以理解为前缀树,加速查询
新的文档写入原理是,首先被添加到内存索引缓存中,然后写入到一个基于磁盘的段,这部分内存为Indexing Buffer
indices.memory.index_buffer_size:接受百分比或字节大小值。它默认为10%
indices.memory.min_index_buffer_size:如果index_buffer_size指定为百分比,则此设置可用于指定绝对最小值。默认为48mb.
indices.memory.max_index_buffer_size:如果index_buffer_size指定为百分比,则此设置可用于指定绝对最大值。默认为无界。
https://www.elastic.co/guide/cn/elasticsearch/guide/current/dynamic-indices.html
https://www.elastic.co/guide/en/elasticsearch/reference/8.10/indexing-buffer.html
https://www.elastic.co/guide/en/elasticsearch/reference/8.10/query-cache.html
查看缓存使用情况:
GET /_stats /request_cache
{
"index": {
"uuid": "X8PM2Eq_Tk2ZVpzjYcJ0CQ",
"primaries": {
"request_cache": {
"memory_size_in_bytes": 97792,
"evictions": 0,
"hit_count": 171814,
"miss_count": 20344
}
},
"total": {
"request_cache": {
"memory_size_in_bytes": 186416,
"evictions": 0,
"hit_count": 241527,
"miss_count": 26841
}
}
}
}
https://www.elastic.co/guide/en/elasticsearch/reference/8.10/shard-request-cache.html
https://www.elastic.co/guide/en/elasticsearch/reference/8.10/modules-fielddata.html
https://www.elastic.co/guide/en/elasticsearch/reference/8.10/text.html#fielddata-mapping-param
https://www.elastic.co/guide/en/elasticsearch/reference/8.10/eager-global-ordinals.html
Segments Cache(segments FST数据的缓存),为了加速查询,FST 永驻堆内内存,无法被 GC 回收。该部分内存无法设置大小,减少data node上的segment memory占用,有三种方法:
https://armsword.com/2021/03/26/es-memory-management/
/_cat/nodes?v&h=name,node*,heap*,ram*,fielddataMemory,queryCacheMemory,requestCacheMemory,segmentsMemory
name id node.role heap.current heap.percent heap.max ram.current ram.percent ram.max fielddataMemory queryCacheMemory requestCacheMemory segmentsMemory
name id node.role heap.current heap.percent heap.max ram.current ram.percent ram.max fielddataMemory queryCacheMemory requestCacheMemory segmentsMemory
es-7-master-1 7Doz dim 6.1gb 38 16gb 20.4gb 64 32gb 0b 3mb 205.9kb 484.2kb
es-7-master-0 mxrv dim 8.2gb 51 16gb 20gb 63 32gb 0b 2.9mb 208.9kb 476.3kb
es-7-master-2 uiZk dim 9.5gb 59 16gb 20gb 63 32gb 0b 2.6mb 209.5kb 516.2kb
https://armsword.com/2021/03/26/es-memory-management/
上面的提到的内存都是JVM管理的,ES能控制,即On-heap内存,ES还有Off-heap内存,由Lucene管理,负责缓存倒排索引(Segment Memory)。Lucene 中的倒排索引 segments 存储在文件中,为提高访问速度,都会把它加载到内存中,从而提高 Lucene 性能。
https://armsword.com/2021/03/26/es-memory-management/
GET /cn/_nodes/stats?filter_path=nodes.*.jvm.mem.pools.old
{
"nodes": {
"7DozU2vhRc2xg2RGrdTFWA": {
"jvm": {
"mem": {
"pools": {
"old": {
"used_in_bytes": 2199187968,
"max_in_bytes": 17179869184,
"peak_used_in_bytes": 3432122880,
"peak_max_in_bytes": 17179869184
}
}
}
}
}
}
}
JVM Memory Pressure = used_in_bytes / max_in_bytes
https://www.elastic.co/cn/blog/managing-and-troubleshooting-elasticsearch-memory
熟悉的话,通过直方图看谁占用内存最大
否则通过dominator_tree查看,可以看到调用链
但不熟悉源码,这种分析感觉作用不大
感觉作用亦不大,安心使用es提供的接口吧
https://elasticsearch.cn/article/14432