Postgresql源码（57）HOT更新为什么性能差距那么大？

相关
《Postgresql源码（52）bitmapset分析RelationGetIndexAttrBitmap》
《Postgresql源码（57）HOT更新为什么性能差距那么大？》

背景

PostgreSQL多版本实现中，堆页面是多版本，索引页面是单版本。如果更新一条堆页面，新元组其实是直接append到表上的，旧元组标记不可见。
同时索引也需要增加一条，但索引这一条是可以优化掉的，方法就是新旧表元组组成链表，用旧索引指向旧元组，再从旧元组指向新元组。
这就是PG的HOT更新机制，在8.3版本后引入这个优化，对多版本下的写放大问题有了较大程度的优化。

1 HOT与非HOT更新性能差距

结果

测试环境是kvm虚拟机16C，磁盘性能很差。（不严谨测试）
结果差距大的原因主要是因为两种测试场景IO都打满了，IO影响占比升高造成了较大的差距
- IO都打满时，HOT都在更新表页面，NOHOT还要更新很多索引页面。
后面分析下什么时候会走HOT？（2.2的第六步）
- 满足三个条件：
  - BUFFER不变
  - hot_attrs_checked：PAGE没满
  - modified_attrs：有索引列被修改了

	TPS
HOT	6035.041159
NOHOT	2447.563057

IO状况截取
在这里插入图片描述

nohot

transaction type: ./test_nohot.sql
scaling factor: 1
query mode: prepared
number of clients: 32
number of threads: 32
duration: 120 s
number of transactions actually processed: 293701
latency average = 13.073 ms
latency stddev = 31.841 ms
initial connection time = 5.961 ms
tps = 2447.563057 (without initial connection time)
statement latencies in milliseconds:
         0.001  \set id random(1,10000000)  
        13.064  update t0 set n1 = random()*100 where id = :id;
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16

hot

transaction type: ./test_hot.sql
scaling factor: 1
query mode: prepared
number of clients: 32
number of threads: 32
duration: 120 s
number of transactions actually processed: 724192
latency average = 5.302 ms
latency stddev = 14.019 ms
initial connection time = 6.242 ms
tps = 6035.041159 (without initial connection time)
statement latencies in milliseconds:
         0.001  \set id random(1,10000000)  
         5.321  update t0 set n2 = random()*100 where id = :id;
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16

测试CASE

drop table t0;
create table t0(id int primary key, n1 int, n2 int);
create index idx_t0_id_n1 on t0(id, n1);
insert into t0 select generate_series(1,10000000);


vi test_nohot.sql  
\set id random(1,10000000)  
update t0 set n1 = random()*100 where id = :id;

pgbench -M prepared -n -r -P 1 -f ./test_nohot.sql -c 32 -j 32 -T 120  


vi test_hot.sql  
\set id random(1,10000000)  
update t0 set n2 = random()*100 where id = :id;

pgbench -M prepared -n -r -P 1 -f ./test_hot.sql -c 32 -j 32 -T 120  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18

2 update流程

从ExecUpdate函数进入

更新前执行：触发器、外键触发器、FDW等
检查新元组约束ExecConstraints
执行更新table_tuple_update
插入新索引元组ExecInsertIndexTuples（HOT更新不需要更新索引）

ExecUpdate
  /* 更新前触发器、外键触发器、FDW等 */
  /* BEFORE ROW UPDATE Triggers */
  /* INSTEAD OF ROW UPDATE Triggers */
  /* update in foreign table: let the FDW do it */
  ...
  ExecConstraints
  table_tuple_update
  /* 可能走HOT不需要新索引项 */
  ExecInsertIndexTuples
  /* AFTER ROW UPDATE Triggers */
  /* Process RETURNING if present */
  ...
  return
1
2
3
4
5
6
7
8
9
10
11
12
13
14

2.1 update流程细节（HOT）

再展开上面流程

更新前：在ExecutorState内存中重新组装元组ExecMaterializeSlot
更新前：拿到所有相关索引ExecOpenIndices
更新前：执行：触发器、外键触发器、FDW等
检查：新元组约束ExecConstraints
执行更新：table_tuple_update
1. 返回是否需要更新索引，如果是HOT则不需要更新索引
2. heap_update执行具体更新流程（下面继续展开分析）
插入新索引元组ExecInsertIndexTuples（HOT不需要更新索引）

ExecUpdate
  ExecMaterializeSlot
    tts_buffer_heap_materialize
      // 切换到ExecutorState内存上下文
      // 在新内存池中把元组拼出来
      heap_form_tuple
  /* 更新前触发器、外键触发器、FDW等 */
  ...
  /* 拿到相关索引oid list */
  ExecOpenIndices
  /* BEFORE ROW UPDATE Triggers */
  /* INSTEAD OF ROW UPDATE Triggers */
  /* update in foreign table: let the FDW do it */
  ...
  ExecConstraints
  table_tuple_update
    heapam_tuple_update
      // 执行更新
      heap_update
      // 是否需要更新索引？HOT？
      *update_indexes = result == TM_Ok && !HeapTupleIsHeapOnly(tuple)
  /* 可能走HOT不需要新索引项 */
  if (... && update_indexes)
    ExecInsertIndexTuples
      // 对每个相关索引都需要构造新的index tuple
      for (i = 0; i < numIndices; i++)
        FormIndexDatum
        index_insert
        // checkUnique
  /* AFTER ROW UPDATE Triggers */
  /* Process RETURNING if present */
  ...
  return
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33

2.2 heap_update函数执行分析（HOT）

在展开上面流程中的heap_update函数：

执行：

drop table t0;
create table t0(id int primary key, n1 int, n2 int);
create index idx_t0_id_n1 on t0(id, n1);
insert into t0 select generate_series(1,2), generate_series(1,2)+100, 888;
update t0 set n2 = 0 where id = 2;
1
2
3
4
5

第一步：bitmap配置，找到所有索引列的位置

这里的位图是什么参考这一篇：《Postgresql源码（52）bitmapset分析RelationGetIndexAttrBitmap》

生成三个位图记录索引位置：hot_attrs、key_attrs、id_attrs

          n1  id
            \/
            ||
hot_attrs = 1100000000  ：哪些列上有索引？
key_attrs =  100000000  ：哪些列上有唯一索引？
id_attrs  =  100000000  ：复制标识
1
2
3
4
5
6

第二步：如果PAGE没满，可能走HOT，构造bitmap interesting_attrs

配置：hot_attrs_checked=true

                  n1  id
                    \/
                    ||
interesting_attrs = 1100000000
1
2
3
4

第三步：锁页面

第四步：哪些索引列被修改了？

HeapDetermineColumnsInfo函数构造modified_attrs位图，这里没有索引列被修改

modified_attrs = 0000000000
1

这个位图记录的是有哪些索引列被修改了，注意是索引列

第五步：旧元组是否可见？

HeapTupleSatisfiesUpdate（MVCC）

第六步：决定是否走HOT

满足三个条件：

BUFFER不变
hot_attrs_checked：上面第二步：PAGE没满
modified_attrs：上面第四步：有索引列被修改了

if (newbuf == buffer)
    if (hot_attrs_checked && !bms_overlap(modified_attrs, hot_attrs))
        use_hot_update = true;
1
2
3

第七步：开始HOT更新

一、加标志位：
- 老元组：HEAP_HOT_UPDATED、HEAP_ONLY_TUPLE
- 新元组：HEAP_ONLY_TUPLE
二、配置旧元组的头
三、插入新元组

if (use_hot_update) 
    HeapTupleSetHotUpdated(&oldtup)
        (tup)->t_infomask2 |= HEAP_HOT_UPDATED
    HeapTupleSetHeapOnly(heaptup)
         (tup)->t_infomask2 |= HEAP_ONLY_TUPLE
    HeapTupleSetHeapOnly(newtup)
        (tup)->t_infomask2 |= HEAP_ONLY_TUPLE
1
2
3
4
5
6
7

第八步：记录XLOG

log_heap_update

相关阅读:
Scala面向对象的特征
 分享下最近基于Avalonia UI和MAUI写跨平台时间管理工具的体验
 第一类曲面积分：曲面微元dσ与其投影面积微元dxdy之间的关系推导
 高忆管理：尾盘拉升是好事还是坏事？
关于写一个read_file的很多个为什么&如何用
 中国新能源汽车行业2“十四五”前景规划及未来发展趋向预测报告022-2028年新版
 EOCR-3E420,3EZ,3DE电机保护器与变频器配合使用的方法
 以太坊 layer2: optimism 源码学习 (一)
app简单控件了解——常用布局——相对布局RelativeLayout
笔记-《RabbitMQ实战指南》
原文地址：https://blog.csdn.net/jackgo73/article/details/125515715