案例说明:
KingbaseES V8R3集群failover时两个cluster都会触发,但只有一个cluster会调用脚本去执行真正的切换流程,另一个有对应的打印,但不会调用脚本,只是走相关的流程去修改状态。 那么你需要确认一点,到底是主库还是备库的cluster在执行脚本,关键字:failover_stream.sh ,看cluster.log中有没有这个的名称。本案例通过主库系统宕机,分析了failover 切换过程。
适用版本: KingbaseES V8R3
集群节点状态信息:
- # 集群节点状态信息
- TEST=# show pool_nodes;
- node_id | hostname | port | status | lb_weight | role | select_cnt | load_balance_node | replication_delay
- ---------+---------------+-------+--------+-----------+---------+------------+-------------------+-------------------
- 0 | 192.168.1.101 | 54321 | up | 0.500000 | primary | 1 | true | 0
- 1 | 192.168.1.102 | 54321 | up | 0.500000 | standby | 0 | false | 0
- (2 rows)
对于KingbaseES V8R3集群failover切换的分析:
- 1. 可以通过主备库的cluster.log,获取整个主备切换的过程。
- 2. 从failover.log日志中获取执行failover过程,注意只有执行failover_stream.sh的节点才会产生failover.log。
- 3. recovery.log日志记录了原主库failover完成后,从主库recovery为新备库的过程。
一、Failover切换过程分析
关闭主库系统:[root@node101 ~]# shutdown -h now
备库cluster.log日志分析:(主库系统关机,备库调用failover_stream.sh执行failover切换)
1、主备库处于正常状态时,备库收到主库的heartbeat。
- 2022-08-30 15:38:03: pid 17612: DETAIL: the last heartbeat from "192.168.1.101:9999" received 0 seconds ago
- 2022-08-30 15:38:06: pid 17612: LOG: watchdog checking if kingbasecluster is alive using heartbeat
- 2022-08-30 15:38:06: pid 17612: DETAIL: the last heartbeat from "192.168.1.101:9999" received 0 seconds ago
2、备库kingbasecluster检测到主库集群服务和数据库服务shutdown,kingbasecluster发生主备切换,当前的kingbasecluster由standby切换为master。
- 2022-08-30 15:38:09: pid 17585: LOG: remote node "192.168.1.101:9999 Linux node101" is shutting down
- 2022-08-30 15:38:09: pid 17585: LOG: watchdog cluster has lost the coordinator node
- 2022-08-30 15:38:09: pid 17585: LOG: unassigning the remote node "192.168.1.101:9999 Linux node101" from watchdog cluster master
- 2022-08-30 15:38:09: pid 17585: LOG: We have lost the cluster master node "192.168.1.101:9999 Linux node101"
- 2022-08-30 15:38:09: pid 17585: LOG: watchdog node state changed from [STANDBY] to [JOINING]
- 2022-08-30 15:38:09: pid 17612: LOG: watchdog checking if kingbasecluster is alive using heartbeat
- 2022-08-30 15:38:09: pid 17612: DETAIL: the last heartbeat from "192.168.1.101:9999" received 1 seconds ago
- 2022-08-30 15:38:10: pid 17547: ERROR: failed to authenticate
- 2022-08-30 15:38:10: pid 17547: DETAIL: the database system is shutting down
- 2022-08-30 15:38:12: pid 17612: LOG: watchdog checking if kingbasecluster is alive using heartbeat
- 2022-08-30 15:38:12: pid 17612: DETAIL: the last heartbeat from "192.168.1.101:9999" received 4 seconds ago
3、备库kingbasecluster对主库的数据库服务根据:health_check_max_retries * health_check_retry_delay配置尝试连接。
- 2022-08-30 15:38:13: pid 17547: LOG: health checking retry count 1
- 2022-08-30 15:38:13: pid 17585: LOG: watchdog node state changed from [JOINING] to [INITIALIZING]
- 2022-08-30 15:38:14: pid 17585: LOG: I am the only alive node in the watchdog cluster
- 2022-08-30 15:38:14: pid 17585: HINT: skipping stand for coordinator state
- 2022-08-30 15:38:14: pid 17585: LOG: watchdog node state changed from [INITIALIZING] to [MASTER]
- 2022-08-30 15:38:14: pid 17585: LOG: I am announcing my self as master/coordinator watchdog node
- 2022-08-30 15:38:15: pid 17612: LOG: watchdog checking if kingbasecluster is alive using heartbeat
- 2022-08-30 15:38:15: pid 17612: DETAIL: the last heartbeat from "192.168.1.101:9999" received 7 seconds ago
- 2022-08-30 15:38:18: pid 17585: LOG: I am the cluster leader node
- 2022-08-30 15:38:18: pid 17585: DETAIL: our declare coordinator message is accepted by all nodes
- 2022-08-30 15:38:18: pid 17585: LOG: setting the local node "192.168.1.102:9999 Linux node102" as watchdog cluster master
4、备库开启升级进程(escalation process),cluster vip漂移到新的master节点。
- 2022-08-30 15:38:18: pid 17585: LOG: I am the cluster leader node. Starting escalation process
- 2022-08-30 15:38:18: pid 17585: LOG: escalation process started with PID:22088
- 2022-08-30 15:38:18: pid 17547: LOG: connect_inet_domain_socket: select() interrupted by certain signal. retrying...
- 2022-08-30 15:38:18: pid 22088: LOG: watchdog: escalation started
- 2022-08-30 15:38:18: pid 17612: LOG: watchdog checking if kingbasecluster is alive using heartbeat
- 2022-08-30 15:38:18: pid 17612: DETAIL: the last heartbeat from "192.168.1.101:9999" received 10 seconds ago
- 2022-08-30 15:38:21: pid 22088: LOG: successfully acquired the delegate IP:"192.168.1.201"
- 2022-08-30 15:38:21: pid 22088: DETAIL: 'if_up_cmd' returned with success
- .......
5、 备库对原主库数据库服务的连接次数超过阈值后,将原主库的数据库服务设置为down状态。
- 2022-08-30 15:39:20: pid 17547: LOG: health checking retry count 10
- 2022-08-30 15:39:22: pid 17612: LOG: watchdog checking if kingbasecluster is alive using heartbeat
- 2022-08-30 15:39:22: pid 17612: DETAIL: the last heartbeat from "192.168.1.101:9999" received 73 seconds ago
- 2022-08-30 15:39:23: pid 17547: LOG: failed to connect to kingbase server on "192.168.1.101:54321", getsockopt() detected error "
- No route to host"
- 2022-08-30 15:39:23: pid 17547: ERROR: failed to make persistent db connection
- 2022-08-30 15:39:23: pid 17547: DETAIL: connection to host:"192.168.1.101:54321" failed
- 2022-08-30 15:39:23: pid 17547: LOG: setting backend node 0 status to NODE DOWN
6、 收到本节点kingbasecluster发出的failover request,并获取到failover lock。
- 2022-08-30 15:39:23: pid 17547: LOG: received degenerate backend request for node_id: 0 from pid [17547]
- 2022-08-30 15:39:23: pid 17585: LOG: new IPC connection received
- 2022-08-30 15:39:23: pid 17585: LOG: watchdog received the failover command from local kingbasecluster on IPC interface
- 2022-08-30 15:39:23: pid 17585: LOG: watchdog is processing the failover command [DEGENERATE_BACKEND_REQUEST] received from local
- kingbasecluster on IPC interface
- 2022-08-30 15:39:23: pid 17585: LOG: I am the only kingbasecluster node in the watchdog cluster
- 2022-08-30 15:39:23: pid 17585: DETAIL: no need to propagate the failover command [DEGENERATE_BACKEND_REQUEST]
- 2022-08-30 15:39:23: pid 17585: LOG: new IPC connection received
- 2022-08-30 15:39:23: pid 17585: LOG: received the failover command lock request from local kingbasecluster on IPC interface
- 2022-08-30 15:39:23: pid 17585: LOG: local kingbasecluster node "192.168.1.102:9999 Linux node102" is requesting to become a lock
- holder for failover ID: 0
7、 获取到failover lock后,开始执行failover_stream.sh进行failover的切换。
- 2022-08-30 15:39:23: pid 17585: LOG: local kingbasecluster node "192.168.1.102:9999 Linux node102" is the lock holder
- 2022-08-30 15:39:23: pid 17547: LOG: starting degeneration. shutdown host 192.168.1.101(54321)
- 2022-08-30 15:39:23: pid 17547: LOG: Restart all children
- 2022-08-30 15:39:23: pid 17547: LOG: execute command: /home/kingbase/cluster/R3HA/kingbasecluster/bin/failover_stream.sh 192.168.
- 1.102 0 0 192.168.1.101 192.168.1.101 1 0 /home/kingbase/cluster/R3HA/db/data
- 2022-08-30 15:39:25: pid 17612: LOG: watchdog checking if kingbasecluster is alive using heartbeat
- 2022-08-30 15:39:25: pid 17612: DETAIL: the last heartbeat from "192.168.1.101:9999" received 76 seconds ago
- ssh: connect to host 192.168.1.101 port 22: No route to host
- 2022-08-30 15:39:31: pid 17612: LOG: watchdog checking if kingbasecluster is alive using heartbeat
- 2022-08-30 15:39:31: pid 17612: DETAIL: the last heartbeat from "192.168.1.101:9999" received 82 seconds ago
- 2022-08-30 15:39:33: pid 17585: LOG: new IPC connection received
- 2022-08-30 15:39:33: pid 17585: LOG: received the failover command lock request from local kingbasecluster on IPC interface
- 2022-08-30 15:39:33: pid 17585: LOG: local kingbasecluster node "192.168.1.102:9999 Linux node102" is requesting to release [FAIL
- OVER] lock for failover ID 0
8、执行failover切换,并查找新的primary。(当前备库切换为primary)
- 2022-08-30 15:39:33: pid 17547: LOG: find_primary_node_repeatedly: waiting for finding a primary node
- 2022-08-30 15:39:33: pid 17547: LOG: find_primary_node: checking backend no 0
- 2022-08-30 15:39:33: pid 17547: LOG: find_primary_node: checking backend no 1
- 2022-08-30 15:39:33: pid 17547: LOG: find_primary_node: primary node id is 1
- 2022-08-30 15:39:33: pid 17585: LOG: new IPC connection received
- 2022-08-30 15:39:33: pid 17585: LOG: received the failover command lock request from local kingbasecluster on IPC interface
- 2022-08-30 15:39:33: pid 17585: LOG: local kingbasecluster node "192.168.1.102:9999 Linux node102" is requesting to release [FOLL
- OW MASTER] lock for failover ID 0
- 2022-08-30 15:39:33: pid 17585: LOG: local kingbasecluster node "192.168.1.102:9999 Linux node102" has released the [FOLLOW MASTE
- R] lock for failover ID 0
- 2022-08-30 15:39:33: pid 17547: LOG: failover: set new primary node: 1
- 2022-08-30 15:39:33: pid 17547: LOG: failover: set new master node: 1
- 2022-08-30 15:39:33: pid 21501: LOG: worker process received restart request
- 2022-08-30 15:39:33: pid 17585: LOG: new IPC connection received
- 2022-08-30 15:39:33: pid 17585: LOG: received the failover command lock request from local kingbasecluster on IPC interface
- 2022-08-30 15:39:33: pid 17585: LOG: local kingbasecluster node "192.168.1.102:9999 Linux node102" is requesting to resign from a
- lock holder for failover ID 0
- 2022-08-30 15:39:33: pid 17585: LOG: local kingbasecluster node "192.168.1.102:9999 Linux node102" has resigned from the lock hol
- der
- failover done. shutdown host 192.168.1.101(54321)2022-08-30 15:39:33: pid 17547: LOG: failover done. shutdown host 192.168.1.101(
- 54321)
备库Failover.log分析:
1、执行failover_stream.sh进行failover切换
- -----------------2022-08-30 15:39:23 failover beging---------------------------------------
- ----failover-stats is %H = hostname of the new master node [192.168.1.102], %P = old primary node id [0], %d = node id[0], %h = host name [192.168.1.101], %O = old primary host[192.168.1.101] %m = new master node id [1], %M = old master node id [0], %D = database cluster path [/home/kingbase/cluster/R3HA/db/data].
-
- Tips:
- 1)new_master : 192.168.1.102
- 2) new_master_id: 1
- 3) old_primary: 192.168.1.101
- 4) old_primary_id : 0
2、ping 网关测试网络连通性
- ----ping trust ip
- ping trust ip 192.168.1.1 success
3、ssh连接远程原主库,执行关闭数据库服务和释放db vip;并将db vip加载到当前主库节点。
- ----determine whether the faulty db is master or standby
- master down, let 192.168.1.102 become new primary.....
- 2022-08-30 15:39:25 del old primary VIP on 192.168.1.101
- ssh connect host:192.168.1.101 failed, can not stop old primary db and del old primary db vip
- 2022-08-30 15:39:29 add VIP on 192.168.1.102
- ADD VIP NOW AT 2022-08-30 15:39:30 ON enp0s3
- execute: [/sbin/ip addr add 192.168.1.200/24 dev enp0s3 label enp0s3:2]
- execute: /home/kingbase/cluster/R3HA/db/bin/arping -U 192.168.1.200 -I enp0s3 -w 1
- ARPING 192.168.1.200 from 192.168.1.200 enp0s3
- Sent 1 probes (1 broadcast(s))
- Received 0 response(s)
- 2022-08-30 15:39:31 promote begin...let 192.168.1.102 become master
4、通过连接db vip执行“select 33333”测试当前数据库服务状态,如果正常,执行promote,提升为新主库,将同步模式设为async。
- check db if is alive
- ksql "port=54321 user=SUPERMANAGER_V8ADMIN dbname=TEST connect_timeout=10" -c "select 33333;"
- 2022-08-30 15:39:31 kingbase is ok , to prepare execute promote
- execute promote
- server promoting
- check db if is alive after promote
- ksql "port=54321 user=SUPERMANAGER_V8ADMIN dbname=TEST connect_timeout=10" -c "select 33333;"
- 2022-08-30 15:39:32 after execute promote , kingbase status is ok.
- after execute promote, kingbase is ok.
- 2022-08-30 15:39:32 sync to async
- ALTER SYSTEM
- SYS_RELOAD_CONF
- -----------------
- t
- (1 row)
5、连接新主库数据库服务,如果成功,在新主库上执行checkpoint,checkpoint成功,则failover切换完成。
- 2022-08-30 15:39:32 make checkpoint
- check the db to see if it is alive
- ksql "port=54321 user=SUPERMANAGER_V8ADMIN dbname=TEST connect_timeout=10" -c "select 33333;"
- 2022-08-30 15:39:32 kingbase is ok , to prepare execute checkpoint
- execute checkpoint
- CHECKPOINT
- check the db to see if it is alive after execute checkpoint
- ksql "port=54321 user=SUPERMANAGER_V8ADMIN dbname=TEST connect_timeout=10" -c "select 33333;"
- 2022-08-30 15:39:33 after execute checkpoint, kingbase is ok.
- after execute checkpoint, kingbase is ok.
- -----------------2022-08-30 15:39:33 failover end---------------------------------------
二、原主库recovery为新备库
1、在原主库生成recovery.conf文件(系统通过crond计划任务,调用network_rewind.sh脚本自动recovery为备库)[kingbase@node101 data]$ cp ../etc/recovery.done ./recovery.conf
2、查看recovery.log日志信息
- [kingbase@node101 log]$ more recovery.log
- ---------------------------------------------------------------------
1)开启recovery,检测存储读写功能是否正常。
- 2022-08-30 16:42:01 recover beging...
- my pid is 11927,officially began to perform recovery
- 2022-08-30 16:42:01 check read/write on mount point
- 2022-08-30 16:42:01 check read/write on mount point (1 / 6).
- 2022-08-30 16:42:01 stat the directory of the mount point "/home/kingbase/cluster/R3HA/db/data" ...
- 2022-08-30 16:42:01 stat the directory of the mount point "/home/kingbase/cluster/R3HA/db/data" ... OK
- 2022-08-30 16:42:01 create/write the file "/home/kingbase/cluster/R3HA/db/data/rw_status_file_574940445" ...
- 2022-08-30 16:42:01 create/write the file "/home/kingbase/cluster/R3HA/db/data/rw_status_file_574940445" ... OK
- 2022-08-30 16:42:01 stat the file "/home/kingbase/cluster/R3HA/db/data/rw_status_file_574940445" ...
- 2022-08-30 16:42:01 stat the file "/home/kingbase/cluster/R3HA/db/data/rw_status_file_574940445" ... OK
- 2022-08-30 16:42:01 read the file "/home/kingbase/cluster/R3HA/db/data/rw_status_file_574940445" ...
- 2022-08-30 16:42:01 read the file "/home/kingbase/cluster/R3HA/db/data/rw_status_file_574940445" ... OK
- 2022-08-30 16:42:01 delete the file "/home/kingbase/cluster/R3HA/db/data/rw_status_file_574940445" ...
- 2022-08-30 16:42:01 delete the file "/home/kingbase/cluster/R3HA/db/data/rw_status_file_574940445" ... OK
- 2022-08-30 16:42:01 success to check read/write on mount point (1 / 6).
- 2022-08-30 16:42:01 check read/write on mount point ... ok
2)ping 网关检测网络的连通性
- 2022-08-30 16:42:01 check if the network is ok
- ping trust ip 192.168.1.1 success ping times :[3], success times:[2]
3)判断当前节点role(master OR standby),如果是standby节点,并判断是否需要recovery。( I,m status is [1] (1 is down), I will be in recovery.)
- determine if i am master or standby
- node_id | hostname | port | status | lb_weight | role | select_cnt | load_balance_node | replication_delay
- ---------+---------------+-------+--------+-----------+---------+------------+-------------------+-------------------
- 0 | 192.168.1.101 | 54321 | down | 0.500000 | standby | 0 | false | 0
- 1 | 192.168.1.102 | 54321 | up | 0.500000 | primary | 0 | true | 0
- (2 rows)
-
- i am standby in cluster,determine if recovery is needed
- 2022-08-30 16:42:03 now will del vip [192.168.1.200/24]
- but no 192.168.1.200/24 on my DEV, nothing to do with del
- ksql execute success,but node:node101 does not have correct streaming(or catchup) replication ,will retry ,retry times:[1/10]
- ......
- ksql execute success,but node:node101 does not have correct streaming(or catchup) replication ,will retry ,retry times:[10/10]
- primary node/Im node status is changed, primary ip[192.168.1.102], recovery.conf NEED_CHANGE [0] (0 is need ), I,m status is [1] (1 is down), I will be in recovery.
- node_id | hostname | port | status | lb_weight | role | select_cnt | load_balance_node | replication_delay
- ---------+---------------+-------+--------+-----------+---------+------------+-------------------+-------------------
- 0 | 192.168.1.101 | 54321 | down | 0.500000 | standby | 0 | false | 0
- 1 | 192.168.1.102 | 54321 | up | 0.500000 | primary | 0 | true | 0
- (2 rows)
4)如果需要做recovery,执行sys_rewind。
- if recover node up, let it down , for rewind
- 2022-08-30 16:42:34 sys_rewind...
- sys_rewind --target-data=/home/kingbase/cluster/R3HA/db/data --source-server="host=192.168.1.102 port=54321 user=SUPERMANAGER_V8ADMIN dbname=TEST"
- datadir_source = /home/kingbase/cluster/R3HA/db/data
- rewinding from last common checkpoint at 0/66000028 on timeline 15
- find last common checkpoint start time from 2022-08-30 16:42:34.727711 CST to 2022-08-30 16:42:34.839545 CST, in "0.111834" seconds.
- reading source file list
- reading target file list
- reading WAL in target
- Rewind datadir file from source
- update the control file: minRecoveryPoint is '0/68024E80', minRecoveryPointTLI is '16', and database state is 'in archive recovery'
- rewind start wal location 0/66000028 (file 0000000F0000000000000066), end wal location 0/68024E80 (file 000000100000000000000068). time from 2022-08-30 16:42:36.727711 CST to 2022-08-30 16:42:37.201678 CST, in "2.473967" seconds.
- Done!
5)sys_rewind成功,修改集群配置文件。
- sed conf change #synchronous_standby_names
- 2022-08-30 16:42:39 file operate
- cp recovery.conf...
- change recovery.conf ip -> primary.ip
- 2022-08-30 16:42:39 no need change recovery.conf, primary node is 192.168.1.102
6)启动新备库数据库服务,删除原复制槽并重建。备库数据库服务启动成功后,配置同步为sync。
- delete pid file if exist
- del the replication_slots if exis
- drop the slot [slot_node101].
- drop the slot [slot_node102].
- 2022-08-30 16:42:39 start up the kingbase...
- waiting for server to start....LOG: redirecting log output to logging collector process
- HINT: Future log output will appear in directory "/home/kingbase/cluster/R3HA/db/data/sys_log".
- done
- server started
- ksql "port=54321 user=SUPERMANAGER_V8ADMIN dbname=TEST connect_timeout=10" -c "select 33333;"
- SYS_CREATE_PHYSICAL_REPLICATION_SLOT
- --------------------------------------
- (slot_node101,)
- (1 row)
-
- 2022-08-30 16:42:41 create the slot [slot_node101] success.
- SYS_CREATE_PHYSICAL_REPLICATION_SLOT
- --------------------------------------
- (slot_node102,)
- (1 row)
-
- 2022-08-30 16:42:41 create the slot [slot_node102] success.
- 2022-08-30 16:42:41 start up standby successful!
- cluster is sync cluster.
- SYNC RECOVER MODE ...
- 2022-08-30 16:42:41 remote primary node change sync
- ALTER SYSTEM
- SYS_RELOAD_CONF
- -----------------
- t
- (1 row)
-
- SYNC RECOVER MODE DONE
7)执行pcp_attach_node将当前节点加入到集群。
- 2022-08-30 16:42:44 attach pool...
- IM Node is 0, will try [pcp_attach_node -U kingbase -W MTIzNDU2 -h 192.168.1.201 -n 0]
- pcp_attach_node -- Command Successful
- node_id | hostname | port | status | lb_weight | role | select_cnt | load_balance_node | replication_delay
- ---------+---------------+-------+--------+-----------+---------+------------+-------------------+-------------------
- 0 | 192.168.1.101 | 54321 | up | 0.500000 | standby | 0 | false | 0
- 1 | 192.168.1.102 | 54321 | up | 0.500000 | primary | 0 | true | 0
- (2 rows)
-
- 2022-08-30 16:42:45 attach end..
- recovery success,exit script with success
三、总结
通过对原主库执行系统重启服务,如果备库在阈值范围内尝试连接主库数据库服务失败,其failover切换过程和主库系统shutdown是一致的。