从master和slave两个角度详细看下全量同步和部分同步的执行流程。
1.全量同步流程
先看master。 确认需要执行全量同步后,master直接进入处理流程。这里提一下,全量同步过程中Redis是依靠slave的状态来驱动整个流程的,
修改当前slave的复制状态为SLAVE_STATE_WAIT_BGSAVE_START(等待bgsave操作开始)并把它加入从机列表;
如果当前slave是第一个从机,需要生成新的Replication ID、清除Replication ID2、初始化复制积压缓冲区;
开始执行bgsave,slave对应的bgsave流程开始后,其复制状态会修改为SLAVE_STATE_WAIT_BGSAVE_END,需要考虑以下三种情况:
如果已经有子进程在执行,并且为磁盘模式:复用当前的bgsave,保存缓冲区状态;然后向slave回复全量同步,offset为正在进行bgsave启动时的复制偏移量;
如果已经有子进程在进行,但是为无盘模式:暂时不执行bgsave,等待周期性检查时触发;
如果没有子进程在进行:开启bgsave;然后向slave回复全量同步,offset为当前master的复制偏移量。
master执行bgsave后就去处理别的请求了,bgsave执行成功后会在serverCron中触发backgroundSaveDoneHandler,进而调用updateSlavesWaitingBgsave,它做的事情就是依次向所有复制状态为SLAVE_STATE_WAIT_BGSAVE_END的slave传输rdb文件。rdb传输之前修改slave复制状态为SLAVE_STATE_SEND_BULK,传输完成后修改状态为SLAVE_STATE_ONLINE。
在看slave侧。 在准备阶段,slave发送psync指令后,就等待master的回复,当收到全量同步的回复后,开始执行全量同步流程。过程如下:
如果slave还有级联的slave,则断开所有与它们的网络链接,并清空复制积压缓冲区;
创建rdb临时文件,接收master传输的文件流并写入;
停止正在进行的rdb持久化、aof持久化流程;
重命名临时文件为正式的rdb文件,执行数据加载;
基于当前与master的网络链接,创建slave的客户端,把master作为slave的客户端;
设置slave的复制id,创建复制积压缓冲区;
进入命令传播阶段;
部分同步流程
相对于全量同步,部分同步要简单的多。
master判定可以使用部分同步方式,执行以下流程:
修改slave状态为SLAVE_STATE_ONLINE,并把slave加入从机列表;
向slave回复部分同步命令,"+CONTINUE replid";
按照slave请求的offset,从复制积压缓冲区提取命令发送至slave;
slave接收到部分同步的回复后,执行以下流程:
对比master复制ID是否发生改变,如果改变了,则更新复制ID,并把原来的复制ID转移至复制ID2;如果有级联的slave,需要断开连接,让他们重连;
基于当前与master的连接,创建slave的客户端,准备接收命令。
接收master传输的命令并执行;
进入命令传播阶段。
命令传播阶段:
与工作原理部分一致,不再重复写了。
测试:
1.09:15进行建立redis主从:
127.0.0.1:6380> slaveof 10.153.119.7 6379
OK
2.然后观察dump_6379.rdb和dump_6380.rdb变化:
[redis@t3-dtpoc-dtpoc-web06 data]$ ls -ltr
total 16
-rw-r----- 1 redis redis 263 Oct 27 09:15 dump_6379.rdb
-rw------- 1 redis redis 263 Oct 27 09:15 dump_6380.rdb
-rw------- 1 redis redis 238 Oct 27 09:18 appendonly_6380.aof
-rw-r----- 1 redis redis 385 Oct 27 09:18 appendonly_6379.aof
3.查看主从的日志文件:注意Background append only file rewriting started by pid 842304,slave在Loading RDB后触发了rewriting of AOF
master:redis_6379.log
842155:M 27 Oct 2023 09:15:50.915 * Replica 10.153.119.7:6380 asks for synchronization
842155:M 27 Oct 2023 09:15:50.915 * Partial resynchronization not accepted: Replication ID mismatch (Replica asked for 'bed977d96bdb0fd16fcad8ffa0604fa128eb0c73', my replication IDs are '92f871f75623f33a8c45c1fdd2df84a5fc7fa3dc' and '0000000000000000000000000000000000000000')
842155:M 27 Oct 2023 09:15:50.915 * Replication backlog created, my new replication IDs are 'f7722075b03f46b40a51d5cba0fea064966cfe0b' and '0000000000000000000000000000000000000000'
842155:M 27 Oct 2023 09:15:50.915 * Starting BGSAVE for SYNC with target: disk
842155:M 27 Oct 2023 09:15:50.916 * Background saving started by pid 842303
842303:C 27 Oct 2023 09:15:50.918 * DB saved on disk
842303:C 27 Oct 2023 09:15:50.920 * RDB: 0 MB of memory used by copy-on-write
842155:M 27 Oct 2023 09:15:51.013 * Background saving terminated with success
842155:M 27 Oct 2023 09:15:51.013 * Synchronization with replica 10.153.119.7:6380 succeeded
slave redis_6380.log:
546152:S 27 Oct 2023 09:15:50.484 * Before turning into a replica, using my own master parameters to synthesize a cached master: I may be able to synchronize with the new master with just a partial transfer.
546152:S 27 Oct 2023 09:15:50.484 * REPLICAOF 10.153.119.7:6379 enabled (user request from 'id=35 addr=127.0.0.1:40398 fd=8 name= age=65913 idle=0 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=46 qbuf-free=32722 obl=0 oll=0 omem=0 events=r cmd=slaveof user=default')
546152:S 27 Oct 2023 09:15:50.914 * Connecting to MASTER 10.153.119.7:6379
546152:S 27 Oct 2023 09:15:50.914 * MASTER <-> REPLICA sync started
546152:S 27 Oct 2023 09:15:50.914 * Non blocking connect for SYNC fired the event.
546152:S 27 Oct 2023 09:15:50.914 * Master replied to PING, replication can continue...
546152:S 27 Oct 2023 09:15:50.915 * Trying a partial resynchronization (request bed977d96bdb0fd16fcad8ffa0604fa128eb0c73:113).
546152:S 27 Oct 2023 09:15:50.916 * Full resync from master: f7722075b03f46b40a51d5cba0fea064966cfe0b:0
546152:S 27 Oct 2023 09:15:50.916 * Discarding previously cached master state.
546152:S 27 Oct 2023 09:15:51.013 * MASTER <-> REPLICA sync: receiving 263 bytes from master to disk
546152:S 27 Oct 2023 09:15:51.013 * MASTER <-> REPLICA sync: Flushing old data
546152:S 27 Oct 2023 09:15:51.014 * MASTER <-> REPLICA sync: Loading DB in memory
546152:S 27 Oct 2023 09:15:51.014 * Loading RDB produced by version 6.0.5
546152:S 27 Oct 2023 09:15:51.014 * RDB age 1 seconds
546152:S 27 Oct 2023 09:15:51.014 * RDB memory usage when created 1.85 Mb
546152:S 27 Oct 2023 09:15:51.014 * MASTER <-> REPLICA sync: Finished with success
546152:S 27 Oct 2023 09:15:51.015 * Background append only file rewriting started by pid 842304
546152:S 27 Oct 2023 09:15:51.039 * AOF rewrite child asks to stop sending diffs.
842304:C 27 Oct 2023 09:15:51.039 * Parent agreed to stop sending diffs. Finalizing AOF...
842304:C 27 Oct 2023 09:15:51.039 * Concatenating 0.00 MB of AOF diff received from parent.
842304:C 27 Oct 2023 09:15:51.039 * SYNC append only file rewrite performed
842304:C 27 Oct 2023 09:15:51.041 * AOF rewrite: 0 MB of memory used by copy-on-write
546152:S 27 Oct 2023 09:15:51.116 * Background AOF rewrite terminated with success
546152:S 27 Oct 2023 09:15:51.117 * Residual parent diff successfully flushed to the rewritten AOF (0.00 MB)
546152:S 27 Oct 2023 09:15:51.117 * Background AOF rewrite finished successfully