正常情况下,如果是多副本的数据存储很容易修复,删除一个两个be也能根据doris自身的恢复机制恢复起来,但是,有时候可能有些表没有弄多个副本,那么就有点搞头了。
案例说明:fe的master节点的数据误删除,导致集群直接都宕机,那么如果是单个fe的话,那么可以使用云存储的快照功能恢复到指点的时间点的元数据存储,如果是多个节点的fe删除了master那么可以使用如下操作。
(操作之前记得备份一份),删除执行目录下的文件,如下目录下的ROLE和VERSION(如果能够正常启动那么不需要删除,启动不了那么就删除ROLE和VERSION):
/apache-doris-fe-1.2.0-bin-x86_64/doris-meta/image
删除ROLE和VERSION,不要删除image文件(如果能够正常启动那么不需要删除,启动不了那么就删除ROLE和VERSION)
然后再fe.conf配置如下(恢复到可以连接查询元数据的时候,停掉服务,取消下面的配置就可以了,恢复以后把新生成的VERSION数据的集群id修改成之前的集群id,原因是要和Be的集群id一样)
metadata_failure_recovery=true
下面的问题是由于fe数据同步出了问题,有多个fe那么就删除这种节点重新加入节点,没有就用上面的方法,删除Fe元数据里面的ROLE和VERSION进行恢复。
- #在sql命令行执行
- ALTER SYSTEM DROP FOLLOWER "100.200.0.36:9010";
- ALTER SYSTEM ADD FOLLOWER "100.200.0.36:9010";
-
- #要加入的节点,10.200.8.117是master节点,加入的节点的端口一定要和集群所有的端口相同
- ./bin/start_fe.sh --helper 100.200.8.117:9010 --daemon
恢复完Fe的数据以后,那么就可以启动Be了(如果Be使用到了supervisord,那么先关闭的,如果没有关闭,他就会在Be的数据存储目录meta目录和stream_load有一个LOCK文件,这是由于开启了自启动,和后面调试起来锁争抢问题,删除LOCK文件在重新启动就可以了)。
可能有的Be的sst文件丢失了,那么可以使用云的磁盘快照功能,恢复到指定时间的数据,恢复以后会有以下问题。
可以使用下面的命令进行空白tablet的覆盖,那么会丢失近一天的数据(详情看doris官网介绍)。
ADMIN SET FRONTEND CONFIG ("recover_with_empty_tablet" = "true");
修复完以后修改成false
ADMIN SET FRONTEND CONFIG ("recover_with_empty_tablet" = "false");
ERROR 1105 (HY000): errCode = 2, detailMessage = (100.200.8.186)[INTERNAL_ERROR]failed to initialize storage reader. tablet=6708126.992547307.aa48977d2d69366b-180d241750a968ae, res=[INTERNAL_ERROR]fail to find path in version_graph. spec_version: 0-132, backend=100.200.8.186
修复步骤
- ============================================================================
- #查询报错的表
-
- mysql> select * from databasename.tablename limit 10;
- ERROR 1105 (HY000): errCode = 2, detailMessage = (100.20.8.16)[INTERNAL_ERROR]failed to initialize storage reader. tablet=20289334.1750657277.ab4aa3a697bb5bec-ea5fe4f921c9f8ba, res=[INTERNAL_ERROR]fail to find path in version_graph. spec_version: 0-48974, backend=100.20.8.16
-
- #查询tablet的详细信息,然后执行DetailCmd下面的命令
- mysql> show tablet 20289334;
- +------------------------------------+----------------+----------------+----------------+--------+---------+-------------+---------+--------+-------+-----------------------------------------------------------------------+
- | DbName | TableName | PartitionName | IndexName | DbId | TableId | PartitionId | IndexId | IsSync | Order | DetailCmd |
- +------------------------------------+----------------+----------------+----------------+--------+---------+-------------+---------+--------+-------+-----------------------------------------------------------------------+
- | default_cluster:databasename | tablename | tablename | tablename | 993914 | 6091980 | 20289269 | 6091981 | true | 16 | SHOW PROC '/dbs/993914/6091980/partitions/20289269/6091981/20289334'; |
- +------------------------------------+----------------+----------------+----------------+--------+---------+-------------+---------+--------+-------+-----------------------------------------------------------------------+
- 1 row in set (0.00 sec)
-
-
- mysql> SHOW PROC '/dbs/993914/6091980/partitions/20289269/6091981/20289334';
- +-----------+-----------+---------+-------------------+------------------+---------------+------------+---------------+----------------+----------+--------+-------+--------------+----------------------+---------------------------------------------------+-----------------------------------------------------------------+
- | ReplicaId | BackendId | Version | LstSuccessVersion | LstFailedVersion | LstFailedTime | SchemaHash | LocalDataSize | RemoteDataSize | RowCount | State | IsBad | VersionCount | PathHash | MetaUrl | CompactionStatus |
- +-----------+-----------+---------+-------------------+------------------+---------------+------------+---------------+----------------+----------+--------+-------+--------------+----------------------+---------------------------------------------------+-----------------------------------------------------------------+
- | 20289335 | 992447 | 48974 | 48974 这个是最后成功的版本,也就是Fe有,Be没有的数据版本!!! | -1 | NULL | 1750657277 | 23244725 | 0 | 758834 | NORMAL | false | 9 | -1695231754850110287 | http://10.240.8.73:8040/api/meta/header/20289334 | http://10.240.8.73:8040/api/compaction/show?tablet_id=20289334 |
- | 20289337 | 992604 | 48974 | 48974 | -1 | NULL | 1750657277 | 23244725 | 0 | 758834 | NORMAL | false | 9 | -7776024449396720965 | http://10.240.8.9:8040/api/meta/header/20289334 | http://10.240.8.9:8040/api/compaction/show?tablet_id=20289334 |
- | 29024516 | 993073 | 48974 | 48974 | -1 | NULL | 1750657277 | 23220943 | 0 | 758485 | NORMAL | false | 7 | -7256810246008465985 | http://100.20.8.16:8040/api/meta/header/20289334 | http://100.20.8.16:8040/api/compaction/show?tablet_id=20289334 |
- +-----------+-----------+---------+-------------------+------------------+---------------+------------+---------------+----------------+----------+--------+-------+--------------+----------------------+---------------------------------------------------+-----------------------------------------------------------------+
-
-
- #查看合并的版本情况,由于现在完成的版本是 48974,下面的是成功合并的版本[48410-48665],那么就从48666 开始合并,直到48974(这里就是补齐Be没有但是Fe有的数据)
-
- curl http://100.20.8.16:8040/api/compaction/show?tablet_id=20289334
-
-
- [root@doris1 ~]# curl http://100.20.8.16:8040/api/compaction/show?tablet_id=20289334
- {
- "cumulative policy type": "SIZE_BASED",
- "cumulative point": 2,
- "last cumulative failure time": "2023-10-19 18:57:10.751",
- "last base failure time": "1970-01-01 08:00:00.000",
- "last cumulative success time": "2023-10-19 18:57:10.751",
- "last base success time": "2023-10-19 18:57:10.751",
- "rowsets": [
- "[0-1] 0 DATA NONOVERLAPPING 0200000000b943b1624293f95b56fef5e5a0a7526615fbb0 0",
- "[2-8892] 1 DATA NONOVERLAPPING 0200000000b943b2624293f95b56fef5e5a0a7526615fbb0 16.68 MB",
- "[8893-44310] 1 DATA NONOVERLAPPING 0200000000fd69e8624293f95b56fef5e5a0a7526615fbb0 4.62 MB",
- "[44311-44940] 1 DATA NONOVERLAPPING 020000000102d9dd624293f95b56fef5e5a0a7526615fbb0 531.21 KB",
- "[44941-47519] 1 DATA NONOVERLAPPING 020000000118285f624293f95b56fef5e5a0a7526615fbb0 252.91 KB",
- "[47520-48409] 1 DATA NONOVERLAPPING 02000000011fd182624293f95b56fef5e5a0a7526615fbb0 64.41 KB",
- "[48410-48665] 1 DATA NONOVERLAPPING 020000000121faaf624293f95b56fef5e5a0a7526615fbb0 21.43 KB"
- ],
- "missing_rowsets": [],
- "stale_rowsets": [],
- "stale version path": []
- }
-
- #查看合并的版本情况,由于现在完成的版本是 48974,下面的是成功合并的版本[48410-48665],那么就从48666 开始合并,直到48974
- curl -X POST "http://100.20.8.16:8040/api/pad_rowset?tablet_id=20289334&start_version=48666&end_version=48974"
- #恢复数据库
- RECOVER DATABASE example_db;
- #恢复表
- recover table 表名;
- #强制删除
- drop xxx force;
- #查看回收站
- show trash;
- #清除指定be的回收站数据
- ADMIN CLEAN TRASH ON("be_ip:9050");
1. 首先机器节点数需要 >= 复本数
2. 修改历史分区副本数:alter table modify partition(*) set ("replication_num" = "3");
3. 动态分区修改未来分区副本数:ALTER TABLE example_db.mysql_table SET ("dynamic_partition.replication_num" = "3");
4. 非分区表:ALTER TABLE example_db.mysql_table SET ("replication_num" = "3");
- ADMIN SHOW REPLICA STATUS FROM 数据库.表名;
-
- show partitions from 数据库.表名;
备注: 先对测试表进行操作,看测试表的状态是否符合预期,进行show partitions from table 和 show create table 进行查看核对,show partitions from 是实际的副本数。