1.部署minio环境
docker pull minio/minio
宿主机与容器挂在映射
| 宿主机位置 | 容器位置 |
|---|---|
| /data/minio/config | /data |
| /data/minio/data | /root/.minio |
拉起环境:
- docker run -p 9000:9000 -p 9090:9090 --name minio \
- -d --restart=always \
- -e "MINIO_ACCESS_KEY=admin" \
- -e "MINIO_SECRET_KEY=admin123456" \
- -v /data/minio/data:/data \
- -v /data/minio/config:/root/.minio \minio/minio \
- server /data --console-address ":9090

2.准备starrocks环境
参考docker部署starrocks 使用 Docker 部署 StarRocks @ deploy_with_docker @ StarRocks Docs
3.minio文件查询/全库备份·实操
借助python生成parquet文件
- xiuchenggong@xiuchengdeMacBook-Pro ~ % python3
- Python 3.9.10 (main, Jan 15 2022, 11:48:04)
- [Clang 13.0.0 (clang-1300.0.29.3)] on darwin
- Type "help", "copyright", "credits" or "license" for more information.
- >>> import pandas as pd;
- >>> pf = pd.read_csv("/Users/xiuchenggong/test.csv")
- >>> pf.to_parquet("/Users/xiuchenggong/test.parquet",engine="pyarrow")
3.1 去查存在minio上的parquet数据(支持查parquet或orc格式数据):
- StarRocks > CREATE EXTERNAL TABLE table_1
- -> (
- -> name string,
- -> id int
- -> )
- -> ENGINE=file
- -> PROPERTIES
- -> (
- -> "path" = "s3a://starrocks/test.parquet",
- -> "format" = "parquet",
- -> "aws.s3.enable_ssl" = "false",
- -> "aws.s3.enable_path_style_access" = "true",
- -> "aws.s3.endpoint" = "172.17.0.3:9000",
- -> "aws.s3.access_key" = "0OnU8H9YwTNTJUBC2r7F",
- -> "aws.s3.secret_key" = "vFQ3fIcs90woUS4200L0BYfxelE86iF6cI4vVzYC"
- -> );
- Query OK, 0 rows affected (0.009 sec)
-
- StarRocks > show tables;
- +-------------------+
- | Tables_in_test_db |
- +-------------------+
- | table_1 |
- | test1 |
- | test2 |
- +-------------------+
- 3 rows in set (0.003 sec)
- StarRocks > select * from table_1;
- +--------------+------+
- | name | id |
- +--------------+------+
- | gongxiucheng | 1 |
- | gongzixi | 2 |
- +--------------+------+
- 2 rows in set (0.073 sec)
3.2 全量备份到minio(外表不能备份)
创建repository:
- StarRocks > create repository starrocks_backup_01
- -> with broker
- -> on location "s3a://starrocks"
- -> properties(
- -> "aws.s3.enable_ssl" = "false",
- -> "aws.s3.enable_path_style_access" = "true",
- -> "aws.s3.access_key" = "0OnU8H9YwTNTJUBC2r7F",
- -> "aws.s3.secret_key" = "vFQ3fIcs90woUS4200L0BYfxelE86iF6cI4vVzYC",
- -> "aws.s3.endpoint" = "172.17.0.3:9000"
- -> )
- -> ;
开始备份:
- StarRocks > drop table table_1;
- Query OK, 0 rows affected (0.010 sec)
-
- StarRocks > backup snapshot test_db.snapshot_minio to starrocks_backup_01 properties("type"="full");
- Query OK, 0 rows affected (0.024 sec)
-
-
- StarRocks > show backup\G;
- *************************** 1. row ***************************
- JobId: 11047
- SnapshotName: snapshot_minio
- DbName: test_db
- State: SAVE_META
- BackupObjs: [test_db.test1], [test_db.test2]
- CreateTime: 2023-09-05 01:58:42
- SnapshotFinishedTime: 2023-09-05 01:58:48
- UploadFinishedTime: 2023-09-05 01:58:54
- FinishedTime: NULL
- UnfinishedTasks:
- Progress:
- TaskErrMsg:
- Status: [OK]
- Timeout: 86400
- 1 row in set (0.003 sec)
-
- ERROR: No query specified
-
-
- StarRocks > show backup\G;
- *************************** 1. row ***************************
- JobId: 11047
- SnapshotName: snapshot_minio
- DbName: test_db
- State: FINISHED
- BackupObjs: [test_db.test1], [test_db.test2]
- CreateTime: 2023-09-05 01:58:42
- SnapshotFinishedTime: 2023-09-05 01:58:48
- UploadFinishedTime: 2023-09-05 01:58:54
- FinishedTime: 2023-09-05 01:59:00
- UnfinishedTasks:
- Progress:
- TaskErrMsg:
- Status: [OK]
- Timeout: 86400
- 1 row in set (0.004 sec)
-
- ERROR: No query specified
-
查看minio上文件:
备份成功;