• ClickHouse 数据插入、更新与删除操作 SQL


    f5a290eafca4532cc7ba12bc985f2166.png


    1.1.数据****操作

    数据操作语言( DML,Data Manipulation Language) 用于在数据库表中添加(插入)、删除和修改(更新)数据。本节主要介绍ClickHouse中的数据插入、更新与删除操作。

    1.1.1.概述

    数据操作语言DML,包括SQL数据更改语句,它修改存储的数据,但不修改数据模型,例如数据库模式或数据库表结构。DML语言常见的语法模式如下:

    INSERT INTO ... VALUES ...
    UPDATE ... SET ... WHERE ...
    DELETE FROM ... WHERE ...
    
    • 1
    • 2
    • 3

    但是,在 ClickHouse 中,UPDATE 与DELETE 是设计在了 ALTER 指令体系中的。

    1.1.2.插入****数据

    一次一条数据插入

    INSERT INTO clickhouse_tutorial.user_tag (user_id, gender, age, active_level, date)
    VALUES (1, 'male', '18', '1', '2022-03-21');
    INSERT INTO clickhouse_tutorial.user_tag (user_id, gender, age, active_level, date)
    VALUES (2, 'female', '16', '2', '2022-03-21');
    
    • 1
    • 2
    • 3
    • 4

    一次多条数据插入

    INSERT INTO clickhouse_tutorial.user_tag (user_id, gender, age, active_level, date)
    VALUES (3, 'female', '20', '3', '2022-03-21'),
           (4, 'female', '22', '4', '2022-03-21');
    
    • 1
    • 2
    • 3

    插入SELECT查询返回数据

    INSERT INTO clickhouse_tutorial.user_tag
    (UserID, WatchID, EventTime, Sex, Age, OS, RegionID, RequestNum, EventDate)
    SELECT
        UserID,
        WatchID,
        EventTime,
        Sex,
        Age,
        OS,
        RegionID,
        RequestNum,
        EventDate
    FROM tutorial.hits_v1
     
    Query id: bfed9d12-b838-4125-9ee2-f61049bf0a56
     
    ↙ Progress: 30.91 million rows, 835.62 MB (6.66 million rows/s., 180.00 MB/s.) (0.0 CPU, 172.08 MB RAM)
    ████████████████████████████████████████████████ 99%
     
    Ok.
     
    0 rows in set. Elapsed: 7.552 sec. Processed 53.24 million rows, 1.37 GB (7.05 million rows/s., 180.96 MB/s.)
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22

    1.1.3.UPDATE更新数据

    语法

    ALTER TABLE [db.]table UPDATE column1 = expr1 [, ...] WHERE filter_expr
    
    • 1

    功能说明

    更新表数据。ClickHouse 中ALTER TABLE 前缀种语法与大多数其他支持 SQL 的数据库系统不同。它旨在表明,与 OLTP 数据库中的类似查询不同,这是一项并非为频繁使用而设计的繁重操作。ALTER 查询是通过一种称为“突变”(Mutation)的机制实现的。

    关于 ALTERTABLE … UPDATE命令,详细说明如下:

    1.WHERE子句中的过滤表达式filter_expr的值是UInt8类型,指定要更新的数据行。

    2.不支持更新用于计算主键或分区键的列。

    3.一个UPDATE操作可以包含多个用逗号分隔的命令,例如column1 = expr1,column2= expr2。 UPDATE操作数据处理是同步还是异步,由系统配置项 mutation_sync 设置,可取值为:

    0 - execute asynchronously。

    1 - wait current server。

    2 - wait all replicas if they exist。默认为0,异步后台进程执行,类似于 *MergeTree 表中的合并操作。

    4.对于 *MergeTree 表,Mutation操作通过重写整个数据Part来执行,Mutation不具备原子性。数据 Part一旦准备好就被MutationPart替换,并且在Mutation执行期间,SELECT查询结果中可以看到,来自已经变异Part的数据,以及来自尚未变异Part的数据。

    5.Mutation按照创建顺序排序,并按该顺序应用于每个MutationPart。

    6.在Mutation提交之前插入到表中的数据会被执行Mutation操作,提交之后插入的数据不会执行Mutation操作。

    7.Mutation操作不会阻塞数据插入。

    8.可以查看 system.mutations 表跟踪突变的进度。

    9.即使重新启动 ClickHouse 服务器,成功提交的变更仍将继续执行。一旦提交,就无法回滚突变。

    10.如果Mutation由于某种原因被卡住,可以使用 KILL MUTATION 查询取消它。

    11.已经完成Mutation的条目不会立即删除。保留条目的数量由 finished_mutations_to_keep 存储引擎参数确定。

    12.在系统配置表system.settings中,有关mutation的配置项如下:

    SELECT *
    FROM system.settings
    WHERE name LIKE '%mutation%'
    FORMAT Vertical
     
    Query id: 24f6ca70-7117-41c5-bc3e-dd6615d5ee6d
     
    Row 1:
    ──────
    name:        background_merges_mutations_concurrency_ratio
    value:       2
    changed:     0
    description: Ratio between a number of how many operations could be processed and a number threads to process them. Only has meaning at server startup.
    min:         ????
    max:         ????
    readonly:    0
    type:        Float
     
    Row 2:
    ──────
    name:        mutations_sync
    value:       0
    changed:     0
    description: Wait for synchronous execution of ALTER TABLE UPDATE/DELETE queries (mutations). 0 - execute asynchronously. 1 - wait current server. 2 - wait all replicas if they exist.
    min:         ????
    max:         ????
    readonly:    0
    type:        UInt64
     
    Row 3:
    ──────
    name:        allow_nondeterministic_mutations
    value:       0
    changed:     0
    description: Allow non-deterministic functions in ALTER UPDATE/ALTER DELETE statements
    min:         ????
    max:         ????
    readonly:    0
    type:        Bool
     
    3 rows in set. Elapsed: 0.003 sec.
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41

    实例讲解

    1、更新之前的数据
    SELECT
        WatchID,
        JavaEnable,
        GoodEvent
    FROM tutorial.hits_v1
    WHERE WatchID = 7043438415214026105
     
    Query id: e0dc9ae5-8f24-48e5-a56d-d107afa1dfe3
     
    ┌─────────────WatchID─┬─JavaEnable─┬─GoodEvent─┐
    │ 7043438415214026105 │          1 │         1 │
    └─────────────────────┴────────────┴───────────┘
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    2**、**UPDATE 目标数据行
    ALTER TABLE tutorial.hits_v1
        UPDATE JavaEnable = 0, GoodEvent = 0 WHERE WatchID = 7043438415214026105
     
    Query id: 32ac8c6b-c78c-4a3c-ab72-29ed38fda687
     
    Ok.
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    3**、**查看更新结果
    SELECT
        WatchID,
        JavaEnable,
        GoodEvent
    FROM tutorial.hits_v1
    WHERE WatchID = 7043438415214026105
     
    Query id: 26591216-0cb4-4c6d-9b89-db9ff588d469
     
    ┌─────────────WatchID─┬─JavaEnable─┬─GoodEvent─┐
    │ 7043438415214026105 │          0 │         0 │
    └─────────────────────┴────────────┴───────────┘
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12

    可以看到,目标数据已经被更新。

    4**、**查看 Mutation 执行日志

    我们可以去服务器端查看 UPDATE 操作提交之后的日志:

    2022.03.31 03:06:47.898420 [ 6154029 ] {} TCP-Session: fd75399f-85bc-4d6d-a86b-d69fa899f6d9 Creating query context from session context, user_id: 94309d50-4f52-5250-31bd-74fecac179db, parent context user: default

    2022.03.31 03:06:47.898553 [ 6154029 ] {32ac8c6b-c78c-4a3c-ab72-29ed38fda687} executeQuery: (from 127.0.0.1:52757) ALTER TABLE tutorial.hits_v1 UPDATE JavaEnable = 0, GoodEvent=0 WHERE WatchID=7043438415214026105;

    2022.03.31 03:06:47.898600 [ 6154029 ] {32ac8c6b-c78c-4a3c-ab72-29ed38fda687} ContextAccess (default): Access granted: ALTER UPDATE(JavaEnable, GoodEvent) ON tutorial.hits_v1

    2022.03.31 03:06:47.900244 [ 6154029 ] {32ac8c6b-c78c-4a3c-ab72-29ed38fda687} tutorial.hits_v1 (0fa45bfe-c9ca-4df7-b7bf-7bd268a6225d): Added mutation: mutation_33.txt

    2022.03.31 03:06:47.900311 [ 6154029 ] {32ac8c6b-c78c-4a3c-ab72-29ed38fda687} MemoryTracker: Peak memory usage (for query): 0.00 B.

    2022.03.31 03:06:47.900365 [ 6154029 ] {} TCPHandler: Processed in 0.00198 sec.

    5**、**查看 Mutation 详情

    可以看到“Added mutation: mutation_33.txt”这样一行关键日志。去系统表system.mutations中查看mutation_33.txt的详情如下:

    SELECT *
    FROM system.mutations
    WHERE mutation_id = 'mutation_33.txt'
    FORMAT Vertical
     
    Query id: 3f7c995b-82e3-41fa-bf5b-fc50be8da824
     
    Row 1:
    ──────
    database:                   tutorial
    table:                      hits_v1
    mutation_id:                mutation_33.txt
    command:                    UPDATE JavaEnable = 0, GoodEvent = 0 WHERE WatchID = 7043438415214026105
    create_time:                2022-03-31 03:06:47
    block_numbers.partition_id: ['']
    block_numbers.number:       [33]
    parts_to_do_names:          []
    parts_to_do:                0
    is_done:                    1
    latest_failed_part:         
    latest_fail_time:           1970-01-01 08:00:00
    latest_fail_reason:         
     
    1 rows in set. Elapsed: 0.003 sec.
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24

    1.1.4.DELETE删除****数据

    语法

    ALTER TABLE [db.]table [ON CLUSTER cluster] DELETE WHERE filter_expr
    
    • 1

    功能说明

    删除表数据。

    实例讲解

    1**、**要删除的目标数据行
    SELECT
        WatchID,
        JavaEnable,
        GoodEvent
    FROM tutorial.hits_v1
    WHERE WatchID = 7043438415214026105
     
    Query id: 1444174c-142b-43ec-8ad6-54da1d871277
     
    ┌─────────────WatchID─┬─JavaEnable─┬─GoodEvent─┐
    │ 7043438415214026105 │          0 │         0 │
    └─────────────────────┴────────────┴───────────┘
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    2、执行删除操作
    ALTER TABLE tutorial.hits_v1
        DELETE WHERE WatchID = 7043438415214026105
    
    • 1
    • 2
    3**、**验证删除结果
    SELECT
        WatchID,
        JavaEnable,
        GoodEvent
    FROM tutorial.hits_v1
    WHERE WatchID = 7043438415214026105
     
    Query id: fd6a7536-4f2e-4fe3-8b03-bf1aed45302f
     
    Ok.
     
    0 rows in set. Elapsed: 0.018 sec. Processed 8.87 million rows, 70.99 MB (480.03 million rows/s., 3.84 GB/s.)
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    4、查看删除操作的服务端日志

    根据 query_id: a303b0d4-564d-48e5-9f32-c7e2df554b1f 查询ClickHouseServer 端日志如下:

    2022.03.31 03:15:10.905286 [ 6154029 ] {} TCP-Session: fd75399f-85bc-4d6d-a86b-d69fa899f6d9 Creating query context from session context, user_id: 94309d50-4f52-5250-31bd-74fecac179db, parent context user: default

    2022.03.31 03:15:10.905429 [ 6154029 ] {a303b0d4-564d-48e5-9f32-c7e2df554b1f} executeQuery: (from 127.0.0.1:52757) ALTER TABLE tutorial.hits_v1 DELETE WHERE WatchID=7043438415214026105;

    2022.03.31 03:15:10.905482 [ 6154029 ] {a303b0d4-564d-48e5-9f32-c7e2df554b1f} ContextAccess (default): Access granted: ALTER DELETE ON tutorial.hits_v1

    2022.03.31 03:15:10.906794 [ 6154029 ] {a303b0d4-564d-48e5-9f32-c7e2df554b1f} InterpreterSelectQuery: MergeTreeWhereOptimizer: condition “isZeroOrNull(WatchID = 7043438415214026105)” moved to PREWHERE

    2022.03.31 03:15:10.911000 [ 6154029 ] {a303b0d4-564d-48e5-9f32-c7e2df554b1f} tutorial.hits_v1 (0fa45bfe-c9ca-4df7-b7bf-7bd268a6225d): Added mutation: mutation_35.txt

    2022.03.31 03:15:10.911081 [ 6154029 ] {a303b0d4-564d-48e5-9f32-c7e2df554b1f} MemoryTracker: Peak memory usage (for query): 0.00 B.

    2022.03.31 03:15:10.911139 [ 6154029 ] {} TCPHandler: Processed in 0.005891 sec.

    可以看到关键日志:“Added mutation: mutation_35.txt”。

    另外,我们还看到了WHERE过滤自动转为PREWHERE优化的日志:“InterpreterSelectQuery: MergeTreeWhereOptimizer: condition “isZeroOrNull(WatchID = 7043438415214026105)” moved to PREWHERE”。

    5**、**查看 mutation 详情

    根据日志内容“Added mutation: mutation_35.txt”,查询 mutation_35.txt详情如下:

    SELECT *
    FROM system.mutations
    WHERE mutation_id = 'mutation_35.txt'
    FORMAT Vertical
     
    Query id: c7e42d42-feef-4a18-a46f-ea97aa9d7b7e
     
    Row 1:
    ──────
    database:                   tutorial
    table:                      hits_v1
    mutation_id:                mutation_35.txt
    command:                    DELETE WHERE WatchID = 7043438415214026105
    create_time:                2022-03-31 03:15:10
    block_numbers.partition_id: ['']
    block_numbers.number:       [35]
    parts_to_do_names:          []
    parts_to_do:                0
    is_done:                    1
    latest_failed_part:         
    latest_fail_time:           1970-01-01 08:00:00
    latest_fail_reason:         
     
    1 rows in set. Elapsed: 0.003 sec.
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24

    1.1.5.EXCHANGE****操作

    语法

    EXCHANGE TABLES|DICTIONARIES [db0.]name_A AND [db1.]name_B
    EXCHANGE DICTIONARIES [db0.]dict_A AND [db1.]dict_B
    
    • 1
    • 2

    功能说明

    1.EXCHANGE 操作以原子操作的方式交换两个表或字典的名称。

    2.EXCHANGE 操作也可以通过使用RENAME 操作来完成,区别是RENAME不是原子操作。 例如,RENAME重命名交换两张表new_table、old_table:

    RENAME TABLE new_table TO tmp, old_table TO new_table, tmp TO old_table;

    直接使用EXCHANGE 命令实现如下:

    EXCHANGE TABLES new_table AND old_table;

    3.EXCHANGE 底层是通过系统调用renameat2() 实现,LinuxKernel3.15+版本才支持。

    4.只有Atomic数据库引擎支持EXCHANGE操作。

    在Atomic 数据库引擎下创建的数据表,支持无锁原子CREATE/DROP/RENAME 操作,并且支持原子EXCHANGE TABLES A and B 直接交换两张表。

    应用场景

    EXCHANGE 命令可以实现 AB 两张表的快速切换。AB 表切换的使用场景很广泛,比如历史表归档、批量抽数、数据同步过程写临时表等等,都可以采用 AB 表切换的思路来实现。

    实例讲解

    1、创建两张表
    drop table if exists tutorial.hits_v2;
    drop table if exists tutorial.hits_v3;
    CREATE TABLE tutorial.hits_v2
    (
        `WatchID` UInt64,
        `UserID` UInt64,
        `JavaEnable` UInt8,
        `Title` String,
        `GoodEvent` Int16,
        `EventTime` DateTime,
        `EventDate` Date,
        `RequestNum` UInt32,
        `RequestTry` UInt8
    )
        ENGINE = MergeTree()
            PARTITION BY toYYYYMM(EventDate)
            ORDER BY (WatchID, EventDate, intHash32(UserID))
            SAMPLE BY intHash32(UserID);
     
    CREATE TABLE tutorial.hits_v3
    (
        `WatchID` UInt64,
        `UserID` UInt64,
        `JavaEnable` UInt8,
        `Title` String,
        `GoodEvent` Int16,
        `EventTime` DateTime,
        `EventDate` Date,
        `RequestNum` UInt32,
        `RequestTry` UInt8
    )
        ENGINE = MergeTree()
            PARTITION BY toYYYYMM(EventDate)
            ORDER BY (WatchID, EventDate, intHash32(UserID))
            SAMPLE BY intHash32(UserID);
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    2**、**交换两张表的名字
    EXCHANGE TABLES tutorial.hits_v1 AND tutorial.hits_v2;
    
    • 1

    上面的命令,在笔者的 MacOS电脑上是报错的:

    Received exception from server (version 22.4.1):
    Code: 48. DB::Exception: Received from 127.0.0.1:9009. DB::Exception: RENAME EXCHANGE is not supported. (NOT_IMPLEMENTED)
    
    • 1
    • 2

    查看内核版本:

    $uname -a
    Darwin C02FJ0KMMD6V 20.3.0 Darwin Kernel Version 20.3.0: Thu Jan 21 00:07:06 PST 2021; root:xnu-7195.81.3~1/RELEASE_X86_64 x86_64
    
    • 1
    • 2

    可见, Mac OS Darwin Kernel Version 20.3.0内核版本,还不支持renameat2函数系统调用。

    EXCHANGE****实现原理

    ClickHouse源代码实现在renameat2.cpp中,相关代码行如下:

    #if !defined(__NR_renameat2)
        #if defined(__x86_64__)
            #define __NR_renameat2 316
        #elif defined(__aarch64__)
            #define __NR_renameat2 276
        #elif defined(__ppc64__)
            #define __NR_renameat2 357
        #elif defined(__riscv)
            #define __NR_renameat2 276
        #else
            #error "Unsupported architecture"
        #endif
    #endif
    ...
    static bool renameat2(const std::string & old_path, const std::string & new_path, int flags)
    {
        if (!supportsRenameat2())
            return false;
        if (old_path.empty() || new_path.empty())
            throw Exception(ErrorCodes::LOGICAL_ERROR, "Cannot rename {} to {}: path is empty", old_path, new_path);
     
        /// int olddirfd (ignored for absolute oldpath), const char *oldpath,
        /// int newdirfd (ignored for absolute newpath), const char *newpath,
        /// unsigned int flags
        if (0 == syscall(__NR_renameat2, AT_FDCWD, old_path.c_str(), AT_FDCWD, new_path.c_str(), flags))
            return true;
     
        /// EINVAL means that filesystem does not support one of the flags.
        /// It also may happen when running clickhouse in docker with Mac OS as a host OS.
        /// supportsRenameat2() with uname is not enough in this case, because virtualized Linux kernel is used.
        /// Other cases when EINVAL can be returned should never happen.
        if (errno == EINVAL)
            return false;
        /// We should never get ENOSYS on Linux, because we check kernel version in supportsRenameat2Impl().
        /// However, we can get in on WSL.
        if (errno == ENOSYS)
            return false;
     
        if (errno == EEXIST)
            throwFromErrno(fmt::format("Cannot rename {} to {} because the second path already exists", old_path, new_path), ErrorCodes::ATOMIC_RENAME_FAIL);
        if (errno == ENOENT)
            throwFromErrno(fmt::format("Paths cannot be exchanged because {} or {} does not exist", old_path, new_path), ErrorCodes::ATOMIC_RENAME_FAIL);
        throwFromErrnoWithPath(fmt::format("Cannot rename {} to {}", old_path, new_path), new_path, ErrorCodes::SYSTEM_ERROR);
    }
    ...
    bool supportsRenameat2()
    {
        static bool supports = supportsRenameat2Impl();
        return supports;
    }
    ...
    static bool supportsRenameat2Impl()
    {
        VersionNumber renameat2_minimal_version(3, 15, 0); // since linux kernel 3.15
        VersionNumber linux_version(Poco::Environment::osVersion());
        return linux_version >= renameat2_minimal_version;
    }
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41
    • 42
    • 43
    • 44
    • 45
    • 46
    • 47
    • 48
    • 49
    • 50
    • 51
    • 52
    • 53
    • 54
    • 55
    • 56
    • 57

    1.1.6.OPTIMIZE****操作

    语法

    OPTIMIZE TABLE [db.]name [ON CLUSTER cluster]
    [PARTITION partition | PARTITION ID 'partition_id'] [FINAL]
    [DEDUPLICATE [BY expression]]
    
    • 1
    • 2
    • 3

    功能说明

    1.OPTIMIZE操作尝试为数据库表db.table ,初始化一个调度计划外的数据part合并操作。

    2.OPTIMIZE操作仅支持MergeTree系列表引擎、MaterializedView和 Buffer表引擎。

    3.当 OPTIMIZE 与 ReplicatedMergeTree 系列表引擎一起使用时,ClickHouse 创建一个用于合并的任务,并等待在所有副本上执行(replication_alter_partitions_sync=2)或者等待当前副本上执行(replication_alter_partitions_sync=1)。

    4.OPTIMIZE无法修复“Too many parts”错误。

    配置项replication_alter_partitions_sync说明

    系统配置项replication_alter_partitions_sync,用来指定等待副本分区变更操作(ALTER、OPTIMIZE 或 TRUNCATE等)执行的sync策略。

    replication_alter_partitions_sync可取值:

    0 — 不等待,直接异步执行。

    1 — 同步等待在当前副本上执行。

    2 — 同步等待在所有副本上执行。

    实例讲解

    执行OPTIMIZE TABLE命令:

    OPTIMIZE TABLE clickhouse_tutorial.user_tag

    Query id: c03335ce-fee8-42b6-bc36-3e23a8b59b29

    Ok.

    0 rows in set. Elapsed: 0.210 sec.

    查看 Server 端日志:

    2022.03.31 04:52:31.277919 [ 6156710 ] {} TCP-Session: 610442b9-37d5-49e0-821d-c25984fc7f41 Creating query context from session context, user_id: 94309d50-4f52-5250-31bd-74fecac179db, parent context user: default

    2022.03.31 04:52:31.278067 [ 6156710 ] {c03335ce-fee8-42b6-bc36-3e23a8b59b29} executeQuery: (from 127.0.0.1:53198) optimize table clickhouse_tutorial.user_tag;

    2022.03.31 04:52:31.278128 [ 6156710 ] {c03335ce-fee8-42b6-bc36-3e23a8b59b29} ContextAccess (default): Access granted: OPTIMIZE ON clickhouse_tutorial.user_tag

    2022.03.31 04:52:31.278239 [ 6156710 ] {c03335ce-fee8-42b6-bc36-3e23a8b59b29} clickhouse_tutorial.user_tag (edac2738-6508-46fa-aee7-7c5560f4539c) (MergerMutator): Selected 3 parts from 20140320_46_46_0 to 20140320_59_59_0

    2022.03.31 04:52:31.278294 [ 6156710 ] {c03335ce-fee8-42b6-bc36-3e23a8b59b29} DiskLocal: Reserving 11.19 MiB on disk `default`, having unreserved 192.05 GiB.

    2022.03.31 04:52:31.278340 [ 6156710 ] {edac2738-6508-46fa-aee7-7c5560f4539c::20140320_46_59_1} MergeTask::PrepareStage: Merging 3 parts: from 20140320_46_46_0 to 20140320_59_59_0 into Wide

    MergeTreeSequentialSource: Reading 2 marks from part pRegionID, total 1443 rows starting from the beginning of the part

    2022.03.31 04:52:31.470792 [ 6156710 ] {edac2738-6508-46fa-aee7-7c5560f4539c::20140320_46_59_1} MergeTask::MergeProjectionsStage: Merge sorted 3917 rows, containing 3 columns (3 merged, 0 gathered) in 0.099764 sec., 39262.65987731045 rows/sec., 15.67 MiB/sec.

    2022.03.31 04:52:31.479793 [ 6156710 ] {edac2738-6508-46fa-aee7-7c5560f4539c::20140320_46_59_1} MergedBlockOutputStream: filled checksums pRegionID (state Active)

    clickhouse_tutorial.user_tag (edac2738-6508-46fa-aee7-7c5560f4539c): Renaming temporary part tmp_merge_20140320_46_59_1 to 20140320_46_59_1.

    2022.03.31 04:52:31.487058 [ 6156710 ] {edac2738-6508-46fa-aee7-7c5560f4539c::20140320_46_59_1} clickhouse_tutorial.user_tag (edac2738-6508-46fa-aee7-7c5560f4539c) (MergerMutator): Merged 3 parts: from 20140320_46_46_0 to 20140320_59_59_0

    2022.03.31 04:52:31.487648 [ 6156710 ] {c03335ce-fee8-42b6-bc36-3e23a8b59b29} MemoryTracker: Peak memory usage Mutate/Merge: 36.46 MiB.

    2022.03.31 04:52:31.487756 [ 6156710 ] {c03335ce-fee8-42b6-bc36-3e23a8b59b29} MemoryTracker: Peak memory usage (for query): 0.00 B.

    2022.03.31 04:52:31.487861 [ 6156710 ] {} TCPHandler: Processed in 0.20999 sec.

    可以看到执行OPTIMIZE TABLE命名,服务端通过MergerMutator(源码MergeTreeDataMergerMutator.cpp)发起了一个MergeTask(源码MergeTask.cpp)任务,执行了一组数据 Part 的合并(merge)、突变(mutation)和移动(move)操作。

    其中,MergerMutator为后台进程选择数据Part,并执行合并、突变和移动等操作。

    1.1.7.ATTACH操作

    语法

    ATTACH操作与 CREATE 的功能相同。

    挂载数据库表或者字典:

    ATTACH TABLE|DICTIONARY [IF NOT EXISTS] [db.]name [ON CLUSTER cluster] ...
    
    • 1

    从数据文件和指定表结构挂载表:

    ATTACH TABLE name FROM 'path/to/data/' (col1 Type1, ...)
    
    • 1

    该操作使用提供的结构创建一个新表,并将表数据从提供的目录,挂载到 `user_files` 中。

    功能说明

    挂载表或字典,执行ATTACH查询后,服务器将知道表或字典的存在。例如,在将ClickHouse数据库移动到另一台服务器时,可以使用此操作迁移数据。ATTACH 操作不会在磁盘上创建数据,而是假设数据已经在适当的位置,并且只是将有关表或字典的信息添加到服务器。

    并将表数据从提供的目录中附加到 `user_files` 中。

    实例讲解

    从数据文件ATTACH 建表

    1、准备数据文件data.CSV:

    DROP TABLE IF EXISTS clickhouse_tutorial.my_test_table;

    INSERT INTO TABLE FUNCTION file(‘/Users/data/clickhouse/user_files/my_test_table/data.CSV’, ‘CSV’, ‘s String, n UInt8’) VALUES (‘abc’, 3);

    2、使用 ATTACH从表数据文件目录/Users/data/clickhouse/user_files/my_test_table/建表:

    ATTACH TABLE clickhouse_tutorial.my_test_table

    FROM ‘/Users/data/clickhouse/user_files/my_test_table’

    (s String, n UInt8)

    ENGINE = File(CSV);

    3、查看表数据:

    SELECT *

    FROM clickhouse_tutorial.my_test_table

    Query id: 33a9e9d9-e95f-4aa0-806c-6362a2f1baeb

    ┌─s───┬─n─┐

    │ abc │ 3 │

    └─────┴───┘

    1 rows in set. Elapsed: 0.002 sec.

    ATTACH 被分离的表

    1、分离表

    DETACH TABLE clickhouse_tutorial.my_test_table

    2、查询分离表

    SELECT *

    FROM clickhouse_tutorial.my_test_table

    Query id: 6a16501c-23ca-47ed-970c-1a14cf30a897

    0 rows in set. Elapsed: 0.001 sec.

    Received exception from server (version 22.4.1):

    Code: 60. DB::Exception: Received from 127.0.0.1:9009. DB::Exception: Table clickhouse_tutorial.my_test_table doesn’t exist. (UNKNOWN_TABLE)

    3、挂载表

    ATTACH TABLE clickhouse_tutorial.my_test_table

    Query id: 09cbec26-cbb6-41fc-ad58-eb6264ff111f

    Ok.

    0 rows in set. Elapsed: 0.001 sec.

    4、查询表

    SELECT *

    FROM clickhouse_tutorial.my_test_table

    Query id: a9a7cad8-3c06-4397-8762-ad6160091d91

    ┌─s───┬─n─┐

    │ abc │ 3 │

    └─────┴───┘

    1 rows in set. Elapsed: 0.002 sec.

    1.1.8.DETACH****操作

    语法

    DETACH TABLE|VIEW|DICTIONARY [IF EXISTS] [db.]name [ON CLUSTER cluster] [PERMANENTLY]

    功能说明

    DETACH : 分离表、视图或字典。详细说明如下:

    分离操作不会删除表、物化视图、字典数据或元数据。

    如果实体未“永久”分离,则在下一次服务器启动时,服务器将读取元数据并再次召回该表、视图或字典。如果实体被“永久”分离,则不会自动召回。

    无论表或字典是否被永久分离,都可以使用ATTACH操作重新挂载它。

    不能RENAME TABLE、DROPTABLE已经分离的表。

    不能CREATE TABLE 与永久分离表名称相同。

    实例讲解

    创建test表:

    CREATE TABLE clickhouse_tutorial.test ENGINE = Log AS SELECT * FROM numbers(10);
    
    • 1

    查询数据:

    SELECT *
    FROM clickhouse_tutorial.test
     
    Query id: 55175f17-9d56-4909-93e3-99c7b945b02c
     
    ┌─number─┐
    │      0 │
    │      1 │
    │      2 │
    │      3 │
    │      4 │
    │      5 │
    │      6 │
    │      7 │
    │      8 │
    │      9 │
    └────────┘
     
    10 rows in set. Elapsed: 0.002 sec.
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19

    DETACH test表:

    DETACH TABLE clickhouse_tutorial.test
     
    Query id: f16d9541-3640-4556-9466-850b99d131aa
     
    Ok.
     
    0 rows in set. Elapsed: 0.001 sec.
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7

    再次查询test 表,提示表不存在::

    SELECT *
    FROM clickhouse_tutorial.test
     
    Query id: 8338bbc9-6d19-4726-a550-c773851ddf58
     
    0 rows in set. Elapsed: 0.001 sec.
     
    Received exception from server (version 22.4.1):
    Code: 60. DB::Exception: Received from 127.0.0.1:9009. DB::Exception: Table clickhouse_tutorial.test doesn't exist. (UNKNOWN_TABLE)
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9

    但是,此时也不能使用表名test创建另外一张表:

    CREATE TABLE clickhouse_tutorial.test
    ENGINE = Log AS
    SELECT *
    FROM numbers(10)
     
    Query id: c338b28a-45c9-41a3-a6c0-f6ddb374df52
     
    0 rows in set. Elapsed: 0.001 sec.
     
    Received exception from server (version 22.4.1):
    Code: 57. DB::Exception: Received from 127.0.0.1:9009. DB::Exception: Table `clickhouse_tutorial`.`test` already exists (detached). (TABLE_ALREADY_EXISTS)
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11

    先自我介绍一下,小编13年上师交大毕业,曾经在小公司待过,去过华为OPPO等大厂,18年进入阿里,直到现在。深知大多数初中级java工程师,想要升技能,往往是需要自己摸索成长或是报班学习,但对于培训机构动则近万元的学费,着实压力不小。自己不成体系的自学效率很低又漫长,而且容易碰到天花板技术停止不前。因此我收集了一份《java开发全套学习资料》送给大家,初衷也很简单,就是希望帮助到想自学又不知道该从何学起的朋友,同时减轻大家的负担。添加下方名片,即可获取全套学习资料哦

  • 相关阅读:
    Go死锁——当Channel遇上Mutex时
    GBase 8c PGXC_GROUP系统表
    2022华数杯A题 B题 C题 思路汇总
    【华为OD机试真题 python】 敏感字段加密【2022 Q4 | 100分】
    世界环境日 | 周大福用心服务推动减碳环保
    Redis为什么变慢了
    文旅媒体有哪些?如何邀请到现场报道?
    Spring中ApplicationListener事件监听机制详解
    浅议信息系统控制在企业中的应用(lunwen+开题报告)
    【推荐系统】推荐系统基础算法-基于协同的推荐算法
  • 原文地址:https://blog.csdn.net/Ajekseg/article/details/126070844