• ClickHouse 运维DBA实用场景


    ClickHouse 运维DBA实用场景

    在 ClickHouse 中有没有一些能够 “安家立命” 的运维 SQL 语句,为此做一次简单的分享。
    在 ClickHouse 默认的 system 数据库下(databse),拥有众多的系统表。我们对 ClickHouse 运行状态的各种信息,就主要来自于这些系统表。

    当前连接数

    CK 对外暴露的原生接口分为 TCP 和 HTTP 两类,通过 system.metrics 即可查询当前的 TCP、HTTP 与内部副本的连接数。

    SELECT *
    FROM system.metrics
    WHERE metric LIKE '%Connection';
    
    ┌─metric────────────────┬─value─┬─description────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
    │ TCPConnection	        │     9 │ Number of connections to TCP server (clients with native interface), also included server-server distributed query connections │
    │ MySQLConnection	    │     0 │ Number of client connections using MySQL protocol                                                                              │
    │ HTTPConnection	    │     1 │ Number of connections to HTTP server                                                                                           │
    │ InterserverConnection	│     0 │ Number of connections from other replicas to fetch parts                                                                       │
    │ PostgreSQLConnection	│     0 │ Number of client connections using PostgreSQL protocol                                                                         │
    └───────────────────────┴───────┴────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11

    当前正在执行的查询

    通过 system.processes 可以查询目前正在执行的查询

    SELECT 
        query_id, 
        user, 
        address, 
        query
    FROM system.processes
    ORDER BY query_id ASC;
    
    ┌─query_id─────────────────────────────┬─user────┬─address────────────┬─query─────────────────────────────────────────────────────────────────────────────┐
    │ 203f1d0e-944e-472d-8d8f-bae548ff9899 │ default │ ::ffff:10.37.119.4SELECT query_id, user, address, query FROM system.processes ORDER BY query_id ASC │
    │ fb7fba85-b2a0-4271-87ff-22da97ae511b │ default │ ::ffff:10.37.119.4INSERT INTO hits_v1 FORMAT TSV                                                    │
    └──────────────────────────────────────┴─────────┴────────────────────┴───────────────────────────────────────────────────────────────────────────────────┘
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12

    可以看到,CH 目前正在执行两条语句,其中第 2 条是 INSERT 查询正在写入数据。

    终止查询

    通过 KILL QUERY 语句,可以终止正在执行的查询:

    KILL QUERY WHERE query_id = 'query_id'
    
    • 1

    例如,终止刚才的 INSERT 查询:

    KILL QUERY WHERE query_id='ff695827-dbf5-45ad-9858-a853946ea140';
    
    KILL QUERY WHERE query_id = 'ff695827-dbf5-45ad-9858-a853946ea140' ASYNC
    
    Ok.
    
    0 rows in set. Elapsed: 0.024 sec.
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7

    众所周知,除了常规的 SELECTINSERT 之外,在 ClickHouse 中还存在一类被称作 Mutation 的操作,也就是 ALTER DELETEALTER UPDATE

    对于 Mutation 操作, ClickHouse 专门提供了 system.mutations 用于查询,例如:

    SELECT 
        database, 
        table, 
        mutation_id, 
        command, 
        create_time, 
        is_done
    FROM system.mutations;
    
    ┌─database─┬─table──────┬─mutation_id────┬─command──────────────────┬─────────create_time─┬─is_done─┐
    │ default  │ testcol_v9 │ mutation_2.txt │ DELETE WHERE ID = 'A003'2020-06-29 01:15:041 │
    └──────────┴────────────┴────────────────┴──────────────────────────┴─────────────────────┴─────────┘
    
    1 rows in set. Elapsed: 0.002 sec.
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14

    同样的,可以使用 KILL MUTATION 终止正在执行的 Mutation 操作:

    KILL MUTATION WHERE mutation_id = 'mutation_id';
    
    • 1

    存储空间统计

    查询 CH 各个存储路径的空间:

    SELECT 
        name, 
        path, 
        formatReadableSize(free_space) AS free, 
        formatReadableSize(total_space) AS total, 
        formatReadableSize(keep_free_space) AS reserved
    FROM system.disks;
    
    ┌─name──────┬─path──────────────┬─free──────┬─total─────┬─reserved─┐
    │ default/chbase/data/36.35 GiB │ 49.09 GiB │ 0.00 B   │
    │ disk_cold │ /chbase/cloddata/35.35 GiB │ 48.09 GiB │ 1.00 GiB │
    │ disk_hot1 │ /chbase/data/36.35 GiB │ 49.09 GiB │ 0.00 B   │
    └───────────┴───────────────────┴───────────┴───────────┴──────────┘
    
    4 rows in set. Elapsed: 0.001 sec.
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15

    各数据库占用空间统计

    SELECT 
        database, 
        formatReadableSize(sum(bytes_on_disk)) AS on_disk
    FROM system.parts
    GROUP BY database;
    
    ┌─database─┬─on_disk──┐
    │ system   │ 1.59 MiB │
    │ default3.60 GiB │
    └──────────┴──────────┘
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10

    个列字段占用空间统计

    每个列字段的压缩大小、压缩比率以及该列的每行数据大小的占比

    SELECT 
        database, 
        table, 
        column, 
        any(type), 
        sum(column_data_compressed_bytes) AS compressed, 
        sum(column_data_uncompressed_bytes) AS uncompressed, 
        round(uncompressed / compressed, 2) AS ratio, 
        compressed / sum(rows) AS bpr, 
        sum(rows)
    FROM system.parts_columns
    WHERE active AND database != 'system'
    GROUP BY 
        database, 
        table, 
        column
    ORDER BY 
        database ASC, 
        table ASC, 
        column ASC;
    
    ┌─database─┬─table────────┬─column─────────────────────┬─any(type)──────────────────────────────┬─compressed─┬─uncompressed─┬──ratio─┬───────────────────bpr─┬─sum(rows)─┐
    │ default  │ hits_v1      │ AdvEngineID                │ UInt8                                  │     3515342662170675.730.01320478860370556326621706 │
    │ default  │ hits_v1      │ Age                        │ UInt8                                  │    7543552266217063.530.283360953651880926621706 │
    │ default  │ hits_v1      │ BrowserCountry             │ FixedString(2)6549379532434128.130.2460165024735830326621706 │
    │ default  │ hits_v1      │ BrowserLanguage            │ FixedString(2)28190855324341218.890.1058942278154525526621706 │
    │ default  │ hits_v1      │ CLID                       │ UInt32                                 │    231100610648682446.080.0868090872914004826621706 │
    │ default  │ hits_v1      │ ClientEventTime            │ DateTime985187041064868241.083.700690857302683826621706 │
    │ default  │ hits_v1      │ ClientIP                   │ UInt32                                 │   251207661064868244.240.943619691390176126621706 │
    │ default  │ hits_v1      │ ClientIP6                  │ FixedString(16)2508855842594729616.980.942409851569993426621706 │
    │ default  │ hits_v1      │ ClientTimeZone             │ Int16                                  │    8487148532434126.270.318805564151298226621706 │
    │ default  │ hits_v1      │ CodeVersion                │ UInt32                                 │   119769521064868248.890.449894232924065826621706 │
    │ default  │ hits_v1      │ ConnectTiming              │ Int32                                  │   279373731064868243.811.049420837267153426621706 │
    │ default  │ hits_v1      │ CookieEnable               │ UInt8                                  │     20271826621706131.320.00761476368193683826621706 │
    │ default  │ hits_v1      │ CounterClass               │ Int8                                   │     4254922662170662.570.01598289756486680526621706...
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36

    慢查询

    SELECT 
        user, 
        client_hostname AS host, 
        client_name AS client, 
        formatDateTime(query_start_time, '%T') AS started, 
        query_duration_ms / 1000 AS sec, 
        round(memory_usage / 1048576) AS MEM_MB, 
        result_rows AS RES_CNT, 
        result_bytes / 1048576 AS RES_MB, 
        read_rows AS R_CNT, 
        round(read_bytes / 1048576) AS R_MB, 
        written_rows AS W_CNT, 
        round(written_bytes / 1048576) AS W_MB, 
        query
    FROM system.query_log
    WHERE type = 2
    ORDER BY query_duration_ms DESC
    LIMIT 10;
    
    ┌─user────┬─host─────────┬─client────────────┬─started──┬────sec─┬─MEM_MB─┬──RES_CNT─┬────────────────RES_MB─┬────R_CNT─┬─R_MB─┬───W_CNT─┬─W_MB─┬─query───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
    │ default │ ch7.nauu.com │ ClickHouse client │ 01:05:0351.434103188738988706.511466979980088738988707INSERT INTO hits_v1 FORMAT TSV                                                                                                                                          │
    │ default │ ch7.nauu.com │ ClickHouse client │ 01:01:4843.511103188738988706.511466979980088738988707INSERT INTO hits_v1 FORMAT TSV                                                                                                                                          │
    │ default │ ch7.nauu.com │ ClickHouse client │ 17:12:0411.12180118874398446.8216323852539629146635100SELECT id, arrayJoin(arrayConcat(groupArray(a), groupArray(b), groupArray(c))) AS v FROM test_y GROUP BY id ORDER BY v ASC                                              │
    │ default │ ch7.nauu.com │ ClickHouse client │ 17:13:283.992154918874398446.8216323852539629146635100SELECT id, arrayJoin(arrayConcat(groupArray(a), groupArray(b), groupArray(c))) AS v FROM test_y GROUP BY id                                                             │
    │ default │ ch7.nauu.com │ ClickHouse client │ 17:13:123.976154918874398446.8216323852539629146635100SELECT id, arrayJoin(arrayConcat(groupArray(a), groupArray(b), groupArray(c))) AS v FROM test_y GROUP BY id                                                             │
    │ default │ ch7.nauu.com │ ClickHouse client │ 01:25:393.962154918874398446.8216323852539629146635100SELECT id, arrayJoin(arrayConcat(groupArray(a), groupArray(b), groupArray(c))) AS v FROM test_y GROUP BY id                                                             │
    │ default │ ch7.nauu.com │ ClickHouse client │ 04:32:293.114154210000000219.821929931640621050000023100SELECT user_id, argMax(score, create_time) AS score, argMax(deleted, create_time) AS deleted, max(create_time) AS ctime FROM test_a GROUP BY user_id HAVING deleted = 0 │
    │ default │ ch7.nauu.com │ ClickHouse client │ 02:59:563.03154410000000219.753809928894041050000023100SELECT user_id, argMax(score, create_time) AS score, argMax(is_update, create_time) AS is_update, max(create_time) AS ctime FROM test_a GROUP BY user_id                │
    │ default │ ch7.nauu.com │ ClickHouse client │ 02:54:013.019154310000000219.34509277343751050000023000SELECT user_id, argMax(score, create_time) AS score, argMax(delete, create_time) AS delete, max(create_time) AS ctime FROM test_a GROUP BY user_id                      │
    │ default │              │                   │ 03:03:122.8571543100.00022697448730468751050000023100SELECT * FROM view_test_a limit 10                                                                                                                                      │
    └─────────┴──────────────┴───────────────────┴──────────┴────────┴────────┴──────────┴───────────────────────┴──────────┴──────┴─────────┴──────┴─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
    
    10 rows in set. Elapsed: 0.017 sec. Processed 1.44 thousand rows, 200.81 KB (83.78 thousand rows/s., 11.68 MB/s.) 
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33

    副本预警监控

    通过下面的 SQL 语句对副本进行预警监控,其中各个预警的变量可以根据自身情况调整。

    SELECT database, table, is_leader, total_replicas, active_replicas 
      FROM system.replicas 
     WHERE is_readonly 
        OR is_session_expired 
        OR future_parts > 30 
        OR parts_to_check > 20 
        OR queue_size > 30 
        OR inserts_in_queue > 20 
        OR log_max_index - log_pointer > 20 
        OR total_replicas < 2 
        OR active_replicas < total_replicas
    
    ┌─database─┬─table───────────────────────┬─is_leader─┬─total_replicas─┬─active_replicas─┐
    │ default  │ replicated_sales_12         │         000 │
    │ default  │ test_fetch                  │         000 │
    │ default  │ test_sharding_simple2_local │         000 │
    └──────────┴─────────────────────────────┴───────────┴────────────────┴─────────────────┘
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
  • 相关阅读:
    Hadoop源码解析之Mapper数量计算公式
    营养电子秤控制板方案
    git config 查看,设置,删除项
    R-CNN,Fast R-CNN详解
    高精度隧道施工人员定位安全管理解决方案
    Idea中使用测试类,提示无可任务No tasks available
    Java中的异常体系
    【Verilog】画出下列wave信号波形图
    仿优酷视频焦点轮播图布局html页面前端源码
    网络安全是什么?如何成为一位优秀的网络安全工程师?
  • 原文地址:https://blog.csdn.net/weixin_44275063/article/details/126125666