• 执行日志(3)查看JOIN算子执行


    3.3查看JOIN算子执行

    本节主要介绍Join, Aggregate等算子的trace的详细信息,从中可分析 出算子是否并行执行,采用具体何种方法执行等信息。

    Join算子目前有:

    SortMerge Join :主要用于非等值连接,如t1. a > t2.b

    Hash Join:主要用于等值连接,内部又细分为RR, One-Pass, Hybrid三 种方法,如t1.a=t2. a;

    NestLoop Join :以上其他方法都失效,走嵌套循环连接;

    Complex Join:主要用于包含Or的连接,Or的两端可以是Sort Merge也 可以是Hash Join;

    3. 3. 1 sort merge join

    BEGIN Join

    cnd(0):

    cnd(0): t1.i < t2. j

    prepare to use sort-merge

    Joining sorters created fo

    inner join (T0 - T1), rows.

    cnd(0) Done(time used: 0.0

    END Join(time used: 0.003s

    3. 3. 2 Hash Join

    等值连接走hash join。hash join有三种,一般情况下会走hybrid hash join

    1. 串行 One Pass Hash Join

    BEGIN Join

    cnd(0):

    cnd(0): lineitem.L_ORDERKEY = orders. O_ORDERKEY

    prepare          to          use          hash          join

    等值连接走hash join

    op buffer size: 16777216, tuple width: 16. op buffer can hold 1048576 rows 算子 buffer 大小 16777216,join 字段每行 16 字节,buffer

    可以装1048576

    Begin one-pass hash partitioning: divide 1500000 tuples into 6 parts. mat_buf_size = 2796200 看到'one-pass hash partitioning'代表

    one-pass hash join

    mat partition thread: divide 1500000 tuples into 6 parts, min 241835
    tuples, max 258974 tuples, avg 250000 tuples.

    Finish one-pass hash partitioning: divide 1500000 tuples into 6 parts. mat_buf_size = 2796200. (time used: 2.104s)

    Begin one-pass hash partitioning: divide 6001215 tuples into 6 parts. mat_buf_size = 2796200 join 的右边划分

    Finish one-pass hash partitioning: divide 6001215 tuples into 6 parts. mat_buf_size = 2796200. (time used: 5.589s)

    Finish One-Pass Hash Join preparation: divided each side into 6 partitions

    Hash tree is used, size: 16777216

    Begin     serial    rowid     merge-sorting:      6001215     rows

    join结果排序,这里serial可以sort可以看出join是串行的

    Finish serial rowid merge-sorting: 6001215 rows. (time used: 5.806s) inner join(T0 - T1),   using hash join,                   produced 6001215 rows.

    cnd(0) Done(time used: 39.754s)

    END Join(time used: 39. 755s)

    1. 并行 One Pass Hash Join

    BEGIN Join

    prepare to use hash join

    op buffer size: 16777216, tuple width: 16. op buffer can hold 1048576

    <

  • 相关阅读:
    flurl监听报错返回的信息
    Pytorch中Tensor类型转换
    一款WPF开发的网易云音乐客户端 - DMSkin-CloudMusic
    旅游推荐系统
    Etcd 解析
    Densenet--->比残差力度更大 senet-->本质抑制特征
    Apache Arrow DataFusion原理与架构
    基于SpringBoot的设备管理系统
    状态压缩DP及其拓展
    动态代理之Cjlib的动态代理简单理解
  • 原文地址:https://blog.csdn.net/aisirea/article/details/128075273