• ROW_NUMBER 开窗函数优化方案(Oracle && KingBase 性能比对)


    帮朋友优化一条很简单的窗口函数 ROW_NUMBER() OVER() , Oracle 迁移 KingBase 项目。


    原始SQL和执行计划

    STUDENT_BAK 表我模拟的数据,3千万行数据。

    SELECT STU_ID, STU_NAME, STU_SEX, STU_AGE, STU_DATE, STU_CLASSID, STU_SAL
    FROM (SELECT SB.*, 
    			 ROW_NUMBER() OVER (PARTITION BY SB.STU_CLASSID,SB.STU_AGE,SB.STU_SEX ORDER BY SB.STU_DATE DESC) RN 
    	FROM STUDENT_BAK SB) X
    WHERE RN = 1;
    

    Oracle 执行计划

    --------------------------------------------------------------------------------------------------------------------
    | Id  | Operation                | Name        | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  | Writes |
    --------------------------------------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT         |             |      1 |        |   5400 |00:00:28.92 |     160K|    166K|   5702 |
    |*  1 |  VIEW                    |             |      1 |     30M|   5400 |00:00:28.92 |     160K|    166K|   5702 |
    |*  2 |   WINDOW SORT PUSHED RANK|             |      1 |     30M|  12919 |00:00:28.92 |     160K|    166K|   5702 |
    |   3 |    TABLE ACCESS FULL     | STUDENT_BAK |      1 |     30M|     30M|00:00:01.44 |     160K|    160K|      0 |
    --------------------------------------------------------------------------------------------------------------------
     
    Predicate Information (identified by operation id):
    ---------------------------------------------------
     
    "   1 - filter(""RN""=1)"
    "   2 - filter(ROW_NUMBER() OVER ( PARTITION BY ""SB"".""STU_CLASSID"",""SB"".""STU_AGE"",""SB"".""STU_SEX"" ORDER BY INTERNAL_FUNCTION(""SB"".""STU_DATE"") DESC "
                  )<=1)
    

    KingBase 执行计划

                                                                        QUERY PLAN                                                                    
    --------------------------------------------------------------------------------------------------------------------------------
     Subquery Scan on x  (cost=5085273.94..6210240.94 rows=149996 width=33) (actual time=64787.749..80656.358 rows=5400 loops=1)
       Filter: (x.rn = 1)
       Rows Removed by Filter: 29994600
       ->  WindowAgg  (cost=5085273.94..5835251.94 rows=29999120 width=41) (actual time=64787.747..79956.535 rows=30000000 loops=1)
             ->  Sort  (cost=5085273.94..5160271.74 rows=29999120 width=33) (actual time=64787.728..73281.302 rows=30000000 loops=1)
                   Sort Key: sb.stu_classid, sb.stu_age, sb.stu_sex, sb.stu_date DESC
                   Sort Method: external merge  Disk: 1369904kB
                   ->  Seq Scan on student_bak sb  (cost=0.00..539329.20 rows=29999120 width=33) (actual time=0.008..20064.258 rows=30000000 loops=1)
     Planning Time: 0.123 ms
     Execution Time: 80764.258 ms
    (10 行记录)
    

    Oracle:28秒出结果,KES:80秒出结果,两者执行时间差距很大。


    优化方案一、使用分区表

    将 STUDENT_BAK 表的 STU_CLASSID 列作为 LIST 分区

    Oracle 使用分区表优化后执行计划:

    ----------------------------------------------------------------------------------------------------------------
    | Id  | Operation                 | Name            | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |
    ----------------------------------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT          |                 |      1 |        |   5400 |00:00:33.62 |     167K|    167K|
    |   1 |  PARTITION LIST ALL       |                 |      1 |     30M|   5400 |00:00:33.62 |     167K|    167K|
    |*  2 |   VIEW                    |                 |    300 |     30M|   5400 |00:00:21.59 |     167K|    167K|
    |*  3 |    WINDOW SORT PUSHED RANK|                 |    300 |     30M|  12919 |00:00:18.20 |     167K|    167K|
    |   4 |     TABLE ACCESS FULL     | STUDENT_PAR_BAK |    300 |     30M|     30M|00:00:01.15 |     167K|    167K|
    ----------------------------------------------------------------------------------------------------------------
     
    Predicate Information (identified by operation id):
    ---------------------------------------------------
     
    "   2 - filter(""RN""=1)"
    "   3 - filter(ROW_NUMBER() OVER ( PARTITION BY ""SB"".""STU_CLASSID"",""SB"".""STU_AGE"",""SB"".""STU_SEX"" ORDER BY "
    "              INTERNAL_FUNCTION(""SB"".""STU_DATE"") DESC )<=1)"
    

    KingBase 使用分区表优化后执行计划:

                                                                             QUERY PLAN                                                                         
    ------------------------------------------------------------------------------------------------------------------------------------------------------------
     Subquery Scan on x  (cost=10416416.10..11541288.60 rows=149983 width=450) (actual time=54656.336..63401.133 rows=5400 loops=1)
       Filter: (x.rn = 1)
       Rows Removed by Filter: 29994600
       ->  WindowAgg  (cost=10416416.10..11166331.10 rows=29996600 width=458) (actual time=54656.333..62742.542 rows=30000000 loops=1)
             ->  Sort  (cost=10416416.10..10491407.60 rows=29996600 width=450) (actual time=54656.303..56466.219 rows=30000000 loops=1)
                   Sort Key: sb.stu_classid, sb.stu_age, sb.stu_sex, sb.stu_date DESC
                   Sort Method: external merge  Disk: 1369904kB
                   ->  Append  (cost=0.00..539445.00 rows=29996600 width=450) (actual time=0.010..20511.558 rows=30000000 loops=1)
                         ->  Seq Scan on student_par_bak_p0 sb  (cost=0.00..1798.15 rows=99989 width=450) (actual time=0.010..67.413 rows=100240 loops=1)
                         ->  Seq Scan on student_par_bak_p1 sb_1  (cost=0.00..1798.15 rows=99989 width=450) (actual time=0.010..67.507 rows=100169 loops=1)
                         ->  Seq Scan on student_par_bak_p2 sb_2  (cost=0.00..1798.15 rows=99989 width=450) (actual time=0.013..65.356 rows=100260 loops=1)
                         ->  Seq Scan on student_par_bak_p3 sb_3  (cost=0.00..1798.15 rows=99989 width=450) (actual time=0.009..65.545 rows=100052 loops=1)
                         ->  Seq Scan on student_par_bak_p4 sb_4  (cost=0.00..1798.15 rows=99989 width=450) (actual time=0.010..65.411 rows=100078 loops=1)
                         ->  Seq Scan on student_par_bak_p5 sb_5  (cost=0.00..1798.15 rows=99989 width=450) (actual time=0.010..65.672 rows=99757 loops=1)
                         ->  Seq Scan on student_par_bak_p6 sb_6  (cost=0.00..1798.15 rows=99989 width=450) (actual time=0.007..66.148 rows=100510 loops=1)
                         ->  Seq Scan on student_par_bak_p7 sb_7  (cost=0.00..1798.15 rows=99989 width=450) (actual time=0.008..65.720 rows=99636 loops=1)
                         ->  Seq Scan on student_par_bak_p8 sb_8  (cost=0.00..1798.15 rows=99989 width=450) (actual time=0.013..63.716 rows=100625 loops=1)
                         ->  Seq Scan on student_par_bak_p9 sb_9  (cost=0.00..1798.15 rows=99989 width=450) (actual time=0.007..64.223 rows=99965 loops=1)
                         ->  Seq Scan on student_par_bak_p10 sb_10  (cost=0.00..1798.15 rows=99989 width=450) (actual time=0.009..65.790 rows=100186 loops=1)
                         ->  Seq Scan on student_par_bak_p11 sb_11  (cost=0.00..1798.15 rows=99989 width=450) (actual time=0.011..65.542 rows=100171 loops=1)
                         ->  Seq Scan on student_par_bak_p12 sb_12  (cost=0.00..1798.15 rows=99989 width=450) (actual time=0.015..64.736 rows=99841 loops=1)
                        /* 此处省略 200 行分区表计划 */
     Planning Time: 0.688 ms
     Execution Time: 63466.714 ms
    (310 行记录)
    

    Oracle:33秒出结果,KES:63秒出结果。

    Oracle使用分区表后反而比原来的单表慢了5秒,但是KES使用分区表以后比原来快了17秒(效果不错)。


    优化方案二、分区表的基础上使用并行

    Oracle 语句添加 HINT parallel(8) 执行计划

    ------------------------------------------------------------------------------------------------------------
    | Id  | Operation                      | Name            | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
    ------------------------------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT               |                 |      1 |        |   5400 |00:00:02.80 |     910 |
    |   1 |  PX COORDINATOR                |                 |      1 |        |   5400 |00:00:02.80 |     910 |
    |   2 |   PX SEND QC (RANDOM)          | :TQ10001        |      0 |     30M|      0 |00:00:00.01 |       0 |
    |*  3 |    VIEW                        |                 |      0 |     30M|      0 |00:00:00.01 |       0 |
    |*  4 |     WINDOW SORT PUSHED RANK    |                 |      0 |     30M|      0 |00:00:00.01 |       0 |
    |   5 |      PX RECEIVE                |                 |      0 |     30M|      0 |00:00:00.01 |       0 |
    |   6 |       PX SEND HASH             | :TQ10000        |      0 |     30M|      0 |00:00:00.01 |       0 |
    |*  7 |        WINDOW CHILD PUSHED RANK|                 |      0 |     30M|      0 |00:00:00.01 |       0 |
    |   8 |         PX BLOCK ITERATOR      |                 |      0 |     30M|      0 |00:00:00.01 |       0 |
    |*  9 |          TABLE ACCESS FULL     | STUDENT_PAR_BAK |      0 |     30M|      0 |00:00:00.01 |       0 |
    ------------------------------------------------------------------------------------------------------------
     
    Predicate Information (identified by operation id):
    ---------------------------------------------------
     
    "   3 - filter(""RN""=1)"
    "   4 - filter(ROW_NUMBER() OVER ( PARTITION BY ""SB"".""STU_CLASSID"",""SB"".""STU_AGE"",""SB"".""STU_SEX"" ORDER BY "
    "              INTERNAL_FUNCTION(""SB"".""STU_DATE"") DESC )<=1)"
    "   7 - filter(ROW_NUMBER() OVER ( PARTITION BY ""SB"".""STU_CLASSID"",""SB"".""STU_AGE"",""SB"".""STU_SEX"" ORDER BY "
    "              INTERNAL_FUNCTION(""SB"".""STU_DATE"") DESC )<=1)"
       9 - access(:Z>=:Z AND :Z<=:Z)
     
    Note
    -----
       - Degree of Parallelism is 8 because of hint
    

    KingBase 语句添加 HINT parallel(8 sb hard) 执行计划

                                                                                   QUERY PLAN                                                                                
    -------------------------------------------------------------------------------------------------------------------------------------------------------------------------
     Subquery Scan on x  (cost=9876971.10..11001843.60 rows=149983 width=450) (actual time=52118.006..61222.471 rows=5400 loops=1)
       Filter: (x.rn = 1)
       Rows Removed by Filter: 29994600
       ->  WindowAgg  (cost=9876971.10..10626886.10 rows=29996600 width=458) (actual time=52118.003..60558.094 rows=30000000 loops=1)
             ->  Sort  (cost=9876971.10..9951962.60 rows=29996600 width=450) (actual time=52117.986..54214.307 rows=30000000 loops=1)
                   Sort Key: sb.stu_classid, sb.stu_age, sb.stu_sex, sb.stu_date DESC
                   Sort Method: external merge  Disk: 1369904kB
                   ->  Gather  (cost=0.00..0.00 rows=29996600 width=450) (actual time=2.655..9516.724 rows=30000000 loops=1)
                         Workers Planned: 8
                         Workers Launched: 8
                         ->  Parallel Append  (cost=0.00..0.00 rows=3749575 width=450) (actual time=0.020..10158.207 rows=3333333 loops=9)
                               ->  Parallel Seq Scan on student_par_bak_p0 sb  (cost=0.00..0.00 rows=12499 width=450) (actual time=0.018..400.671 rows=100240 loops=1)
                               ->  Parallel Seq Scan on student_par_bak_p1 sb_1  (cost=0.00..0.00 rows=12499 width=450) (actual time=0.018..410.512 rows=100169 loops=1)
                               ->  Parallel Seq Scan on student_par_bak_p2 sb_2  (cost=0.00..0.00 rows=12499 width=450) (actual time=0.024..344.441 rows=100260 loops=1)
                               ->  Parallel Seq Scan on student_par_bak_p3 sb_3  (cost=0.00..0.00 rows=12499 width=450) (actual time=0.021..337.809 rows=100052 loops=1)
                               ->  Parallel Seq Scan on student_par_bak_p4 sb_4  (cost=0.00..0.00 rows=12499 width=450) (actual time=0.019..406.484 rows=100078 loops=1)
                               ->  Parallel Seq Scan on student_par_bak_p5 sb_5  (cost=0.00..0.00 rows=12499 width=450) (actual time=0.020..385.837 rows=99757 loops=1)
                               ->  Parallel Seq Scan on student_par_bak_p6 sb_6  (cost=0.00..0.00 rows=12499 width=450) (actual time=0.023..259.886 rows=100510 loops=1)
                               ->  Parallel Seq Scan on student_par_bak_p7 sb_7  (cost=0.00..0.00 rows=12499 width=450) (actual time=0.019..293.689 rows=99636 loops=1)
                              /* 此处省略 200 行分区表计划 */
     Planning Time: 0.564 ms
     Execution Time: 61279.150 ms
    (313 行记录)
    

    Oracle:2.8秒出结果,KES:61秒出结果。
    Oracle使用并行以后提升巨大、KES 感觉并行感觉没有对row_number进行很好的优化,和原来分区表执行时间差别不大:61s。


    优化方案三、ROW_NUMBER() OVER() 等价改写

    在原来分区表、8个并行的基础上等价改写 ROW_NUMBER() OVER() 逻辑

    等价改写SQL不放出来,看到这里有想要继续深入学习的读者可以联系我,毕竟我也想了一会才想到实现的方法,纯逻辑转换题。

    Oracle 等价改写ROW_NUMBER() OVER() 计划

    -------------------------------------------------------------------------------------------------------------------
    | Id  | Operation                     | Name            | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |
    -------------------------------------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT              |                 |      1 |        |   5400 |00:00:00.69 |     910 |       |
    |   1 |  PX COORDINATOR               |                 |      1 |        |   5400 |00:00:00.69 |     910 |       |
    |   2 |   PX SEND QC (RANDOM)         | :TQ10001        |      0 |     37 |      0 |00:00:00.01 |       0 |       |
    |   3 |    NESTED LOOPS               |                 |      0 |     37 |      0 |00:00:00.01 |       0 |       |
    |   4 |     VIEW                      |                 |      0 |   2700 |      0 |00:00:00.01 |       0 |       |
    |   5 |      SORT GROUP BY            |                 |      0 |   2700 |      0 |00:00:00.01 |       0 | 68608 |
    |   6 |       PX RECEIVE              |                 |      0 |   2700 |      0 |00:00:00.01 |       0 |       |
    |   7 |        PX SEND HASH           | :TQ10000        |      0 |   2700 |      0 |00:00:00.01 |       0 |       |
    |   8 |         SORT GROUP BY         |                 |      0 |   2700 |      0 |00:00:00.01 |       0 |   585K|
    |   9 |          PX BLOCK ITERATOR    |                 |      0 |     30M|      0 |00:00:00.01 |       0 |       |
    |* 10 |           TABLE ACCESS FULL   | STUDENT_PAR_BAK |      0 |     30M|      0 |00:00:00.01 |       0 |       |
    |* 11 |     TABLE ACCESS BY USER ROWID| STUDENT_PAR_BAK |      0 |      1 |      0 |00:00:00.01 |       0 |       |
    -------------------------------------------------------------------------------------------------------------------
     
    Predicate Information (identified by operation id):
    ---------------------------------------------------
     
      10 - access(:Z>=:Z AND :Z<=:Z)
    "  11 - filter((""B"".""STU_CLASSID""=""X2"".""STU_CLASSID"" AND ""B"".""STU_AGE""=""X2"".""STU_AGE"" AND ""B"".""STU_SEX""=""X2"".""STU_SEX"" AND "
    "              ""B"".""STU_DATE""=""X2"".""MAX_STU_DATE""))"
     
    Note
    -----
       - Degree of Parallelism is 8 because of hint
    

    KingBase 等价改写ROW_NUMBER() OVER() 计划

    						
    																QUERY PLAN    
    
    --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
     Hash Join  (cost=5627880.30..6621517.67 rows=281 width=450) (actual time=26827.678..32286.183 rows=5400 loops=1)
       Hash Cond: ((x1.stu_classid = sb.stu_classid) AND (x1.stu_age = sb.stu_age) AND ((x1.stu_sex)::text = (sb.stu_sex)::text) AND ((x1.stu_date)::timestamp without time zone = (((max(
    (sb.stu_date)::timestamp without time zone))::date))::timestamp without time zone))
       CTE x
         ->  Gather  (cost=0.00..0.00 rows=29996600 width=456) (actual time=3.752..6509.459 rows=30000000 loops=1)
               Workers Planned: 8
               Workers Launched: 8
               ->  Parallel Append  (cost=0.00..0.00 rows=3749575 width=456) (actual time=0.018..10115.770 rows=3333333 loops=9)
                     ->  Parallel Seq Scan on student_par_bak_p0 b  (cost=0.00..0.00 rows=12499 width=456) (actual time=0.020..308.911 rows=100240 loops=1)
                     ->  Parallel Seq Scan on student_par_bak_p1 b_1  (cost=0.00..0.00 rows=12499 width=456) (actual time=0.018..308.579 rows=100169 loops=1)
                     ->  Parallel Seq Scan on student_par_bak_p2 b_2  (cost=0.00..0.00 rows=12499 width=456) (actual time=0.017..248.827 rows=100260 loops=1)
                     ->  Parallel Seq Scan on student_par_bak_p3 b_3  (cost=0.00..0.00 rows=12499 width=456) (actual time=0.016..278.987 rows=100052 loops=1)
                     ->  Parallel Seq Scan on student_par_bak_p4 b_4  (cost=0.00..0.00 rows=12499 width=456) (actual time=0.021..316.482 rows=100078 loops=1)
                     ->  Parallel Seq Scan on student_par_bak_p5 b_5  (cost=0.00..0.00 rows=12499 width=456) (actual time=0.016..306.535 rows=99757 loops=1)
                     ->  Parallel Seq Scan on student_par_bak_p6 b_6  (cost=0.00..0.00 rows=12499 width=456) (actual time=0.016..312.291 rows=100510 loops=1)
                     ->  Parallel Seq Scan on student_par_bak_p7 b_7  (cost=0.00..0.00 rows=12499 width=456) (actual time=0.015..187.120 rows=99636 loops=1)
                              /* 此处省略 200 行分区表计划 */
       ->  CTE Scan on x x1  (cost=0.00..599932.00 rows=29996600 width=456) (actual time=3.754..1339.590 rows=30000000 loops=1)
       ->  Hash  (cost=5560387.95..5560387.95 rows=2999660 width=30) (actual time=26808.484..26808.486 rows=5400 loops=1)
             Buckets: 4194304  Batches: 1  Memory Usage: 33095kB
             ->  GroupAggregate  (cost=5042946.60..5530391.35 rows=2999660 width=30) (actual time=20922.491..26805.407 rows=5400 loops=1)
                   Group Key: sb.stu_classid, sb.stu_age, sb.stu_sex
                   ->  Sort  (cost=5042946.60..5117938.10 rows=29996600 width=30) (actual time=20921.360..22292.427 rows=30000000 loops=1)
                         Sort Key: sb.stu_classid, sb.stu_age, sb.stu_sex
                         Sort Method: external merge  Disk: 1174192kB
                         ->  CTE Scan on x sb  (cost=0.00..599932.00 rows=29996600 width=30) (actual time=0.004..10345.811 rows=30000000 loops=1)
     Planning Time: 1.540 ms
     Execution Time: 32535.572 ms
    (318 行记录)
    
    时间:32541.503 ms (00:32.542)
    

    Oracle:0.69秒出结果,KES:32秒出结果。
    Oracle执行速度感觉打了鸡血一样,太猛了,KES改写以后也能从61秒降低到32秒跑出结果

    目前只能将KES的 ROW_NUMBER() OVER() 执行速度优化到32秒,后续再想想有没有更快的优化方案


    结束语

    上面 ROW_NUMBER() OVER() 的三种优化手段,如果在项目上有遇到类似的案例,先要搞清楚环境需求,再选择采用哪种优化手段。

    如果读者有更好的优化思路,欢迎一起交流,进步。

  • 相关阅读:
    升级macOS系统到13之后git报错
    APS软件的技术指标与特色
    使用 BigDecimal 进行浮点数运算
    华为模拟器手把手安装教程-HCIE(华为网络工程师)
    java使用bouncycastle加解密
    并发编程(概念简述)
    java毕业设计产品销售管理系统Mybatis+系统+数据库+调试部署
    【全民Python】Python环境配置和Pychram的安装使用
    Flutter实战-页面参数传递
    算法通关村第十一关|白银|位运算高频算法题【持续更新】
  • 原文地址:https://www.cnblogs.com/yuzhijian/p/17910072.html