• 值得反复研读的表连接之SORT MERGE JOIN方式


    SORT MERGE JOIN 主要用于两表非等值关联。 例如 >,>=,<,<=,<>

    不能用于 instr,substr,like,regexp_like关联(只能走NL)

    SORT MERGE JOIN : 两表关联,先对两表根据连接列进行排序,将较小的表作为驱动表,然后从驱动表中取出连接列的值。去已完成排序的被驱动表中进行数据匹配,成功匹配,表明关联成功。驱动表返回多少行,被驱动表就会被匹配多少次。与NL相似,但NL是从被驱动表的索引中进行数据匹配。而SORT MERGE JOIN 是在内存(PGA)中进行数据匹配。

    1. scott@orclpdb1:orclcdb> select /*+gather_plan_statistics*/
    2. 2 e.ename, e.job, d.dname
    3. from emp e, dept d
    4. 4 where e.deptno >= d.deptno;
    5. ENAME JOB DNAME
    6. ---------- --------- --------------
    7. CLARK MANAGER ACCOUNTING
    8. KING PRESIDENT ACCOUNTING
    9. MILLER CLERK ACCOUNTING
    10. JONES MANAGER ACCOUNTING
    11. FORD ANALYST ACCOUNTING
    12. ADAMS CLERK ACCOUNTING
    13. SMITH CLERK ACCOUNTING
    14. SCOTT ANALYST ACCOUNTING
    15. WARD SALESMAN ACCOUNTING
    16. TURNER SALESMAN ACCOUNTING
    17. ALLEN SALESMAN ACCOUNTING
    18. JAMES CLERK ACCOUNTING
    19. BLAKE MANAGER ACCOUNTING
    20. MARTIN SALESMAN ACCOUNTING
    21. JONES MANAGER RESEARCH
    22. FORD ANALYST RESEARCH
    23. ADAMS CLERK RESEARCH
    24. SMITH CLERK RESEARCH
    25. SCOTT ANALYST RESEARCH
    26. WARD SALESMAN RESEARCH
    27. TURNER SALESMAN RESEARCH
    28. ALLEN SALESMAN RESEARCH
    29. JAMES CLERK RESEARCH
    30. BLAKE MANAGER RESEARCH
    31. MARTIN SALESMAN RESEARCH
    32. WARD SALESMAN SALES
    33. TURNER SALESMAN SALES
    34. ALLEN SALESMAN SALES
    35. JAMES CLERK SALES
    36. BLAKE MANAGER SALES
    37. MARTIN SALESMAN SALES
    38. 31 rows selected.
    39. scott@orclpdb1:orclcdb> select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST'));
    40. PLAN_TABLE_OUTPUT
    41. --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    42. SQL_ID g38n2ng1g2jma, child number 0
    43. -------------------------------------
    44. select /*+gather_plan_statistics*/ e.ename, e.job, d.dname from emp
    45. e, dept d where e.deptno >= d.deptno
    46. Plan hash value: 844388907
    47. -----------------------------------------------------------------------------------------------------------------------------
    48. | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
    49. -----------------------------------------------------------------------------------------------------------------------------
    50. | 0 | SELECT STATEMENT | | 1 | | 31 |00:00:00.01 | 14 | | | |
    51. | 1 | MERGE JOIN | | 1 | 28 | 31 |00:00:00.01 | 14 | | | |
    52. | 2 | TABLE ACCESS BY INDEX ROWID| DEPT | 1 | 4 | 4 |00:00:00.01 | 8 | | | |
    53. | 3 | INDEX FULL SCAN | PK_DEPT | 1 | 4 | 4 |00:00:00.01 | 4 | | | |
    54. |* 4 | SORT JOIN | | 4 | 14 | 31 |00:00:00.01 | 6 | 2048 | 2048 | 2048 (0)|
    55. | 5 | TABLE ACCESS FULL | EMP | 1 | 14 | 14 |00:00:00.01 | 6 | | | |
    56. -----------------------------------------------------------------------------------------------------------------------------
    57. Predicate Information (identified by operation id):
    58. ---------------------------------------------------
    59. 4 - access("E"."DEPTNO">="D"."DEPTNO")
    60. filter("E"."DEPTNO">="D"."DEPTNO")
    61. 24 rows selected.
    62. scott@orclpdb1:orclcdb>

    从上述执行计划中可知道,DEPT为驱动表,EMP为被驱动表。驱动表扫描一次,被驱动表也扫描一次。如下图标注所示

     

     因DEPT 走INDEX FULL SCAN,INDEX FULL SCAN返回的数据是有序的,故DEPT无须排序。EMP走TABLE ACCESS FULL,返回数据是无序的,故此EMP在PGA中进行排序。

    实际工作中一定要注意INDEX FULL SCAN返回的数据量,若INDEX FULL SCAN返回数据行数过多(INDEX FULL SCAN产生了回表,INDEX FULL SCAN单块读,回表也是单块读,应该走全表扫描性能更佳),则应强制更改执行计划,走全表扫描。

    故此将DEPT表强制走全表扫描,执行计划如下:

    1. scott@orclpdb1:orclcdb> select /*+ full(d)*/
    2. 2 e.ename, e.job, d.dname
    3. 3 from emp e, dept d
    4. 4 where e.deptno >= d.deptno;
    5. ENAME JOB DNAME
    6. ---------- --------- --------------
    7. CLARK MANAGER ACCOUNTING
    8. KING PRESIDENT ACCOUNTING
    9. MILLER CLERK ACCOUNTING
    10. JONES MANAGER ACCOUNTING
    11. FORD ANALYST ACCOUNTING
    12. ADAMS CLERK ACCOUNTING
    13. SMITH CLERK ACCOUNTING
    14. SCOTT ANALYST ACCOUNTING
    15. WARD SALESMAN ACCOUNTING
    16. TURNER SALESMAN ACCOUNTING
    17. ALLEN SALESMAN ACCOUNTING
    18. JAMES CLERK ACCOUNTING
    19. BLAKE MANAGER ACCOUNTING
    20. MARTIN SALESMAN ACCOUNTING
    21. JONES MANAGER RESEARCH
    22. FORD ANALYST RESEARCH
    23. ADAMS CLERK RESEARCH
    24. SMITH CLERK RESEARCH
    25. SCOTT ANALYST RESEARCH
    26. WARD SALESMAN RESEARCH
    27. TURNER SALESMAN RESEARCH
    28. ALLEN SALESMAN RESEARCH
    29. JAMES CLERK RESEARCH
    30. BLAKE MANAGER RESEARCH
    31. MARTIN SALESMAN RESEARCH
    32. WARD SALESMAN SALES
    33. TURNER SALESMAN SALES
    34. ALLEN SALESMAN SALES
    35. JAMES CLERK SALES
    36. BLAKE MANAGER SALES
    37. MARTIN SALESMAN SALES
    38. 31 rows selected.
    39. scott@orclpdb1:orclcdb> select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST'));
    40. PLAN_TABLE_OUTPUT
    41. --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    42. SQL_ID f6p4ampywwx9n, child number 0
    43. -------------------------------------
    44. select /*+ full(d)*/ e.ename, e.job, d.dname from emp e, dept d
    45. where e.deptno >= d.deptno
    46. Plan hash value: 1407029907
    47. ------------------------------------------------------------------------
    48. | Id | Operation | Name | E-Rows | OMem | 1Mem | Used-Mem |
    49. ------------------------------------------------------------------------
    50. | 0 | SELECT STATEMENT | | | | | |
    51. | 1 | MERGE JOIN | | 28 | | | |
    52. | 2 | SORT JOIN | | 4 | 2048 | 2048 | 2048 (0)|
    53. | 3 | TABLE ACCESS FULL| DEPT | 4 | | | |
    54. |* 4 | SORT JOIN | | 14 | 2048 | 2048 | 2048 (0)|
    55. | 5 | TABLE ACCESS FULL| EMP | 14 | | | |
    56. ------------------------------------------------------------------------
    57. Predicate Information (identified by operation id):
    58. ---------------------------------------------------
    59. 4 - access("E"."DEPTNO">="D"."DEPTNO")
    60. filter("E"."DEPTNO">="D"."DEPTNO")
    61. Note
    62. -----
    63. - Warning: basic plan statistics not available. These are only collected when:
    64. * hint 'gather_plan_statistics' is used for the statement or
    65. * parameter 'statistics_level' is set to 'ALL', at session or system level
    66. 30 rows selected.
    67. scott@orclpdb1:orclcdb>

    若两表是等值关联,不建议走SORT MERGE JOIN,由于SORT MERGE JOIN将两表都放入PGA中,消耗大量的PGA。 而HASH JOIN 仅需将驱动表放入PGA中,SORT MERGE JOIN 比HASH JOIN 消耗更多的PGA资源。即便SORT MERGE JOIN 中有一个表走INDEX FULL SCAN,另外一个表也需要放入PGA中,放入PGA中的表往往是大表。若走HASH JOIN ,大表会被作为被驱动表,是不会被放入PGA中。故此两表等值连接时,要么走NL(SQL返回行的数据量少),要么走HASH JOIN (SQL返回数据行数据量多),一般情况不走SORT MERGE JOIN。

    Q: 如何优化SORT MERGE JOIN?

    • 若两表关联为等值连接,走的是SORT MERGE JOIN,则可以让其走HASH JOIN .
    • 若两表关联为非等值连接,像>,>=,<,<=,<>, 可以先从业务下手,将非等值转换为等值。若无法进行非等值对等值的转换,则可以考虑增加过滤条件,将两表的数据量减少,外加开启并行查询加速SQL执行速度。
    表连接方式的主要区别
    Table connection wayDriver TablePGAOutput Collectionnon-equivalence connectiondrivred table return rows be scaned
    NL不消耗support等于驱动表返回的行数
    HASH JOIN消耗

    don't support

    1
    SORT MERGE JOIN消耗support1
  • 相关阅读:
    K8S:kubectl陈述式及声明式资源管理
    RPC框架选型
    HyperTerminal 超级终端设置TCP/IP Client和TCP/IP Server
    以OneFlow为例探索MLIR的实际开发流程
    C语言实现 1.在一个二维数组中形成 n 阶矩阵,2.去掉靠边元素,生成新的 n-2 阶矩阵;3.求矩阵主对角线下元素之和:4.以方阵形式输出数组。
    容灾备份 | 看我使用Powershell操作FTP进行数据文件自动上传备份
    kubernetes (k8s)的使用
    jvm YGC和FGC发生的具体场景
    【数据结构】——顺序表
    数据结构:栈和队列
  • 原文地址:https://blog.csdn.net/u011868279/article/details/127448651