• 值得反复研读的表连接之SORT MERGE JOIN方式


    SORT MERGE JOIN 主要用于两表非等值关联。 例如 >,>=,<,<=,<>

    不能用于 instr,substr,like,regexp_like关联(只能走NL)

    SORT MERGE JOIN : 两表关联,先对两表根据连接列进行排序,将较小的表作为驱动表,然后从驱动表中取出连接列的值。去已完成排序的被驱动表中进行数据匹配,成功匹配,表明关联成功。驱动表返回多少行,被驱动表就会被匹配多少次。与NL相似,但NL是从被驱动表的索引中进行数据匹配。而SORT MERGE JOIN 是在内存(PGA)中进行数据匹配。

    1. scott@orclpdb1:orclcdb> select /*+gather_plan_statistics*/
    2. 2 e.ename, e.job, d.dname
    3. from emp e, dept d
    4. 4 where e.deptno >= d.deptno;
    5. ENAME JOB DNAME
    6. ---------- --------- --------------
    7. CLARK MANAGER ACCOUNTING
    8. KING PRESIDENT ACCOUNTING
    9. MILLER CLERK ACCOUNTING
    10. JONES MANAGER ACCOUNTING
    11. FORD ANALYST ACCOUNTING
    12. ADAMS CLERK ACCOUNTING
    13. SMITH CLERK ACCOUNTING
    14. SCOTT ANALYST ACCOUNTING
    15. WARD SALESMAN ACCOUNTING
    16. TURNER SALESMAN ACCOUNTING
    17. ALLEN SALESMAN ACCOUNTING
    18. JAMES CLERK ACCOUNTING
    19. BLAKE MANAGER ACCOUNTING
    20. MARTIN SALESMAN ACCOUNTING
    21. JONES MANAGER RESEARCH
    22. FORD ANALYST RESEARCH
    23. ADAMS CLERK RESEARCH
    24. SMITH CLERK RESEARCH
    25. SCOTT ANALYST RESEARCH
    26. WARD SALESMAN RESEARCH
    27. TURNER SALESMAN RESEARCH
    28. ALLEN SALESMAN RESEARCH
    29. JAMES CLERK RESEARCH
    30. BLAKE MANAGER RESEARCH
    31. MARTIN SALESMAN RESEARCH
    32. WARD SALESMAN SALES
    33. TURNER SALESMAN SALES
    34. ALLEN SALESMAN SALES
    35. JAMES CLERK SALES
    36. BLAKE MANAGER SALES
    37. MARTIN SALESMAN SALES
    38. 31 rows selected.
    39. scott@orclpdb1:orclcdb> select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST'));
    40. PLAN_TABLE_OUTPUT
    41. --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    42. SQL_ID g38n2ng1g2jma, child number 0
    43. -------------------------------------
    44. select /*+gather_plan_statistics*/ e.ename, e.job, d.dname from emp
    45. e, dept d where e.deptno >= d.deptno
    46. Plan hash value: 844388907
    47. -----------------------------------------------------------------------------------------------------------------------------
    48. | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
    49. -----------------------------------------------------------------------------------------------------------------------------
    50. | 0 | SELECT STATEMENT | | 1 | | 31 |00:00:00.01 | 14 | | | |
    51. | 1 | MERGE JOIN | | 1 | 28 | 31 |00:00:00.01 | 14 | | | |
    52. | 2 | TABLE ACCESS BY INDEX ROWID| DEPT | 1 | 4 | 4 |00:00:00.01 | 8 | | | |
    53. | 3 | INDEX FULL SCAN | PK_DEPT | 1 | 4 | 4 |00:00:00.01 | 4 | | | |
    54. |* 4 | SORT JOIN | | 4 | 14 | 31 |00:00:00.01 | 6 | 2048 | 2048 | 2048 (0)|
    55. | 5 | TABLE ACCESS FULL | EMP | 1 | 14 | 14 |00:00:00.01 | 6 | | | |
    56. -----------------------------------------------------------------------------------------------------------------------------
    57. Predicate Information (identified by operation id):
    58. ---------------------------------------------------
    59. 4 - access("E"."DEPTNO">="D"."DEPTNO")
    60. filter("E"."DEPTNO">="D"."DEPTNO")
    61. 24 rows selected.
    62. scott@orclpdb1:orclcdb>

    从上述执行计划中可知道,DEPT为驱动表,EMP为被驱动表。驱动表扫描一次,被驱动表也扫描一次。如下图标注所示

     

     因DEPT 走INDEX FULL SCAN,INDEX FULL SCAN返回的数据是有序的,故DEPT无须排序。EMP走TABLE ACCESS FULL,返回数据是无序的,故此EMP在PGA中进行排序。

    实际工作中一定要注意INDEX FULL SCAN返回的数据量,若INDEX FULL SCAN返回数据行数过多(INDEX FULL SCAN产生了回表,INDEX FULL SCAN单块读,回表也是单块读,应该走全表扫描性能更佳),则应强制更改执行计划,走全表扫描。

    故此将DEPT表强制走全表扫描,执行计划如下:

    1. scott@orclpdb1:orclcdb> select /*+ full(d)*/
    2. 2 e.ename, e.job, d.dname
    3. 3 from emp e, dept d
    4. 4 where e.deptno >= d.deptno;
    5. ENAME JOB DNAME
    6. ---------- --------- --------------
    7. CLARK MANAGER ACCOUNTING
    8. KING PRESIDENT ACCOUNTING
    9. MILLER CLERK ACCOUNTING
    10. JONES MANAGER ACCOUNTING
    11. FORD ANALYST ACCOUNTING
    12. ADAMS CLERK ACCOUNTING
    13. SMITH CLERK ACCOUNTING
    14. SCOTT ANALYST ACCOUNTING
    15. WARD SALESMAN ACCOUNTING
    16. TURNER SALESMAN ACCOUNTING
    17. ALLEN SALESMAN ACCOUNTING
    18. JAMES CLERK ACCOUNTING
    19. BLAKE MANAGER ACCOUNTING
    20. MARTIN SALESMAN ACCOUNTING
    21. JONES MANAGER RESEARCH
    22. FORD ANALYST RESEARCH
    23. ADAMS CLERK RESEARCH
    24. SMITH CLERK RESEARCH
    25. SCOTT ANALYST RESEARCH
    26. WARD SALESMAN RESEARCH
    27. TURNER SALESMAN RESEARCH
    28. ALLEN SALESMAN RESEARCH
    29. JAMES CLERK RESEARCH
    30. BLAKE MANAGER RESEARCH
    31. MARTIN SALESMAN RESEARCH
    32. WARD SALESMAN SALES
    33. TURNER SALESMAN SALES
    34. ALLEN SALESMAN SALES
    35. JAMES CLERK SALES
    36. BLAKE MANAGER SALES
    37. MARTIN SALESMAN SALES
    38. 31 rows selected.
    39. scott@orclpdb1:orclcdb> select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST'));
    40. PLAN_TABLE_OUTPUT
    41. --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    42. SQL_ID f6p4ampywwx9n, child number 0
    43. -------------------------------------
    44. select /*+ full(d)*/ e.ename, e.job, d.dname from emp e, dept d
    45. where e.deptno >= d.deptno
    46. Plan hash value: 1407029907
    47. ------------------------------------------------------------------------
    48. | Id | Operation | Name | E-Rows | OMem | 1Mem | Used-Mem |
    49. ------------------------------------------------------------------------
    50. | 0 | SELECT STATEMENT | | | | | |
    51. | 1 | MERGE JOIN | | 28 | | | |
    52. | 2 | SORT JOIN | | 4 | 2048 | 2048 | 2048 (0)|
    53. | 3 | TABLE ACCESS FULL| DEPT | 4 | | | |
    54. |* 4 | SORT JOIN | | 14 | 2048 | 2048 | 2048 (0)|
    55. | 5 | TABLE ACCESS FULL| EMP | 14 | | | |
    56. ------------------------------------------------------------------------
    57. Predicate Information (identified by operation id):
    58. ---------------------------------------------------
    59. 4 - access("E"."DEPTNO">="D"."DEPTNO")
    60. filter("E"."DEPTNO">="D"."DEPTNO")
    61. Note
    62. -----
    63. - Warning: basic plan statistics not available. These are only collected when:
    64. * hint 'gather_plan_statistics' is used for the statement or
    65. * parameter 'statistics_level' is set to 'ALL', at session or system level
    66. 30 rows selected.
    67. scott@orclpdb1:orclcdb>

    若两表是等值关联,不建议走SORT MERGE JOIN,由于SORT MERGE JOIN将两表都放入PGA中,消耗大量的PGA。 而HASH JOIN 仅需将驱动表放入PGA中,SORT MERGE JOIN 比HASH JOIN 消耗更多的PGA资源。即便SORT MERGE JOIN 中有一个表走INDEX FULL SCAN,另外一个表也需要放入PGA中,放入PGA中的表往往是大表。若走HASH JOIN ,大表会被作为被驱动表,是不会被放入PGA中。故此两表等值连接时,要么走NL(SQL返回行的数据量少),要么走HASH JOIN (SQL返回数据行数据量多),一般情况不走SORT MERGE JOIN。

    Q: 如何优化SORT MERGE JOIN?

    • 若两表关联为等值连接,走的是SORT MERGE JOIN,则可以让其走HASH JOIN .
    • 若两表关联为非等值连接,像>,>=,<,<=,<>, 可以先从业务下手,将非等值转换为等值。若无法进行非等值对等值的转换,则可以考虑增加过滤条件,将两表的数据量减少,外加开启并行查询加速SQL执行速度。
    表连接方式的主要区别
    Table connection wayDriver TablePGAOutput Collectionnon-equivalence connectiondrivred table return rows be scaned
    NL不消耗support等于驱动表返回的行数
    HASH JOIN消耗

    don't support

    1
    SORT MERGE JOIN消耗support1
  • 相关阅读:
    Pytest+Allure+Yaml+PyMsql+Jenkins+Gitlab接口自动化(四)Jenkins配置
    Teamcenter RAC 开发之《Excel模版导出》
    有趣的23000----整理(08)B词根
    【Java基础】线程的概念、特点及创建线程的三种方式
    【Flask】四、flask连接并操作数据库
    vue2 vue3 中指令总结
    Linux的十个常用命令
    Linux的 Bash 简介和使用方法
    Android 9.0系统源码_SystemUI(二)StatusBar系统状态栏的启动流程
    谷粒商城--分布式高级篇P129~P339(完结)
  • 原文地址:https://blog.csdn.net/u011868279/article/details/127448651