SORT MERGE JOIN 主要用于两表非等值关联。 例如 >,>=,<,<=,<>
但不能用于 instr,substr,like,regexp_like关联(只能走NL)
SORT MERGE JOIN : 两表关联,先对两表根据连接列进行排序,将较小的表作为驱动表,然后从驱动表中取出连接列的值。去已完成排序的被驱动表中进行数据匹配,成功匹配,表明关联成功。驱动表返回多少行,被驱动表就会被匹配多少次。与NL相似,但NL是从被驱动表的索引中进行数据匹配。而SORT MERGE JOIN 是在内存(PGA)中进行数据匹配。
- scott@orclpdb1:orclcdb> select /*+gather_plan_statistics*/
- 2 e.ename, e.job, d.dname
- from emp e, dept d
- 4 where e.deptno >= d.deptno;
-
- ENAME JOB DNAME
- ---------- --------- --------------
- CLARK MANAGER ACCOUNTING
- KING PRESIDENT ACCOUNTING
- MILLER CLERK ACCOUNTING
- JONES MANAGER ACCOUNTING
- FORD ANALYST ACCOUNTING
- ADAMS CLERK ACCOUNTING
- SMITH CLERK ACCOUNTING
- SCOTT ANALYST ACCOUNTING
- WARD SALESMAN ACCOUNTING
- TURNER SALESMAN ACCOUNTING
- ALLEN SALESMAN ACCOUNTING
- JAMES CLERK ACCOUNTING
- BLAKE MANAGER ACCOUNTING
- MARTIN SALESMAN ACCOUNTING
- JONES MANAGER RESEARCH
- FORD ANALYST RESEARCH
- ADAMS CLERK RESEARCH
- SMITH CLERK RESEARCH
- SCOTT ANALYST RESEARCH
- WARD SALESMAN RESEARCH
- TURNER SALESMAN RESEARCH
- ALLEN SALESMAN RESEARCH
- JAMES CLERK RESEARCH
- BLAKE MANAGER RESEARCH
- MARTIN SALESMAN RESEARCH
- WARD SALESMAN SALES
- TURNER SALESMAN SALES
- ALLEN SALESMAN SALES
- JAMES CLERK SALES
- BLAKE MANAGER SALES
- MARTIN SALESMAN SALES
-
- 31 rows selected.
-
- scott@orclpdb1:orclcdb> select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST'));
-
- PLAN_TABLE_OUTPUT
- --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
- SQL_ID g38n2ng1g2jma, child number 0
- -------------------------------------
- select /*+gather_plan_statistics*/ e.ename, e.job, d.dname from emp
- e, dept d where e.deptno >= d.deptno
-
- Plan hash value: 844388907
-
- -----------------------------------------------------------------------------------------------------------------------------
- | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
- -----------------------------------------------------------------------------------------------------------------------------
- | 0 | SELECT STATEMENT | | 1 | | 31 |00:00:00.01 | 14 | | | |
- | 1 | MERGE JOIN | | 1 | 28 | 31 |00:00:00.01 | 14 | | | |
- | 2 | TABLE ACCESS BY INDEX ROWID| DEPT | 1 | 4 | 4 |00:00:00.01 | 8 | | | |
- | 3 | INDEX FULL SCAN | PK_DEPT | 1 | 4 | 4 |00:00:00.01 | 4 | | | |
- |* 4 | SORT JOIN | | 4 | 14 | 31 |00:00:00.01 | 6 | 2048 | 2048 | 2048 (0)|
- | 5 | TABLE ACCESS FULL | EMP | 1 | 14 | 14 |00:00:00.01 | 6 | | | |
- -----------------------------------------------------------------------------------------------------------------------------
-
- Predicate Information (identified by operation id):
- ---------------------------------------------------
-
- 4 - access("E"."DEPTNO">="D"."DEPTNO")
- filter("E"."DEPTNO">="D"."DEPTNO")
-
-
- 24 rows selected.
-
- scott@orclpdb1:orclcdb>
从上述执行计划中可知道,DEPT为驱动表,EMP为被驱动表。驱动表扫描一次,被驱动表也扫描一次。如下图标注所示
因DEPT 走INDEX FULL SCAN,INDEX FULL SCAN返回的数据是有序的,故DEPT无须排序。EMP走TABLE ACCESS FULL,返回数据是无序的,故此EMP在PGA中进行排序。
实际工作中一定要注意INDEX FULL SCAN返回的数据量,若INDEX FULL SCAN返回数据行数过多(INDEX FULL SCAN产生了回表,INDEX FULL SCAN单块读,回表也是单块读,应该走全表扫描性能更佳),则应强制更改执行计划,走全表扫描。
故此将DEPT表强制走全表扫描,执行计划如下:
- scott@orclpdb1:orclcdb> select /*+ full(d)*/
- 2 e.ename, e.job, d.dname
- 3 from emp e, dept d
- 4 where e.deptno >= d.deptno;
-
- ENAME JOB DNAME
- ---------- --------- --------------
- CLARK MANAGER ACCOUNTING
- KING PRESIDENT ACCOUNTING
- MILLER CLERK ACCOUNTING
- JONES MANAGER ACCOUNTING
- FORD ANALYST ACCOUNTING
- ADAMS CLERK ACCOUNTING
- SMITH CLERK ACCOUNTING
- SCOTT ANALYST ACCOUNTING
- WARD SALESMAN ACCOUNTING
- TURNER SALESMAN ACCOUNTING
- ALLEN SALESMAN ACCOUNTING
- JAMES CLERK ACCOUNTING
- BLAKE MANAGER ACCOUNTING
- MARTIN SALESMAN ACCOUNTING
- JONES MANAGER RESEARCH
- FORD ANALYST RESEARCH
- ADAMS CLERK RESEARCH
- SMITH CLERK RESEARCH
- SCOTT ANALYST RESEARCH
- WARD SALESMAN RESEARCH
- TURNER SALESMAN RESEARCH
- ALLEN SALESMAN RESEARCH
- JAMES CLERK RESEARCH
- BLAKE MANAGER RESEARCH
- MARTIN SALESMAN RESEARCH
- WARD SALESMAN SALES
- TURNER SALESMAN SALES
- ALLEN SALESMAN SALES
- JAMES CLERK SALES
- BLAKE MANAGER SALES
- MARTIN SALESMAN SALES
-
- 31 rows selected.
-
- scott@orclpdb1:orclcdb> select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST'));
-
- PLAN_TABLE_OUTPUT
- --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
- SQL_ID f6p4ampywwx9n, child number 0
- -------------------------------------
- select /*+ full(d)*/ e.ename, e.job, d.dname from emp e, dept d
- where e.deptno >= d.deptno
-
- Plan hash value: 1407029907
-
- ------------------------------------------------------------------------
- | Id | Operation | Name | E-Rows | OMem | 1Mem | Used-Mem |
- ------------------------------------------------------------------------
- | 0 | SELECT STATEMENT | | | | | |
- | 1 | MERGE JOIN | | 28 | | | |
- | 2 | SORT JOIN | | 4 | 2048 | 2048 | 2048 (0)|
- | 3 | TABLE ACCESS FULL| DEPT | 4 | | | |
- |* 4 | SORT JOIN | | 14 | 2048 | 2048 | 2048 (0)|
- | 5 | TABLE ACCESS FULL| EMP | 14 | | | |
- ------------------------------------------------------------------------
-
- Predicate Information (identified by operation id):
- ---------------------------------------------------
-
- 4 - access("E"."DEPTNO">="D"."DEPTNO")
- filter("E"."DEPTNO">="D"."DEPTNO")
-
- Note
- -----
- - Warning: basic plan statistics not available. These are only collected when:
- * hint 'gather_plan_statistics' is used for the statement or
- * parameter 'statistics_level' is set to 'ALL', at session or system level
-
-
- 30 rows selected.
-
- scott@orclpdb1:orclcdb>
若两表是等值关联,不建议走SORT MERGE JOIN,由于SORT MERGE JOIN将两表都放入PGA中,消耗大量的PGA。 而HASH JOIN 仅需将驱动表放入PGA中,SORT MERGE JOIN 比HASH JOIN 消耗更多的PGA资源。即便SORT MERGE JOIN 中有一个表走INDEX FULL SCAN,另外一个表也需要放入PGA中,放入PGA中的表往往是大表。若走HASH JOIN ,大表会被作为被驱动表,是不会被放入PGA中。故此两表等值连接时,要么走NL(SQL返回行的数据量少),要么走HASH JOIN (SQL返回数据行数据量多),一般情况不走SORT MERGE JOIN。
Q: 如何优化SORT MERGE JOIN?
Table connection way | Driver Table | PGA | Output Collection | non-equivalence connection | drivred table return rows be scaned |
NL | 有 | 不消耗 | 少 | support | 等于驱动表返回的行数 |
HASH JOIN | 有 | 消耗 | 多 | don't support | 1 |
SORT MERGE JOIN | 无 | 消耗 | 多 | support | 1 |