两表关联返回少量数据应该走嵌套循环,两表关联返回数据量大的应该走HASH JOIN连接。
HASH JOIN :两表等值关联,将数据量小的表作为驱动表,将数据量大的表作为被驱动表,对驱动表的连接列进行HASH 生成HASH TABLE ,并且也对被驱动表的连接列进行HASH生成HASH TABLE。能够匹配的数据关联上。HASH JOIN 只支持等值连接。
- scott@orclpdb1:orclcdb>
- select /*+gather_plan_statistics use_hash(e,d)*/
- 2 e.ename, e.job, d.dname
- from emp e, dept d
- 4 where e.deptno = d.deptno;
-
- ENAME JOB DNAME
- ---------- --------- --------------
- SMITH CLERK RESEARCH
- ALLEN SALESMAN SALES
- WARD SALESMAN SALES
- JONES MANAGER RESEARCH
- MARTIN SALESMAN SALES
- BLAKE MANAGER SALES
- CLARK MANAGER ACCOUNTING
- SCOTT ANALYST RESEARCH
- KING PRESIDENT ACCOUNTING
- TURNER SALESMAN SALES
- ADAMS CLERK RESEARCH
- JAMES CLERK SALES
- FORD ANALYST RESEARCH
- MILLER CLERK ACCOUNTING
-
- 14 rows selected.
-
- scott@orclpdb1:orclcdb> select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST'));
-
- PLAN_TABLE_OUTPUT
- --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
- SQL_ID d3uaffp94gd3x, child number 0
- -------------------------------------
- select /*+gather_plan_statistics use_hash(e,d)*/ e.ename, e.job,
- d.dname from emp e, dept d where e.deptno = d.deptno
-
- Plan hash value: 615168685
-
- -------------------------------------------------------------------------------------------------------------------------
- | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | OMem | 1Mem | Used-Mem |
- -------------------------------------------------------------------------------------------------------------------------
- | 0 | SELECT STATEMENT | | 1 | | 14 |00:00:00.01 | 54 | 5 | | | |
- |* 1 | HASH JOIN | | 1 | 14 | 14 |00:00:00.01 | 54 | 5 | 1695K| 1695K| 1079K (0)|
- | 2 | TABLE ACCESS FULL| DEPT | 1 | 4 | 4 |00:00:00.01 | 6 | 5 | | | |
- | 3 | TABLE ACCESS FULL| EMP | 1 | 14 | 14 |00:00:00.01 | 7 | 0 | | | |
- -------------------------------------------------------------------------------------------------------------------------
-
- Predicate Information (identified by operation id):
- ---------------------------------------------------
-
- 1 - access("E"."DEPTNO"="D"."DEPTNO")
-
-
- 21 rows selected.
-
- scott@orclpdb1:orclcdb>
执行计划中离HASH JOIN 最近的表为驱动表,则DEPT为驱动表,EMP为被驱动表。驱动表 扫描一次,被驱动表也扫描一次。
Note: NL中被驱动表需要扫描多次, HASH JOIN 中被驱动表只需扫描一次。

Used-Mem表示HASH JOIN 消耗的PGA , 当驱动表太大、PGA不能完全容纳驱动表时,驱动表就会溢出到临时表空间,进而产生磁盘HASH JOIN。此刻HASH JOIN 会产生严重的性能问题。而NL无须消耗PGA.
NL 会进行值传递。(NL循环一次,驱动表连接列传值给被驱动表连接列)
HASH JOIN 没有传值的过程。(HASH JOIN 的驱动表与被驱动表的连接列都无须创建索引)
OLTP 环境(一般高并发小事务居多):SQL执行计划多以NL为主。SGA 设置较大,PGA设置较小。(NL不消耗PGA)
OLAP环境 (多数SQL都是大规模ETL),SQL返回结果集很多,SQL执行计划通常以HASH连接为主。往往会消耗大量的PGA,故PGA设置较大。
当两表使用外连接进行关联,如果执行计划走的是HASH JOIN ,想要更改驱动表,我们需要使用swap_join_inputs, 而不是leading.
- scott@orclpdb1:orclcdb>
- explain plan for select /*+use_hash(d,e) leading(e)*/
- 2 *
- from dept d
- 4 left join emp e
- 5 on d.deptno = e.deptno;
-
- Explained.
-
- scott@orclpdb1:orclcdb> select * from table(dbms_xplan.display);
-
- PLAN_TABLE_OUTPUT
- --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
- Plan hash value: 3713469723
-
- ---------------------------------------------------------------------------
- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
- ---------------------------------------------------------------------------
- | 0 | SELECT STATEMENT | | 15 | 870 | 6 (0)| 00:00:01 |
- |* 1 | HASH JOIN OUTER | | 15 | 870 | 6 (0)| 00:00:01 |
- | 2 | TABLE ACCESS FULL| DEPT | 4 | 80 | 3 (0)| 00:00:01 |
- | 3 | TABLE ACCESS FULL| EMP | 14 | 532 | 3 (0)| 00:00:01 |
- ---------------------------------------------------------------------------
-
- Predicate Information (identified by operation id):
- ---------------------------------------------------
-
- 1 - access("D"."DEPTNO"="E"."DEPTNO"(+))
-
- Hint Report (identified by operation id / Query Block Name / Object Alias):
- Total hints for statement: 2 (U - Unused (2))
- ---------------------------------------------------------------------------
-
- 1 - SEL$2BFA4EE4
- U - leading(e)
-
- 2 - SEL$2BFA4EE4 / D@SEL$1
- U - use_hash(d,e)
-
- 25 rows selected.
-
- scott@orclpdb1:orclcdb>
从上执行计划中所得,两表关联走的是HASH JOIN , 表明use_hash(d,e)生效,而驱动表为DEPT故设置leading(e) 未生效。因此需要使用swap_join_inputs来更改外连接中HASH JOIN的驱动表。
- scott@orclpdb1:orclcdb> explain plan for select /*+use_hash(d,e) swap_join_inputs(e)*/
- 2 *
- from dept d
- 4 left join emp e
- 5 on d.deptno = e.deptno;
-
- Explained.
-
- scott@orclpdb1:orclcdb> select * from table(dbms_xplan.display);
-
- PLAN_TABLE_OUTPUT
- --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
- Plan hash value: 3590956717
-
- ------------------------------------------------------------------------------
- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
- ------------------------------------------------------------------------------
- | 0 | SELECT STATEMENT | | 15 | 870 | 6 (0)| 00:00:01 |
- |* 1 | HASH JOIN RIGHT OUTER| | 15 | 870 | 6 (0)| 00:00:01 |
- | 2 | TABLE ACCESS FULL | EMP | 14 | 532 | 3 (0)| 00:00:01 |
- | 3 | TABLE ACCESS FULL | DEPT | 4 | 80 | 3 (0)| 00:00:01 |
- ------------------------------------------------------------------------------
-
- Predicate Information (identified by operation id):
- ---------------------------------------------------
-
- 1 - access("D"."DEPTNO"="E"."DEPTNO"(+))
-
- Hint Report (identified by operation id / Query Block Name / Object Alias):
- Total hints for statement: 1 (U - Unused (1))
- ---------------------------------------------------------------------------
-
- 3 - SEL$2BFA4EE4 / D@SEL$1
- U - use_hash(d,e)
-
- 22 rows selected.
-
- scott@orclpdb1:orclcdb>
swap_join_inputs(e) 更改了外连接中的HASH JOIN 的驱动表
Q: 如何优化HASH JOIN ?