哈希连接(hash join)原理
哈希连接(hash join)原理
哈希连接(hashjoin)
访问次数:驱动表和被驱动表都只会访问0次或1次。
驱动表是否有顺序:有。
是否要排序:否。
应用场景: 1. 一个大表,一个小表的关联;
2. 表上没有索引;
3. 返回结果集比较大。
原理我们说的简单一点,先把驱动表的关联字段hash到PGA中(当然rowid也在PGA中),然后扫描被驱动表,取第一条数据,将关联的字段hash 一下探测PGA中的小表,如果匹配则关联,再取第二条........。
下面我们来做个试验:
SQL> create table test1 as select * from dba_objects where rownum <=100; SQL> create table test2 as select * from dba_objects where rownum <=1000; SQL> exec dbms_stats.gather_table_stats(user,'test1'); SQL> exec dbms_stats.gather_table_stats(user,'test2'); SQL> alter session set statistics_level=all; SQL> select /*+leading(t1) use_hash(t2)*/count(*) from test1 t1, test2 t2 where t1.object_id = t2.object_id; COUNT(*) ---------- 100 SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last')); PLAN_TABLE_OUTPUT ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- SQL_ID 3f2mts0kt82u2, child number 0 ------------------------------------- select /*+leading(t1) use_hash(t2)*/count(*) from test1 t1, test2 t2 where t1.object_id = t2.object_id Plan hash value: 2544416891
----解释一下:
Starts为该sql执行的次数。
E-Rows为执行计划预计的行数。
A-Rows为实际返回的行数。A-Rows跟E-Rows做比较,就可以确定哪一步执行计划出了问题。
A-Time为每一步实际执行的时间(HH:MM:SS.FF),根据这一行可以知道该sql耗时在了哪个地方。
Buffers为每一步实际执行的逻辑读或一致性读。
Reads为物理读。
OMem、1Mem为执行所需的内存评估值,0Mem为最优执行模式所需内存的评估值,1Mem为one-pass模式所需内存的评估值。
0/1/M 为最优/one-pass/multipass执行的次数。
Used-Mem耗的内存
------------------------------------------------------------------------------------------------------------------ | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem | ------------------------------------------------------------------------------------------------------------------ | 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.01 | 19 | | | | |* 2 | HASH JOIN | | 1 | 100 | 100 |00:00:00.01 | 19 | 1066K| 1066K| 1162K (0)| | 3 | TABLE ACCESS FULL| TEST1 | 1| 100 | 100 |00:00:00.01 | 4 | | | | | 4 | TABLE ACCESS FULL| TEST2 | 1 | 1000 | 1000 |00:00:00.01 | 15 | | | | ------------------------------------------------------------------------------------------------------------------ Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("T1"."OBJECT_ID"="T2"."OBJECT_ID") SQL> select /*+leading(t1) use_hash (t2)*/count(*) from test1 t1, test2 t2 where t1.object_id = t2.object_id and t1.object_id = 99999; COUNT(*) ---------- 0 SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last')); PLAN_TABLE_OUTPUT ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- SQL_ID f9zwsrs05kg0n, child number 0 ------------------------------------- select /*+leading(t1) use_hash (t2)*/count(*) from test1 t1, test2 t2 where t1.object_id = t2.object_id and t1.object_id = 99999 Plan hash value: 2544416891 ------------------------------------------------------------------------------------------------------------------ | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem | ------------------------------------------------------------------------------------------------------------------ | 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.01 | 4 | | | | |* 2 | HASH JOIN | | 1 | 1 | 0 |00:00:00.01 | 4 | 921K| 921K| 176K (0)| |* 3 | TABLE ACCESS FULL| TEST1 | 1 | 1 | 0 |00:00:00.01 | 4 | | | | |* 4 | TABLE ACCESS FULL| TEST2 | 0 | 1 | 0 |00:00:00.01 | 0 | | | | ------------------------------------------------------------------------------------------------------------------ Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("T1"."OBJECT_ID"="T2"."OBJECT_ID") 3 - filter("T1"."OBJECT_ID"=99999) 4 - filter("T2"."OBJECT_ID"=99999) SQL> select /*+leading(t1) use_hash (t2)*/count(*) 2 from test1 t1, test2 t2 3 where t1.object_id = t2.object_id 4 and 1=2; COUNT(*) ---------- 0 SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last')); PLAN_TABLE_OUTPUT ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- SQL_ID bnrfbt4ybxnnp, child number 0 ------------------------------------- select /*+leading(t1) use_hash (t2)*/count(*) from test1 t1, test2 t2 where t1.object_id = t2.object_id and 1=2 Plan hash value: 1013001923 --------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | OMem | 1Mem | Used-Mem | --------------------------------------------------------------------------------------------------------- | 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.01 | | | | |* 2 | FILTER | | 1 | | 0 |00:00:00.01 | | | | |* 3 | HASH JOIN | | 0 | 100 | 0 |00:00:00.01 | 921K| 921K| | | 4 | TABLE ACCESS FULL| TEST1 | 0 | 100 | 0 |00:00:00.01 | | | | | 5 | TABLE ACCESS FULL| TEST2 | 0 | 1000 | 0 |00:00:00.01 | | | | --------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter(NULL IS NOT NULL) 3 - access("T1"."OBJECT_ID"="T2"."OBJECT_ID")