100 likes | 229 Vues
This guide explores the optimization of SQL join statements and the crucial role of the SQL optimizer in selecting efficient execution plans. It covers the three main factors influencing join operations: access paths, join operations, and join order. We discuss various join types, including Nested Loops, Hash Joins, and Sort-Merge Joins, explaining their costs and appropriate usage scenarios. Additionally, the optimizer’s approach to determining join order and execution plans is analyzed to enhance overall query performance.
E N D
Optimizing Join Statements To choose an execution plan for a join statement, the optimizer must choose: • Access Paths • Join Operation • Join Order These Choices are related. Lets examine the different possible join operations.
The outer table or the driving table The inner Table Nested Loops Join SELECT * FROM emp,dept WHERE emp.deptno=dept.deptno 1 - NESTED LOOPS 2 - TABLE ACCESS (FULL) emp 3 - TABLE ACCESS (BY ROWID) dept 4 - INDEX (UNIQUE SCAN) pk_dept
Nested Loops Join • The cost of nested loops operation is: • Access(O) + Access(I) * Rows(O) • Is good only when the outer table drives the inner table and the subsets accessed are small. • In outer joins the outer table is dictated by the query.
Hash Join 1 - HASH JOIN 3 - HASH (JOIN) 2- Table Access (FULL SCAN) emp 4 - Table Access (FULL SCAN) dept
Hash Join • The cost of a hash join is: • Access (O) + Access(I) + Hash (min(I,O)) • Can be used in equijoins only • Best if Hash(min(I,O) ) can be stored in memory, otherwise it is partitioned. • Not available with RBO. • In outer joins the outer table is dictated by the query.
1 - MERGE JOIN 2 - SORT (JOIN) 4 - SORT (JOIN) 3- Table Access (FULL SCAN) emp 5 - Table Access (FULL SCAN) dept Sort-merge Join
Sort Merge Join • The cost of a sort-merge join is: • Access (O) + Access(I) + Sort (O) + Sort (I) • Can not be used in non-equijoins only. • No distinction between outer/inner tables except when dictated by the query. • Preferred on hash when: • It is not an equijoin, or • Sorts are already required in other parts of the query, or • SORT_AREA_SIZE >> HASH_AREA_SIZE
Cluster Join 1 - NESTED LOOPS TABLE ACCESS (CLUSTER) emp TABLE ACCESS (FULL) dept
Nested Loops • The cost of nested loops operation is: • Access(O) + Access(I) * Rows(O) • Is good only when the outer table drives the inner table • In outer joins the outer table is dictated by the query • The cost of a sort-merge join is: • Access (O) + Access(I) + Sort (O) + Sort (I) • The cost of a hash join is: • Access (O) + Access(I) + Hash (min(I,O)) • Can be used in equijoins only • Not available with RBO
The Join Order • The optimizer: • begins with the joins that results in a single row (using keys & unique indexes). • In outer joins the outer table must appear after the inner table in the join condition. • The possible execution plans are generated and compared according to estimated costs.