Advanced Techniques in Database Schema Capture and Differential Query Optimization
E N D
Presentation Transcript
Topics covered Overview of DEMO • Capturing database schema in QL • Differential query optimization • Dialect DLA • Capturing database schema in DLA
Capturing database schema in QL • Another view on views • Another view on physical design
Tables: current practice • Tables that are base • Tables that are views (defined by queries on base tables) create table EMP ( attribute name on STR; attribute age on INT; attribute loc on STR; primary key (name); ) create view WATEMP ( select name, age from EMP where loc = ‘Waterloo’; ) create view TOREMP ( select name from EMP where loc = ‘Toronto’; )
Tables: an alternative • Tables • Constraints (query containment dependencies) create table EMP ( attribute name on STR; attribute age on INT; attribute loc on STR; primary key (name); updatable; ) create table WATEMP ( attribute name on STR; attribute age on INT; ) create table TOREMP ( attribute name on STR; ) ( select name, age from EMP where loc = ‘Waterloo’ ) ´ ( select name, age from WATEMP ) ( select name from EMP where loc = ‘Toronto’ ) ´ ( select name from TOREMP )
create table EMP ( attribute name on STR; attribute age on INT; attribute loc on STR; primary key (name); updatable; ) create table WATEMP ( attribute name on STR; attribute age on INT; ) create table TOREMP ( attribute name on STR; ) ( select name, age from EMP where loc = ‘Waterloo’ ) ´ ( select name, age from WATEMP ) ( select name from EMP where loc = ‘Toronto’ ) ´ ( select name from TOREMP ) ( select name from EMP ) ´ ( select name from WATEMP ) union all ( select name from TOREMP ) Table: an alternative (cont’d) • Easy to do much more!
Physical design: current practice • Records and fields (implicit) • Clustered indices on base tables (a default file created) • Secondary indices on base tables • Materialized views (a default file created) • Secondary indices on materialized views • Etc. • External engines (collections of tables, views and the above) create index on EMP (loc asc, age desc) alter view WATEMP ( materialized; ) create index on WATEMP (name asc)
Physical design: an alternative • Explicit named record identifiers and stored attributes • Iterator signatures • Secondary indices are also base relations alter table EMP ( record identifier emp; stored (name, age, loc); iterator; ) create table EMPX ( attribute loc on STR; attribute age on INT; attribute eref on RID; stored (loc, age, eref); iterator (loc asc, age desc); ) ( select loc, age, emp from EMP ) ´ ( select loc, age, eref from EMPX )
Physical design: an alternative (cont’d) alter table WATEMP ( record identifier emp; stored (name, age); iterator; ) create table WATEMPX ( attribute name on STR; attribute eref on RID; stored (name, eref); iterator (name asc); ( select name, emp from WATEMP ) ´ ( select name, eref from WATEMPX )
select name, age from EMP as e, PV as :p, (STRu QV) as name, (INTuQV) as age where e.loc = ‘Waterloo’ and e.name = :p and e.name = name and e.age = age Differential query optimization • (based on simple hill climbing) • Expand conjunctive subquery using constraints. • Initialize subplan. • Select next iterator or disjunction from query for subplan; fail if none available. • If disjunction selected, recursively apply steps 1 to 5 on each disjunct; • fail if any recursive application fails. • If expand of subplan contains subquery, return subplan; otherwise repeat from 3. select name, age from EMP where loc = ‘Waterloo’ and name = :p
select name, age from EMP as e, PV as :p, (STRu QV) as name, (INTuQV) as age, WATEMP as w where e.loc = ‘Waterloo’ and e.name = :p and e.name = name and e.age = age and e.name = w.name and e.age = w.age Differential query optimization (cont’d) • Expand conjunctive subquery using constraints. select name, age from EMP as e, PV as :p, (STRu QV) as name, (INTuQV) as age where e.loc = ‘Waterloo’ and e.name = :p and e.name = name and e.age = age ( select name, age from EMP where loc = ‘Waterloo’ ) ´ ( select name, age from WATEMP )
select name, age from EMP as e, PV as :p, (STRu QV) as name, (INTuQV) as age, WATEMP as w, WATEMPX as x where e.loc = ‘Waterloo’ and e.name = :p and e.name = name and e.age = age and e.name = w.name and e.age = w.age and w.name = x.name and w.emp = x.eref Differential query optimization (cont’d) • Expand conjunctive subquery using constraints. select name, age from EMP as e, PV as :p, (STRu QV) as name, (INTuQV) as age, WATEMP as w where e.loc = ‘Waterloo’ and e.name = :p and e.name = name and e.age = age and e.name = w.name and e.age = w.age ( select name, emp from WATEMP ) ´ ( select name, eref from WATEMPX )
select from PV as :p (plan) (query parameter) Differential query optimization (cont’d) select name, age from EMP as e, PV as :p, (STRu QV) as name, (INTuQV) as age, WATEMP as w, WATEMPX as x where e.loc = ‘Waterloo’ and e.name = :p and e.name = name and e.age = age and e.name = w.name and e.age = w.age and w.name = x.name and w.emp = x.eref (expanded query) • Initialize subplan.
Differential query optimization (cont’d) select name, age from EMP as e, PV as :p, (STRu QV) as name, (INTuQV) as age, WATEMP as w, WATEMPX as x where e.loc = ‘Waterloo’ and e.name = :p and e.name = name and e.age = age and e.name = w.name and e.age = w.age and w.name = x.name and w.emp = x.eref (expanded query) • Select next iterator or disjunction from query for subplan; fail if none available. select from PV as :p (plan)
Differential query optimization (cont’d) select name, age from EMP as e, PV as :p, (STRu QV) as name, (INTuQV) as age, WATEMP as w, WATEMPX as x where e.loc = ‘Waterloo’ and e.name = :p and e.name = name and e.age = age and e.name = w.name and e.age = w.age and w.name = x.name and w.emp = x.eref (expanded query) • Select next iterator or disjunction from query for subplan; fail if none available. select name from PV as :p, (STRu QV) as name where name = :p (plan) (nested loops;string copy)
select name, age from EMP as e, PV as :p, (STRu QV) as name, (INTuQV) as age where e.loc = ‘Waterloo’ and e.name = :p and e.name = name and e.age = age (expanded query) Differential query optimization (cont’d) • If expand of subplan contains subquery, return subplan; otherwise repeat from 3. select name from PV as :p, (STRu QV) as name where name = :p (plan)
select name, age from EMP as e, PV as :p, (STRu QV) as name, (INTuQV) as age where e.loc = ‘Waterloo’ and e.name = :p and e.name = name and e.age = age (expanded query) Differential query optimization (cont’d) • If expand of subplan contains subquery, return subplan; otherwise repeat from 3. select name from PV as :p, (STRu QV) as name where name = :p (expanded plan)
Differential query optimization (cont’d) select name, age from EMP as e, PV as :p, (STRu QV) as name, (INTuQV) as age, WATEMP as w, WATEMPX as x where e.loc = ‘Waterloo’ and e.name = :p and e.name = name and e.age = age and e.name = w.name and e.age = w.age and w.name = x.name and w.emp = x.eref (expanded query) • Select next iterator or disjunction from query for subplan; fail if none available. select name from PV as :p, (STRu QV) as name where name = :p (plan)
Differential query optimization (cont’d) select name, age from EMP as e, PV as :p, (STRu QV) as name, (INTuQV) as age, WATEMP as w, WATEMPX as x where e.loc = ‘Waterloo’ and e.name = :p and e.name = name and e.age = age and e.name = w.name and e.age = w.age and w.name = x.name and w.emp = x.eref (expanded query) • Select next iterator or disjunction from query for subplan; fail if none available. select name from PV as :p, (STRu QV) as name, WATEMPX as x where name = :p and x.name = name (plan) (nested loops;index scan)
select name, age from EMP as e, PV as :p, (STRu QV) as name, (INTuQV) as age where e.loc = ‘Waterloo’ and e.name = :p and e.name = name and e.age = age (expanded query) Differential query optimization (cont’d) • If expand of subplan contains subquery, return subplan; otherwise repeat from 3. select name from PV as :p, (STRu QV) as name, WATEMPX as x where name = :p and x.name = name (plan)
select name, age from EMP as e, PV as :p, (STRu QV) as name, (INTuQV) as age where e.loc = ‘Waterloo’ and e.name = :p and e.name = name and e.age = age (expanded query) Differential query optimization (cont’d) • If expand of subplan contains subquery, return subplan; otherwise repeat from 3. select name, age from PV as :p, (STRu QV) as name, WATEMPX as x, WATEMP as w, INT as age, EMP as e where name = :p and x.name = name and w.name = x.name and w.emp = x.eref and e.name = w.name and e.age = w.age and e.age = age and e.loc = ‘Waterloo’ (expandedplan)
Differential query optimization (cont’d) select name, age from EMP as e, PV as :p, (STRu QV) as name, (INTuQV) as age, WATEMP as w, WATEMPX as x where e.loc = ‘Waterloo’ and e.name = :p and e.name = name and e.age = age and e.name = w.name and e.age = w.age and w.name = x.name and w.emp = x.eref (expanded query) • Select next iterator or disjunction from query for subplan; fail if none available. select name from PV as :p, (STRu QV) as name, WATEMPX as x where name = :p and x.name = name (plan)
Differential query optimization (cont’d) select name, age from EMP as e, PV as :p, (STRu QV) as name, (INTuQV) as age, WATEMP as w, WATEMPX as x where e.loc = ‘Waterloo’ and e.name = :p and e.name = name and e.age = age and e.name = w.name and e.age = w.age and w.name = x.name and w.emp = x.eref (expanded query) • Select next iterator or disjunction from query for subplan; fail if none available. select name from PV as :p, (STRu QV) as name, WATEMPX as x, WATEMP as w where name = :p and x.name = name and w.emp = x.eref (plan) (nested loops;pointer navigation)
select name, age from EMP as e, PV as :p, (STRu QV) as name, (INTuQV) as age where e.loc = ‘Waterloo’ and e.name = :p and e.name = name and e.age = age (expanded query) Differential query optimization (cont’d) • If expand of subplan contains subquery, return subplan; otherwise repeat from 3. select name from PV as :p, (STRu QV) as name, WATEMPX as x, WATEMP as w where name = :p and x.name = name and w.emp = x.eref (plan)
select name, age from EMP as e, PV as :p, (STRu QV) as name, (INTuQV) as age where e.loc = ‘Waterloo’ and e.name = :p and e.name = name and e.age = age (expanded query) Differential query optimization (cont’d) • If expand of subplan contains subquery, return subplan; otherwise repeat from 3. select name, age from PV as :p, (STRu QV) as name, WATEMPX as x, WATEMP as w, INT as age, EMP as e where name = :p and x.name = name and w.emp = x.eref and w.name = x.name and e.name = w.name and e.age = w.age and e.age = age and e.loc = ‘Waterloo’ (expandedplan)
Differential query optimization (cont’d) select name, age from EMP as e, PV as :p, (STRu QV) as name, (INTuQV) as age, WATEMP as w, WATEMPX as x where e.loc = ‘Waterloo’ and e.name = :p and e.name = name and e.age = age and e.name = w.name and e.age = w.age and w.name = x.name and w.emp = x.eref (expanded query) • Select next iterator or disjunction from query for subplan; fail if none available. select name from PV as :p, (STRu QV) as name, WATEMPX as x, WATEMP as w where name = :p and x.name = name and w.emp = x.eref (plan)
Differential query optimization (cont’d) select name, age from EMP as e, PV as :p, (STRu QV) as name, (INTuQV) as age, WATEMP as w, WATEMPX as x where e.loc = ‘Waterloo’ and e.name = :p and e.name = name and e.age = age and e.name = w.name and e.age = w.age and w.name = x.name and w.emp = x.eref (expanded query) • Select next iterator or disjunction from query for subplan; fail if none available. select name, age from PV as :p, (STRu QV) as name, WATEMPX as x, WATEMP as w, (INTuQV) as age where name = :p and x.name = name and w.emp = x.eref and age = w.age (plan) (nested loops;field extraction)
select name, age from EMP as e, PV as :p, (STRu QV) as name, (INTuQV) as age where e.loc = ‘Waterloo’ and e.name = :p and e.name = name and e.age = age (expanded query) Differential query optimization (cont’d) • If expand of subplan contains subquery, return subplan; otherwise repeat from 3. select name, age from PV as :p, (STRu QV) as name, WATEMPX as x, WATEMP as w, (INTuQV) as age where name = :p and x.name = name and w.emp = x.eref and age = w.age (plan)
select name, age from EMP as e, PV as :p, (STRu QV) as name, (INTuQV) as age where e.loc = ‘Waterloo’ and e.name = :p and e.name = name and e.age = age (expanded query) Differential query optimization (cont’d) • If expand of subplan contains subquery, return subplan; otherwise repeat from 3. select name, age from PV as :p, (STRu QV) as name, WATEMPX as x, WATEMP as w, (INTuQV) as age, EMP as e where name = :p and x.name = name and w.emp = x.eref and age = w.age and w.name = x.name and e.name = w.name and e.age = w.age and e.age = age and e.loc = ‘Waterloo’ (expandedplan)
Differential query optimization (cont’d) select name, age from PV as :p, // input parameter (STRuQV) as name, // string copy WATEMPX as x, // index scan WATEMP as w, // pointer navigation (INTuQV) as age // field extraction where name = :p and x.name = name and w.emp = x.eref and age = x.age (final plan)
Dialect DLA D ::= (universal concept) | >D (primitive concept) | C (C)I (bottom concept) | ? ; (negation) | :D D – (D)I (intersection) | D1uD2 (D1)IÅ (D2)I (union) | D1tD2 (D1)I[ (D2)I (attribute value restriction) | 8A.D {e : (A)I(e) 2 (D)I} (quantified attribute inverse) | (>n A) {e1 : |{e2 : (A)I(e2) =e1}| ¸n} (quantified attribute inverse) | (6n A) {e1 : n¸ |{e2 : (A)I(e2) =e1}|} (path agreement) | Pf1 = Pf2 {e : (Pf1)I(e) = (Pf2)I(e)} (path disagreement) | Pf1¹Pf2 {e : (Pf1)I(e) ¹ (Pf2)I(e)} (path functional dependency) | C : L1!L2 (next slide) (an individual) | ¤2D Pf ::= id | A . Pf L ::= id | A | L . L | L , L | { L }
Semantics of PFDs (C : L1!L2 )I ´fe1 : 8e22 (C)I : Æ (Pf )I (e1) = (Pf )I (e2) !Æ (Pf )I (e1) = (Pf )I (e2)g Pf2L(L1) Pf 2L(L2) where L(id) ´ {id} L(A) ´ {A . id} L(L1 . L2) ´ {Pf1±Pf2 : Pf12L(L1) ÆPf22L(L2)} L(L1 , L2) ´L(L1) [ L(L2) L({L}) ´L(L) where id ±Pf´Pf (A . Pf1) ±Pf2´A . (Pf1±Pf2)
EMP 28 INT age emp RID @2 emp (RID) name (STR) age (INT) loc (STR) EMP @1 @2 @3 Mary Ann Fred 37 28 33 Waterloo Toronto Waterloo Toronto STR loc name STR Ann 37 INT age emp RID @1 EMP name STR Mary Waterloo STR loc emp RID @3 EMP 33 INT age name STR Fred Sample partial database
DLA: primitive concepts • STR 28 INT age emp RID @2 EMP Toronto STR loc name STR Ann 37 INT age emp RID @1 EMP name STR Mary Waterloo STR loc emp RID @3 EMP 33 INT age name STR Fred
DLA: primitive concepts (cont’d) • STR • EMP 28 INT age emp RID @2 EMP Toronto STR loc name STR Ann 37 INT age emp RID @1 EMP name STR Mary Waterloo STR loc emp RID @3 EMP 33 INT age name STR Fred
DLA: attribute value restriction • STR • EMP • 8loc.STR 28 INT age emp RID @2 EMP Toronto STR loc name STR Ann 37 INT age emp RID @1 EMP name STR Mary Waterloo STR loc emp RID @3 EMP 33 INT age name STR Fred
DLA: path functional dependencies • STR • EMP • 8loc.STR • EMP:{loc}!{name} 28 INT age emp RID @2 EMP Toronto STR loc name STR Ann 37 INT age emp RID @1 EMP name STR Mary Waterloo STR loc emp RID @3 EMP 33 INT age name STR Fred
DLA: quantified attribute inverse • STR • EMP • 8loc.STR • EMP:{loc}!{name} • (> 1loc) 28 INT age emp RID @2 EMP Toronto STR loc name STR Ann 37 INT age emp RID @1 EMP name STR Mary Waterloo STR loc emp RID @3 EMP 33 INT age name STR Fred
DLA: path agreement • STR • EMP • 8loc.STR • EMP:{loc}!{name} • (> 1loc) • name = loc 28 INT age emp RID @2 EMP Toronto STR loc name STR Ann 37 INT age emp RID @1 EMP name STR Mary Waterloo STR loc emp RID @3 EMP 33 INT age name STR Fred
DLA: intersection • STR • EMP • 8loc.STR • EMP:{loc}!{name} • (> 1loc) • name = loc • RIDu(> 1emp) 28 INT age emp RID @2 EMP Toronto STR loc name STR Ann 37 INT age emp RID @1 EMP name STR Mary Waterloo STR loc emp RID @3 EMP 33 INT age name STR Fred
DLA: negation • STR • EMP • 8loc.STR • EMP:{loc}!{name} • (> 1loc) • name = loc • RIDu(> 1emp) • :(> 1emp) 28 INT age emp RID @2 EMP Toronto STR loc name STR Ann 37 INT age emp RID @1 EMP name STR Mary Waterloo STR loc emp RID @3 EMP 33 INT age name STR Fred
28 INT age emp RID @2 EMP Toronto STR loc name STR Ann 37 INT age emp RID @1 EMP name STR Mary Waterloo STR loc emp RID @3 EMP 33 INT age name STR Fred DLA: dependencies Models • EMPv8loc.STR
28 INT age emp RID @2 EMP Toronto STR loc name STR Ann 37 INT age emp RID @1 EMP name STR Mary Waterloo STR loc emp RID @3 EMP 33 INT age name STR Fred DLA: dependencies (cont’d) Models • EMPv8loc.STR • (> 1loc)v:INT
28 INT age emp RID @2 EMP Toronto STR loc name STR Ann 37 INT age emp RID @1 EMP name STR Mary Waterloo STR loc emp RID @3 EMP 33 INT age name STR Fred DLA: dependencies (cont’d) Models • EMPv8loc.STR • (> 1loc)v:INT • EMPvEMP:{Name}!{id}
28 INT age emp RID @2 EMP Toronto STR loc name STR Ann 37 INT age emp RID @1 EMP name STR Mary Waterloo STR loc emp RID @3 EMP 33 INT age name STR Fred DLA: dependencies (cont’d) Models • EMPv8loc.STR • (> 1loc)v:INT • EMPvEMP:{Name}!{id} • Does not model • EMPvSTR
28 INT age emp RID @2 EMP Toronto STR loc name STR Ann 37 INT age emp RID @1 EMP name STR Mary Waterloo STR loc emp RID @3 EMP 33 INT age name STR Fred DLA: dependencies (cont’d) Models • EMPv8loc.STR • (> 1loc)v:INT • EMPvEMP:{Name}!{id} • Does not model • EMPvSTR • EMPvEMP:{Loc}!{Name}
Capturing database schema in DLA (recallalternative view on views) • Tables • QL dependencies create table EMP ( attribute name on STR; attribute age on INT; attribute loc on STR; primary key (name); updatable; ) create table WATEMP ( attribute name on STR; attribute age on INT; ) create table TOREMP ( attribute name on STR; ) ( select name, age from EMP where loc = ‘Waterloo’ ) ´ ( select name, age from WATEMP ) ( select name from EMP where loc = ‘Toronto’ ) ´ ( select name from TOREMP ) ( select name from EMP ) ´ ( select name from WATEMP ) union all ( select name from TOREMP )
DLA schema: capturing tables EMPv (8name.STR) u (8age.INT) u (8loc.STR) WATEMPv (8name.STR) u (8age.INT) TOREMPv8name.STR (signatures; typing) create table EMP ( attribute name on STR; attribute age on INT; attribute loc on STR; primary key (name); updatable; ) create table WATEMP ( attribute name on STR; attribute age on INT; ) create table TOREMP ( attribute name on STR; )
DLA schema: capturing tables EMPv (8name.STR) u (8age.INT) u (8loc.STR) WATEMPv (8name.STR) u (8age.INT) TOREMPv8name.STR EMPvEMP:{name}!{id} create table EMP ( attribute name on STR; attribute age on INT; attribute loc on STR; primary key (name); updatable; ) create table WATEMP ( attribute name on STR; attribute age on INT; ) create table TOREMP ( attribute name on STR; ) (primary keys)
DLA schema: capturing tables EMPv (8name.STR) u (8age.INT) u (8loc.STR) WATEMPv (8name.STR) u (8age.INT) TOREMPv8name.STR EMPvEMP:{name}!{id} CLASS v OBJECTu:DOMAIN DOMAIN v OBJECTu:CLASS EMPvCLASSuUPDATABLEu:WATEMPu:TOREMP WATEMPvCLASSu:EMPu:TOREMP TOREMPvOBJECTu:EMPu:WATEMP INTvDOMAINu:STR STRvDOMAINu:INT create table EMP ( attribute name on STR; attribute age on INT; attribute loc on STR; primary key (name); updatable; ) create table WATEMP ( attribute name on STR; attribute age on INT; ) create table TOREMP ( attribute name on STR; ) (generalization hierarchies)
QL dependencies to DLA dependencies (class assignment) ( select name, age from EMP where loc = ‘Waterloo’ ) ´ ( select name, age from WATEMP ) ( select name from EMP where loc = ‘Toronto’ ) ´ ( select name from TOREMP ) ( select name from EMP ) ´ ( select name from WATEMP ) union all ( select name from TOREMP )