1 / 55

An Introduction to Database System

An Introduction to Database System. Chap 6. Relational Algebra. 6.1 Introduction (1/6). the manipulative part of the relational model is called relational algebra Relational Algebra consists of a collection of operators, such as join, that take relations as their operands and

louis
Télécharger la présentation

An Introduction to Database System

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. An Introduction to Database System Chap 6. Relational Algebra

  2. 6.1 Introduction (1/6) • the manipulative part of the relational model • is called relational algebra • Relational Algebra • consists of a collection of operators, • such as join, that take relations as their operands and • return relations as their result • "original" algebra • Codd's eight operators

  3. 6.1 Introduction (2/6)An Overview of the Original Algebra • the relational algebra as defined by Codd • Four traditional set operators (, , , ) • Intersect () • Union () • Difference () • Cartesian product ()(or simply product) • Four special relational operators (, , ⋈, ) • Restrict () (also known as Select) • Project () • Join (⋈) • Divide ()

  4. 6.1 Introduction (3/6)An Overview of the Original Algebra • simple definition of operators • RESTRICT : • all tuples from a specified relation that satisfy a specified condition PROJECT : • all tuples that remains as subtuples in a specified relation after specified attributes have been eliminated

  5. 6.1 Introduction (4/6)An Overview of the Original Algebra • simple definition of operators • product (Cartesian product) • combination of two tuples • union a x a y b x b y c x c y R1 R2 a b c x y product

  6. 6.1 Introduction (5/6)An Overview of the Original Algebra • simple definition of operator • intersect • difference

  7. A B B C a1 b1 a2 b1 a3 b2 b1 c1 b2 c2 b3 c3 a1 b1 c1 a2 b1 c1 a3 b2 c2 R1 R2 R1 R2 B (Natural) join 6.1 Introduction (6/6)An Overview of the Original Algebra • simple definition of operator • join • divide divide a b c a a a b c x y z x y x z a

  8. 6.2 closure • closure property • output of any operation • is the same kind of object(relations) as the input • S JOIN P • the output from one operation can become input to another • possible to make nested expression • (S JOIN P) where CITY = ‘Athens’ • RENAME operator • to rename attributes within a specified relation • S RENAME CITY AS SCITY

  9. 6.3 A Syntax for the Relational Algebra (1/2) <relational expression> ::= RELATION { <tuple expression commalist> } | <relvar name> | <relational operation> | (<relational expression>) <relational operation> ::= <project> | <nonproject> <project> ::= <relational expression> { [ALL BUT ] < attribute commalist> } <nonproject> ::= <rename> | <union> | <intersect> | <minus> | <times> | <restrict> | <join> | <divide> <rename> ::= <relational expression> RENAME <renaming commalist>

  10. 6.3 A Syntax for the Relational Algebra (2/2) <union> ::= <relational expression> UNION <relational expression> <intersect> ::= <relational expression> INTERSECT <relational expression> <minus> ::= <relational expression> MINUS <relational expression> <times> ::= <relational expression> TIMES <relational expression> <restrict> ::= <relational expression> WHERE <boolean expression> <join> ::= <relational expression> JOIN <relational expression> <divide> ::= <relational expression> DIVIDEBY <relational expression> PER <per> <per> ::= <relational expression> | (<relational expression>, <relational expression>)

  11. 6.4 Semantics • type-compatibility • "same shape" concept • two relations are type compatible if they have identical headings • same set of attributes • defined on same domain • union, intersect, and difference require their operands to be type-compatible

  12. The Suppliers-Parts DB: Tables S# SNAME STATUS CITY S# P# QTY S SP S1 S2 S3 S4 S5 Smith Jones Blake Clark Adams 20 10 30 20 30 London Paris Paris London Athens S1 S1 S1 S1 S1 S1 S2 S2 S3 S4 S4 S4 P1 P2 P3 P4 P5 P6 P1 P2 P2 P2 P4 P5 300 200 400 200 100 100 300 400 200 200 300 400 P# PNAME COLOR WEIGHT CITY P P1 P2 P3 P4 P5 P6 Nut Bolt Screw Screw Cam Cog Red Green Blue Red Blue Red 12 17 17 14 12 19 London Paris Rome London Paris London

  13. 6.4 Semantics : Union • Union • Returns a relation containing all tuples that appear in “either or both” of two specified relations. • The two input relations must be of the same type - i.e., have the same heading. • Syntax: <relational expr> UNION <relational expr> • Example: If A and B are two relations of the same type, A UNION B

  14. 6.4 Semantics : Intersect • Intersect • Returns a relation containing all tuples that appear in “both” of two specified relations. • The two input relations must be of the same type - i.e., have the same heading. • Syntax: <relational expr> INTERSECT <relational expr> • Example: If A and B are two relations of the same type, A INTERSECT B

  15. 6.4 Semantics : Intersect A B S# SNAME • Intersect Example STATUS CITY S# SNAME STATUS CITY S1 S4 Smith Clark 20 20 London London S1 S2 Smith Jones 20 10 London Paris S# SNAME STATUS CITY Q2: A Intersect B S1 Smith 20 London Suppose A and B are derived from S; A is the suppliers in London, and B is the suppliers who supply part P1 Q : Which suppliers are located in London AND supply part P1?

  16. 6.4 Semantics : Difference • Difference • Returns a relation containing all tuples that appear in the first and not the second of two specified relations. • The two input relations must be of the same type - i.e., have the same heading. • Syntax: <relational expr> MINUS <relational expr> • Example: If A and B are two relations of the same type, A MINUS B

  17. S# SNAME STATUS CITY S# SNAME STATUS CITY S1 S4 Smith Clark 20 20 London London S1 S2 Smith Jones 20 10 London Paris Q1: A MINUS B Q2: B MINUS A S# SNAME STATUS CITY S# SNAME STATUS CITY S4 Clark 20 London S2 Jones 10 Paris 6.4 Semantics : Difference A B • Difference Example • Q1: Which suppliers are located in London AND DO NOT supply part P1? • Q2: Which suppliers supply part P1 AND ARE NOT located in London?

  18. a x a y b x b y c x c y A B a b c x y product 6.4 Semantics : Product • Product • Returns a relation containing all possible tuples that are a combination of two tuples, one from each of two specified relations, which do not have common attribute name. • Syntax: <relational expr> TIMES <relational expr> • Example: If A and B are relations with no common attribute name, A TIMES B • Number of tuples (Cardinality): Card(A TIMES B) = Card(A) * Card(B) • Number of attributes (Degree): Degree(A TIMES B)= Degree(A)+Degree(B)

  19. 6.4 Semantics : Product S# P# S# P# ... S4 S4 S5 S5 S5 S5 S5 S5 ... P5 P6 P1 P2 P3 P4 P5 P6 • Product Example • Q: What are all current supplier-number/part-number PAIRS? S1 S2 S3 S4 S5 P1 P2 P3 P4 P5 P6 S1 S1 S1 S1 S1 S1 S2 ... P1 P2 P3 P4 P5 P6 P1 ... A TIMES B A B

  20. 6.4 Semantics : Restrict • Restriction • Returns a relation containing all tuples from a specified relation that satisfy a specified condition. • Syntax: <relational expr> WHERE <boolean expr> <boolean expr> is a restriction condition, e.g., a comparison, that evaluates to a truth value (True/False). • Example: S WHERE CITY = 'London'

  21. Q1: S WHERE CITY = 'London' S# SNAME STATUS CITY S1 S4 Smith Clark 20 20 London London Q2: S WHERE CITY = 'London' OR CITY = 'Athens' S# SNAME STATUS CITY S1 S4 S5 Smith Clark Adams 20 20 30 London London Athens 6.4 Semantics : Restrict • Restrict Example • Q1: Which suppliers are located in London? • Q2: Which suppliers are located in London or Athens?

  22. 6.4 Semantics : Project • Project • Returns a relation containing all tuples that remain in a specified relation after specified attributes have been removed. • Syntax: <relational expr> {[ALL BUT] <attr name list>} [ALL BUT] is optional.

  23. Q: S {CITY} S {ALL BUT S#, SNAME, STATUS} CITY London Paris Athens 6.4 Semantics : Project • Project Example • Q: Which cities are suppliers located in? • Why does the result have three, not five, tuples? • Why don’t we duplicate 'London' and 'Paris'? • “Stating a truth twice doesn’t make it more true.”

  24. Restrict and Project • Restrict yields a "Horizontal" subset of a given relation. • Project yields a "Vertical" subset of a given relation. • Degree(A WHERE C) = Degree(A) • Card(A {C}) <= Card(A) • Why <= ?

  25. A B B C a1 b1 a2 b1 a3 b2 b1 c1 b2 c2 b3 c3 a1 b1 c1 a2 b1 c1 a3 b2 c2 R1 R2 R1 R2 B (Natural) join 6.4 Semantics : Join • JOIN • Returns a relation containing all possible tuples that are a combination of two tuples, one from each of two specified relations, • such that the two tuples contributing to any given combination have a common value for the common attributes of the two relations • (and that common value appears just once, not twice, in the result tuple).

  26. 6.4 Semantics : Join • Syntax: <relational expr> JOIN <relational expr> • Let relations A and B have headings {X1, X2, ..., Xm, Y1, Y2, ..., Yn} {Y1, Y2, ..., Yn, Z1, Z2, ..., Zp} • Then A JOIN B has heading {X1, X2, ..., Xm, Y1, Y2, ..., Yn, Z1, Z2, ..., Zp} • Card(A JOIN B) <= Card(A) * Card(B) • Degree(A JOIN B) <= Degree(A) + Degree(B)

  27. 6.4 Semantics : Join S# SNAME STATUS CITY S# P# QTY S SP S1 S2 S3 S4 S5 Smith Jones Blake Clark Adams 20 10 30 20 30 London Paris Paris London Athens S1 S1 S1 S1 S1 S1 S2 S2 S3 S4 S4 S4 P1 P2 P3 P4 P5 P6 P1 P2 P2 P2 P4 P5 300 200 400 200 100 100 300 400 200 200 300 400

  28. 6.4 Semantics : Join S# SNAME STATUS CITY P# QTY Q: S JOIN SP S1 S1 S1 S1 S1 S1 S2 S2 S3 S4 S4 S4 Smith Smith Smith Smith Smith Smith Jones Jones Blake Clark Clark Clark 20 20 20 20 20 20 10 10 30 20 20 20 London London London London London London Paris Paris Paris London London London P1 P2 P3 P4 P5 P6 P1 P2 P2 P2 P4 P5 300 200 400 200 100 100 300 400 200 200 300 400 • Q11: What suppliers supply what parts? List the info. of each supplier and the parts it supplies.

  29. Product and Join • Product: Card(A TIMES B) = Card(A) * Card(B) Degree(A TIMES B) = Degree(A)+Degree(B) • Join: Card(A JOIN B) <= Card(A) * Card(B) Degree(A JOIN B) <= Degree(A)+Degree(B) • If A and B do not have any common attribute name, (A JOIN B) and (A TIMES B) are identical. • Product is a special case of Join.

  30. 6.4 Semantics : Other Join • some other types of join • (Theta)-Join, Equi-Join, Semi-Join, and Outer-Join • The -joinof relation A on attribute X with relation B on attribute Y is defined in terms of product and restrict: (A TIMES B) WHERE X  Y q : scalar comparison operator (=, ≠, ≤, ≥, <, >) • The equi-join is a special case of -join where  is “=”. • If the headings of relation A and B are {X,Y} and {Y,Z}, A SEMIJOIN B is defined as: (A JOIN B) {X, Y}

  31. 6.4 Semantics : Divide divide • Divide • Takes two unary relations and one binary relation • and returns a relation containing all tuples from one unary relation • that appear in the binary relation matched with ALL tuples in the other unary relation. a b c a a a b c x y z x y x z a

  32. 6.4 Semantics : Divide • Syntax: <rel expr> DIVIDEBY <rel expr> PER <per> • Let relations A, B and C have headings {X}, {Y}, and {X, Y}, • then the division of A by B per C • (A is the dividend, B is the divisor, C is the “mediator”) - A DIVIDEBY B PER C • - is a relation with heading {X} and body consisting of all tuples {X: x} such that a tuple {X:x, Y:y} appears in C for ALL tuples {Y:y} in B. • The result consists of those X values for A whosecorresponding Y values in C include ALL Y values from B.

  33. 6.4 Semantics : Divide DEND MED DOR Q S# S# P# P# S# S1 S2 S3 S4 S5 S1 S1 S1 S1 S1 S1 P1 P2 P3 P4 P5 P6 S2 S2 S3 S4 S4 S4 P1 P2 P2 P2 P4 P5 P1 P2 P3 P4 P5 P6 S1 • Q: Which suppliers supply all parts? DEND DIVIDEBY DOR PER MED (S{S#}) DIVIDEBY (P{P#}) PER SP{S#, P#}

  34. 6.4 Semantics : Divide DEND MED DOR Q • Q: Which suppliers supply all parts named ‘Nut’? DEND DIVIDEBY DOR PER MED S# S# P# P# S# S1 S2 S3 S4 S5 S1 S1 S1 S1 S1 S1 P1 P2 P3 P4 P5 P6 S2 S2 S3 S4 S4 S4 P1 P2 P2 P2 P4 P5 P1 S1 S2 DEND DIVIDEBY DOR PER MED

  35. …. …. …. R1 R2 R3 6.4 Semantics : Associativity and Communativity • associative • UNION, INTERSECT, and TIMES (NOT MINUS) ( R1 UNION R2 ) UNION R3 = R1 UNION ( R2 UNION R3 ) • commutative • UNION, INTERSECT, and TIMES (NOT MINUS) R1 UNION R2 = R2 UNION R1 (R1 UNION R2) UNION R3 = R1 UNION ( R2 UNION R3) R1 UNION R2 = R1 UNION R2

  36. 6.5 Examples • 6.5.1 Get supplier names for suppliers who supply part P2. ((S JOIN SP) WHERE P#='P2') {SNAME} (S JOIN (SP WHERE P#='P2')) {SNAME} There will often be several different ways of formulating any given query

  37. 6.5 Examples • 6.5.2 Get supplier names for suppliers who supply at least one red part. ( ( ( (P WHERE COLOR='Red')JOIN SP ) {S#} ) JOIN S ) {SNAME} ( ( ( ( P WHERE COLOR = 'Red') {P#}) JOIN SP) JOIN S) {SNAME}

  38. 6.5 Examples • 6.5. 3 Get supplier names for suppliers who supply all parts ( (S{S#} DIVIDEBY P{P#} PER SP{S#, P#}) JOIN S) {SNAME} • 6.5.4 Get supplier numbers for suppliers who supply at least all those parts supplied by supplier S2. S {S#} DIVIDEBY (SP WHERE S='S2') {P#} PER SP {S#, P#}

  39. 6.5 Examples • 6.5.5 Get all pairs of supplier numbers such that the two suppliers concerned are "colocated" (i.e., located in the same city). ( ((S RENAME S# AS SA) {SA, CITY} JOIN (S RENAME S# AS SB) {SB, CITY}) WHERE SA < SB) {SA, SB} • The purpose of “where SA < SB” • Except that the Same Supplier • Except that the duplicate comparison (S1=S2, S2=S1)

  40. 6.5 Examples • 6.5.6 Get supplier names for suppliers who do not supply part P2. ((S {S#} MINUS (SP WHERE P#='P2') {S#}) JOIN S) {SNAME}

  41. 6.6 What is the Algebra For (1/3) • primitive operation • minimal set of Codd's operations • restriction, projection, product, union, difference • other three operations • join, intersection, divideby • can be defined in terms of the other five operations • five primitive operations • can not be defined in terms of the other four • A join B • (A TIMES B) where X θ Y • A intersect B • A minus (A minus B)

  42. 6.6 What is the Algebra For ? (2/3) X S# Y Z P# S# P# S1 P1 S1 P1 • X divideby Y per C, • x : set of attributes on X, y : set of attributes on Y, z : set of attributes on Z S2 P2 S1 P2 S3 P3 S1 P3 S4 S2 P1 ( X minus ((Z{z-y} times Y) minus Z) { z-y} ) S# S# P# S1 S1 P1 S2 S1 P2 S1 P3 S# S2 P1 S2 S2 P2 S# P# S2 P3 S2 P2 S2 P3 Result : S# S1

  43. 6.6 What is the Algebra For ? (3/3) • the writing of expression • defining a scope for retrieval and update • defining (named) virtual relations • defining snapshots • defining security rules • defining stability operations • defining integrity rules • relationally complete • if it is at least as powerful as the algebra • measure of relational languages

  44. 6.7 Additional Operators • Numerous additional operators have been proposed since Codd defined his eight. • Examples: • SEMIJOIN, EXTEND, SUMMARIZE, TCLOSE

  45. 6.7 Additional Operators : Semijoin • If the headings of relation A and B are {X,Y} and {Y,Z}, A SEMIJOIN B is defined to be equivalent to: (A JOIN B) {X, Y} • Get S#, SNAME, STATUS, and CITY for suppliers who supply part P2 S SEMIJOIN (SP WHERE P# = P# (‘P2’) )

  46. 6.7 Additional Operators : Semidifference • If the headings of relation A and B are {X,Y} and {Y,Z}, A SEMIMINUS B is defined to be equivalent to: A MINUS ( A SEMIJOIN B) • Get S#, SNAME, STATUS, and CITY for suppliers who do not supply part P2 S SEMIMINUS (SP WHERE P# = P# (‘P2’) )

  47. 6.7 Additional Operators : Extend • syntax • EXTEND term ADD scalar-expression AS attribute • extend P add ( weight * 454 ) as GMWT • GMWT : additional attribute • GMWT can be used in projection, restriction, etc. • Get part numbers and weights in grams. (Extend P ADD (WEIGHT*454) AS GMWT){P#,GMWT} P# p1 p2 p3 p4 p5 p6 PNAME Nut Bolt Screw Screw Cam Cog COLOR Red Green Blue Red Blue Red WEIGHT 12 17 17 14 12 19 CITY London Paris Rome London Paris London GMWT 5448 7718 7718 6356 5448 8626

  48. 6.7 Additional Operators : Summarize • Summarize Operator • “vertical” or “column-wise” computations • EXTEND : “horizontal” or “row-wise” computations • Syntax SUMMARIZE term PER ( attribute-commalist ) ADD aggregate-expression AS attribute • summarize A PER ( A1, A2, .., An) add exp as Z • A1, A2, .., An : distinct attributes • heading of result relation : { A1,...,An, Z} • new Z value : • by evaluating the aggregate expression exp on all tuples of A that have the same values for A1, A2, ..., An

  49. 6.7 Additional Operators : Summarize • Summarize Example summarize SP per (p#) add SUM (QTY) as TOTQTY P# p1 p2 p3 p4 p5 p6 TOTQTY 600 1000 400 500 500 100

  50. 6.7 Additional Operators : Summarize • Aggregate Operations • COUNT, SUM, AVG, MAX, MIN • Derives a single scalar value from the values appearing in some specified attribute of some specified relation. • Syntax: <agg op> (<rel expr> [, attr name]) • <agg op> can be COUNT, SUM, AVG, MAX, MIN. • attr name is optional.

More Related