html5-img
1 / 90

Lecture 3: Relational Algebra and SQL

Lecture 3: Relational Algebra and SQL. Outline. Relational Algebra: 4.2 (except 4.2.5) SQL: 5.2, 5.3, 5.4. Querying the Database. Goal: specify what we want from our database Find all the employees who earn more than $50,000 and pay taxes in New Jersey.

iria
Télécharger la présentation

Lecture 3: Relational Algebra and SQL

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lecture 3: Relational Algebra and SQL

  2. Outline • Relational Algebra: 4.2 (except 4.2.5) • SQL: 5.2, 5.3, 5.4

  3. Querying the Database • Goal: specify what we want from our database • Find all the employees who earn more than $50,000 and pay taxes in New Jersey. • Could write in C++/Java, but bad idea • Instead use high-level query languages: • Theoretical: Relational Algebra, Datalog • Practical: SQL • Relational algebra: a basic set of operations on relations that provide the basic principles.

  4. Instances of Branch and Staff (part) Relations

  5. Relational Algebra • Operators: relations as input, new relation as output • Five basic RA operators: • Set Operators • union, difference • Selection: s • Projection: p • Cartesian Product: X • Derived operators: • Intersection, complement • Joins (natural,equi-join, theta join, semi-join) • When our relations have attribute names: • Renaming: r

  6. Set Operations: Union • Union: all tuples in R1 or R2 • Notation: R1 U R2 • R1, R2 must have the same schema • R1 U R2 has the same schema as R1, R2 • Example: • ActiveEmployees U RetiredEmployees

  7. Union • R  S • Union of two relations R and S defines a relation that contains all the tuples of R, or S, or both R and S, duplicate tuples being eliminated. • R and S must be union-compatible. • If R and S have I and J tuples, respectively, union is obtained by concatenating them into one relation with a maximum of (I + J) tuples. Example: List all cities where there is either a branch office or a property for rent. • Pcity(Branch) union Pcity(PropertyForRent) or R = Branch[city] union PropertyForRent[city]

  8. Intersection • R  S • Defines a relation consisting of the set of all tuples that are in both R and S. • R and S must be union-compatible. • Expressed using basic operations: R  S = R – (R – S) Example: List all cities where there is both a branch office and at least one property for rent. city(Branch) city(PropertyForRent)

  9. Difference (Minus) • R – S • Defines a relation consisting of the tuples that are in relation R, but not in S. • R and S must be union-compatible. • List all cities where there is a branch office but no properties for rent. city(Branch) – city(PropertyForRent) Or R = Branch [city] - PropertyForRent [city]

  10. Set Operations: Difference • Difference: all tuples in R1 and not in R2 • Notation: R1 – R2 • R1, R2 must have the same schema • R1 - R2 has the same schema as R1, R2 • Example • AllEmployees - RetiredEmployees

  11. Relational Algebra Operations

  12. Set Operations: Selection • Returns all tuples which satisfy a condition • Notation: sc(R) • c is a condition: =, <, >, and, or, not • Output schema: same as input schema • Find all employees with salary > $40,000: • s Salary > 40000(Employee)

  13. Find all employees with salary more than $40,000. s Salary > 40000(Employee)

  14. Restriction (or Selection) • Works on a single relation R and defines a relation that contains only those tuples (rows) of R that satisfy the specified condition (predicate). Example: List all staff with a salary greater than US$10,000. salary > 10000 (Staff) or R = STAFF Where Salary > 10000

  15. Projection • Unary operation: returns certain columns • Eliminates duplicate tuples ! • Notation: P A1,…,An(R) • Input schema R(B1,…,Bm) • Condition: {A1, …, An}  {B1, …, Bm} • Output schema S(A1,…,An) • Example: project social-security number and names: • PSSN, Name (Employee)

  16. Projection • col1, . . . , coln(R) • Works on a single relation R and defines a relation that contains a vertical subset of R, extracting the values of specified attributes and eliminating duplicates. • Produce a list of salaries for all staff, showing only staffNo, fName, lName, and salary details. staffNo, fName, lName, salary(Staff) or Staff [staffNo, fName, lName, salary]

  17. PSSN, Name (Employee)

  18. Cartesian Product • Each tuple in R1 with each tuple in R2 • Notation: R1 x R2 • Input schemas R1(A1,…,An), R2(B1,…,Bm) • Condition: {A1,…,An} ∩ {B1,…Bm} = F • Output schema is S(A1, …, An, B1, …, Bm) • Notation: R1 x R2 • Example: Employee x Dependents • Very rare in practice; but joins are very often

  19. Cartesian Product (Multiplication) • R X S • Defines a relation that is the concatenation of every tuple of relation R with every tuple of relation S. • List the names and comments of all clients who have viewed a property for rent. (clientNo, fName, lName(Client)) X (clientNo, propertyNo, comment (Viewing)) or Client [clientNo, fName, lName] x Viewing [clientNo, propertyNo, comment ]

  20. Example - Cartesian product and Selection • Use selection operation to extract those tuples where Client.clientNo = Viewing.clientNo. sClient.clientNo = Viewing.clientNo((ÕclientNo,fName,lName(Client))  (ÕclientNo,propertyNo,comment(Viewing))) • Cartesian product and Selection can be reduced to a single operation called a Join.

  21. Renaming • Does not change the relational instance • Changes the relational schema only • Notation: rB1,…,Bn (R) • Input schema: R(A1, …, An) • Output schema: S(B1, …, Bn) • Example: • LastName, SocSocNo (Employee)

  22. Renaming Example Employee Name SSN John 999999999 Tony 777777777 • LastName, SocSocNo (Employee) LastName SocSocNo John 999999999 Tony 777777777

  23. Derived Operations • Intersection can be derived: • R1 ∩ R2 = R1 – (R1 – R2) • There is another way to express it (later) • Most importantly: joins, in many variants

  24. Join • Join is a derivative of Cartesian product. • Equivalent to performing a Selection, using join predicate as selection formula, over Cartesian product of the two operand relations. • One of the most difficult operations to implement efficiently in an RDBMS and one reason why RDBMSs have intrinsic performance problems.

  25. Join • Join is a derivative of Cartesian product. • Equivalent to performing a Selection, using join predicate as selection formula, over Cartesian product of the two operand relations. • One of the most difficult operations to implement efficiently in an RDBMS and one reason why RDBMSs have intrinsic performance problems. • Various forms of join operation • Natural join (defined by Codd) • Outer join • Theta join • Equijoin (a particular type of Theta join) • Semijoin

  26. Theta join (-join) • R FS • Defines a relation that contains tuples satisfying the predicate F from the Cartesian product of R and S. • The predicate F is of the form R.ai S.bi where  may be one of the comparison operators (<, , >, , =, ).

  27. Theta join (-join) • Can rewrite Theta join using basic Selection and Cartesian product operations. R FS = F(R  S) • Degree of a Theta join is sum of degrees of the operand relations R and S. If predicate F contains only equality (=), the term Equijoin is used.

  28. Example - Equijoin • List the names and comments of all clients who have viewed a property for rent. (clientNo, fName, lName(Client)) Client.clientNo = Viewing.clientNo (clientNo, propertyNo, comment(Viewing))

  29. EQUIJOIN • SELECT table.column, table.column FROM table1, table2 WHERE table1.column1 = table2.column2;

  30. Natural Join • Notation: R1 R2 • Input Schema: R1(A1, …, An), R2(B1, …, Bm) • Output Schema: S(C1,…,Cp) • Where {C1, …, Cp} = {A1, …, An} U {B1, …, Bm} • Meaning: combine all pairs of tuples in R1 and R2 that agree on the attributes: • {A1,…,An} ∩ {B1,…, Bm} (called the join attributes) • Equivalent to a cross product followed by selection • Example Employee Dependents

  31. Employee Dependents = PName, SSN, Dname(sSSN=SSN2(Employee x rSSN2, Dname(Dependents)) Natural Join Example Employee Name SSN John 999999999 Tony 777777777 Dependents SSN Dname 999999999 Emily 777777777 Joe Name SSN Dname John 999999999 Emily Tony 777777777 Joe

  32. Natural Join • List the names and comments of all clients who have viewed a property for rent. (clientNo, fName, lName(Client)) Join (clientNo, propertyNo, comment(Viewing)) Or Client [clientNo, fName, lName] Join Viewing [clientNo, propertyNo, comment ]

  33. Another Natural Join Example • R= S= • R S=

  34. Natural Join • Given the schemas R(A, B, C, D), S(A, C, E), what is the schema of R S ? • Given R(A, B, C), S(D, E), what is R S ? • Given R(A, B), S(A, B), what is R S ?

  35. Theta Join • A join that involves a predicate • Notation: R1 q R2 where q is a condition • Input schemas: R1(A1,…,An), R2(B1,…,Bm) • {A1,…An} ∩ {B1,…,Bm} = f • Output schema: S(A1,…,An,B1,…,Bm) • Derived operator: R1 q R2 = sq (R1 x R2)

  36. Semijoin • R S = PA1,…,An (R S) • Where the schemas are: • Input: R(A1,…An), S(B1,…,Bm) • Output: T(A1,…,An)

  37. Outer join • To display rows in the result that do not have matching values in the join column, use Outer join. • R S • (Left) outer join is join in which tuples from R that do not have matching values in common columns of S are also included in result relation.

  38. Outer Join • To display rows in the result that do not have matching values in the join column, use Outer join. • R Left Outer Join S • (Left) outer join is join in which tuples from R that do not have matching values in common columns of S are also included in result relation. Example: • Produce a status report on property viewings. propertyNo, street, city(PropertyForRent) Left Outer Join Viewing

  39. Types of Joins • Left Outer Join • keep all of the tuples from the “left” relation • join with the right relation • pad the non-matching tuples with nulls • Right Outer Join • same as the left, but keep tuples from the “right” relation • Full Outer Join • same as left, but keep all tuples from both relations

  40. Left Outer Join name phone name email R= S= • If we do a left outer join on R and S, and we match on the first column, the result is: name phone email

  41. Example - Left Outer join • Produce a status report on property viewings. propertyNo, street, city(PropertyForRent) Viewing

  42. Right Outer Join name email name phone R= S= • If we do a right outer join on R and S, and we match on the first column, the result is: name phone email

  43. Full Outer Join name email name phone R= S= • If we do a full outer join on R and S, and we match on the first column, the result is: name phone email

  44. OUTER JOIN Example 1

  45. OUTER JOIN Example 2

  46. Division • Identify all clients who have viewed all properties with three rooms. (clientNo, propertyNo(Viewing))  (propertyNo(rooms = 3 (PropertyForRent)))

  47. Natural join • R S • An Equijoin of the two relations R and S over all common attributes x. One occurrence of each common attribute is eliminated from the result.

  48. Example - Natural join • List the names and comments of all clients who have viewed a property for rent. (clientNo, fName, lName(Client)) (clientNo, propertyNo, comment(Viewing))

  49. Semijoin • R F S • Defines a relation that contains the tuples of R that participate in the join of R with S. • Can rewrite Semijoin using Projection and Join: • R F S = A(R F S)

More Related