1 / 49

Relational Algebra

Relational Algebra. Instructor: Mohamed Eltabakh meltabakh@cs.wpi.edu. Announcements. Project-Phase 1 is due NOW !!! Project-Phase 2 is out today (Nov. 4) and due on Nov. 11 Submission Guidelines (to make it easier for grading) Submit single file (word or pdf )

epotter
Télécharger la présentation

Relational Algebra

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Relational Algebra Instructor: Mohamed Eltabakh meltabakh@cs.wpi.edu

  2. Announcements • Project-Phase 1 is due NOW !!! • Project-Phase 2 is out today (Nov. 4) and due on Nov. 11 • Submission Guidelines (to make it easier for grading) • Submit single file (word or pdf) • Make sure your name (or username) is specified • Groups submit single copy of the project phases • Pickup your graded hardcopy submission from TAs during their office hours

  3. Relational Model (Recap) • Relations (Tables) + Attributes (Columns) • Integrity constraints • Create “Students” relation • Create “Courses” relation CREATE TABLE Students (sid: CHAR(20) Primary Key, name: CHAR(20), login: CHAR(10), age: INTEGER, gpa: REAL); CREATE TABLE Courses (cid: Varchar(20) Primary Key, name: string, maxCredits : integer, graduateFlag: boolean); • Create “Enrolled” relation CREATE TABLE Enrolled (sid: CHAR(20) Foreign Key References (Students.sid), cid: Varchar(20), enrollDate: date, grade: CHAR(2), Constraints fk_cidForeign Key cid References (Courses.cid));

  4. What about Converting this ERD to Relational Model status date

  5. Query Language • Query Language • Define data retrieval operations for relational model • Express easy access to large data sets in high-level language, not complex application programs • Categories of languages • Procedural: What you want and how to get it • Non-procedural, or declarative: What you want (without how) • SQL: High-level language for relational algebra. • Relational Algebra : Operator semantics based on set or bag theory • Relational algebra form underlying basis (and optimization rules) for SQL

  6. Relational Algebra • Basic operators • Set Operations (Union: ∪, Intersection: ∩ ,difference: – ) • Select: σ • Project: π • Cartesian product: x • rename: ρ • More advanced operators, e.g., grouping and joins • The operators take one or two relations as inputs and produce a new relation as an output • One input  unary operator, two inputs binary operator

  7. Relational Algebra • Allows to build expressions using composition of the available operators • For example, arithmetic expressions are expressions of operators • (w + t) / ((x + y) * 3) • In relational algebra, instead of variables we have relations

  8. Set Operators • Union, Intersection, Difference • Defined only for union compatiblerelations • Relations are union compatible iff • they have same sets of attributes (schema), and • the same types (domains) of attributes • Example : Union compatible or not? • Student (sNumber, sName) • Course (cNumber, cName) Not compatible

  9. Union:  • Consider two relations R and S that are union-compatible R S R S

  10. Union:  • Notation: R ∪ S • Defined as: • R ∪S = {t | t∈R or t∈S} • For R ∪S to be valid, they have to be union-compatible

  11. Difference: - • R – S are the tuples that appear in R and not in S • Defined as: • R – S = {t | t ∈R and t∈ S} R S R – S

  12. Intersection: ∩ • Consider two Relations R and S that are union-compatible R S R∩S

  13. Intersection: ∩ • Notation: R ∩ S • Defined as: • R ∩ S = { t | t ∈ r and t ∈ s } • Note: R ∩ S = R– (R–S) S R

  14. Selection: σ • Select: σc (R): • c is a condition on R’s attributes • Select subset of tuples from R that satisfy selection condition c σ(C ≥ 6) (R) R

  15. Selection: σ • Notation: σc(R) • cis called the selection predicate • Defined as: • σc(R) = {t | t ∈ R and c(t) is true} • cis a formula in propositional calculus consisting of terms connected by : • ∧ (and), ∨ (or), ¬ (not) • Each term is one of: • <attribute> op <attribute> | <attribute> op <constant> • op is one of: =,≠,>,≥.<.≤ • Example of selection: • σ branch_name=“Perryridge” ^ balance>1000(account)

  16. Selection: Example R σ ((A=B) ^ (D>5))(R)

  17. Project: π • πA1, A2, …, An (R), with A1, A2, …, An  attributes AR • returns all tuples in R, but only columns A1, A2, …, An • A1, A2, …, An are called Projection List πA, C (R) R

  18. Cross Product (Cartesian Product): X R X S S R

  19. Cross Product (Cartesian Product): X • Notation R x S • Defined as: • R x S = {t q | t ∈ r and q ∈ s} • Assume that attributes are all unique, otherwise renaming must be used

  20. Renaming: ρ • ρS (R) changes relation name from R to S • ρS(A1, A2, …, An) (R) renames also attributes of R to A1, A2, …, An ρS(X, C, D) (R) ρS (R) R S S

  21. Composition of Operations • Can build expressions using multiple operations • Example: σA=C(R x S) R X S S R σA=C(R x S)

  22. Banking Example • branch (branch_name, branch_city, assets) • customer (customer_name, customer_street, customer_city) • account (account_number, branch_name, balance) • loan (loan_number, branch_name, amount) • depositor (customer_name, account_number) • borrower (customer_name, loan_number)

  23. Example Queries

  24. Example Queries (Cont’d)

  25. Example Queries (Cont’d)

  26. Example Queries (Cont’d)

  27. Example Queries (Cont’d)

  28. More Operators

  29. Natural Join: R ⋈ S • Consider relations • R with attributes AR, and • S with attributes AS. • Let A = AR∩ AS = {A1, A2, …, An} – The common attributes • In English • Natural join R ⋈ Sis a Cartesian Product R X S with equality predicates on the common attributes (Set A)

  30. Natural Join: R ⋈ S • R ⋈ S can be defined as : πAR – A, A, AS - A (σR.A1 = S.A1 AND R.A2 = S.A2 AND … R.An = S.An (R X S)) Project the union of all attributes Equality on common attributes Cartesian Product

  31. Natural Join: R ⋈ S: Example S R R ⋈ S

  32. Theta Join: R ⋈C S • Theta Join is cross product, with condition C • It is defined as : R ⋈C S = (σC (R X S)) Theta join can express both Cartesian Product & Natural Join R S R ⋈R.A>=S.CS

  33. Outer Join • An extension of the join operation that avoids loss of information • Computes the join and then adds tuples form one relation that does not match tuples in the other relation to the result • Uses null values to fill in empty attributes with no matching • Types of outer join between R and S • Left outer (R o⋈ S): preserve all tuples from the left relation R • Right outer (R ⋈o S): preserve all tuples from the right relation S • Full outer (R ⋈ S): preserve all tuples from both relations o

  34. Left Outer Join (R o⋈ S): Example R S R ⋈ S (R o⋈ S)

  35. Right Outer Join (R ⋈o S): Example R S R ⋈ S (R ⋈o S)

  36. Full Outer Join (R ⋈ S): Example o R S R ⋈ S (R ⋈ S) o

  37. Assignment Operator:  • The assignment operation (←) provides a convenient way to express complex queries on multiple line • Write query as a sequence of line consisting of: • Series of assignments • Result expression containing the final answer • Assignment must always be made to a temporary relation variable • May use a variable multiple times in subsequent expressions • Example: • R1  (σ ((A=B) ^ (D>5))(R – S)) ∩ W • R2  R1 ⋈(R.A = T.C)T • Result  R1 U R2

  38. Duplicate Elimination:  (R) • Delete all duplicate records • Convert a bag to a set R  (R)

  39. Extended Projection: πL (R) • Standard project • L contains only column names of R • Extended projection • L may contain expressions and assignment operators π C, VA, X C*3+D(R)

  40. Grouping & Aggregation operator:  • Aggregation function takes a collection of values and returns a single value as a result • avg: average value • min: minimum value • max: maximum value • sum: sum of values • count: number of values • Grouing & Aggregate operation in relational algebra • g1,g2, …gm,F1(A1), F2(A2), …Fn(An) (R) • R is a relation or any relational-algebra expression • g1, g2, …gm is a list of attributes on which to group (can be empty) • Each Fiis an aggregate function applied on attribute Ai within each group

  41. Grouping & Aggregation Operator: Example S R • branch_name.sum(balance)(S) • sum(c)(R)

  42. Summary of Relational-Algebra Operators • Set operators • Union, Intersection, Difference • Selection & Projection & Extended Projection • Joins • Natural, Theta, Outer join • Rename & Assignment • Duplicate elimination • Grouping & Aggregation

  43. Example Queries Find customer names having loans with sum > 20,000 • πcustomer_name (σsum > 20,000 (customer_name, sum sum(amount)(loan ⋈ borrower)))

  44. Example Queries Find the branch name with the largest number of accounts • R1  branch_name. countAccounts count(account_number)(account) • R2  Max  max(countAccounts)(R1) • Result  πbranch_name(R1 ⋈countAccounts = Max R2)

  45. Example Queries Find customers having account balance below 100 or above 10,000 • πcustomer_name (depositor ⋈ • πaccount_number(σbalance <100 OR balance > 10,000(account)))

  46. Example Queries Find customers having account balance below 100 and loans above 10,000 • R1  πcustomer_name (depositor ⋈ πaccount_number(σbalance <100 (account))) • R2  πcustomer_name (borrower ⋈ πloan_number(σamount >10,000 (loan))) • Result  R1 ∩ R2

  47. Example Queries Find account numbers and balances for customers having loans > 10,000 • πaccount_number, balance • ( (depositor ⋈ account) ⋈ • (πcustomer_name (borrower ⋈ (σamount >10,000 (loan)))) • )

  48. Reversed Queries (what does it do)? • πcustomer_name(customer) - • (πcustomer_name(borrower) U πcustomer_name(depositer)) Find customers who neither have accounts nor loans

  49. Reversed Queries (what does it do)? • R1  (MaxLoan  max(amount)(σbranch_name= “ABC”(loan))) • Result  πcustomer_name(borrower ⋈ (R1 ⋈MaxLoan=amount^branch_name= “ABC”loan)) Find customer name with the largest loan from a branch “ABC”

More Related