Download Presentation
## Relational Algebra

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**Relational Algebra**Murali Mani**What is Relational Algebra?**• Defines operations (data retrieval) for relational model • SQL’s DML (Data Manipulation Language) has data retrieval facilities, which are equivalent to that of relational algebra. • SQL and relational algebra are not for complex operations; they support efficient, easy access of large data sets. Murali Mani**Basics**• Relational Algebra is defined on bags, rather than relations (sets). • Bag or multiset allows duplicate values; but order is not significant. • We can write an expression using relational algebra operators with parentheses • Need closure – an operator on bag returns a bag. • Relational algebra includes set operators, and other operators specific to relational model. Murali Mani**Set Operators**• Union, Intersection, Difference, cross product • Union, Intersection and Difference are defined only for union compatible relations. • Two relations are union compatible if they have the same set of attributes and the types (domains) of the attributes are the same. • Eg of two relations that are not union compatible: • Student (sNumber, sName) • Course (cNumber, cName) Murali Mani**Union: **• Consider two bags R1 and R2 that are union-compatible. Suppose a tuple t appears in R1m times, and in R2n times. Then in the union, t appears m + n times. R1 R2 R1 R2 Murali Mani**Intersection: ∩**• Consider two bags R1 and R2 that are union-compatible. Suppose a tuple t appears in R1m times, and in R2n times. Then in the intersection, t appears min (m, n) times. R1 R2 R1 ∩R2 Murali Mani**Difference: -**• Consider two bags R1 and R2 that are union-compatible. Suppose a tuple t appears in R1m times, and in R2n times. Then in R1 – R2, t appears max (0, m - n) times. R1 R2 R1 – R2 Murali Mani**Bag semantics vs Set semantics**• Union is idempotent for sets: (R1 R2) R2 = R1 R2 • Union is not idempotent for bags. • Intersection is idempotent for sets and bags. • Difference is idempotent for sets, but not for bags. • For sets and bags, R1R2 = R1 – (R1 – R2). Murali Mani**Cross Product (Cartesian Product): X**• Consider two bags R1 and R2. Suppose a tuple t1 appears in R1m times, and a tuple t2 appears in R2n times. Then in R1 X R2, t1t2 appears mn times. R1 XR2 R1 R2 Murali Mani**Basic Relational Operations**• Select, Project, Join • Select: denoted σC (R): selects the subset of tuples of R that satisfies selection condition C. C can be any boolean expression, its clauses can be combined with AND, OR, NOT. σ(C ≥ 6) (R) R Murali Mani**Select**• Select is commutative: σC2 (σC1 (R)) = σC1 (σC2 (R)) • Select is idempotent: σC (σC (R)) = σC (R) • We can combine multiple select conditions into one condition. σC1 (σC2 (… σCn (R)…)) = σC1 AND C2 AND … Cn (R) Murali Mani**Project: πA1, A2, …, An (R)**• Consider relation (bag) R with set of attributes AR. πA1, A2, …, An (R), where A1, A2, …, An AR returns the tuples in R, but only with columns A1, A2, …, An. πA, B (R) R Murali Mani**Project: Bag Semantics vs Set Semantics**• For bags, the cardinality of R = cardinality of πA1, A2, …, An (R). • For sets, cardinality of R ≥ cardinality of πA1,A2, …, An (R). • For sets and bags • project is not commutative • project is idempotent Murali Mani**Natural Join: R ⋈ S**• Consider relations (bags) R with attributes AR, and S with attributes AS. Let A = AR∩ AS. R ⋈ S can be defined as πAR – A, A, AS - A (σR.A1 = S.A1 AND R.A2 =S.A2 AND … R.An=S.An (R X S)) where A = {A1, A2, …, An} The above expression says: select those tuples in R X S that agree in values for each of the A attributes, and project the resulting tuples such that we have only one value for each A attribute. Murali Mani**Natural Join example**R1 R2 R1 ⋈R2 Murali Mani**Theta Join: R ⋈C S**• Theta Join is similar to cartesian product, except that we can specify any condition C. It is defined as R ⋈C S = (σC (R X S)) R1 ⋈ R1.B<R2.BR2 R1 R2 Murali Mani**Outer Join: R ⋈o S**• Similar to natural join, however, if there is a tuple in R, that has no “matching” tuple in S, or a tuple in S that has no matching tuple in R, then that tuple also appears, with null values for attributes in S (or R). R1 ⋈o R2 R1 R2 Murali Mani**Left Outer Join: R ⋈oLS**• Similar to natural join, however, if there is a tuple in R, that has no “matching” tuple in S, then that tuple also appears, with null values for attributes in S (note: a tuple in S that has no matching tuple in R does not appear). R1 ⋈oLR2 R1 R2 Murali Mani**Right Outer Join: R ⋈oRS**• Similar to natural join, however, if there is a tuple in S, that has no “matching” tuple in R, then that tuple also appears, with null values for attributes in R (note: a tuple in R that has no matching tuple in S does not appear). R1 ⋈oRR2 R1 R2 Murali Mani**Renaming: ρS(A1, A2, …, An) (R)**• Rename relation R to S, attributes of R are renamed to A1, A2, …, An • ρS (R) renames relation R to S, keeping the attributes same. ρS(X, C, D) (R2) ρS (R2) R2 S S Murali Mani**Example: Introducing new relations**Find the semijoin of 2 relations R, S. Semijoin denoted R ⋉ S is defined as the tuples in R, such that for a tuple t1 in R, if there exists a tuple t2 in S, and t1 and t2 agree in all attributes common to R and S, then t1 appears in the result. R1 = R ⋈ S R2 = πAR (R1) R ⋉ S = R2 ⋂ R Murali Mani**Duplicate Elimination: (R)**• Convert a bag to a set. (R) R Murali Mani**Extended Projection: πL (R)**• Here L can be • An attribute (just like simple projection) • An expression x → y, where x and y are names of attributes, this renames attribute x to y. • An expression E → z, where E is any expression involving attributes, constants, and arithmetic and string operators. This has an attribute called z whose values are given by E. πB→A, C+D→X, C, D (R) R Murali Mani**Aggregation operators**• MIN, MAX, COUNT, SUM, AVG • AGGB(R) considers only non-null values of R. SUMB (R) COUNTB (R) MINB (R) R AVGB (R) COUNT* (R) MAXB (R) Murali Mani**Aggregation Operators**• MIN, MAX, SUM, AVG must be on any 1 attribute. COUNT can be on any 1 attribute or COUNT* (R) • An aggregation operator returns a bag, not a single value ! But SQL allows treatment as a single value. σB=MAXB (R) (R) Murali Mani**Grouping Operator: GL, AL (R)**• GL, AL (R) groups all attributes in GL, and performs the aggregation specified in AL. starName, MIN (year)→year, COUNT(title) →num (StarsIn) StarsIn Murali Mani**Sorting Operator: L (R)**• It sorts the tuples in R. If L is list A1, A2, …, An, it first sorts by A1, then by A2, and so on. • Sort is used as a last operator in an expression. A,B (R) R Murali Mani**Relational Algebra Operators**• Set Operators: Union, Intersection, Difference, Cartesian Product • Select, Project • Join: Natural Join, Theta Join, (Left/Right) Outer Join • Renaming, Duplicate Elimination • Aggregation: MIN, MAX, COUNT, SUM, AVG • Grouping, Sorting Murali Mani