270 likes | 514 Vues
Relational Algebra. Tim Kaddoura CS157A. Introduction. Relational query languages are languages for describing queries on a relational database Three variants Three variants – Relational Algebra – Relational Calculus – SQL Query languages V.S. programming languages
E N D
Relational Algebra Tim Kaddoura CS157A
Introduction • Relational query languages are languages for describing queries on a relational database • Three variants Three variants – Relational Algebra – Relational Calculus – SQL • Query languages V.S. programming languages – Query languages support easy, efficient access to large data sets.
Some Facts • Introduced by Edgar 'Ted' Coddof IBM Research in 1970. • Concept of mathematical relation as the underlying basis. • The standard database model for most transactional databases today.
What is Algebra? • A language based on operators and a domain of values • Operators map values taken from the domain into other domain values • Hence, an expression involving operators and arguments produces a value in the domain • Consider arithmetic operations +, - and * over integers. – Algebra expressions: 2+3, (46–3)+3, (7*x)+(3*x) • Relational algebra: – Domain: the set of all relations – Expression: referred to as a query
Relational Algebra • Domain: set of relations • Basic operators: select, project, union, set difference, Cartesian product • Derived operators: set intersection, division, join • Procedural: Relational expression specifies query by describing an algorithm for determining the result of an expression
Example Database STUDENT(Id, Name, Password, Address) FACULTY(Id, Name, DeptId, Password, Address) COURSE(CrsCode, DeptId, CrsName, CreditHours) REQUIRES(CrsCode, PrereqCrsCode, EnforcedSince) CLASS(CrsCode, SectionNo, Semester, Year,Textbook, ClassTime, Enrollment, MaxEnrollment, ClassroomId, InstructorId) CLASSROOM(ClassroomId, Seats) TRANSCRIPT(StudId, CrsCode, SectionNo, Semester, Year, Grade)
Examples Queries • Find all the courses student named ‘John Doe’ has completed. • Find all students who are taking a course by Prof. Lee in ‘Fall 2005’. • Find all courses taught by a faculty from the ‘CS’ department in ‘Fall 2005’. • Find all faculty who taught courses both in ‘Fall’ and ‘Spring’ 2005. • Find all faculty who did not teach courses both in ‘Fall’ and ‘Spring’ 2005. • Find all students with 4.0 GPA. • Find all students who completed a course without taking one of its prerequisites.
Relational Algebra • A relation schema is given by R(A1,…,Ak), the name of the relation and the list of the attributes in the relation • A relation is a set of tuples that are valid instances of its schema • Relational algebra expressions take as input relations and produce as output new relations. • After each operation, the attributes of the participating relations are carried to the new relation. The attributes may be renamed, but their domain remains the same.
Basic Operators • Unary (single relation) operators • SELECT ( selection-condition ) • PROJECT ( attribute-list) • Binary (two relation) operators • UNION ( ) • SET DIFFERENCE ( - ) • CARTESIAN PRODUCT ( )
Set Operators • Set operators connect two relations by set operations. • However, for a set operation R1 op R2 to be valid, R1 and R2 should be union-compatible, that is • R1, R2 have the same number of columns • The names of the attributes are the same in both relations • Attributes with the same name in both relations have the same domain
Selection SELECTION: selection-condition( R ) • Select from R all tuples that satisfy the selection-condition • The condition only refers to the attributes in R • An atomic-selection-condition is of the form • relation-attribute operation constant, or • relation-attribute operation relation-attribute • A selection-condition is obtained by boolean combination of atomic selection conditions by means of connectives AND, OR, and NOT.
Selection • Enrollment>MaxEnrollment AND Enrollment>100(CLASS) • Grade=‘A’(TRANSCRIPT) • Year=2005 AND (Semester=‘Fall’ OR Semester=‘Spring’) (CLASS) • Name like ‘%Lee’ (FACULTY)
Projection PROJECT: attribute-list (R) • Return from R all tuples, but remove from all tuples any attribute that is not in the attribute list • The attribute list can only refer to attributes in R Examples: • ID (FACULY) • ID ( Name like ‘%Lee’ FACULTY ) • CrsCode, Textbook (COURSE) • PrereqCrsCode ( CrsCode=‘CSCI4380’ REQUIRES )
Set Operators • Set operators connect two relations by set operations. • However, for a set operation R1 op R2 to be valid, R1 and R2 should be union-compatible, that is • R1, R2 have the same number of columns • The names of the attributes are the same in both relations • Attributes with the same name in both relations have the same domain
Set Operators • Given two relations R1, R2 that are union-compatible, we have that • R1 R2 returns the set of tuples that are in R1 or R2. [UNION] • R1 R2 returns the set of tuples that are both in R1 and R2. [INTERSECTION] • R1 - R2 returns the set of tuples that are in R1, but not in R2. [SET DIFFERENCE] • Note that set difference is the only negative operator, i.e. not in R2.
Set Operators • Name (FACULY) Name (STUDENT) • Address (FACULY) Address (STUDENT) • CrsCode (CLASS) - CrsCode (TRANSCRIPT) • Id (STUDENT) - StudId (TRANSCRIPT) • Problem: The two relations in this case are not union compatible (even though the attribute numbers and domains match, the names do not). • Solution For this?
Rename • Rename all the attributes in the relation • Given a relation with schema R(A1,…,An) • The expression R[B1,…,Bn] is used to rename attribute A1 as B1, …, An as Bn. • The rename operator does not change the domain of the attributes. • The rename operator does not change the number of attributes in a relation (that can be done by the projection operation).
Rename Corrected Expression • Id (STUDENT) - ( StudId (TRANSCRIPT))[Id] Example: • Temp:= InstructorId (Year=2005 AND Semester=‘Spring’COURSE) InstructorId (Year=2005 AND Semester=‘Fall’COURSE) ‘All faculty who taught courses both in ‘Fall’ and ‘Spring’ 2005.’ • Id (FACULTY) - Temp[Id] All faculty who did not teach courses both in ‘Fall’ and ‘Spring’ 2005.’’
Set Difference Find all students with 4.0 GPA or find students with all As (and at least one A grade) or find students who never got anything other than an A Temp1= StudId (Grade=’A’ (TRANSCRIPT)) Students who got at least one non-A grade Temp2 = StudId (TRANSCRIPT) Students who got at least one grade Result = Temp2 - Temp1
Cartesian Product • Given two sets, A={a1,a2,a3}, B={b1,b2}, the Cartesian Product (A B) returns the set of tuples A B={(a1,b1), (a1,b2), (a2,b1), (a2,b2), (a3,b1), (a3,b2)} • The Cartesian product for relations is a generalization of this concept.
Cartesian Product • Given two relations with schema R1(A1,…,An) and R2(B1,….,Bm), and let Temp = R1 R2 • R1 R2 returns a relation Temp with schema Temp(A1,…,An,B1,…,Bm) such that • For all possible tuples t1 in R1, and t2 in R2, there exists a tuple t in Temp, such that t is identical to t1 with respect to A1,…,An and t is identical to t2 with respect to B1,…,Bm.
Cartesian Product Find all the courses student named ‘John Doe’ has completed. CrsCode (Id=StudId(TRANSCRIPT (Name=‘John Doe’STUDENT) ) )
Join • Join is a derived operator, obtained from a Cartesian product followed by a selection condition. • Given R1 join-condition R2 • Where join-condition is a boolean combination of combinations of the form relation-attribute operation relation-attribute where the attributes on the left (and right) hand side come from R1 (and R2). • R1 join-condition R2 = join-condition (R1 R2)
Join • Output the names of all employees that earn more than their managers. Employee.Name(Employee >< MngrId=Manager.Id AND Employee.Salary > Manager.SalaryManager) • RESULT (a table with the following attributes): Employee.Name, Employee.Id, Employee.Salary, Manager.Name, Manager.Id, Manager.Salary
Division • Given R1(A1,…,An,B1,…,Bm) and R2(B1,…,Bm) R1/R2 is the “maximal” set of tuples from A1,…,An(R1) such that R1 contains all tuples in (R / S) x S. • Division is often used for “for-all”queries, e.g. “Find tuples in R that are in a relation with ‘all’ tuples in S.” • Important to note that all the attributes in the dividing relation R2 must exist in the divided relation R1!
Relational Algebra - Division • Find: • R / S • R / V • R / Y
Conclusion • Based on the concept of mathematical relation • Building block: a relation comprising of attributes within domains • Tuples + Schema = Relation