1 / 88

Chapters 3-6 Relational Data Models, Relational Constraints, and Relational Algebra

Chapters 3-6 Relational Data Models, Relational Constraints, and Relational Algebra. Flat file: A two dimensional array of attributes or data items ProductX 1 Bellaire 5 ProductY 2 Sugarland 5

whitney
Télécharger la présentation

Chapters 3-6 Relational Data Models, Relational Constraints, and Relational Algebra

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapters 3-6Relational Data Models, Relational Constraints, and Relational Algebra • Flat file: A two dimensional array of attributes or data items ProductX 1 Bellaire 5 ProductY 2 Sugarland 5 ProductZ 3 Houston 5 Computerization 10 Stafford 4 Reorganization 20 Houston 1 Newbenefits 30 Stafford 4 • Database Management Systems (DBMS): A generalized software system that is used to create, manage, and protect data bases Chapters 5-8

  2. Chapters 5-8

  3. Attribute: A name characteristic or property of an entity = column header Entity: A “thing” in the real world with an independent existence physical existence: person, student, car Chapters 5-8

  4. Domain - The valid set of atomic value for an attribute in a relation e.g. SSN set of 9 digits GPA: 0<= GPA <= 4.0 Atomic - each value in the domain is indivisible Name (Fname, Minit, Lname) – not atomic Fname -- atomic Minit -- atomic Lname -- atomic Chapters 5-8

  5. Relational Model Concepts • A Relation is a mathematical concept based on the ideas of sets • The model was first proposed by Dr. E.F. Codd of IBM Research in 1970 in the following paper: • "A Relational Model for Large Shared Data Banks," Communications of the ACM, June 1970 • The above paper caused a major revolution in the field of database management and earned Dr. Codd the coveted ACM Turing Award Chapters 5-8

  6. Informal Definitions • Informally, a relation looks like a table of values. • A relation typically contains a set of rows. • The data elements in each row represent certain facts that correspond to a real-world entity or relationship • In the formal model, rows are called tuples • Each column has a column header that gives an indication of the meaning of the data items in that column • In the formal model, the column header is called an attribute name (or just attribute) Chapters 5-8

  7. Formal Definitions - Schema • The Schema (or description) of a Relation: • Denoted by R(A1, A2, .....An) • R is the name of the relation • The attributes of the relation are A1, A2, ..., An • Example: CUSTOMER (Cust-id, Cust-name, Address, Phone#) • CUSTOMER is the relation name • Defined over the four attributes: Cust-id, Cust-name, Address, Phone# • Each attribute has a domain or a set of valid values. • For example, the domain of Cust-id is 6 digit numbers. Chapters 5-8

  8. Formal Definitions - Tuple • A tuple is an ordered set of values (enclosed in angled brackets ‘< … >’) • Each value is derived from an appropriate domain. • A row in the CUSTOMER relation is a 4-tuple and would consist of four values, for example: • <632895, "John Smith", "101 Main St. Atlanta, GA 30332", "(404) 894-2000"> • This is called a 4-tuple as it has 4 values • A tuple (row) in the CUSTOMER relation. • A relation is a set of such tuples (rows) Chapters 5-8

  9. Formal Definitions - Domain • A domain has a logical definition: • Example: “USA_phone_numbers” are the set of 10 digit phone numbers valid in the U.S. • A domain also has a data-type or a format defined for it. • The USA_phone_numbers may have a format: (ddd)ddd-dddd where each d is a decimal digit. • Dates have various formats such as year, month, date formatted as yyyy-mm-dd, or as dd mm,yyyy etc. • The attribute name designates the role played by a domain in a relation: • Used to interpret the meaning of the data elements corresponding to that attribute • Example: The domain Date may be used to define two attributes named “Invoice-date” and “Payment-date” with different meanings Chapters 5-8

  10. Formal Definitions - State • The relation state is a subset of the Cartesian product of the domains of its attributes • each domain contains the set of all possible values the attribute can take. • Example: attribute Cust-name is defined over the domain of character strings of maximum length 25 • dom(Cust-name) is varchar(25) • The role these strings play in the CUSTOMER relation is that of the name of a customer. Chapters 5-8

  11. Formal Definitions - Summary • Formally, • Given R(A1, A2, .........., An) • r(R) dom (A1) X dom (A2) X ....X dom(An) • R(A1, A2, …, An) is the schema of the relation • R is the name of the relation • A1, A2, …, An are the attributes of the relation • r(R): a specific state (or "value" or “population”) of relation R – this is a set of tuples (rows) • r(R) = {t1, t2, …, tn} where each ti is an n-tuple • ti = <v1, v2, …, vn> where each vjelement-ofdom(Aj) Chapters 5-8

  12. Formal Definitions - Example • Let R(A1, A2) be a relation schema: • Let dom(A1) = {0,1} • Let dom(A2) = {a,b,c} • Then: dom(A1) X dom(A2) is all possible combinations: {<0,a> , <0,b> , <0,c>, <1,a>, <1,b>, <1,c> } • The relation state r(R)  dom(A1) X dom(A2) • For example: r(R) could be {<0,a> , <0,b> , <1,c> } • this is one possible state (or “population” or “extension”) r of the relation R, defined over A1 and A2. • It has three 2-tuples: <0,a> , <0,b> , <1,c> Chapters 5-8

  13. Definition Summary Chapters 5-8

  14. Super key: an attribute or a set of attributes that identifies an entity uniquely (may not be minimal set) SSN SSN, NAME SSN, NAME, MAJOR Chapters 5-8

  15. Candidate key: a super key such that no proper subset of its attributes is itself a super key. So candidate keys must have a minimal identifier. STUID SSNPrimary key: the candidate key that is chosenOR the candidate key that is used to identify tuples in a relation -- unique, must existAlternate key:A candidate key in a relation that is not selectede.g. if primary key is SSN then STUID is a alternate key Chapters 5-8

  16. Concatenated (composite) key: A primary key that is comprised of two or more attributes or data items G RADE_REPORT(STUID, COURSE#, GRADE)  Chapters 5-8

  17. Foreign key: A non-key attribute in one relation that appears as the primary key (or part of the key) in another relationEMPLOYEE(SSN, FNAME, MINIT, DNO) DEPARTMENT(DNUMBER, DNAME, MANAGER) Chapters 5-8

  18. Secondary key: a field that can have duplicate values, and that can be used as search path by the users Chapters 5-8

  19. Chapters 5-8

  20. Referential Integrity Constraints for COMPANY database Chapters 5-8

  21. Chapters 5-8

  22. Relational Algebra Overview • Relational algebra is the basic set of operations for the relational model • These operations enable a user to specify basic retrieval requests (or queries) • The result of an operation is a new relation, which may have been formed from one or more input relations • This property makes the algebra “closed” (all objects in relational algebra are relations) Chapters 5-8

  23. Relational Algebra Overview (continued) • The algebra operations thus produce new relations • These can be further manipulated using operations of the same algebra • A sequence of relational algebra operations forms a relational algebra expression • The result of a relational algebra expression is also a relation that represents the result of a database query (or retrieval request) Chapters 5-8

  24. Relational Algebra Overview • Relational Algebra consists of several groups of operations • Unary Relational Operations • SELECT (symbol:  (sigma)) • PROJECT (symbol:  (pi)) • Relational Algebra Operations From Set Theory • UNION (  ), INTERSECTION ( ), DIFFERENCE (or MINUS, – ) • CARTESIAN PRODUCT ( x ) • Binary Relational Operations • JOIN (several variations of JOIN exist) • DIVISION • Additional Relational Operations • OUTER JOINS, OUTER UNION • AGGREGATE FUNCTIONS (These compute summary of information: for example, SUM, COUNT, AVG, MIN, MAX) Chapters 5-8

  25. Unary Relational Operations: SELECT • The SELECT operation (denoted by  (sigma)) is used to select a subset of the tuples from a relation based on a selection condition. • The selection condition acts as a filter • Keeps only those tuples that satisfy the qualifying condition • Tuples satisfying the condition are selected whereas the other tuples are discarded (filtered out) • Examples: • Select the EMPLOYEE tuples whose department number is 4: DNO = 4 (EMPLOYEE) • Select the employee tuples whose salary is greater than $30,000: SALARY > 30,000 (EMPLOYEE) Chapters 5-8

  26. Unary Relational Operations: SELECT • In general, the select operation is denoted by <selection condition>(R) where • the symbol  (sigma) is used to denote the select operator • the selection condition is a Boolean (conditional) expression specified on the attributes of relation R • tuples that make the condition true are selected • appear in the result of the operation • tuples that make the condition false are filtered out • discarded from the result of the operation Chapters 5-8

  27. Unary Relational Operations: SELECT (contd.) • SELECT Operation Properties • The SELECT operation <selection condition>(R) produces a relation S that has the same schema (same attributes) as R • SELECT  is commutative: • <condition1>(< condition2> (R)) = <condition2> (< condition1> (R)) • Because of commutativity property, a cascade (sequence) of SELECT operations may be applied in any order: • <cond1>(<cond2> (<cond3> (R)) = <cond2> (<cond3> (<cond1> ( R))) • A cascade of SELECT operations may be replaced by a single selection with a conjunction of all the conditions: • <cond1>(< cond2> (<cond3>(R)) =  <cond1> AND < cond2> AND < cond3>(R))) • The number of tuples in the result of a SELECT is less than (or equal to) the number of tuples in the input relation R Chapters 5-8

  28. Select Works on single table and takes rows that meet a specified condition, copy them into a new table • (Table name) Condition(s) SQL (Structured Query language) SELECT * FROM (table name) WHERE condition 1 AND condition 2 AND condition 3… Chapters 5-8

  29. Condition(s) Chapters 5-8 Table

  30. Find employees who work for department number 5.  employee DNO = 5 SQL: SELECT * FROM employee WHERE dno = 5; Chapters 5-8

  31. Chapters 5-8

  32. Query tree  DNO=5 Chapters 5-8 Employee

  33. s(DNO=4 ANDSALARY>25000) OR (DNO=5 ANDSALARY>30000)(EMPLOYEE) • --------------------------------------------------------- • s<cond1>(s<cond2>(R)) = s<cond2>(s<cond1>(R)) • s<cond1>(s<cond2>(. . .(s<condn> (R)) . . .)) = s<cond1> AND <cond2> AND . . . AND <condn>(R) Chapters 5-8

  34. Project Operates on a single table, produces a vertical subset of the table, extract the values of specified columns eliminate duplicate rows place the value in a new table  (table name) column1, column2, column3, … Chapters 5-8

  35. SQL: SELECT column1, column2, column3, … FROM (table name) Chapters 5-8

  36. column(s) Chapters 5-8 Table

  37. E.g. Show the names of all employees employee fname, minit, lname SELECT fname, minit, lname FROM employee; Chapters 5-8

  38. Chapters 5-8

  39. fname,minit,lname Chapters 5-8 Employee

  40. Select & project Show the names of all employees who work for department number 5  ( employee) fname, minit, lname dno = 5 SELECT fname, minit, lname FROM employee WHERE dno = 5; Chapters 5-8

  41. Chapters 5-8

  42. fname,minit,lname  DNO = 5 Chapters 5-8 Employee

  43. Examples of applying SELECT and PROJECT operations Chapters 5-8

  44. PRODUCT(or Cartesian product) R1 x R2 R1 X R2 is a table where width is the width of R1 plus the width of R2 and whose columns are the columns of R1 followed by the columns of R2 If R1 has X rows and M columns R2 has Y rows and N columns R1 X R2 = X * Y rows and M + N columns Chapters 5-8

  45. Cartesian Product Chapters 5-8

  46. Query Tree for Cartesian Product X Chapters 5-8 Table1 Table2

  47. Example of Query Tree Chapters 5-8

  48. Theta Join The result of performing a SELECT operation using a comparison operator theta (=,<, <=, >, <=, <>) on the product Chapters 5-8

  49. Theta Join (>) Chapters 5-8

  50. Theta Join (ID>STUID) Chapters 5-8

More Related