1 / 31

Today’s Class

IS C332: Database Systems and Applications. Today’s Class. Relational Model Relational Algebra SQL Exercises. Natural Join r s. Natural Join Example. B. D. E. A. B. C. D. Relations r, s:. 1 3 1 2 3. a a a b b.     .     . 1 2 4 1 2.   

rocio
Télécharger la présentation

Today’s Class

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. IS C332: Database Systems and Applications Today’s Class Relational Model Relational Algebra SQL Exercises

  2. Natural Join • r s Natural Join Example B D E A B C D • Relations r, s: 1 3 1 2 3 a a a b b           1 2 4 1 2      a a b a b s r A B C D E      1 1 1 1 2      a a a a b     

  3. Joining two relations – Natural Join • Let r and s be relations on schemas R and S respectively. Then, the “natural join” of relations R and S is a relation on schema R S obtained as follows: • Consider each pair of tuples tr from r and ts from s. • If tr and ts have the same value on each of the attributes in RS, add a tuple t to the result, where • t has the same value as tr on r • t has the same value as ts on s

  4. Natural Join • Example: R = (A, B, C, D) S = (E, B, D) • Result schema = (A, B, C, D, E) • rs is defined as:r.A, r.B, r.C, r.D, s.E (r.B = s.Br.D = s.D (r x s))

  5. Preliminaries • A query is applied to relation instances, and the result of a query is also a relation instance. • Schemas of inputrelations for a query are fixed. • The schema for the resultof a given query is also fixed! - determined by definition of query language constructs. • Positional vs. named-field notation: • Positional notation easier for formal definitions, named-field notation more readable. • Both used in SQL

  6. Relational Algebra • Basic operations: • Selection ( ) Selects a subset of rows from relation. • Projection ( ) Deletes unwanted columns from relation. • Cross-product( ) Allows us to combine two relations. • Set-difference ( ) Tuples in reln. 1, but not in reln. 2. • Union( ) Tuples in reln. 1 and in reln. 2. • renaming (  ): Not essential, but (very!) useful. • Additional operations: • Intersection,join, division, • The operators take one or two relations as inputs and produce a new relation as a result. • Since each operation returns a relation, operations can becomposed: algebra is “closed”.

  7. Formal Definition • A basic expression in the relational algebra consists of either one of the following: • A relation in the database • A constant relation • Let E1 and E2 be relational-algebra expressions; the following are all relational-algebra expressions: • E1 E2 • E1–E2 • E1 x E2 • p (E1), P is a predicate on attributes in E1 • s(E1), S is a list consisting of some of the attributes in E1 •  x(E1), x is the new name for the result of E1

  8. Composition of Operations A B C D E • Results of relational operations are relations themselves. • Compositions of operations form a relational-algebra expression. • Can build expressions using multiple operations • Example: A=C(r x s) • r x s • A=C(r x s)         1 1 1 1 2 2 2 2         10 10 20 10 10 10 20 10 a a b b a a b b A B C D E       10 10 20 a a b 1 2 2

  9. Figure 2.1 Relational database for Practice Exercise 2.1. • employee (person name, street, city) • works (person name, company name, salary) • company (company name, city)

  10. Banking Example branch (branch_name, branch_city, assets) customer (customer_name, customer_street, customer_city) account (account_number, branch_name, balance) loan (loan_number, branch_name, amount) depositor (customer_name, account_number) borrower(customer_name, loan_number)

  11. Select Operation Select operation returns a relation that satisfies the given predicate from the original relation. • Notation: p(r) • p is called the selection predicate • Defined as:p(r) = {t | t  rand p(t)} Where p is a formula in propositional calculus consisting of termsconnected by :  (and),  (or),  (not)Each term is one of: <attribute> op <attribute> or <constant> where op is one of: =, , >, . <.  • Example of selection:branch_name=“Perryridge”(account)

  12. Project Operation Returns a relation with only the specified attributes. • Notation: where A1, A2 are attribute names and r is a relation name. • The result is defined as the relation of k columns obtained by erasing the columns that are not listed • Duplicate rows removed from result, since relations are sets • Example: To eliminate the branch_name attribute of accountaccount_number, balance (account)

  13. Union Operation Results in a relation with all of the tuples that appear in either or both of the argument relations. • Notation: r s • Defined as: r s = {t | t  r or t  s} • For r s to be valid. 1. r,s must have the same arity (same number of attributes) 2. The attribute domains must be compatible (example: 2nd column of r deals with the same type of values as does the 2nd column of s) • Example: to find all customers with either an account or a loancustomer_name (depositor)  customer_name (borrower)

  14. Set Difference Operation R – S produces all tuples in R but not in S • Notation r – s • Defined as: r – s = {t | t rand t  s} • Set differences must be taken between compatible relations. • r and s must have the same arity • attribute domains of r and s must be compatible

  15. Cartesian-Product Operation Combines any two relations Output has the attributes of both relations • Notation r x s • Defined as: r x s = {t q | t  r and q  s} • Assume that attributes of r(R) and s(S) are disjoint. (That is, R  S = ). • If attributes of r and s are not disjoint, then renaming must be used. Repeated attribute names are preceded by the relation they originated from. Example: r= borrower × loan (borrower.customer-name, borrower.loan-number, loan.loan-number, loan.branch-name, loan.amount)

  16. Rename Operation • Allows us to name, and therefore to refer to, the results of relational-algebra expressions. • Allows us to refer to a relation by more than one name. • Example: x (E) returns the expression E under the name X • If a relational-algebra expression E has arity n, then returns the result of expression E under the name X, and with the attributes renamed to A1 , A2 , …., An . Useful for naming the unnamed relations returned from other operations.

  17. Set-Intersection Operation Results in a relation that contains only the tuples that appear in both relations. • Notation: r s • Defined as: • rs = { t | trandts } • Assume: • r, s have the same arity • attributes of r and s are compatible • Note: rs = r – (r – s)

  18. SQL Overview • CREATE TABLE <name> ( <field> <domain>, … ) • INSERT INTO <name> (<field names>) VALUES (<field values>) • DELETE FROM <name> WHERE <condition> • UPDATE <name> SET <field name> = <value> WHERE <condition> • SELECT <fields> FROM <name> WHERE <condition>

  19. Creating Relations in SQL CREATE TABLE Students (sidCHAR(20), name CHAR(30), login CHAR(20), age INTEGER, gpaREAL) • Creates the Students relation. Observe that the type (domain)of each field is specified, and enforced by the DBMS whenever tuples are added or modified. • As another example, the Enrolled table holds information about courses that students take. CREATE TABLE Enrolled (sid CHAR(20), cid CHAR(20), grade CHAR(2))

  20. Adding and Deleting Tuples INSERT INTO Students (sid, name, login, age, gpa) VALUES (‘53688’, ‘Smith’, ‘smith@ee’, 18, 3.2) • Can insert a single tuple using: • Can delete all tuples satisfying some condition (e.g., name = Smith): DELETE FROM Students S WHERE S.name = ‘Smith’ Powerful variants of these commands are available; more later!

  21. CREATE TABLE Enrolled (sid CHAR(20) cid CHAR(20), grade CHAR(2), PRIMARY KEY (sid), UNIQUE (cid, grade)) CREATE TABLE Enrolled (sid CHAR(20) cid CHAR(20), grade CHAR(2), PRIMARY KEY (sid,cid)) vs. Primary and Candidate Keys in SQL • Possibly many candidate keys(specified using UNIQUE), one of which is chosen as the primary key. • Keys must be used carefully! • “For a given student and course, there is a single grade.” “Students can take only one course, and no two students in a course receive the same grade.”

  22. Relational Query Languages • A major strength of the relational model: supports simple, powerful querying of data. • Queries can be written intuitively, and the DBMS is responsible for efficient evaluation. • The key: precise semantics for relational queries. • Allows the optimizer to extensively re-order operations, and still ensure that the answer does not change.

  23. The SQL Query Language • The most widely used relational query language. • Current std is SQL-2011 • To find all 18 year old students, we can write: SELECT * FROM Students S WHERE S.age=18 • To find just names and logins, replace the first line: SELECT S.name, S.login

  24. Semantics of a Query • A conceptualevaluation method for the previous query: 1. do FROM clause: compute cross-product of Students and Enrolled 2. do WHERE clause: Check conditions, discard tuples that fail 3. do SELECT clause: Delete unwanted fields • Remember, this is conceptual. Actual evaluation will be much more efficient, but must produce the same answers.

  25. Foreign Keys in SQL • Only students listed in the Students relation should be allowed to enroll for courses. CREATE TABLE Enrolled (sid CHAR(20), cid CHAR(20), grade CHAR(2), PRIMARY KEY(sid,cid), FOREIGN KEY(sid) REFERENCESStudents ) Enrolled Students

  26. Given the following instance of Enrolled we get: Querying Multiple Relations • What does the following query compute? SELECT S.name, E.cid FROM Students S, Enrolled E WHERE S.sid=E.sid AND E.grade='A'

  27. Cross-product of Students and Enrolled Instances

  28. Enforcing Referential Integrity • Consider Students and Enrolled; sid in Enrolled is a foreign key that references Students. • What should be done if an Enrolled tuple with a non-existent student id is inserted? (Reject it!) • What should be done if a Students tuple is deleted? • Also delete all Enrolled tuples that refer to it. • Disallow deletion of a Students tuple that is referred to. • Set sid in Enrolled tuples that refer to it to a default sid. • (In SQL, also: Set sid in Enrolled tuples that refer to it to a special value null, denoting `unknown’ or `inapplicable’.) • Similar if primary key of Students tuple is updated.

  29. Referential Integrity in SQL • SQL/92 and SQL:1999 support all 4 options on deletes and updates. • Default is NO ACTION(delete/update is rejected) • CASCADE (also delete all tuples that refer to deleted tuple) • SET NULL / SETDEFAULT (sets foreign key value of referencing tuple) CREATE TABLE Enrolled (sid CHAR(20), cid CHAR(20), grade CHAR(2), PRIMARY KEY(sid,cid), FOREIGN KEY(sid) REFERENCESStudents ON DELETE CASCADE ON UPDATE SET DEFAULT)

  30. Integrity Constraints (ICs) • IC: condition that must be true for any instance of the database; e.g., domain constraints. • ICs are specified when schema is defined. • ICs are checked when relations are modified. • A legalinstance of a relation is one that satisfies all specified ICs. • DBMS should not allow illegal instances. • If the DBMS checks ICs, stored data is more faithful to real-world meaning. • Avoids data entry errors, too!

  31. Where do ICs Come From? • ICs are based upon the semantics of the real-world that is being described in the database relations. • We can check a database instance to see if an IC is violated, but we can NEVER infer that an IC is true by looking at an instance. • An IC is a statement about all possible instances! • From example, we know name is not a key, but the assertion that sid is a key is given to us. • Key and foreign key ICs are the most common; more general ICs supported too.

More Related