1 / 108

CMSC424: Database Design

CMSC424: Database Design. Lecture 6. SQL - Introduction. Standard DML/DDL for relational DB’s. DML = Data Manipulation Language (queries, updates) DDL = Data Definition Language (create tables, indexes, …). Also includes. View definition Security (Authorization) Integrity constraints

nola
Télécharger la présentation

CMSC424: Database Design

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CMSC424: Database Design Lecture 6 CMSC424, Spring 2005

  2. SQL - Introduction • Standard DML/DDL for relational DB’s • DML = Data Manipulation Language (queries, updates) • DDL = Data Definition Language (create tables, indexes, …) • Also includes • View definition • Security (Authorization) • Integrity constraints • Transactions CMSC424, Spring 2005

  3. SQL: Basic Structure • SELECT A1, ….., An • FROM r1, ….., rm • WHERE P Equivalent to: • A1,A2,…,An (σP (r1 …  rn)) CMSC424, Spring 2005

  4. A Simple SELECT-FROM-WHERE Query • Similar to SELECT bname FROM loan WHERE amt > 1000  bname( amt > 1000(loan) ) • But not quite In general, SQL will not remove duplicates unless asked to Duplicates are retained (i.e., result not a set) CMSC424, Spring 2005

  5. A Simple SELECT-FROM-WHERE Query • Similar to SELECT DISTINCT bname FROM loan WHERE amt > 1000  bname( amt > 1000(loan) ) • Result: CMSC424, Spring 2005

  6. Next • Formal Semantics of SQL • Bag or multiset semantics CMSC424, Spring 2005

  7. Formal Semantics of SQL • RA can only express SELECT DISTINCT queries • To express SQL, must extend RA to a bag algebra • Bags (aka: multisets) like sets, but can have duplicates e.g: {5, 3, 3} e.g: homes = • Next: will define RA*: a bag version of RA CMSC424, Spring 2005

  8. Formal Semantics of SQL: RA* • *p (r): preserves copies in r e.g: *city = Brighton (homes) = • *A1, …, An (r): no duplicate elimination e.g:  *cname (homes) = CMSC424, Spring 2005

  9. Formal Semantics of SQL: RA* • r* s : additive union • * = r s • r-* s: bag difference e.g: r -* s = s -* r = CMSC424, Spring 2005

  10. Formal Semantics of SQL: RA* • r* s: cartesian product = * CMSC424, Spring 2005

  11. Formal Semantics of SQL • Query: SELECT a1, ….., an FROM r1, ….., rm WHERE p • Semantics: *A1, …, An (*p (r1* … * rm) ) (1) • Query: SELECT DISTINCT a1, ….., an FROM r1, ….., rm WHERE p What is the only operator to change in (1)? • Semantics: A1, …, An (*p (r1* … * rm) ) (2) CMSC424, Spring 2005

  12. Next: More SQL CMSC424, Spring 2005

  13. More SQL: AS • Using AS in FROM clause • Introduces tuple variables e.g: SELECT DISTINCT T.bname FROM branch AS T, branch AS S WHERE T.assets > S.assets returns branch names of branches with non-minimal assets • Using AS in SELECT clause • Renames columns in result (p) e.g: SELECT bname, acct_no, balance * 1.05 AS newbal FROM account • returns: CMSC424, Spring 2005

  14. More SQL: INTO • Used to name query results () e.g: SELECT DISTINCT bname INTO branchnames FROM branch • Intuitively: BranchNames SELECT DISTINCT bname FROM branch CMSC424, Spring 2005

  15. More SQL: Order by • Example: List in alphabetical order the names of all customers with loans at the Perry branch SELECT DISTINCT cname FROM borrower AS b, loan AS l WHERE b.lno = l.lno AND bname = “Perry” ORDER BY cname Result = default: ascending order (asc) Can also write: ORDER BY cname DESC, or ORDER BY cname ASC Like SELECT DISTINCT, very expensive • requires external sort • cannot (usually) fit entire relation in memory. instead must sort in chunks. CMSC424, Spring 2005

  16. More SQL: Aggregate Operators • Aggregate operators: AVG (col): average of values in col SUM (col): sum of values in col COUNT (col): number of values in col MIN (col): minimun value in col MAX (col): maximun value in col • Examples: • Find the average acct balance @ Perry SELECT AVG (bal) FROM account WHERE bname = “Perry” • Find the number of tuples in customer SELECT COUNT (*) FROM customer • Find the number of unique depositors SELECT COUNT (DISTINCT cname) FROM customer CMSC424, Spring 2005

  17. More SQL: Aggregates & Group By • Usually, aggregates used with “GROUP BY” • Example: SELECT bname, COUNT (DISTINCT cname) FROM depositor AS d, account AS a WHERE d.acct_no = a.acct_no GROUP BY bname Extended relational grouping operator: G1, G2, …, GngF1( A1), F2( A2),…, Fn( An)(E) • Result: CMSC424, Spring 2005

  18. More SQL: Aggregates & Group By • Intuition behind “Group By” SELECT bname, COUNT (DISTINCT cname) FROM depositor AS d, account AS a WHERE d.acct_no = a.acct_no GROUP BY bname Step 1: “Group “ result of join Step 2: Aggregate on groups and project on result CMSC424, Spring 2005

  19. More SQL: GROUP BY (cont.) • Another Example: SELECT bname, SUM(assets) as total FROM branch GROUP BY bcity • Result? Multiple names to choose from • Above Query Not Allowed! CMSC424, Spring 2005

  20. More SQL: GROUP BY (cont.) • Another Example: SELECT bname, SUM(assets) as total FROM branch GROUP BY bcity • Above Query Not Allowed! Non-aggregated attributes in SELECT clause (e.g., bname) must also appear in GROUP BY clause SELECT A1, ..., Ak, Agg1(), ...., Aggi() FROM .......... WHERE ............ GROUP BY A1, ..., Ak CMSC424, Spring 2005

  21. More SQL: Having • WHERE :: FROM as HAVING :: GROUP BY • HAVING P: selects rows from result of GROUP BY • Optional (missing HAVING clause = HAVING TRUE) Example: Find names of branches and the average account balance for those branches having an account balance > $1200 SELECT bname, AVG(balance) AS avgFROM accountGROUP BY bnameHAVING avg > 1200 same result as SELECT bname, AVG(balance) AS avg INTO temp FROM account GROUP BY bname SELECT *FROM temp WHERE avg > 1200 + CMSC424, Spring 2005

  22. More SQL: Set/Bag Operations • Set Operations • UNION ≡ U • INTERSECT ≡ ∩ • EXCEPT ≡ - • Bag Operations • UNION ALL ≡ U* • INTERSECT ALL ≡ ∩* • EXCEPT ALL ≡ -* • Duplicate Counting: • Given m copies of t in r, n copies of t in s, how many copies of t in: r UNION ALL s? A: m + n r INTERSECT ALL s? A: min (m, n) r EXCEPT ALL s? A: max (0, m-n) CMSC424, Spring 2005

  23. More SQL: Set/Bag Operations • Example Queries: (SELECT cname FROM depositor) ? (SELECT cname FROM borrower) • ? ≡ UNION  returns names of customers with savings accts, loans or both • ? ≡ INTERSECT  returns names of customers with savings accts and loans • ? ≡ EXCEPT  returns names of customers with savings accts but not loans CMSC424, Spring 2005

  24. SQL: Summary Thus Far CMSC424, Spring 2005

  25. SQL: Summary Thus Far • A kitchen sink query: SELECT bcity, sum(balance) AS totalbalance INTO BranchAcctSummary FROM branch AS b, account AS a WHERE b.bname = a.bname AND assets >= 1M GROUP BY bcity HAVING totalbalance > 700 ORDER BY bcity DESC • Steps 1,2 : FROM, WHERE CMSC424, Spring 2005

  26. SQL: Summary Thus Far • Steps 3,4: GROUP BY, SELECT • Step 6: ORDER BY • Step 5: HAVING • Step 7: INTO • ... CMSC424, Spring 2005

  27. Next: NULLs CMSC424, Spring 2005

  28. More SQL: Nulls • The “dirty little secret” of SQL (major headache for query optimization) • Can be a value of any attribute e.g: branch = • What does this mean? We don’t know Waltham’s assets? Waltham has no assets? …. (Many possible interpretations) CMSC424, Spring 2005

  29. More SQL: Nulls • Arithmetic Operations with Null n + NULL = NULL (similarly for all arithmetic ops: +, -, *, /, mod, …) e.g: branch = SELECT bname, assets * 2 as a2 FROM branch = CMSC424, Spring 2005

  30. More SQL: Nulls • Boolean Operations with Null n < NULL = UNKNOWN (similarly for all boolean ops: >, <=, >=, <>, =, …) e.g: branch = = SELECT * FROM branch WHERE assets = NULL CMSC424, Spring 2005

  31. More SQL: Nulls • Boolean Operations with Null n < NULL = UNKNOWN (similarly for all boolean ops: >, <=, >=, <>, =, …) e.g: branch = = SELECT * FROM branch WHERE assets IS NULL CMSC424, Spring 2005

  32. More SQL: Unknown • Boolean Operations with Unknown n < NULL = UNKNOWN (similarly for all boolean ops: >, <=, >=, <>, =, …) FALSE OR UNKNOWN = UNKNOWN TRUE AND UNKNOWN = UNKNOWN Intuition: substitute each of TRUE, FALSE for unknown. If different answer results, results is unknown Can write: SELECT … FROM … WHERE booleanexp IS UNKNOWN UNKNOWN OR UNKNOWN = UNKNOWN UNKNOWN AND UNKNOWN = UNKNOWN NOT (UNKNOWN) = UNKNOWN CMSC424, Spring 2005

  33. More SQL: Nulls • Given branch = • Aggregate Operations SELECT SUM (assets) = FROM branch NULL is ignored Same for AVG (3.7M), MIN (0.4M), MAX (9M) But COUNT (assets) returns CMSC424, Spring 2005

  34. More SQL: Nulls • Given branch = SELECT SUM (assets) = FROM branch • Same as AVG, MIN, MAX • But COUNT (assets) returns CMSC424, Spring 2005

  35. More and More… • Nested Subqueries • Derived Relations • Views • Modification of the Database • Joined Relations • Data Definition Language • Embedded SQL, ODBC and JDBC • We will discuss some of these things in class… rest through the SQL assignment CMSC424, Spring 2005

  36. Nested Subqueries: Example • Find all customers who have both an account and a loan at the bank. select distinct customer-namefrom borrowerwhere customer-name in (select customer-namefromdepositor) Find all customers who have a loan at the bank but do not have an account at the bank select distinct customer-namefrom borrowerwhere customer-name not in (select customer-namefrom depositor) CMSC424, Spring 2005

  37. Example Query • Find all customers who have both an account and a loan at the Perryridge branch select distinctcustomer-namefrom borrower, loanwhere borrower.loan-number = loan.loan-number andbranch-name = “Perryridge” and(branch-name, customer-name) in (select branch-name, customer-namefrom depositor, accountwhere depositor.account-number = account.account-number) Note: Above query can be written in a much simpler manner. The formulation above is simply to illustrate SQL features. CMSC424, Spring 2005

  38. Set Comparison • Find all branches that have greater assets than some branch located in Brooklyn. select distinct T.branch-namefrom branch as T, branch as Swhere T.assets > S.assets andS.branch-city = ‘Brooklyn’ Same query using > some clause select branch-namefrom branchwhere assets > some (select assetsfrom branchwhere branch-city = ‘Brooklyn’) CMSC424, Spring 2005

  39. 0 5 6 Definition of Some Clause • F <comp> some r t r s.t. (F <comp> t)Where <comp> can be:  (5< some ) = true (read: 5 < some tuple in the relation) 0 ) = false (5< some 5 0 ) = true (5 = some 5 0 (5 some ) = true (since 0  5) 5 (= some)  in However, ( some)  not in CMSC424, Spring 2005

  40. 0 5 6 Definition of all Clause • F <comp> all r t r (F <comp> t) (5< all ) = false 6 ) = true (5< all 10 4 ) = false (5 = all 5 4 (5 all ) = true (since 5  4 and 5  6) 6 (all)  not in However, (= all)  in CMSC424, Spring 2005

  41. Example Query • Find the names of all branches that have greater assets than all branches located in Brooklyn. select branch-namefrom branchwhere assets > all (select assetsfrom branchwhere branch-city = ‘Brooklyn’) CMSC424, Spring 2005

  42. Test for Empty Relations • The exists construct returns the value true if the argument subquery is nonempty. • exists r  r  Ø • not exists r  r = Ø CMSC424, Spring 2005

  43. Example Query • Find all customers who have an account at all branches located in Brooklyn. select distinct S.customer-namefrom depositor as Swhere not exists ( (select branch-namefrom branchwhere branch-city = ‘Brooklyn’)except (select R.branch-namefrom depositor as T, account as Rwhere T.account-number = R.account-number andS.customer-name = T.customer-name)) Note that X – Y = Ø  X Y Note: Cannot write this query using= alland its variants CMSC424, Spring 2005

  44. Test for Absence of Duplicate Tuples • The unique construct tests whether a subquery has any duplicate tuples in its result. • Find all customers who have at most one account at the Perryridge branch. • select T.customer-name • from depositor as T • where unique ( • select R.customer-namefrom account, depositor as Rwhere T.customer-name = R.customer-name andR.account-number = account.account-number andaccount.branch-name = ‘Perryridge’) CMSC424, Spring 2005

  45. Example Query • Find all customers who have at least two accounts at the Perryridge branch. select distinct T.customer-name from depositor T where not unique ( select R.customer-name from account, depositor as R where T.customer-name = R.customer-name and R.account-number = account.account-number and account.branch-name = ‘Perryridge’) CMSC424, Spring 2005

  46. Views • Provide a mechanism to hide certain data from the view of certain users. To create a view we use the command: create view v as<query expression> where: <query expression> is any legal expression The view name is represented by v CMSC424, Spring 2005

  47. Example Queries • A view consisting of branches and their customers create view all-customer as(select branch-name, customer-namefrom depositor, accountwhere depositor.account-number = account.account-number) union(select branch-name, customer-namefrom borrower, loanwhere borrower.loan-number = loan.loan-number) • Find all customers of the Perryridge branch select customer-namefrom all-customerwhere branch-name = ‘Perryridge’ CMSC424, Spring 2005

  48. Derived Relations • Find the average account balance of those branches where the average account balance is greater than $1200. • select branch-name, avg-balancefrom (select branch-name, avg (balance)from accountgroup by branch-name)as result (branch-name, avg-balance)where avg-balance > 1200 • Note that we do not need to use the having clause, since we compute the temporary (view) relation result in the from clause, and the attributes of result can be used directly in the where clause. CMSC424, Spring 2005

  49. Modification of the Database – Deletion • Delete all account records at the Perryridge branch • delete from accountwhere branch-name = ‘Perryridge’ • Delete all accounts at every branch located in Needham city. • delete from accountwhere branch-name in (select branch-namefrom branchwhere branch-city = ‘Needham’)delete fromdepositorwhere account-number in (select account-numberfrom branch, accountwhere branch-city = ‘Needham’and branch.branch-name = account.branch-name) CMSC424, Spring 2005

  50. Example Query • Delete the record of all accounts with balances below the average at the bank. delete from accountwhere balance < (select avg (balance)from account) Problem: as we delete tuples from deposit, the average balance changes Solution used in SQL: • First, compute avg balance and find all tuples to delete • Next, delete all tuples found above (without recomputing avg or retesting the tuples) CMSC424, Spring 2005

More Related