1 / 34

Introduction to Data Management CSE 344

Introduction to Data Management CSE 344. Lecture 9: SQL Wrap-up and RDBMs Architecture. Announcements. Webquiz due on Monday, 1/28 Homework 3 is posted: due on Wednesday, 2/6. Review: Indexes. V(M, N) ;. Suppose we have queries like these:. SELECT * FROM V WHERE M=?. SELECT *

dyami
Télécharger la présentation

Introduction to Data Management CSE 344

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to Data ManagementCSE 344 Lecture 9: SQL Wrap-upand RDBMs Architecture CSE 344 - Winter 2013

  2. Announcements • Webquiz due on Monday, 1/28 • Homework 3 is posted: due on Wednesday, 2/6 CSE 344 - Winter 2013

  3. Review: Indexes V(M, N); Suppose we have queries like these: SELECT* FROM V WHEREM=? SELECT* FROM V WHEREN=? SELECT* FROM V WHEREM=? and N=? Which of these indexes are helpful for each query? Index on V(M) Index on V(N) Index on V(M,N)

  4. Review: Indexes Suppose V(M,N) contains 10,000 records: (1,1), (1,2), …, (100, 100) V(M, N); SELECT* FROM V WHEREM=3 SELECT* FROM V WHEREN=5 SELECT* FROM V WHEREM=3 and N=5 List of pointers to records (3,1), (3,2), …, (3,100) B+Tree Index on V(M)

  5. Review: Indexes Suppose V(M,N) contains 10,000 records: (1,1), (1,2), …, (100, 100) V(M, N); SELECT* FROM V WHEREM=3 SELECT* FROM V WHEREN=5 SELECT* FROM V WHEREM=3 and N=5 How do we computethis query? B+Tree B+Tree Index on V(M) Index on V(N)

  6. Review: Indexes Suppose V(M,N) contains 10,000 records: (1,1), (1,2), …, (100, 100) V(M, N); SELECT* FROM V WHEREM=3 SELECT* FROM V WHEREN=5 SELECT* FROM V WHEREM=3 and N=5 B+Tree Single pointer to the record (3,5) Index on V(M,N)

  7. Review: Indexes Discussion • Why not create all three indexes V(M), V(N), V(M,N)? • Suppose M is the primary key in V(M, N):V = {(1,1), (2,2), …, (10000, 10000)}How do the two indexes V(M) and V(M,N) compare? Consider their utility for evaluating the predicate M=5 CSE 344 - Winter 2013

  8. Product (pname, price, cid) Company(cid, cname, city) Review: Subqueries in WHERE Find all companies s.t.all their products have price < 200 same as: Find all companies that make only products with price < 200 Universal quantifiers Universal quantifiers are hard !  CSE 344 - Winter 2013

  9. Product (pname, price, cid) Company(cid, cname, city) Review: Subqueries in WHERE Find all companies s.t.all their products have price < 200 1. Find the other companies: i.e. s.t. some product  200 SELECTDISTINCTC.cname FROMCompany C WHEREC.cidIN (SELECTP.cidFROM Product PWHEREP.price >= 200) 2. Find all companies s.t. all their products have price < 200 SELECTDISTINCTC.cname FROMCompany C WHEREC.cidNOTIN (SELECTP.cidFROM Product PWHEREP.price >= 200)

  10. Product (pname, price, cid) Company(cid, cname, city) Review: Subqueries in WHERE Find all companies s.t.all their products have price < 200 Universal quantifiers Using EXISTS: SELECTDISTINCTC.cname FROMCompany C WHERENOT EXISTS (SELECT *FROMProduct PWHEREP.cid = C.cidandP.price >= 200) CSE 344 - Winter 2013

  11. Product (pname, price, cid) Company(cid, cname, city) Review: Subqueries in WHERE Find all companies s.t.all their products have price < 200 Universal quantifiers Using ALL: SELECTDISTINCTC.cname FROMCompany C WHERE 200 > ALL (SELECT priceFROMProduct PWHEREP.cid = C.cid) CSE 344 - Winter 2013

  12. Question for Database Fansand their Friends • Can we unnest the universal quantifier query ? CSE 344 - Winter 2013

  13. Product (pname, price, cid) Company(cid, cname, city) Monotone Queries • Definition A query Q is monotone if: • Whenever we add tuples to one or more input tables, the answer to the query will not lose any of of the tuples Product Company Q Is the mysteryquery monotone? Product Company Q

  14. Monotone Queries • Theorem: A SELECT-FROM-WHERE query (without subqueries or aggregates) is monotone. • Proof. We use the nested loop semantics: if we insert a tuple in a relation Ri, this will not remove any tuples from the answer forx1in R1do for x2in R2do ….. forxninRndo if Conditions output (a1,…,ak) SELECT a1, a2, …, ak FROM R1 AS x1, R2 AS x2, …, Rn AS xn WHERE Conditions CSE 344 - Winter 2013

  15. Product (pname, price, cid) Company(cid, cname, city) Monotone Queries • The query: is not monotone • Consequence: we cannot write it as a SELECT-FROM-WHERE query without nested subqueries Find all companies s.t.all their products have price < 200

  16. Queries that must be nested • Queries with universal quantifiers or with negation • Queries that have complex aggregates CSE 344 - Winter 2013

  17. Practice these queries in SQL Likes(drinker, beer) Frequents(drinker, bar) Serves(bar, beer) Ullman’s drinkers-bars-beers example Find drinkers that frequent some bar that serves some beer they like. x: y. z. Frequents(x, y)Serves(y,z)Likes(x,z) Find drinkers that frequent only bars that serves some beer they like. x: y. Frequents(x, y) (z. Serves(y,z)Likes(x,z)) Find drinkers that frequent some bar that serves only beers they like. x: y. Frequents(x, y)z.(Serves(y,z) Likes(x,z)) Find drinkers that frequent only bars that serves only beer they like. x: y. Frequents(x, y)z.(Serves(y,z) Likes(x,z))

  18. Purchase(pid, product, quantity, price) GROUP BY v.s. Nested Queries SELECT product, Sum(quantity) ASTotalSales FROM Purchase WHERE price > 1 GROUP BY product SELECTDISTINCTx.product, (SELECTSum(y.quantity)FROM Purchase yWHEREx.product = y.product AND price > 1)ASTotalSales FROM Purchase x WHERE price > 1 Why twice ?

  19. Product (pname, price, cid) Company(cid, cname, city) Unnesting Aggregates Find the number of companies in each city SELECTDISTINCTcity, (SELECT count(*) FROM Company Y WHEREX.city = Y.city) FROM Company X Equivalent queries SELECT city, count(*) FROM CompanyGROUP BY city Note: no need for DISTINCT (DISTINCTis the same as GROUPBY) CSE 344 - Winter 2013

  20. Product (pname, price, cid) Company(cid, cname, city) Unnesting Aggregates What if there are no products for a city? Find the number of products made in each city SELECTDISTINCTX.city, (SELECT count(*) FROM Product Y, Company ZWHEREZ.cid=Y.cid ANDZ.city = X.city) FROM Company X They are NOTequivalent !(WHY?) SELECTX.city, count(*) FROMCompany X, Product Y WHEREX.cid=Y.cidGROUP BYX.city

  21. More Unnesting Author(login,name) Wrote(login,url) • Find authors who wrote ³ 10 documents: • Attempt 1: with nested queries This isSQL bya novice SELECTDISTINCTAuthor.name FROM Author WHERE(SELECTcount(Wrote.url)FROMWroteWHEREAuthor.login=Wrote.login) > 10 CSE 344 - Winter 2013

  22. More Unnesting • Find all authors who wrote at least 10 documents: • Attempt 2: SQL style (with GROUP BY) SELECTAuthor.name FROM Author, Wrote WHEREAuthor.login=Wrote.login GROUP BYAuthor.name HAVINGcount(wrote.url) > 10 This isSQL byan expert CSE 344 - Winter 2013

  23. Product (pname, price, cid) Company(cid, cname, city) Finding Witnesses For each city, find the most expensive product made in that city CSE 344 - Winter 2013

  24. Product (pname, price, cid) Company(cid, cname, city) Finding Witnesses For each city, find the most expensive product made in that city Finding the maximum price is easy… SELECTx.city, max(y.price) FROMCompany x, Product y WHEREx.cid = y.cid GROUPBYx.city; But we need the witnesses, i.e. the products with max price CSE 344 - Winter 2013

  25. Product (pname, price, cid) Company(cid, cname, city) Finding Witnesses To find the witnesses, compute the maximum pricein a subquery SELECTDISTINCTu.city, v.pname, v.price FROM Company u, Product v, (SELECTx.city, max(y.price) asmaxprice FROM Company x, Product y WHEREx.cid = y.cid GROUPBYx.city) w WHEREu.cid = v.cid andu.city = w.city andv.price=w.maxprice; CSE 344 - Winter 2013

  26. Product (pname, price, cid) Company(cid, cname, city) Finding Witnesses There is a more concise solution here: SELECTu.city, v.pname, v.price FROM Company u, Product v, Company x, Product y WHEREu.cid = v.cidandu.city = x.cityandx.cid = y.cid GROUPBYu.city, v.pname, v.price HAVINGv.price = max(y.price); CSE 344 - Winter 2013

  27. Product (pname, price, cid) Company(cid, cname, city) Finding Witnesses And another one: SELECTu.city, v.pname, v.price FROM Company u, Product v WHEREu.cid = v.cid andv.price >= ALL (SELECTy.priceFROMCompany x, Product y WHEREu.city=x.cityandx.cid=y.cid); CSE 344 - Winter 2013

  28. Where We Are • Motivation for using a DBMS for managing data • SQL, SQL, SQL • Declaring the schema for our data (CREATE TABLE) • Inserting data one row at a time or in bulk (INSERT/.import) • Modifying the schema and updating the data (ALTER/UPDATE) • Querying the data (SELECT) • Tuning queries (CREATE INDEX) • Next step: More knowledge of how DBMSs work • Client-server architecture • Relational algebra and query execution CSE 344 - Winter 2013

  29. Data Management with SQLite Desktop • So far, we have been managing data with SQLite as follows: • One data file • One user • One DBMS application • But only a limited number of scenarios work with such model User DBMS Application (SQLite) Disk File Data file CSE 344 - Winter 2013

  30. Connection (JDBC, ODBC) Client-Server Architecture Supports many apps and many users simultaneously Client Applications Server Machine • One server running the database • Many clients, connecting via the ODBC or JDBC (Java Database Connectivity) protocol DBMS Server Process (SQL Server) Data files File1 File2 … DISK

  31. Client-Server Architecture • One serverthat runs the DBMS (or RDBMS): • Your own desktop, or • Some beefy system, or • A cloud service (SQL Azure) • Many clientsrun apps and connect to DBMS • Microsoft’s Management Studio (for SQL Server), or • psql(for postgres) • Some Java program (HW5) or some C++ program • Clients “talk” to server using JDBC/ODBC protocol CSE 344 - Winter 2013

  32. DBMS Deployment: 3 Tiers Great for web-based applications Connection (e.g., JDBC) HTTP/SSL DB Server Web Server & App Server Data files Browser

  33. DBMS Deployment: Cloud Great for web-based applications too Data Files HTTP/SSL Web & App Server DB Server Developers Users CSE 344 - Winter 2013

  34. Using a DBMS Server • Client application establishes connection to server • Client must authenticate self • Client submits SQL commands to server • Server executes commands and returns results DBMS CSE 344 - Winter 2013

More Related