1 / 79

Database Management Systems

Database Management Systems. Chapter 5 Advanced Queries. Tables. Organization. Harder Questions Subqueries Not In, LEFT JOIN UNION, Multiple JOIN columns, Recursive JOIN Other SQL Commands DDL: Data Definition Language DML: Data Manipulation Language OLAP Microsoft SQL Server Oracle

sukey
Télécharger la présentation

Database Management Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Database Management Systems Chapter 5 Advanced Queries

  2. Tables

  3. Organization • Harder Questions • Subqueries • Not In, LEFT JOIN • UNION, Multiple JOIN columns, Recursive JOIN • Other SQL Commands • DDL: Data Definition Language • DML: Data Manipulation Language • OLAP • Microsoft SQL Server • Oracle • Microsoft Access Crosstab

  4. How many cats are “in-stock” on 10/1/01? Which cats sold for more than the average price? Which animals sold for more than the average price of animals in their category? Which animals have not been sold? Which customers (who bought something at least once) did not buy anything between 11/1/01 and 12/31/01? Which customers who bought Dogs also bought products for Cats (at any time)? Harder Questions

  5. Which cats sold for more than the average sale price of cats? Assume we know the average price is $170. Usually we need to compute it first. Sub-query for Calculation SELECT SaleAnimal.AnimalID, Animal.Category, SaleAnimal.SalePrice FROM Animal INNER JOIN SaleAnimal ON Animal.AnimalID = SaleAnimal.AnimalID WHERE ((Animal.Category="Cat") AND (SaleAnimal.SalePrice>170)); SELECT SaleAnimal.AnimalID, Animal.Category, SaleAnimal.SalePrice FROM Animal INNER JOIN SaleAnimal ON Animal.AnimalID = SaleAnimal.AnimalID WHERE ((Animal.Category="Cat") AND (SaleAnimal.SalePrice> ( SELECT AVG(SalePrice) FROM Animal INNER JOIN SaleAnimal ON Animal.AnimalID = SaleAnimal.AnimalID WHERE (Animal.Category=“Cat”) ) ) );

  6. List all customers (Name) who purchased one of the following items: 1, 2, 30, 32, 33. Could use Or, but harder to write. In ( . . .) matches any in the list. Query04_13 Query Sets (IN) SELECT Customer.LastName, Customer.FirstName, SaleItem.ItemID FROM (Customer INNER JOIN Sale ON Customer.CustomerID = Sale.CustomerID) INNER JOIN SaleItem ON Sale.SaleID = SaleItem.SaleID WHERE (SaleItem.ItemID In (1,2,30,32,33)) ORDER BY Customer.LastName, Customer.FirstName;

  7. List all customers who bought items for cats. Using IN with a Sub-query SELECT Customer.LastName, Customer.FirstName, SaleItem.ItemID FROM (Customer INNER JOIN Sale ON Customer.CustomerID = Sale.CustomerID) INNER JOIN SaleItem ON Sale.SaleID = SaleItem.SaleID WHERE (SaleItem.ItemID In (SELECT ItemID FROM Merchandise WHERE Category="Cat") );

  8. List all of the customers who bought something in March and who bought something in May. (Two tests on the same data!) LastName First Adkins Inga McCain Sam Grimes Earl Query04_14 SubQuery (IN: Look up a Set) SELECT Customer.LastName, Customer.FirstName FROM Customer INNER JOIN Sale ON Customer.CustomerID = Sale.CustomerID WHERE ((Month([SaleDate])=3)) And Customer.CustomerID In (SELECT CustomerID FROM Sale WHERE (Month([SaleDate])=5) );

  9. Any: value is compared to each item in the list. If it is True for any of the items, the statement is evaluated to True. All: value is compared to each item in the list. If it is True for every item in the list, the statement is evaluated to True (much more restrictive than any. Query04_15 SubQuery (ANY, ALL) SELECT DISTINCTROW Animal.AnimalID, Name, SalePrice, ListPrice FROM Animal INNER JOIN SaleAnimal ON Animal.AnimalID = SaleAnimal.AnimalID WHERE (((SalePrice) > Any (SELECT 0.80*ListPrice FROM Animal INNER JOIN SaleAnimal ON Animal.AnimalID = SaleAnimal.AnimalID WHERE Category = "Cat")) AND ((Category)="Cat"));

  10. Which animals have not been sold? Start with list of all animals. Subtract out list of those who were sold. AnimalID Name Category 12 Leisha Dog 19 Gene Dog 25 Vivian Dog 34 Rhonda Dog 88 Brandy Dog 181 Fish Query04_16 SubQuery: NOT IN (Subtract) SELECT Animal.AnimalID, Animal.Name, Animal.Category FROM Animal WHERE (Animal.AnimalID Not In (SELECT AnimalID From SaleAnimal));

  11. SubQuery: NOT IN (Data) Animal SaleAnimal ID Name Category Breed 2 Fish Angel 4 Gary Dog Dalmation 5 Fish Shark 6 Rosie Cat Oriental Shorthair 7 Eugene Cat Bombay 8 Miranda Dog Norfolk Terrier 9 Fish Guppy 10 Sherri Dog Siberian Huskie 11 Susan Dog Dalmation 12 Leisha Dog Rottweiler ID SaleID SalePrice 2 35 $10.80 4 80 $156.66 6 27 $173.99 7 25 $251.59 8 4 $183.38 10 18 $150.11 11 17 $148.47 Which animals have not been sold?

  12. Which animals have not been sold? LEFT JOIN includes all rows from left table (Animal) But only those from right table (SaleAnimal) that match a row in Animal. Rows in Animal without matching data in Sale Animal will have Null. AnimalID Name Category 12 Leisha Dog 19 Gene Dog 25 Vivian Dog 34 Rhonda Dog 88 Brandy Dog 181 Fish Query04_17 Left Outer Join SELECT Animal.AnimalID, Animal.Name, Animal.Category FROM Animal LEFT JOIN SaleAnimal ON Animal.AnimalID = SaleAnimal.AnimalID WHERE (SaleAnimal.SaleID Is Null);

  13. Left Outer Join (Example) ID SaleID SalePrice 2 35 $10.80 4 80 $156.66 Null Null Null 6 27 $173.99 7 25 $251.59 8 4 $183.38 Null Null Null 10 18 $150.11 11 17 $148.47 Null Null Null ID Name Category Breed 2 Fish Angel 4 Gary Dog Dalmation 5 Fish Shark 6 Rosie Cat Oriental Shorthair 7 Eugene Cat Bombay 8 Miranda Dog Norfolk Terrier 9 Fish Guppy 10 Sherri Dog Siberian Huskie 11 Susan Dog Dalmation 12 Leisha Dog Rottweiler

  14. Which animals have not been sold? Older Syntax for Left Join SELECT ALL FROM Animal, SaleAnimal WHERE Animal.AnimalID *= SaleAnimal.AnimalID And SaleAnimal.SaleID Is Null; Oracle syntax--note that the (+) symbol is on the reversed side. SELECT ALL FROM Animal, SaleAnimal WHERE Animal.AnimalID = SaleAnimal.AnimalID (+) And SaleAnimal.SaleID Is Null;

  15. Don’t know the average, so use a subquery to look it up. Watch parentheses. Query04_18 SubQuery for Computation SELECT SaleAnimal.AnimalID, Animal.Category, SaleAnimal.SalePrice FROM Animal INNER JOIN SaleAnimal ON Animal.AnimalID = SaleAnimal.AnimalID WHERE ((Animal.Category="Cat") AND (SaleAnimal.SalePrice> ( SELECT AVG(SalePrice) FROM Animal INNER JOIN SaleAnimal ON Animal.AnimalID = SaleAnimal.AnimalID WHERE (Animal.Category=“Cat”) ) ) );

  16. List the Animals that have sold for a price higher than the average for animals in that Category. The subquery needs to compute the average for a given category. Problem: Which category? Answer: the category that matches the category from the main part of the query. Problem: How do we refer to it? Both tables are called Animal. This query will not work yet. Correlated Subquery SELECT AnimalID, Name, Category, SalePrice FROM Animal INNER JOIN SaleAnimal ON Animal.AnimalID = SaleAnimal.AnimalID WHERE (SaleAnimal.SalePrice> (SELECT Avg(SaleAnimal.SalePrice) FROM Animal INNER JOIN SaleAnimal ON Animal.AnimalID = SaleAnimal.AnimalID WHERE (Animal.Category = Animal.Category) ) ) ORDER BY SaleAnimal.SalePrice DESC;

  17. List the Animals that have sold for a price higher than the average for animals in that Category. Match category in subquery with top level Rename tables (As) Correlated Subquery Recompute subquery for every row in top level--slow! Better to compute and save Subquery, then use in join. Query04_19 Correlated SubQuery (Avoid) SELECT A1.AnimalID, A1.Name, A1.Category, SaleAnimal.SalePrice FROM Animal As A1 INNER JOIN SaleAnimal ON A1.AnimalID = SaleAnimal.AnimalID WHERE (SaleAnimal.SalePrice> (SELECT Avg(SaleAnimal.SalePrice) FROM Animal As A2 INNER JOIN SaleAnimal ON A2.AnimalID = SaleAnimal.AnimalID WHERE (A2.Category = A1.Category) ) ) ORDER BY SaleAnimal.SalePrice DESC;

  18. Assume small query 100,000 rows 5 categories of 20,000 rows 100,000 * 20,000 = 1 billion rows to read! Correlated Subquery Problem Animal + SaleAnimal Category SalePrice Compute Avg: $37.78 Fish $10.80 Dog $156.66 Fish $19.80 Cat $173.99 Cat $251.59 Dog $183.38 Fish $1.80 Dog $150.11 Dog $148.47 Compute Avg: $174.20 Compute Avg: $37.78 Compute Avg: $169.73 Compute Avg: $169.73 Recompute average for every row in the main query!

  19. Compute the averages once and save query JOIN saved query to main query Two passes through table: 1 billion / 200,000 => 10,000 More Efficient Solution: 2 queries Animal + SaleAnimal Saved Query Category SalePrice Category AvgOfSalePrice Fish $10.80 Dog $156.66 Fish $19.80 Cat $173.99 Cat $251.59 Dog $183.38 Fish $1.80 Dog $150.11 Dog $148.47 Bird $176.57 Cat $169.73 Dog $174.20 Fish $37.78 Mammal $80.72 Reptile $181.83 Spider $118.16 JOIN Animal.Category = Query1.Category

  20. Offices in Los Angeles and New York. Each has an Employee table (East and West). Need to search data from both tables. Columns in the two SELECT lines must match. UNION Operator SELECT EID, Name, Phone, Salary, ‘East’ AS Office FROM EmployeeEast UNION SELECT EID, Name, Phone, Salary, ‘West’ AS Office FROM EmployeeWest EID Name Phone Salary Office 352 Jones 3352 45,000 East 876 Inez 8736 47,000 East 372 Stoiko 7632 38,000 East 890 Smythe 9803 62,000 West 361 Kim 7736 73,000 West

  21. UNION, INTERSECT, EXCEPT List the name of any employee who has worked for both the East and West regions. A B C T1 T2 SELECT EID, Name FROM EmployeeEast INTERSECT SELECT EID, Name FROM EmployeeWest

  22. Sometimes need to JOIN tables on more than one column. PetStore: Category and Breed. Multiple JOIN Columns Animal AnimalID Name Category Breed DateBorn Gender . . . Breed Category Breed SELECT * FROM Breed INNER JOIN Animal ON Breed.Category = Animal.Category AND Breed.Breed = Animal.Breed

  23. Need to connect a table to itself. Common example: Employee(EID, Name, . . ., Manager) A manager is also an employee. Use a second copy of the table and an alias. Reflexive Join Employee EID Name . . . Manager 115 Sanchez 765 462 Miller 115 523 Hawk 115 765 Munoz 886 SQL SELECT Employee.EID, Employee.Name, Employee.Manager, E2.Name FROM Employee INNER JOIN Employee AS E2 ON Employee.Manager = E2.EID Result EID Name Manager Name 115 Sanchez 765 Munoz 462 Miller 115 Sanchez 523 Hawk 115 Sanchez

  24. Used to change data to a different context. Example: Define age categories for the animals. Less than 3 months Between 3 months and 9 months Between 9 months and 1 year Over 1 year Not available in Microsoft Access. It is in SQL Server and Oracle. CASE Function Select AnimalID, CASE WHEN Date()-DateBorn < 90 Then “Baby” WHEN Date()-DateBorn >= 90 AND Date()-DateBorn < 270 Then “Young” WHEN Date()-DateBorn >= 270 AND Date()-DateBorn < 365 Then “Grown” ELSE “Experienced” END FROM Animal;

  25. AccountsReceivable Categorize by Days Late 30, 90, 120+ Three queries? New table for business rules Inequality Join AR(TransactionID, CustomerID, Amount, DateDue) LateCategory(Category, MinDays, MaxDays, Charge, …) Month 30 90 3% Quarter 90 120 5% Overdue 120 9999 10% SELECT * FROM AR INNER JOIN LateCategory ON ((Date() - AR.DateDue) >= LateCategory.MinDays) AND ((Date() - AR.DateDue) < LateCategory.MaxDays)

  26. SQL SELECT • SELECT DISTINCT Table.Column {AS alias} , . . . • FROM Table/Query • INNER JOIN Table/Query ON T1.ColA = T2.ColB • WHERE (condition) • GROUP BY Column • HAVING (group condition) • ORDER BY Table.Column • { Union second select } • TRANSFORM aggfunction {Crosstab values} • SELECT . . . FROM . . . GROUP BY {Crosstab rows} • PIVOT pivotfield {Crosstab columns}

  27. SQL Mnemonic Someone From Ireland Will Grow Horseradish and Onions SELECT FROM INNER JOIN WHERE GROUP BY HAVING ORDER BY SQL is picky about putting the commands in the proper sequence. If you have to memorize the sequence, this mnemonic may be helpful.

  28. SQL Data Definition • Create Schema AuthorizationdbNamepassword • Create TableTableName (Column Type, . . .) • Alter TableTable {Add, Column, Constraint, Drop} • Drop {Table Table | Index Index On table} • Create IndexIndexName ON Table (Column {ASC|DESC})

  29. Syntax Examples CREATE TABLE Customer (CustomerID INTEGER NOT NULL, LastName CHAR (10), more columns ); ALTER TABLE Customer DROP COLUMN ZipCode; ALTER TABLE Customer ADD COLUMN CellPhone CHAR(15);

  30. Queries with “Every” Need EXISTS List the employees who have sold animals from every category. By hand: List the employees and the categories. Go through the SaleAnimal list and check off the animals they have sold.

  31. Query With EXISTS List the Animal categories that have not been sold by an employee (#5). SELECT Category FROM Category WHERE (Category <> "Other") And Category NOT IN (SELECT Animal.Category FROM Animal INNER JOIN (Sale INNER JOIN SaleAnimal ON Sale.SaleID = SaleAnimal.SaleID) ON Animal.AnimalID = SaleAnimal.AnimalID WHERE Sale.EmployeeID = 5) If this query returns any rows, then the employee has not sold every animal. So list all the employees for whom the above query returns no rows: SELECT EmployeeID, LastName FROM Employee WHERE NOT EXISTS (above query slightly modified.)

  32. Query for Every SELECT Employee.EmployeeID, Employee.LastName FROM Employee WHERE Not Exists (SELECT Category FROM Category WHERE (Category <> "Other") And Category NOT IN (SELECT Animal.Category FROM Animal INNER JOIN (Sale INNER JOIN SaleAnimal ON Sale.SaleID = SaleAnimal.SaleID) ON Animal.AnimalID = SaleAnimal.AnimalID WHERE Sale.EmployeeID = Employee.EmployeeID) ); Result: 3 Reasoner

  33. Simpler Query for Every Sometimes it is easier to use Crosstab and the Count function. But some systems do not have Crosstab, and sometimes the lists would be too long. So you need to know both techniques.

  34. SQL: Foreign Key CREATE TABLE Order (OrderID INTEGER NOT NULL, OrderDate DATE, CustomerID INTEGER CONSTRAINT pkorder PRIMARY KEY (OrderID), CONSTRAINT fkorder FOREIGN KEY (CustomerID) REFERENCES Customer (CustomerID) ); Order Customer OrderID OrderDate CustomerID CustomerID LastName FirstName Address … *

  35. SQL Data Manipulation Commands Insert Intotarget (column1 . . .) VALUES (value1 . . .) Insert Intotarget (column1 . . .) SELECT . . . FROM. . . Delete Fromtable WHERE condition Updatetable SET Column=Value,. . . Where condition Note the use of the Select and Where conditions. Synatx is the same--only learn it once. You can also use subqueries.

  36. Copy Old Animal Data INSERT INTO OldAnimals SELECT * FROM Animals WHERE AnimalID IN (SELECT AnimalOrderItem.AnimalID FROM AnimalOrder INNER JOIN AnimalOrderItem ON AnimalOrder.OrderID = AnimalOrderItem.OrderId WHERE (AnimalOrder.OrderDate<#1/1/01#) );

  37. Delete Old Animal Data DELETE FROM Animals WHERE AnimalID IN (SELECT AnimalOrderItem.AnimalID FROM AnimalOrder INNER JOIN AnimalOrderItem ON AnimalOrder.OrderID = AnimalOrderItem.OrderId WHERE (AnimalOrder.OrderDate<#1/1/01#) );

  38. Change the ListPrice of Animals at the PetStore. For cats, increase the ListPrice by 10%. For dogs, increase the ListPrice by 20%. Typically use two similar UPDATE statements. With the CASE function, the statements can be combined. Update Example UPDATE Animal SET ListPrice = ListPrice*1.10 WHERE Category = “Cat” ; UPDATE Animal SET ListPrice = ListPrice*1.20 WHERE Category = “Dog” ;

  39. Break questions into smaller pieces. Test each query. Check the SQL. Look at the data. Check computations Combine into subqueries. Use cut-and-paste to avoid errors. Check for correlated subqueries. Test sample data. Identify different cases. Check final query and subqueries. Verify calculations. Quality: Building Queries Which customers who bought Dogs also bought products for Cats (at any time)? Who bought dogs? Who bought cat products? Dogs and cat products on the same sale. Dogs and cat products at different times. Dogs and never any cat products. Cat products and never any Dogs.  Test SELECT queries before executing UPDATE queries.

  40. Quality Queries: Example Which customers who bought Dogs also bought products for Cats? A. Which customers bought dogs? B. Which customers bought cat products? SELECT DISTINCT Animal.Category, Sale.CustomerID FROM Sale INNER JOIN (Animal INNER JOIN SaleAnimal ON Animal.AnimalID = SaleAnimal.AnimalID) ON Sale.SaleID = SaleAnimal.SaleID WHERE (((Animal.Category)="Dog")) AND Sale.CustomerID IN ( SELECT DISTINCT Sale.CustomerID FROM Sale INNER JOIN (Merchandise INNER JOIN SaleItem ON Merchandise.ItemID = SaleItem.ItemID) ON Sale.SaleID = SaleItem.SaleID WHERE (((Merchandise.Category)="Cat")) );

  41. Data Warehouse Predefined reports Interactive data analysis Operations data Daily data transfer OLTP Database 3NF tables Data warehouse Star configuration Flat files

  42. Existing databases optimized for Online Transaction Processing (OLTP) Online Analytical Processing (OLAP) requires fast retrievals, and only bulk writes. Different goals require different storage, so build separate dta warehouse to use for queries. Extraction, Transformation, Transportation (ETT) Data analysis Ad hoc queries Statistical analysis Data mining (specialized automated tools) Data Warehouse Goals

  43. OLTP v. OLAP

  44. Multidimensional Cube Pet Store Item Sales Amount = Quantity*Sale Price Category Customer Location Time Sale Date

  45. Sales Date: Time Hierarchy Year Roll-up To get higher-level totals Levels Quarter Month Drill-down To get lower-level details Week Day

  46. Star Design Dimension Tables Products Sales Date Fact Table Sales Quantity Amount=SalePrice*Quantity Customer Location

  47. Snowflake Design: SQL Server Analysis

  48. OLAP Computation Issues Compute Quantity*Price in base query, then add to get $23.00 If you use Calculated Measure in the Cube, it will add first and multiply second to get $45.00, which is wrong.

  49. SQL Server Analysis Cube Design • Define the Data Source (choose the database) • Create a new cube—avoid the wizard • Create the OLAPItems query, Amount=Quantity*SalePrice • Pick the fact table: OLAPItems query • Create a new dimension: PetItemSaleDate • Star design • Sale table, SaleDate column, Time dimension • Create a second dimension: PetItemCategory • Star • Merchandise table, Category column • Create a third dimension: PetCustomerLocation • Snowflake • Sale, Customer, City tables • Country, State, City columns • Select the data to measure in the fact table • Measures, New: Quantity, Amount • Save the cube as: PetItemSales • Process the cube (icon) • MOLAP (stores data in a cube layout) • Select the defaults on the other options.

  50. Oracle OLAP Design Create snapshots (Materialized Views) to preprocess queries. Mostly an issue of performance—use Star and bitmap indexes. Need Oracle Express ($$$$$) or have to build tables yourself.

More Related