350 likes | 476 Vues
Chapter 3: SQL. Schema used in examples. p. 75-109 (omit 3.8.2, 3.10.5, 3.11). Data Definition Language. Allows the specification of a set of relations and information about each relation Including:. the schema for each relation the domain of values associated with each attribute
E N D
Chapter 3: SQL Schema used in examples p. 75-109 (omit 3.8.2, 3.10.5, 3.11)
Data Definition Language Allows the specification of a set of relations and information about each relation Including: • the schema for each relation • the domain of values associated with each attribute • integrity constraints • the set of indices to be maintained for each relations • security and authorization information for each relation • the physical storage structure of each relation on disk
Domain Types in SQL • char(n). Fixed length character string with user-specified length n • varchar(n). Variable length character strings with user-specified maximum length n • int.Integer (machine-dependent) • smallint. Small integer (machine-dependent) • numeric(p,d). Fixed point number with user-specified precision of p digits, with n digits to the right of decimal point • real, double precision. Floating point and double-precision floating point numbers with machine-dependent precision • float(n). Floating point number with user-specified precision • More in Chapter 4
Create Table • A relation is defined using create table: • create table r (A1D1, A2D2, ..., An Dn,(integrity-constraint1), ..., (integrity-constraintk)) • r is the name of the relation • each Ai is an attribute name in the schema of relation r • Di is the data type of values in the domain of attribute Ai • Example (requires that branch_name not be null) • create table branch (branch_name char(15) not null,branch_city char(30),assets integer)
Integrity Constraints in Create Table • not null • primary key (A1, ..., An ) Declare branch_name as the primary key for branch . create table branch(branch_name char(15),branch_city char(30),assets integer,primary key (branch_name)) primary key ensures not null in SQL-92 onwards A primary key can have multiple attributes : primary key (branch_name, branch_city) (more integrity constraints later)
Drop and Alter Table • drop tabledeletes a relation from the database • alter table to add attributes to an existing relation: • alter table r add A D • A is an attribute to be added to relation r • D is the domain of A. • all tuples in the relation are assigned null for the new attribute • alter table to drop attributes: • alter table r drop A • A is an attribute of relation r
Basic Query Structure • Typical query:select A1, A2, ..., Anfromr1, r2, ..., rmwhere P • Ai is an attribute • ri s a relation • P is a predicate • Equivalent to the relational algebra • The result of an SQL query is a relation
select • select corresponds to projection in the relational algebra • Find the names of all branches in loan: • select branch_namefrom loan • Relational algebra: • branch_name(loan) • SQL does not permit '–' (branch_name, not branch_name) • SQL names are case-insensitive
select (cont.) • SQL allows duplicates in relations as well as in query results • Keyword distinct eliminates duplicates • Find the names of all branches in the loan relations and remove duplicates • select distinct branch_namefrom loan • Keyword allspecifies that duplicates not be removed • select allbranch_namefrom loan • all is the default
select (cont.) • Asterisk denotes “all attributes” • select *from loan • select can contain arithmetic expressions: • selectloan_number, branch_name, amount 100from loan • returns a relation with attribute amount multiplied by 100
where • where corresponds to selection () in the relational algebra • Find loan number for all loans made at the Aurora branch with loan amounts greater than $1200: • select loan_numberfrom loanwhere branch_name = “Aurora” and amount > 1200 • (SQL uses the logical connectives and, or, and not) • Find the loan number of loans with loan amounts between $90,000 and $100,000: • select loan_numberfrom loanwhere amount90000 and amount100000 • Using between: • select loan_numberfrom loanwhere amountbetween 90000 and 100000
from • from corresponds to Cartesian product in the relational algebra. • Find borrower x loan: • select from borrower, loan • Natural join has to be made explicit • E.g. find the name and loan number of all customers with a loan at the Aurora branch: • select customer_name, borrower.loan_numberfrom borrower, loanwhereborrower.loan_number = loan.loan_number andbranch_name = “Aurora”
MySQL • MySQL can be run from a command line, Query Browser, or application program interface • Free, popular, well-supported • Download from mysql.com Access Its own version of SQL Query by Example (QBE)
The Rename Operation • Renaming relations and attributes is accomplished through as: • old_name as new_name • Find the name and loan number of all customers with a loan at the Aurora branch; replace the column name loan_number with the name loan_id. • select customer_name, borrower.loan_number as loan_id, amount • from borrower, loan • where borrower.loan_number = loan.loan_number and branch_number = “Aurora” • (customer_name, loan_number) (loan_number, branch_name, amount) • = "Aurora" • (customer_name, loan_id, amount)
Tuple Variables • Tuple variables are defined in the from clause via as • Find the customer names and their loan numbers for all customers having a loan at some branch • select customer_name, T.loan_number • from borrower as T, loan as Swhere T.loan_number = S.loan_number • Find the names of all branches that have greater assets than some branch located in Brooklyn.select distinct T.branch_namefrom branch as T, branch as Swhere T.assets > S.assets and S.branch_city = “Brooklyn” • Try: find the names of all customers with more than one loan • select distinct S.customer_name from borrower as S, borrower as T, where S. customer_namer = T. customer_name and • S.loan_number ≠ T.loan_number
String Operations • Patterns are described using two special characters: • % matches any substring. • _ matches any character. • Find the names of all customers whose street includes the substring “Main”. • select customer_namefrom customerwherecustomer_street like '% Main%' • What does this do: • wherecustomer_street like "M_in"
Ordering the Tuples • List the names of all customers having a loan in Perryridge branch alphabetically • select distinct customer_namefrom borrower, loanwhere borrower loan_number = loan.loan_number and branch_name = 'Perryridge' order by customer_name desc • desc for descending order, asc for ascending order • Ascending order is the default
Set Operations • union, intersect, and except correspond to the relational algebra operations • To retain duplicates use union all, intersect all and except all • Find the names of all customers who have a loan, an account, or both: • (selectcustomer_name from depositor)union (selectcustomer_name from borrower) • Find all customers who have both a loan and an account: • (replace union with intersect) • Find all customers who have an account but no loan: • (replace union with except)
Aggregate Functions • These functions operate on the values of a column of a relation • avg: average valuemin: minimum valuemax: maximum valuesum: sum of valuescount: number of values • Find the average account balance at the Aurora branch • select avg (balance)from accountwhere branch_name = “Aurora” • Find the number of tuples in the customer relation • select count (*)from customer • Find the number of depositors in the bank. • select count (distinct customer_name) • from depositor
Aggregate Functions – Group and Having • Sometimes we want to calculate over a group • select branch_name, avg (balance)from accountgroup by branch_name • finds the average for each group • To restrict within those group calculations: branches where the average account balance is more than $1,200 • select branch_name, avg (balance)from accountgroup by branch_namehaving avg (balance) > 1200
Null Values • Tuples may be null for some of their attributes • null signifies unknown or that a value does not exist • Use is null to check for null values: • find all loan numbers in loan with null values for amount • select loan_numberfrom loanwhere amount is null • Aggregate operations (except count(*)) ignore tuples with null values on the aggregated attributes • E.g. • select avg (amount)from loan • ignores null amounts
Nested Subqueries • Find all customers who have both an account and a loan • select distinct customer_namefrom borrowerwhere customer_name in (select customer_namefromdepositor) • Can we do it without nesting? • (select distinct customer_namefrom borrower) • intersect • (select distinct customer_namefrom depositor)
Nested Subqueries • Find the names of all customers who have a loan at the bank but do not have an account at the bank • select distinct customer_namefrom borrowerwhere customer_name not in (select customer_namefrom depositor) • Can we do it without nesting? • (select distinct customer_namefrom borrower)except • (select distinct customer_namefrom depositor)
Set Comparison • Find all branches that have greater assets than some branch located in Brooklyn select distinct T.branch_namefrom branch as T, branch as Swhere T.assets > S.assets andS.branch_city = 'Brooklyn' Same query using > some select branch_namefrom branchwhere assets > some (select assetsfrom branchwhere branch_city = 'Brooklyn')
Example • Find the names of all branches that have greater assets than all branches located in Brooklyn select branch_namefrom branchwhere assets > all (select assetsfrom branchwhere branch_city = 'Brooklyn')
Test for Absence of Duplicate Tuples • unique tests whether a subquery has duplicate tuples in its result • Find all customers with at most one account at Perryridge branch • select T.customer_name • from depositor as T • where unique ( • select R.customer_namefrom account, depositor as Rwhere T.customer_name = R.customer_name andR.account_number = account.account_number andaccount.branch_name = 'Perryridge') • Relational algebra?
Derived Relations • SQL allows a subquery in the from clause • Find the average account balance of those branches where the average account balance is greater than $1200 • select branch_name, avg_balancefrom (select branch_name, avg (balance)from accountgroup by branch_name )as branch_avg ( branch_name, avg_balance )where avg_balance > 1200
Views • A relation that is not of the conceptual model but is made visible to a user as a “virtual relation” • A way to hide certain data from some users • Defined using create view • create view v as < query expression > • where <query expression> is any legal SQL expression • The view name is represented by v. • When a view is created • the query expression is stored in the database • it is substituted into queries using the view
Example • A view consisting of branches and their customers create view all_customer as(select branch_name, customer_namefrom depositor, accountwhere depositor.account_number = account.account_number ) union(select branch_name, customer_namefrom borrower, loanwhere borrower.loan_number = loan.loan_number ) Find all customers of the Aurora branch select customer_namefrom all_customerwhere branch_name = 'Aurora'
Modification of the Database – Deletion • Delete all account records at the Aurora branch • delete from accountwhere branch_name = “Aurora” • Delete all accounts at every branch located in Needham. • delete from accountwhere branch_name in (select branch_namefrom branchwhere branch_city = “Needham”) • What does this mean? • delete from depositorwhere account_number in (select account_numberfrom branch, accountwhere branch_city = “Needham”and branch.branch_name = • account.branch_name)
Example • Delete the record of all accounts with balances below the average at the bank. • delete from accountwhere balance < (select avg (balance)from account) • Problem: deleting tuples from deposit changes the average balance • SQL's solution: 1. first, compute avg (balance) 2. next, delete all tuples found above it (without recomputing avg)
Modification of the Database – Insertion • Add a new tuple to account • insert into accountvalues ('A-9732', 'Perryridge', 1200) • Equivalentlyinsert into account (branch_name, balance, account_number)values ('Perryridge', 1200, 'A-9732') • Add a new tuple to account with balance set to null • insert into accountvalues ('A-777','Perryridge', null )
Modification of the Database – Insertion • Provide a $200 savings account to all loan customers of the Perryridge branch. Let the loan number serve as the account number for the new savings account • What tables need to be modified? • insert into accountselect loan_number, branch_name, 200from loanwhere branch_name = 'Perryridge' • insert into depositorselect customer_name, loan_numberfrom loan, borrowerwhere branch_name = 'Perryridge' and loan.account_number = borrower.account_number
Modification of the Database – Updates • Increase all accounts with balances over $10,000 by 6%, all other accounts receive 5% • Write two update statements: update accountset balance = balance 1.06where balance > 10000 update accountset balance = balance 1.05where balance 10000 • Why is the order is important? • Can be done better using case: updateaccountsetbalance = casewhenbalance <= 10000 thenbalance *1.05elsebalance * 1.06end
Update through a View • Create a view of all loan data in the loan relation, hiding the amount attribute • create view loan_branch as select loan_number, branch_namefrom loan • Add a new tuple to branch_loan • insert into loan_branchvalues ('L-37‘, 'Perryridge‘) • Result: tuple • ('L-37', 'Perryridge', null ) • is inserted into the loan relation