510 likes | 631 Vues
CS 540 Database Management Systems. DB Programming & Schema Design. Announcements. Project definition 5 min presentation + 5 min Q &A Your group General definition, specific challenges Schedule We have a TA! Thursday 3-4pm Piazza. Relational Database Management.
E N D
CS 540 Database Management Systems DB Programming & Schema Design
Announcements • Project definition • 5 min presentation + 5 min Q&A • Your group • General definition, specific challenges • Schedule • We have a TA! • Thursday 3-4pm • Piazza
Relational Database Management Database Programming Physical Storage Conceptual Design Schema Relational Model Entity Relationship(ER) Model Files and Indexes
Is SQL Sufficient? • Using IsParent(parent,child), find all descendants of a given person. • It is not possible to write this query in standard SQL! • We can prove it.
Writing Complex Queries • SQL cannot express some complex queries. • We can add more programming constructs like recursion to SQL: • Recursive SQL • Oracle’s connected by • …
Writing Large DB Programs • We have to write large scale programs that query the database. • SQL does not support features such as graphical user interface development. • SQL is not enough!
Using SQL in a PL • Using SQL in a programming language to write programs that contain many queries • PHP, Java, C, C++,… • CS 275
Stored Procedures • Programming modules written in a procedural language. • Stored in the database. • Compiled and run by RDBMS. • Called by user.
Stored Procedure • Programming modules written in a procedural language. • We cannot use standard procedural languages like C. • RDBMS must support the language.
Stored Procedure • Each RDBMS provides its own language to write stored procedures. • Oracle: PL / SQL • SQL Server: Transact-SQL • …
Example: PL/SQL • Create a stored procedure that inserts the integer numbers between 1 and 100 into table NumOddEven(numVal, oddEven). CREATE OR REPLACE PROCEDURE sample IS BEGIN FOR i IN 1.. 100 LOOP IF MOD(i,2) = 0 THEN -- i is even INSERT INTO numOddEven VALUES (i, ‘i is even’); ELSE INSERT INTO numOddEven VALUES (i, ‘iis odd’); END IF; END LOOP; COMMIT; END
Example: PL/SQL • Called by user for from a program. > Execute sample • The languages of stored procedures support most features of programming languages. • Users can pass input parameters to stored procedures • For example, the number of values to insert in table NumOddEven.
Advantages of Stored Procedures • They are more expressive than SQL. • It is easy to use SQL in them • SQL queries are parts of the programming language. • They run faster than using SQL in standard programming languages such as PHP, Java, …. • No need to submit multiple SQL statements to the RDBMS.
Disadvantages of Stored Procedures • They do not support features such as graphical user interface. • Each RDBMS has its own. • If we write a program over a RDBMS, we cannot run it over another one. • They are harder to write, debug, and maintain than the programs written in Java or C++. • They do not generally have the nice object-oriented features of Java or C++.
When to use Stored Procedures • Generally Speaking: • They should contain small number of queries. • Keep them simple so they are easy to maintain. • They should be used for portions of the program whose performances matter a lot. • ID generation
Problems with Using SQL in PL • The programs are hard to debug and maintain • Programmers often have to run their programs to find the bugs. • If a programmer adds an attribute to a table, she has to manually find the change many lines in the source code.
Problems with Using SQL in PL • Programmers must deal with two different data models • Relational model create table Bars(name varchar(50), addrvarchar(100) …) • object-oriented model public class Bars { private String name; private String addr; …}
Object Relational Mapping (ORM) • ORM tools hide relational model from the programmers. (as much as possible) • Programmers deal with only object oriented data data model. • They write small number of SQL queries or no SQL query at all.
Object Relational Mapping (ORM) • Programmers write their programs in an object oriented language. • They let the ORM tool know which objects to store in DB. (as tables) • The ORM tool generates the SQL queries to create tables and retrieve/manipulate the data.
Object Relational Mapping (ORM) • Various tools for many programming languages. • Hibernate, Zend, … • We use Hibernate (www.hibernate.org) for Java in our example.
Example • We like to write a Java program that stores and retrieves information about bars, where each bar has name, address, and license. • We write a class in Java for Bar. public class Bars implements Serializable { private String name; private String addr; private String license; …} Interface for persistent classes
Example: Cont. • We add setter and getter methods for member variables of the class. public class Bars implements Serializable{ private String name; private String addr; private String license; // Setter and getter methods public void setName(string name){ this.name = name; } public String getName(){ return name; } … }
Example: Cont. • DBA creates table Bars(name, addr, license) in DB. (or programmer) create table Bars (name varchar(50), addrvarchar(100), license varchar(100) …) • DBA writes the relationship between the relation and the Java class in a configuration file • Bars.nameBars.name • Bars.addrBars.addr • Bars.licenseBars.license
Example: Cont. try{ SessionFactory factory = new Configur().configure().buildSessionFactory(); Session session = factory.openSession(); Bars JoeBar = new Bars(); JoeBar.setName(“Joe Bar”); JoeBar.setaddr(“12 Madison St. Corvallis,OR,97331”); JoeBar.setLicense(“license324”); session.save(JoeBar); … } catch(Exception ex){…} finally{ session.close(); } Hibernate reads the configuration file, generate the insert query, and inserts a new tuple into the DB.
More Info in the Configuration File • We can describe integrity constraints • Not NULL • Primary key and foreign key • We can describe relationships between objects (tables) • One to one, one to many
Advantages of ORMTools • Programmers write minimal amount of SQL queries. • DBA may do most of the work. • Programmers do not need to know about RDBMS • Each RDBMS makes some small syntactical changes to SQL • Our programs will work over various RDBMSs. • Test the program over MySQL, deploy it over Oracle
Disadvantages of ORMTools • They are slower than stored procedures • Outside DB • They might be slower than using SQL in programming in some cases • Too many function calls. • We cannot use customized and fast commands of an RDBMS. • They are generally harder to learn than JDBC and PHP • API + configuration files
When to Use What • There is no exact rule! • Generally speaking: • Small programs + performance: stored procedures • Small programs: SQL in programming language (JDBC, PHP) • Large programs: ORM
When to Use What • We usually use a mixed of these techniques. • Ex: Separate small parts of the program where performance is essential, and make them stored procedures and use ORM for other parts.
Database Implementation Schema Design Physical Storage Conceptual Design Schema Relational Model Entity Relationship(ER) Model Files and Indexes
E-R Model to Relational Schema Person name ssn address Address Name SSN 21 Kings St. John 222000111 234 2nd St. John 222000111 31 Kings St. Charles 111222333 2 Harrison St.Charles 111222333
Redundancy Address Name SSN • Person contains duplicate values for SSN and Name. • Update ‘John’ to ‘Richard’ in the first tuple • Same SSN, different names! low quality data • Update anomaly 21 Kings St. John 222000111 234 2nd St. John 222000111 31 Kings St. Charles 111222333 2 Harrison St.Charles 111222333
Incomplete Information Address Name SSN • Delete John’s addresses • Lose John’s SSN and name low quality data • Deletion anomaly • Insert Name and SSN for George • NULL value for Address low quality data • Insertion anomaly 21 Kings St. John 222000111 234 2nd St. John 222000111 31 Kings St. Charles 111222333 2 Harrison St.Charles 111222333
How to resolve these problems? Decompose the relation Person: SSN Name 222000111 John 111222333 Charles SSN Address 22200011121 Kings St. 222000111234 2nd St. 11122233331 Kings St. 1112223332 Harrison St.
Designing for High Quality Data: Normalization • Translate E-R model to relational schema S • Transform S to another relational schema S’ such that: • S contains all the information that is available in S. • S contains minimal amount of redundancy. • S does not have incomplete information problem.
Normal Forms ✔ ✔ • ✔ • ✔ • Each normal form removes some type of anomaly.
Normal Forms ✔ ✔ • ✔ • ✔ • Fourth normal form is also important!
Functional Dependency (FD) • A form of constraint in the database • Given attributes in relation R, the functional dependency means that all tuples in R that agree on attributes must also agree on .
Example FDs Address Name SSN SSN Name SSN Address 21 Kings St. John 222000111 234 2nd St. John 222000111 31 Kings St. Charles 111222333 2 Harrison St.Charles 111222333
Example FDs • Given Relation movies (title, year, actor, cost): title actor title year title, year, actor cost
Keys • The key of relation R is a set of attributes in R that • functionally determines all attributes in R. • none of its subsets is a key. Ex: movies (title, year,actor, cost, revenue) title, year, actor cost title, year, actor revenue title cost • Superkeyis a set of attributes that contains a key
How to find Keys and FDs? • We need to know all keys and FDs in relation R to normalize it. • Some come from E-R model and problem definition. • Some are implied by others Ex: movies (title, year, actor, cost, revenue) E-R model: title, year, actor cost E-R model: title, year, actor revenue Implied: title, year, actor cost, revenue
Closure of a set of FDs • Given a set of FDs U, we like to find all FDs logically implied by U. movies (title, year, actor, cost, revenue, blockbuster) U = {title, year, actor cost, revenue cost, revenue blockbuster} does title, year, actor blockbuster hold? • The set of all such FDs is called the closure of U, shown as U+.`
Armstrong’s Axioms • Reflexivity: generally, if , we have (trivial FD) • Augmentation: If , then • Transitivity: If and then
Computing the Closure of U • U+ = U. • Repeat • Apply reflexivity and augmentation to each FD in U+ and add the resulting FDs to U+. • Apply transitivity to each pairs of FDs in U+and add the resulting FDs to U+. • Until U+ does not change anymore.
Computing the Closure of U Ex: movies(title, year, actor, cost, revenue, blockbuster) U = {title, year, actor cost, revenue cost, revenue blockbuster} does title, year, actor blockbuster hold? • Apply transitivity
Other Rules • Decomposition: If , then , , … , • Union: If , ,…, then • There are more rules, check the textbook. We can prove them all using Armstrong’s axioms.
Closure of a set of Attributes • Given a relation R, a set of attributes and a set of FDs U, we like to find all attributes B such that . • We show the closure of set as • Used to find keys and superkeys in a relation.