700 likes | 969 Vues
The Relational Model. CMPT 354. The Relational Model. What is the Relational Model Relations Domain Constraints SQL Integrity Constraints Translating an ER diagram to the Relational Model and SQL Views. Relational Model. {x , y, z }. Relational Databases.
 
                
                E N D
The Relational Model CMPT 354
The Relational Model • What is the Relational Model • Relations • Domain Constraints • SQL • Integrity Constraints • Translating an ER diagram to the Relational Model and SQL • Views
Relational Model {x, y, z}
Relational Databases • A relational database consists of a collection of tables • Each table has a unique name • Each row represents a single entity • A table corresponds to the mathematical concept of a relation • Relations corresponds to tables • Tuples corresponds to rows
Relations • The key construct in the relational model is a relation • A DB is a collection of relations • Relations consist of a relation instance and a relation schema • A relation instance corresponds to an actual table with a set of rows • A relation schema consists of the column headings of the table
Relation Schema • A schema describes • The relation's name • The name of each column in the table, also known as a field, or attribute • The domain of each field • Each domain has a set of associated values like a type • For example, the Customer relation • Customer = (sin, fname, lname, age, income) • The schema should also specify the domain of each attribute
Customer Relation Relation schema: Customer = (sin, fname, lname, age, salary) Relation instance Each field (attribute) has a domain: char(11), char(20), char(20), integer, real An instance is a (small) subset of the Cartesian product of the domains
Domains • A domain of a field is similar to the type of a variable • The values that can appear in a field are restricted to the field's domain • A relation instance is a subset of the Cartesian product of the list of domains • If a relation schema has domains D1 to Dn • Where the domains are associated with fields f1 tofn • A relation must be a subset of D1 D2 …  Dn-1 Dn • In practice a relation instance is usually some small subset of the Cartesian product
Relation Instance • A relation instance is a set of tuples, or records • Each tuple has the same number of fields as the schema • No two rows (tuples) are identical • As each relation is defined to be a set of unique tuples or rows • A set is a collection of distinct values • In practice DBMS allow tables to have duplicate rows in some circumstances • The order of the rows is not important
More Terminology • The degree (or arity) of a relation is the number of fields (or columns) • The degree of the Customer relation is 5 • The cardinality of a relation instance is the number of tuples in it • The cardinality of the Customer relation instance is 4 • A relational database is a collection of relations with distinct relation names • A relational database schema is the collection of schemas for all relations in the database
Creating Tables in SQL CREATE TABLE ...
SQL DDL • SQL stands for Structured Query Language • SQL is divided into two parts • Data Manipulation Language (DML) which allows users to create, modify and query data • Data Definition Language (DDL) which is used to define external and conceptual schemas • The DDL supports the creation, deletion and modification of tables • Including the specification of domain constraints and other constraints
Creating Tables • To create a table use the CREATE TABLEstatement • Specify the table name, field names and domains • For example, to create the Customer table: CREATE TABLE Customer ( sin CHAR(11), fname CHAR(20), lname CHAR(20), age INTEGER, income REAL)
Inserting Records • To insert a record into an existing table use the INSERT statement • The list of column names is optional • If omitted the values must be in the same order as the columns INSERT INTO Customer(sin, fname, lname, age, income) VALUES ('111', 'Sam', 'Spade', 23, 65234)
Deleting Records • To delete a record use the DELETE statement • The WHERE clause specifies the record(s) to be deleted • Be careful, the following SQL query deletes all the records in a table DELETE FROM Customer WHERE sin = '111' DELETE FROM Customer
Modifying Records • Use the UPDATE statement to modify a record, or records, in a table • Note that the WHERE statement is evaluated before the SET statement • An UPDATE statement may affect more than one record UPDATE Customer SET age = 37 WHERE sin = '111'
Deleting Tables • To delete a table use the DROP TABLEstatement • This not only deletes all of the records but also deletes the table schema • http://xkcd.com/327/ DROP TABLE Customer
Modifying Tables • Columns can be added or removed to tables using the ALTER TABLE statement • ADD to add a column and • DROP to remove a column ALTER TABLE Customer ADD height INTEGER ALTER TABLE Customer DROP height
Integrity Constraints • An integrity constraint restricts the data that can be stored in a DB • To prevent invalid data being added to the DB • e.g. two people with the same SIN or • someone with a negative age • When a DB schema is defined, the associated integrity constraints should also be specified • A DBMS checks every update to ensure that it does not violate any integrity constraints
Types of Integrity Constraints • Domain Constraints • Specified when tables are created by selecting the type of the data • e.g. age INTEGER • Key Constraints • Identifies primary keys and other candidate keys • Foreign Key Constraints • References primary keys of other tables • General constraints
Key Constraints • A key constraint states that a minimal subset of the fields in a relation is unique • i.e. a candidate key • SQL allows two sorts of key constraints • The UNIQUEstatement identifies candidate keys • Records may not have duplicate values for the set of attributes specified in the UNIQUEstatement • A PRIMARY KEYidentifies the primary key • Records may not have duplicate primary keys and • May not have null values for primary key attributes
Primary and Candidate Keys • A primary key identifies the unique set of attributes that will be used to identify a record in a table • In some cases there are obvious primary keys in the problem domain (e.g. SIN) • In other cases, a primary key may be generated by the system • Candidate keys are identified to record the fact that a set of attributes should be unique • Preventing duplicate data being entered into the table for this set of attributes
Identifying Key Constraints • Assume that a patientcan be uniquely identified by either SIN or MSP number • SIN is chosen as the primary key CREATE TABLE Patient ( sin CHAR(11), msp CHAR(15), fName CHAR(20), lName CHAR(20), age INTEGER, CONSTRAINT unique_mspUNIQUE (msp) PRIMARY KEY (sin) )
Takes Foreign Keys • Consider the abbreviated ERD shown below • Takes is a many-to-many relationship, so a database table must be created for it (more on this later) • It makes no sense for the Takes table to be allowed to contain the student ID of students who do not exist • The student ID in Takes should be identified as a foreign key, that references Student Student Course
Takes Foreign Keys • A foreign key constraint references the primary key of another relation • The value of the foreign key attribute(s) must match a primary key in the referenced table referencing relation referenced relation Student Course
Foreign Keys • The attributes of the foreign key and the referenced primary key must be consistent • The same number of attributes, with • Compatible attribute types, but • The attribute names may be different • A foreign key references the entire primary key • Where the primary key is compound the foreign key must also be compound • And not multiple single attribute foreign keys
owns Specifying Foreign Keys an account must be owned by a single customer Customer Account CREATE TABLE Account( accnum INTEGER, type CHAR(5), balance REAL, custSIN CHAR(11), PRIMARY KEY (accNum), CONSTRAINT fk_cust FOREIGN KEY (custSIN) REFERENCES Customer )
General Constraints • A DB may require constraints other than primary keys, foreign keys and domain constraints • Limiting domain values to subsets of the domain • e.g. limit age to positive values less than 150 • Or other constraints involving multiple attributes • SQL supports two kinds of general constraint • Table constraints associated with a single table • Assertions which may involve several tables and are checked when any of these tables are modified • These will be covered later in the course
Enforcing Integrity Constraints • Whenever a table with a constraint is modified the constraint must be checked • A modification can be a deletion, a change (update) or an insertion of a record • If a transaction violates a constraint what happens? • Primary key constraints • A primary key constraint can be violated in two ways • A record can be changed or inserted so that it duplicates the primary key of another record in the table or • A record can be changed or inserted so that (one of) the primary key attribute(s) is null • In either case the transaction is rejected
age owns accnum type income sin fname lname balance Enforcing Foreign Keys • Consider these schema (the domains have been omitted for brevity) • Customer = (sin, fname, lname, age, income) • Account = (accnum, balance, type, sin) • sin in Account is a foreign key referencing Customer Customer Account
Foreign Keys – Insertions in the Referencing Table Inserting {409, 0, CHQ, 555} into Accountviolates the foreign key on sinas there is no sin of 555 in Customer The insertion is rejected; before it is processed a Customerwith a sinof 555 must be inserted into the Customer table
Foreign Keys – Updates to the Referencing Table Changing this record’s sinto 555 also violates the foreign key, again leading to the transaction being rejected
Foreign Keys – Deletions in the Referenced Table Deleting this record will violate the foreign key, because a record with that sin exists in the Account table
Foreign Keys – Updates to the Referenced Table Updating this record so that the sin = 666 will violate the foreign key, because a record with the original sin exists in the Account table
Foreign Key Violations • A deletion or update transaction in the referenced table may violate a foreign key • Different responses can be specified in SQL • Reject the transaction (the default) – NO ACTION • Delete or update the referencing record – CASCADE • Set the referencing record's foreign key attribute(s) to null (only on deletion) – SET NULL • Set the referencing record's foreign key attribute(s) to a default value (only on deletion) – SET DEFAULT • The default value must be specified in the foreign key
Logical Database Design • The ER model is an initial high level design which can be translated into a relational schema • And therefore into SQL • Each entity and relationship set can be represented by a unique relation • Although some relationship sets do not need to be represented by separate relations • SQL constraints should be included in DB tables to represent constraints identified in the ERD
Translating Entity Sets • A table that represents a (strong) entity set should have the following characteristics • One column for each attribute • Each row represents a unique entity • The domain of each attribute should be known and specified in the table • The primary key should be specified in the table • Other constraints that have been identified outside the ER model should also be created where possible
age sin income Customer lname fname Entity Sets CREATE TABLE Customer ( sin CHAR(11), fname CHAR(20), lname CHAR(20), age INTEGER, income REAL, PRIMARY KEY (sin) )
Relationship Sets • The attributes of a relationship set are: • The primary keys of the participating entity sets, and • Any descriptive attributes of the relationship set • The mapping cardinalities of the entities involved in a relationship determine • The primary key of the relationship set and • Whether or not the relationship set needs to be represented as a separate table • Foreign keys should be created for the attributes derived from the participating entity sets
Relationship Sets With No Cardinality Constraints • A separate table is required to represent a relationship set with no cardinality constraints • i.e. relationships that are many to many • The primary key of the relationship set is the union of the attributes derived from its entity sets • A compound key made up of the primary keys of the participating entity sets • Attributes derived from its entity sets should be declared as foreign keys
start budget sin salary address bname name works_in Branch Employee Relationship Set to Table CREATE TABLE WorksIn( sin CHAR(11), bname CHAR(30), start DATETIME, FOREIGN KEY (sin) REFERENCES Employee, FOREIGN KEY (bname) REFERENCES Branch, PRIMARY KEY (sin, bname) )
salary sin fname lname Employee manager subordinate manages Relationship Set to Table … CREATE TABLE Manages ( manSIN CHAR(11), subSIN CHAR(11), FOREIGN KEY (manSIN) REFERENCES Employee, FOREIGN KEY (subSIN) REFERENCES Employee, PRIMARY KEY (manSIN, subSIN) )
Relationship Sets With Cardinality Constraints • Determine attributes for the table as for a relationship set without cardinality constraints • To determine the primary key of a binary relationship set • If the relationship is one to one then the primary key of either entity set can be its primary key • But it must be only one of the two primary keys • If the relationship is many to one, or one to many the primary key of the “many” entity set is its primary key • That is, the entity set whose entities can only appear once in the relationship set
Non Binary Relationship Set Primary Keys • If the relationship set has no cardinality constraints the primary key is a compound key • Union of the primary keys of the participating entity sets • If the relationship set has at least one cardinality constraint • The primary key is taken from one of the “many” entity sets in the relationship • That is, one of the entity sets whose entities can be involved in at most one relationship
Should a Relationship Set be Represented as a Table? • A binary relationship does not require a table if at least one its entities has a key constraint • Add the relationship set's attributes to the table for the entity set that provided the primary key • If the entity’s participation in the relationship is partial, some rows may not have data for these attributes • Such rows will contain nullfor these fields, which wastes space (and has other undesirable properties) • The attributes from the other entity sets involved in the relationship are specified as foreign keys
Many-to-One Relationship CREATE TABLE Employee ( sin CHAR(11), name CHAR(40), salary REAL, bname CHAR(30), start DATETIME, FOREIGN KEY (bname) REFERENCES Branch, PRIMARY KEY (sin) ) start budget sin salary address bname name works_in Branch Employee
Participation Constraints • Where possible participation constraints should be included in a table specification • By declaring the attributes on which there is a foreign key as NOT NULL • This approach only works when a relationship set is not represented in a separate table • i.e. when the relationship is represented as a foreign key in a table that represents an entity set • Specifying an attribute as NOT NULLforces every record of that entity set to have a value for the attribute • Some participation constraints must be modeled using assertions or triggers
sin accnum birthdate balance income Account Customer owns lname type fname Participation Constraint CREATE TABLE Owns ( sin CHAR(11), accnum INTEGER, FOREIGN KEY (sin) REFERENCES Customer, FOREIGN KEY (accnum) REFERENCES Account, PRIMARY KEY (sin, accnum) ) Although sinand accNumcannot be null the participation constraint is notrepresented Why not?
start budget sin salary address bname name works_in Branch Employee Participation Constraint … CREATE TABLE Employee ( sin CHAR(11), nameCHAR(40), salary REAL, bname CHAR(30) NOT NULL, start DATETIME, FOREIGN KEY (bname) REFERENCES Branch, PRIMARY KEY (sin) ) Making bnameNOT NULLensures that every Employee must work in a branch