150 likes | 162 Vues
Explore the widely used relational model and the SQL query language, including the creation of relations in SQL and enforcing referential integrity with foreign keys.
 
                
                E N D
The Relational Model Chapter 3
Why Study the Relational Model? • Most widely used model. • Vendors: IBM, Informix, Microsoft, Oracle, Sybase, etc. • “Legacy systems” in older models • E.G., IBM’s IMS • Recent competitor: object-oriented model • ObjectStore, Versant, Ontos • A synthesis emerging: object-relational model • Informix Universal Server, UniSQL, O2, Oracle, DB2
The SQL Query Language • Developed by IBM (system R) in the 1970s • Need for a standard since it is used by many vendors • Standards: • SQL-86 • SQL-89 (minor revision) • SQL-92 (major revision, current standard) • SQL-99 (major extensions)
Summary Relational Data Model I • A relation database consists of a set of relations (tables) • Each relation consist of a set of attribute/domain pairs • Attributes have to be single-valued. Additionally, “null” values are supported • The attributes of a relation are unordered • Attribute values have to be atomic (cannot be tuples or relations) • Relations store sets of tuples (queries return bags of tuples!!); each tuple of a relation stores a value for each attribute of the relation; all tuples stored in a relation must be different • Tuples in a relation are unordered
Summary Relational Data Model II • Each relation has one primary key that consists of a set of attributes that uniquely identifies the tuples in a relation). • Relationships between object in the real world are represented in the relational data model by “exporting primary keys of the objects that participate in the relationship”. The exported keys are called foreign keys. • Multi-values attributes can be presented either by using a separate relation or by representing the object through multiple tuples (one for each value of the multi-valued attribute). • Optional attributes (attributes that not necessarily have a value) can be represented using null values; furthermore, they can also be represented by using a separate relation that stores the relation’s primary key attributes and the optional attribute.
Creating Relations in SQL CREATE TABLE Students (sid: CHAR(20), name: CHAR(20), login: CHAR(10), age: INTEGER, gpa: REAL) • Creates the Students relation. Observe that the type (domain) of each field is specified, and enforced by the DBMS whenever tuples are added or modified. • As another example, the Enrolled table holds information about courses that students take. CREATE TABLE Enrolled (sid: CHAR(20), cid: CHAR(20), grade: CHAR(2))
Graphical Short Notations forRelational Schemas • R(A,B,C), S(D,E) meaning: {A,B} is a primary key for R; D is a primary key for S • S(D,E) meaning: X is a foreign key in T that references attribute D of relation T: T[X]  S[D] T(X,Y,Z) Remark: The graphical short notation only specifies relation names, attributes, primary keys, and foreign keys but omits other schema information (such as attribute domains, uniqueness constraints, …)
Primary Key Constraints • A set of fields is a candidate keyfor a relation if : 1. No two distinct tuples can have same values in all key fields, and 2. This is not true for any subset of the candidate key. • Part 2 false? A superkey. • If there’s >1 key for a relation, one of the keys is chosen (by DBA) to be the primary key. • E.g., sid is a candidate key for Students. (What about name?) The set {sid, gpa} is a superkey. • Remark: Sometimes people say ‘key’ when the refer to candidate keys.
Primary and Candidate Keys in SQL • Possibly many candidate keys(specified using UNIQUE), one of which is chosen as the primary key, if necessary. CREATE TABLE Enrolled (sid CHAR(20) cid CHAR(20), grade CHAR(2), PRIMARY KEY (sid,cid) ) • “For a given student and course, there is a single grade.” vs. “Students can take only one course, and receive a single grade for that course; further, no two students in a course receive the same grade.” • Used carelessly, an IC can prevent the storage of database instances that arise in practice! CREATE TABLE Enrolled (sid CHAR(20) cid CHAR(20), grade CHAR(2), PRIMARY KEY (sid), UNIQUE (cid, grade) )
Foreign Keys, Referential Integrity • Foreign key : Set of fields in one relation that is used to `refer’ to a tuple in another relation. (Must correspond to primary key of the second relation.) Like a `logical pointer’. • E.g. sid is a foreign key referring to Students: • Enrolled(sid: string, cid: string, grade: string) • If all foreign key constraints are enforced, referential integrity is achieved, i.e., no dangling references. • Can you name a data model w/o referential integrity? • Links in HTML!
Foreign Keys in SQL • Only students listed in the Students relation should be allowed to enroll for courses. CREATE TABLE Enrolled (sid CHAR(20), cid CHAR(20), grade CHAR(2), PRIMARY KEY (sid,cid), FOREIGN KEY (sid) REFERENCESStudents ) Enrolled Students
Enforcing Referential Integrity • Consider Students and Enrolled; sid in Enrolled is a foreign key that references Students. • What should be done if an Enrolled tuple with a non-existent student id is inserted? (Reject it!) • What should be done if a Students tuple is deleted? • Also delete all Enrolled tuples that refer to it. • Disallow deletion of a Students tuple that is referred to. • Set sid in Enrolled tuples that refer to it to a default sid. • (In SQL, also: Set sid in Enrolled tuples that refer to it to a special value null, denoting `unknown’ or `inapplicable’.) • Similar if primary key of Students tuple is updated.
Referential Integrity in SQL/92 CREATE TABLE Enrolled (sid CHAR(20), cid CHAR(20), grade CHAR(2), PRIMARY KEY (sid,cid), FOREIGN KEY (sid) REFERENCESStudents ON DELETE CASCADE ON UPDATE SET DEFAULT ) • SQL/92 supports all 4 options on deletes and updates. • Default is NO ACTION (delete/update is rejected) • CASCADE (also delete all tuples that refer to deleted tuple) • SET NULL / SET DEFAULT(sets foreign key value of referencing tuple)
Graphical Short Notations forRelational Schemas • R(A,B,C), S(D,E) meaning: {A,B} is a primary key for R; D is a primary key for S • S(D,E) meaning: X is a foreign key in T that references attribute D of relation T: T[X]  S[D] T(X,Y,Z) Remark: The graphical short notation only specifies relation names, attributes, primary keys, and foreign keys but omits other schema information (such as attribute domains, uniqueness constraints, …)
Relational Integrity Rules • Entity Integrity: Null values are disallowed for primary key attributes of a relation. • Referential Integrity: the value of a foreign key must be either null or there must be a tuple with the foreign key value for the exported reference attribute(s) in the referenced relation. • Stability of the Primary Key: Updates of primary key attributes should be disallowed. Remark: The referenced attribute of a foreign key is usually the primary key of the relation; however, SQL-92 allows foreign keys to reference any attribute(s) of a relation, even attributes that are not superkeys.