Supertype and Subtype Entities in Database Design
290 likes | 323 Vues
Learn about the nature and solutions to problems related to subtype entities, specialization variations, and implementing subtype identifiers in database design, with special topics and performance considerations.
Supertype and Subtype Entities in Database Design
E N D
Presentation Transcript
Super-Type & Sub-Type Entities—Topics • Problems needing subtype entities • Nature of the solution • Variations—Specialization and Completeness • Subtype Identifiers • Implementing • Special Topics • Using Super- and Sub-Types • Performance Considerations
Supertype & Subtype Entities • Some entities have records that come in various ‘flavors’. • StudentsDoctoral, Masters, Undergraduate • ProductsSerial-numbered, perishable, animals, etc. • EmployeesSalaried, hourly, managerial, part time • Pet Store Products Food, animals, accessories • These entity sets have two types of attributes • Attributes common to every occurrence • Attributes required by one or more subtypes but not used by all occurrences of the entity
Why is This a Problem? • Variations on an entity create a space problem • If we put all possible attributes for all possible variations (subtypes) in one entity we will waste unused fields in most records • Sport attribute for students who are not athletes
Supertype & Subtype Entities (cont.) • Subtypes also create relationship problems • Some relationships will only be with a subtype of the entity, not with all types • In a pet store a veterinarian will inspect the animals inventory items but probably not the turtle food
Supertype & Subtype Entities • It is common to split upentities with variationsinto a supertype andsome subtypes • Supertype containsattributes common toall occurrences • Subtypes contain attributes needed by the subtype ERD Notation Visio Equivalent
An Example • Cash is a PaymentTypebut needs no special attributes • Partial Specialization (coming up) • Payment ID is PK of all entities • Payment ID is also FK insubtype entities • In SQL Server besure to set parent this way when implementing relationships
Need for Subtypes • Subtypes are used when an identifiable subset of occurrences have a need for fields not needed by all occurrences • Many occurrences will have empty attribute values • An occurrence’s membership in the identifiable subset must be observable • It is known whether a student is registered as an athlete • But there is no obvious distinction to distinguish ‘local’ students from ‘transient’ students
First Variation on Super-/Subtypes • Completeness Constraint • Must every supertype occurrence have at least one occurrence in one of the subtypes? • Total specialization means thata subtype occurrence must exits • Indicated with a double lineto the connecting circle • Partial specialization means thata subtype need not exist • Indicated with a single line tothe connecting circle
Total Specialization Completeness Constraint • Total specialization means that every record in the supertype must have a matching record in one or more subtypes • Relatively rare (in my experience) but possible • Model in Visio using a thicker descending line (use Format Line) • (Visio doesn’t do double lines) • Increase thickness by two levels • Watch for SQL Server modeling later
Partial Specialization Completeness Constraint • Some records in supertypes may have no matching subtype records • Their subtype groups do not needspecial attributes • But membership in a groupmay still be important andtracked • It is possible for a suptertype to haveonly one subtype group
Your Turn • Model the products in a home improvement store as a supertype/subtype relationship • Identify categories and any specialized attributes needed
Second Variation on Super-/Subtypes • You must also determine whether a supertype occurrence can be found in more than one subtype • A disjoint relationship meansthat a supertype occurrence can only be found in one subtype • An overlap relationship means that a supertype occurrencecan be found in multiple subtypes(E.g., some universities have ajoint J.D./MBA program) “d” “o”
Disjoint Relationships • A registered vehicle canonly be of one type
Subtype Identifiers • The supertype entity must indicate which (if any) subtypes are used • Disjoint subtypes can use one attribute with a code to indicate the type of subtype • Value of the attribute (‘Cash’, ‘Check’, ‘CC’) identifies the subtype • Remember that some subtype identifiers (‘Cash’ here) may have no subtype entities • Sometimes this value may be blank (not part of any group)
Subtype Identifiers (cont.) • Overlapping subtypes must use a collection of yes/no attributes, one for each possible subtype • Setting attribute to true/yes in a record indicates that a matching subtype record exists • Leaving all to false/no indicates no matching subtype (partial specialization)
Subtype Identifiers (cont.) • A subset of subtypes may be disjoint while others are overlap
Subtypes of Subtypes • It is possible to have subtypes of subtypes • Model products in a pet store where some are inanimate, some are food, some are live and of the live animals some are tracked individually… • Cute puppies with wet noses • Cats • … and others are not • Goldfish • Mice • … and some are sold as food • Cute little mice as food for slithering scaly snakes
Some Caveats • ST/ST determined at the group level. Individual records may not have values for all fields • More than one subtype may have the same field in it • Field goes in subtype entities if not every subtype group needs it • Consider eliminating subtypes if they have only one or two attributes • Roll their attributes back into the suptertype and accept wasted space • Consider if a large proportion of the population • Consider if frequently accessed
Implementing Super-/Subtypes • There is a Mandatory-1:Optional-1 relationship between entities in a super and subtype relationship • Mandatory at supertype end • Optional at each subtype end • Each subtype occurrence (record) has identifier attribute values that exactly match a record in the supertype (but not vice-versa) • All entities have the same primary key/ identifier attributes • PK in the subtype is also the FK from supertype • Special case of a weak entity
Implementing in SQL Server—Relationships PK is also FK
Implementing in SQL Server—Diagrams • Arrange in org-chart hierarchy • Gives visual cue that this is a ST/ST relationship • You will need to wrestle with the relationship lines a little • Note Key symbols at both ends of the lines • Indicates 1:1 Cardinality
Subtypes of an Unimplemented Supertype • Many, many data models will have records that could be subtypes of a supertype that is not implemented • For UCF a “Person” entity could have subtypes • Student − Donor • Faculty − Contractor • Tend to not implement this Person supertype unless the entities are regularly queried together • Occasional queries can be supported with a UNION query
Subtypes and Object Oriented Design • Super- and Sub-type design exactly corresponds to the philosophy of inheritance in object oriented design • If programming using an OO approach you will almost always implement objects with inheritance to match super- and sub-type design • You can also implement inheritance for the unimplemented supertype discussed in the previous slide, even if not implemented in the DB design
Using Super-/Sub-type Tables • Application logic and SQL for super-and sub-type tables becomes more complex • Inserts must test the subtype identifier to determine where to add records • Always to the supertype • Decide which (if any) subtype(s) • Similar for Updates
Using Super-/Sub-type Tables (cont.) • Retrieval also complex • You cannot simply join the supertype with all subtypes since no records will be returned if a subtype has no match • Why won’t the following work? SELECT Payment.*, Check_Payment.*, CC_Payment.* FROM Payment, Check_Payment, CC_Payment WHERE Payment.PaymentID = Check_Payment.PaymentID AND Payment.PaymentID = CC_Payment.PaymentID AND Payment.PaymentID = 1472
Using Super-/Sub-type Tables (cont.) • Two query approaches • Use conditional logic • Use Left/Right Outer Joins SELECT Payment.*, Check_Payment.*, CC_Payment.* FROM Payment LEFT JOIN Check_Payment ON Payment.PaymentID = Check_Payment.PaymentID LEFT JOIN CC_Payment ON Payment.PaymentID = CC_Payment.PaymentID WHERE Payment.PaymentID = 1472
Performance Considerations • Because of the performance considerations and complexity of Super- and Sub-types you will regularly consider eliminating subtypes • Roll up their attributes into the super-type and accept the wasted columns • Arguments for retaining subtypes • Several unique attributes, especially large (text) ones • Relatively few records in the subtype (compared to overall number of records) • Relatively few transactions use the subtype • Look at vertical partitioning later in the course