1 / 35

Databases and Database Design

Databases and Database Design. William A. Yasnoff, MD, PhD Oregon Health Division. Databases & Database Design. Introduction Relational Databases Database Design: Data Models. Introduction. Database = large collection of information Stored on hard disk Retrieval

onofre
Télécharger la présentation

Databases and Database Design

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Databases and Database Design William A. Yasnoff, MD, PhD Oregon Health Division

  2. Databases & Database Design • Introduction • Relational Databases • Database Design: Data Models

  3. Introduction • Database = large collection of information • Stored on hard disk • Retrieval • 100,000 items X 10 ms = 17 minutes • Need to organize data to improve retrieval speed • How do you organize your paper files?

  4. Introduction (continued) • Data Organization • sorting data: phonebook in alpha v. random order • “tabs” at key points: dictionary with tabs for each letter • direct pointers to information = index • table of key element value and address of other data • e.g. name & mailbox number --> letters in mailbox

  5. Relational Database Model

  6. Relational Database Model • Developed by EF Codd, CJ Date (70s) • Table = Entity = Relation • Table row = tuple = instance • Table column = attribute • Table linkage by values • Entity-Relationship Model

  7. Attributes • Key • uniquely identify row/tuple • may be one or more attributes • Non-key • other properties of instance • dependent on key

  8. Retrieval in RDMS • SQL • Select ... • From ... • optional: can be computed • Where ... • Query by example • Fill in the blanks • WONDER

  9. Advantages of RDMS • Very strong theoretical basis • storage • retrieval • Easy to implement • Conflicts and anomalies can be avoided • Intuitive appeal • Easy retrieval

  10. Relational Model: Summary • popular method of organizing data • strong theoretical properties facilitate retrieval • relation = table • attribute = column • tuple = row • key = {attributes} that uniquely identify each row

  11. Relational Model: Summary 2 • Allows complex data relationships with multiple tables: • {patient id, patient demographics} • {patient id, patient visit date, blood pressure} • Structured Query Language (SQL) retrieval • Select patient _name where blood_lead_level > 10

  12. Database Design: Data Models

  13. Data Models • Definition: A representation of the data and and data relationships of an activity • data: case report of serious E. Coli illness • data: E. Coli serotype O157 • relationship: case report “contains” serotype (semantic relationship) • Database Design = Development of a quality data model

  14. Correctness Conceptual (concepts represented properly) real world representation Syntactic (relationships represented properly) real world language Completeness (wholeness) Conceptual (all concepts represented) Syntactic (all relationships represented) Data Model Quality

  15. Entity • “Any distinguishable object that is to be represented in a database” [C.J. Date] • Properties • within scope of model • single concept • a set of distinguishable “instances” • satisfies normalization rules • Also called a relation

  16. Entity Guidelines • Relationship with at least one other entity • Unique, descriptive name • Class or set of things (not just one) • Single meaning (no homonyms) • No synonyms (two entities describing same class)

  17. Attribute • “a type or characteristic of an entity” (e.g. “gender” is an attribute of the entity “patient”) • Domain = the set of values from which an attribute may be selected (e.g. the domain of the attribute “gender” is [male, female]) • An entity typically has many attributes

  18. Attribute Characteristics • Key • value uniquely identifies entity • e.g. “Lab test ID” is key attribute of “specimen” • Non-Key • value does not uniquely identify entity • e.g. “author” does not uniquely identify “publication”

  19. Attribute Characteristics • Atomic = individual data value (one and only one fact) • Description = complete and clear definition • e.g. “professional privilege date” = The date on which a health care professional is granted privilege to practice in a particular health care facility, establishing the provider’s eligibility for patient care assignments and liability coverage

  20. Attribute Guidelines • Unique name • no plurals, possessives, articles, conjunctions, verbs, or prepositions • Clear, complete, unambiguous description • Atomic (no positional information) • Domain with 2 or more values • Originates in only one entity

  21. Primary Key • Attribute of an entity whose values uniquely determine its occurrences • {birth certificate number, person name, birth date, mother name, physician name} • {facility, patient name, physician name, date, temperature, pulse, blood pressure}

  22. Primary Key Characteristics • Stable: does not change over time • Minimal: fewest attributes necessary • Factless: no hidden information • Definitive: value always exists • Accessible: available when data created • Unique: absolutely no duplicates

  23. Relationship • Semantic: “contains”, “is part of”, “belongs to” • One-to-one • serotype of an organism • immunization status of a child • One-to-many • antibiotic resistances of an organism • vaccines administered to a child

  24. Relationship Guidelines • No circular references (e.g. “health plans -> markets -> products -> health plans” should be “health plans -> health plan markets -> market products <- health plan products <- health plans”) • Single relationship between two entities • No recursive relationships

  25. Single Relationship of Entities • Problem: employee <--> job assignment <--> job [double circular] • Solution: person --> job assignment <-- job [“person” includes “status” as employee or contractor]

  26. Recursive Relationships • Problem: supervisor <--> employee [may be circular] • Solution: person {person ID } --> employee-supervisor relationship {employee ID, supervisor ID} • multiple supervisory roles for each person

  27. Normalization • Formalization of common sense rules of information organization • An attribute is functionally dependent on X only if each of its values is determined by the value of X (X may be composite) • Example: DOB is functionally dependent on Driver’s License number

  28. Key Normalization Concept • Functional dependence of each entity must be • based on entire primary key • NOT based on any other attributes

  29. Benefits of Normalization • Aids in database design, integration • Ensures precise capture of business logic • Minimizes redundancy • Minimizes need for null values • Prevents • information loss • unintentional results

  30. Summary of Normalization • One Fact in One Place

  31. Common DB Design Errors • Multiple instances in same row of table, e.g. first_value, second_value, third_value [Problem: what to do with 4th value?] • Same data item repeated in multiple places, e.g. address appears in two different tables [Problem: how to keep two values synchronized?]

  32. Correctness Conceptual (concepts represented properly) real world representation Syntactic (relationships represented properly) real world language Completeness (wholeness) Conceptual (all concepts represented) Syntactic (all relationships represented) Data Model Quality

  33. Database Design Pearls • Accommodate all data needed • Correct relationships between data items • No duplicate representation • Anticipated retrievals use indexes • Meet confidentiality requirements

  34. References - 1 • Reingruber MC & Gregory WW: The Data Modeling Handbook (New York: John Wiley & Sons, 1994) • Montgomery SL: Object-Oriented Information Engineering (Boston: AP Professional/Harcourt Brace, 1994)

  35. References - 2 • Codd EF: The Relational Model for Database Management (Reading, MA: Addison-Wesley, 1990) • Date CJ: An Introduction to Database Systems, 5th ed. (Reading, MA: Addison-Wesley, 1990) • Duncan KA: Health Information and Health Reform (San Francisco: Jossey-Bass, 1994)

More Related