1 / 23

CS405G: Introduction to Database Systems

CS405G: Introduction to Database Systems. Final Review. Database Design. E-R model. E-R model Entities Attributes Relationships. Database Design. 8/21/2014. 8/21/2014. 5. 5. From E-R Diagram to Relations. Relations Schemas Converting E-R diagram to relations Keys Super keys

garron
Télécharger la présentation

CS405G: Introduction to Database Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS405G: Introduction to Database Systems Final Review

  2. Database Design Jinze Liu @ University of Kentucky

  3. E-R model • E-R model • Entities • Attributes • Relationships

  4. Jinze Liu @ University of Kentucky

  5. Database Design 8/21/2014 8/21/2014 5 5

  6. From E-R Diagram to Relations • Relations • Schemas • Converting E-R diagram to relations • Keys • Super keys • Candidate keys • Primary keys • Relational integrity constraints

  7. Key Constraints • Superkey: • (Uniqueness constraints) A set of attributes where no two distinct tuples can have the same values • Every relation has at least one superkey: • The set of all attributes. • Key: A minimal superkey • Uniqueness constraint (superkey) • Minimum Constraint • No attribute can be removed and still satisfy the uniqueness constraints. Jinze Liu @ University of Kentucky

  8. Relational Integrity Constraints Constraints are conditions that must hold on all valid relation instances. There are four main types of constraints: Domain constraints The value of a attribute must come from its domain Key constraints Entity integrity constraints Referential integrity constraints 8/21/2014 8/21/2014 8 8

  9. Database Normalization • Functional Dependency • Functional Closure • Keys • Redefined • Based on functional dependency • DB Norm Form • 1st, 2nd, 3rd, BCNF

  10. Database Query Luke Huan Univ. of Kansas

  11. Relational Algebra and SQL • Relational algebra • SQL query • SFW • Group by …, Having • Subqueries • Relationship between R.A. and SQL

  12. RelOp RelOp Relational algebra A language for querying relational databases based on operators: • Core set of operators: • Selection, projection, cross product, union, difference, and renaming • Additional, derived operators: • Join, natural join, intersection, etc. • Compose operators to make complex queries Jinze Liu @ University of Kentucky

  13. Summary of core operators σpR πLR RXS RS R- S ρS(A1, A2, …)R • Selection: • Projection: • Cross product: • Union: • Difference: • Renaming: • Does not really add “processing” power Jinze Liu @ University of Kentucky

  14. Summary of derived operators RpS RS RS • Join: • Natural join: • Intersection: Jinze Liu @ University of Kentucky

  15. Selection: σpR Projection: πLR Cross product: RXS Join: RpS Natural join: RS Union: RUS Difference: R-S Intersection: R∩S Monotone Monotone Monotone Monotone Monotone Monotone Monotone w.r.t. R; non-monotone w.r.t S Monotone Classification of relational operators Jinze Liu @ University of Kentucky

  16. Update Operations on Relations • Update operations • INSERT a tuple. • DELETE a tuple. • MODIFY a tuple. • Constraints should not be violated in updates Jinze Liu @ University of Kentucky

  17. Basic queries: SFW statement • SELECT A1, A2, …, AnFROM R1, R2, …, RmWHERE condition; • Also called an SPJ (select-project-join) query • (almost) Equivalent to relational algebra query π A1, A2, …, An (σ condition (R1XR2 X … X Rm)) Luke Huan Univ. of Kansas

  18. Semantics of SFW • SELECT E1, E2, …, EnFROM R1, R2, …, RmWHERE condition; • For each t1 in R1: For each t2 in R2: … … For each tm in Rm: If condition is true over t1, t2, …, tm: Compute and output E1, E2, …, En as a row • t1, t2, …, tm are often called tuple variables • Not 100% correct, we will see Luke Huan Univ. of Kansas

  19. Operational semantics of GROUPBY SELECT … FROM … WHERE … GROUP BY …; • Compute FROM • Compute WHERE • Compute GROUPBY: group rows according to the values of GROUPBY columns • Compute SELECT for each group • For aggregation functions with DISTINCT inputs, first eliminate duplicates within the group • Number of groups = number of rows in the final output Jinze Liu @ University of Kentucky

  20. Database Design Jinze Liu @ University of Kentucky

  21. physical data organization • Storage hierarchy (DC vs. Pluto) !count I/O’s • Disk geometry: three components of access cost; random vs. sequential I/O • Data layout • Record layout (handling variable-length fields, NULL’s) • Block layout (NSM, PAX) ! inter-/intra-record locality • Access paths • Primary versus secondary indexes • Tree-based indexes: ISAM, B+-tree !Again, reintroduce redundancy to improve performance ! Fundamental trade-off: query versus update cost

  22. Performance Issues on Indexes • Indexes • ISAM • B+ Tree • Metrics • Storage • IO-costs • Operations • Single value query & range query • Insertion and deletion

  23. Query Processing Implementation • Typical Query Processings • Selection • Join • Set operations. • Typical approaches • Sequential scans in unsorted database • Sorted database • What are the tradeoffs.

More Related