1 / 31

Major Topics

Major Topics. Databases Normalization Data warehouses Data mining. Objectives for Data Storage Design. The data must be available when the user wants to use it The data must have integrity It must be accurate and consistent

Télécharger la présentation

Major Topics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Major Topics • Databases • Normalization • Data warehouses • Data mining Chapter 17 Designing Databases

  2. Objectives for Data Storage Design • The data must be available when the user wants to use it • The data must have integrity • It must be accurate and consistent • Efficient storage of data as well as efficient updating and retrieval • The information retrieval be purposeful Chapter 17 Designing Databases

  3. Computer Based Data Storage • Approach: • Using individual files • Each file unique to a particular application • Using database • used in many different applications by many uses Chapter 17 Designing Databases

  4. Objectives of Effective Databases • Ensuring that data can be shared among users for a variety of applications • Maintaining data that are both accurate and consistent • Ensuring all data required for current and future applications will be readily available • Allowing the database to evolve and the needs of the users to grow • Allowing users to construct their personal view of the data without concern for the way the data are physically stored Chapter 17 Designing Databases

  5. Metadata • Metadata is the information that describes data in the file or database • Used to help users understand the form and structure of the data Chapter 17 Designing Databases

  6. Entity-Relationship Concepts • Entities are objects or events for which data is collected and stored • Relationships are associations between entities Chapter 17 Designing Databases

  7. Entities • A distinct collection of data for one person, place, thing, or event • Entities become files of database tables Chapter 17 Designing Databases

  8. Associative Entity • Associative Entity - links two entities • An associative entity can only exist between two entities • Associative entities become database tables Chapter 17 Designing Databases

  9. Associative Entity Connections • Each entity end has a “one” connection • The associative entity has a “many” connection on each side Chapter 17 Designing Databases

  10. Key fields for Associative Entity • The primary key for each “one” end is a foreign key on the associative entity • Both foreign keys concatenated together become the primary key Chapter 17 Designing Databases

  11. Attributive Entity (pg. 39) • Attributive Entity - describes attributes, especially repeating elements • Attributive entities becomes database tables Chapter 17 Designing Databases

  12. Example Chapter 17 Designing Databases

  13. Types of Keys • Primary key, unique for the record • Secondary key, a key which may not be unique • Concatenated key, a combination of two or more data items for the key • Foreign key, a data item in one record that is the key of another record Chapter 17 Designing Databases

  14. Types of Files • Master file • Have large records • Contain all pertinent information about an entity • Transaction file • Are short records • Contain information used to update master files • Table file: contains data to calculate more data • Work file: for quick record access • Report file: various reports generated for printing Chapter 17 Designing Databases

  15. Databases Organization • A database is intended to be shared by many users • Three structures for storing database files: • Hierarchical database structures • Network database structures • Relational database structures Chapter 17 Designing Databases

  16. Normalization • Normalization is the transformation of complex user views and data to a set of smaller, stable, and easily maintainable data structures • Normalization creates data that are stored only once on a file • The exception is key fields • This eliminates redundant data storage • It provides ideal data storage for database systems Chapter 17 Designing Databases

  17. Three Steps of Data Normalization • Remove all repeating groups and identify the primary key • Ensure that all nonkey attributes are fully dependent on the primary key • Remove any transitive dependencies, attributes which are dependent on other nonkey attributes Chapter 17 Designing Databases

  18. Normalization Chapter 17 Designing Databases

  19. First Normal Form (1NF) • Remove any repeating groups • All repeating groups are moved into a new table • Foreign keys are used to link the tables • When a relation contains no repeating groups, it is in the first normal form • Keys must be included to link the relations, tables Chapter 17 Designing Databases

  20. Example: To 1NF Chapter 17 Designing Databases

  21. Second Normal Form (2NF) • Remove any partial dependencies • A partial dependency is when the data are only dependent on a part of a key field • A relation is created for the data that are only dependent on part of the key and another for data that are dependent on both parts Chapter 17 Designing Databases

  22. Example: To 2NF Chapter 17 Designing Databases

  23. Third Normal Form (3NF) • Remove any transitive dependencies • A transitive dependency is when a relation contains data that are not part of the entity • The problem with transitive dependencies is updating the data • A single data item may be present on many records Chapter 17 Designing Databases

  24. Example: To 3NF Chapter 17 Designing Databases

  25. Example: ERD Chapter 17 Designing Databases

  26. Data Warehouses • Data warehouses are used to organize information for quick and effective queries Chapter 17 Designing Databases

  27. Data Warehouses and Database • In the data warehouse, data are organized around major subjects • Data in the warehouse are stored as summarized rather than detailed raw data • Data in the data warehouse cover a much longer time frame than in a traditional transaction-oriented database • Data warehouses are organized for fast queries • Data warehouses are usually optimized for answering complex queries, known as OLAP Chapter 17 Designing Databases

  28. Data Warehouses and Database • Data warehouses allow for easy access via data-mining software called software • Data warehouses include multiple databases that have been processed so that data are uniformly defined, containing what is referred to as “clean” data • Data warehouses usually contain data from outside sources Chapter 17 Designing Databases

  29. Data Mining • Statistical analysis • Decision trees • Neural networks • Fuzzy logic • Clustering Chapter 17 Designing Databases

  30. Data Mining Patterns • Data mining patterns that decision makers try to identify include • Associations, patterns that occur together • Sequences, patterns of actions that take place over a period of time • Clustering, patterns that develop among groups of people • Trends, the patterns that are noticed over a period of time Chapter 17 Designing Databases

  31. Web Based Databases and XML • Web-based databases are used for sharing data • Extensible markup language (XML) is used to define data used primarily for business data exchange over the Web • An XML document contains only data and the nature of the data Chapter 17 Designing Databases

More Related