1 / 60

CHAPTER 4

CHAPTER 4. Data and Knowledge Management. Chapter Outline. 4.1 Managing Data 4.2 The Database Approach 4.3 Database Management Systems 4.4 Data Warehousing 4.5 Data Governance 4.6 Knowledge Management. Learning Objectives.

luz
Télécharger la présentation

CHAPTER 4

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CHAPTER 4 Data and Knowledge Management

  2. Chapter Outline • 4.1 Managing Data • 4.2 The Database Approach • 4.3 Database Management Systems • 4.4 Data Warehousing • 4.5 Data Governance • 4.6 Knowledge Management

  3. Learning Objectives • Recognize the importance of data, issues involved in managing data and their lifecycle. • Describe the sources of data and explain how data are collected. • Explain the advantages of the database approach. • Explain the operation of data warehousing and its role in decision support. • Explain data governance and how it helps to produce high-quality data. • Define knowledge, and describe different types of knowledge.

  4. Examples of Data Sources Credit card swipes RFID tags Digital video surveillance Radiology scans E-mails Blogs

  5. Chapter Opening Case Push Model Products

  6. Chapter Opening Case Pull Model Orders

  7. 4.1 Managing Data • Difficulties in Managing Data • Amount of data increases exponentially. • Data are scattered and collected by many individuals using various methods and devices. • Data come from many sources. • Data security, quality and integrity are critical.

  8. Difficulties in Managing Data • An ever-increasing amount of data needs to be considered in making organizational decisions. The Data Deluge http://www.applimation.com/

  9. Data Life Cycle (Figure 4.1) • Businesses run on data that have been processed or transformed into information and knowledge. • Figure 4.1 illustrates the processing of data into information and ultimately knowledge. Time

  10. Data, Information, Knowledge, Wisdom • Putting data, information, knowledge, and wisdom into perspective.

  11. What is meaning Data, Information, Knowledge, and Wisdom ? • At your tables, take a few minutes and try to define these terms.

  12. What is meaning of Data, Information, Knowledge, and Wisdom ? • Data Item • Elementary description of things, events, activities and transactions that are recorded, classified and stored but are not organized to convey any specific meaning. • Information • Data organized so that they have meaning and value to the recipient. • Knowledge • Data and/or information organized and processed to convey understanding, experience, accumulated learning and expertise as they apply to a current problem or activity. • Wisdom • The quality or state of being wise; knowledge of what is true or right coupled with just judgment as to action

  13. 4.2 The Database Approach • A database management system (DBMS) provides all users with access to all the data. • DBMSs minimize the following data management problems: • Data redundancy: • The same data are stored in many places. • Data isolation: • Applications cannot access data associated with other applications. • Data inconsistency: • Various copies of the data do not agree.

  14. Database Approach (continued) • DBMSs maximize the following issues: • Data security: • Keeping the organization’s data safe from theft, modification, and/or destruction. • Data integrity: • Data must meet constraints (e.g., student grade point averages cannot be negative). • Data independence: • Applications and data are independent of one another. This means that applications and data are not linked to each other, so application logic can be changed and the database does not have to be modified. The inverse is also true.

  15. Database Management Systems

  16. Data Hierarchy (some DBMS Terminology) • A bit • a binary digit, or a “0” or a “1”. • A byte • eight bits and represents a single character (e.g., a letter, number or symbol). • A field • a group of logically related characters (e.g., a word, small group of words, or identification number). • A record • a group of logically related fields (e.g., student in a university database). • A file • a group of logically related records. • A database • a group of logically related files.

  17. Hierarchy of Data for a Computer-Based File

  18. Data Hierarchy (continued) Bit (binary digit) Byte (eight bits)

  19. See Digital Data Representation Handout • Review Digital Data Representation Handout

  20. Data Hierarchy (continued) • Example of Field and Record

  21. Data Hierarchy (continued) Example of a Database Form.

  22. Designing the Database • Data Model • A diagram that represents the entities in the database and their relationships. • Data Model Components • Entity • An entity is a person, place, thing, or event about which information is maintained. • A record is a database instance of an entity. • Attribute • A particular characteristic or quality of a particular entity. • Primary Key • A field that uniquely identifies a record. • Non-key Attributes • A property or characteristic of an entity that is not part of the key

  23. Entity Example Entity Attributes MOVIE Movie Number Name Rating Rental Rate 12345345 Die Hard PG13 $3 23456781 Wings PG $2 65656565 Black Beauty G $2 Instances CUSTOMER Status Code Cust Number Name Address 123-345 Tom Jones 12 Oak St OK 789-789 Mary Sullivan 456 Hill Ave Pend 567-342 Bob Waters 7676 Scutter Rd OK

  24. Entity Attribute Try it … • Copy #-The sequence number of the item available for rent. Used to differentiate multiple copies of a Movie. • Customer # (Fk2)-Unique identifier of an individual authorized to rent a Movie. • Late Status-A status code identifying if the rental item has not been returned by the Return Date. • Length-The running time in minutes of the item available for rent. • Movie #-Unique identifier of the item available for rent. • Movie Rental-An instance of a Movie being rented by a customer. • Movie Type-The genre or classification associated with the items available for rent. • Movie-An item that is available to rent, a motion picture or television production. • MPAA Rating-Motion Picture Association of America evaluation. Valid values are: G, PG, PG-13 R, and NC-17. • Rent Date-The date a Movie is rented by a Customer. • Return Date-The date a rented Movie is to be returned to the store for restocking. • Title-The name of the item available for rent.

  25. Entity-Relationship Modeling • Database designers plan the database design in a process called entity-relationship (ER) modeling. • ER diagrams consists of entities, attributes and relationships. • Other concepts • Entity classes • Groups of entities of a certain type. • Instance • The representation of a particular entity. • Identifiers • Attributes that are unique to that entity instance.

  26. Sample Information Model (Relational - IDEF 1X) (SET TYPE)

  27. Entity-Relationship Diagram Model

  28. 4.3 Database Management Systems Key Definitions • Database management system (DBMS) • A set of programs that provide users with tools to add, delete, access, and analyze data stored in one location. • Relational database model • A popular type of DBMS that is based on the concept of two-dimensional tables. • Structured Query Language (SQL) • SQL is a standard interactive and programming language for querying and modifying data and managing databases. • The core of SQL is formed by a command language that allows the retrieval, insertion, updating, and deletion of data, and performing management and administrative functions. • Query by Example (QBE) • allows users to fill out a grid or template to construct a filter or description of the data one wants.

  29. Example of a Relational Database Table

  30. Normalization • A set of rules for analyzing the attributes of an information model • Eliminate model redundancy • Ensure model consistency • Verify structural correctness • Maximize stability • However, normalization cannot validate a model's accuracy in reflecting the business meaning of the information

  31. Normal Forms • Sequential steps for achieving an optimized and logically desirable information model • Provides a common foundation from which an efficient physical database design can be created • There are six degrees of normal form - the first three are usually sufficient for most modeling applications • First normal form • Second normal form • Third normal form • Boyce/Codd normal form • Fourth normal form • Fifth normal form

  32. First Normal Form - (1NF) • Every key and non-key attribute of an entity must be single valued • No entity instance can have multiple values for a given attribute • i.e., The No Repeat Rule • A violating entity is corrected by removing repeating or multivalued attributes to another, dependent (child) entity

  33. First Normal Form - Example RESTAURANT REST NAME ADDRESS PHONE # EMPLOYEE NAME REST NAME ADDRESS PHONE # EMPLOYEE NAME BURGER KING 123 NORTH ST 123-2345 JOHN, SUE, LISA TACO HOUSE 345 126TH PLACE 765-8907 MARY, BILL FISH COMPANY 77 SUNSET AVE 395-5682 ED, SAM, JOSE, RICK RESTAURANT EMPLOYEE REST NAME EMPLOYEE NAMEREST NAME ADDRESS employs PHONE # POSITION

  34. Second Normal Form - (2NF) • An entity that is in first normal form and each non-key attribute is dependent on the entire primary key • No non-key attribute instance can be determined by knowing just part of an entity instances key • A violating entity is corrected by removing to a parent entity any attributes that depend on only a subset of the primary key

  35. SUPPLIER RESTAURANT ORDER REST NAME SUPPLIER NAME ORDER ITEM PHONE # SUPPLIER NAME (FK1) Second Normal Form - Example RESTAURANT ORDER REST NAME SUPPLIER NAME ORDER ITEM SUPPLIER PHONE # REST NAME SUPPLIER NAME ORDER ITEM SUPPLIER PHONE # BURGER KING SAM'S PRODUCE BEEF 123-2345 TACO HOUSE SALSA INC. PEPPERS 765-8907 FISH COMPANY SAM'S PRODUCE SNAPPER 123-2345 fills

  36. Third Normal Form - (3NF) • An entity that is in second normal form and each non-key attribute is only dependent on the entire primary key and nothing other than the key • No non-key attribute instance can be determined by knowing the value of another non-key attribute for the same instance • A violating entity is corrected by removing to a parent entity any attributes exhibiting transitive dependencies (non-key attributes that not only depend on the whole key but also on other non-key attributes)

  37. RESTAURANT RESERVATION CUSTOMER REST NAME CUSTOMER NAME RESERVATION # PHONE # CUSTOMER NAME (FK1) TIME # IN PARTY Third Normal Form - Example RESTAURANT RESERVATION REST NAME RESERVATION # CUSTOMER NAME CUSTOMER PHONE # TIME # IN PARTY REST NAME RES # CUST NAME CUST PH # TIME # IN PARTY F. JONES BURGER KING 12 123-2345 11:00 AM 4 R. SMITH TACO HOUSE 234 765-8907 2:30 PM 4 F. JONES FISH COMPANY 88 123-2345 8:15 PM 6 makes

  38. Example #2 Non-Normalized Relation

  39. Normalizing the Database (part A)

  40. Normalizing the Database (part B)

  41. Summary: Normalization Produces Order

  42. Database that Catches Plagiarists P116 A Turnitin originality report http://www.turnitin.com

  43. 4.4 Data Warehousing • Data warehouse • A repository of historical data organized by subject to support decision makers in an organization. • Organized by business dimension or subject. • Data warehouses are multidimensional. • A Data Cube with three dimensions: • customer, • product, and • time.

  44. Data Warehousing (continued) • Data warehouses are historical. • Historical data in data warehouses can be used for identifying trends, forecasting, and making comparisons over time. • Data warehouses use Online Analytical Processing (OLAP). • OLAP involves the analysis of accumulated data by end users (usually in a data warehouse). • In contrast, Online Transaction Processing (OLTP) typically involves a database, where data from business transactions are processed online and as soon as they occur.

  45. Data Warehouse Framework & Views • Process of building and using a data warehouse.

  46. Relational Databases • First slide of five showing the relationship between relational databases and a multidimensional data structure (or data cube).

  47. Multidimensional Database View

  48. Equivalence Between Relational and Multidimensional Databases

  49. Equivalence Between Relational and Multidimensional Databases

  50. Equivalence Between Relational and Multidimensional Databases

More Related