400 likes | 562 Vues
Resource Management Data. Fall 2004. Data Management. MANAGE. COLLECT. STORE. USE. Target Marketing. Data Mining. Processing (Transaction versus Web). Inhouse vs Outsourcing. Data Warehouses. Databases. Data Sources. Represent. Data and their Sources. Types of data public
E N D
Resource ManagementData Fall 2004
Data Management MANAGE COLLECT STORE USE Target Marketing Data Mining Processing (Transaction versus Web) Inhouse vs Outsourcing Data Warehouses Databases Data Sources Represent
Data and their Sources • Types of data • public • private • Sources • internal • external • Implications?
e-Commerce architecture SERVER ENVIRONMENT E-Commerce environment 2nd Layer: Application Layer E-Commerce application (e-shop etc) Request Info Receive Info Client Web Server 3rd Layer: Data Layer 1st Layer: Presentation Layer Database
Databases Database --- A non-redundant collection of logically related records or files. It enables a common pool of data records to serve many processing applications. Database management software --- mechanism for storing and organizing data for sophisticated queries and manipulation of data Relational database – most popular, data organized into tables (Microsoft SQL Server, Oracle)
Database Approach • Data occupies the central position; referenced as needed. • Data sharing: data is not the property of one person; one representation for each piece of data; avoids (minimizes) redundancy. • User Views: allows a user to have a view of the database that is different from the view used by others; user isolated from changes in the data/programs. • Query Language: An English-like language designed for end-users to query the database. • Database Administrator: Specialist who manages the database.
Database System for Bank Client Database Application program User CUSTOMER DATA Cust name SSN Address Savings Account # Loan Account # Investment Account # SAVINGSSYSTEM Database Management System SAVINGS DATA Savings Account # Account balance LOAN SYSTEM LOAN DATA Loan Account # Account balance INVESTMENT DATA Investment Account # Account balance INVESTMENT SYSTEM Source: Dorit Nevo
Database Development • Objective • develop a database that accurately represents the real world. (i.e. “model” the real world.) • Database: a model of an organization. • Any results that the database gives you must be truein the real world. • Any relevantresults about the real world must be obtainable from the database.
Redundant DataConsider the following table that stores data about auto parts and suppliers. This seemingly harmless table contains many potential problems. Suppose you want to add another part?107 Tail Pipe GlassCo 1638 S. Front Seattle WA Disk space is wasted by duplicating data about the supplier. Every time a new part is entered for a particular supplier, all of the supplier data is repeated. Imagine the problems if several suppliers supply hundreds of auto parts each.
Modification AnomalyWhat is GlassCo moves to Olympia? How many rows have to change in order to ensure that the new address is recorded. Again, imagine the issues surrounding modifications of hundreds of rows of data for one supplier. When changes are made, they must be made to all copies of the data. Think about the confusion that results from changing only a subset of the duplicate data.
Deletion AnomalySuppose you no longer carried part number 102 and decided to delete that row from the table?
Now, looking at the remaining data below, what is the address of A1 Auto? A deletion anomaly means that we lose more information than we want. We lose facts about more than one subject with one deletion.
Insertion AnomalyNext, you want to add a new supplier, CarParts, but you have not yet ordered parts from that supplier. What do you add? The situation is called an insertion anomaly. Negatively stated, we cannot add a fact about one subject until we have additional data about another subject.
Data Management MANAGE STORE USE COLLECT Target Marketing Data Mining Transaction Processing Inhouse vs Outsourcing Data Sources Data Warehouses Represent Databases Relational Model Entity Relationship Model
Database DesignRepresentation Entity-Relationship Model ENTITY:Person, place, thing, event about which data must be kept • ATTRIBUTE: Description of a particular ENTITY • KEY FIELD:Field used to retrieve, update, sort RECORD Source: @2002 Prentice Hall
KEY FIELD • Field in each record • Uniquely Identifies THIS Record • For RETRIEVAL UPDATING SORTING Source: @2002 Prentice Hall
STUDENT Mother ONE-TO-ONE: CLASS ONE-TO-MANY: STUDENT A STUDENT B STUDENT C CLASS 1 CLASS 2 MANY-TO-MANY: STUDENT A STUDENT B STUDENT C TYPES OF RELATIONSHIPS
Example: Consulting Company Database You have been asked to create a database for a small consulting company. The company wants to keep track of which employees are assigned to which project and what dates they start and stop working on them. An employee can work on more than one project at a time (as many students know). You also need to keep track of which client sponsors which project(s). Each project usually requires a set of skills so you need to know what skills an employee has and when he or she obtained them. Employees are encouraged to find clients and receive extra compensation for doing so.
Entity-Relationship Model # Emp finds 1 Client-Id Employee N N N start-date Client has end-date date 1 acquired assignedto M sponsors Skill M N Project# Skill- M name N requires Project
Typical Queries • What date was the project called “Metro” completed? • What is the name of the client who sponsors the project called “Pontiac?” • What skills are required for the project called “Virtual Courtyard”? Note: Minimal redundancy in database design
Relational Data Model • One basic construct: the relation. • Relations represent both entities and relationships. • Data Manipulation Language: English-like. • Dominant database structure. • DB2 by IBM • ACCESS by Microsoft • Oracle
Translate E-R Model into Relational Model • Each entity represented by an (entity) relation • N:M relationship represented by a separate (relationship) relation • Key is concatenation (joining together) of entity keys. • Relationship attributes are non keys. • 1:N relationship represented by foreign key, i.e. key of entity on “1” side appears as non key in relation for the entity on the “N” side.
Example: Student-Course Design a database to keep track of what courses a student takes and the grade he or she receives. Entities: Student: [SSN, name, address] Course: [Course-Id, description] Relationships: Student takes Course: [grade] N : M
Student Relation Course Relation
What is Data Quality? 1. Data is accurate— e.g. customer’s name spelled correctly; address correct. 2. Data is stored according to data type— e.g. as character, integer. 3. Data has integrity— backup and recovery procedures. 4. Data is not redundant. 5. Data follows business rules— e.g. loan balance may never be negative.
What is Data Quality (cont’d) 6. Data corresponds to established domains— e.g. employee age 16-65 7. Data is timely— e.g. monthly, weekly, daily, real-time. 8. Data satisfies needs of the business— e.g. marketing (customers, demographics), accounts payable (vendors, products). 9. Data is complete— e.g. all line items for an invoice captured.
What Managers Should Know About Data Modeling • Database operators represent ways in which data can be manipulated to assist in managerial decision-making • Without some sense of the possibilities of queries and reports, managers will have a misconception of what they can expect from a database. • Data modeling is a technique used expertly by professionals • Nevertheless, general managers need to understand the general design issues involved in order to appreciate the effort involved and value of excellent data modeling.
Privacy Issues • Is there information in my files that should not be there? • Is information being used for the purpose it was originally intended? • Is information being shared appropriately (both inside and outside the firm?) • Is information being combined in appropriate ways? • Are decisions that require human judgment being made appropriately? • Are appropriate procedures in place for preventing and correcting errors? Source: Cash, J.I., McFarlan, F.W., McKenney, J.L., and Applegate, L.M., Corporate Information Systems Management: Text and Cases, Homewood. II.
What should managers know about Database Management? • Management of database is an important issue • Although once a technical issue, the management of databases has become increasingly important throughout all types of organizations. • Organizations store and use large quantities of data • Sheer volume of data alone means that proper management is essential. • Data are a valuable resource that must be managed • Value is assured by capturing, validating, and protecting the data.
What should managers know about Database Management? • The wrong approach to managing data adds complexity in the management of organizations. • The management of data should be part of the solution, not part of the problem. • You have a right to influence the management of data you need. • The management of databases is not an activity that should occur in isolation. Those who rely on the data captured and stored in an organization have a need and, in fact, an obligation to be involved in the decisions that affect their use of the data.
Appendix • File organization • Components of database management system • SQL (Structured Query Language)
FILE ORGANIZATION • BIT:Binary Digit (0,1; Y,N; On,Off) • BYTE:Combination of BITS which represent a CHARACTER • FIELD:Collection of BYTES which represent a DATUM or Fact • RECORD:Collection of FIELDS which reflect a TRANSACTION Source: @2002 Prentice Hall
FILE ORGANIZATION • FILE:A Collection of similar RECORDS • DATABASE:An Organization’s Electronic Library of FILES organized to serve business applications Source: @2002 Prentice Hall
COMPONENTS OF DBMS • DATA DEFINITION LANGUAGE • Defines data elements in database • DATA MANIPULATION LANGUAGE • Manipulates data for applications • DATA DICTIONARY • Formal definitions of all variables in database, controls variety of database contents, data elements Source: @2002 Prentice Hall DBMS
STRUCTURED QUERY LANGUAGE (SQL) • DE FACTO STANDARD • DATA MANIPULATION LANGUAGE FOR RELATIONAL DATABASES Source: @2002 Prentice Hall DBMS
ELEMENTS OF SQL • SELECT: List of columns from tables desired • FROM: Identifies tables from which columns will be selected • WHERE: Includes conditions for selecting specific rows, conditions for joining multiple tables Source: @2002 Prentice Hall DBMS