600 likes | 811 Vues
Databases. Chapter 5. Objectives. Understanding the strategic advantage of using Databases? How does it support operations? How does it support management? What are the “best practices” today?. Concepts. Major types of databases. Data warehouses and data mining. Logical data elements.
E N D
Databases Chapter 5
Objectives • Understanding the strategic advantage of using Databases? • How does it support operations? • How does it support management? • What are the “best practices” today?
Concepts • Major types of databases. • Data warehouses and data mining. • Logical data elements. • Fundamental database structures. • Database development.
Databases are Important and Exciting • Q: Why are databases so important? • A: Today’s business enterprises cannot survive or succeed without quality data about their internal operations and external environment. • Q: Why are database so exciting? • A: This question makes no sense; its like asking why are Brad Pitt and Angelina Jolie so hot?
Data Resource Management Definition: • A managerial activity that applies information systems technologies to the task of managing an organization’s data resources to meet the information needs of their business stakeholders. • Deep down inside, you all want to be data resource managers; you really do; trust me!
Foundation Data Concepts • Character – single alphabetic, numeric or other symbol • Field – group of related characters • Entity – person, place, object or event • Attribute – characteristic of an entity
Foundation Data Concepts • Record – collection of attributes that describe an entity • File – group of related records • Database – integrated collection of logically related data elements
Entities and Relationships • Entities • Real world things with attributes • Abstract things that can be recorded • A Bill, A Person, A Building, A Product, A Fishmonger, A Recipe • Relationships • How are the entities related • How are entities connected logically
Relationships • ParenthasChild • Customer makes An Order • Student takes A Class • Fishmonger works at A Dock • Recipe includes An Ingredient • Brad Pitt dates Angelina Jolie
Types of Databases • Operational – store detailed data needed to support the business processes and operations of a company • Distributed – databases that are replicated and distributed in whole or in part to network servers at a variety of sites • External – contain a wealth of information available from commercial online services and from many sources on the World Wide Web • Hypermedia – consist of hyperlinked pages of multimedia
Data Warehouse Definition: • Large database that stores data that have been extracted from the various operational, external, and other databases of an organization
Data Mart Definition: • Databases that hold subsets of data from a data warehouse that focus on specific aspects of a company, such as a department or a business process
Data Mining Definition: • Analyzing the data in a data warehouse to reveal hidden patterns and trends in historical business activity
Data Mining Uses • Perform “market-basket analysis” to identify new product bundles. • Find root causes to quality or manufacturing problems. • Prevent customer attrition and acquire new customers. • Cross-sell to existing customers. • Profile customers with more accuracy.
Traditional File Processing Definition: • Data are organized, stored, and processed in independent files of data records
Flat file BUS100,ACCT 1,SMITH J.,M 10-11 BUS100,ACCT 1,JONES A., T 11-12 BUS200,ACCT 2,SMITH J., T 1-2 BUS200,ACCT 2,ANDERSON B.,T 3-4 CSI120,PROG 1,WILLIAMS C.,W 4-6
Problems with “flat files” • Concurrent access update problems • Serialization • Searching for a record – sequential • Inserting a record • Update/change size of record • Deleting a record
Problems of File Processing • Data Redundancy – duplicate data requires an update to be made to all files storing that data • Lack of Data Integration – data stored in separate files require special programs for output making ad hoc reporting difficult • Data Dependence – programs must include information about how the data is stored so a change in storage format requires a change in programs
Database Management Approach Definition: • Consolidates data records into one database that can be accessed by many different application programs. • Software interface between users and databases • Data definition is stored once, separately from application programs
Database Management Software (DBMS) Definition: • Software that controls the creation, maintenance, and use of databases
Database Interrogation Definition: • Capability of a DBMS to report information from the database in response to end users’ requests • Query Language – allows easy, immediate access to ad hoc data requests • Report Generator - allows quick, easy specification of a report format for information users have requested
Database Structures • Hierarchical – relationships between records form a hierarchy or treelike structure • Network – data can be accessed by one of several paths because any data element or record can be related to any number of other data elements
Relational Database Structure Definition: • All data elements within the database are viewed as being stored in the form of simple tables
Modeling Relationshipsusing Tables 1:1 1:Many (1:N) Many : Many (N:M)
Instructors Compensation One to One PK
Instructors Requires FOREIGN KEY Sections One to Many
Many-to-many Instructor/CoursesNo-No’s: Multiple entry fields
Instructors Courses Section – linking Table FK FK
Searching a file • UNSORTED file: • Sequential search • On average, must read N/2 records • Very inefficient
Search Keys • Primary keys – unique (SSN) or SID ! • ORDERS the Table and is always indexed • Must be unique • Indexed allows fast search • Secondary keys – together make the record unique (NAME, BIRTHDAY)
Search a sorted file(using direct access) • Can use Binary Search for a record • Skip to “Middle” of file • Compare search key with record • If smaller – record in first half of file • If larger – record in second half of file • Continue until record found or no more records to search • Efficient – about Log base 2(N) probes
Indexing • Adding additional navigation information • Help locate records quickly • Primary Index: • Contains KEY field and BLOCK number • Only KEY of first record in BLOCK • Do binary search on (smaller) Index file
Secondary index • Need way to also quickly find records by an attribute that the file is not stored by. (non-primary index) • EX: file stored by customer number, need to reference by name as well.
Database approach:advantages • Centralized repository • Program/db independence • Can form new data relationships • Reduce redundancy • Data integrity – transaction rollback • Data recovery – log files