1 / 61

Data Warehousing & Business Intelligence Fundamentals

Data Warehousing & Business Intelligence Fundamentals. Agenda. Evolution of DSS Business Intelligence. Evolution of Decision Support Systems. Overview. Most organizations began information processing on a small scale, automating one application at a time.

tracey
Télécharger la présentation

Data Warehousing & Business Intelligence Fundamentals

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Warehousing & Business Intelligence Fundamentals

  2. Agenda • Evolution of DSS • Business Intelligence

  3. Evolution of Decision Support Systems

  4. Overview • Most organizations began information processing on a small scale, automating one application at a time. • Systems tended to grow independently to support defined functional areas. • Each functional area tended to plan and develop systems in isolation from other areas.

  5. Evolution of Information Processing • File Transaction Processing • Data Based Management Systems (DBMS) • 4GLs & MIS • Extract Processing • Decision Support Systems (DSS) • Data Warehouses • OLAP • Data Mining

  6. File Transaction Processing • The traditional approach to file processing encouraged each functional area to develop and maintain specialized applications. • Individual applications ran on unique master files housed on magnetic tapes.

  7. Problems With Traditional File Processing: • Data redundancy • Lack of data integrity • Program-data dependency • Lack of flexibility • Poor security • Lack of data sharing and availability

  8. Traditional File Processing Data redundancy and inconsistency across files: Registration Application Library Application Financial aids Application Credit Records Application File A Student ID Name Address ZIP Code Phone Number File B Student ID First Name Last Name Address & ZIP Phone Number File C Student ID First & Last Name Address ZIP Code Phone Number File D Social Security Name Address ZIP Code Phone Number

  9. Evolution of Information Processing • File Transaction Processing • Data Based Management Systems (DBMS) • 4GLs & MIS • Extract Processing • Decision Support Systems (DSS) • Data Warehouses • OLAP • Data Mining

  10. Data Base Management Systema single source for all processing Registration IS application Library IS application Financial aids application Credit Records IS application Common Data Dictionary Data Definition Language DBMS- database management system Data Manipulation Language INTEGRATED STUDENTS DATABASE Students: Name Address Credit Records: Number of credits Course number Books: Title Author

  11. Evolution of Information Processing • File Transaction Processing • Data Based Management Systems (DBMS) • 4GLs & MIS • Extract Processing • Decision Support Systems (DSS) • Data Warehouses • OLAP • Data Mining

  12. PCs, 4GLs, MIS • PCs and Fourth-Generation Languages brought access to data and users at the level of the end-user. • MIS – focus on providing pre-specified reports to drive management decisions

  13. Evolution of Information Processing • File Transaction Processing • Data Based Management Systems (DBMS) • 4GLs & MIS • Extract Processing • Decision Support Systems (DSS) • Data Warehouses • OLAP • Data Mining

  14. Extract Programs Program extracting data from files and databases based on criteria. Reasons for Extract Programs • Accessibility • move data out of online processing systems • Performance • perform analytical functions separate from online processing functions • Control • shift in control of the data • the end-user ends up “owing” it

  15. Problems with Natural Evolution of Data Extraction • Credibility of data • Inconsistent conclusions as a result of extracting data at different times, using different criteria, using data from different unsynchronized data sources • Low Productivity • Locating data, multiple storage technologies, multiple communication technologies, different data definitions • Inability to transform data into information • Difficult to integrate extracted data, lack of historical data

  16. Evolution of Information Processing • File Transaction Processing • Data Based Management Systems (DBMS) • 4GLs & MIS • Extract Processing • Decision Support Systems (DSS) • Data Warehouses • OLAP • Data Mining

  17. Decision Support Systems • Computer system at the management level of an organization that combines data, sophisticated analytical models, and user-friendly software to support semi-structured and unstructured decision making. • DSS often tend to be stand-alone systems, developed by end-user groups not under central IS control

  18. Components of DSS • DSS Data base • A collection of current and historical, internal & external data • DSS Model base • A collection of analytical (math, statistical) models that can easily be made accessible to the DSS user. • DSS software system • User interface and development environment that permits easy interaction between the users of the system and the DSS data & model base.

  19. Which are our lowest/highest margin customers ? Who are my customers and what products are they buying? What is the most effective distribution channel? What product promo-tions have the biggest impact on revenue? Which customers are most likely to go to the competition ? What impact will new products/services have on revenue and margins? A producer wants to know….

  20. Data, Data everywhereyet ... • I can’t find the data I need • data is scattered over the network • many versions, subtle differences • I can’t get the data I need • need an expert to get the data • I can’t understand the data I found • available data poorly documented • I can’t use the data I found • results are unexpected • data needs to be transformed from one form to other

  21. Problem: Heterogeneous Information Sources “Heterogeneities are everywhere” Personal Databases World Wide Web Scientific Databases Digital Libraries • Different interfaces • Different data representations • Duplicate and inconsistent information

  22. Problem: Data Management in Large Enterprises • Vertical fragmentation of informational systems (vertical stove pipes) • Result of application (user)-driven development of operational systems Sales Planning Suppliers Num. Control Stock Mngmt Debt Mngmt Inventory ... ... ... Sales Administration Finance Manufacturing ...

  23. What are the users saying... • Data should be integrated across the enterprise • Summary data has a real value to the organization • Historical data holds the key to understanding data over time • What-if capabilities are required

  24. Dilemma: Most of the business analysts’ time is not spent in true data analysis • These logistic factors can negatively impact and slow down efficiency and effectiveness of business analysis: • Growing Volume of Data • Data stored in many different systems and formats • The criticality of quick decision making • Introduction to new products and market dynamics • Change in organizational strategies

  25. Evolution of Information Processing • File Transaction Processing • Data Based Management Systems (DBMS) • 4GLs & MIS • Extract Processing • Decision Support Systems (DSS) • Data Warehouses • OLAP • Data Mining

  26. Information Data What is Data Warehousing? A process of transforming data into information and making it available to users in a timely enough manner to make a difference [Forrester Research]

  27. Data Warehousing -- It is a process • Technique for assembling and managing data from various sources for the purpose of answering business questions. Thus making decisions that were not previous possible • A decision support database maintained separately from the organization’s operational database

  28. What is a Data Warehouse?A practitioners view “A data warehouse is simply a single, complete, and consistent store of data obtained from a variety of sources and made available to end users in a way they can understand and use it in a business context.” Barry Devlin, IBM Consultant

  29. What is a Data Warehouse? • Defined in many different ways, but not rigorously. • A decision support database that is maintained separately from the organization’s operational database • Support information processing by providing a solid platform of consolidated, historical data for analysis. • “A data warehouse is asubject-oriented, integrated, time-variant, and nonvolatilecollection of data in support of management’s decision-making process.”—W. H. Inmon • Multidimensional database with reporting and query tools, that stores current and historical data extracted from various operational systems and consolidated for management reporting and analysis. • Addresses the problem of integrating key operational data from around the company in a form that is consistent, reliable, and easily available for reporting.

  30. RelationalDatabases ExtractionCleansing Optimized Loader ERP Systems Data Warehouse Engine AnalyzeQuery Purchased Data LegacyData Metadata Repository Data Warehouse Architecture

  31. Warehousing data outside the operational systems • The primary concept of data warehousing is that the data stored for business analysis can most effectively be accessed by separating it from the data in the operational systems. • Fundamental differences between operational and informational (DW) environment: • Nature of the data • Development cycle • Supporting technology • User community • Processing characteristics

  32. Evolution of Information Processing • File Transaction Processing • Data Based Management Systems (DBMS) • 4GLs & MIS • Extract Processing • Decision Support Systems (DSS) • Data Warehouses • OLAP • Data Mining

  33. What Is OLAP? • Online Analytical Processing - coined by EF Codd in 1994 • “A group of technologies and applications that collect, manage, process, and present multidimensional data for analysis and management purposes.” • “A category of database software which provides an interface such that users can transform or limit raw data according to user-defined or pre-defined functions, and quickly and interactively examine the results in various dimensions of the data.”

  34. OLTP database applications are developed to meet the day-to-day and operational data retrieval needs of end-users Provide read-write capability Data Warehouses along with OLAP tools are being developed to meet information exploration and historical trend analysis management needs Provides read-only capability OLTP vs. OLAP

  35. Evolution of Information Processing • File Transaction Processing • Data Based Management Systems (DBMS) • 4GLs & MIS • Extract Processing • Decision Support Systems (DSS) • Data Warehouses • OLAP • Data Mining

  36. Data Mining • The exploration and analysis, by automatic or semiautomatic means, of large quantities of data in order to discover valid, meaningful patterns and rules to assist with business decisions.

  37. Data Mining works with Warehouse Data • Data Warehousing provides the Enterprise with a memory • Data Mining provides the Enterprise with intelligence

  38. We want to know ... • Given a database of 100,000 names, which persons are the least likely to default on their credit cards? • Which types of transactions are likely to be fraudulent given the demographics and transactional history of a particular customer? • If I emphasize ease-of-use of the product as opposed to its technical capabilities, what will be the net effect on my revenues? • Which of my customers are likely to be the most loyal? Data Mining helps extract such information

  39. Data Mining Application Areas Industry Application Finance Credit Card Analysis Insurance Claims, Fraud Analysis Telecommunication Call record analysis Transport Logistics management Consumer goods promotion analysis Data Service providers Value added data Utilities Power usage analysis

  40. Data Mining in Use • The US Government uses Data Mining to track fraud • A supermarket becomes an information broker • Basketball teams use it to track game strategy • Cross selling • Holding on to good customers • Weeding out bad customers

  41. What makes data mining possible? • Advances in the following areas are making data mining deployable: • data warehousing • better and more data (i.e., operational, behavioral, and demographic) • the emergence of easily deployed data mining tools and • the advent of new data mining techniques. • -- Gartner Group

  42. Business Intelligence

  43. Objectives • Identify potential solutions to business problems using the organization’s data. • Describe alternatives for distributing business intelligence to the organization. • Describe methods to measure the effectiveness of a business solution.

  44. Business Case Study Sterling Airlines is a small, commercial airline. For several years, quarterly reports have shown gross revenues unchanged, while operating costs have continued to rise.

  45. Business Case Study In recent years, the airline industry has experienced increased costs in areas such as • operations • personnel • greater security measures • equipment maintenance and upgrades.

  46. Business Case Study Sterling Airlines developed questions surrounding its business processes.

  47. Questions about Business Processes 1. What are the time trends in flight delay frequency by hub, location, and day of the week? 2. How much does it cost to offer a compensation package for flight delays? 3. How will costs be affected if the compensation package is changed? 4. What is the impact of delays on repeat business?

  48. Building a Data Warehouse To answer these questions,Sterling Airlines must turnits business data intobusiness intelligence.

  49. Knowledge as Intellectual Capital The data warehouse exists to increase the value of the corporate body of knowledge: • Human capital — potentially valuable staff knowledge relevant to the business • Structural capital — underlying assets for capture, storage, and exchange of information • Relationship capital — with clients, channels, and so on.

  50. Guidelines for Building a Data Warehouse 1. Plan from the top down. 2. Implement from the bottom up. 3. Build your warehouse one subject area at a time. 4. Define, load, and test warehouse data sources and data stores as you go.

More Related