Download
chapter 33 n.
Skip this Video
Loading SlideShow in 5 Seconds..
Chapter 33 PowerPoint Presentation

Chapter 33

281 Views Download Presentation
Download Presentation

Chapter 33

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Chapter 33 OLAP Transparencies © Pearson Education Limited 1995, 2005

  2. Chapter 33 - Objectives • The purpose of Online Analytical Processing (OLAP). • The relationship between OLAP and data warehousing. • The key features of OLAP applications. • The potential benefits associated with successful OLAP applications. © Pearson Education Limited 1995, 2005

  3. Chapter 33 - Objectives • How to represent multi-dimensional data. • The rules for OLAP tools. • The main categories of OLAP tools. • OLAP extensions to the SQL standard. • How Oracle supports OLAP. © Pearson Education Limited 1995, 2005

  4. Business Intelligence Technologies • Accompanying the growth in data warehousing is an ever-increasing demand by users for more powerful access tools that provide advanced analytical capabilities. • There are two main types of access tools available to meet this demand, namely Online Analytical Processing (OLAP) and data mining. © Pearson Education Limited 1995, 2005

  5. Business Intelligence Technologies • OLAP and Data Mining differ in what they offer the user and because of this they are complementary technologies. • An environment that includes a data warehouse (or more commonly one or more data marts) together with tools such as OLAP and /or data mining are collectively referred to as Business Intelligence (BI) technologies. © Pearson Education Limited 1995, 2005

  6. Online Analytical Processing (OLAP) • The dynamic synthesis, analysis, and consolidation of large volumes of multi-dimensional data, Codd (1993). • Describes a technology that uses a multi-dimensional view of aggregate data to provide quick access to strategic information for the purposes of advanced analysis. © Pearson Education Limited 1995, 2005

  7. Online Analytical Processing (OLAP) • Enables users to gain a deeper understanding and knowledge about various aspects of their corporate data through fast, consistent, interactive access to a wide variety of possible views of the data. • Allows users to view corporate data in such a way that it is a better model of the true dimensionality of the enterprise. © Pearson Education Limited 1995, 2005

  8. Online Analytical Processing (OLAP) • Can easily answer ‘who?’ and ‘what?’ questions, however, ability to answer ‘what if?’ and ‘why?’ type questions distinguishes OLAP from general-purpose query tools. • Types of analysis ranges from basic navigation and browsing (slicing and dicing) to calculations, to more complex analyses such as time series and complex modeling. © Pearson Education Limited 1995, 2005

  9. OLAP Benchmarks • OLAP Council published an analytical processing benchmark referred to as the APB-1 (OLAP Council, 1998). • Aim is to measure a server’s overall OLAP performance rather than the performance of individual tasks. © Pearson Education Limited 1995, 2005

  10. OLAP Benchmarks • APB-1 assesses the most common business operations including: • bulk loading of data from internal or external data sources • incremental loading of data from operational systems; • aggregation of input level data along hierarchies; © Pearson Education Limited 1995, 2005

  11. OLAP Benchmarks • APB-1 assesses the most common business operations including (continued): • calculation of new data based on business models; • time series analysis; • queries with a high degree of complexity; • drill-down through hierarchies; • ad hoc queries; • multiple online sessions. © Pearson Education Limited 1995, 2005

  12. OLAP Benchmarks • OLAP applications are judged on their ability to provide just-in-time (JIT) information, a core requirement of supporting effective decision-making. • This requirement is more than measuring processing performance but includes its abilities to model complex business relationships and to respond to changing business requirements. © Pearson Education Limited 1995, 2005

  13. OLAP Benchmarks • APB-1 uses a standard benchmark metric called AQM (Analytical Queries per Minute). • AQM represents the number of analytical queries processed per minute including data loading and computation time. Thus, the AQM incorporates data loading performance, calculation performance, and query performance into a singe metric. © Pearson Education Limited 1995, 2005

  14. OLAP Benchmarks • Publication of APB-1 benchmark results must include both the database schema and all code required for executing the benchmark. • An essential requirement of all OLAP applications is the ability to provide users with JIT information, which is necessary to make effective decisions about an organization's strategic directions. © Pearson Education Limited 1995, 2005

  15. OLAP Applications • JIT information is computed data that usually reflects complex relationships and is often calculated on the fly. Also as data relationships may not be known in advance, the data model must be flexible. © Pearson Education Limited 1995, 2005

  16. Examples of OLAP applications in various functional areas © Pearson Education Limited 1995, 2005

  17. OLAP Applications • Although OLAP applications are found in widely divergent functional areas, they all have the following key features: • multi-dimensional views of data • support for complex calculations • time intelligence © Pearson Education Limited 1995, 2005

  18. OLAP Applications - multi-dimensional views of data • Core requirement of building a ‘realistic’ business model. • Provides basis for analytical processing through flexible access to corporate data. • The underlying database design that provides the multi-dimensional view of data should treat all dimensions equally. © Pearson Education Limited 1995, 2005

  19. OLAP Applications - support for complex calculations • Must provide a range of powerful computational methods such as that required by sales forecasting, which uses trend algorithms such as moving averages and percentage growth. • Mechanisms for implementing computational methods should be clear and non-procedural. © Pearson Education Limited 1995, 2005

  20. OLAP Applications – time intelligence • Key feature of almost any analytical application as performance is almost always judged over time. • Time hierarchy is not always used in the same manner as other hierarchies. • Concepts such as year-to-date and period-over-period comparisons should be easily defined. © Pearson Education Limited 1995, 2005

  21. OLAP Benefits • Increased productivity of end-users. • Reduced backlog of applications development for IT staff. • Retention of organizational control over the integrity of corporate data. • Reduced query drag and network traffic on OLTP systems or on the data warehouse. • Improved potential revenue and profitability. © Pearson Education Limited 1995, 2005

  22. Representation of Multi-dimensional Data • Example of two-dimensional query. • What is the total revenue generated by property sales in each city, in each quarter of 2004?’ • Choice of representation is based on types of queries end-user may ask. • Compare representation - three-field relational table versus two-dimensional matrix. © Pearson Education Limited 1995, 2005

  23. Multi-dimensional Data as Three-field table versus Two-dimensional Matrix © Pearson Education Limited 1995, 2005

  24. Representation of Multi-dimensional Data • Example of three-dimensional query. • ‘What is the total revenue generated by property sales for each type of property (Flat or House) in each city, in each quarter of 2004?’ • Compare representation - four-field relational table versus three-dimensional cube. © Pearson Education Limited 1995, 2005

  25. Multi-dimensional Data as Four-field Table versus Three-dimensional Cube © Pearson Education Limited 1995, 2005

  26. Representation of Multi-dimensional Data • Cube represents data as cells in an array. • Relational table only represents multi-dimensional data in two dimensions. © Pearson Education Limited 1995, 2005

  27. Representation of Multi-dimensional Data • Use multi-dimensional structures to store data and relationships between data. • Multi-dimensional structures are best visualized as cubes of data, and cubes within cubes of data. Each side of a cube is a dimension. • A cube can be expanded to include other dimensions. © Pearson Education Limited 1995, 2005

  28. Representation of Multi-dimensional Data • A cube supports matrix arithmetic. • Multi-dimensional query response time depends on how many cells have to be added ‘on the fly’. • As number of dimensions increases, number of the cube’s cells increases exponentially. © Pearson Education Limited 1995, 2005

  29. Representation of Multi-dimensional Data • However, majority of multi-dimensional queries use summarized, high-level data. • Solution is to pre-aggregate (consolidate) all logical subtotals and totals along all dimensions. © Pearson Education Limited 1995, 2005

  30. Representation of Multi-dimensional Data • Pre-aggregation is valuable, as typical dimensions are hierarchical in nature. • (e.g. Time dimension hierarchy - years, quarters, months, weeks, and days) • Predefined hierarchy allows logical pre-aggregation and, conversely, allows for a logical ‘drill-down’. © Pearson Education Limited 1995, 2005

  31. Representation of Multi-dimensional Data • Supports common analytical operations • Consolidation • Drill-down • Slicing and dicing © Pearson Education Limited 1995, 2005

  32. Representation of Multi-dimensional Data • Consolidation - aggregation of data such as simple ‘roll-ups’ or complex expressions involving inter-related data. • Drill-Down - is the reverse of consolidation and involves displaying the detailed data that comprises the consolidated data. © Pearson Education Limited 1995, 2005

  33. Representation of Multi-dimensional Data • Slicing and Dicing - (also called pivoting) refers to the ability to look at the data from different viewpoints. © Pearson Education Limited 1995, 2005

  34. Representation of Multi-dimensional Data • Can store data in a compressed form by dynamically selecting physical storage organizations and compression techniques that maximize space utilization. • Dense data (that is, data that exists for a high percentage of cells) can be stored separately from sparse data (that is, a significant percentage of cells are empty). © Pearson Education Limited 1995, 2005

  35. Representation of Multi-dimensional Data • Ability to omit empty or repetitive cells can greatly reduce the size of the cube and the amount of processing. • Allows analysis of exceptionally large amounts of data. © Pearson Education Limited 1995, 2005

  36. Representation of Multi-dimensional Data • In summary, pre-aggregation, dimensional hierarchy, and sparse data management can significantly reduce the size of the cube and the need to calculate values ‘on-the-fly’. • Removes need for multi-table joins and provides quick and direct access to arrays of data, thus significantly speeding up execution of multi-dimensional queries. © Pearson Education Limited 1995, 2005

  37. OLAP Tools • There are many varieties of OLAP tools available in the marketplace. • This choice has resulted in some confusion with much debate regarding what OLAP actually means to a potential buyer and in particular what are the available architectures for OLAP tools. © Pearson Education Limited 1995, 2005

  38. Codd’s Rules for OLAP Systems • In 1993, E.F. Codd formulated twelve rules as the basis for selecting OLAP tools. © Pearson Education Limited 1995, 2005

  39. Codd’s Rules for OLAP Systems • Multi-dimensional conceptual view • Transparency • Accessibility • Consistent reporting performance • Client-server architecture • Generic dimensionality © Pearson Education Limited 1995, 2005

  40. Codd’s rules for OLAP • Dynamic sparse matrix handling • Multi-user support • Unrestricted cross-dimensional operations • Intuitive data manipulation • Flexible reporting • Unlimited dimensions and aggregation levels © Pearson Education Limited 1995, 2005

  41. Codd’s Rules for OLAP Systems • There are proposals to re-defined or extended the rules. For example to also include • Comprehensive database management tools • Ability to drill down to detail (source record) level • Incremental database refresh • SQL interface to the existing enterprise environment © Pearson Education Limited 1995, 2005

  42. Categories of OLAP Tools • OLAP tools are categorized according to the architecture used to store and process multi-dimensional data. • There are four main categories: • Multi-dimensional OLAP (MOLAP) • Relational OLAP (ROLAP) • Hybrid OLAP (HOLAP) • Desktop OLAP (DOLAP) © Pearson Education Limited 1995, 2005

  43. Multi-dimensional OLAP (MOLAP) • Use specialized data structures and multi-dimensional Database Management Systems (MDDBMSs) to organize, navigate, and analyze data. • Data is typically aggregated and stored according to predicted usage to enhance query performance. © Pearson Education Limited 1995, 2005

  44. Multi-dimensional OLAP (MOLAP) • Use array technology and efficient storage techniques that minimize the disk space requirements through sparse data management. • Provides excellent performance when data is used as designed, and the focus is on data for a specific decision-support application. © Pearson Education Limited 1995, 2005

  45. Multi-dimensional OLAP (MOLAP) • Traditionally, require a tight coupling with the application layer and presentation layer. • Recent trends segregate the OLAP from the data structures through the use of published application programming interfaces (APIs). © Pearson Education Limited 1995, 2005

  46. Typical Architecture for MOLAP Tools © Pearson Education Limited 1995, 2005

  47. MOLAP Tools - Development Issues • Underlying data structures are limited in their ability to support multiple subject areas and to provide access to detailed data. • Navigation and analysis of data is limited because the data is designed according to previously determined requirements. © Pearson Education Limited 1995, 2005

  48. MOLAP Tools - Development Issues • MOLAP products require a different set of skills and tools to build and maintain the database, thus increasing the cost and complexity of support. © Pearson Education Limited 1995, 2005

  49. Relational OLAP (ROLAP) • Fastest-growing style of OLAP technology due to requirements to analyze ever-increasing amounts of data and the realization that users cannot store all the data they require in MOLAP databases. © Pearson Education Limited 1995, 2005

  50. Relational OLAP (ROLAP) • Supports RDBMS products using a metadata layer - avoids need to create a static multi-dimensional data structure - facilitates the creation of multiple multi-dimensional views of the two-dimensional relation. © Pearson Education Limited 1995, 2005