270 likes | 399 Vues
CISB594 – Business Intelligence. Business Analytics and Data Visualization Part II. Reference. Materials used in this presentation are extracted mainly from the following texts, unless stated otherwise. Objectives. At the end of this lecture, you should be able to:
E N D
CISB594 – Business Intelligence Business Analytics and Data Visualization Part II
Reference • Materials used in this presentation are extracted mainly from the following texts, unless stated otherwise.
Objectives At the end of this lecture, you should be able to: • Explain OLAP’s role in BI • Distinguish between OLAP and OLTP • Explain the concept of multidimensionality and how can it improve decision making • Explain the concept of cube in multidimensionality • Explain Geographic Information Systems (GIS) and its role in BI CISB594 – Business Intelligence
Introduction to OLAP • Where is OLAP in BI?
Introduction to OLAP What is OLAP in BI? • Online Analytical Processing (OLAP) is an industry-accepted reporting technology that provides high-performance analysis and easy reporting on large volumes of data • The goal of OLAP, also known as multidimensional data analysis, is to provide fast and flexible data summarization, analysis, and reporting capabilities with the ability to view trends over time
Introduction to OLAP OLAP applications have the following features: • Enable users to look at different relationships in data by looking beyond traditional two-dimensional row and column data analysis • Offer high-performance access to large amounts of data -give users the power to retrieve answers to multi-dimensional business questions quickly and easily • Provide slice-and-dice views of multiple relationships in large quantities of data (separation of data in the way you like to view e.g. products sales vs. year vs. location)
Relational Database vs. Dimensional Database • A relational database is a collection of relations or tables • Purpose – relational is designed more for data updating, dimensional is meant more for data retrieving (BI) By using this model, you can examine the Sales table to find that 5 bolts have been purchased. Then you can check the Order table to find out that the purchase is done by Customer Id AAA002 . Then you can check the Customer table to find out that Customer Id AAA002 is actually Samantha Jones Imagine this retrieval is done on terabytes of data !
Relational Database vs. Dimensional Database • A dimension database model consists of one fact table and multiple dimension tables which are smaller. (Kimball). Common kind of schema are used when designing data models for dimension database is a star schema (Dimensional Data Model) The star schema (sometimes referenced as dimensional model) is the simplest data warehouse schema, consisting of a single "fact table“ with one segment for each "dimension” http://hubpages.com/hub/Star_schema
Relational Database vs. Dimensional Database • All data that represent order transaction are placed into one table ie fact table • The fact table refers to several tables called dimension table – consists data used to retrieve sales information • A simple data ware house structure
Relational Database vs. Dimensional Database http://hubpages.com/hub/Star_schema
Multidimensionality • Perhaps the best starting point to approach the multidimensional model effectively is a by looking at the types of queries for which this model is best suited. • Example of common queries in BI • "What is the total amount of receipts recorded last year per state and per product category?" • "What is the relationship between the trend of PC manufacturers' shares and quarter gains over the last five years?" • "Which orders maximize receipts?" • "Which one of two new treatments will result in a decrease in the average period of admission?" • "What is the relationship between profit gained by the shipments consisting of less than 10 items and the profit gained by the shipments of more than 10 items?" http://searchdatamanagement.techtarget.com/
Multidimensionality Another example of query : A manager wants to know the sales of a product, by unit or dollar in a certain geographic area, by a specific salesperson, during a specific month. The answer to such question can be provided fast if the data is organized in multidimensional database or if query or related software products are designed for multidimensionality. This will allow users to navigate through the many dimensions and levels of data via tables or graphs and are able to make quick interpretations, such as uncovering significant deviations or important trends.
Multidimensionality • Multidimensionality • The ability to organize, present, and analyze data by several dimensions, such as sales by region, by product, by salesperson, and by time (four dimensions) is called multidimensionality • Done during Business Analytical application design using OLAP technology (to enable retrieving certain data from data warehouse for further analysis) • An example of OLAP technology is cube database (sometimes the OLAP application points to the data in data warehouse for analysis, and sometimes they can have a capability to retrieve the data and store data using cube database)
Understanding cube • Conceptually, a multidimensional database uses the idea of a data cube to represent the dimensions of data available to a user. • For example, "sales" could be viewed in the dimensions of product model, geography, time, or some additional dimension. • The design of cube database is planned and implemented by the database developer based on users requirement http://searchoracle.techtarget.com/definition/multidimensional-database
Cube Database – allowing for slice and dice Dice down to the location Example query: How many bolts were sold in the year 2005 by the Central branch? Slice the year 2005 Dice down to the product
Advanced Business Analytics • While OLAP concentrates on reporting and queries, a more sophisticated way of analyzing data and information is needed • Users today will want to perform statistical and mathematical analysis such as hypothesis testing, multiple regression, prediction and customer scoring models. • Such investigation cannot be done with basic OLAP and will require special tools, including data mining and predictive analysis – hence, advanced business analytics
Advanced Business Analytics • A major step in managerial decision making is forecasting or estimating the results of different alternative courses of actions • Two methods that can be used for advanced business analytics are • Data mining • Predictive analysis
Advanced Business Analytics • Data mining • Tools that would automatically extract hidden, predictive information from databases, search for pattern in large transaction database. OLAP can only answer questions you are certain to ask, whereas data mining answers questions you don’t necessarily know you should ask (to be discussed further in the next chapter) • Predictive analysis Use of tools that help determine the probable future outcome for an event or the likelihood of a situation occurring. These tools also identify relationships and patterns
Geographic Information Systems (GIS) Geographical information system (GIS) A computer based system for capturing, storing, modeling, retrieving, checking, analyzing and displaying geographically referenced data by using digitized maps
Geographic Information Systems (GIS) • As GIS tools become increasingly sophisticated and affordable, they help more companies and governments understand: • Precisely where their trucks, workers, and resources are located • Where they need to go to service a customer • The best way to get from here to there
Geographic Information Systems (GIS) • GIS and decision making • GIS applications are used to improve decision making in the public and private sectors including: • Dispatch of emergency vehicles • Transit management • Facility site selection • Drought risk management • Wildlife management • Local governments use GIS applications for mapping and other decision-making applications
Geographic Information Systems (GIS) Examples of GIS Applications
Geographic Information Systems (GIS) • GIS combined with GPS • Global positioning systems (GPS) Wireless devices that use satellites to enable users to detect the position on earth of items (e.g., cars or people) the devices are attached to, with reasonable precision Example of usage : New York City pioneered CompStat, which uses GIS to map criminal activity and GPS for police deployment by date, time, location. Resulting in reduction of crime rate by 70% in the past decade
Objectives At the end of this lecture, you should be able to: • Explain OLAP’s role in BI • Distinguish between OLAP and OLTP • Explain the concept of multidimensionality and how can it improve decision making • Explain the concept of cube in multidimensionality • Explain Geographic Information Systems (GIS) and its role in BI CISB594 – Business Intelligence