1 / 18

Data Warehousing

Data Warehousing. What is it? Why use it? Who cares? Is it summer yet?. Don’t Know. Don’t Care Don’t Know No. Definitions. “A data warehouse is a subject-oriented, time-variant, and non-volatile collection of data in support of management’s decision-making process.”

italia
Télécharger la présentation

Data Warehousing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Warehousing • What is it? • Why use it? • Who cares? • Is it summer yet? • Don’t Know. • Don’t Care • Don’t Know • No.

  2. Definitions • “A data warehouse is a subject-oriented, time-variant, and non-volatile collection of data in support of management’s decision-making process.” • “Data warehousing is the process whereby organizations extract value from their informational assets through the use of special stores called data warehouses.”

  3. Definitions II • Operational Data: • the data that exists in the databases that store the day-to-day activities of the enterprise. • Virtual Data Warehouse: • Users are given the right to query data in the operational databases; no data is duplicated.

  4. Definitions III • Operational Data Store: • When the operational data is mirrored or replicated on a separate computer. • Data Warehouse: • When historical data is maintained well beyond the normal amount stored in the operational databases. • Data Mart: • A subset of a data warehouse dedicated to only a small set of operational data.

  5. Definitions IV • What it is not: • Archive • “Data Museum”

  6. Basic Questions to Answer • Operational Data Store: • “Do we have enough stock to offer some promotion?” • Data Mart: • “For what items should we increase our stock of next month, based on historical projections?” • Data Warehouse: • “What type of customer is most likely to respond to what promotional offers and when?”

  7. Warehouse Components • Data Migration Tools • Metadata Repositories • Data Warehouse Management System • Tools for retrieving and analyzing the data • Tools for managing the data warehouse

  8. Warehouse Tools • On-Line Analytical Processing (OLAP) • Tools which provide statistical-type queries: • Find the top salesperson for the month of May. • Find the top 3 best-selling products for the last quarter. • In SQL, many may be done with grouping functions; others require a little more processing (such as the second query above).

  9. OLAP In Depth • OLAP as opposed to OLTP. • (the “T” is for transaction) • Uses multi-dimensional data analysis techniques: • Data is viewed as an n-dimensional cube. • The analysis is slice, dice, roll up or drill down the cube. • Graphical interface is usually provided as well.

  10. OLAP Example • I have a transactional database for a suit case wholesaler. • The tables are • Customer(CID, Cname, etc), which is a retail store • SuitCase(SC#, manufacturer, price, etc), which is a suitcase that the customers may purchase. • SalesInfo(CID, SC#, NumSold, DateOfSale), which is information about purchases of suitcases by the retail stores.

  11. OLAP Example II • This data can easily be viewed as a three-dimensional cube: • One dimension for the customer • One dimension for the suitcase • One dimension for time • The data in each element of the cube will be the number of suitcases sold to that customer in that timeframe.

  12. This might be sales of Samsonite Suitcases to Sears in March SuitCase Customer Time OLAP Example III

  13. OLAP Rollup • A “Rollup” is basically a sum on a particular dimension. • For example, • “Give me the total sales for each month” • “Give me the total number of suitcases sold to each customer” • This analysis gives the decision support specialist an overview of the sales data.

  14. OLAP Drill Down • Suppose the DS specialist notices a marked decrease in sales from January to February; he/she might want to know why. • Drill Down amounts to taking a rolled up dimension and expanding it. • For example, • “I have the total sales for each month; now give me the total sales for each month for each customer”

  15. OLAP Slice and Dice • “Slice” and “Dice” refers to carving up the “Cube”. • A “Slice” is a cut along the “Z” or depth axis. • A “Dice” is a cut along one of the other axes. • Example: • “Give me a slice for Sears and then dice it for January”

  16. Warehouse Tools II • Data Mining • Tools which extract important relationships not otherwise obvious in the data: • Mechanical Pencils and Pencil lead are often purchased together. • Pencils and Pens are stolen more often than other items. • People between the ages of 50 and 65 are more likely to buy a Buick than any other age group.

  17. Warehouse Tools III • Data Mining (continued) • Uses of this information: • Product placement to maximize sales. • Product placement to minimize theft. • Target group marketing. • Techniques to use:

  18. Data Mining Techniques • Frequency Counts • Bayesian Analysis • Association Rules • ISA Hierarchies • Regression and Classification • “Support” and “Confidence” • Decision Trees

More Related