1 / 52

DW: A Tactical View

DW: A Tactical View. Developed by: Dr Eddie Ip Modified by: Dr Arif Ansari. Outline. Brief review Evolution of DW Architectural model of DW Putting a DW in place. Review: DW. Snapshot: DW takes a snap-shot of operation & stores it away. Jan 01. update. update. Dec 00. Nov 00.

deweyb
Télécharger la présentation

DW: A Tactical View

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. DW: A Tactical View Developed by: Dr Eddie Ip Modified by: Dr Arif Ansari

  2. Outline • Brief review • Evolution of DW • Architectural model of DW • Putting a DW in place

  3. Review: DW • Snapshot: DW takes a snap-shot of operation & stores it away Jan 01 update update Dec 00 Nov 00 Current 1/18/01 snapshot Update

  4. Review: DW • Read-only • For analytical processing • For company-wide usage (Rising Tide strategy)

  5. Review: Operational goals of DW • Provide business users with access to data • Provide one version of the truth • Record the past accurately • Slice and dice efficiently thro’ data • Provide a unified view of customers

  6. Evolution of DW as a decision support tool • Long history of analytic system • 1970 DSS = decision support systems • 1980 EIS = executive information systems • Reporting system for senior managers

  7. EIS example • Frito-Lay

  8. Pre-DW DSS • Lack DB component • Extract programs subset files (predetermined criteria) • Problems: reliable joins, synchronization, credibility, consistency • Process controlled by IT, not business users • Silos (one OLTP, one DSS)

  9. Modern DSS Example • Stock Screen • Used by security analysts to inform & support business decision

  10. Modern DSS Example • A simple web-based mutual fund screen • Charles Schwab http://www.schwab.com/SchwabNOW/navigation/mainFrameSet/0,4528,527,00.html

  11. Java-based DSS • Java-based DSS (Comshare) • http://decisionweb.comshare.com/deciweb/tr/trmain2.asp?App=corporate%20review

  12. DW as a DSS • OLTP bring data in • DSS draw information out Data Information to support decision DW

  13. Basic Architectural model • An architecture is a set of rules or structures providing a framework for the overall design of a system. • Poe, Building a DW for Decision Support

  14. Architectural Model • Data flow through the DW pipe of 6 layers • Transaction layer • Propagation layer • Database layer • Middleware layer • Application layer • Presentation layer

  15. OD1 Six-layer DW Architecture AP1 OD2 AP2 OD3 AP3 OD4 AP4 OD5 Transaction Propagation Database Middleware Application Presentation

  16. Transaction layer • Validate = is data up-to-date and accurate? ID invalid, missing, duplicate values • even name may have different format • First/Last, Last/Middle/First, …. • Define default for missing value • Map source to target = which field in operation data to which field in DW

  17. Transaction layer • Data • Flat file (text or ASCII file) • RDBMS file • Proprietary file (.dat)

  18. Transaction layer(Tips to handle data) • RDBMS files > SQL / extraction tools with GUI (often available in RDBMS) • One Big flat file • Large volume (e.g., click stream, call transaction) • Read into scalable RDBMS • Scripting or C/C++ program • Small volume • Statistical tools (SPSS, SAS, JMP, Splus) • Specialized tools such as spreadsheet, OLAP tools • Desktop RDBMS such as MS Access

  19. Propagation layer • Most common option is asynchronous propagation (versus synchronous: changes occurring in transaction system synchronously transported to DW) • Transfer and load • Time for loading depends on a factors such as number of indexes

  20. Database layer • DW is not just the database layer! • Integrated & transformed BEFORE • Analysis & presentation AFTER • Parallel processing capability often required • Symmetric Multi-processor(SMP) • Massively parallel processing (MPP) • Enhancement http://houseandhome.msn.com/pickaplace/nf_Overview.aspx

  21. Middleware layer • Software that make access to database layer and application layer possible (= the pipes) • LAN, WAN • Client server model : logical and physical separation of data from the application • Thin client model : Internet/Intranet

  22. Application layer • Decision support applications • Presentational systems (pre-defined reports) • Interrogative systems (SQL) • Simulation systems (What-if analysis) • Functional systems (e.g. segmentation analysis) • Automatic systems (e.g. credit approval expert system) • Constrained by organizational design • System geographical oriented? Product oriented?

  23. Presentation layer • Interface • Command-line (e.g. unix) • Menu-driven (little interaction) • Query language (e.g. SQL) • Graphical (windows, pull-down menu, buttons, interactive) • Groupware (Lotus Notes, Window based products)

  24. Presentation layer • Multmedia (sound ,video, animation) • Hypertext (html, xml)

  25. Presentation layer • Security • Methods to enforce security • Views = seeing selected rows or columns • Access control = DBA Grant/Revoke user’s privilege’s access to data files • Admin control • Accesses based on roles (group of users) • Encryption • Firewall • Audit = track trail

  26. Putting a DW in place • Plan and launch • Collect business requirements • Implementation • Maintenance, support, and evolution

  27. Selecting technology Architecture Evolution Data staging Data Modeling Plan Collect Business Requirements Analytical Applications

  28. Plan and launch • Sponsorship • ROI and risk • Expenses: hardware & network upgrade (25-60 %), software (10-25%), services (10-35%), internal staff (5-15%) • 3-6 m ~ 1-3 m • Figures from Humphrey, Hawkins, and Dy (1999) • Cooperation across departments (enterprise DW)

  29. Plan>Implement • Business requirement collection • A DW aims at solving business, not technical problems. • Architecture & design • Infrastructure = technologies, platforms & components necessary to support the chosen architecture • Data • Analytical applications

  30. Plan>Implement • Gather business requirements • Interview key end users • Executives • Managers • Analysts • Front-line workers

  31. Implementation • Design • What to include • How are data formatted and stored • Physical and logical models • Meta-data management • Other implementation issues (e.g., data model) discussed later in course

  32. Other Broad Issues • How do I measure results? (If you’re not measuring it, you’re not managing it) • Build or buy? • Data mart vs Enterprise DW • Ownership • What can be outsourced?

  33. Types of DW • Plain-vanilla/ generic DW • Data mart = specialized DW built for a community of users (e.g., finance) • Enterprise DW = DW built for entire company • Operational data store (ODS) = data store to see instant snapshot, e.g, customer call center • (reference: Westerman, p.10)

  34. Types of DW • Federated DW • Hybrid of Top down/Bottom up approach • Top down= built Enterprise DW • Bottom Up= Each unit/company built its own DW or data mart • FDW: set of common rules (captured as meta data) for each quasi independent DW http://www.dmreview.com/master.cfm?NavID=198&EdID=1953

  35. Some Trends • An encompassing enterprise DW is a myth • Federated DW, data mart • Quick ROI (small victories) often useful to sustain DW • Big, fat, expensive apps that required long development time are replaced by smaller, more flexible, less expensive turnkey apps • Mid-size & smaller companies are beginning to build DW (Declining h/w cost + Microsoft factor)

  36. The Modern Data Warehouse • A data warehouse is a copy of transaction data specifically structured for querying, analysis and reporting • Note that the data warehouse contains a copy of the transactions. These are not updated or changed later by the transaction system. • Also note that this data is specially structured, and may have been transformed when it was placed in the warehouse

  37. Data Warehouse Roles and Structures The DW has the following primary functions: • It is a direct reflection of the business rules of the enterprise. • It is the collection point for strategic information. • It is the historical store of strategic information. • It is the source of information later delivered to data marts. • It is the source of stable data regardless of how the business processes may change.

  38. Position of the Data Warehouse Within the Organization

  39. Data Marts • A data mart is a smaller, more focused data warehouse. It reflects the business rules of a specific business unit. • The data mart does not need to cleanse its data because that was done when it went into the warehouse. • It is a set of tables for direct access by users. • These tables are designed for aggregation. • It typically is not a source for traditional statistical analysis.

  40. Decision Support Information Data Mart Data Mart Data Mart Data Delivery Decision Support Information Decision Support Information Position of the Data Mart Within the Organization

  41. What Can a Data Warehouse Do? Some of the benefits of a DW are: • Immediate information delivery • Data integration from across and even outside the organization • Future vision from historical trends • Tools for looking at data in new ways • Freedom from IS department resource limitations (you don’t need programmers to use a data warehouse)

  42. Examples of Common DW Applications

  43. What Does All This Mean? • On a daily basis, organizations turn to their data warehouses to answer a limitless variety of questions. • Nothing is free, however, and these benefits do come with a cost. • The value of a data warehouse is a result of the new and changed business processes it enables. • There are limitations, though. A DW cannot correct problems with the data, although it may help to clearly identify them.

  44. Comparison of Typical DW Costs and Benefits

  45. The Cost of Warehousing Data • Expenditures can be categorized as one-time initial costs or as recurring, ongoing costs. • The initial costs can further be identified as for hardware or software. • Expenditures can also be categorized as capital costs (associated with acquisition of the warehouse) or as operational costs (associated with running and maintaining the warehouse)

  46. Expenditures Associated with Building a DW

  47. Cost Are Highly Variable • A company that spends less money for their data warehouse is often happier with it. • The main justification for the development expense is that a DW reduces the cost of accessing the information owned by the organization. • Since information has to be retrieved just once (when it is placed in the warehouse), DW users see a lower cost on each report generated.

  48. Source System A Source System B Source System C Source System D Typical Multidatabase Report and Screen Generation Data download and transformation contribute to retrieval costs for every report or screen generated

  49. Source System A Organizational Data Warehouse Source System B Source System C Source System D Typical DW Report and Screen Generation Data upload and transformation costs occur just once. Retrieval costs are lower.

  50. Summary • At a tactical level, DW can be viewed as a DSS • 6-layer architecture model of DW • Collecting business requirements is an important first step in building a DW • General DW proliferation : without the analytics, you’re at a competitive disadvantage

More Related