1 / 25

Data Warehouse Components

Data Warehouse Components. Overview of the Components. Source Data Component Production data Internal data Archive data External data Data staging component Extraction Transformation Cleaning standardization Loading Data storage component Information delivery component

ailis
Télécharger la présentation

Data Warehouse Components

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Warehouse Components

  2. Overview of the Components • Source Data Component • Production data • Internal data • Archive data • External data • Data staging component • Extraction • Transformation • Cleaning • standardization • Loading • Data storage component • Information delivery component • Metadata component • Management and control component

  3. Architectural Framework

  4. Data Acquisition You are the data analyst on the project team building a DW for an insurance company. List the possible data sources from which you will bring data into DW • Production data: data from various operational systems • External data: for finding trends and comparisons against other organizations. • Internal data: private confidential data important to an organization • Archived data: for getting some historical information

  5. Architectural Framework

  6. Data Staging • Performs ETL • Extraction • Select data sources, determine filters • Automatic replicate • Create intermediary files • Transformation • Clean, merge, de-duplicate data • Covert data types • Calculate derived data • Resolve synonyms and homonyms • Loading • Initial loading • Incremental loading

  7. Why is a separate data staging area required? • Data is across various operational databases • It should be subject-oriented data • Data staging is mandatory

  8. Architectural Framework

  9. Characteristics of data storage area • Separate repository • Data content • Read only • Integrated • High volumes • Grouped by business subjects • Metadata driven • Data from DW is aggregated in MDDBs

  10. Architectural Framework

  11. Information delivery component • Depends on the user • Novice user: prefabricated reports, preset queries • Casual user: once in a while information • business analyst: complex analysis • Power users: picks up interesting data

  12. Information delivery component

  13. Architectural Framework

  14. Metadata component • Data about data in the datawarehouse • Metadata can be of 3 types • Operational metadata: contains information about operational data sources • Extraction and transformation metadata: Details pertaining to extraction frequencies, extraction methods, business rules for data extraction • End-user metadata: navigational map of DW

  15. Why is metadata especially important in a data warehouse? • It acts as the glue that connects all parts of the data warehouse. • It provides information about the contents and structures to the developers. • It opens the door to the end-users and makes the contents recognizable in their own terms.

  16. Management and Control • Sits on top of all components • Coordinates the services and activities within the DW • Controls the data transformation and transfer in DW storage

  17. Summing up • Data warehouse building blocks or components are: source data, data staging, data storage, information delivery, metadata, and management and control. • In a data warehouse, metadata is especially significant because it acts as the glue holding all the components together and serves as a roadmap for the end-users.

  18. Doubts????????????????

  19. Trends in DW

  20. Case study 1 • As a senior analyst on DW project of a large retail chain, you are responsible for improving data visualization of the output results. Make a list of recommendations

  21. Parallel processing • Performance of DW may be improved using parallel processing with appropriate hardware and software options. • Parallel processing options • Symmetric multiprocessing • Massively parallel processing • clusters

  22. DW with ERP packages

  23. Web Enabled configuration

More Related