1 / 9

DATA WAREHOUSING

A Data Warehouse can be defined as a centralized, consistent data store or Decision Support System (OLAP), for the end business users for analysis, prediction and decision making in their business operations. Data from various enterprise-wide application/transactional source systems (OLTP), are extracted, cleansed, integrated, transformed and loaded in the Data Warehouse.

Télécharger la présentation

DATA WAREHOUSING

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. DATA WAREHOUSE

  2. What is a Data Warehouse • A Data Warehouse can be defined as a centralized, consistent data store or Decision Support System (OLAP), for the end business users for analysis, prediction and decision making in their business operations. • Data from various enterprise-wide application/transactional source systems (OLTP), are extracted, cleansed, integrated, transformed and loaded in the Data Warehouse.

  3. Benefits of data warehouse • A data warehouse helps to integrate data and store them historically so that we can analyze different aspects of business including, performance analysis, trend, prediction etc. over a given time frame and use the result of our analysis to improve the efficiency of business processes. • Data warehouses are built to facilitate reporting on different key business processes of an organization, known as KPI. • Data warehouse can be further used for trend prediction, forecasts, pattern recognition etc.

  4. Properties of a Data Warehouse • Subject Oriented - Data Warehouse is designed based on the major subjects areas of the business, rather than the major application areas of the enterprise. The ultimate necessity is to store decision-support data rather than application-oriented data. • Integrated - The data in Data Warehouse comes from different enterprise-wide application source systems. The source data is often inconsistent in nature. During the Integration process, source data must be made consistent in Data Warehouse so that the data becomes homogeneous and uniform. • Time-Variant - The source data in the Data Warehouse is only accurate and valid at some point in time or over some time interval. • Non-Volatile - Data in Data Warehouse is not updated or replaced in real time but is refreshed from operational systems on a regular basis. New data is always added incrementally integrating it with the previous data set.

  5. Schematic View of a data warehouse

  6. ETL process ETL (Extract, Transform and Load) is a process in data warehousing responsible for pulling data out of the source systems and placing it into a data warehouse. ETL involves the following tasks: • extracting the data from source systems (SAP, ERP, other operational systems), data from different source systems is converted into one consolidated data warehouse format which is ready for transformation processing. continued….

  7. Transforming the data may involve the following tasks: • Applying business rules • Cleaning(e.g., mapping NULL to 0 or "Male" to "M" and "Female" to "F" etc.), • Filtering(e.g., selecting only certain columns to load) • Splittinga column into multiple columns and vice versa • Joiningtogether data from multiple sources • Transposingrows and columns • Applying any kind of simple or complex data validation loading the data into a data warehouse or data repository other reporting applications

  8. ETL layer is mostly responsible for 2 types of treatments on the data: • Data Integration - process of combining heterogeneous data sources in to a single queriable schema so as to get an unified view of these data. • Qualitative Treatment - so that the validity and quality of the data can be checked (and if required corrected) before loading to the data warehouse.

  9. THANK YOU

More Related