1 / 13

Data Warehousing

Data Warehousing. Alex Ostrovsky CS157B Spring 2007. Introduction. Data warehouse is a main repository of corporate data Multiple databases are employed per specific purpose

quinta
Télécharger la présentation

Data Warehousing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Warehousing Alex Ostrovsky CS157B Spring 2007

  2. Introduction • Data warehouse is a main repository of corporate data • Multiple databases are employed per specific purpose • Contains raw events and unprocessed data, although separate tables might exist for processed information displaying meaningful data

  3. What is it used for? • Data analysis • Data mining • Complex queries with multiple table join • Forecasting • Historical reporting • OLAP (Online Analytical Processing)

  4. High level view

  5. Key Concepts and Features • Data is not required to be heavily normalized • Transaction Processing is done mostly offline, thus processing time is not very critical. Although, this might depend on amount of data, normalization, query complexity, and application specifications.

  6. Key Concepts and Features (cont.) • Unlike regular OLTP real-time databases data is subject-oriented • Non-volatile, i.e. data is essentially stored forever without being pruned or deleted. • Heavily integrated: contains data from majority of organization’s applications • Time-variant: most of the data has some time reference for the purpose of producing the reports

  7. Types of data warehousing DBs • Offline operational database: similar to regular data replication. Used to minimize the impact of queries on a running primary operational system • Offline data warehouse: heavily integrated, reporting-oriented warehouse databases which are updated with data from operational databases on regular time intervals

  8. Types of data warehousing DBs (cont) • Real-time data warehouse: database data is updated instantaneously as soon as transaction happens • Integrated data warehouse: database is integrated with primary operational system for immediate decision making and reporting.

  9. Benefits of Data Warehousing • No need to stress operational database with complex queries • Separation of processing and business logic • Very flexible, multiple distinct relations can be defined from a set of data • Can be customer or object specific • Persistent – once result is computed from the raw events, it doesn’t need to be recomputed again, giving faster response time on subsequent queries.

  10. Dangers of Data Warehousing • Heavy processing requires physically separate database machines for warehousing and OLTP • Must be optimized for novice users, complex queries might take a very long time • Much more complex multidimensional design compared to regular relational databases • Errors in computational logic can cause serious financial losses and computational recalculations. • Data representation • Relatively difficult to perform data migration

  11. Database Design • Data warehousing databases mostly utilize complex multidimensional design • Relationships must be meaningful and represent clear patterns and trends of unprocessed data. More data and relationships you have more dimensions database will have. • Information is viewed along one common dimensional position. Can be thought of as intersection of a few planes.

  12. OLAP Market

  13. References • http://en.wikipedia.org/wiki/Data_warehouse • http://en.wikipedia.org/wiki/OLAP • http://dmoz.org/Computers/Software/Databases/Data_Warehousing/ • http://dmoz.org/Computers/Software/Databases/Data_Warehousing/Articles/ • http://en.wikipedia.org/wiki/Multidimensional_database • http://www.olapreport.com/market.htm

More Related