1 / 16

Understanding Data Replication for Real-Time Warehousing: Key Insights and Use Cases

Data replication is the process of copying a portion of a database from one environment to another while ensuring that changes made to the original are synced with the copies. This technology is crucial for data warehousing and business intelligence, supporting real-time queries and operational synchronization. Use cases include enabling high availability, facilitating data migration without downtime, and enhancing master data management. Key considerations involve performance optimization, support for heterogeneity and topology, and efficient management and monitoring.

shandi
Télécharger la présentation

Understanding Data Replication for Real-Time Warehousing: Key Insights and Use Cases

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Replication for real-time warehousing Philip Howard Research Director – Bloor Research

  2. Agenda • What is data replication? • When would you use it? • What are its requirements?

  3. What is data replication? “The process of copying a portion of a database from one environment to another and keeping the subsequent copies of the data in sync with the original source. Changes made to the original source are propagated to the copies of the data in other environments.”

  4. When would you use data replication? • Data warehousing and BI • Loading real-time data for operational BI • Supporting real-time query/reporting • Integrating CEP with operational data • Operational synchronisation • E.g. Lookers v Bookers • E.g. synchronising (POS and) central pricing data • High/continuous availability • Data migration (zero downtime) • Master data management • To update/broadcast from a hub • High/continuous availability • …

  5. Enabling data replication • Performance • Native interfaces • Support for parallelism • Compression • Change data capture • Impact minimalism • Heterogeneity • Topology support • Synchronisation • Graphical development and management/monitoring • In operational/HA environments: transactional integrity

  6. Performance 1: native interfaces High level interfaces (O/JDBC) not fast enough

  7. Performance 2: parallelism

  8. Performance 3: compression One size does not fit all

  9. Performance 4: CDC

  10. Performance 5: impact minimalism

  11. Heterogeneity

  12. Topology support 1 to 1 1 to Many Many to 1 M to M 1 to 1 to 1 etc

  13. Synchronisation

  14. Development & Monitoring

  15. Conclusion • Replication serves sundry purposes • Fastest growing adoption for BI • Key requirement is performance but multiple others • Complementary (not competitive) to both data integration and data virtualisation

More Related