1 / 83

Nicholas Collins Clinical Analytics and Informatics 24 September 2013

A Warehouse and Reporting Architecture for Healthcare using Oracle Technologies [ CON3926]. Nicholas Collins Clinical Analytics and Informatics 24 September 2013. Topics. About MD Anderson (and me!) The Future of Cancer Treatment and Research Oracle at MD Anderson

stefan
Télécharger la présentation

Nicholas Collins Clinical Analytics and Informatics 24 September 2013

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Warehouse and Reporting Architecture for Healthcare using Oracle Technologies [CON3926] Nicholas Collins Clinical Analytics and Informatics 24 September 2013

  2. Topics About MD Anderson (and me!) The Future of Cancer Treatment and Research Oracle at MD Anderson Our Warehouse and Reporting Architecture Implementation Conclusions 2

  3. 1 About MD Anderson 3

  4. About MD Anderson • Non-profit cancer hospital and research institution, founded in 1941 as part of The University of Texas System • Named after Monroe Dunaway Anderson (a banker and cotton trader, not an MD) • “Making Cancer History” – our mission is to eradicate cancer • Consistently ranked as the #1 hospital for cancer care 4

  5. About MD Anderson 5

  6. About MD Anderson • Over 19,000 employees, majority in the Houston area • Occupying over 20 buildings in the Texas Medical Center • The Texas Medical Center has over 50 member institutions, together over 100,000 employees 6

  7. About Me • Began working at MD Anderson while an undergraduate at Rice University (across the street) • Currently in the Clinical Analytics and Informatics (CAI) Department, but before that was in HR Information Management (HRIM), working with PeopleSoft and other HR apps • While in HRIM, built a custom HR data warehouse and reporting system using a combination of Microsoft and Oracle technologies • After hours, a professional stage actor/director in Houston 7

  8. 2 The Future of Cancer Treatment and Research 8

  9. MD Anderson Moon Shots Program • “The Time is Now. Together we will end cancer.” • Target six cancers: Breast/Ovarian, Leukemia (AML/MDS & CLL), Melanoma, Lung, Prostate • Clear focus on the concept that the answer to curing cancer lies in both clinical and genomic data 9

  10. MD Anderson Moon Shots Program http://www.cancermoonshots.org 10

  11. It’s in the Data! 11

  12. MD Anderson Moon Shots Platforms • Massive Data Analytics – An infrastructure for complex analytics and clinical decision support using integrated patient information, including clinical and research data • Big Data – An Information Technology infrastructure/environment that enables centralization, integration and secured access of patient and research data and analytical results 12

  13. It’s in the Genes! 13

  14. MD Anderson Moon Shots Platforms • Clinical Genomics – Clinical gene sequencing infrastructure, including centralized bio-specimen repository and processing • Omics– Bioinformatics – A high-throughput infrastructure for generation and standardization of large-scale “omic” data, including genomics, proteomics and immune profiling • Adaptive Learning in Genomic Medicine – A framework for bringing clinical medicine and genomic research together to enable rapid learning to improve patient management using Clinical Genomics, Omics-Bioinformatics and Massive Data Analytics platforms within the Big Data environment 14

  15. Genomics in the News 15

  16. 3 Oracle at MD Anderson 16

  17. Oracle Health Sciences Products at MD Anderson • Oracle Healthcare Data Warehouse Foundation (HDWF) • Oracle Healthcare Analytics Data Integration (OHADI) • Oracle TRC (Translational Research Center) Cohort Explorer • Oracle TRC Omics Data Bank (ODB) 17

  18. Oracle Technology at MD Anderson • Oracle Database 11gR2 • Oracle Exadata (x3) • Oracle Business Intelligence (OBIEE) • Oracle GoldenGate* *Oracle GoldenGate was used to demonstrate replication capabilities in a significant POC, but has not been purchased or put into production. Informatica is commonly used at MD Anderson for data integration; ODI is not currently in use at the institution. 18

  19. Oracle Healthcare Data Warehouse Foundation (HDWF) HDM HDI 19

  20. Oracle Healthcare Analytics Data Integration (OHADI) OHADI HDM HDI • Integration code that maps from the interface tables (HDI) to the warehouse tables (HDM) • Available as either Informatica or ODI mappings 20

  21. Oracle Cohort Explorer Cohort Explorer CDM 21

  22. Oracle Cohort Explorer 22

  23. Oracle Omics Data Bank (ODB) 23

  24. 24

  25. Review of Oracle Health Sciences Products Cohort Explorer HDM CDM HDI OHADI ODB 25

  26. 4 Our Warehouse and Reporting Architecture 26

  27. The MD Anderson FIRE Program • FIRE - Federated Institutional Reporting Environment • A program level initiative, with many projects and products involved, to provide a unified BI/Reporting solution for all of MD Anderson • Managed by the Clinical Analytics and Informatics (CAI) Department, part of Oracle SDP (Strategic Development Parnter) Program 27

  28. FIRE Program Team Structure (Early Proposal) 28

  29. FIRE Program Team Structure (Current) 29

  30. First FIRE Release – Pharmacy Dashboard • Custom-built OBIEE Dashboard for orders data, pulling data from HDM, with HDI populated from the GE Centricity source • Dimensional model with orders as the core fact • Included a smaller pre-release of HR data for staff details and testing the FIRE Architecture • Purchased HDWF and OHADI in May 2012, aiming for a fall go-live for our first FIRE release 30

  31. Pharmacy Dashboard (OBIEE) 31

  32. Unifying the Solution • Oracle Health Sciences GBU provides products that are a part of an overall solution, the rest is organization specific • We needed a way to effectively get data into HDWF and deliver it to custom Data Marts • Having a pre-built warehouse model helps with speed of delivery, but not necessarily source system mapping and integration 32

  33. Architectural Concept • Bring all the data processing together on a single Oracle instance for performance benefits of local movement and transformation • Abstract across all commonalities and patterns to the largest extent possible, avoiding needless one-off solutions, use code generation and automation • Ideal candidate for a later “forklift” to Exadata 33

  34. The FIRE Architecture 34

  35. Source Systems at MD Anderson • Currently no centralized EHR solution in place, a best-of-breed model with many disparate source systems • Data currently brought together for patient-care clinical use in a single UI by a SOA-based custom .NET app called ClinicStation • In July 2013, the institution announced its intention to migrate to Epic’s EHR solution 35

  36. The FIRE Architecture HDI/HDM SR SI UI UD 36

  37. SR (Staging Replica) Layer • Stores replicated data from source systems in the consolidated warehouse environment, provides buffer from sources and their technology, also allows custom indexing and partitioning if needed • Replication can be accomplished in a variety of ways, using a “bag of dirty tricks” to get the data into relational form in Oracle • GoldenGate proposed as standard tool to replicate in from relational sources, transparent gateway and Informatica/ODI are other options • Repliaction done at table level for consistency and ease of change 37

  38. SI (Staging Interface) Layer • Pull data directly from the corresponding SR schema (i.e. the SI_CENTRICITY schema pulls data from the SR_CENTRICITY schema) • Contains views that match the target HDI tables, one-to-one, same column names and data types • Accomplishes selection of the appropriate data, and any necessary pre-transformation • In complex cases, can have materialized pre-processing data in tables or materialized views, in practice this meets the 80/20 rule 38

  39. HDI Layer • Oracle-defined HDWF interface tables • A “landing zone” schema, can be used as a Persistent Staging Area (PSA) • Conceptually, where you place unrefined and unvalidated data to be processed by OHADI before it goes into the HDM (warehouse tables) • Tables are designed to be insert-only (source-change dated) 39

  40. HDM Layer • Oracle-defined HDWF warehouse tables • HDM stands for Healtcare Data Model • Keys and Referential Integrity (RI) in place for this layer, but RI is disabled by default • Conceptually, where all your data is persistently stored, though reloads from HDI are possible • Can be configured for effective-dating or only current state 40

  41. UI (User Interface) Layer • Similar in concept to the SI Layer, but pulling data from HDM for use in the UD layer • Contains views that match the target UD tables, one-to-one, same column names and data types • Accomplishes selection of the appropriate data, and any necessary pre-transformation (like SI) • In complex cases, can have materialized pre-processing data in tables or materialized views, in practice this meets the 80/20 rule (like SI) 41

  42. UD (User Data Mart) Layer • Pull data directly from the corresponding UI schema (i.e. the UD_RX schema pulls data from the views in the UI_RX schema) • Contains the user-layer target tables that are used in the dimensional (star schema) models • Can be used directly by OBIEE or schemas can be replicated to a separate user database for isolated/off-loaded dashboard processing • Data-wise, it’s the end result of the warehouse pipeline 42

  43. Data Movement 43

  44. Data Movement (Planned) • There was a desire from our integration team to use Informaticafor ETL because of experience base on the team, not much PL/SQL or ODI knowledge • Architecture proposed use of abstracted code generation via Informatica APIs, jointly used with the push-down optimization option for all non-OHADI internal data movement (i.e. SI to HDI, UI to UD) 44

  45. Data Movement (Actual) • Our integration team initially indicated that code generation with Informatica (or other tools) could not be done on account of complexity, and that the push-down optimization option was too expensive • To demostrate the feasibility, I programmed a PL/SQL-based version of the code generation as proposed in the FIRE Architecture documentation, we used this code in the first release 45

  46. Data Movement Code Generation 46

  47. PL/SQL Procedures for Code Generation • Procedure iv_tv_ip_gen(name_of_sv_view) for SI layer, generates objects for change detection and movement from SR to HDI • Procedure iv_uv_dv_ip_gen(name_of_sv_view) for UI layer, generates objects for change detection and movement from HDM to UD • All that is needed for generation is the SV view, which conforms to the HDI-based structure, data in certain standard HDI columns determine action • A benefit of the generated views is the ability to see what will happen during the next run, without actually running anything 47

  48. Results/Next Steps • Had approximately three months to implement, process was difficult, but in the end everything worked and we went to production with the first FIRE release in November 2012 • OHADI was slower than expected but got the job done, Informatica version used, but might be faster with ODI? • Integration team wanted a second chance to get code generation going for Informatica, and wanted more Informatica and less SQL and PL/SQL • Committee voted to try Informatica alternatives for the next release 48

  49. Second FIRE Release – Moon Shot Analytics and Pharmacy Dashboard • In this release, there were four total projects under the FIRE Program: • (1) Second Pharmacy Release, (2) Cohort Explorer and ODB (Moon Shot Analytics), (3) Exadata Implementation for Cohort Explorer and ODB, and (4) OBIEE Infrastructure Upgrade • Beginning to use Omics Data Bank, data volumes required Exadata • License restrictions and no budget yet to put HDWF on Exadata 49

  50. Second Release Cohort Explorer CDM ODB 50

More Related