1 / 51

Jyotishman Pathak, PhD Assistant Professor of Biomedical Informatics

Strategic Health IT Advanced Research Projects (SHARP) Area 4: Secondary Use of EHR Data Project 3: High-Throughput Phenotyping. Jyotishman Pathak, PhD Assistant Professor of Biomedical Informatics. June 11, 2012. Project 3: Collaborators & Acknowledgments.

lester
Télécharger la présentation

Jyotishman Pathak, PhD Assistant Professor of Biomedical Informatics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Strategic Health IT Advanced Research Projects (SHARP) Area 4: Secondary Use of EHR Data Project 3: High-Throughput Phenotyping Jyotishman Pathak, PhD Assistant Professor of Biomedical Informatics June 11, 2012

  2. Project 3: Collaborators & Acknowledgments • CDISC (Clinical Data Interchange Standards Consortium) • Rebecca Kush, Landen Bain • Centerphase Solutions • Gary Lubin, Jeff Tarlowe • Group Health Seattle • David Carrell • Harvard University/MIT • GuerganaSavova, Peter Szolovits • Intermountain Healthcare/University of Utah • Susan Welch, Herman Post, Darin Wilcox, Peter Haug • Mayo Clinic • Cory Endle, Rick Kiefer, Sahana Murthy, GopuShrestha, Dingcheng Li, Gyorgy Simon, Matt Durski, Craig Stancl, Kevin Peterson, Cui Tao, Lacey Hart, Erin Martin, Kent Bailey, Scott Tabor

  3. Phenotyping is still a bottleneck… [Image from Wikipedia]

  4. EHR systems: United States 2002—2011 [Millwood et al. 2012]

  5. Electronic health records (EHRs) driven phenotyping • EHRs are becoming more and more prevalent within the U.S. healthcare system • Meaningful Use is one of the major drivers • Overarching goal • To develop high-throughputautomated techniques and algorithms that operate on normalized EHR data to identify cohorts of potentially eligible subjects on the basis of disease, symptoms, or related findings

  6. http://gwas.org

  7. EHR-driven Phenotyping Algorithms - I • Typical components • Billing and diagnoses codes • Procedure codes • Labs • Medications • Phenotype-specific co-variates (e.g., Demographics, Vitals, Smoking Status, CASI scores) • Pathology • Imaging? • Organized into inclusion and exclusion criteria

  8. EHR-driven Phenotyping Algorithms - II Rules Evaluation Phenotype Algorithm Visualization Data Transform Transform Mappings NLP, SQL [eMERGE Network]

  9. Example: Hypothyroidism Algorithm No thyroid-altering medications (e.g., Phenytoin, Lithium) 2+ non-acute visits in 3 yrs ICD-9s forHypothyroidism AbnormalTSH/FT4 Antibodies forTTG or TPO(anti-thyroglobulin,anti-thyroperidase) No ICD-9s forHypothyroidism NoAbnormalTSH/FT4 Thyroid replace. meds No thyroid replace. meds NoAntiboides for TTG/TPO No secondary causes (e.g., pregnancy, ablation) No hx of myasthenia gravis Case 1 Case 2 Control [Denny et al., 2012]

  10. Hypothyroidism Algorithm: Validation [Denny et al., 2012]

  11. [eMERGE Network]

  12. Genotype-Phenotype Association Results published observed gene / disease marker region rs2200733 Chr. 4q25 Atrial fibrillation rs10033464 Chr. 4q25 rs11805303 IL23R rs17234657 Chr. 5 Crohn's disease rs1000113 Chr. 5 rs17221417 NOD2 rs2542151 PTPN22 rs3135388 DRB1*1501 Multiple sclerosis rs2104286 IL2RA rs6897932 IL7RA rs6457617 Chr. 6 Rheumatoid arthritis rs6679677 RSBN1 rs2476601 PTPN22 rs4506565 TCF7L2 rs12255372 TCF7L2 rs12243326 TCF7L2 rs10811661 CDKN2B Type 2 diabetes rs8050136 FTO rs5219 KCNJ11 rs5215 KCNJ11 rs4402960 IGF2BP2 0.5 1.0 5.0 2.0 Odds Ratio [Ritchie et al.2010]

  13. Key lessons learned from eMERGE • Algorithm design and transportability • Non-trivial; requires significant expert involvement • Highly iterative process • Time-consuming manual chart reviews • Representation of “phenotype logic” for transportability is critical • Standardized data access and representation • Importance of unified vocabularies, data elements, and value sets • Questionable reliability of ICD & CPT codes (e.g., billing the wrong code since it is easier to find) • Natural Language Processing (NLP) is critical

  14. Algorithm Development Process - Modified Rules Semi-Automatic Execution Evaluation Phenotype Algorithm Visualization Data Transform Transform Mappings NLP, SQL [eMERGE Network]

  15. Algorithm Development Process - Modified • Standardized and structured representation of phenotype definition criteria • Use the NQF Quality Data Model (QDM) Rules • Conversion of structured phenotype criteria into executable queries • Use JBoss® Drools (DRLs) Semi-Automatic Execution Evaluation Phenotype Algorithm Visualization • Standardized representation of clinical data • Create new and re-use existing clinical element models (CEMs) Data Transform Transform [Welch et al. 2012] [Thompson et al., submitted 2012] [Li et al., submitted 2012] Mappings NLP, SQL

  16. The SHARPn “phenotyping funnel” Intermountain EHR Mayo Clinic EHR [Welch et al. 2012] [Thompson et al., submitted 2012] [Li et al., submitted 2012]

  17. Clinical Element ModelsHigher-Order Structured Representations [Stan Huff, IHC]

  18. Pre- and Post-Coordination [Stan Huff, IHC]

  19. CEMs available for patient demographics, medications, lab measurements, procedures etc. [Stan Huff, IHC]

  20. SHARPn data normalization flow - I CEM MySQL database with normalized patient information [Welch et al. 2012]

  21. SHARPn data normalization flow - II CEM MySQL database with normalized patient information

  22. Algorithm Development Process - Modified • Standardized and structured representation of phenotype definition criteria • Use the NQF Quality Data Model (QDM) Rules Semi-Automatic Execution Evaluation Phenotype Algorithm Visualization • Standardized representation of clinical data • Create new and re-use existing clinical element models (CEMs) Data Transform Transform [Welch et al. 2012] [Thompson et al., submitted 2012] [Li et al., submitted 2012] Mappings NLP, SQL

  23. Our task: human readable  machine computable [Thompson et al., submitted 2012]

  24. NQF Quality Data Model (QDM) • Standard of the National Quality Forum (NQF) • A structure and grammar to represent quality measures in a standardized format • Groups of codes in a code set (ICD-9, etc.) • "Diagnosis, Active: steroid induced diabetes" using "steroid induced diabetes Value Set GROUPING (2.16.840.1.113883.3.464.0001.113)” • Supports temporality & sequences • AND: "Procedure, Performed: eye exam" > 1 year(s) starts before or during "Measurement end date" • Implemented as set of XML schemas • Links to standardized terminologies (ICD-9, ICD-10, SNOMED-CT, CPT-4, LOINC, RxNorm etc.)

  25. 116 Meaningful Use Phase I Quality Measures

  26. Example: Diabetes & Lipid Mgmt. - I Human readable HTML

  27. Example: Diabetes & Lipid Mgmt. - II Computable XML

  28. NQF Measure Authoring Tool (MAT)

  29. Algorithm Development Process - Modified • Standardized and structured representation of phenotype definition criteria • Use the NQF Quality Data Model (QDM) Rules • Conversion of structured phenotype criteria into executable queries • Use JBoss® Drools (DRLs) Semi-Automatic Execution Evaluation Phenotype Algorithm Visualization • Standardized representation of clinical data • Create new and re-use existing clinical element models (CEMs) Data Transform Transform [Welch et al. 2012] [Thompson et al., submitted 2012] [Li et al., submitted 2012] Mappings NLP, SQL

  30. JBoss® open-source Drools rules based management system (RBMS) • Represents knowledge with declarative production rules • Origins in artificial intelligence expert systems • Simple when <pattern> then <action>rules specified in text files • Separation of data and logic into separate components • Forward chaining inference model (Rete algorithm) • Domain specific languages (DSL)

  31. Example Drools rule {Rule Name} rule"Glucose <= 40, Insulin On“ when $msg : GlucoseMsg(glucoseFinding <= 40, currentInsulinDrip > 0 ) then glucoseProtocolResult.setInstruction(GlucoseInstructions.GLUCOSE _LESS_THAN_40_INSULIN_ON_MSG); end {Class Getter Method} {Java Class} {binding} {Class Setter Method} Parameter {Java Class}

  32. Automatic translation from NQF QDM criteria to Drools Measure Authoring Toolkit Drools Engine From non-executable to executable Measures XML-based Structured representation Drools scripts Converting measures to Drools scripts Data Types XML-based structured representation Fact Models Mapping data types and value sets Value Sets saved in XLS files [Li et al., submitted 2012]

  33. Automatic translation from NQF QDM criteria to Drools [Li et al., submitted 2012]

  34. The “executable” Drools flow

  35. Phenotype library and workbench - I http://phenotypeportal.org Converts QDM to Drools Rule execution by querying the CEM database Generate summary reports

  36. Phenotype library and workbench - II http://phenotypeportal.org

  37. Phenotype library and workbench - III http://phenotypeportal.org

  38. Phenotype library and workbench - IV

  39. Additional on-going research efforts - I • Machine learning and association rule mining • Manual creation of algorithms take time • Let computers do the “hard work” • Validate against expert developed ones [Caroll et al. 2011]

  40. Additional on-going research efforts - I • Origins from sales data • Items (columns): co-morbid conditions • Transactions (rows): patients • Itemsets: sets of co-morbid conditions • Goal: find allitemsets (sets of conditions) that frequently co-occur in patients. • One of those conditions should be DM. • Support: # of transactions the itemsetI appeared in • Support({TB, DLM, ND})=3 • Frequent: an itemsetI is frequent, if support(I)>minsup X: infrequent [Simon et al. 2012]

  41. Additional on-going research efforts - II

  42. Additional on-going research efforts - II TRALI/TACO sniffer

  43. Active Surveillance for TRALI and TACO Of the 88 TRALI cases correctly identified by the CART algorithm, only 11 (12.5%) of these were reported to the blood bank by the clinical service. Of the 45 TACO cases correctly identified by the CART algorithm, only 5 (11.1%) were reported to the blood bank by the clinical service.

  44. Additional on-going research efforts - III • Phenome-wide association scan (PheWAS) • Do a “reverse GWAS” using EHR data • Facilitate hypothesis generation [Pathak et al. submitted 2012]

  45. Publications till date (conservative)

  46. Mayo projects and collaborations • Ongoing • Transfusion related acute lung injury (Kor) • Drug induced liver injury (Talwalkar) • Drug induced thrombocytopenia and neutropenia (Al-Kali) • Active surveillance for celiac disease (Murray) • Warfarin dose response & heartvalvereplacements (Pereira) • Phenotype definition standardization (HCPR/Quality) • Getting started/planning • Pharmacogenomics of systolic heart failure (Bielinski/Pereira) • Pharmacogenomics of SSRI (Mrazek/Weinshilboum) • Lumbar image reporting with epidemiology (Kallmes) • Active clinical trial alerting (CTMS/Cancer Center)

  47. HTP related presentations • June 11th, 2012 • Using EHRs for clinical research (VitalyHerasevich) • Association rule mining and T2D risk prediction (Gyorgy Simon) • Scenario-based requirements engineering for developing EHR add-ons to support CER in patient care settings (JunfengGao) • June 12th, 2012 • Exploring patient data in context clinical research studies: Research Data Explorer (Adam Wilcox et al.) • Utilizing previous result sets as criteria for new queries with FURTHeR (Dustin Schultz et al.) • Semantic search engine for clinical trials (Yugyung Lee) • Knowledge-driven workbench for predictive modeling (Peter Haug et al.) • Clinical analytics driven care coordination for 30-day readmission – Demonstration from 360 Fresh.com (Ramesh Sairamesh)

More Related