1 / 29

Using Electronic Medical Records Systems for Clinical Research: Benefits and Challenges

Using Electronic Medical Records Systems for Clinical Research: Benefits and Challenges . Prakash M. Nadkarni. Introduction. Opportunities Availability of clinical, financial and administrative data in electronic form Challenges Using EMR Software for research operations

chilton
Télécharger la présentation

Using Electronic Medical Records Systems for Clinical Research: Benefits and Challenges

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Using Electronic Medical Records Systems for Clinical Research: Benefits and Challenges Prakash M. Nadkarni

  2. Introduction • Opportunities • Availability of clinical, financial and administrative data in electronic form • Challenges • Using EMR Software for research operations • Using EMR Data for research? Suitability of care-oriented data to clinical research needs. • EMRs queried directly to answer research questions

  3. EMR/Clinical Research Information System (CRIS) Differences: Research Subjects • Subjects are not necessarily “patients”. • Personal Health Information may be optional. • Not all screened subjects are enrolled. • Simultaneous or sequential enrollment • Eligibility Criteria

  4. EMR/CRIS Differences: The Study Calendar • Events/Visits and Study Calendar: Specific evaluations or interventions are done at specific time points ('events") relative to the start of the study. • All patients are not enrolled at the same time.

  5. EMR/CRIS Differences: Electronic Data Capture (EDC) • CRIS EDC is Far More Structured and Fine-grained – textual comments are only a last resort. • CRISs may need to Support Real-Time Self-reporting of Subject Data • CRIS EDC may not always be Real-Time. • Quality Control considerations dictate many workflow steps.

  6. EMR/CRIS Differences: Trans-Institutional Scope • For trans-institutional scope, Web technology is virtually mandated. • Site restriction in Multi-Site studies – end-users and investigators access only their own site’s patients. • Trans-National Issues: Software Localization/ Globalization – same software, different language/layout.

  7. EMR/CRIS Differences: User Roles • CRISs support differential access to studies • Most users of a CRIS are unaware of the other studies in the same database. • Some users have read-only access to the data; some only view reports. • Only certain users may be allowed to enter data in particular forms, or even view certain "blinded" data. • Data analysts typically do not need to access PHI. However, in multi-institutional studies, they are not typically site-restricted (see later)

  8. EMR/CRIS Differences: Summary • EMRs are intended to primarily support patient care, not research. CRISs are specifically designed for research protocols. • May inter-operate with CRISs. • Sub-systems: Laboratory, Pharmacy, Scheduling • EMR *may* be used with structured EDC for intra-institutional studies if the only alternative is paper, or if data-entry would otherwise be duplicated. • Claims by any EMR vendor that their systems are CRIS-capable should be viewed skeptically.

  9. EMR Data for Research: • The Nature of Electronic EMR Data • Significant dependence on narrative text, which is often the gold standard for clinical findings. • Using administrative/billing data as a surrogate for clinical data • Miscoding, variations in coding

  10. Using EMR Data for Research • Primarily hypothesis suggestion/generation rather than confirmation • Sample size may be too small to achieve statistical significance • Most data mining tests only show association, which does not prove causation. • Selection of patients matching complex criteria: sample size projections for a planned study (a strength of I2B2 – no IRB approval needed because only anonymized data is returned).

  11. Medical Natural Language Processing 101 • NLP is concerned with extraction of meaningful information from human language input. • Ultimate goal is to transform unstructured text into a structured form. • Most NLP applications are targeted toward specific goals – e.g., identification of medications, adverse drug events. • NLP is not 100% accurate

  12. Medical NLP 101 : Symbolic/ Rule-based approaches • Linguistic / symbolic NLP approaches employ hand-crafted grammar rules to parse text into units of speech (symbols), which are then processed further. • Still used successfully for limited problems. • This approach does not always scale • Labor-intensive, ambiguous parses, poor results with telegraphic text.

  13. Medical NLP 101: Statistical NLP • Relies on large bodies of text annotated with the correct answers by humans. • Utilizes probabilistic methods for prediction • The larger and more representative the training data, the better the results will be. • Approaches include Support Vector Machines (SVMs), Hidden Markov Models (HMMs), and Conditional Random Fields (CRFs).

  14. Medical NLP 101: Subproblems • NLP software typically works as a pipeline of modules: Modules for Low-level tasks precede those for high-level tasks • Low Level Tasks • Segmentation- sentence and word boundary detection, problem-specific boundary detection • Part of speech tagging • Morphological decomposition of compound words • Aggregation – identification of phrases

  15. Medical NLP 101 : Sub-problems (2) • High-level tasks • Spelling and grammatical error correction • Named Entity Recognition – including medical concept recognition • Word /abbreviation disambiguation • Negation and uncertainty identification • Relationship extraction • Temporal inferencing

  16. Medical NLP: Practical Issues • Change of Workflow and Introduction of Structure can eliminate a difficult problem. • Code Reuse to avoid reinventing wheels. • General vs. Specific Solutions • Tools Need Commoditization

  17. Querying EMR Data: Technological Considerations • A database cannot be simultaneously designed for rapid query as well as efficient interactive, multi-user updates. • EMR database designs are transaction-oriented. • EMRs are optimized for "Patient/Entity Centric", not "Attribute-Centric" queries

  18. Data Warehousing 101 • Principle: Operating on a separate read-only copy of the data on separate hardware yields better query performance. • Structural tweaks include adding extra and pre-computation of aggregate values. • Special types of indexes (bitmap indexes) yield improved query performance. • “Star schemas” characterize most warehouse designs. • Farmers vs. Explorers (Inmon) • “Virtual" integration ("federation")

  19. Data Warehousing: Practical Considerations • After warehouse, need for creation of custom reports may increase rather than decrease. • The critical requirement for effective ad hoc query is a comprehensive understanding of the data. This is generally a full-time effort.

  20. Special Considerations: Querying of Clinical Data • Both EMRs and large-scale CRISs typically store clinical data in Entity-Attribute-Value (EAV) form • 100,000s of clinical parameters exist across all medical domains. • The vast majority of parameters will be inapplicable for a particular subject/patient. • EAV is a triple: Entity=Patient+point in time, Attribute=Parameter, Value=value of that parameter. • EPIC Flowsheet data uses EAV.

  21. Standardization • The mere presence of structure does not solve all problems • Synonyms in narrative text are unavoidable- reduced to the same concept. Controlled medical vocabularies (UMLS) help. • UMLS is not a panacea • Institutions will therefore evolve their internal controlled vocabularies.

  22. Standardization Considerations • Standardizing your definitions • 2nd Law of Thermodynamics • Poor definition quality becomes a problem if pooled-data (or meta-) analysis is intended. • Features of certain systems predispose to disorder. (Learn As You Go, separate definitions databases.) • Even the best system is not immune – path of least resistance. • Consistent definition is difficult to achieve after the fact – Deming.

  23. EMR use as the basis for research hypotheses • Conflicting evidence regarding EMR benefit still appears. • A *well designed* EMR may benefit. • Electronic Alerting Systems themselves may not improve care, unless EMRs also reduce workload through automatic actions. • Review vendor-supplied templates carefully.

  24. Conclusions: Future EMR Evolution • EMRs fully supporting CRIS capability are unlikely to evolve. • No software should attempt to do everything • Differences in storage-engine capabilities • Jack-of-all-trades approach (doing everything in a mediocre manner) is not viable. • Difficult (or impossible) to devise a logically consistent user-interface metaphor that applies to diverse unrelated features. • Example of Microsoft Office.

  25. Inter-operation (1) • Co-existing and Inter-operating best-of-breed packages offer the best usability and feature-set • CRISs, Genomic / Proteomic Data Management Packages • There may be minimal data duplication- e.g., EMRs may pull in very limited summary information on critical genetic data for selected patients, so that it is immediately visible.

  26. Inter-operation (2) • CRIS/EMR • Bulk import of laboratory parameters, to avoid duplicate data entry • Automatic grading of laboratory-based adverse events (oncology studies) – Richesson et al. • Use for scheduling research subject visits • Pharmacy subsystem for drug dispensation • EMR for primary EDC in intra-institutional studies if the only alternative is paper, or if data-entry would otherwise be duplicated. • EMR/Specialized EMR • Picture-archiving systems

  27. Inter-operation (3) • Application Programming Interfaces (APIs) • All large packages – CRISs, EMRs, ‘Omics – require APIs to make inter-operation efficient • APIs are vendor-specific. Inter-operation standards (e.g., the HL7 Virtual medical record) have not received much traction. • Currently, many vendors set unreasonable financial and other barriers to use of their APIs (e.g., official certification, withholding of documentation). • EMRs lag in the software industry’s trend toward open-source.

  28. Questions?

  29. Further reading • CRIS • Richesson and Andrews, Clinical Research Informatics, 2012 (Springer) • NLP • Jurafsky and Martin: Natural Language Processing • Manning and Schuetze: Foundations of Statistical Natural Language Processing • Nadkarni, Ohno-Machado and Chapman: Natural Language Processing: An Introduction. Journal of the American Medical Informatics Association 2011. • Data Warehousing • Larry Greenfield. The Data Warehousing Information Center. www.dwinfocenter.org/ • Kimball, Reeves, Ross and Thornthwaite. The Data Warehouse Lifecycle Toolkit : Expert Methods for Designing, Developing, and Deploying Data Warehouses. Wiley, 1998.

More Related