1 / 36

A Formal Representation for Numerical Data Presented in Published Clinical Trial Reports

A Formal Representation for Numerical Data Presented in Published Clinical Trial Reports. Maurine Tong BS, William Hsu PhD, Ricky K Taira PhD Medical Imaging Informatics Group University of California, Los Angeles. Problem: Querying Free Text CTRs. Clinical Trial Reports (CTRs).

butch
Télécharger la présentation

A Formal Representation for Numerical Data Presented in Published Clinical Trial Reports

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Formal Representation for Numerical Data Presented in Published Clinical Trial Reports Maurine Tong BS, William Hsu PhD, Ricky K Taira PhD Medical Imaging Informatics Group University of California, Los Angeles

  2. Problem: Querying Free Text CTRs Clinical Trial Reports (CTRs) Informatics Applications Patient Recruitment Query Processor Internal/External Validity Testing Disease Modeling Representation

  3. Why Focus on Numerical Info Patient Recruitment • Predictive disease modeling • Ex: Bayesian Belief Networks • Key to identifying trial quality • Hypothesis testing context and measures • Key to synthesizing evidence • What is the context for reported probabilities • P ( effect | cause, context ) Internal Validity Disease Modeling

  4. Background and Prior Work • Ontologies for Experiments and Clinical Trials • Ontology of Clinical Research (OCRe) Sim et al. • Ontology of Scientific Experiments (EXPO) Soldatova et al. • Standardizing and sharing clinical trial data • BRIDG, CDISC, SNOMED CT • Representing individual sections of a clinical trial report • Eligibility criteria: EliXR, Weng et al. • Scientific claims: Blake et al. These systems primarily help to improve patient recruitment. Our focus is on modeling numerical information for quality assessment and disease modeling

  5. Problem: Fragmentation

  6. Methods: Requirements Analysis • What are the queries to be supported by the representation? Study Quality Disease Modeling

  7. Methods: Requirements Analysis Study Quality • Study quality queries • What is the p-value (population parameter associated with hypothesis? • What is the statistical test used to calculate the p-value? • What is the power of the sample size tested? • … Consulted textbooks and experts James Sayre, PhD Biostatician

  8. Methods: Requirements Analysis Disease Modeling • Disease modeling queries • What are the prior probabilities? • Can we estimate posterior probabilities from p-values or other reported information? • … Consulted experts, textbooks and literature • Thomas Belin, PhD • Biostatician

  9. Methods: Initial Design • Conceptual model of representation • Domain: Metastatic Melanoma Flaherty KT. et al. N Engl J Med. 2010 Aug 26;363(9):809-19

  10. Pop. Stats Sample Pop. Intervention Baseline Measurements … … … … …

  11. A Pop. Stats Sample Pop. Intervention Baseline Measurements … Process Model … … … …

  12. Pop. Stats Sample Pop. Intervention Baseline Measurements … … B Global Variable List … … …

  13. Pop. Stats Sample Pop. Intervention Baseline Measurements … … … … … C Variable Characterization

  14. Pop. Stats Sample Pop. Intervention Baseline Measurements … … … … D Statistical Hypothesis Testing …

  15. Results: Implementation

  16. Example 1: Capturing context • Demonstration of how the representation captures context for the observations of an intervention group. • Query • Domain: Lung Cancer • In Johnson et al., what is the context (e.g., intervention, population characteristics, measurement methodology) associated with progression free survival (PFS) in the high dose group (HDG)? Johnson DH. et al. J Clin Oncol. 2004 Jun 1;22(11):2184-91.

  17. Steps to Capture Context • Find the node in the process model • Find corresponding column • Find variable of interest • Backtrack through the process model to obtain context for observations and get associated data to backtracked node • Construct logical representation of context • Repeat steps 4-5 until the start node

  18. Step 1: Find the node in process model This node represents the progression free survival time point for high dose group.

  19. Step 2: Find corresponding column This column represents the numerical data and data elements associated with this node

  20. Step 3: Find variable of interest

  21. Step 4: Backtrack & Obtain Data Obtain context by looking at linked nodes in process model

  22. Step 5: Construct logical context Cell name: Bevacizumab Cell Location #: 474 Drug: Bevacizumab Dose: 15 mg/kg How was it administered: Vehicle: Intravenous infusion Duration: Over 90 minutes Cycle: 3 weeks Maximum dose: 18 doses Exception: Well tolerated Resulting Action: New duration Duration: 30-60 minutes Data modeling is straightforward from semantics of process model link and node

  23. Step 6: Repeat steps 4-5 until start • Continue backtracking through process model • Aggregate associated data • Repeat until first node • Context for Adverse Event (Node #740): • Name of n847

  24. Example 1: Capturing context • Demonstration of how the representation captures context for the observations of an intervention group. • Query • What is the context (e.g., intervention, population characteristics, measurement methodology) associated with progression free survival (PFS) in the high dose group?

  25. Example 1: Capturing context • Data: • AssociatedContext: Context for Adverse Event (Node #740): 1 ) INTERVENTION: • Bevacizumab (Node #474) • 2) POPULATION CHARACTERISTICS: • High Dose Bev (Arm #3) • Eligibility Criteria: • Stage 3 Recurrent NSCLC (Node #847) • No Prior Chemotherapy (Node #628) • Other criteria (Node #748) • Baseline characteristics of the patient (Node #222) 3) METHODS: Progression Free Survival

  26. Example 2: Comparisons • Comparison of outcomes in the intervention vs. control arms • Query • Compare PFS for intervention and control arm • Context from two nodes can be placed on the same chart

  27. Example 3: Analyses • How was the p-value calculated? • Visualization includes: • Data • Test Statistics • P-value • Statement

  28. Pilot Evaluation • Can representation answer user queries from requirements analysis? • Preliminary evaluation questions • Characteristics of the trial • Quality of the trial • Significance of the science

  29. Evaluation: Objectives • Objective 1 • Utility of the representation to accurately identify numerical data to support key contributions made by a clinical trial report • Objective 2 • Intuitiveness of the representation through reproducibility of the visualization by different users

  30. Evaluation: Study Design • Study design • 2-arm study • Status quo group using paper copy • Intervention group using proposed representation • Participants (n=6) • Graduate students in biology, biostatistics, informatics, or engineering • Statistical methods • Student’s paired t-test • Gold standard • Established by graduate student supervised by domain expert • 4 clinical trial papers in NSCLC • J Clin Oncol. 2004 Jun 1;22(11):2184-91. • J Clin Oncol. 2008 May 20;26(15):2442-9. • Lancet Oncol. 2012 Jan;13(1):33-42. • J Clin Oncol. 2011 Nov 1;29(31):4113-20.

  31. Evaluation: Questions • What is the purpose of this trial? • What is the sample size for each experimental arm? • How was the primary outcome assessed? • How many patients experienced positive outcomes in this trial? • How was the data analyzed?

  32. Evaluation: Results • Users of the representation was able to accurately identify numerical data that support key contributions as compared with status quo • User visualizations was reproducible • 68.1% ± 6.45% was of the gold standard was reproduced by users

  33. Discussion • Our work supports queries related to study quality and disease modeling • We developed a representation to associate appropriate context from numerical data within clinical trial reports • The pilot evaluation shows that the utility of the representation is promising • To extend this work: • Instantiate using automatic methods and capture numerical data using NLP methods • Develop an interface to support frequently-asked queries for specific clinical trial reports • Test in journal club setting

  34. Conclusion • We are establishing a systematic way of extracting information from clinical trial reports in a machine-understandable way • The overarching objective is to have a computer reason on this representation to facilitate clinical decision making

  35. Acknowledgements • James Sayre, PhD, Biostatician • Domain experts • Research participants • NLM Training Grant • NLM R01-LM009961

  36. Thank You

More Related