Advancements in Risk Prediction Tools Using QResearch: Insights from Prof. Julia Hippisley-Cox

Using QResearch for development & validation of risk prediction tools Prof Julia Hippisley-Cox, University of Nottingham, 5th Sept 2013

My roles & interests • Professor Clinical Epidemiology & GP University Nottingham • NHS General Practitioner • Member Confidentiality Advisory Committee (s251) • Director QResearch & QSurveillance (EMIS/Notts) • Director ClinRisk Ltd (medical software) • Member EMIS National User Group • UoN also license holder THIN, CPRD, HES, ONS datasets

Acknowledgements • Co-authors Drs Carol Coupland, Peter Brindle, John Robson • QResearch database • University of Nottingham • EMIS & contributing practices & user group • ClinRisk Ltd (software) • Oxford University (independent validation, Prof Altman’s team)

Outline • QResearch database +linked data • General approach to risk prediction • QRISK2 • QIntervention • Any questions

QResearch Database • One of the worlds largest and richest research databases • Over 700 general practices across the UK, 14 million patients • Joint NFP venture between EMIS (largest GP supplier > 55% practices) and University of Nottingham • Patient level pseudonymised database for research • Available for peer reviewed academic research where outputs made publically available • Data from 1989 to present day.

Information on QResearch – GP derived data • Demographic data – age, sex, ethnicity, SHA, deprivation • Diagnoses • Clinical values –blood pressure, body mass index • Laboratory tests – FBC, U&E, LFTs etc • Prescribed medication – drug, dose, duration, frequency, route • Referrals • Consultations

QResearch Data Linkage Project • QResearch database already linked to • deprivation data in 2002 • cause of death data in 2007 • Very useful for research • better definition & capture of outcomes • Health inequality analysis • Improved performance of QRISK2 and similar scores • Developed new open source technique for data linkage using pseudonymised data

www.openpseudonymiser.org • Scrambles NHS number BEFORE extraction from clinical system • Takes NHS number + project specific encrypted ‘salt code’ • One way hashing algorithm (SHA2-256) • Cant be reversed engineered • Applied twice in two separate locations before data leaves source • Apply identical software to external dataset • Allows two pseudonymised datasets to be linked • Open source – free for all to use

QResearch Database + data linked in 2013

QPrediction ScoresA new family of Risk Prediction tools • Individual assessment • Who is most at risk of preventable disease? • What is level of that risk and how does it compare? • Who is likely to benefit from interventions? • What is the balance of risks and benefits for my patient? • Enable informed consent and shared decisions • Population level • Risk stratification • Identification of rank ordered list of patients for recall or reassurance • GP systems integration • Allow updates tool over time, audit of impact on services and outcomes

Clinical Research Cycle

Criteria for choosing clinical outcomes • Major cause morbidity & mortality • Represents real clinical need • Related intervention which can be targeted • Related to national priorities (ideally) • Necessary data in clinical record • Help inform decisions at the point of care • Can be implemented into everyday clinical practice

Change in research question • Leads to • Novel application of existing methods • Development of new methods • Better utilisation different data sources • Leads to • Lively academic debate! • Changes in policy and guidance • New utilities to implement research findings • (hopefully) Better patient care

Published & validated scores

Primary prevention CVD:(slide from NICE website) • Offer information about: • absolute risk of vascular disease • absolute benefits/harms of an • intervention • Information should: • present individualised risk/benefit • scenarios • present absolute risk of events • numerically • use appropriate diagrams and text

Challenge: to develop a new CVD risk score for use in UK • New cardiovascular disease risk score • Calibrated to UK population • Use routinely collected GP data • Include additional known risk factors (eg family history, deprivation) • Better calibration and discrimination than US derived Framingham score

Why a new CVD risk score? • Framingham has many strengths but some limitations: • Small cohort (5,000 patients) from one American town • Almost entirely white • Developed during peak incidence CVD in US • Doesn’t include certain risk factors (body mass index, family history, blood pressure treatment, deprivation) • Over predicts CVD risk by up to 50% in European populations • Underestimates risk in patients from deprived areas

Derivation of QRISK2 Score • Derivation cohort • 355 practices; 1,591,209 patients; • 96,709 events • Traditional Risk Factors • Additional risk factors: • ethnic group • type 2 diabetes, treated hypertension, rheumatoid arthritis, renal disease, atrial fibrillation • Interactions with age J Hippisley-Cox, C Coupland, et al. Predicting cardiovascular risk in England and Wales: prospective derivation and validation of QRISK2. BMJ 2008; 336: 1475-1482

Model Derivation • Separate models in males and females • Cox regression analysis • Fractional polynomials to model non-linear risk relationships • Multiple imputation of missing values

Validation • Separate sample of 176 QResearch practices; 750,232 patients; 43,396 events • Validation statistics (for survival data) • D statistic1 (discrimination) • R squared (% variation explained) • Predicted vs. observed CVD events • Clinical impact in terms of reclassification of patients into high/low risk 1 Royston and Sauerbrei. A new measure of prognostic separation in survival data. Stat Med 2004; 23: 723-748.

Calculation of risk scores • Risk scores calculated in validation dataset • Risk score calculation: • Used coefficients for risk factors obtained from Cox model using multiple imputed data • Combined these with patient characteristics in validation data to give prognostic index • Combined with baseline survival function estimated at 10 years to give estimated risk of CVD at 10 years for each person

Validation statistics Hippisley-Cox J et al. BMJ 2008;336:1475-1482

External validation using THIN database • Additional validation carried out using the THIN database • Based on practices in UK using Vision system • One validation carried out by QRISK authors • Hippisley-Cox J et al. The performance of the QRISK cardiovascular risk prediction algorithm in an independent UK sample of patients from general practice: a validation study. Heart 2007:hrt.2007.134890. • An independent validation carried out by a separate group • Collins GS, Altman DG. An independent and external validation of QRISK2 cardiovascular disease risk score: a prospective open cohort study. BMJ 2010;340:c2442

External validation using THIN database Collins GS, Altman DG. An independent and external validation of QRISK2 cardiovascular disease risk score: a prospective open cohort study. BMJ 2010;340:c2442

QRISK2 web calculator: www.qrisk.org

QRISK2 web calculator

Annual updates to QRISK2 • Reasoning: • Changes in population characteristics – • e.g. incidence of cardiovascular disease is falling; obesity is rising; smoking rates are falling • Improvements in data quality - recording of predictors and clinical outcomes becomes more complete over time (e.g. ethnic group now 50%). • Inclusion of new risk factors • Changes in requirements for how the risk prediction scores can be used - e.g. changes in age ranges.

QRISK2 in national guidelines

QRISK2 in clinical settings

Risks and Benefits of Statins • Two recent papers: • Unintended effects statins (Hippisley-Cox & Coupland, BMJ, 2010) • Individualising Risks & Benefits of Statins (Hippisley-Cox & Coupland, Heart, 2010) • Conclusions: • New tools to quantify likely benefit from statins • New tools to identify patients who might get rare adverse effects eg myopathy for closer monitoring

Qintervention www.qintervention.org

Thank you for listeningQuestions & Discussion

Advancements in Risk Prediction Tools Using QResearch: Insights from Prof. Julia Hippisley-Cox

Advancements in Risk Prediction Tools Using QResearch: Insights from Prof. Julia Hippisley-Cox

Presentation Transcript

Achieving Real-time Pulse-to-pulse PRI Prediction

Prediction Markets and the Wisdom of Crowds

Topics in the Development and Validation of Gene Expression Profiling Based Predictive Classifiers

Cleaning Validation

Financial Risk Management

Qualification and Validation

Cluster validation

Chapter 6 Systems Development Steps, Tools, and Techniques

Financial Risk Management

Risk Analysis and Project Evaluation

Thesis Defense: Incremental Validation of Formal Specifications

Machine Learning Algorithms for Protein Structure Prediction

RNA: Secondary Structure Prediction and Analysis

The Advanced Regional Prediction System (ARPS)

Gene Prediction

Chapter 6 Systems Development Steps, Tools, and Techniques

Process Validation – What the Future Holds

Intro to NWP numerical weather prediction

Periodontal risk assessment

QUALITY ASSURANCE AND VALIDATION FOR BIOMANUFACTURING

Chapter 6. Classification and Prediction