Start with the basics: pilot evaluation of eHealth services

Start with the basics: pilot evaluation of eHealth services Jeremy Wyatt Visiting professor, KIK, AMC Amsterdam Associate director of R&D, NICE, London jw@nice.nhs.uk

Overview • What is eHealth ? • What is evaluation and why do we evaluate ? • What to measure in pilot eHealth studies ? • Case study: NHS Clinical Enquiry Service • Summary & conclusions

What is eHealth ? “Using the internet and other electronic media to disseminate or provide access to health and lifestyle information or services” Cf. Telemedicine – implies a health professional at one or both ends

Kinds of eHealth • Information access, dissemination • Services: • Risk assessment – stroke risk scores • Support – online cancer communities • Triage – NHS Clinical Enquiry Service • Clinical advice - eMed • Supplies – online pharmacies, testing devices • Virtual data management – NHS HealthSpace

What is evaluation ? • Describing or measuring something • Usually with a purpose – making a decision, answering a question • Implies a set of criteria or judgements to be made (eg. option appraisal) - but may just be data collection and analysis

4. Make your decision 2. Design study 3. Collect data, analyse results Evaluation as an information-generating process • Formulate • question

Evaluation principles Aim to generate relevant information to support decisions throughout project Stakeholders ask questions, evaluators formalise them Methods depend on question & reliability of answer needed (not on technology): • Qualitative methods describe perceptions, barriers, needs, why things (do not) work, teams, relationships... • Quantitative methods measure how much, how often, eg. data quality, system use, change in clinical actions Challenge: titrating evaluation methods to resources available & reliability of answer required

Stages of evaluation

What to measure in a pilot study ?

Development project risks • Getting key assumptions wrong – eg. about need for system • Wasting development resources • Wasting evaluation resources – eg. doing large scale trials on unsafe, infeasible or unacceptable systems

Pilot studies Aims: • to manage project risks • review and improve a prototype system or service Related concepts: • usability testing, formative testing, “reality check” Possible outcomes: • Information helps us improve the prototype, design summative evaluation studies • Decide to radically change prototype and re-pilot • Decide to cancel development and any larger scale tests

Risks of eHealth systems / services • Harm: • to intended users • to unintended / inappropriate users • Lack of feasibility: • wastes health professional time, money • beneficiaries unclear, too small a group • ineffective • Unacceptable to target users: • users dislike the idea • users dislike the reality (unreliable, slow, clunky, not private…) • Cyber-divide in target user group

What to measure in eHealth pilots ? • Safety: • For intended users • Fail safe with inappropriate users • Feasibility: • Realistic health care resources are needed • Clear who is likely to benefit • Is usable, promises to be effective • Acceptability to target users: • Positive attitudes to the idea of the service • Positive comments, behaviour after using system • No major cyber-divide (age, gender, education…)

Measuring safety Measures: • Accuracy of advice against a gold standard • Kinds, frequency, severity of errors made • Data loss, distortion • Threats to privacy Methods: • Site studies in a safe environment (eg. GP surgery) • Exclude those who could be harmed, or ensure adequate follow up (eg. face to face encounter)

Measuring feasibility Measures: • Likely health care resources needed: staff training needs, time, test results etc. • Who is the population likely to benefit: absolute numbers, cyber divide ? • Usability, promise of effectiveness

Checking usability Who to study: 5-10 typical target users (Nielsen, www.useit.com) Setting: lab / classroom Measures: • Can the users understand what it is for ? • Can the users navigate around the system ? • Can they use the system to help them complete well / poorly specified tasks ? • What is their success rate & what errors do they make ? • What comments do they have about it ?

HealthSpace patient-managed data NHSDirect Healthspace, www.nhsdirect.nhs.uk • Secure, web-based record for patients • Calendar with appt reminders via SMS, email… • Patients can grant access to GP if they want • Health news feeds • Portal to their own official EHR (Dec 2004) • Will allow data import from chronic disease monitoring devices, etc.

HealthSpace usability test • Phase 1: NHSDirect staff play with system • Phase 2: 18 patients with modest PC experience try to complete 10 well specified tasks • Phase 3: same patients try to complete 10 similar but less well specified tasks • Phase 4: limited remote testing by patients and their friends

Measuring acceptability • Users: those with no special training or experience • Setting: minimise experimenter bias • Measures: • Opinions eg. Users complete questionnaire after reading short description or a demonstration / video showing tool in use, then again after using it themselves • Concerns / fears about using it (focus groups) • Likely actions: would they use it in practice, recommend it to others ?

Validated measurement instruments On-screen or paper questionnaires with closed and open ended responses Introduction, question wording & order, response wording & format will all influence the answers ! Need to pilot instrument, check its reliability (repeatability) & validity (usefulness) Use published instruments of reasonable reliability & validity, where possible, preserving original wording Example: TeleMedicine Preparedness Questionnaire, developed to assess preparedness of elderly Americans for virtual home visits by nurses using teleconferencing hardware. Demiris G et al. A questionnaire for the assessment of patients’ impressions of the risks and benefits of home telecare. J Telemedicine & Telecare 2000; 6: 278-84

Kinds of evaluation study Evaluation studies Qualitative studies Quantitative studies Measurement studies Demonstration studies Reliability studies Descriptive studies Validity studies Correlational studies Comparative studies

Case study: NHS Clinical Enquiry Service

NHS Clinical Enquiry Service • NHSDirect: nurse triage by phone, used 8M times pa. • CES: pilot web chat alternative for people who are deaf, speech impaired, shy…

Study methods Pilot studies carried out Nov ’02-March ’03 (total consults so far c. 150): • Simulated problems (scenarios): 68 consults with NHSDirect staff, 16 consults with members of Patient Reference Group; 5 consults with deaf people; others with staff, visitors, GPs etc. • Real problems: 25 patients (79% PC literate, 61% typing ok) attending GP for new problem (correct disposition 30% GP 12hrs, 57% GP routine, 13% self care) in inner-city Coventry practice

Data capture methods • Validated Minnesota TM Preparedness Questionnaire used pre / post on PRG & Coventry participants • Exit survey, researcher-administered comment form, transcript analysis • 6 focus groups to capture patient comments • Form for GPs • Nurse reflective diaries + focus groups

CES results – from patient perspective

What happens during the consult

Patient satisfaction • Mean summary score (1-10) allocated by patients: 8.1 (SD 1.0) • 81% preferred CES to NHSD, 100% said they would recommend it to friends or relatives • > 90% of the patients answered 12 of 18 other questions positively, 80-89% for 4 and 60-80% for 2 • TMPQ scores (n=23): • Mean baseline TMPQ score 44/60, no correlation with age • Score rose by 10% after CES use to 48/60 (p=0.002); no difference in final score with age

CES results - from NHSDO perspective

Cost and staff time Staff time: • Length of consults: 31 minutes (17-49), longer in young and old. No variation between 5 nurses • Nurse costs: assume nurse can do 10 consults per 8 hr shift and nurse cost per shift = £100 (£22k / 220 shifts pa.). • Nurse cost per consult approximately £10 Nurse training needs: • Entry requirements: nurse triage + say 12 weeks NHSDirect experience • Add say 12 weeks using web chat tools + internet slang + training in / experience of communicating with the profoundly deaf • Awaiting HP report

Strategic issues for NHSDO Fit: CES seems a logical replacement for: • The NHSD textphone service for the deaf used c. 26 times per week, cf. 1000 expected (source: Craig Murray) • Clients requesting personal advice from the Online Enquiry Service (20% - c. 120 per week) Safety & risk exposure • Nurses erred on side of safety, directing 10/21 patients (48%) to more intensive dispositions. • They directed 2 patients to less urgent dispositions (GP 2 weeks instead of 36 hrs, GP 36 hrs instead of 12 hrs) with no likely clinical consequences [details awaited]. • Overall, 76% of patients were directed to a disposition within 1 CAS category of GP’s advised disposition; remainder false positives.

Results - from GP perspective Data: • Stated that patient was better prepared for 50% of consultations • Thought that CES consult saved the GP time in 40% of their consultations Comments face to face & on form very positive, eg: • “Patient volunteered pertinent information immediately… finds the service useful” • “not quicker but had got chat out of her system” • “came straight to the point” • “Patient seems more confident than usual”

CES - summary

Target patient group • Fair insight into own health status, English ability • Reasonable PC / internet skills (claimed by 79% of Coventry patients) • Target patient groups: • deaf / hard of hearing (esp. with mother tongue English); • speech problems (eg. stroke, cerebral palsy); • shy or social problems • Tested on wide range of ages (19-81 year olds, mean 48) with no differences in pre / post TMPQ results • Target clinical problems: • not acute / emergency (advise NHSD or A&E) • may be better for embarrassing problems

Some issues to be resolved • Nursing concerns, recruitment & training etc.: to be discussed • Strategy for integration with OES, NHSD textphone, failed NHSD calls… • Communications: how to reach target groups – esp. deaf (see Ali Harding’s report) • New version of software: see draft requirements document from Michiel Veen

Summary: benefits of CES: Compared to NHSD textphone service, the CES is: • More portable – access from any internet-connected PC • Easier to use – no specific textphone training required, can read and enter multi-line statements, review past conversation, can take transcript to GP / A&E dept. • Inclusive – brings hard-to-reach groups (eg. deaf, young or retired male surfers) into easy range of the NHSD service • Sustainable, safe, well liked, feasible to integrate into selected NHSD call centres

… So it’s simple, then ?

Fallacies of evaluation

Study produces positive, expected result: Congratulations all round Prompt publication Conclude that results can be widely applied No need to repeat study Sweep biases, confounders under the carpet Study produces negative, unexpected result: Secrecy Delayed / postponed publication Conclude that results can never be applied Repeat study in another setting Careful search for biases, confounders to explain “anomalous” results Asymmetry in evaluation

What should we avoid ? • Excess focus on information systems, cf. problems: • “Idolatry of technology” (Gremy, IMIA WG, Helsinki 1997) • Keeping up with leading edge technology while users / industry develop & apply solutions • Excess focus on evaluation methods, cf. questions • “To a methodologist with a sample size calculator, every question is a null hypothesis*” • Inventing new evaluation methods, measurement instruments • Isolationism: • Building a wall of jargon around our work • Publishing in HI journals / conferences, cf. health care J • Paddling our canoes around a silting-up backwater, while healthcare / industry jetfoils speed by… (* to a qualitative researcher, every question is an invitation to participate)

What should we think about a doctor who uses the wrong treatment? Most people would agree that such behaviour was unethical & unacceptable. What, then, should we think about researchers who use the wrong techniques (wilfully or in ignorance), use the right techniques wrongly, misinterpret their results, report them selectively, cite the literature selectively, and draw unjustified conclusions? We should be appalled. Yet numerous studies of the medical literature have shown that all of the above phenomena are common. This is surely a scandal. Huge sums of money are spent annually on research seriously flawed through inappropriate designs, unrepresentative small samples, incorrect analysis and faulty interpretation. Errors are so varied that a whole book on the topic is not comprehensive… We need less research, better research & research done for the right reasons. The scandal of poor medical research. D G Altman. BMJ 1994; 308: 283-4

Evaluation ethics and governance • Write a protocol: background, aims, methods, instruments to use • Draft a consent form • Get them peer reviewed • Submit to ethics committee • Ensure staff are trained, can give study adequate time • Take care with data protection • Publish, inform participants of results

Conclusions • Piloting is a key step in the development of any system or service • For eHealth, safety, feasibility and acceptability are fundamental • Careful evaluation at the pilot stage can detect serious problems • Such evaluation is not hard and can stop you wasting millions later • However, further summative evaluation will also be needed

Start with the basics: pilot evaluation of eHealth services

Start with the basics: pilot evaluation of eHealth services

Presentation Transcript

Evaluation Basics

EAI: Evaluation of Software and Services Phase 2 – Pilot Evaluation

TRIANGLE Evaluation of Outcomes Pilot North East Floating Support Services

Start-up Basics

S-START Evaluation

Signals Basics Start with the right way

Teacher/Principal Evaluation Pilot

The Promise of eHealth

EDUCATOR EVALUATION PILOT

EVALUATION OF PILOT YOUTH FRIENDLY SERVICES

Theme 1: Let’s start with the basics

Theme 1: Let’s start with the basics

Evaluation of the Dairy Options Pilot Program

PILOT PLANT START UP

Basics of Program Evaluation

Evaluation of the Individual Budget Pilot Projects

Some basics to start with probability

Injury Evaluation Basics

The Basics Of Bookkeeping For Start Ups

PILOT WORKSHOP EVALUATION

Basics of Program Evaluation

Educator Evaluation Pilot