1 / 29

Calculating Sample Sizes for Research

This lecture explores the importance of sample size in research, emphasizing the need for appropriate sample sizes to confirm or refute hypotheses. It also discusses the biases and ethical considerations that can arise due to inadequate sampling.

morganl
Télécharger la présentation

Calculating Sample Sizes for Research

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The following lecture has been approved for University Undergraduate Students This lecture may contain information, ideas, concepts and discursive anecdotes that may be thought provoking and challenging It is not intended for the content or delivery to cause offence Any issues raised in the lecture may require the viewer to engage in further thought, insight, reflection or critical evaluation

  2. Calculating Sample Sizes for Research Dr. Craig Jackson Senior Lecturer in Health Psychology Faculty of Health www.hcc.uce.ac.uk/craig_jackson

  3. Keep it simple “Some people hate the very name of statistics but.....their power of dealing with complicated phenomena is extraordinary. They are the only tools by which an opening can be cut through the formidable thicket of difficulties that bars the path of those who pursue the science of man.” Sir Francis Galton, 1889

  4. How Many Make a Sample?

  5. How Many Make a Sample? “8 out of 10 owners who expressed a preference, said their cats preferred it.” How confident can we be about such statistics? 8 out of 10? 80 out of 100? 800 out of 1000? 80,000 out of 100,000?

  6. 26 25 24 23 22 21 20 Multiple Measurement of small sample 25 cell clusters 22 cell clusters 24 cell clusters 21 cell clusters Total = 92 cell clusters Mean = 23 cell clusters SD = 1.8 cell clusters

  7. It all depends on the size of your needle

  8. Small samples spoil research N Age IQ 1 20 100 2 20 100 3 20 100 4 20 100 5 20 100 6 20 100 7 20 100 8 20 100 9 20 100 10 20 100 Total 200 1000 Mean 20 100 SD 0 0 N Age IQ 1 18 100 2 20 110 3 22 119 4 24 101 5 26 105 6 21 113 7 19 120 8 25 119 9 20 114 10 21 101 Total 216 1102 Mean 21.6 110.2 SD ± 4.2 ± 19.2 N Age IQ 1 18 100 2 20 110 3 22 119 4 24 101 5 26 105 6 21 113 7 19 120 8 25 119 9 20 114 10 45 156 Total 240 1157 Mean 24 115.7 SD ± 8.5 ± 30.2

  9. Background on Surveys • Large-scale • Quantitative • Can be descriptive • (“2% of women think they are beautiful”) • Can be inferential • (“Significantly more single women think they’re beautiful than married women do”) • Done with a sample of patients, respondents, consumers, or professionals • Differences between any groups assessed with hypothesis testing • Important that sample size must be large enough to detect any such difference if it truly exists

  10. Importance of Sample Size • “Forgotten” in many studies • Little consideration given • Appropriate sample size needed to confirm / refute hypotheses • Small samples far too small to detect anything but the grossest difference • Non-significant results are reported as “significant” – Type 2 error • Too large a sample – unnecessary waste of (clinical) resources • Ethical considerations – waste of patient time, inconvenience, discomfort • Essential to make assessment of optimal sample size before starting investigation

  11. Qualitative studies need to sample wisely too… • Asian GPs’ attitudes to ANP • Objective: • To determine attitudes to ANP among Asian doctors in East Birmingham PCT • Method: • Send invitation to 55 Asian GPs (Approx 47% of East Birmingham PCT) • Intends to interview (30mins) with first 20 GPs who respond • Sample would be 36% of Asian GPs – and only 17% of GPs in PCT • Severely Biased Research (and ethically dodgy too)

  12. Have Some Consideration – “The Good” #1 Pulmonary Valve Replacement on Biventricular Function following Tetralogy of Fallot Q. How many participants will be recruited? How many of these participants will be in a control group? A. “Power analyses have been undertaken based on previous data provided by Hazekamp et al. (2001). A sample size of 18 in each group will have 95% power to detect a difference in right-ventricular end-diastolic volume of 78ml (the difference between preoperative mean of 292ml and the postoperative mean of 214ml) assuming the common standard deviation is 62ml and using a two-group t-test with a 5% two-sided significance level.”

  13. Have Some Consideration – “The Bad” #2 Survey of knowledge and Attitudes regarding ADHD in Adults among Specialist Adult Psychiatrists It is a cross sectional questionnaire survey to assess the current knowledge and attitudes regarding ADHD in Adults amongst ALL General and Specialist Adult Consultants, Specialist Registrars and Staff-grade / Associate Specialist Doctors in Birmingham and Solihull Q. How many participants will be recruited? How many of these participants will be in a control group? A. “100.”

  14. Have Some Consideration – “The Ugly” #3 The Sepsis Study This is a cross sectional study which will be conducted using a postal questionnaire with a follow-up reminder letter to non-responders. The sample will be taken from patients who have been admitted to the ITU department for severe sepsis or septic shock between Feb 1st 2004 and Aug 1st 2004. Patients will be over the age of 18 and will have spent at least one day on ITU. The questionnaire will be a standard health related quality of life questionnaire. Patients will be contacted by letter a maximum of two times. The patients’ personal details will be stored on a database kept in hospital to maintain patient confidentiality. Names will not be published in the written report. The database should highlight any patients who are deceased and obviously questionnaires will not be sent to the addresses.

  15. Have Some Consideration – “The Ugly” #3 The Sepsis Study Q. How many participants will be recruited? How many of these participants will be in a control group? A. “Between 30 and 60.”

  16. Hypothesis testing All about 2 types of errors Hi Men perform better than womenHo Men perform no better than women Imagine: actual data really shows no difference between sexes Decide to accept Ho Decide to reject Ho Ho trueCorrect decisionType 1 error (false positive) probability α Ho false Type 2 error Correct decision (false negative) probability β

  17. Errors in hypothesis testing Type 1 errors “False positive” Occurs if null-hypothesis rejected when it should be accepted e.g. a “significant result” obtained when null hypothesis is in fact true Probability of making Type 1 error denoted as “α” Type 2 errors “False negative”Occurs if null-hypothesis accepted when it should be rejectede.g. a non-significant result obtained when null hypothesis is in fact not trueProbability of making Type 2 error denoted as “β”

  18. 2. 1. Clinically important difference Primary Outcome Measure? Power 3. 4. Significance level Natural variability Factors affecting Sample Size Dependent upon 4 inter-related factors 1. Possible to calculate each one if the other three are known N = ?

  19. 1. Power Probability that study of given size would detect a real statistically significant difference Usually between 80% to 90% .80 .85 .90 Higher power = higher chance of detecting a genuine significant difference and low chance of making a type 2 error With high power, can be reasonably sure any non-significant result is genuine e.g. ok to accept null-hypothesis

  20. 2. Minimal Important Size of difference to be detected • If difference between treatments is large, small samples can produce significant results • If difference between treatments is small, larger samples are needed • Important to know if any differences are expected to be small • Determine the min. difference between treatments considered clinically relevant • Given large enough sample, any difference can be made statistically significant • Experience & Judgement needed in deciding minimal treatment effect that is of any value – to justify effort, time and finance involved

  21. 2. Minimum Important Difference to be detected (MID) Bronchodilator & Chronic Bronchitis Example New bronchodilator causes a real increase in tidal volume in patients (10ml average) Standard deviation (natural variation) in tidal volume in this clinical population is more than 10ml Given huge sample a significant tidal volume increase in users could be proved (but this is due to natural variation) Expensive & Pointless Such a small (but stat. significant) increase - the drug is of little clinical use

  22. ± 3. Standard Deviation & Variability Larger the SD of 2 groups, relative to CID, then the larger the sample needed Smaller the SD, the smaller the sample required Ratio of MID to SD is the “standardized difference” – used in calculating sample sizes Estimated SD Estimate of SD may not be available 1. Pilot study 2. Begin trial and estimate SD from initial patients 3. Use SD found in previous trials 4. Use SD found in similar patients / circumstances in other literature

  23. P • 4. Significance Level • Significance level (α) important bearing on sample size required • Relationship between significance level (α) and the chance of making type 2 error (β) • Smaller significance level (e.g. P=0.01 rather than P=0.05) requires larger sample size to avoid type 2 error • As nominated significance level gets smaller, so does chance of type 2 error • Significance level of P=0.05 implies a type 2 error will occur in every 20 trials • 5 out of 100 studies will make type 2 errors - - purely by chance. Acceptable • Prob. of type 2 error should be approx. 4 times sig. level chosen e.g. • α =5% then power =80% α =1% then power =95%

  24. Calculating Sample Size Sample size calculations available for all study designs, trials, and data types e.g. categorical data, continuous data, means, proportions, multiple groups, paired samples, unpaired samples, equal / unequal sized groups Calculations are complex but easily done with a PC and www Statistician helpful (if s/he can communicate clearly!) Two approaches for us non-statisticians 1. Altman’s Normogram 2. Internet

  25. 10000 6000 4000 3000 2000 1400 1000 8000 600 500 400 300 240 200 160 140 120 100 80 70 60 50 40 30 24 20 16 14 12 10 8 Altman’s Normogram Power 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2 0.995 0.99 0.98 0.97 0.96 0.95 0.90 0.85 0.80 0.75 0.70 0.65 0.60 0.55 0.50 0.45 0.40 0.35 0.30 0.25 0.20 0.15 0.10 0.05 N Standardized difference = Min. important difference Standard deviation

  26. Example Calculation – Effects of Pesticide Study IQ survey, concerning workers exposed to pesticides What we already know… Mean IQ score is 100 points SD is ± 10 points e.g. Normal IQ= 90-110 What we need to do…. a) Decide on CID. A difference of 11 IQ points seems clinically important to me b) Calculate Standardized Difference = Min Important Difference11 = 1 Standard Deviation 10 c) Use Altman’s Normogram to observe N

  27. 10000 6000 4000 3000 2000 1400 1000 8000 600 500 400 300 240 200 160 140 120 100 80 70 60 50 40 30 24 20 16 14 12 10 8 Altman’s Normogram - Effects of Pesticide Study Power 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2 0.995 0.99 0.98 0.97 0.96 0.95* 0.90* 0.85* 0.80* 0.75 0.70 0.65 0.60 0.55 0.50 0.45 0.40 0.35 0.30 0.25 0.20 0.15 0.10 0.05 N Standardized difference = Min. important difference Standard deviation 1.1 = 11 10

  28. 2. Electronic Calculation of Sample Size Not covered in most stats packages e.g. SPSS, Statistica Many sites available Real time calculation Hyperstat by David M Lane www.davidmlane.com Other additional software e.g. Xlstat.com

  29. Summary of Sample Size & Power Correct sample size helps avoid type I & type II errors A correct study has balance of four factors Power (no less than .80) Bigger = Better study Min. clinical difference (effective difference) Bigger = Better study Standard deviation (variability) Smaller = Better study Significance level (0.05) Smaller = Better study Looking for big differences much easier than smaller differences fixed fixed

More Related