1 / 21

Design of Statistical Investigations

Design of Statistical Investigations. 14 Case Control Studies. Stephen Senn. Case-Control Study Definition.

gil
Télécharger la présentation

Design of Statistical Investigations

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Design of Statistical Investigations 14 Case Control Studies Stephen Senn SJS SDI_14

  2. Case-Control StudyDefinition “The observational epidemiologic study of persons with the disease (or other outcome variable) of interest and a suitable control (comparison, reference) group of person with the disease. The relationship of the attribute to the disease is examined by comparing the diseased or nondiseased group with regard to how frequently the disease is present, or if quantitative, the levels of the attribute in each group . In short the past history of exposure to a suspected risk factor is compared between “cases” and “controls”, persons who resemble the cases in such respects as age and sex but do not have the disease or condition of interest.” Last, J.M. A Dictionary of Epidemiology SJS SDI_14

  3. Schematic Representation of Cohort Study Each point represents a member of the cohort of 10,000 persons SJS SDI_14

  4. 200 cases and 200 controls are sampled from diseased and healthy persons respectively SJS SDI_14

  5. The number of cases and controls is a foregone conclusion. Exposure becomes the random variable and is studied as a function of status Note that axes have been exchanged to reflect this SJS SDI_14

  6. Smoking and Lung-CancerObs_7 • Famous study of Hill and Doll • Sampled 1357 cases of lung cancer from four hospitals in the United Kingdom • Sampled 1357 hospital-based controls • Compared the two groups as regards smoking history SJS SDI_14

  7. Doll and Hill DataObs_7 SJS SDI_14

  8. In General SJS SDI_14

  9. A Model for Case-Control Studies Number exposed Number unexposed Probability case if exposed Probability case if unexposed Probability recorded if case Probability recorded if control SJS SDI_14

  10. Expectations etc. SJS SDI_14

  11. Notes • Thus the odds-ratio can be estimated even though nE, nU, l and q are unknown. • However, although the assumption that l and q are equal is not needed, an assumption that they do not vary with exposure is needed. SJS SDI_14

  12. Sources for Controls(Rothman) • Population • using population register • Neighbourhood • For example one or two control from neighbourhood of case • Not suitable for environmental exposure • Random digit dialing • Hospitals or clinics SJS SDI_14

  13. Complete population Can calculate incidence rates Usually expensive Convenient for studying many diseases Can be prospective or retrospective Sampled population Can calculate ratios only Usually less expensive Convenient for studying many exposures Can be prospective or retrospective Cohort and Case Control StudiesCohort Case Control Rothman p 91 SJS SDI_14

  14. The “Delta” Method SJS SDI_14

  15. Variance of a Logit SJS SDI_14

  16. Variance of the log-odds ratio The log-odds ratio is the difference between two logits. Since these are independent, the variance of their difference is the sum of their variances. Thus, in terms of our previous table, we have Note the implications of the variance formula. The variance cannot be reduced beyond the reciprocal of the entry in a given cell by increasing the frequencies of the other cells. SJS SDI_14

  17. S-Plus AnalysisObs_7 #Doll and Hill options(contrasts=c("contr.treatment", "contr.poly")) #set contrast options #To analyse the famous case-control study Outcome<-factor(c("case","case","control","control")) Exposure<-factor(rep(c("smoker","non-smoker"),2)) Freq<-c(1350,7,1296,61) Doll.Hill<-data.frame(Outcome, Exposure, Freq) Doll.Hill OR<-Freq[1]*Freq[4]/(Freq[2]*Freq[3]) l.OR<-log(OR) var<-(1/Freq[1]+1/Freq[2]+1/Freq[3]+1/Freq[4]) SE<-sqrt(var) t<-l.OR/SE LCL<-exp(l.OR-1.96*SE) UCL<-exp(l.OR+1.96*SE) results.1<-data.frame(l.OR,var,SE,t,LCL,OR,UCL) results.1 SJS SDI_14

  18. #Fit results using a log-linear model • fit1<-glm(Freq~Exposure*Outcome,family=poisson) • summary(fit1,cor=F) • #Prepare data to perform logistic regression • Y<-c(Freq[1],Freq[3]) • N<-c(Freq[1]+Freq[2],Freq[3]+Freq[4]) • Exposure2<-factor(c("Smoker","Non-smoker")) • P<-Y/N • DollHill.2<-data.frame(Y,N,P,Exposure2) • DollHill.2 • #Logistic regression • fit2<-glm(P~Exposure2,family=binomial,weight=N) • summary(fit2,cor=F) SJS SDI_14

  19. > Doll.Hill Outcome Exposure Freq 1 case smoker 1350 2 case non-smoker 7 3 control smoker 1296 4 control non-smoker 61 > results.1 l.OR var SE t LCL OR UCL 1 2.205786 0.1607629 0.40095255.501364 4.136784 9.077381 19.91857 Call: glm(formula = Freq ~ Exposure * Outcome, family = poisson) Coefficients: Value Std. Error t value (Intercept) 1.945910 0.3779645 5.148394 Exposure 5.261950 0.3789431 13.885857 Outcome 2.164964 0.3990621 5.425129 Exposure:Outcome -2.2057860.4009525 -5.501364 SJS SDI_14

  20. > DollHill.2 Y N P Exposure2 1 1350 1357 0.9948416 Smoker 2 1296 1357 0.9550479 Non-smoker Call: glm(formula = P ~ Exposure2, family = binomial, weights = N) Coefficients: Value Std. Error t value (Intercept) 3.056164 0.1310154 23.326746 Exposure2 2.2057860.4009483 5.501422 (Dispersion Parameter for Binomial family taken to be 1 ) SJS SDI_14

  21. Questions • Why did Hill and Doll choose a case-control study rather than a cohort study? • We now believe that the choice of controls used in the Hill and Doll study led to an underestimate of odds ratio for lung cancer and smoking why? • Consider the recent controversy over breast implants and connective tissue disease. What difficulty does press-coverage cause for any case-control study in this field? • Why do epidemiologists rarely use more than three controls per case? SJS SDI_14

More Related