Epidemiologic Methods Fall 2012 First 5 Lectures

Epidemiologic MethodsFall 2012First 5 Lectures • Overview of Study Design (today) • Disease Occurrence I and II • Disease Association I and II

Study DesignOutline of Topics for Today • Individual vs group as unit of observation • Underlying study base or cohort • Cohort design • Cross-sectional design • Case-control design • 3 ways to sample controls • Primary & secondary study bases • Case-crossover/case series design • Incident versus prevalent disease • Retrospective versus prospective – what matters

Human Subjects Studies Unit of observation Individual Group (eg, geographic area) Experimental Ecological Studies Observational Randomized Non-randomized Within-Subject Case-crossover Case only Self-controlled case series Between-Subject Cohort Cross-sectional Case-Control

Ecological study: water fluoride & dental caries • 191 municipalities in Denmark in 2004. • Unit of analysis was the municipalities. • Mean DMFS of 15-year-olds was used as outcome variable. • Concentration of fluoride in the water supply (mean) ppm. Ekstrand et al. Factors associated with inter-municipality differences in dental caries experience among Danish adolescents. Community Dent Oral Epidemiol 2010

Ecological association Fluoride concentration in the water supply versus mean DMFS among 15-year-olds in the different municipalities in 2004. Tendency line is presented.

Ecological fallacy • Mean income and traffic injuries • Results using community as the unit of analysis Communities with higher income have more traffic injuries. What about individuals with higher income?

Results using individual as the unit of analysis: Individuals with higher income have fewer traffic injuries.

Ecological Fallacy

Human Subjects Studies Unit of observation Individual Group (eg, geographic area) Experimental Ecological Studies Observational Randomized Non-randomized Within-Subject Case-crossover Case only Self-controlled case series Between-Subject Cohort Cross-sectional Case-Control

Cohort is the basis of design with the individual as the unit of measurement With individual as unit of observation, all study design is best thought of as sampling the disease experience of an underlying study base or cohort

Concept of the Study Base The study base is the population that experiences the disease outcomes you will observe in your study

Three Keys to Study Design Using Observation of Individuals • Identify the population that is the Study Base • Determine how the experience of the Study Base population will be/was sampled • Consider the timing of measurements of exposure and outcome, relative to each other

Cohort study design D = disease (event) occurrence; arrow = losses to follow-up Minimum loss to follow-up (1%)

Cohort study design D = disease (event) occurrence; arrow = losses to follow-up CE Minimum loss to follow-up (1%)

Cohort study • Every cohort has competing events, unless the outcome is all-cause mortality. • Schematic assumes one-time event but also possible to study repeated events • In a cohort study, we can measure: • rate of event occurrence, how many events occur in a certain amount of person-time • risk of event occurrence, probability of event occurring over set time • For now, we are assuming that we can accurately identify when an event occurs

What is a cohort? • From the Latin cohors, the basic unit of a legion in the Roman army. • A cohort is defined and described by its characteristics at baseline, i.e. the beginning of follow-up. Examples: • Research cohort, recruited and followed for particular study, e.g. Framingham study • Administrative cohort, e.g., Kaiser members • Geographically defined cohort, e.g.,residents of SF

Cohort Study Design • Mimics individual’s progress through life and accompanying disease risk • Gold standard because exposure/risk factor is observed before the outcome occurs • Randomized trial is a cohort design with exposure assigned rather than observed

Framingham Cohort Study The impact of diabetes on survival following myocardial infarction in men vs women. The Framingham Study.Abbott RD, Donahue RP, Kannel WB, Wilson PW. The impact of diabetes on recurrent myocardial infarction (MI) and fatal coronary heart disease was examined in survivors of an initial MI using 34-year follow-up data in the Framingham Study. Among nondiabetic patients, the risk of fatal coronary heart disease was significantly lower in women compared with men (relative risk, 0.6). In the presence of diabetes, however, the risk of recurrent MI in women was twice the risk in men. In addition, the effect of diabetes doubled the risk of recurrent MI in women (relative risk, 2.1) but had an insignificant effect in men. JAMA, 1989

Threat to Validity in a Cohort Study: Losses to follow-up What bias do those lost introduce? Missed disease diagnoses? Those with a risk factor more or less likelyto leave? ? ? ? CE ? ? ? ?

Subjects lost during follow-up • If losses are random, only power is affected • If disease incidence is research question, losses that are related to outcome will bias results • If association of risk factor to disease is focus, losses bias results only if related to both outcome and the risk factor

Swiss HIV Cohort 3111 patients, ‘96-’99 At least two visits Med. follow-up 28 mos HCV+ more rapid disease progression Adj RH = 1.7 (95% CI = 1.3 - 2.3) No loss to follow-up info (Greub, Lancet, 2000) Johns Hopkins Cohort 1955 patients, ‘95-’01 At least two visits Med. follow-up 25 mos HCV not associated with disease progression Adj RH = 1.0 (95% CI = 0.9 - 1.2) No loss to follow-up info (Sulkowski, JAMA, 2002) Two Cohort Studies of HCV/HIV Coinfection and Risk of AIDS

Cross-Sectional Study Design CE

Cross-Sectional DesignMeasures Prevalence • Measures prevalence of disease at one point in time. Two types: • Point prevalence: Do you currently have a backache? • Period prevalence: Have you had a backache in the past 6 months? • Don’t confuse with length of time to conduct the study. e.g., Both examples above could occur in a study conducted over 4 months.

Cross-Sectional DesignWeaknesses • Cannot determine whether putative cause preceded the disease outcome • Cannot distinguish factors associated with disease (incidence) from factors associated with survival with disease (prevalence)

Example of Cross-sectional Study • Objective: Determine the prevalence of neck and shoulder pain in a well-functioning cohort, identify factors associated with this pain, …and evaluate the impact of this pain on physical functioning. • Methods: 3,075 men and women in Health ABC study. 50% white/50% black. 70-79 y.o. Well-functioning. • Baseline visit: Participants were asked if they had had neck or shoulder pain lasting at least 1 month during the previous year, and to rate severity of any pain. Vogt et al. Neck and shoulder pain in 70- to 79-year-old men and women: findings from the Health, Aging and Body Composition Study. Spine 2003.

(continued) • Assessment of functioning was carried out using a battery of lower-extremity performance tests • Results: 11.9% reported neck pain and 18.9% reported shoulder pain. .. More severe neck/shoulder pain associated with lower physical performance measures.

Case-Control Design • Case-control studies are an efficient way to sample an underlying cohort • Study Base in case-control design = the population (or underlying cohort) that gave rise to the cases (Szklo and Nieto call it the “reference population”). • Case-control design is best understood by considering how the experience of a cohort is sampled • Study base is the key concept that links case-control design and cohort design

Case-Control Key Concept #1 • Think of the selection of cases and controls as occurring from an underlying cohort (or study base)

Primary & Secondary Study Bases • Case-control studies can be thought of as evolving from one of two study bases (i.e. underlying cohorts): • Primary study base vs Secondary study base • A primary study base is one where the underlying cohort is readily defined, either because it is: • previously assembled for research (e.g., Framingham) • administratively defined (e.g., Kaiser members in 2004) • or geographically defined (e.g., SF residents in 2008) • A secondary study base is more elusive. It starts with the cases and then works backwards.

Cohort study design D = disease occurrence; arrow = losses to follow-up CE Given that all the cases are diagnosed, how would you sample controls from this cohort for a case-control study?

3 Ways to Sample Controls within a Cohort (Primary Study Base) • At time each case is diagnosed = incidence density sampling • A random sample of the cohort at baseline = case-cohort design • From persons without disease at the end of follow-up = prevalent controls design (aka “cumulative” or “epidemic” design)

Incidence Density Sampling within a Cohort Study Study Base = Cohort CE In this example, controls are randomly sampled each time a case is diagnosed from those still in follow-up without the diagnosis. It follows that a subject who is a control at one time may become a case later in time.

Example of case-control study with incidence density sampling (of a primary study base) Abstract We investigated associations between metabolic syndrome, its components, and breast cancer risk in a nested case–control study on postmenopausal women of the ORDET cohort (N = 10,000). After a median follow-up of 13.5 years, 163 women developed breast cancer…Four matched controls per case were selected by incidence density sampling, and rate ratios were estimated by conditional logistic regression. Metabolic syndrome (i.e. presence of three or more metabolic syndrome components) was significantly associated with breast cancer risk (rate ratio 1.58 [95% confidence interval 1.07–2.33])… Metabolic syndrome and postmenopausal breast cancer in the ORDET cohort: A nested case–control study Agnoli et al. Nutr Metab Cardiovasc Dis. 2009 Efficient design! 815 (163 cases + 652 ctrls) represent 10,000

Counterintuitive Idea #1 • Individuals can be controls at one point in time and cases at a later time • Incidence density sampling: Sampling “person-time” not individuals. Each time a case occurs, the remaining cohort is sampled for controls from those in follow-up at that time. • Compare with a full cohort: Individuals contribute person-time up until becoming a case or the end of their follow-up. This case-control design samples from the person-time experience of the cohort.

Another example of incidence density sampling • Objective: Associations between serum C-reactive protein (CRP) and colon and rectal cancer • Methods: Nested case-control study within the European Prospective Investigation into Cancer and Nutrition, over 520,000 participants recruited 1992–2000. • Cases: Incident cancer cases (N=1,096) identified through record linkage with regional cancer registries. Closure dates ranged from December 1999 to June 2003 • Controls: We used an incidence density sampling protocol for control selection (N=1,096), such that controls could include subjects who later became cases, while each control subject could also be sampled more than once. Matching characteristics were study center at the time of enrollment, sex, age at blood collection (6-month to 2-year intervals),… • Results: Evidence that elevated CRP concentrations are related to a higher risk of colon cancer but not rectal cancer. Aleksandrova et al. Circulating C-Reactive protein concentrations and risks of colon and rectal cancer. AJE 2010

Case-cohort design: sample baseline of cohort CE

Case-cohort design: Case-control study using sampling of the baseline of the underlying cohort Abstract We conducted a case cohort investigation of colorectal cancer among nondiabetic subjects enrolled in the Women’s Health Initiative Observational Study, a prospective cohort of 93,676 postmenopausal women. Fasting baseline serum specimens from all incident colorectal cancer cases (n = 438) and a random subcohort (n = 816) of Women’s Health Initiative Observational Study subjects were tested for insulin, glucose, total IGF-I, free IGF-I, IGF binding protein-3, and estradiol. Gunter et al. Insulin, Insulin-like Growth Factor-I, Endogenous Estradiol, and Risk of Colorectal Cancer in Postmenopausal Women. Cancer Res 2008;68(1):329–37

Individual can be a control and a case in case-cohort design • In the subcohort, some will go on to become cases. How can we include them in the control group? • Goal is to compare the prevalence of the exposure of interest between the cases and the cohort. Not comparing cases and “non-cases.” • By comparing exposure between the cases and the underlying cohort, we can calculate an unbiased estimate of the risk ratio. • More on this in the Disease Association lectures.

Another case-cohort example • Objective: Determine whether serum C-reactive protein levels, a sensitive indicator of inflammation, are associated with the risk of cardiovascular mortality among older women. • Methods: We conducted a case-cohort study within the Study of Osteoporotic Fractures, a population-based study involving 9,704 women aged > or = 65 years from four U.S. centers. • Cases: Random sample of 92 women from the 1,125 women in the cohort who had died during the first 6 years of follow-up. • Controls: We randomly selected 400 women from the entire cohort. • Baseline serum C-reactive protein levels • Results: During 6 years of follow-up, women with C-reactive protein levels in the highest quartile (>3.0 mg/L) had a 8.0-fold (95% confidence interval [CI]: 2.2 to 29) greater risk of cardiovascular mortality than those in the lowest quartile (< or = 1.0 mg/L). Tice et al. The relation of C-reactive protein levels to total and cardiovascular mortality in older U.S. women. Am J Med 2003.

Case-control design using incident cases + prevalent controls CE

Case-Control => Case-Referent • We often think of controls in the “case-control” design as "never cases." With that framework, the prevalent control design looks best. However, as we can see, the prevalent control design is biased. So, case-control is sort of a misnomer. It is best thought of as a "case vs reference" design.

Case-Control Key Concept #2 • Closed (fixed) vs open (dynamic) cohorts • Any well-enumerated administrative or geographic defined population can be thought of as a dynamic cohort that continues to recruit new subjects over time

Incidence Density Sampling in a Dynamic Cohort (e.g., a primary study base: San Francisco County) New residents D CE Incidence-density sampling in a specified population with new subjects entering

Example of case-control study with incidence density sampling in a primary (dynamic) study base The association between long-term use of NSAIDs and non-Hodgkin lymphoma (NHL) was examined …in an ongoing population-based case-control study in the SF Bay Area. Cases were identified using Surveillance, Epidemiology, and End Results (SEER) registry data. …Controls were a random sample of persons identified by random digit dialing…resided in the same six counties as the casesat the time of diagnosis… …After adjustment for age and sex, there was no consistent association between long-term use and NHL for all NSAIDs combined, aspirin, nonselective NSAIDs, and COX-2 inhibitors. Flick, et al. Use of NonsteroidalAntiinflammatory Drugs and Non-Hodgkin Lymphoma: A Population-based Case-Control Study.Am J Epidemiol 2006; Sept 1, 164:497-504.

Dynamic study base: Cases • Incident NHL between October 2001 and May 2004 in the 6 SF Bay Area counties. • Eligible cases were those who were SF Bay Area residents at the time of diagnosis, aged 21–85 years at the time of diagnosis, and able to complete an interview in English. NHL cases were identified using rapid case ascertainment and SEER registry data. • A total of 1,000 cases

Dynamic study base: controls • Control participants were frequency-matched to cases by age, sex, and county of residence. Control participants were a random sample of persons identified by random digit dialing or from Centers for Medicare and Medicaid Services files. Eligible controls had no history of NHL, resided in the same six counties as the cases at the time of diagnosis, were aged 21–85 years, and were able to complete an interview in English. • Participants completed in-person interviews designed to measure potential NHL risk factors. Questions were asked regarding use during the past 20 years of aspirin, prescription and over-the-counter nonselective NSAIDs, and cyclooxygenase-2 (COX-2) inhibitors. • A total of 1,060 controls • Efficiency prize! 1,000 cases + 1,060 controls represent experience of Bay Area population (~4 million)

Definition of a Primary Study Base • Primary Study Base = population that gives rise to cases that can be defined before cases appear • Typically, based on participation in a research study; administrative definition; or geographical area • Our previous examples of an “underlying cohort” for a case-control study were primary study bases

Secondary Study Base • Secondary Study Base = population that gave rise to cases, defined as those persons who would have been identified as cases if they had developed disease during the period of study • Start with cases and then attempt to identify hypothetical cohort that gave rise to them • Difficult concept but crucial to case-control design when primary study base not possible

Case Control Selection from a Secondary Study Base • Source of cases is often one or more hospitals or other medical facilities • For controls, problem is identifying who would come to the facility if diagnosed with the disease • Careful consideration has to be given to factors causing someone to show up at that institution with that diagnosis

Secondary Study Base • Example: Incident glioma cases seen at UCSF • The secondary study base is: Persons whom, if they had developed incident glioma, would have been seen at UCSF. • Can we identify this secondary study base in practice? Referrals to UCSF come from many areas and for different reasons • Common approaches fall short • One possible control group might be UCSF patients with a different neurologic disease • Patients from a similar tertiary referral clinic are another possible control group • Residents of the neighborhood of the case are another possibility

Epidemiologic Methods Fall 2012 First 5 Lectures

Epidemiologic Methods Fall 2012 First 5 Lectures

Presentation Transcript

Epidemiologic Methods - Fall 2006

Epidemiologic Methods

The Use of Epidemiologic Methods in Disasters

Lectures on Research Methods

Epidemiologic Methods Fall 2011 First 5 Lectures

Use of epidemiologic methods in disaster management

Fall 2012

Fall 2012 First year entry survey

Fall Semester 2012 Sept. 5 – Oct. 24, 2012

CISE301 : Numerical Methods Topic 5: Interpolation Lectures 20-22:

AGU Fall Meeting 5 December 2012

Fall 2012

Fall 2012

Epidemiologic Methods - Fall 2009

Introduction to Epidemiologic Methods

Quasi-Monte Carlo Methods Fall 2012

Epidemiologic Methods - Fall 2004

Epidemiologic Methods - Fall 2007

Fall 2012

Statistical Methods (Lectures 2.1, 2.2)

SE301: Numerical Methods Topic 5: Interpolation Lectures 20-22: