OBSERVATIONAL STUDIES-2 BY Dr. Bashir Ssuna MBCh.B, MSc Makerere Epidemiology and Statistical Center email@example.com
Cross-sectional Study A study that examines the relationship between one or more outcomes and other variables of interest (risk factors)as they exist in a defined population at a particular point in time.
Structure of a cross-sectional study • Draw a sample from the population • For each individual, measure all predictors/exposures and outcomes on a single occasion • Determine the presence or absence of exposure and the presence or absence of disease for each individual. • The choice of variables to label as predictors or outcomes depends on the cause-effect hypotheses of the investigator and not the study design • There is no follow-up • A snap shot of the population at a certain point in time
Hypothetical example A team of investigators at Mbarara University have the following research questions: • “What is the prevalence of Chlamydia infection among women attending the ANC at MNRH?” • “Is Chlamydia infection associated with use of contraceptives?”
Planned design • Investigators will select a sample of 560 women attending the ANC clinic for the first time during October 2012 to April 2013 • Take a history of contraceptives use, measure socio-demographic variables • Cervical swabs will be taken from each study participant and sent to the laboratory for culture • Culture results will be obtained within two weeks from sampling What is this study design?
Cross-sectional studies…ctd • Cross-sectional studies are largely descriptive epidemiologic studies • Used to generate hypotheses • Also called prevalence studies
Presentation of data from a cross-sectional study in a two-by-two table Prevalence of disease among the exposed=a/(a+b) Prevalence of exposure among people with disease=a/(a+c)
Measures of association in cross-sectional studies • Odds ratio (ratio of odds) • Odds ratio=ad/bc • Odds ratio(OR) ranges between 0 and infinity If • OR=1, there is no association • OR<1, there is a protective effect • OR>1, exposure is associated with disease
Measures of association in cross-sectional studies..ctd • Prevalence difference Prevalence of disease among the exposed(pe)=a/(a+b) Prevalence of disease among the non-exposed(pu)=c/(c+d) • Difference in prevalence =pe-pu
Determinants of infant growth in Eastern Uganda: a community-based cross-sectional study. Engebretsen et al, 2001 BMC Public Health 2008, 8:418 Background Child under-nutrition is a leading factor underlying child mortality and morbidity in Sub-Saharan Africa. Several studies from Uganda have reported impaired growth, but there have been few if any community-based infant anthropometric studies from Eastern Uganda Aims To describe current infant growth patterns using WHO Child Growth Standards To determine the extent to which these patterns are associated with infant feeding practices, equity dimensions, morbidity and use of primary health care for the infants Methods Study site, sampling and participants The study was conducted from September to November 2003 in Mbale District, Eastern Uganda, in one urban area (Mbale municipality) and the surrounding rural county of Bungokho. Administrative information was retrieved from the Uganda Bureau of Statistics in Entebbe, which gave us parish sizes and the number of villages within each parish http://www.ubos.org. The populations were sampled on the basis of probability proportional to size: the appropriate number of villages was randomly selected in each parish according to parish size and seven households were randomly selected in each village. The sample was not stratified on urban/rural status. The study site, design, questionnaire details and definitions are fully described elsewhere .
Methods (continued) . Engebretsen et al, 2001 The population is semi-urban and comprises mainly subsistence farmers. We contacted 793 randomly-selected caretaker-infant (0–11 months) pairs; 30 were non-respondents, and 36 were excluded because the caretaker was not the mother of the infant and the data were incomplete. Four were excluded because of inconsistent anthropometric values. The exclusion criteria we used were weight-for-length z-scores (WLZ) more than +2 and length-for-age z-scores (LAZ) less than -3. This was in line with a conservative interpretation of the criteria used by WHO to avoid "unhealthy weights for length/height, observations falling above +3 SD (standard deviations) and less than -3 SD" . We excluded no infants with WLZ less than -3 SD from our analysis because the data collectors described them as having very bad health status and the measurements seemed plausible. This left 723 mother-infant pairs to be included in the analysis. Data collection, measurements and handling We used a structured questionnaire that included 24-hour dietary recall and dietary recall since birth on 35 food and liquid items. It also included questions on breastfeeding in general, pre-lacteal feeding, initiation of breastfeeding, socio-demographic characteristics, water and sanitation, education of mothers and fathers, having brothers and/or sisters, type of work, marital status, immunisation status, primary health care usage for the infants and recent sickness.
Engebretsen et al, 2001 Methods (Continued) Weight and recumbent length were taken according to WHO standardized techniques . Undressed infants were weighed to the nearest 0.1 kg using 25 kilogram (kg) portable Salter spring scales and recumbent length was measured to the nearest 0.1 cm. Validation of instruments and measurements and random auditing were done on a daily basis. Data were entered using EpiData 3.0 and analysed using SPSS 15.0.1 and STATA 9.2. Anthropometric indices were generated using WHO Anthro 2005 software Results The prevalences of wasting and stunting were 4.2% and 16.7%, respectively. Diarrhoea during the previous 14 days was associated with wasting in the crude analysis, but no factors were significantly associated with wasting in the adjusted analysis. The adjusted analysis for stunting showed associations with age and gender. Stunting was more prevalent among boys than girls, 58.7% versus 41.3%. Having brothers and/or sisters was a protective factor against stunting (OR 0.4, 95% CI 0.2–0.8), but replacement or mixed feeding was not (OR 2.7, 95% CI 1.0–7.1). Lowest household wealth was the most prominent factor associated with stunting with a more than three-fold increase in odds ratio (OR 3.5, 95% CI 1.6–7.8). This pattern was also seen when the mean LAZ was investigated across household wealth categories: the adjusted mean difference between the top and the bottom wealth categories was 0.58 z-scores, p < 0.001. Those who had received pre-lacteal feeds had lower adjusted mean WLZ than those who had not: difference 0.20 z-scores, p = 0.023.
Strengths of cross-sectional studies • They are fast and inexpensive • No waiting for the outcome to occur • There is no loss to follow-up • Cross-sectional design is the only one that provides the prevalence of the disease or risk factor • Often used as a first step in a cohort or experimental study • Provides baseline information of the study group and may reveal cross-sectional associations
Weaknesses • Cannot measure incidence • Difficult to establish causal relationships. The investigatorcannot establish the temporal sequence between the exposure and the outcome. • Impractical for the study of rare diseases if data is to be collected from individuals in the general population • E.g. a cross-sectional study of stomach cancer in a general population of 45-49 year old men may need about 10,000 participants to find just one case.
Weaknesses …ctd • Inefficient for studying uncommon exposures • Tend not to detect rapidly fatal instances of the outcome or individuals who recover quickly • Cross-sectional studies conducted in occupational settings, may be studying “healthy workers,” who are unrepresentative due to preferential loss or death of individuals in a given outcome group • An observed association between exposure and disease may be an association with survival after developing disease, and not the risk of developing disease
Serial surveys • Also called longitudinal surveys • Eg Demographic and Health Surveys • Are a series of cross-sectional surveys of a single population at several points in time • Used to draw inferences about changing patterns of disease or exposures over overtime • Also useful when the investigator wishes to characterize the changes in the population over time but is concerned that in a cohort design the initial examination will produce a learning effect • i.e. that the initial examination will influence responses to follow-up examinations
CASE-CONTROL STUDIES • Also called case referent studies • Definition: • Case control studies are observational studies in which persons with a disease or health outcome (the cases) are compared with individuals without the disease or health outcome (the controls) with regard to various attributes (e.g. exposures, risk factors,). • In case-control studies, the investigator begins with the outcome status of study participants (case or control) and then ascertains their exposure status.
Presentation of data from a case-control study in a two-by-two table
“TYPES” OF CASE-CONTROL STUDIES • Retrospective: Case-control studies are generally retrospective. • At the beginning of the study, investigators identify a group of subjects with the disease (cases) and another group (controls) with out the disease. • Cases and controls are compared in terms of past or existing exposures or attributes which are thought to be relevant to the development of the disease under study i.e. Look for differences in attributes that may explain why the cases got the disease and control did not.
“TYPES” OF CASE-CONTROL STUDIES • Prospective: Case-control study in which cases with new onset are used. • Matched: Case-control study in which individual cases are “matched” to one (or more) controls and matched sets are created. • Nested: Case-control study done within a prospective cohort • Case-cohort: Case-control study in which controls are selected from among the population at risk at the beginning of the follow-up time period. Some of the controls may later become cases
Case-Control Studies Case control studies may be hospital-based or community-based. • Hospital-based: • Case-control studies in which the cases are ascertained based on admission to one or more hospitals. • Controls can be selected from among other individuals hospitalized at the same hospital(s) for conditions unrelated to the disease of interest or from the community. • Unless all or virtually all cases of the outcome are hospitalized and all hospitals in an area are included as enrollment sites, the cases may not be representative of all cases in the population.
Case-Control Studies • Population-based: • Case-control studies in which all diagnosed cases (or a representative sample of cases) arising in a given geographic area/population are included, regardless of whether they are hospitalized, where they obtain medical care, etc. (e.g. case-control studies conducted through tumor registries, birth defects monitoring programs, population-based surveillance programs, etc.)
When is it advantageous to use the Case-Control design? • To study rare diseases • To study diseases with long latency • When exposure data are difficult or expensive to obtain • To investigate outbreaks • To assess the effectiveness of vaccines and screening programs as used in the “real world” • To study new diseases/diseases of uncertain etiology, when there are many hypotheses to be tested
Examples of Case Control Studies • DES and Adenocarcinoma of the Vagina • Menstrual Toxic Shock Syndrome TSS continues to occur in association with menstruation and tampon use and in other circumstances. • Food borne Illness in Sierra Leone • Cell phones and Brain Tumors • Alcohol use and premature mortality
DES and Adenocarcinoma of the Vagina • DES (diethylstilbestrol) is a man-made (synthetic) form of estrogen, a female hormone. Doctors prescribed it from 1938 until 1971 to help some pregnant women who had had miscarriages or premature deliveries in the past. • Health Risks and Related Concerns for DES Daughters: More than 30 years of research have confirmed that health risks are associated with DES exposure. • DES Daughters are at an increased risk for: Clear cell adenocarcinoma (CCA), a rare kind of vaginal and cervical cancer , Increased risk for clear cell cancer appears to be highest for DES Daughters in their teens and early 20s. However, cases have been reported for DES Daughters in their 30s and 40s (Hatch, 1998). • Reproductive tract structural differences (for example, T-shaped uterus), Pregnancy complications, such as ectopic (tubal) pregnancy and pre-term delivery , and Infertility .
CASE-CONTROL STUDIES – RARE OUTCOMES • Adenocarcinoma of the vagina and in utero exposure to diethyl stilbesterol (DES) • Cancer of the vagina very rare, particularly in women < 25 years of age • Between 1966 and 1969, seven cases in young women (15-22 years of age) seen in one hospital; an eighth case seen at nearby hospital • No similarities among cases regarding douches, tampons, or other vaginal irritants. Only 1 of 7 had initiated sexual activity, and none had used birth control.
CASE-CONTROL STUDIES - RARE OUTCOMES (CONTINUED) • Cases: 8 cases with histologically confirmed cancer of the vagina • Controls: 4 controls per case, selecting females born within five days of the case and in the same hospital/on the same service • No differences between cases and controls in maternal age, maternal smoking, breastfeeding, intrauterine x-ray exposure, or many other exposures. • Significant difference in oestrogen use during pregnancy among cases compared to controls Herbst et al, 1971
CASE-CONTROL STUDIES – OUTBREAK INVESTIGATIONS Food borne Illness • May 20, 1986: 18 children and 9 adults in the town of Kenema, Sierra Leone, presented at clinics with sudden onset of weakness, dizziness, vomiting, and diarrhea beginning within 30 minutes of eating. Severely affected frothed at the mouth, became short of breath and lost consciousness. Seven of 27 died within a few hours. • May 21, 1986: 13 people in the village of Lalehun (30 miles away) developed the same symptoms - 5 died. • June 1, 1986: 9 more people in Kenema become ill - 2 died.
Case-Control Studies - Outbreak Investigations (continued) • Case Definition: An afebrile person admitted to the hospital between May 20 and June 15 with loss of consciousness and at least one of the following signs: Excess salivation, frothing at the mouth, excess sweating, or muscle twitching. • Control Selection: Selected alphabetically from households of the patients. Had to be > 6 months of age, in good health, and available for interview.
Case-Control Studies - Outbreak Investigations (continued) • Information Collected • Signs and symptoms present • Foods and beverages consumed during the 4 hours prior to onset (cases) or on the day the patient became ill (controls) - 53 commonly used foods and beverages.
CASE-CONTROL STUDIES – OUTBREAK INVESTIGATIONS (CONTINUED) Odds Ratio= (14*19)/(3*7) =12.7 Bread consumption was strongly associated with disease The odds of bread consumption among cases are 12.7 times the odds of bread consumption among controls.
Three principles to consider in case control design • The study base principle • The deconfounding principle • The comparable accuracy principle
The “study base” The most difficult part of case-control design is defining the study base • Definitions of “study base” concept • The set of persons in which diseased subjects become cases • The members of the underlying cohort or source population during the time period when cases are identified • Goal is to sample controls from the study base in which the cases arose • “simplest” (in theory not in practice) way to do this is to randomly sample controls from the study base • controls should be sampled from the study base at risk at the same calendar time or age
Dealing with confounding • The study base principle guides the selection of who can be entered into the study • When the exposure of interest is associated with other possible risk factors, the investigator needs to deal with potential confounding at the design stage. • Confounding by a factor is eliminated by eliminating variability in that factor. For example, if gender is a possible confounder, selecting only men or only women completely eliminates the variability of gender. • This reduction in variability is the rationale for the choice of controls from the same neighborhood or family as the case
Dealing with confounding • Known confounders that can be measured are usually controlled in an analysis. Thus one should collect data on potential confounding variables to be able to rule out alternative explanations for the apparent association or no association between exposure and outcome. • The degree of confounding from an unmeasured confounder depends on the strength of associations between it and the study exposure and/or disease risk. This may be reduced by matching cases to controls e.g. using sibling controls to minimize the effect of genetic differences
The comparable accuracy principle • Comparable accuracy principle • The accuracy of the measurement of the exposure of interest in the cases should be the same as that in the controls • Example: in a study of the effect of smoking on lung cancer it would not be appropriate to measure smoking with urine cotinine levels in the cases and with questionnaires in the controls • Example: in a study of a fatal disease, it is inappropriate to measure an exposure by questioning the relatives of deceased cases but questioning the actual controls
Case Selection in Case-Control Studies • Source population should be definable • Case definition should provide accurate classification of those with and without disease • Selection of cases must be independent of their exposure status (I.e. exposed and unexposed cases should have the same probability of selection) • If a combination of exposure and disease under study increases the risk of hospital admission, this will lead to a higher exposure rate among the hospital cases than the hospital controls • Incident cases are typically preferred over prevalent cases
Exclusion criteria in case control studies • Exclusion criteria that apply equally to cases and controls are valid because such criteria merely narrow the scope of the study base. • Exclusion criteria that apply unequally to cases and controls violate the study base principle.
Types of controls in case control studies • Population controls • Selection from a roster • Selection without a roster • Hospital or disease registry controls • Controls from a medical practice • Friend controls • Relative controls • Proxy respondents and deceased control
Population controls • If the probability of identification of a case depends on some other variable (such as access to care) then there is the potential for selection bias unless control selection depends upon the same variable. • Example: If access to care is strongly related to the case condition, sometimes the use of hospitalized controls may be more appropriate than the use of population controls. • Example: In a study of occupational exposures using population controls, investigators might exclude hard to reach rural hospitals (for logistic reasons). This may lead to an overrepresentation of urban occupations among the cases. • Alternatives: use of hospitalized controls or stratification by geographic region.
Population controls • Poor sampling of the population experience • There is incomplete case ascertainment • A true random sample cannot be obtained because of nonresponders or inadequate sampling frames. • In these situations, hospitalized or other types of controls may be appropriate • Inconvenience • Sampling population controls can be expensive and time consuming
Population controls when a roster exists • When a population roster is available, the selection of population controls is simplest. • Types of rosters: • Census lists (available in some states and other countries) • Birth certificates • Electoral rolls • If cases are subsequently identified who could not have been originally found on the roster (sampling frame), they should be excluded. • Example: one wouldn’t include non-citizen cases if the sampling frame were electoral records • Some possible approaches when no roster is available: • Random digit dialing • Neighborhood controls
Random digit dialing (RDD) • Random digit selection eliminates problems from using a directory that may have missing or unlisted numbers • Standard approach to RDD for a national study • Random sample is drawn from working sets of telephone exchanges (are code + prefix + two numbers • The 9th and 10th digits are completed randomly and the number is called • If the number is a residential number, a predetermined number of calls are made to that same exchange; if the number s not a residential number, the exchange is discarded • Potential sources of bias arising from RDD • Incomplete telephone coverage by SES level • Multiple numbers in the same household
Neighborhood controls • Controls are selected based on residential location (i.e. near the case) rather than telephone number • Can be useful when telephone coverage is low • Creating the roster if it doesn’t exist can be very expensive • Neighborhood controls are usually selected non-randomly within a defined geographic stratum. • To ensure that selection is not somehow related to the exposure, some algorithm should be in place. For example, in a survey of cases in a village one could always select the first house to the north of the case house. • Avoid letting the interviewer select controls on the basis of willingness to cooperate. • The neighborhood control should have been a resident at the time the case was diagnosed
Advantages of neighborhood controls • Advantages of neighborhood controls • Control selection does not require the prior existence of a roster • Does not require use of telephone • Unmeasured confounding factors related to geography and SES may be balanced between cases and control • Attractive alternative for studies with a primary base in which there is no roster or for studies where cases are obtained from hospital lists
Hospital controls for case control studies • Hospitalized controls are usually a non-random sample of any study base. The use of hospitalized controls is reasonable only under certain conditions: • Example: Identical catchment populations • Control subjects would have been admitted to the same hospital if they had had the same disease that caused the cases to be admitted. • Impact: Determinants of hospitalizations and the choice of hospital must be considered carefully in studies with hospitalized controls
Advantages of hospital controls • Comparable quality of information • Tend to have same type and quality of medical record information as the cases • Example: A study of birth-related risk factors for testicular cancer in men treated at military or tertiary care hospitals was conducted. • The controls were age-matched men who had other (non-testicular) cancers believed to be unrelated to the exposures of interest • The mothers of the cases and the controls provided the information about birth-related events • What do you think about the controls? • Convenience • Especially true if biologic samples such as blood are needed