Download Presentation
## Principles of Epidemiology

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**Principles of Epidemiology**Dona Schneider, PhD, MPH, FACE**Epidemiology Defined**• Epi + demos + logos = “that which befalls man” • The study of the distribution and determinants of disease frequency in human populations (MacMahon and Pugh, 1970)**Epidemiology Defined**• The study of the distribution and determinants of health-related states or events in specified populations and the application of this study to the control of health problems (John Last, 1988)**Uses of Epidemiology**• Identifying the causes of disease • Legionnaire’s disease • Completing the clinical picture of disease • Tuskegee experiment • Determining effectiveness of therapeutic and preventive measures • Mammograms, clinical trials • Identifying new syndromes • Varieties of hepatitis**Uses of Epidemiology**• Monitoring the health of a community, region, or nation • Surveillance, accident reports • Identifying risks in terms of probability statements • DES daughters • Studying trends over time to make predictions for the future • Smoking and lung cancer • Estimating health services needs**Life Table of Deaths in London**Source: Graunt’s Observations 1662**Graunt’s Observations**• Excess of male births • High infant mortality • Seasonal variation in mortality**Yearly Mortality Bill for 1632:Top 10 Causes of Death**Chrisomes & Infants Consumption Fever Collick, Stone, Strangury Flox & Small Pox Bloody Flux, Scowring & Flux Dropsie & Swelling Convulsion Childbed Liver Grown 0 500 1000 1500 2000 2500 Number of deaths**Leading Causes of Death in US: 1990**Heart disease Cancer Stroke Unintentional injury Lung diseases Pneumonia and influenza Diabetes Suicide Liver disease HIV/AIDS 0 50 100 150 200 250 300 Death Rates per 100,000**Endemic Vs. Epidemic**No. of Cases of a Disease Epidemic Endemic Time**1900**1940 1960 1980 2000**Statistics**• Statistics: A branch of applied mathematics which utilizes procedures for condensing, describing, analyzing and interpreting sets of information • Biostatistics: A subset of statistics used to handle health-relevant information**Statistics (cont.)**• Descriptive statistics: Methods of producing quantitative summaries of information • Measures of central tendency • Measures of dispersion • Inferential statistics: Methods of making generalizations about a larger group based on information about a subset (sample) of that group**Populations and Samples**• Before we can determine what statistical test to use, we need to know if our information represents a population or a sample • A sample is a subset which should be representative of a population**Samples**• A sample should be representative if selected randomly (i.e., each data point should have the same chance for selection as every other point) • In some cases, the sample may be stratified but then randomized within the strata**Example**We want a sample that will reflect a population’s gender and age: • Stratify the data by gender • Within each strata, further stratify by age • Select randomly within each gender/age strata so that the number selected will be proportional to that of the population**Populations and Samples**• You can tell if you are looking at statistics on a population or a sample • Greek letters stand for population parameters (unknown but fixed) • Arabic letters stand for statistics (known but random)**Classification of Data**Qualitative or Quantitative • Qualitative: non-numeric or categorical • Examples: gender, race/ethnicity • Quantitative: numeric • Examples: age, temperature, blood pressure**Classification of Data**Discrete or Continuous • Discrete: having a fixed number of values • Examples: marital status, blood type, number of children • Continuous: having an infinite number of values • Examples: height, weight, temperature**Hint**• Qualitative (categorical) data are discrete • Quantitative (numerical) data may be • discrete • continuous**Qualitative Data: Nominal**• Data which fall into mutually exclusive categories (discrete) for which there is no natural order • Examples: • Race/ethnicity • Gender • Marital status • ICD-10 codes • Dichotomous data such as HIV+ or HIV-; yes or no**Qualitative Data: Ordinal**• Data which fall into mutually exclusive categories (discrete data) which have a rank or graded order • Examples: • Grades • Socioeconomic status • Stage of disease • Low, medium, high**Quantitative Data: Interval**• Data which are measured by standard units • The scale measures not only that one data point is different than another, but by how much • Examples • Number of days since onset of illness (discrete) • Temperature in Fahrenheit or Celsius (continuous)**Quantitative Data: Ratio**• Data which are measured in standard units where a true zero represents total absence of that unit • Examples • Number of children (discrete) • Temperature in Kelvin (continuous)**Review of Descriptive Biostatistics**• Mean • Median • Mode and range • Variance and standard deviation • Frequency distributions • Histograms**Mean**• Most commonly used measure of central tendency • Arithmetic average • Formula: x = x / n • Sensitive to outliers**Example: Number of accidents per week**8, 5, 3, 2, 7, 1, 2, 4, 6, 2 x= (8+5+3+2+7+1+2+4+6+2) / 10 = 40 / 10 = 4**Median**• The value which divides a ranked set into two equal parts • Order the data • If n is even, take the mean of the two middle observations • If n is odd, the median is the middle observation**Given an even number of observations (n=10):**Example: 1, 2, 2, 2, 3, 4, 5, 6, 7, 8 Median = (3+4) / 2 = 3.5 Given an odd number of observations (n=11): Example: 1, 2, 2, 2, 3, 4, 5, 6, 7, 8, 10 Median = 4 (n+1)/2 = (11+1)/2 = 6th observation**Mode**• The number which occurs the most frequently in a set • Example: 1, 2, 2, 2, 3, 4, 5, 6, 7, 8 • Mode = 2**Range**• The difference between the largest and smallest values in a distribution • Example: 1, 2, 2, 2, 3, 4, 5, 6, 7, 8 • Range = 8-1 = 7**Variance and Standard Deviation**• Measures of dispersion (or scatter) of the values about the mean • If the numbers are near the mean, variance is small • If numbers are far from the mean, the variance is large**V = [S(x-x)2] / (n-1)**V = [(8-4) 2 +(5-4) 2 +(3-4) 2 +(2-4) 2 +(7-4) 2 +(1-4) 2 + (2-4) 2 +(4-4) 2 +(6-4) 2 +(2-4) 2] / (10-1) = V = 5.7777 Variance**Standard Deviation**SD = ÖV SD = Ö5.777 = 2.404**Symmetric and Skewed Distributions**Symmetrical Skewed Mean Mean Median Mode Median Mode**Frequency Diagrams of Symmetric and Skewed Distributions**Skewed Symmetric**Frequency Diagram for 12 Psychiatric Patients**Frequency Score**Histogram**Frequency Number of accidents per week**Frequency Polygon**Frequency Number of accidents per week**B**C Frequency B C A D A D Number of accidents per week Frequency Polygon and Histogram Note: area A = A; B = B; C = C; D = D; area under histogram = to area under polygon**Descriptive Statistics**• Used as a first step to look at health-related outcomes • Examine numbers of cases to identify an increase (epidemic) • Examine patterns of cases to see who gets sick (demographic variables) and where and when they get sick (space/time variables)