Comprehensive Guide to Data Analysis Using Descriptive Statistics
This resource provides an overview of data analysis through descriptive statistics, focusing on measures of central tendency (mean, median, mode) and variability (range, variance, standard deviation). It illustrates various statistical tools and online resources, such as interactive statistics tools and frequency distributions, emphasizing when to use graphs and how to interpret different data distributions, including normal and skewed distributions. Examples, visualization techniques, and practical applications enhance understanding for students and professionals in research and evaluation contexts.
Comprehensive Guide to Data Analysis Using Descriptive Statistics
E N D
Presentation Transcript
Data Analysis Using Descriptive Statistics ED 690 Minjuan Wang
A Few Online Tools • Seeing Statistics • Interactive Statistics Tools • Against all odds • Inside statistics • http://www.learner.org/resources/series65.html#jump1 • Video on Demand
Today We’ll Watch • http://learner.org/resources/series65.html • Picturing Distributions • Describing Distributions • Normal Distributions
How to know what to use? • Statistical Procedures Applied are determined by: • Research/Evaluation Questions • Research/Evaluation Design • Types of Measurements • Nominal, Ordinal, Interval or Ratio
Analyzing Quantitative Data • Measures of Central Tendency • Measures of Variability • Measure of relative standing • Measures of Relationship • Refer to your Statistical Family Tree
Measures of Central Tendency • Convenient way to describe a set of numbers with a single number • Three common types: • Mean • Median • Mode
Measure of Variability • Variability: • reflects how the scores differed from one another. • a measure of difference from the mean. • Central Tendency without any measures of variability? • can be misleading
Measure of Variability • Range • Variance • Standard Deviation • the most common and useful measure of variability • the average distance of each score from the mean • Candy bar example • Frequency distribution • Distribution Curve • Normal distribution • Skewed distribution • Negatively skewed; positively skewed
When to Use Graphs • To illustrate relative amounts • To specify the subject • To answer specific questions
Bar Graphs • Quantitative and Rank-Order Data • Show achievement of objectives • Frequency histogram
Frequency Distribution & Histogram • Frequency Distribution • "a set of scores arranged in order of magnitude along the x-axis and the frequency of each score is represented along the y-axis“ • Frequency Histogram • similar to bar graphs but has no spaces between the bars
Example of company • 25 employees • 1 Owner at 450K • 1 VP at 150K • 2 Directors at 100K • 1 Manager at 57K • 3 Department heads at 50K • 4 Section Chiefs at 37K • 1 Maintenance at 30K • 12 Shift workers at 20K
Company data average • Mean = 57 K • Median = 30 K • Mode = 20 K • What is the average wage?
Draw Frequency Distribution • Group data into intervals (5 to 10) • Define the size of the interval widths based on understandable units • Range/intervals • Make sure the intervals do not overlap • Work in teams to draw a distribution of the Salkind book data or the salary data • Handout: project data & results on screen • The result sheet
Frequency distribution--Normal Curve (Figure 12.2, p. 445) • Many statistics assume the normal, bell-shaped curve distribution for scores. • 50% > mean; 50% < mean • Normal curve for population (height, weight, IQ scores) • Mean=median=mode • Mean + 1SD/34.13% of the score • Mean – 1SD/34.13% of the score • Mean +/- 3SD = more than 99% of the score
Skewed Distribution • Non-symmetrical distribution • Mean, median, mode not the same • Negatively skewed (Figure 12.3, p. 447) • extreme scores at the lower end • Mean < median <mode • most did well, a few poorly • Positively skewed • at the higher end • Mean >median >mode • Most did poorly, a few well • Colorado Mountain: Ski to the right->skew to the right • The further apart the mean and median, the more the distribution is skewed.
Describing-Variability • Standard Deviation [or dispersion] (average distance from the mean) • 1 sd includes 34% above and below mean • 2 sd includes 47.5% above and below mean • 3 sd includes 49.9 % above and below mean • SD chart by Kathleen Barlo • EET article on SD • URL: http://coe.sdsu.edu/eet/Articles/standarddev/index.htm
Using SD in Prescribing Cereal • As a practicing nutritionist, Dr. Green frequently came across patient questions like "what is the cereal that are within my diet in terms of calories and fat grams?" • Dr. Greenly uses descriptive statistics to give advise. • Launch Cereal data from Data->Load data->sample data (StatCrunch) • Draw frequency histogram • Fruit loop calories: SD=+2 • Give it to someone who is trying to lose 10 lbs?
StatCrunch Demo http://focus.sdsu.edu/statcrunch4.0/
Mini-Data Activity • Use the Culture Data posted on the weekly page: • http://edweb.sdsu.edu/courses/ed690new/week8new.htm • Run the basic descriptive analysis • Instructions: see “Guide for Analysis” worksheet on file Culture_Minida_SCrunchSee
Mean Height SD Monday 67.9” 3.56” Tuesday* 68.0” 3.6” Both Sections 68.0” 3.5” Predicting Height: Normal Distribution From this sample of 30 adults, the average height of is 68.0 inches or 5 feet 10 inches tall and that 99% of all adults fall in between the height of ??? And ???
*Describing-Relative standing • Percentile • z score • based on sd • Score of 0 is mean. • Score of 1 is 1 sd above mean (percentile of 68%) • T score • 10z + 50 • Quartile • Divided into 4 groups • Stanine: Divided into 9 groups Questions 1-5 onPage 451
Measure of relative standing • Z-score • One type of standard scores • Compares scores from different tests • Convert scores to z scores, average them->final index of average performance • Z= Raw Score (X)-Mean/SD • Z score of mean = 0 • Percentiles • The percentage of scores that fall at or below a given score • Outliers • Example • GRE
Describing-Relationships • How variables are related--need at least 2 variables • Spearman rho • coefficient correlates data that are ranked • Pearson r • correlates data that are interval or ratio • How does foot size correlate to GRE scores? • Scores go from +1 to -1 • More in “Correlational Research”
Optional: Z-Distribution (Histogram) & Hypothesis Testing • One type of frequency histogram • Z-distribution (normal) lies at the heart of inferential statistics
Optional--Do Copper Bracelets reduce arthritic pain? • Take all patients’ scores (Numbers of pain complaints) in the experimental and control group and convert them to a single z score • If the z score of the treatment group falls within -2 to +2 SD, conclusion? • Ney! Since 95.44% of all z scores should fall within this range by chance anyway • If the z is >2.00 or <-2.00, conclusion? • Yeah! P (probability of chance)=4.56%