1 / 30

Descriptive Statistics

Descriptive Statistics. Renan Levine. Frequency Table. One can easily display all of the responses to survey questions in a frequency table.

ludlow
Télécharger la présentation

Descriptive Statistics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Descriptive Statistics Renan Levine

  2. Frequency Table • One can easily display all of the responses to survey questions in a frequency table. How much do you agree or disagree with the following statements: Without a union, teachers would be vulnerable to school politics or administrators who abuse their power Strongly agree: 56.5% Somewhat agree: 31.2% Somewhat disagree: 8.2% Strongly disagree: 4.1% Public Agenda Foundation Poll: Stand By Me--What Teachers Think About Unions, Merit Pay, and Other Professional Matters. March 17-April 31, 2003. Mail survey conducted by Robison and Muenster with a sample provided by Market Data Retrieval (MDR). Available at the Roper Center Data Archive.

  3. Describe the distribution of the responses! How much do you agree or disagree with the following statements: Without a union, teachers would be vulnerable to school politics or administrators who abuse their power Strongly agree: 56.5% Somewhat agree: 31.2% Somewhat disagree: 8.2% Strongly disagree: 4.1% Let me suggest: More than half of all teachers strongly agree that without a union, teachers would be vulnerable to school politics or administrators. A further _____ somewhat agree that they would be vulnerable without a union. Only ____ disagree. From this we would conclude that teachers are {badly split}/{tend to agree/___} on their vulnerability without unions.

  4. Professional sounding descriptions • Univariate descriptive statistics exist to succinctly give people a mental picture of the distribution of the observations. • Primary focus on what is the “typical” observation. • Average (mean) response • Middle (median) response • Most common response (mode) • Secondary: how typical is “typical” (or, are many observations different from “typical”).

  5. Typical? Measures of central tendency • Mode = Most frequent observation. • Just look at which category has the most observations. • Median = observation in the middle • Order observations by category in ascending or descending order. • Look at which category has the “middle” observation, so that , half of all observations are higher, half are lower. • Mean= Average

  6. Calculating an Average • Order the observations in ascending or descending order. • Value 1 * Number of Observations = X • Value 2 * Number of Observations = Y • Value 3 * Number of Observations = Z • Average = (X+Y+Z) ÷ by the total number of observations. • Mean is just a technical name for an average.

  7. What is typical? • Choosing the right descriptive statistic depends primarily on the level of measurement of the variable. • To ascertain what is “typical” one must first assess what level of measurement is used.

  8. Levels of Measurement: Nominal • Nominal – categories are unordered • Only differentiates categories. • Categories are presented in an arbitrary order. • Usually includes dichotomous variables. • Examples: • Provinces (QC, ON, NB, BC…) • Occupation (Teacher, Manager, Retail clerk…) • Which party did you vote for in the last election? (Liberals, NDP, Greens…) • Do you approve of the performance of the Prime Minister? (Yes, No, I don’t know)

  9. Nominal? Use mode. • Do you approve of the job performance of the current Mayor of San Diego ? • Yes – 37% • No - 48% • I don’t know – 15%^ • Nominal variables are unordered, so one cannot order the categories in order to find the median or the mean. • Only “typical” measure one can rely on is mode, most frequent observation. • In example above, mode = “No” with 48% • Mode is more concise than saying, “just under half of all survey respondents approve of the Mayor of San Diego, with 15% saying they do not know…”

  10. Mode • Every variable has a mode or modal category. • Can be identified simply by looking at the number of frequencies in each category. • If two categories are tied for the honor of having the most observations, then the variable is said to be “bimodal”

  11. Example I: Find the mode? USA Today/Gallup Poll # 2009-02: January – Economy / Obama: What should be the primary goal for the United States in Afghanistan: 1 Building a stable democratic government in Afghanistan 300 (30.6%) 2 Weakening terrorists ability to stage attacks against the USA… 557 (56.8%) 3 Both equally… 123 (12.6%)

  12. Example II: Find the mode? Worldviews 2002: American and European Public Opinion on Foreign Policy (Chicago Council on Foreign Relations): There has been some discussion about whether the US should use its troops to invade Iraq and overthrow the government of Saddam Hussein. Which of the following positions is closest to yours: 1 The US should not invade Iraq…108 (15.3%) 2 The US should only invade Iraqwith UN approval & the support of its allies … 452 (64.1%) 3 The US should invade Iraq even if they have to go it alone… 145 (20.6%)

  13. Levels of Measurement: Ordinal • Ordinal – ordered, but no set distance between categories/values. • Examples: • Any question that presents a statement and asks respondents to indicate: Strongly agree, agree, neither agree nor disagree, disagree, or strongly disagree, like: • We have gone too far in pushing equal rights in this country (Canadian Election Survey 2004, MBS_A1) • People who don’t vote have no right to criticize the government (Canadian Election Survey 2004, MBS_E1)

  14. Ordinal? Find the median (usually) • The median is the value of the middle observation in an ordered distribution. • If there is an even number of observations, take the average of the middle two observations. • The mean is also often reported, especially if the ordinal variable has many categories and there are no values that are unusually high compared to the other observations.

  15. Median Example: College Shootings • There were nine shootings at US universities and colleges between January 2010 and December 2013.* • The number of people killed in these shootings were (chronological order): 3, 3, 2, 7, 0, 0, 3, 1, 0 • First, order the observations in ascending order: 0, 0, 0, 1, 2, 3, 3, 3, 7 • There are nine observations, so the median is the fifth one in order (red box).

  16. Median Example: Shootings • The number of wounded at those shootings were (in chronological order): 3,0,0,3,4,2,0,0,3 • What was the median number of people wounded in those nine shooting events? • 0 • 2 • 3 • 4

  17. Median? Confidence in Unions • 2004 Canadian Election Study, MBS_D5 • Please indicate how much confidence you have in the following institutions? Unions. • What is the median? • Are most Canadians confident in Unions? Note: Unweighted responses are not reflective of the population.

  18. Median? Unions Example • There are 1632 non-missing observations, so the median observation is the 816th. • Look at the frequency column. • There are only 86 observations in the first row, plus 445 in the second row = 531. So, the 816th observation must be among the 735 observations in the 3rd row. • Conclude that the median is 3 = Not very much.

  19. Median? Unions Example in SPSS • Remember, the median observation is where half of all observations are below, and half of all observations are below. • Look at the column on the far-right, “Cum[ulative] Percent. Find the row that surpasses 50%. • The second row is 32.54%, so the median must be higher than the second value. • The third row is 77.57%, so the median, the observation that puts the distribution over 50% must be here, since 50% is greater than 32.54% and less than 77.57% • Any statistical package will also report the median for you below this table. In this case the median is ‘3’ = Not very much. Note: Unweighted responses

  20. Median? Follow news about economic stimulus • USA Today/Gallup Poll # 2009-02: January – Economy / Obama: • How closely have you been following the news about new economic stimulus proposals announced by President Obama and considered by Congress this past week? • What is the median? • Did most Americans say they were paying close attention to one of President Obama’s first policy initiatives? Note: Unweighted responses are not reflective of the population.

  21. Levels of Measurement: Interval/Ratio • Interval/ratio- ordered with standardized distances between categories/values. • Sometimes called “continuous” variables (along with some ordinal variables with plentiful categories). • Examples: • Temperature (F or C) • Income • Gross Domestic Product (GDP)

  22. Continuous? Look at the mean (usually) • For interval/ratio data, the mean should be reported. • Survey data is rarely interval/ratio, but also look at the mean when the data is ordinal with many categories.

  23. Calculating an Average • Order the observations in ascending or descending order. • Value 1 * Number of Observations = X • Value 2 * Number of Observations = Y • Value 3 * Number of Observations = Z • Average = (X+Y+Z) ÷ by the total number of observations. • Mean is just a technical name for an average.

  24. Ex: Population living on $2 a day Mean=42.6 Source: Quality of Government (QoG) v6, April 2011

  25. Feelings towards Hillary Clinton Mean = 5.9 Source: American National Election Study, 2008, v093040 [recoded into 11 categories, and weighted]

  26. Example: Feelings towards Hillary Clinton (2008) Mean = 5.9 Source: American National Election Study, 2008, v093040 [recoded into 11 categories, and weighted]

  27. Check the median too. • The mean is more sensitive to extreme values. • When there are one or more observations that are very different than most of the other observations, the mean will be very different than the median. • You may need to use your judgement as to whether to report the mean or the median. • Best to also check the median. • If there are no extreme outliers, the median and mean will be similar.

  28. Trimmed Mean • With continuous (interval/ratio) variables, some scholars will report the “10% trimmed mean.” • To solve the problem of extreme outliers making the mean atypical of the observations, the trimmed mean calculates the average of all the observations except the highest and lowest 10 percent of the observations. • In a perfectly symmetrical distribution, the mean is the same as the median and the trimmed mean.

  29. Ex: Not much difference between Mean & Median Mean = 5.9 Median = 6 Source: American National Election Study, 2008, v093040 [recoded into 11 categories, and weighted]

  30. Example: Real GDP – Large Differences Mean is sensitive to a few very wealthy countries Median = $5,194.48 Trimmed Mean = $7549 Mean = $9,089.82 Source: Gleditsch, K. S. 2002 via Quality of Government (QoG) v6, April 2011

More Related