1 / 102

Top Ten #1

Top Ten #1. Descriptive Statistics NOTE! This Power Point file is not an introduction, but rather a checklist of topics to review. Location: central tendency. Population Mean = µ = Σ x/N = (5+1+6)/3 = 12/3 = 4 Algebra: Σx = N* µ = 3*4 =12 Do NOT use if N is small and extreme values

Albert_Lan
Télécharger la présentation

Top Ten #1

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Top Ten #1 Descriptive Statistics NOTE! This Power Point file is not an introduction, but rather a checklist of topics to review

  2. Location: central tendency • Population Mean =µ= Σx/N = (5+1+6)/3 = 12/3 = 4 • Algebra: Σx = N*µ = 3*4 =12 • Do NOT use if N is small and extreme values • Ex: Do NOT use if 3 houses sold this week, and one was a mansion

  3. Location • Median = middle value • Ex: 5,1,6 • Step 1: Sort data: 1,5,6 • Step 2: Middle value = 5 • OK even if extreme values • Home sales: 100K,200K,900K, so mean =400K, but median = 200K

  4. Location • Mode: most frequent value • Ex: female, male, female • Mode = female • Ex: 1,1,2,3,5,8: mode = 1

  5. Relationship • Case 1: if symmetric (ex bell, normal), then mean = median = mode • Case 2: if positively skewed to right, then mode<median<mean • Case 3: if negatively skewed to left, then mean<median<mode

  6. Dispersion • How much spread of data • How much uncertainty • Range = Max-Min > 0 • But range affected by unusual values • Ex: Santa Monica = 105 degrees once a century, but range would be 105-min

  7. Standard Deviation • Better than range because all data used • Population SD = Square root of variance =sigma =σ • SD > 0

  8. Empirical Rule • Applies to mound or bell-shaped curves • Ex: normal distribution • 68% of data within + one SD of mean • 95% of data within + two SD of mean • 99.7% of data within + three SD of mean

  9. Sample Variance

  10. Standard deviation = Square root of variance

  11. Sample Standard Deviation

  12. Standard Deviation Total variation = 34 • Sample variance = 34/4 = 8.5 • Sample standard deviation = square root of 8.5 = 2.9

  13. Graphical Tools • Line chart: trend over time • Scatter diagram: relationship between two variables • Bar Chart: frequency for each category • Histogram: frequency for each class of measured data (graph of frequency distr) • Box Plot: graphical display based on quartiles, which divide data into 4 parts

  14. Top Ten #2 • Hypothesis Testing

  15. Ho: Null Hypothesis • Population mean=µ • Population proportion=π • Never include sample statistic in hypothesis

  16. HA: Alternative Hypothesis • ONE TAIL ALTERNATIVE • Right tail: µ>number(smog ck) π>fraction(%defectives) Left tail: µ<number(weight in box of crackers) π<fraction(unpopular President’s % approval low)

  17. Two-tail Alternative • Population mean not equal to number (too hot or too cold) • Population proportion not equal to fraction(% alcohol too weak or too strong)

  18. Reject null hypothesis if • Absolute value of test statistic > critical value • Reject Ho if |Z Value| > critical Z • Reject Ho if | t Value| > critical t • Reject Ho if p-value < significance level (note that direction of inequality is reversed) • Reject Ho if very large difference between sample statistic and population parameter in Ho

  19. Example: Smog Check • Ho: µ = 80 • HA: µ > 80 • If test statistic =2.2 and critical value = 1.96, reject Ho, and conclude that the population mean is likely > 80 • If test statistic = 1.6 and critical value = 1.96, do not reject Ho, and reserve judgment about Ho

  20. Type I vs Type II error • Alpha=α = P(type I error) = Significance level = probability you reject true null hypothesis • Ex: Ho: Defendant innocent • α = P(jury convicts innocent person) • Beta= β = P(type II error) = probability you do not reject a null hypothesis, given Ho false • β =P(jury acquits guilty person)

  21. Type I vs Type II Error

  22. Top Ten #3 • Confidence Intervals: Mean and Proportion

  23. Confidence Interval: Mean • Use normal distribution (Z table if): population standard deviation (sigma) known and either (1) or (2): • Normal population (2) Sample size > 30

  24. Confidence Interval: Mean • If normal table, then µ =(Σx/n)+ Z(σ/n1/2), where n1/2 is the square root of n

  25. Normal table • Tail = .5(1 – confidence level) • NOTE! Different statistics texts have different normal tables • This review uses the tail of the bell curve • Ex: 95% confidence: tail = .5(1-.95)= .025 • Z.025 = 1.96

  26. Example • n=49, Σx=490, σ=2, 95% confidence • µ = (490/49) + 1.96(2/7) = 10 + .56 • 9.44 < µ < 10.56

  27. Conf. Interval: Mean t distribution • Use if normal population but population standard deviation (σ) not known • If you are given the sample standard deviation (s), use t table, assuming normal population • If one population, n-1 degrees of freedom

  28. t distribution • µ = (Σx/n) + tn-1(s/n1/2)

  29. Conf. Interval: Proportion • Use if success or failure (ex: defective or ok) Normal approximation to binomial ok if (n)(π) > 5 and (n)(1-π) > 5, where n = sample size π= population proportion NOTE! NEVER use the t table if proportion!!

  30. Confidence Interval: proportion • Π= p + Z(p(1-p)/n)1/2 • Ex: 8 defectives out of 100, so p = .08 and n = 100, 95% confidence Π= .08 + 1.96(.08*.92/100)1/2 = .08 + .05

  31. Interpretation • If 95% confidence, then 95% of all confidence intervals will include the true population parameter • NOTE! Never use the term “probability” when estimating a parameter!! (ex: Do NOT say ”Probability that population mean is between 23 and 32 is .95” because parameter is not a random variable)

  32. Point vs Interval Estimate • Point estimate: statistic (single number) • Ex: sample mean, sample proportion • Each sample gives different point estimate • Interval estimate: range of values • Ex: Population mean = sample mean + error • Parameter = statistic + error

  33. Width of Interval • Ex: sample mean =23, error = 3 • Point estimate = 23 • Interval estimate = 23 + 3, or (20,26) • Width of interval = 26-20 = 6 • Wide interval: Point estimate unreliable

  34. Wide interval if • (1) small sample size(n) • (2) large standard deviation(σ) • (3) high confidence interval (ex: 99% confidence interval wider than 95% confidence interval) If you want narrow interval, you need a large sample size or small standard deviation or low confidence level.

  35. Top Ten #4: Linear Regression • Regression equation: y=bo+b1x • y=dependent variable=predicted value • x= independent variable • bo=y-intercept =predicted value of y if x=0 • b1=slope=regression coefficient =change in y per unit change in x

  36. Slope vs correlation • Positive slope (b1>0): positive correlation between x and y (y incr if x incr) • Negative slope (b1<0): negative correlation (y decr if x incr) • Zero slope (b1=0): no correlation(predicted value for y is mean of y), no linear relationship between x and y

  37. Simple linear regression • Simple: one independent variable, one dependent variable • Linear: graph of regression equation is straight line

  38. Coefficient of determination • R2 = % of total variation in y that can be explained by variation in x • Measure of how close the linear regression line fits the points in a scatter diagram • R2 = 1: max possible value: perfect linear relationship between y and x (straight line) • R2 = 0: min value: no linear relationship

  39. example • Y = salary (female manager, in thousands of dollars) • X = number of children • n = number of observations

  40. Given data

  41. Totals

  42. Slope = -6.500 • Method of Least Squares formulas not on 301 exam • B1 = -6.500 given

  43. Interpret slope If one female manager has 1 more child than another, salary is $6500 lower

  44. Intercept bo= y – b1x

  45. Intercept bo=44.33-(-6.5)(2.33) = 59.5

  46. Interpret intercept If number of children is zero, expected salary is $59,500

More Related