1 / 61

Please start your Daily Portfolio

. Please start your Daily Portfolio.

chaeli
Télécharger la présentation

Please start your Daily Portfolio

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. . Please start your Daily Portfolio

  2. Introduction to Statistics for the Social SciencesSBS200, COMM200, GEOG200, PA200, POL200, or SOC200Lecture Section 001, Summer Session II, 20139:00 - 11:20am Monday - FridayRoom 312 Social Sciences (Monday – Thursdays)Room 480 Marshall Building (Fridays) Welcome http://www.youtube.com/watch?v=oSQJP40PcGI

  3. Please click in Study Guide is online My last name starts with a letter somewhere between A. A – D B. E – L C. M – R D. S – Z Please double check All cell phones other electronic devices are turned off and stowed away

  4. Homework due – Thursday (July 18th) On class website: Please print and complete homework worksheet #7 Calculating z-scores, raw scores and probabilities Type of Probabilities Calculating confidence intervals

  5. Schedule of readings Before Friday (July 19th) Please read chapters 3, 4, 5, & 6 in Ha & Ha Please read Chapters 10, 11, 12 and 14 in Plous Chapter 10: The Representativeness Heuristic Chapter 11: The Availability Heuristic Chapter 12: Probability and Risk Chapter 14: The Perception of Randomness

  6. Use this as your study guide By the end of lecture today7/17/13 • Empirical, classical and subjective approaches • Probability of an event • Complement of an event; Union of two events • Intersection of two events; Mutually exclusive events • Collectively exhaustive events • Conditional probability • Probability of an event • Law of Large Numbers • Central Limit Theorem • Three propositions • True mean 2) Standard Error of Mean 3) Normal Shape • Calculating Confidence Intervals

  7. Union versus Intersection ∩ P(A B) Union of two events means Event A or Event B will happen Intersection of two events means Event A and Event B will happen Also called a “joint probability” P(A ∩ B)

  8. The union of two events: all outcomes in the sample space S that are contained either in event Aor in event Bor both (denoted A  B or “A or B”).  may be read as “or” since one or the other or both events may occur.

  9. The union of two events: all outcomes contained either in event Aor in event Bor both (denoted A  B or “A or B”). What is probability of drawing a red card or a queen? what is Q  R? It is the possibility of drawing either a queen (4 ways) or a red card (26 ways) or both (2 ways).

  10. Probability of picking a Queen Probability of picking a Red 26/52 4/52 P(Q) = 4/52(4 queens in a deck) 2/52 P(R) = 26/52 (26 red cards in a deck) P(Q  R) = 2/52 (2 red queens in a deck) Probability of picking both R and Q When you add the P(A) and P(B) together, you count the P(A and B) twice. So, you have to subtract P(A  B) to avoid over-stating the probability. P(Q  R) = P(Q) + P(R) – P(Q  R) = 4/52 + 26/52 – 2/52 = 28/52 = .5385 or 53.85%

  11. Union versus Intersection ∩ P(A B) Union of two events means Event A or Event B will happen Intersection of two events means Event A and Event B will happen Also called a “joint probability” P(A ∩ B)

  12. The intersection of two events: all outcomes contained in both event A and event B(denoted A  B or “A and B”) What is probability of drawing red queen? what is Q R? It is the possibility of drawing both a queen and a red card (2 ways).

  13. If two events are mutually exclusive (or disjoint) their intersection is a null set (and we can use the “Special Law of Addition”) P(A ∩ B) = 0 Intersection of two events means Event A and Event B will happen Examples: mutually exclusive If A = Poodles If B = Labradors Poodles and Labs:Mutually Exclusive (assuming purebred)

  14. If two events are mutually exclusive (or disjoint) their intersection is a null set (and we can use the “Special Law of Addition”) P(A ∩ B) = 0 ∩ Dog Pound P(A B) = P(A) +P(B) Intersection of two events means Event A and Event B will happen Examples: If A = Poodles If B = Labradors (let’s say 10% of dogs are poodles) (let’s say 15% of dogs are labs) What’s the probability of picking a poodle or a lab at random from pound? P(poodle or lab) = P(poodle) + P(lab) P(poodle or lab) = (.10) + (.15) = (.25) Poodles and Labs:Mutually Exclusive (assuming purebred)

  15. Conditional Probabilities Probability that A has occurred given that B has occurred Denoted P(A | B): The vertical line “ | ” is read as “given.” P(A ∩ B) P(A | B) = P(B) The sample space is restricted to B, an event that has occurred. A  B is the part of B that is also in A. The ratio of the relative size of A  B to B is P(A | B).

  16. Conditional Probabilities Probability that A has occurred given that B has occurred Of the population aged 16 – 21 and not in college: P(U) = .1350 P(ND) = .2905 P(UND) = .0532 What is the conditional probability that a member of this population is unemployed, given that the person has no diploma? .0532 P(A ∩ B) .1831 = P(A | B) = = .2905 P(B) or 18.31%

  17. Conditional Probabilities Probability that A has occurred given that B has occurred Of the population aged 16 – 21 and not in college: P(U) = .1350 P(ND) = .2905 P(UND) = .0532 What is the conditional probability that a member of this population is unemployed, given that the person has no diploma? .0532 P(A ∩ B) .1831 = P(A | B) = = .2905 P(B) or 18.31%

  18. Mean = 100 Standard deviation = 5 If we go up one standard deviation z score = +1.0 and raw score = 105 If we go down one standard deviation z score = -1.0 and raw score = 95 85 90 95 100 105 110 115 If we go up two standard deviations z score = +2.0 and raw score = 110 If we go down two standard deviations z score = -2.0 and raw score = 90 85 90 95 100 105 110 115 If we go up three standard deviations z score = +3.0 and raw score = 115 If we go down three standard deviations z score = -3.0 and raw score = 85 85 90 95 100 105 110 115 z score: A score that indicates how many standard deviations an observation is above or below the mean of the distribution z score = raw score - mean standard deviation

  19. z = -1 z = 1 Normal distribution Raw scores z-scores -3 -2 -1 0 +1 +2 +3 z scores -3 -2 -1 0 +1 +2 +3 z scores raw scores In z-score distribution mean = 0 standard deviation = 1 In a normal distribution mean = µstandard deviation = σ

  20. Hint: Always draw a picture! Homework worksheet

  21. . Homework Worksheet: Problem 1 1 sd 1 sd .68 30 32 28

  22. . Homework Worksheet: Problem 2 2 sd 2 sd .95 32 28 34 26 30

  23. . Homework Worksheet: Problem 3 3 sd 3 sd .997 24 36 32 28 34 26 30

  24. . Homework Worksheet: Problem 4 .50 24 36 32 28 34 26 30

  25. . Homework Worksheet: Problem 5 Go to table 33-30 z = 1.5 z = .4332 2 .4332 24 36 32 28 34 26 30

  26. . Homework Worksheet: Problem 6 Go to table 33-30 z = 1.5 z = .4332 2 .9332 .4332 .5000 24 36 32 28 34 26 30

  27. .0668 Go to table 33-30 .4332 z = 1.5 z = .4332 2 33 .5000 - .4332 = .0668 Go to table 29-30 z =-.5 z = .1915 .5000 .1915 2 .5000 + .1915 = .6915 29 .4938 .1915 25-30 25 31 z = -2.5 z = .4938 2 .4938 + .1915 = .6853 Go to table 31-30 z =.5 z = .1915 2 .0668 .4332 27-30 z = -1.5 z = .4332 27 .5000 - .4332 = .0668 2

  28. Problem 11: .5000 + .4938 = .9938 Problem 12: .5000 - .3413 = .1587 Problem 13: 30 Problem 14: 28 and 32

  29. . 77th percentile Go to table nearest z = .74 .2700 x = mean + z σ = 30 + (.74)(2) = 31.48 .7700 .27 .5000 24 36 ? 28 34 26 30 31.48

  30. . 13th percentile Go to table nearest z = 1.13 .3700 x = mean + z σ = 30 + (-1.13)(2) = 27.74 .37 .50 .13 ? 24 36 32 27.74 34 26 30

  31. Problem 17: 68% or .68 or 68.26% or .6826 Problem 18: 95% or .95 or 95.44% or .9544 Problem 19: 99.70% or .9970 Problem 20: 27.34% or .2734 Problem 21: 40.13% or .4013 Problem 22: 69.15% or .6915 Problem 23: 18.41% or .1841 Problem 24: 28.81% or .2881 Problem 25: 96.93% or .9693 or 96.93% or .9693 Problem 26: .89% or .0089 Problem 27: 95.99% or .9599 Problem 28: 4.01% or .0401 Problem 29: 293.2 x = mean + z σ = 200 + (2.33)(40) = 293.2 Problem 29: 182.4 x = mean + z σ = 200 + (-.44)(40) = 182.4 Problem 31: 190 Problem 32: 217.6

  32. Please use the following distribution with a mean of 200 and a standard deviation of 40. Find the area under the curve between scores of 200 and 230. Start by filling in the desired information on curve 20 (to the right)(Note this one will require you to calculate a z-score for a raw score of 230 and use the z-table) Go to table 230-200 z = .75 z = .2734 40 .2734 80 320 240 160 280 120 200

  33. Normal Distribution has a mean of 50 and standard deviation of 4. Determine value below which 95% of observations will occur.Note: sounds like a percentile rank problem 1.64 okay too Go to table .4500 nearest z = 1.65 x = mean + z σ = 50 + (1.65)(4) = 56.60 .9500 .4500 .5000 38 62 54 46 58 ? 42 50 56.60

  34. Normal Distribution has a mean of $2,100 and s.d. of $250. What is the operating cost for the lowest 3% of airplanes.Note: sounds like a percentile rank problem = find score for 3rd percentile Go to table .4700 nearest z = - 1.88 x = mean + z σ = 2100 + (-1.88)(250) = 1,630 .0300 .4700 ? 2100 1,630

  35. Normal Distribution has a mean of 195 and standard deviation of 8.5. Determine value for top 1% of hours listened. Go to table .4900 nearest z = 2.33 x = mean + z σ = 195 + (2.33)(8.5) = 214.805 .4900 .0100 .5000 195 ? 214.8

  36. Try this one: Please find the (2) raw scores that border exactly the middle 95% of the curve Mean of 30 and standard deviation of 2 Go to table .4750 nearest z = 1.96 mean + z σ = 30 + (1.96)(2) = 33.92 Go to table .4750 nearest z = -1.96 mean + z σ = 30 + (-1.96)(2) = 26.08 .9500 .475 .475 26.08 33.92 ? ? 24 32 36 28 30

  37. Remember confidence intervals? 95% Confidence Interval: We can be 95% confident that our population mean falls between these two scores 99% Confidence Interval: We can be 99% confident that our population mean falls between these two scores z- scores allow us to find the raw scores for the middle 95% of the distribution

  38. Standard Error of the Mean (SEM) Remember confidence intervals? Revisit Confidence Intervals Confidence Intervals (based on z): We are using this to estimate a value such as a population mean, with a known degree of certainty with a range of values • The interval refers to possible values of the population mean. • We can be reasonably confident that the population mean • falls in this range (90%, 95%, or 99% confident) • In the long run, series of intervals, like the one we • figured out will describe the population mean about 95% • of the time. Greater confidence implies loss of precision.(95% confidence is most often used) Can actually generate CI for any confidence level you want – these are just the most common

  39. ? ? Mean = 50Standard deviation = 10 Find the scores for the middle 95% 95% x = mean ± (z)(standard deviation) 30.4 69.6 .9500 Please note: We will be using this same logic for “confidence intervals” .4750 .4750 ? 1) Go to z table - find z score for for area .4750 z = 1.96 2) x = mean + (z)(standard deviation) x = 50 + (-1.96)(10) x = 30.4 30.4 3) x = mean + (z)(standard deviation) x = 50 + (1.96)(10) x = 69.6 69.6 Scores 30.4 - 69.6 capture the middle 95% of the curve

  40. ? ? Mean = 50Standard deviation = 10 n = 100 s.e.m. = 1 Confidence intervals σ 95% standard error of the mean = Find the scores for the middle 95% n √ 48.04 51.96 For “confidence intervals” same logic – same z-score But - we’ll replace standard deviation with the standard error of the mean .9500 .4750 .4750 ? 10 = 100 √ x = mean ± (z)(s.e.m.) x = 50 + (1.96)(1) x = 51.96 x = 50 + (-1.96)(1) x = 48.04 95% Confidence Interval is captured by the scores 48.04 – 51.96

  41. Confidence intervals ? ? σ standard error of the mean 95% = n √ Mean = 50 Standard error mean = 10 Hint always draw a picture! Tell me the scores associated that border exactly the middle 95% of the curve We know this raw score = mean ± (z score)(standard deviation) Construct a 95 percent confidence interval around the mean Similar, but uses standard error the mean raw score = mean ± (z score)(standard error of the mean)

  42. Law of large numbers: As the number of measurements increases the data becomes more stable and a better approximation of the true (theoretical) probability As the number of observations (n) increases or the number of times the experiment is performed, the estimate will become more accurate.

  43. Law of large numbers: As the number of measurements increases the data becomes more stable and a better approximation of the true signal (e.g. mean) As the number of observations (n) increases or the number of times the experiment is performed, the signal will become more clear (static cancels out) With only a few people any little error is noticed (becomes exaggerated when we look at whole group) With many people any little error is corrected (becomes minimized when we look at whole group) http://www.youtube.com/watch?v=ne6tB2KiZuk

  44. Central Limit Theorem

  45. Sampling distributions of sample means versus frequency distributions of individual scores Distribution of raw scores: is an empirical probability distribution of the values from a sample of raw scores from a population Eugene X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X • Frequency distributions of individual scores • derived empirically • we are plotting raw data • this is a single sample Melvin X X X X X X X X X X X X Take a single score x Repeat over and over x x x Population x x x x

  46. Sampling distribution: is a theoretical probability distribution of • the possible values of some sample statistic that would • occur if we were to draw an infinite number of same-sized • samples from a population important note: “fixed n” • Sampling distributions of sample means • theoretical distribution • we are plotting means of samples Take sample – get mean Repeat over and over Population

  47. Sampling distribution: is a theoretical probability distribution of • the possible values of some sample statistic that would • occur if we were to draw an infinite number of same-sized • samples from a population important note: “fixed n” • Sampling distributions of sample means • theoretical distribution • we are plotting means of samples Take sample – get mean Repeat over and over Population Distribution of means of samples

  48. Sampling distribution: is a theoretical probability distribution of • the possible values of some sample statistic that would • occur if we were to draw an infinite number of same-sized • samples from a population Eugene • Frequency distributions of individual scores • derived empirically • we are plotting raw data • this is a single sample X X X X X X X Melvin X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X • Sampling distributions sample means • theoretical distribution • we are plotting means of samples 23rd sample 2nd sample

More Related