Probability (Part 2)

Probability (Part 2)

Télécharger la présentation

Probability (Part 2)

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

1. 5 Chapter Probability (Part 2) Contingency Tables Tree Diagrams Bayes’s Theorem (optional) Counting Rules (optional)

2. Variable 1 Col 1 Col 2 Col 3 Row 1 Row 2 Row 3 Row 4 Variable 2 Contingency Tables • What is a Contingency Table? • A contingency table is a cross-tabulation of frequencies into rows and columns. Cell • A contingency table is like a frequency distribution for two variables.

3. Contingency Tables • Example: Salary Gains and MBA Tuition • Consider the following cross-tabulation table for n = 67 top-tier MBA programs:

4. The frequencies indicate that MBA graduates of high-tuition schools do tend to have large salary gains. Contingency Tables • Example: Salary Gains and MBA Tuition • Are large salary gains more likely to accrue to graduates of high-tuition MBA programs? • Also, most of the top-tier schools charge high tuition. • More precise interpretations of this data can be made using the concepts of probability.

5. For example, find the marginal probability of a medium salary gain (P(S2)). Contingency Tables • Marginal Probabilities • The marginal probability of a single event is found by dividing a row or column total by the total sample size. P(S2) = 33/67 = .4925 • Conclude that about 49% of salary gains at the top-tier schools were between \$50,000 and \$100,000 (medium gain).

6. Contingency Tables • Marginal Probabilities • Find the marginal probability of a low tuition P(T1). .2388 16/67 = P(T1) = • There is a 24% chance that a top-tier school’s MBA tuition is under \$40.000.

7. Contingency Tables • Joint Probabilities • A joint probability represents the intersection of two events in a cross-tabulation table. • Consider the joint event that the school has low tuition and large salary gains (denoted as P(T1S3)).

8. Contingency Tables • Joint Probabilities • So, using the cross-tabulation table, P(T1S3) = 1/67 = .0149 • There is less than a 2% chance that a top-tier school has both low tuition and large salary gains.

9. Contingency Tables • Conditional Probabilities • Found by restricting ourselves to a single row or column (the condition). • For example, knowing that a school’s MBA tuition is high (T3), we would restrict ourselves to the third row of the table.

10. Contingency Tables • Conditional Probabilities • Find the probability that the salary gains are small (S1) given that the MBA tuition is large (T3). .1563 5/32 = P(T1|S3) = • What does this mean?

11. Contingency Tables • Independence • To check for independent events in a contingency table, compare the conditional to the marginal probabilities. • For example, if large salary gains (S3) were independent of low tuition (T1), then P(S3 | T1) = P(S3). • What do you conclude about events S3 and T1?

12. Contingency Tables • Relative Frequencies • Calculate the relative frequencies below for each cell of the cross-tabulation table to facilitate probability calculations. • Symbolic notation for relative frequencies:

13. Contingency Tables • Relative Frequencies • Here are the resulting probabilities (relative frequencies). For example, P(T1 and S1) = 5/67 P(T2 and S2) = 11/67 P(T3 and S3) = 15/67 P(S1) = 17/67 P(T2) = 19/67

14. Contingency Tables • Relative Frequencies • The nine joint probabilities sum to 1.0000 since these are all the possible intersections. • Summing the across a row or down a column gives marginal probabilities for the respective row or column.

15. Contingency Tables • Example: Payment Method and Purchase Quantity • A small grocery store would like to know if the number of items purchased by a customer is independent of the type of payment method the customer chooses to use. • Why would this information be useful to the store manager? • The manager collected a random sample of 368 customer transactions.

16. Contingency Tables • Example: Payment Method and Purchase Quantity • Here is the contingency table of frequencies:

17. Calculate the marginal probability that a customer will use cash to make the payment. Contingency Tables • Example: Payment Method and Purchase Quantity • Let C be the event cash. P(C) = 126/368 = .3424 • Now, is this probability the same if we condition on number of items purchased?

18. Contingency Tables • Example: Payment Method and Purchase Quantity P(C | 1-5) = 30/88 = .3409 P(C | 6-10) = 46/135 = .3407 P(C | 10-20) = 31/89 = .3483 P(C | 20+) = 19/56 = .3393 • P(C) = .3424, so what do you conclude about independence? • Based on this, the manager might decide to offer a cash-only lane that is not restricted to the number of items purchased.

19. Contingency Tables • How Do We Get a Contingency Table? • Contingency tables require careful organization and are created from raw data. • Consider the data of salary gain and tuition for n = 67 top-tier MBA schools.

20. Once coded, tabulate the frequency in each cell of the contingency table using MINITAB’s Stat | Tables | Cross Tabulation Contingency Tables • How Do We Get a Contingency Table? • The data should be coded so that the values can be placed into the contingency table.

21. Tree Diagrams • What is a Tree? • A tree diagram or decision tree helps you visualize all possible outcomes. • Start with a contingency table. • For example, this table gives expense ratios by fund type for 21 bond funds and 23 stock funds.

22. Tree Diagrams • What is a Tree? • To label the tree, first calculate conditional probabilities by dividing each cell frequency by its column total. = .5238 • For example, = 11/21 P(L | B) • Here is the table of conditional probabilities

23. The tree diagram shows all events along with their marginal, conditional and joint probabilities. Tree Diagrams • What is a Tree? • To calculate joint probabilities, use P(AB) = P(A | B)P(B) = P(B | A)P(A) • The joint probability of each terminal event on the tree can be obtained by multiplying the probabilities along its branch. • For example, P(BL) = P(L | B)P(B) = (.5238)(.4773) = .2500

24. Tree Diagrams • Tree Diagram for Fund Type and Expense Ratios

25. Bayes’s Theorem • Thomas Bayes (1702-1761) provided a method (called Bayes’s Theorem) of revising probabilities to reflect new probabilities. • The prior (marginal) probability of an event B is revised after event A has been considered to yield a posterior (conditional) probability. • Bayes’s formula is:

26. Bayes’s Theorem • Bayes’s formula begins as: • In some situations P(A) is not given. Therefore, the most useful and common form of Bayes’s Theorem is:

27. False Negative False Positive 96% of time 4% of time 1% of time 99% of time Bayes’s Theorem • How Bayes’s Theorem Works • Consider an over-the-counter pregnancy testing kit and it’s “track record” of determining pregnancies. • If a woman is actually pregnant, what is the test’s “track record”? • If a woman is not pregnant, what is the test’s “track record”?

28. Bayes’s Theorem • How Bayes’s Theorem Works • Suppose that 60% of the women who purchase the kit are actually pregnant. • Intuitively, if 1,000 women use this test, the results should look like this.

29. Bayes’s Theorem • How Bayes’s Theorem Works • Of the 580 women who test positive, 576 will actually be pregnant. • So, the desired probability is: P(Pregnant│Positive Test) = 576/580 = .9931

30. From the contingency table, we know that: Bayes’s Theorem • How Bayes’s Theorem Works • Now use Bayes’s Theorem to formally derive the result P(Pregnant | Positive) = .9931: • First defineA = positive test B = pregnantA' = negative test B' = not pregnant • And the compliment of each event is: P(A | B) = .96 P(A | B') = .01 P(B) = .60 P(A' | B) = .04 P(A' | B') = .99 P(B') = .40

31. P(A | B)P(B) P(B | A) = P(A | B)P(B) + P(A | B')P(B') (.96)(.60) = (.96)(.60) + (.01)(.40) .576 .576 = = = .9931 .576 + .04 .580 Bayes’s Theorem • How Bayes’s Theorem Works • So, there is a 99.31% chance that a woman is pregnant, given that the test is positive.

32. Bayes’s Theorem • How Bayes’s Theorem Works • Bayes’s Theorem shows us how to revise our prior probability of pregnancy to get the posterior probability after the results of the pregnancy test are known. • Bayes’s Theorem is useful when a direct calculation of a conditional probability is not permitted due to lack of information.

33. Bayes’s Theorem • How Bayes’s Theorem Works • A tree diagram helps visualize the situation.

34. Bayes’s Theorem • How Bayes’s Theorem Works The 2 branches showing a positive test (A) comprise a reduced sample space B A and B' A, so add their probabilities to obtain the denominator of the fraction whosenumerator is P(B A).

35. Bayes’s Theorem • General Form of Bayes’s Theorem • A generalization of Bayes’s Theorem allows event B to be polytomous (B1, B2, … Bn) rather than dichotomous (B and B').

36. Bayes’s Theorem • Example: Hospital Trauma Centers • Based on historical data, the percent of cases at 3 hospital trauma centers and the probability of a case resulting in a malpractice suit are as follows: • let event A = a malpractice suit is filedBi= patient was treated at trauma center i

37. 0. Bayes’s Theorem • Example: Hospital Trauma Centers • Applying the general form of Bayes’ Theorem, find P(B1 | A).

38. Bayes’s Theorem • Example: Hospital Trauma Centers • Conclude that the probability that the malpractice suit was filed in hospital 1 is .1389 or 13.89%. • All the posterior probabilities for each hospital can be calculated and then compared:

39. = 10,000x.2 = 10,000x.3 = 10,000x.5 = 5,000 - 5 = 2,000 x .008 = 3,000 x .005 = 5,000 x .001 = 3,000 - 15 = 1,984 - 16 Bayes’s Theorem • Example: Hospital Trauma Centers • Intuitively, imagine there were 10,000 patients and calculate the frequencies:

40. Bayes’s Theorem • Example: Hospital Trauma Centers • Now, use these frequencies to find the probabilities needed for Bayes’ Theorem. • For example,

41. Bayes’s Theorem • Example: Hospital Trauma Centers • Consider the following visual description of the problem:

42. Bayes’s Theorem • Example: Hospital Trauma Centers • The initial sample space consists of 3 mutually exclusive and collectively exhaustive events (hospitals B1, B2, B3).

43. Bayes’s Theorem • Example: Hospital Trauma Centers • As indicated by their relative areas, B1is 50% of the sample space, B2 is 30% and B3 is 20%. 30% 50% 20%

44. P(B2 | A) P(B3 | A) P(B1 | A) Bayes’s Theorem • Example: Hospital Trauma Centers • But, given that a malpractice case has been filed (event A), then the relevant sample space is reduced to the yellow area of event A. • The revised probabilities are the relative areas within event A.

45. Counting Rules • Fundamental Rule of Counting • If event A can occur in n1 ways and event B can occur in n2 ways, then events A and B can occur in n1 x n2 ways. • In general, m events can occurn1 x n2 x … x nm ways.

46. Counting Rules • Example: Stock-Keeping Labels • How many unique stock-keeping unit (SKU) labels can a hardware store create by using 2 letters (ranging from AA to ZZ) followed by four numbers (0 through 9)? • For example, AF1078: hex-head 6 cm bolts – box of 12RT4855: Lime-A-Way cleaner – 16 ounceLL3319: Rust-Oleum primer – gray 15 ounce

47. Counting Rules • Example: Stock-Keeping Labels • View the problem as filling six empty boxes: • There are 26 ways to fill either the 1st or 2nd box and 10 ways to fill the 3rd through 6th. • Therefore, there are 26 x 26 x 10 x 10 x 10 x 10 = 6,760,000 unique inventory labels.

48. Counting Rules • Example: Shirt Inventory • L.L. Bean men’s cotton chambray shirt comes in 6 colors (blue, stone, rust, green, plum, indigo), 5 sizes (S, M, L, XL, XXL) and two styles (short and long sleeves). • Their stock might include 6 x 5 x 2 = 60 possible shirts. • However, the number of each type of shirt to be stocked depends on prior demand.

49. Counting Rules • Factorials • The number of ways that n items can be arranged in a particular order is nfactorial. • n factorial is the product of all integers from 1 to n. n! = n(n–1)(n–2)...1 • Factorials are useful for counting the possible arrangements of any n items. • There are n ways to choose the first, n-1 ways to choose the second, and so on.