1 / 71

Probability Theory: Great Expectations

Great Theoretical Ideas In Computer Science. Probability Theory: Great Expectations. Lecture 19. CS 15-251. Two Edged Sword. Reasoning in terms of weighted averages is the source of many probability pitfalls, but it is also the source of very powerful mathematical tools.

swood
Télécharger la présentation

Probability Theory: Great Expectations

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Great Theoretical Ideas In Computer Science Probability Theory:Great Expectations Lecture 19 CS 15-251

  2. Two Edged Sword • Reasoning in terms of weighted averages is the source of many probability pitfalls, but it is also the source of very powerful mathematical tools.

  3. Finite Probability Distribution • A (finite) probability distribution D is a finite set S of elements, where each element x2S has a positive real weight, proportion, or probability p(x). The weights must satisfy:

  4. Finite Probability Distribution • A (finite) probability distribution D is a finite set S of elements, where each element x2S has a positive real weight, proportion, or probability p(x). • S is often called the sample space.

  5. Finite Probability Distribution • A (finite) probability distribution D is a finite set S of elements, where each element x2S has a positive real weight, proportion, or probability p(x). • Any set E ½ S is called an event. The probability of event E is defined to be

  6. Uniform Distribution • A (finite) probability distribution D is a finite set S of elements, where each element x2S has a positive real probability p(x). • If each element has equal probability, the distribution is said to be uniform.

  7. Functions From Distributions To Distributions • Let D be a probability distributions on a set S. Let f: S -> T be a function. • f(D) denotes a probability distribution on the set T satisfying: • 8 y2T, p(y) = PrD [ { x2 S | f(x) = y } ]

  8. 2 Fair Flips X=f(D): 0 --- ¼ 1 --- ½ 2 --- ¼ • D: • ¼ --- 00 • ¼ --- 01 • ¼ --- 10 • ¼ --- 11 f f: {00,01,10,11} -> {0,1,2} counts the number of 1’s

  9. Random Variables • Let D be a probability distributions on a set S. Let f: S -> T be a function. • f(D) denotes a probability distribution on the set T satisfying: • 8 y2T, p(y) = PrD [ { x2 S | f(x) = y } ] • If T ½ Reals, then we say that f(D) is a random variable resulting from the action of f on the underlying distribution D.

  10. Fair Coin Flips • If T ½ Reals, then we say that f(D) is a random variable resulting from the action of f on D. • Let D be the uniform distribution on n-bit stings. Let f:{0,1}n -> Naturals be a function returning the number of 1’s. • f(D) is a distribution on {0,1,2,…,n} where the probability of x is (x choose n)/2n

  11. 2 Fair Flips X=f(D): 0 --- ¼ 1 --- ½ 2 --- ¼ • D: • ¼ --- 00 • ¼ --- 01 • ¼ --- 10 • ¼ --- 11 f f: {00,01,10,11} -> {0,1,2} counts the number of 1’s

  12. “Let X be a random variable measuring the height of a randomly selected person in the room.” • Is shorthand for: • Let D be the uniform distribution on people in the room. Let f be function taking a person to his/her height. • X = f(D)

  13. We require that random variables be distributions on real numbers so that we can combine and summarize random variables in mathematically natural ways.

  14. Let X = f(D) and Y = g(D). Define Z = X + Y to be a new random variable h(D), where h(x) = f(x) + g(x). When two random variables are based on the same distribution, we can sum them to obtain a new random variable.

  15. Example: Let D be the uniform distribution of people in the USA. Let f return the length in inches of a person’s left arm. Let g return the length in inches of a person’s right arm. Let X = f(D) and Y= g(D). Z = X + Y is a random variable measuring the combined arm lengths of a random person in the USA.

  16. Example: Let D be the uniform distribution of refrigerators in Pittsburgh kitchens. Let f return the number of apples in the fridge. Let g return the number of oranges in the fridge. Let X = f(D) and Y= g(D). Z = X + Y is a random variable measuring the total apples and oranges in a random Pittsburgh fridge.

  17. More generally, for any two random variables X and Y on the same distribution, we can create a new random variable Z=h(X,Y) for any function h from the Reals to the Reals. Mostly, we will be adding random variables.

  18. EXPECTATION of X=f(D) on the sample space S The expectation of a random variable is defined to be average of its values, each value weighted by its probability of occurring. E[X] = Sa2f(S) a Pr[X=a]

  19. 2 Fair Flips X=f(D): 0 --- ¼ 1 --- ½ 2 --- ¼ • D uniform on S={00,01,10,11} • ¼ --- 00 • ¼ --- 01 • ¼ --- 10 • ¼ --- 11 f E[X] = Sa2f(S) a Pr[X=a] = 0( ¼ ) + 1( ½ ) + 2( ¼ ) = 1

  20. EXPECTATION of X=f(D) The expectation of a random variable is defined by a weighted average as follows: each d2D contributes f(d) with weight p(d). E[X] = Sd2S f(d) p(d)

  21. 2 Fair Flips X=f(D): 0 --- ¼ 1 --- ½ 2 --- ¼ • D on S: • ¼ --- 00 • ¼ --- 01 • ¼ --- 10 • ¼ --- 11 f E[X] = Sd2S f(d) p(d) = f(00)(¼) + f(01)(¼) + f(10)(¼) + f(11)(¼) = 1

  22. EXPECTATION of X=f(D) with sample space S E[X] = Sd2S f(d) p(d) = Sa2f(S) a Pr[X=a]

  23. Example: X is random variable defined by counting the number of heads when n fair, independent coins are flipped. E[X] = ?

  24. Example: X is random variable defined by counting the number of heads when n fair, independent coins are flipped. E[X] = Sa2f(S) a Pr[X=a] = Sa2{0..n} a (n choose a) / 2n =

  25. Example: X is random variable defined by counting the number of heads when n fair, independent coins are flipped. E[X] = Sa2f(S) a Pr[X=a] = Sa2{0..n} a (n choose a) / 2n = (½)n [n 2n-1] = n/2

  26. Don’t Always Expect The Expected • Let D be the uniform distribution on {0,1,9,10}. • E[D] = 5 • The probability that you ever see a sample close to 5 is zero.

  27. Example: X is random variable defined by counting the number of heads when n bias p, independent coins are flipped. E[X] = Sa2f(S) a Pr[X=a] = Sa2{0..n} a (n choose a) pi(1-p)n-i =

  28. Example: X is random variable defined by counting the number of heads when n bias p, independent coins are flipped. E[X] = Sa2f(S) a Pr[X=a] = Sa2{0..n} a (n choose a) pi(1-p)n-i = Ug! There has to be a better way!

  29. IMPORTANT: E[X+Y] = E[X] + E[y]

  30. IMPORTANT: E[X+Y] = E[X] + E[y]

  31. E[X+Y] = E[X] + E[y] Proof: E[X] = Sd2Sf(d) p(d) E[Y] = Sd2Sg(d) p(d) E[X+Y] = Sd2S(f(d)+g(d)) p(d)

  32. By induction . . . E[X1 + X2 + … + Xn] = E[X1] + E[X2] + …. + E[Xn]

  33. The expectation of the sum is the sum of the expectations.

  34. We will now explain a powerful way to compute expectations. This is called the indicator variable method.

  35. Example: X is random variable defined by counting the number of heads when n fair, independent coins are flipped. E[X] = Sa2f(S) a Pr[X=a] The method of indicator variables will solve this problem with almost no calculation.

  36. Example: X is random variable defined by counting the number of heads when n fair, independent coins are flipped. E[X] = ? Define n Indicator Variables, Xk = Xkindicates whether the kth flip is heads. By design, the sum of the indicator variables is X. 0, if the kth coin is tails 1, if the kth coin is heads

  37. Example: X is random variable defined by counting the number of heads when n fair, independent coins are flipped. E[X] = ? Define n Indicator Variables, Xk = Sk Xk = X 0, if the kth coin is tails 1, if the kth coin is heads

  38. Example: X is random variable defined by counting the number of heads when n fair, independent coins are flipped. E[X] = ? Define n Indicator Variables, Xk = E[Sk Xk ]= E[ X ] 0, if the kth coin is tails 1, if the kth coin is heads

  39. E[ X ] = E[Sk Xk] = E[X1] + . . . + E[Xn] The expectation of the sum is the sum of the expectations.

  40. E[ X ] = E[Sk Xk] = E[X1] + . . . + E[Xn] Each individual E[Xk] is trivial to calculate: E[Xk] = (½) 0 + (½) 1 = ½

  41. E[ X ] = E[Sk Xk] = E[X1] + . . . + E[Xn] = ½ + ½ + … + ½ = n/2

  42. Example: X is a random variable defined by counting the number of heads when n fair, independent coins are flipped. E[X] = ? Define n Indicator Variables, Xk = E[ X ] =E[Sk Xk] = n/2 0, if the kth coin is tails 1, if the kth coin is heads

  43. Example: X is random variable defined by counting the number of heads when n bias p, independent coins are flipped. E[X] = ? Define n Indicator Variables, Xk = E[ X ] = E[Sk Xk] = ? 0, if the kth coin is tails 1, if the kth coin is heads

  44. E[ X ] = E[Sk Xk] = E[X1] + . . . + E[Xn] Each individual E[Xk] is trivial to calculate: E[Xk] = (1-p) 0 + (p) 1 = p

  45. Example: X is random variable defined by counting the number of heads when n bias p, independent coins are flipped. E[X] = ? Define n Indicator Variables, Xk = E[ X ] = E[Sk Xk] = pn 0, if the kth coin is tails 1, if the kth coin is heads

  46. Exercise Go back to the painful looking sum involving k [n choose k] pk (1-p)n-k and solve it using the fact that E[X] = pn.

  47. The method of indicator variables is even more powerful than we are letting on! The indicator variables do not have to be independent for the method to work. Additivity of expectations does not require independence.

  48. E[X+Y] = E[X] + E[y] Proof: E[X] = Sd2Sf(d) p(d) E[Y] = Sd2Sg(d) p(d) E[X+Y] = Sd2S(f(d)+g(d)) p(d)

  49. The indicator variables do not have to be independent for the method to work. Application: When computing the expected running time of a randomized algorithm, we can compute the expectation of each piece pf the program and sum the results.

  50. Dependent indicator variables arise in the birthday example. Suppose we have k people each with a uniformly chosen birthday from 1 to 365. X=number of pairs of people with the same birthday. E[X] = ?

More Related