Chapter 5: Probability Analysis of Randomized Algorithms

Chapter 5: Probability Analysis of Randomized Algorithms Size is rarely the only property of input that affects run time Worst-case analysis most common way to eliminate other properties Alternative: <T(n)> after randomizing input Example: Randomized Hire Assistantprocedure Randomize interview schedule best 0 initialization fori 1 to n do interview candidate i if candidate i is better than best thenbesti hire candidate i Find how <T(n)> depends on interviewing and hiring times

Appendix C.2: Background on probability theory Discrete probabilities are defined in terms of a sample space S. Usually S is a collection of elementary events that are outcomes of independent experiments, such as flipping coins, throwing dice, pulling cards, etc. |S| = size of S (also called “cardinality” of S) Example: S = set of outcomes from flipping 2 coins = {HH, HT, TH, TT} Events are subsets of S. (S itself is called the “certain” set.) The event of getting 1 head and 1 trail = {HT, TH} The empty subset, , called the “null event”

Background on set theory (Appendix B.1, text p1158) If xA implies xB, then A is a subset of B (written AB) If, in addition, AB then A is a proper subset of B (written AB) AB = intersection of sets A and B defined by {x: xA and xB} AB = union of sets A and B defined by {x: xA or xB} A - B = difference of sets A and B defined by {x: xA and xB} For laws obeyed by sets seetext pp1159-1162

Definition of a probability distributionon sample space S (1) Pr{A} > 0 for any event A (2) Pr{S} = 1 (3) for any 2 mutually exclusive events (i.e. A  B = ) Pr{A  B} = Pr{A} + Pr{B} By definition, elementary events are always mutually exclusive. From the definition of probability distributions, it follows that Pr{} = 0 If A  B then Pr{A} < Pr{B} Pr{S – A} = 1- Pr{A} S-A is the complement of A For any two events (not necessarily mutually exclusive) Pr{A  B} = Pr{A} + Pr{B} – Pr{A  B} < Pr{A} + Pr{B}

Discrete probability distributions Defined over a finite or countable infinite sample space Let s denote elementary events of S, then for any event A Pr{A} = sA P{s} If S is finite and every sS has probability 1/|S|, then we have a uniform probability distribution on S. We call the experiment that gives rise to elementary events “picking an event at random from S” Example: in S = {HH, HT, TH, TT}, Pr{HH} = Pr{HT} = Pr{TH} = P{TT} = ¼ The probability of at least one head is Pr{HH, HT, TH} = Pr{HH} + Pr{HT} + Pr{TH} = ¾ = 1 – Pr{TT} = 1 – ¼ = ¾

A “fair” coin flipped n times defines a uniform probability distribution on S Elementary events are strings HHTHT… For each of n positions in string we have 2 choices of letter |S| = 2n For n=2, S = {HH, HT, TH, TT}, What is the probability of event A = {exactly k heads and n-k tails}? Pr{A} = (nk)/2nwhere (nk) = n!/(k!(n-k)!) = number ways to choose k items out of a total of n (nk) are binomial coefficients (see text pp 1185-1186) S is normalized

Conditional probability Given some prior partial knowledge of outcomes, we want the probability of an outcome conditioned on our prior knowledge about it. Suppose that someone flips 2 coins and tells us that at least one shows heads. What is the probability that both coins are show heads? S = {HH, HT, TH, TT} Our prior knowledge eliminates elementary event TT. The remaining 3 possibilities are equally likely. • Pr{HH} conditioned on at least 1 heads showing = 1/3. Probability of A conditioned on B, Pr{A|B}, is meaningful only if Pr{B}  0. Given that B occurs, the probability that A also occurs is related to the set of outcomes in which both A and B occur.  Pr{A|B} is proportional to Pr{AB} If we normalize Pr{A|B} by dividing by Pr{B} (which ≠ 0), then Pr{B|B} = Pr{BB}/Pr{B} = Pr{B}/Pr{B} = 1

Apply Pr{A|B} = Pr{AB}/Pr{B}to “probability of 2 heads showing given that at least one head is showing” AB is the event with 2 heads showing and at least one head showing. in S = {HH, HT, TH, TT}, HH is the only such event. • Pr{AB} = ¼ B is the event with at least one head showing. Pr{B} = ¾ • Pr{A|B} = (1/4)/(3/4) = 1/3 Two events are independent if Pr{AB} = Pr{A}Pr{B} If events A and B are independent and Pr{B}  0, then Pr{A|B} = Pr{A}

Discrete random variables: Mappings of finite or countable infinite sets of events onto the real numbers For random variable X and real number x, the event X = x is the collection of all elementary events {sS: X(s) = x}. • Pr{X = x} = {sS: X(s) = x} Pr{s} f(x) = Pr{X = x} is the “probability density function” of random variable X By axioms of probability f(x) > 0 and x f(x) = 1 In physical sciences more likely to encounter “continuous” random variables defined for uncountable infinite sample spaces In the continuous case, probability density function f(x) defined such that f(x)dx is probability that random variable X has a value between x and x + dx.

Example of discrete random variable: A pair of 6-sided dice is rolled. Random variable X is the max of the 2 numbers showing. What is Pr{X = 3}? Use exhaustive enumeration Elementary events with 3 as max of the 2 numbers showing (1,3), (2,3), (3,3), (3,2), and (3,1). The probability of each elementary event is (1/6)(1/6) = 1/36  Pr{X = 3} = 5/36

Expected value of a random variable discrete: E[X] = xPr{X =x} x continuous: E[X] = x f(x)xdx f(x) is probability density of X If g(X) defines a new random variable, then E[g(X)] = x g(x)Pr{X =x} Let g(x) = ax and h(y) = by, then E[aX + bY] = aE[X] + bE[Y] by property of sums and integrals Called “linearity of expectation values

For any two random variables X and Y joint probability density function is Pr{X=x and Y=y} for any fixed value of y Pr{Y=y} = x Pr{X=x and Y=y} called “marginal” probability Pr{X=x | Y=y} = Pr{X=x and Y=y}/Pr{Y=y} called “conditional” probability X and Y are independent if for all x and y Pr{X = x and Y = y} = Pr{X = x}Pr{Y = y} If X and Y are independent, then E[XY] = E[X]E[Y] proof on p1198 of text

Indicator Random Variables For any event A, Indicator Random Variable I{A} = 1 if A occurs = 0 if A does not occur Zero and one are the only values an indicator random variables can have Lemma 5.1: Let A be an event in sample space S and XA = I{A}. Then the expected value of XA = probability of event A. Proof: For any random variable X, E[X] = x x Pr{X = x} For indicator random variables x = 0 or 1, only The complement of event A in sample space S is S-A E[XA] = (1)Pr(A) + (0)Pr(S - A) = Pr(A)

Expectation of repeated trials by indicator random variable Example: What is the expected number of heads for n coin flips? Let Xi be the indicator random variable for the event that ith flip is heads Let X be the random variable for the total number of heads in n coin flips X = i=1 to n (Xi) only trials with head contribute to sum By the linearity of expectation values E[X] = E[i=1 to n (Xi)] = i=1 to n E[Xi] = i=1 to n (1/2) = n/2 Note: to use I{A} must know Pr{A}

Analysis of the cost of the RandomizedHire Assistant ci = interview cost ch = hiring cost X = random variable for the number of times a new assistant is hired <T(n)> = n ci + E[X] ch A = event that kth candidate is hired Xk = I{A} X = k=1 to n Xk E[X] = k=1 to n E[Xk] Since interview schedule randomized, kth candidate has probability 1/k of being better than all previous candidates E[X] = k=1 to n (1/k) = ln(n) + O(1) hamonic sum text p1147 <T(n)> = n ci + ln(n) ch + O(ch)

CptS 450 Spring 2014 [All problems are from Cormen et al, 3nd Edition] Homework Assignment 5: due 3/12/14 1. ex C.2-3 p 1195 2. ex C.3-1 p 1200 3. ex C.3-2 p 1200 4. ex 5.2-3 p 122

Chapter 5: Probability Analysis of Randomized Algorithms