350 likes | 523 Vues
CSE 321 Discrete Structures. Winter 2008 Lecture 19 Probability Theory. TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A A A A. Announcements. Readings Probability Theory 6.1, 6.2 (5.1, 5.2) Probability Theory 6.3 (New material!) Bayes’ Theorem
E N D
CSE 321 Discrete Structures Winter 2008 Lecture 19 Probability Theory TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAAAA
Announcements • Readings • Probability Theory • 6.1, 6.2 (5.1, 5.2) Probability Theory • 6.3 (New material!) Bayes’ Theorem • 6.4 (5.3) Expectation • Advanced Counting Techniques – Ch 7. • Not covered
Discrete Probability Experiment: Procedure that yields an outcome Sample space: Set of all possible outcomes Event: subset of the sample space S a sample space of equally likely outcomes, E an event, the probability of E, p(E) = |E|/|S|
Example: Dice Sample space, event, example
Example: Poker Probability of 4 of a kind
Combinations of Events EC is the complement of E P(EC) = 1 – P(E) P(E1 E2) = P(E1) + P(E2) – P(E1 E2)
Combinations of Events EC is the complement of E P(EC) = 1 – P(E) P(E1 E2) = P(E1) + P(E2) – P(E1 E2)
Probability Concepts • Probability Distribution • Conditional Probability • Independence • Bernoulli Trials / Binomial Distribution • Random Variable
Discrete Probability Theory • Set S • Probability distribution p : S [0,1] • For s S, 0 p(s) 1 • s S p(s) = 1 • Event E, E S • p(E) = s Ep(s)
Conditional Probability Let E and F be events with p(F) > 0. The conditional probability of E given F, defined by p(E | F), is defined as:
Examples Flip a coin 5 times, W is the event of three or more heads
Independence The events E and F are independent if and only if p(E F) = p(E)p(F) E and F are independent if and only if p(E | F) = p(E)
Are these independent? • Flip a coin three times • E: the first coin is a head • F: the second coin is a head • Roll two dice • E: the sum of the two dice is 5 • F: the first die is a 1 • Roll two dice • E: the sum of the two dice is 7 • F: the first die is a 1 • Deal two five card poker hands • E: hand one has four of a kind • F: hand two has four of a kind 0.0000000576 0.0000000740
Bernoulli Trials and Binomial Distribution • Bernoulli Trial • Success probability p, failure probability q The probability of exactly k successes in n independent Bernoulli trials is
Random Variables A random variable is a function from a sample space to the real numbers
Bayes’ Theorem Suppose that E and F are events from a sample space S such that p(E) > 0 and p(F) > 0. Then
False Positives, False Negatives Let D be the event that a person has the disease Let Y be the event that a person tests positive for the disease
Testing for disease Disease is very rare: p(D) = 1/100,000 Testing is accurate: False negative: 1% False positive: 0.5% Suppose you get a positive result, what do you conclude? P(YC|D) P(Y|DC)
P(D|Y) Answer is about 0.002
Spam Filtering From: Zambia Nation Farmers Union [znfukabwe@mail.zamtel.zm] Subject: Letter of assistance for school installation To: Richard Anderson Dear Richard, I hope you are fine, Iam through talking to local headmen about the possible assistance of school installation. the idea is and will be welcome. I trust that you will do your best as i await for more from you. Once again Thanking you very much Sebastian Mazuba.
Bayesian Spam filters • Classification domain • Cost of false negative • Cost of false positive • Criteria for spam • v1agra, ONE HUNDRED MILLION USD • Basic question: given an email message, based on spam criteria, what is the probability it is spam
Email message with phrase “Account Review” • 250 of 20000 messages known to be spam • 5 of 10000 messages known not to be spam • Assuming 50% of messages are spam, what is the probability that a message with “Account Review” is spam
Expectation The expected value of random variable X(s) on sample space S is:
Flip a coin until the first headExpected number of flips? Probability Space: Computing the expectation:
Linearity of Expectation E(X1 + X2) = E(X1) + E(X2) E(aX) = aE(X)
Hashing H: M [0..n-1] If k elements have been hashed to random locations, what is the expected number of elements in bucket j? What is the expected number of collisions when hashing k elements to random locations?
Hashing analysis Sample space: [0..n-1] [0..n-1] . . . [0..n-1] Random Variables Xj = number of elements hashed to bucket j C = total number of collisions Bij = 1 if element i hashed to bucket j Bij = 0 if element i is not hashed to bucket j Cab = 1 if element a is hashed to the same bucket as element b Cab = 0 if element a is hashed to a different bucket than element b
Counting inversions Let p1, p2, . . . , pn be a permutation of 1 . . . n pi, pj is an inversion if i < j and pi > pj 4, 2, 5, 1, 3 1, 6, 4, 3, 2, 5 7, 6, 5, 4, 3, 2, 1
Insertion sort 4 2 5 1 3 for i :=1 to n-1{ j := i; while (j > 0 and A[ j - 1 ] > A[ j ]){ swap(A[ j -1], A[ j ]); j := j – 1; } }
Left to right maxima max_so_far := A[0]; for i := 1 to n-1 if (A[ i ] > max_so_far) max_so_far := A[ i ]; 5, 2, 9, 14, 11, 18, 7, 16, 1, 20, 3, 19, 10, 15, 4, 6, 17, 12, 8
What is the expected number of left-to-right maxima in a random permutation