Random Thoughts 2012 (COMP 066)

Random Thoughts 2012(COMP 066) Jan-Michael Frahm Jared Heinly

Blood Testing • 1% of people have disease • Need to test 100 samples of blood • Probability that all samples are healthy • What if we pool the blood into 2 sets of 50 each and then test? • What is the expected number of samples that need to be tested?

Blood Testing • 1% of people have disease • Need to test 100 samples of blood • Probability that all samples are healthy • What if we pool the blood into 2 sets of 50 each and then test? • What is the expected number of samples that need to be tested? • Can we do better?

Better Blood Testing • split in 4 groups of 25

Better Blood Testing • split in sets of 50 and if a set of 50 contains a sick person’s blood split in two sets of 25 and test each set and only fully test a set of 25 only if its having a sick person’s blood.

Example of Connectivity • There are 5 billion adults in the world. • Assume the average person has 1600 acquaintances and a typical politician like president Obama has 5000 acquaintances • What are the chances that an average person is linked through chance by two connections to the president? • What are the chances that the president is connected to an average person by two connections?

Homework Assignment

How Certain Are the Measurements? • How sure are we about the value we measured? • What will it depend on how sure we are? • number of trials • variance in the trials • Margin of error (MOE) measures our potential of error in an experiment. • sample size • measured as standard error • how sure you want to be • MOE determines confidence interval • mean ± standard error

How many do you have to sample? • That depends! • How confident you want to be (Confidence level) • Confidence level • What percentage of experiments is correct if you redo the experiment numerous times • Often used confidence levels are 95% or 99% • Confidence level determines how many standard errors σ you add

How many do you have to sample? • Number z* of standard deviations for typical confidence levels

How many do you have to sample? • Standard deviation • N is the number of measures (samples) • m is the measurement per individual • Standard error for sample of size n • Margin of error for confidence level

How many do you have to sample? • Then the number of samples can be determined by • House prices in a city vary more than within a neighborhood hence you need more samples to determine the average price in the city than in the neighborhood • Larger sample for city than for neighborhood • If standard deviation σ is unknown • Pilot study to use its σ • MOE does not measure any bias (systematic disturbance)

What is a random sample? • Random subset of your test population • Randomly chosen voters, buyers, … • Sample represent the population • represents the student population of UNC • Which of these subsets are random? • All students in this class • Drawn by lottery from Connect Carolina • Students on the lawn • Random students asked at noon in Lenor Hall • ✗ • ✓ • ✗ • ✗

Random Thoughts 2012 (COMP 066)