130 likes | 263 Vues
This document explores innovative methods for blood testing, aimed at enhancing disease detection efficiency. It discusses the statistical probability of testing healthy samples from a population where only 1% have a disease. Through pooling blood samples into sets of varying sizes, the analysis reveals how to identify sick individuals while minimizing the number of tests needed. Additionally, concepts of confidence levels, margin of error, and random sampling in experimental design are examined to ensure accurate measurement and reliable results.
E N D
Random Thoughts 2012(COMP 066) Jan-Michael Frahm Jared Heinly
Blood Testing • 1% of people have disease • Need to test 100 samples of blood • Probability that all samples are healthy • What if we pool the blood into 2 sets of 50 each and then test? • What is the expected number of samples that need to be tested?
Blood Testing • 1% of people have disease • Need to test 100 samples of blood • Probability that all samples are healthy • What if we pool the blood into 2 sets of 50 each and then test? • What is the expected number of samples that need to be tested? • Can we do better?
Better Blood Testing • split in 4 groups of 25
Better Blood Testing • split in sets of 50 and if a set of 50 contains a sick person’s blood split in two sets of 25 and test each set and only fully test a set of 25 only if its having a sick person’s blood.
Example of Connectivity • There are 5 billion adults in the world. • Assume the average person has 1600 acquaintances and a typical politician like president Obama has 5000 acquaintances • What are the chances that an average person is linked through chance by two connections to the president? • What are the chances that the president is connected to an average person by two connections?
How Certain Are the Measurements? • How sure are we about the value we measured? • What will it depend on how sure we are? • number of trials • variance in the trials • Margin of error (MOE) measures our potential of error in an experiment. • sample size • measured as standard error • how sure you want to be • MOE determines confidence interval • mean ± standard error
How many do you have to sample? • That depends! • How confident you want to be (Confidence level) • Confidence level • What percentage of experiments is correct if you redo the experiment numerous times • Often used confidence levels are 95% or 99% • Confidence level determines how many standard errors σ you add
How many do you have to sample? • Number z* of standard deviations for typical confidence levels
How many do you have to sample? • Standard deviation • N is the number of measures (samples) • m is the measurement per individual • Standard error for sample of size n • Margin of error for confidence level
How many do you have to sample? • Then the number of samples can be determined by • House prices in a city vary more than within a neighborhood hence you need more samples to determine the average price in the city than in the neighborhood • Larger sample for city than for neighborhood • If standard deviation σ is unknown • Pilot study to use its σ • MOE does not measure any bias (systematic disturbance)
What is a random sample? • Random subset of your test population • Randomly chosen voters, buyers, … • Sample represent the population • represents the student population of UNC • Which of these subsets are random? • All students in this class • Drawn by lottery from Connect Carolina • Students on the lawn • Random students asked at noon in Lenor Hall • ✗ • ✓ • ✗ • ✗