190 likes | 286 Vues
Dr. Ken Rice, leader in CHARGE consortium, tackles statistical challenges in large genetic studies with humor and insight. Learn about the intricacies and importance of statistical methods in cardiovascular disease research.
E N D
Tales (and heads) of statistics in large genetic studies Ken Rice Associate Professor Analysis Committee Chair, CHARGE consortium http://faculty.washington.edu/kenrice
Q. What do you do? Like most faculty, my time is split; • Teaching courses • Advising students (Training Grant) • Developing new statistical methods • … and Cardiovascular disease research
Q. What do you do in GWAS? Basically, it’s embarrassingly simple… Y G p < 5x10-8?
Q. What does 5x10-8 mean? 5x10-8 is 0.00000005; a 1-in-20-million chance, or a 5-millionths of 1 percent. Which of these are more/less likely? • You are struck by lightning, this year • Your 1 ticket wins WA’s Lottery Jackpot • You (born today) live to 110 years old 1 in a million 1 in 7 million 1 in 250 million
Q. What’s it mean that’s familiar? • Someone is tossing coins; who? Nice, ineffectual Causes deaths!
Q. Dudley D-R or Snidely W? Suppose we see; • 2 heads in a row;p=1/4 • 3 heads in a row;p=1/8 Neither of these would be very suspicious
Q. Dudley or Snidely, in a GWAS? How many heads in a row gives p<5x10-8 ? p= • In GWAS, seeing ‘only’ 24 heads in a row isn’t enough to make us suspicious (!)
Q. Dudley or Snidely? (harder) Suppose, unknown to us and the coin-tosser, the coin was a little biased? Heads comes up more often than usual; we’d be suspicious too soon
Q. Dudley or Snidely? (harder) How much does it matter? If the coin actually has a 55% chance of heads; • 3 heads in a row; =16.6% • but we’d think; = 12.5% We’d be 1.33 too suspicious – about the same as extra 4/10 Heads, from a fair coin
Q. How does this affect GWAS? We’d think 29.9 heads, not 25 (!) We’d think 26.7 heads, not 23 (!!!) We thought 3+0.4 heads, not 3
Q. How does this affect GWAS? Inflation exactly like this happens in GWAS; • If many tests are only slightly ‘wrong’, there will be many spurious signals • E.g. some variants are more common in Scots… • We can fix it, by ‘angling down’ the line so it behaves correctly at p=0.5 (i.e. at 1 head)
Q. Is that the only problem? (no) Back to our cartoon – and a fair coin; Computers work out p; …actually, they* just work out the approximate value of p *…even the cool stylish ones
Q. What’s the right answer? = 0.031
Q. What’s the approximate answer? p = 0.031 Area = 0.033 (i.e. 4.9 heads)
Q. What happens in GWAS? For 25/25 heads; p = 3x10-8 Area = ???
Q. What happens in GWAS? At 25/25 heads; p = 3x10-8 Area = 1.3x10-12 i.e. 39.5 heads (!) No problem, at 5 H’s Claiming 25 H’s worth of suspicion when should claim 18 (!!!)
Q. How does this affect GWAS? Inflation exactly like this happens in GWAS; • The data is fine, but the approximate calculations are too approximate • The ‘angling down’ fix doesn’t work, here • In GWAS we can’t do perfect calculations – but are now using better approximations • More accurate results& better science
Q. Are you going to stop now? In summary; • “Omics” data a huge statistical challenge… even to do familiar stuff • We want people who are; • Smart • Inquisitive about statistics • Care about doing good science