190 likes | 283 Vues
Tales (and heads) of statistics in large genetic studies. Ken Rice Associate Professor Analysis Committee Chair, CHARGE consortium http://faculty.washington.edu/kenrice. Q. What do you do?. Like most faculty, my time is split; Teaching courses Advising students (Training Grant)
E N D
Tales (and heads) of statistics in large genetic studies Ken Rice Associate Professor Analysis Committee Chair, CHARGE consortium http://faculty.washington.edu/kenrice
Q. What do you do? Like most faculty, my time is split; • Teaching courses • Advising students (Training Grant) • Developing new statistical methods • … and Cardiovascular disease research
Q. What do you do in GWAS? Basically, it’s embarrassingly simple… Y G p < 5x10-8?
Q. What does 5x10-8 mean? 5x10-8 is 0.00000005; a 1-in-20-million chance, or a 5-millionths of 1 percent. Which of these are more/less likely? • You are struck by lightning, this year • Your 1 ticket wins WA’s Lottery Jackpot • You (born today) live to 110 years old 1 in a million 1 in 7 million 1 in 250 million
Q. What’s it mean that’s familiar? • Someone is tossing coins; who? Nice, ineffectual Causes deaths!
Q. Dudley D-R or Snidely W? Suppose we see; • 2 heads in a row;p=1/4 • 3 heads in a row;p=1/8 Neither of these would be very suspicious
Q. Dudley or Snidely, in a GWAS? How many heads in a row gives p<5x10-8 ? p= • In GWAS, seeing ‘only’ 24 heads in a row isn’t enough to make us suspicious (!)
Q. Dudley or Snidely? (harder) Suppose, unknown to us and the coin-tosser, the coin was a little biased? Heads comes up more often than usual; we’d be suspicious too soon
Q. Dudley or Snidely? (harder) How much does it matter? If the coin actually has a 55% chance of heads; • 3 heads in a row; =16.6% • but we’d think; = 12.5% We’d be 1.33 too suspicious – about the same as extra 4/10 Heads, from a fair coin
Q. How does this affect GWAS? We’d think 29.9 heads, not 25 (!) We’d think 26.7 heads, not 23 (!!!) We thought 3+0.4 heads, not 3
Q. How does this affect GWAS? Inflation exactly like this happens in GWAS; • If many tests are only slightly ‘wrong’, there will be many spurious signals • E.g. some variants are more common in Scots… • We can fix it, by ‘angling down’ the line so it behaves correctly at p=0.5 (i.e. at 1 head)
Q. Is that the only problem? (no) Back to our cartoon – and a fair coin; Computers work out p; …actually, they* just work out the approximate value of p *…even the cool stylish ones
Q. What’s the right answer? = 0.031
Q. What’s the approximate answer? p = 0.031 Area = 0.033 (i.e. 4.9 heads)
Q. What happens in GWAS? For 25/25 heads; p = 3x10-8 Area = ???
Q. What happens in GWAS? At 25/25 heads; p = 3x10-8 Area = 1.3x10-12 i.e. 39.5 heads (!) No problem, at 5 H’s Claiming 25 H’s worth of suspicion when should claim 18 (!!!)
Q. How does this affect GWAS? Inflation exactly like this happens in GWAS; • The data is fine, but the approximate calculations are too approximate • The ‘angling down’ fix doesn’t work, here • In GWAS we can’t do perfect calculations – but are now using better approximations • More accurate results& better science
Q. Are you going to stop now? In summary; • “Omics” data a huge statistical challenge… even to do familiar stuff • We want people who are; • Smart • Inquisitive about statistics • Care about doing good science