1 / 18

Bonferonni correction+

Bonferonni correction+. Adapted from presentation of Рубанович А.В. 1000 people guessed the sequence of 10 cards: red or black ?. 12 persons guessed 9 of 10 cards, two of them all 10 cards. Experiments in finding people with paranormal powers : Joseph Rhine (1950).

gina
Télécharger la présentation

Bonferonni correction+

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Bonferonni correction+ Adapted from presentation of Рубанович А.В.

  2. 1000 people guessed the sequence of 10 cards: red or black? 12 persons guessed 9 of 10 cards, two of them all 10 cards Experiments in finding people with paranormal powers: Joseph Rhine (1950) All these “physics” in further experiments did’t confirm their paranormal abilities

  3. Problems of «multiple comparisons» ? • Genome-wide association: gene expression studies with DNA chips – 500 000 SNP. • For the significance level 0.01we can expect up to 5000 false associations • Meta-studies: joining and comparison of different results obtained by different authors Multiple testing is dangerous: large probability to find false association!

  4. Significant! How it happens? Appearance of false associations Let us generate two identically distributed samples with 100 personswith20-locus genotypes Should be OR=1 Odd Ratio –w/o association OR=1 Cases Controls GeneSample 1Sample 2 OR p All 3 loci are Associated with a disease! 2 4 3 1

  5. Carlo Bonferroni (1935): When applyingmindependent statistical test, only significant results are results with How to avoid false associations? Applyingmindependent statistical tests with significance level a, a probability of at least one false association should be 1-(1-a)m< 0.05

  6. 1 against 8 with equal size samples : Bonferroni correction kills the significance of certain results: Two mutations associated with the disease But adjusted by Bonferroni it should be: p < 0,05/2=0,025

  7. case_mut1=matrix(1,8,1) case_non_mut1=matrix(0,92,1) control_mut1=matrix(1,1,1) control_non_mut1=matrix(0,99,1) data=rbind(case_mut1,case_non_mut1,control_mut1,control_non_mut1) res=rbind(matrix(1,100,1),matrix(0,100,1)) mylogit<- glm(as.formula(res~data), family=binomial(link="logit"), na.action=na.pass) exp(mylogit$coefficients[2]) summary(mylogit)[["coefficients"]][,"Pr(>|z|)"] case_mut1=matrix(1,15,1) case_non_mut1=matrix(0,85,1) control_mut1=matrix(1,5,1) control_non_mut1=matrix(0,95,1) data=rbind(case_mut1,case_non_mut1,control_mut1,control_non_mut1) res=rbind(matrix(1,100,1),matrix(0,100,1)) mylogit<- glm(as.formula(res~data), family=binomial(link="logit"), na.action=na.pass) exp(mylogit$coefficients[2]) summary(mylogit)[["coefficients"]][,"Pr(>|z|)"] Example to compute OR

  8. Not significant! According to Bonferroni shoud be: Assessment of individual sensitivity to ionizing radiationand DNA repair efficiency in a healthy population F. Marcona, C. Andreoli, et al. Mut. Res., 541 (2003) Genotypes

  9. Not significant! Bonferroni correction requests: High-Throughput Detection of GST Polymorphic Alleles in a Pediatric Cancer Population P. Barnette, R. Scholl, et al. Cancer Epidemiology, Biomarkers & PreventionVol. 13, 304–313, 2004 Control 13 genotypes OR=6,4 P=0,007 8 diseases Homozygocity in GST prevents cancer! OR=2,3 P=0,018

  10. Bonferroni correction leads to very high probability to miss proper association! Bonferroni method creates more problems thanit solves (Thomas Perneger, 1998): “Bonferroni adjustments are, at best, unnecessary and, at worst, deleterious to sound statistical inference…”

  11. Errors by statistical testing … and is not taking care about the possibility to miss discovery (Type II Error) Null hypothesis – usually about absence of differences in two samples Traditionally a biologist is trying to avoid Type I error, i.e. to guarantee avoidance of False discoveries Type I Error Probability to reject null hypothesis=probability to find differences where there are any = Probability of false discovery TypeII Error Probability to accept wrong null hypothesis = Probability not to find existing differences = Probability to miss proper discovery Test power = 1- TypeII error = Probability to reject correctly null hypothesis = Probability to make a discovery

  12. 1 Dependence of Type II erroron number of tests using the Bonferroni correction Probability to miss gene with OR=2.7 with sample sizes 100(case)and 100 (control) With 100 comparisons to guarantee avoidance of 1 false discovery, we miss 88% proper discoveries! For m=100 the probability of error is 0.88 With 5 comparisons we miss 50% of discoveries In single test a probability to miss the discovery is 0.2 Type II error Number of tests

  13. >105 papers in New algorithm to test statistical hypothesis: FDR-control False Discovery Rate control: Benjamini, Hochberg (1995)) Probability of false discovery < Significance level TypeI Error< 0.05 Traditional principle is replaced by Average fractionof false discoveries< Significance level chosen

  14. Algorithm of FDRcontrol(Benjamini, Hochberg, 1995) • Order tests according to p-value : p1< p2 < … < pm. • For FDR control onα level (e.g. 0.05), we find • Differences are assumed to be significant for j = 1, …, j*. • Forj > j* differences are assumed not to be significant Significance level required Order number of gene P-value for j-th test (gene) Total number of tests (genes)

  15. Example: multiplecomparisonson 10 tests Order tests in ascending order of p-value Significant corrections after FDR control In first cell Bonferroni p-value Bonferonni correction leaves only first value In second two times larger Three times larger and so on …. Significant p-values without correction That’s it!!! For 6th test p-value is larger than FDR

  16. Example: expression of3051 genesin leykomiaGolub T.R. Molecular classification of cancer: class discovery and class prediction by gene expression monitoring.// Science. 2001, v.286. Number of genes with this level of t-statistics t-statistics for the comparison of gene expression in healthy and ill patients t-test: 1045 genes, for which p<0.05 Bonferroni correction: 98 genes with p’<0.000016 FDR: 681 genes, for which FDR< 0.05

More Related