60 likes | 175 Vues
This study outlines improved methodologies for assembly breakpoint detection by implementing stringent requirements on b-points. By removing around 8% of b-points and increasing stringency in contigs utilized by AGE and Crossmatch, we achieved significantly better alignment and agreement between methods. The findings indicate that using GenomeStrip in conjunction with assembly support from both AGE and Crossmatch could enhance validation accuracy, addressing inherent biases in current approaches. Consistent b-points should be prioritized for reliable genotyping.
E N D
Stringent breakpoints Alexej & Ken, August 17, 2011
Stringent requirements for b-points • More stringent requirements for AGE alignment (remove ~8% of b-points) • Increase stringency on contigs used by both AGE and Crossmatch(strong effect on the number of resulting breakpoints)
Phase1 deletion breakpoint assembly First line – initial b-points Second line – stringent b-points Much better agreement between AGE and CROSSMATCH
First line – initial b-points Second line – stringent b-points All calls from 5 merged calls Inconsistent FDR estimation: (2108 – 519)/2108 = 75% < 86%
Calls with SAV p-val <> 0.5 Require 50% reciprocal overlap Second line – p-val > 0.5 • Calls with p-val < 0.5 and p-val > 0.5 are not dramatically different • SAV validation seems to systematically overestimate FDR. • Possible reasons: • Input call properties (sample under-assignment) • Bias again smaller regions • Bias in repetitive regions • Reference sample bias when using aCGH probes
Proposal • Use GenomeStrip + assembly support (all or only with both AGE and CROSSMATCH support) + SAV validated (possibly) • SAV validation has inherent biases • Assembly validation is orthogonal to SAV validation • There is evidence that SAV overestimates FDR • GenomeStrip + assembly support by both AGE and CROSSMATCH would have 29,281 calls with likely overestimated SAV FDR of 11% • Use only consistent b-points by AGE and CROSSMATCH for genotyping