330 likes | 455 Vues
This presentation by Brian C. Searle at the 2013 Scaffold Users Meeting discusses significant improvements in peptide probability modeling within Scaffold 4. Key topics include enhanced probability estimation using LFDR, target/decoy classification of multiple scores, and advancements in delta mass error modeling. The presentation showcases the implementation of Naïve Bayes classifiers for robust protein identification, as well as the integration of various new search engines compatible with mzIdentML standard. These innovations aim to improve accuracy in protein-level false discovery rates (FDR) and streamline data analysis for researchers.
E N D
Improving Peptide Probability Modeling in Scaffold 4 Brian C. Searle brian.searle@proteomesoftware.com Scaffold Users Meeting, 2013 Creative Commons Attribution
Scaffold 4 Improvements • Probability Estimation using LFDR • Target/Decoy Classification of multiple scores • Delta Mass Error Modeling Improvements • Requires Target/Decoy analysis (1:1 … 1:10)
“Incorrect” “Correct”
Number of Identified Proteins Protein-Level False Discovery Rate
Number of Identified Proteins Protein-Level False Discovery Rate
XCorr DeltaCN % Ions Identified …
XCorr DeltaCN % Ions Identified …
XCorr DeltaCN % Ions Identified
Naïve Bayes Classifier • Trained to each data set • Simple (can calculate with a formula, no magic!) • Robust to over-fitting
Number of Identified Proteins Protein-Level False Discovery Rate
Number of Identified Proteins Protein-Level False Discovery Rate
Probability the ID is Correct Probability the ID is Wrong
Number of Identified Proteins Protein-Level False Discovery Rate
Number of Identified Proteins Protein-Level False Discovery Rate
1% Peptide FDR Number of Identified Proteins
1% Peptide FDR > 10% Protein FDR?!? Number of Identified Proteins Protein-Level FDR
New Search Engines? • Difficult to add new search engines with PeptideProphet (new seeds) • Easy to add with Naïve Bayes / LFDR • mzIdentML interchange (HUPO standard)
New Search Enginesin Scaffold 4 • Peaks • Byonic • Myrimatch (Tabb Lab) • SQID (Wysocki Lab) • MS-GF+ (Pevzner Lab) • MS-Amanda (Mechtler Lab, PD)
New Search Enginesin Scaffold 4 • Peaks • Byonic • Myrimatch (Tabb Lab) • SQID (Wysocki Lab) • MS-GF+ (Pevzner Lab) • MS-Amanda (Mechtler Lab, PD) • ... Any engine with decoys & mzIdentML!
Scaffold 4 Improvements • New Naïve Bayes / LFDR Probabilities • Probability Estimation using LFDR • Target/Decoy Classification • Delta Mass Error Modeling • “Next generation” search engine interpretation • New mzIdentML File Loading • Several newly supported search engines • Any search engine with decoys