210 likes | 513 Vues
Glycoprotein Microheterogeneity via N-Glycopeptide Identification. Kevin Brown Chandler, Petr Pompach, Radoslav Goldman, Nathan Edwards Georgetown University Medical Center. The challenge. Identify glycopeptides in large-scale tandem mass-spectrometry datasets
E N D
Glycoprotein Microheterogeneity via N-Glycopeptide Identification Kevin Brown Chandler, Petr Pompach, Radoslav Goldman, Nathan Edwards Georgetown University Medical Center
The challenge • Identify glycopeptides in large-scale tandem mass-spectrometry datasets • Many glycopeptide enriched fractions • Many tandem mass-spectra / fraction • Good, but not great, instrumentation • QStar Elite – CID, good MS1/MS2 resolution • Strive for hypothesis-generating analysis • Site-specific glycopeptide characterization • Glycoform occupancy in differentiated samples
Observations • Oxonium ions (204, 366) help distinguish glycopeptides from peptides… • …but do little to identify the glycopeptide • Few peptide b/y-ions to identify peptides… • …but intact peptide fragments are common • If the peptide can be guessed, then… • …the glycan's mass can be determined
Glycopeptide Search Strategy • Glycan-Peptide to Spectrum Matches • Multi-Peptide, Multi-Glycan Mass (Single Peptide), • Single Glycan Mass, Single Glycan (Topology)
Compromises • Single protein / Simple protein mixture • Few peptides to distinguish • Single N-glycan per peptide • Subtraction from precursor • Digest may not resolve site • Need peptide/glycan fragments to distinguish • Isobaric peptide-glycan pairs are not resolved • Need peptide/glycan fragments to distinguish
Glycan Databases • Link putative glycan masses to N-linked glycan structures (and organism, etc. ): • Human N-linked GlycomeDB • Cartoonist structure enumeration • CFG Mammalian Array (v5.0) • In-house database (Oxford notation) • Database(s) provide "biased" search space: • Coverage vs. "Reasonableness" • Trade off: Time, Specificity, Biology
Haptoglobin standard • N-glycosylation motif (NX/ST) * Site of GluC cleavage Pompach et al. Journal of Proteome Research 11.3 (2012): 1728–1740. Haptoglobin (HPT_HUMAN) NLFLNHSE*NATAK VVLHPNYSQVDIGLIK MVSHHNLTTGATLINE
Haptoglobin standard • 11 HILIC fractions enriched for glycopeptides • 11 x LC-MS/MS acquisitions (≥ 15k spectra) • 2887/3288 MS/MS spectra have oxonium ion(s) • 317 have "intact-peptide" fragment ions • 263 spectra matched to peptide-glycan pairs • 52% matched single-glycan • 8% matched multi-peptide • 27 distinct (mass) glycans on 11 peptides • Glycans identified on all 4 haptoglobin sites
Algorithms & Infrastructure • Glycan databases indexed by composition, mass, N-linked, and motif/type • Formats: IUPAC, Linear Code, GlycoCT_condensed • Implemented: GlycomeDB, Cartoonist, CFG Array • Monosaccharide decomposition of glycan mass • Böcker et al. Efficient mass decomposition (2005) • χ2 Goodness-of-fit test for precursor cluster • Theoretical isotope cluster from composition. • ICScore based on χ2 -test p-value.
False Discovery Rate (FDR) • How confidentcan we be in these mass-matches?
False Discovery Rate (FDR) • How confidentcan we be in these mass-matches? FDR: 3.9% [ ~ 10 / 263 spectra ]
False Discovery Rate (FDR) • How confidentcan we be in these mass-matches? FDR: 3.9% [ ~ 10 / 263 spectra ] • Estimate the number of errors by searching with non-N-linked motif (decoy) peptides too. • Count spectra matched to decoy peptide-glycan pairs. • Rescale decoy counts to balance the number of motif and non-motif peptides.
Tuning the filters… • Adjusting thresholds and parameters to • Increase specificity (lower FDR, fewer spectra), or • Increase sensitivity (more spectra, higher FDR)
Tuning the filters… • Oxonium ions: • Number & intensity • Match tolerance • "Intact-peptide" fragments: • Number & intensity • Match tolerance • Glycan composition: • ICScore • Constrain search space • Match tolerance • Glycan database: • Constrain search space • Match tolerance • Precursor ion: • Non-monoisotopic selection • Sodium adducts • Charge state • Peptide search space: • Semi-specific peptides • Non-specific peptides • Peptide MW range • Variable modifications
GlycoPeptideSearch (GPS) 1.3 • Freely available implementation • Windows, Linux • Reads open-format spectra (mzXML, MGF) • Pre-indexed Glycan databases • Human & Mammalian GlycomeDB • Mammalian CFG Array (v5.0) • User-Named (Oxford notation) • In silico digest and N-linked motif identification • Automatic target/decoy analysis for FDR • http://edwardslab.bmcb.georgetown.edu/GPS
Where to from here? • Demonstrate utility on new instrument platforms, proteins, samples • Develop a scoring model for fragments • Re-implement Cartoonist demerits • Exploit relationships between • MS2 spectra, MSn spectra • Explore application to • O-glycopeptides, N-glycans, O-glycans
Acknowledgements • Edwards Lab (Georgetown) • Kevin Brown Chandler [NSF] (Poster 32) • Goldman Lab (Georgetown) • Radoslav Goldman (Poster 6) • Petr Pompach • Miloslav Sanda (Poster 23) • Marshal Bern (Xerox PARC) • Cartoonist, Peptoonist • Rene Ranzinger (CCRC) • GlycomeDB