Glycoprotein Microheterogeneity via N-Glycopeptide Identification

Glycoprotein Microheterogeneity via N-Glycopeptide Identification Kevin Brown Chandler, Petr Pompach, Radoslav Goldman, Nathan Edwards Georgetown University Medical Center

The challenge • Identify glycopeptides in large-scale tandem mass-spectrometry datasets • Many glycopeptide enriched fractions • Many tandem mass-spectra / fraction • Good, but not great, instrumentation • QStar Elite – CID, good MS1/MS2 resolution • Strive for hypothesis-generating analysis • Site-specific glycopeptide characterization • Glycoform occupancy in differentiated samples

Observations • Oxonium ions (204, 366) help distinguish glycopeptides from peptides… • …but do little to identify the glycopeptide • Few peptide b/y-ions to identify peptides… • …but intact peptide fragments are common • If the peptide can be guessed, then… • …the glycan's mass can be determined

Observations

Glycopeptide Search Strategy • Glycan-Peptide to Spectrum Matches • Multi-Peptide, Multi-Glycan Mass (Single Peptide), • Single Glycan Mass, Single Glycan (Topology)

Compromises • Single protein / Simple protein mixture • Few peptides to distinguish • Single N-glycan per peptide • Subtraction from precursor • Digest may not resolve site • Need peptide/glycan fragments to distinguish • Isobaric peptide-glycan pairs are not resolved • Need peptide/glycan fragments to distinguish

Glycan Databases • Link putative glycan masses to N-linked glycan structures (and organism, etc. ): • Human N-linked GlycomeDB • Cartoonist structure enumeration • CFG Mammalian Array (v5.0) • In-house database (Oxford notation) • Database(s) provide "biased" search space: • Coverage vs. "Reasonableness" • Trade off: Time, Specificity, Biology

Haptoglobin standard • N-glycosylation motif (NX/ST) * Site of GluC cleavage Pompach et al. Journal of Proteome Research 11.3 (2012): 1728–1740. Haptoglobin (HPT_HUMAN) NLFLNHSE*NATAK VVLHPNYSQVDIGLIK MVSHHNLTTGATLINE

Haptoglobin standard • 11 HILIC fractions enriched for glycopeptides • 11 x LC-MS/MS acquisitions (≥ 15k spectra) • 2887/3288 MS/MS spectra have oxonium ion(s) • 317 have "intact-peptide" fragment ions • 263 spectra matched to peptide-glycan pairs • 52% matched single-glycan • 8% matched multi-peptide • 27 distinct (mass) glycans on 11 peptides • Glycans identified on all 4 haptoglobin sites

Algorithms & Infrastructure • Glycan databases indexed by composition, mass, N-linked, and motif/type • Formats: IUPAC, Linear Code, GlycoCT_condensed • Implemented: GlycomeDB, Cartoonist, CFG Array • Monosaccharide decomposition of glycan mass • Böcker et al. Efficient mass decomposition (2005) • χ2 Goodness-of-fit test for precursor cluster • Theoretical isotope cluster from composition. • ICScore based on χ2 -test p-value.

False Discovery Rate (FDR) • How confidentcan we be in these mass-matches?

False Discovery Rate (FDR) • How confidentcan we be in these mass-matches? FDR: 3.9% [ ~ 10 / 263 spectra ]

False Discovery Rate (FDR) • How confidentcan we be in these mass-matches? FDR: 3.9% [ ~ 10 / 263 spectra ] • Estimate the number of errors by searching with non-N-linked motif (decoy) peptides too. • Count spectra matched to decoy peptide-glycan pairs. • Rescale decoy counts to balance the number of motif and non-motif peptides.

Tuning the filters… • Adjusting thresholds and parameters to • Increase specificity (lower FDR, fewer spectra), or • Increase sensitivity (more spectra, higher FDR)

Tuning the filters… • Oxonium ions: • Number & intensity • Match tolerance • "Intact-peptide" fragments: • Number & intensity • Match tolerance • Glycan composition: • ICScore • Constrain search space • Match tolerance • Glycan database: • Constrain search space • Match tolerance • Precursor ion: • Non-monoisotopic selection • Sodium adducts • Charge state • Peptide search space: • Semi-specific peptides • Non-specific peptides • Peptide MW range • Variable modifications

Tuning the filters…

GlycoPeptideSearch (GPS) 1.3 • Freely available implementation • Windows, Linux • Reads open-format spectra (mzXML, MGF) • Pre-indexed Glycan databases • Human & Mammalian GlycomeDB • Mammalian CFG Array (v5.0) • User-Named (Oxford notation) • In silico digest and N-linked motif identification • Automatic target/decoy analysis for FDR • http://edwardslab.bmcb.georgetown.edu/GPS

Where to from here? • Demonstrate utility on new instrument platforms, proteins, samples • Develop a scoring model for fragments • Re-implement Cartoonist demerits • Exploit relationships between • MS2 spectra, MSn spectra • Explore application to • O-glycopeptides, N-glycans, O-glycans

Acknowledgements • Edwards Lab (Georgetown) • Kevin Brown Chandler [NSF] (Poster 32) • Goldman Lab (Georgetown) • Radoslav Goldman (Poster 6) • Petr Pompach • Miloslav Sanda (Poster 23) • Marshal Bern (Xerox PARC) • Cartoonist, Peptoonist • Rene Ranzinger (CCRC) • GlycomeDB

Glycoprotein Microheterogeneity via N-Glycopeptide Identification

Glycoprotein Microheterogeneity via N-Glycopeptide Identification

Presentation Transcript

Identification of Drug Metabolites via Mass Spectrometry

Localizaci n de Riesgos Hazard Identification

Protein Identification via Database searching

A Proposal for Faster Victim Identification via Remote DNA Testing

Rapid Identification of Architectural Bottlenecks via Precise Event Counting

V9 Pharmacogenomics of P-Glycoprotein

PROBING B-L UNIFICATION via N-N-bar Oscillation

Glycopeptide MS/MS Spectra

Work to Improve n e Identification

P-GLYCOPROTEIN AND DRUG TRANSPORT

N -Queens via Relaxation Labeling

Protocol Identification via Statistical Analysis (PISA)

Peptide Identification via Tandem Mass Spectrometry Sorin Istrail

Vector meson identification via dimuon measurements at CBM/FAIR

MOLECULAR ENDOCRINOLOGY AND IMMUNOLOGY Glycoprotein Hormones

V10 Pharmacogenomics of P-Glycoprotein

SURVEILLANCE FOR GLYCOPEPTIDE-RESISTANT ENTEROCOCCI

leucine rich alpha 2 glycoprotein

Dealing with Liars: Misbehavior Identification via Rényi-Ulam Games

N -Queens via Relaxation Labeling

PROBING B-L UNIFICATION via N-N-bar Oscillation

via via