1 / 29

Top-down characterization of proteins in bacteria with unsequenced genomes

Top-down characterization of proteins in bacteria with unsequenced genomes. Nathan Edwards Georgetown University Medical Center. Microorganism Identification. Homeland-security/defense applications Long history of fingerprinting approaches Clinical applications in strain identification:

zola
Télécharger la présentation

Top-down characterization of proteins in bacteria with unsequenced genomes

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Top-down characterization of proteins in bacteria with unsequenced genomes Nathan Edwards Georgetown University Medical Center

  2. Microorganism Identification • Homeland-security/defense applications • Long history of fingerprinting approaches • Clinical applications in strain identification: • Selection of treatment and/or antibiotics • New applications in microbiome analysis: • Bacterial colonies in gut, .... • Chronic wound infections • Compete with genomic approaches? • PCR, Next-gen sequencing • Primary sales-pitch is speed.

  3. Microorganism Identifications • Match spectra with proteome (or genome) sequence for (species) identity • Provides robust match with respect to instrumentation and sample prep • Many bacteria will never be sequenced or "finished"... • Pathogen simulants, for example • ...but many have – about 2500 to date.

  4. Microorganism Identifications • Match spectra with proteome (or genome) sequence for (species) identity • Provides robust match with respect to instrumentation and sample prep • Many bacteria will never be sequenced or "finished"... • Pathogen simulants, for example • ...but many have – about 2500 to date. • Can we use the available sequence to identify proteins from unknown, unsequenced bacteria? • Yes, for some proteins in some organisms!

  5. Crude cell lysate Capilary HPLC C8 column LTQ-Orbitrap XL Precursor scan: 30,000 @ 400 m/z Data-dependent precursor selection: 5 most abundant ions 10 second dynamic exclusion Charge-state +3 or greater CAD product ion scan 15,000 @ 400 m/z Intact protein LC-MS/MS

  6. CID Protein Fragmentation Spectrum from Y. rohdei

  7. Enterobacteriaceae Protein Sequences • Exhaustive set of all Enterobacteriaceae family protein sequences from • Swiss-Prot, TrEMBL, RefSeq, Genbank, and [CMR] • ...plus Glimmer3 predictions on RefSeq Enterobacteriaceae genomes • Primary and alternative translation start-sites • Filter for intact mass in range 1 kDa – 20 kDa • 253,626 distinct protein sequences, 256 species • Derived from "Rapid Microorganism Identification Database" (RMIDb.org) infrastructure.

  8. ProSightPC 2.0 • Product ion scan decharging • Enabled by high-resolution fragment ion measurements • THRASH algorithm implementation • Absolute mass search mode • 15 ppm fragment ion match tolerance • 250 Da precursor ion match tolerance • "Single-click" analysis of entire LC-MS/MS datafile.

  9. Other tools • Explored using standard search engines: • Decharge and format as charge +1 spectrum • X!Tandem scoring plugin (ProSight, delta M) • OMSSA, Mascot, etc… • MS-Tools: • MS-Deconv, MS-TopDown, • MS-Align, MS-Align+, MS-Align-E!

  10. CID Protein Fragmentation Spectrum from Y. rohdei Match to Y. pestis 50S Ribosomal Protein L32

  11. Exact match sequence…

  12. Phylogeny: Protein vs DNA Protein Sequence 16S-rRNA Sequence

  13. What about mixtures?

  14. Shared Small Ribosomal Proteins

  15. Shared Small Ribosomal Proteins

  16. Identified E. herbicola proteins • 30S Ribosomal Protein S19 • m/z 686.39, z 15+, E-value 1.96e-16, Δ 0.007 • Six proteins identified with |Δ| < 0.02

  17. Identified E. herbicola proteins • DNA-binding protein HU-alpha • m/z 732.71, z 13+, E-value 7.5e-26, Δ-14.128 • Eight proteins identified with "large" |Δ|

  18. Identified E. herbicola proteins • DNA-binding protein HU-alpha • m/z 732.71, z 13+, E-value 1.91e-58 • Use "Sequence Gazer" to find mass shift • ΔM mode can "tolerate" one shift for free!

  19. ProSightPC: ΔM mode ExperimentalPrecursor b- and y-ions ΔM Protein Sequence Also: PIITA - Tsai et al. 2009

  20. ProSightPC: ΔM mode Match a single "blind" mass-shift for free! b'- and y'-ions ExperimentalPrecursor b- and y-ions ΔM ΔM Protein Sequence Also: PIITA - Tsai et al. 2009

  21. ProSightPC: ΔM mode Match a single "blind" mass-shift for free! ExperimentalPrecursor b-, b'-, y- and y'-ions ΔM ΔM Protein Sequence Also: PIITA - Tsai et al. 2009

  22. Identified E. herbicola proteins • DNA-binding protein HU-alpha • m/z 732.71, z 13+, E-value 7.5e-26, Δ-14.128 • Extract N- and C-terminus sequence supported by at least 3 b- or y-ions

  23. E. herbicola protein sequences

  24. E. herbicola sequences found in other species

  25. Phylogenetic placement of E. herbicola Cladogram Phylogram phylogeny.fr – "One-Click"

  26. Genome annotation errors • UniProt: E. coli Cell division protein ZapB • 22 (371) E. coli strains MQFRRGMTMSLEVFEKLEAKVQQAIDTITL… 3 (204) 17 (166) 0 (2)

  27. Genome annotation errors • UniProt: E. coli Cell division protein ZapB • 22 (371) E. coli strains • Need ±1500 Da precursor tolerance… MQFRRGMTMSLEVFEKLEAKVQQAIDTITL… 3 (204) 17 (166) 0 (2)

  28. Conclusions • Protein identification for unsequenced organisms. • Identification and localization for sequence mutations and post-translational modifications. • Extraction of confidently established sequence suitable for phylogenetic analysis. • Genome annotation correction. • New paradigm for phylogenetic analysis?

  29. Acknowledgements • Dr. Catherine Fenselau • Avantika Dhabaria, Joe Cannon*, Colin Wynne* • University of Maryland Biochemistry • Dr. Yan Wang • University of Maryland Proteomics Core • Dr. Art Delcher • University of Maryland CBCB • Funding: NIH/NCI

More Related