1 / 61

MS Identification

MS Identification. Dr. Juan Antonio VIZCAINO PRIDE Group coordinator. PRIDE team, Proteomics Services Group PANDA group European Bioinformatics Institute Hinxton , Cambridge United Kingdom. Overview …. Search engines: peptide identification Protein inference

kaycee
Télécharger la présentation

MS Identification

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. MS Identification Dr. Juan Antonio VIZCAINO PRIDE Group coordinator PRIDE team, Proteomics Services Group PANDA group European Bioinformatics Institute Hinxton, Cambridge United Kingdom

  2. Overview … • Search engines: peptide identification • Protein inference • De novo and spectral searches • Choosing the right protein sequence DB • You need to learn many things…

  3. It should not be a black box… From: Lilley et al., Proteomics, 2011

  4. MS proteomics: Shot-gun/bottom-up approaches MS/MS analysis P R O T O C O L peptides sequence database proteins fragmentation MS analysis

  5. PMF IDENTIFICATION

  6. Peptide Mass Fingerprinting (MS) MS analysis Peptide Mass Fingerprinting (PMF) MW - Each peak in the spectrum represents a peptide (or mixture of peptides) - Information about the Mass and Charge Not very used at present except for Gel Based approaches (in this case the Molecular Weight of the protein is known)

  7. Peptide Mass Fingerprinting (MS) in the web Aldente (Phenyx): http://www.expasy.org/tools/aldente/ ASCQ_ME: https://www.genopole-lille.fr/logiciel/ascq_me/ Bupid: http://zlab.bu.edu/Amemee/ Mascot: http://www.matrixscience.com/search_form_select.html MassSearch: http://www.cbrg.ethz.ch/services/MassSearch MS-Fit (Protein Prospector): http://prospector.ucsf.edu/prospector/mshome.htm PepMAPPER:http://www.nwsr.manchester.ac.uk/mapper/ Profound (Prowl): http://prowl.rockefeller.edu/prowl-cgi/profound.exe XProteo: http://xproteo.com:2698/

  8. MS/MS IDENTIFICATION

  9. MS/MS MS analysis Peptide Mass Fingerprinting (PMF) Fragmentation Peptide sequence information (on top of Mass and Charge) MS/MS analysis

  10. Protein database based comparison compare theoretical spectrum experimental spectrum database sequence Sequential comparison: de novo approaches compare experimental spectrum de novo sequence database sequence Spectral comparison compare experimental spectrum experimental spectrum Spectral library Three types of MS/MS identification Modified From: Eidhammer, Flikka, Martens, Mikalsen – Wiley 2007

  11. MS proteomics: peptide IDs and protein IDs MS/MS spectra proteins

  12. MS proteomics: peptide IDs and protein IDs MS/MS spectra proteins

  13. MS proteomics: peptide IDs and protein IDs UniProt IPI RefSeq sequence database peptides Search engine TDMDNQIVVSDYAQMDR LFDQAFGLPR AKPLMELIER DESTNVDMSLAQR DIVVQETMEDIDK NGMFFSTYDR GTAGNALMDGASQL MS/MS spectra proteins

  14. SEARCH ENGINES

  15. Search engines UniProt IPI RefSeq sequence database Proteins TDMDNQIVVSDYAQMDR LFDQAFGLPR AKPLMELIER DESTNVDMSLAQR DIVVQETMEDIDK NGMFFSTYDR GTAGNALMDGASQL VDMSLAQR DIVVQETMEDIDK … Peptides Spectra Sequence database matching Experimental Spectra Theoretical Spectra

  16. Search engines Experimental Spectra Theoretical Spectra • How good is the correlation? • Scores are generated by search engines • Usually the best match is kept

  17. Search engines Taken from Nesvizhskii, J Proteomics, 2010

  18. Search engines Taken from Nesvizhskii, J Proteomics, 2010

  19. The most popular algorithms • MASCOT (Matrix Science) • http://www.matrixscience.com • SEQUEST (Scripps, Thermo Fisher Scientific) • http://fields.scripps.edu/sequest • X!Tandem (The Global Proteome Machine Organization) • http://www.thegpm.org/TANDEM • OMSSA (NCBI) • http://pubchem.ncbi.nlm.nih.gov/omssa/

  20. Overall concept of scores and cut-offs Incorrect identifications Threshold score Correct identifications False negatives False positives Adapted from: www.proteomesoftware.com – Wiki pages

  21. Playing with probabilistic cut-off scores higher stringency identifications false positives

  22. SEQUEST • Very well established search engine • Can be used for MS/MS (PFF) identifications • Based on a cross-correlation score (includes experimental peak height) • Published core algorithm (patented, licensed to Thermo Fisher Scientific) • Provides preliminary (Sp) score, rank, cross-correlation score (XCorr), • and score difference between the top tow ranks (deltaCn, Cn) • Thresholding is up to the user, and is commonly done per charge state • Many extensions exist to perform a more automatic validation of results XCorr = deltaCn=

  23. Search engines: Sequest It measures how good the XCorr is relative to the next best match. The XCorr is high if the direct comparison is significantly greater than the background

  24. Search engines: Mascot • Very well established search engine • Can do MS (PMF) and MS/MS (PFF) identifications • Based on the MOWSE score • Unpublished core algorithm (trade secret) • Predicts an a priori threshold score that identifications need to pass • From version 2.2, Mascot allows integrated decoy searches • Provides rank, score, threshold and expectation value per identification • Customizable confidence level for the threshold score

  25. Search engines: Mascot www.matrixscience.com

  26. Search engines: X!Tandem • Open source search engine • Can be used for MS/MS experiments • Based on a hyperscore, than only takes into account b and y ions. • Published core algorithm and it is freely available • Fast and able to handle PTMs in an iterative fashion • Used as an auxiliary search engine by-Score= Sum of intensities of peaks matching B-type or Y-type ions HyperScore=

  27. Search engines: OMSSA • Open source search engine • Can be used for MS/MS experiments • Relies on a Poisson distribution • Published core algorithm and it is freely available • Provides an expectancy score, similar to the BLAST E-value • Very good performance in comparison with the others • Used as an auxiliary search engine

  28. MS proteomics: peptide IDs and protein IDs UniProt IPI RefSeq sequence database peptides Search engine TDMDNQIVVSDYAQMDR LFDQAFGLPR AKPLMELIER DESTNVDMSLAQR DIVVQETMEDIDK NGMFFSTYDR GTAGNALMDGASQL MS/MS spectra So far, we have actually identified peptides, not proteins proteins

  29. MS proteomics: peptide IDs and protein IDs peptides proteins IPI00302927 IPI00025512 IPI00002478 IPI00185600 IPI00014537 IPI00298497 IPI00329236 IPI00002232 TDMDNQIVVSDYAQMDRTW LFDQAFGLPR AKPLMELIER DESTNVDMSLAQR DIVVQETMEDIDK NGMFFSTYDR GTAGNALMDGASQL Protein Inference is complex!!

  30. PROTEIN INFERENCE

  31. Intermezzo: Protein inference The minimal and maximal explanatory sets peptide a b c d proteins prot X x x prot Y x prot Z x x x { Minimal set Occam The Truth peptide a b c d proteins prot X x x prot Y x prot Z x x x { Maximal set anti-Occam

  32. Intermezzo: Protein inference Slide from J. Cottrell, Matrix Science

  33. Protein inference A B C D

  34. Protein inference A B C D

  35. Protein inference A B C D

  36. Protein inference A B C D

  37. Protein inference A B C D

  38. Protein inference A B C D

  39. Protein inference A B C D

  40. Protein inference A B C D

  41. Protein inference A B C D Unambiguous peptide

  42. OTHER APPROACHES TO PERFORM MS/MS IDENTIFICATION

  43. Protein database based comparison compare theoretical spectrum experimental spectrum database sequence Sequential comparison: de novo approaches compare experimental spectrum de novo sequence database sequence Spectral comparison compare experimental spectrum experimental spectrum Spectral library Three types of MS/MS identification Modified From: Eidhammer, Flikka, Martens, Mikalsen – Wiley 2007

  44. De novo approaches Example of a manual de novo of an MS/MS spectrum No more database necessary to extract a sequence! Algorithms Lutefisk Sherenga PEAKS PepNovo … References Dancik 1999, Taylor 2000 Fernandez-de-Cossio 2000 Ma 2003, Zhang 2004 Frank 2005, Grossmann 2005 …

  45. Protein database based comparison compare theoretical spectrum experimental spectrum database sequence Sequential comparison: de novo approaches compare experimental spectrum de novo sequence database sequence Spectral comparison compare experimental spectrum experimental spectrum Spectral library Three types of MS/MS identification Modified From: Eidhammer, Flikka, Martens, Mikalsen – Wiley 2007

  46. Spectral searching • Concept: To compare experimental spectra to other experimental spectra. • There are many spectral libraries publicly available (for instance, from NIST) • Custom ‘search engines’ have been developed: • SpectraST (TPP) • X!Hunter (GPM) • It has been claimed that the searches have more sensitivity that with sequence database approaches

  47. Spectral searching (2) http://peptide.nist.gov/

  48. COMBINING DIFFERENT SEARCH APPROACHES

  49. Multi-stage peptide identification strategy Goal: “Squeeze” your good quality experimental spectra Taken from Nesvizhskii, J Proteomics, 2010

  50. PROTEIN SEQUENCE DATABASES

More Related