240 likes | 377 Vues
Use of SEQUEST search results with ProteoRed.org MIAPE Extractor. Sp-HPP. HPP. La Cristalera , Miraflores de la Sierra, 10-11 December 2012. Index.
E N D
Use of SEQUEST search results with ProteoRed.org MIAPE Extractor • Sp-HPP • HPP La Cristalera, Miraflores de la Sierra, 10-11 December 2012
Index A working Workflow to extract MIAPE information from Proteome Discoverer 1.3 search results using ProteoRedMIAPE ToolkitÓscar Gallardo, Joan Villanueva, Montserrat Carrascal, Joaquín Abián • Data dependent acquisition using inclusion list (IL)Joan Villanueva, Óscar Gallardo, Joaquín Abián, Montserrat Carrascal
MascotWorkflow Mass Spectra Identification Output file RAW MGF Mascot mzIdentML Mascot MIAPE Extractor MIAPE Generation MIAPE MS MIAPE MSI MIAPE Generator Tool Ó. Gallardo
ProteomeDiscovererWorkflow Mass Spectra Identification Output file RAW MSF MGF mzIdentML MIAPE Extractor MIAPE Extractor Proteome Discoverer Ó. Gallardo
ProteomeDiscovererWorkflow (GPL) LP-CSIC/UAB 2011-2012 RAW MGF MGF Ó. Gallardo
ProteomeDiscovererWorkflow RAW MGF MGF Proteome Discoverer DiscovererDaemon Ó. Gallardo
ProteomeDiscovererWorkflow Mass Spectra Identification Output file RAW MSF Proteome Discoverer Discoverer Daemon MGF MGF mzIdentML MIAPE Extractor Proteome Discoverer Ó. Gallardo
ProteomeDiscovererWorkflow MSF ProCon 0.9.152 mzIdentML A. Medina August 2012 Ó. Gallardo
ProteomeDiscovererWorkflow MSF MSF .Prot.XML .Prot.XML ProCon 0.9.162 mzIdentML ERROR!! ...........................................................67% finished .....................TaxID for organismName unknown: Leptospira interrogans serogroup Icterohaemorrhagiae serovar Lai .TaxID for organismName unknown: Sphaerochaeta globosa ...TaxID for organismName unknown: Leptospira borgpetersenii serovar ..... MyProgressBar for getSpectrumIdentificationListAndProteinDetectionListAndPeptideEvidences for SEQ finished SequenceCollection written CV term for unknown modification Deamidated / +0.984 Da (N, Q) not found. CV term for unknown modification Acetyl / +42.011 Da (Any NTerminus) not found. Exception in thread "AWT-EventQueue-0" java.lang.NullPointerException at de.mpc.Prot2MzIdent.AParamHandler.createThresholdParameterList(AParamHandler.java:526) at de.mpc.Prot2MzIdent.PD12ToMzIdentML.getProteinDetectionProtocol(PD12ToMzIdentML.java:851) ...........................................................67% finished .....................TaxID for organismName unknown: Leptospira interrogans serogroup Icterohaemorrhagiae serovar Lai .TaxID for organismName unknown: Sphaerochaeta globosa ...TaxID for organismName unknown: Leptospira borgpetersenii serovar ..... MyProgressBar for getSpectrumIdentificationListAndProteinDetectionListAndPeptideEvidences for SEQ finished SequenceCollection written CV term for unknown modification Deamidated / +0.984 Da (N, Q) not found. CV term for unknown modification Acetyl / +42.011 Da (Any NTerminus) not found. ProCon 0.9.162 was unable to interpret correctly the Controlled Vocabulary used by Proteome Discoverer to identify Post Translational Modifications (PTMs) ProCon 0.9.162 also had problems with it’s internal array references Ó. Gallardo
ProteomeDiscovererWorkflow MSF MSF ProCon 0.9.16 3 2 .Prot.XML .Prot.XML mzIdentML ERROR!! ...........................................................67% finished .....................TaxID for organismName unknown: Leptospira interrogans serogroup Icterohaemorrhagiae serovar Lai .TaxID for organismName unknown: Sphaerochaeta globosa ...TaxID for organismName unknown: Leptospira borgpetersenii serovar ..... MyProgressBar for getSpectrumIdentificationListAndProteinDetectionListAndPeptideEvidences for SEQ finished SequenceCollection written CV term for unknown modification Deamidated / +0.984 Da (N, Q) not found. CV term for unknown modification Acetyl / +42.011 Da (Any NTerminus) not found. Exception in thread "AWT-EventQueue-0" java.lang.NullPointerException at de.mpc.Prot2MzIdent.AParamHandler.createThresholdParameterList(AParamHandler.java:526) at de.mpc.Prot2MzIdent.PD12ToMzIdentML.getProteinDetectionProtocol(PD12ToMzIdentML.java:851) ...........................................................67% finished .....................TaxID for organismName unknown: Leptospira interrogans serogroup Icterohaemorrhagiae serovar Lai .TaxID for organismName unknown: Sphaerochaeta globosa ...TaxID for organismName unknown: Leptospira borgpetersenii serovar ..... MyProgressBar for getSpectrumIdentificationListAndProteinDetectionListAndPeptideEvidences for SEQ finished SequenceCollection written CV term for unknown modification Deamidated / +0.984 Da (N, Q) not found. CV term for unknown modification Acetyl / +42.011 Da (Any NTerminus) not found. ProCon 0.9.163 was unable to identify correctly Post Translational Modifications (PTMs) , marking all of them as “unknown modification” in the resulting mzIdentML file ProCon 0.9.163 had still problems with it’s internal array references Ó. Gallardo
ProteomeDiscovererWorkflow MSF MSF ProCon 0.9.16 4 3 .Prot.XML .Prot.XML mzIdentML mzIdentML Ó. Gallardo
ProteomeDiscovererWorkflow Mass Spectra Identification Output file RAW MSF Proteome Discoverer .Prot.XML Discoverer Daemon MGF MGF mzIdentML mzIdentML MIAPE Extractor MIAPE Extractor MIAPE Generation Proteome Discoverer MIAPE Generator Tool MIAPE Generator Tool MIAPE Generator Tool Ó. Gallardo
ProteomeDiscovererWorkflow Mass Spectra Identification Output file ID ID ID ID RAW MSF ...........................................................67% finished .....................TaxID for organismName unknown: Leptospira interrogans serogroup Icterohaemorrhagiae serovar Lai .TaxID for organismName unknown: Sphaerochaeta globosa ...TaxID for organismName unknown: Leptospira borgpetersenii serovar ..... MyProgressBar for getSpectrumIdentificationListAndProteinDetectionListAndPeptideEvidences for SEQ finished SequenceCollection written CV term for unknown modification Deamidated / +0.984 Da (N, Q) not found. CV term for unknown modification Acetyl / +42.011 Da (Any NTerminus) not found. Spectra IDs didn’t match between MGF file and mzIdentMLfile Proteome Discoverer .Prot.XML Discoverer Daemon MGF mzIdentML MIAPE Extractor IDmzid PepMS Charge RT IDmgf MIAPE Generation Proteome Discoverer MIAPE Generator Tool Ó. Gallardo
ProteomeDiscovererWorkflow Mass Spectra Identification Output file ID ID RAW MSF Proteome Discoverer .Prot.XML Discoverer Daemon MGF mzIdentML MIAPE Extractor ID PepMS Charge RT MIAPE Generation Proteome Discoverer MIAPE MS MIAPE MSI MIAPE Generator Tool MIAPE Generator Tool MIAPE Generator Tool Ó. Gallardo
ProteomeDiscovererWorkflow Mass Spectra Identification Output file RAW MSF Proteome Discoverer .Prot.XML Discoverer Daemon MGF mzIdentML MIAPE Extractor MIAPE Generation Proteome Discoverer MIAPE MS MIAPE MSI MIAPE Generator Tool Ó. Gallardo
Work in Progress Uploading of MSF + mzIdentML files through MIAPE Extractor is not yet automatized Although we can generate MIAPE data from Sequest search results, MIAPE Toolkit doesn’t work very well with this data for the analysis stage: we can not retrieve the identified proteins, there are problems with the Sequest Score fields, … We are working in an automation script, to automatize MIAPE Extractor data extraction: MIAPE Extractor Automator v.2 Development of MIAPE Extractor and MIAPE Generator tool continues improvement in each version Exportation of Prot.XML files from the MSF ones, and utter conversion of MSF + Prot.XML files to mzIdentML files is not automatized ProCon has still some errors, is very slow with large files, and is memory hungry ProCon developers are working in a new version that doesn’t need Prot.XML files, making the conversion process much faster and easier. Ó. Gallardo
Index A working Workflow to extract MIAPE information from Proteome Discoverer 1.3 search results using ProteoRedMIAPE ToolkitÓscar Gallardo, Joan Villanueva, Montserrat Carrascal, Joaquín Abián • Data dependent acquisition using inclusion list (IL)Joan Villanueva, Óscar Gallardo, Joaquín Abián, Montserrat Carrascal
Data dependentacquisitionwithinclusionlist RATIONAL OF USING DDP WITH INCLUSION LIST (IL): a.- Most target proteins assigned to the groups of the shotgun project were not detected using shotgun approaches. b.- The few detected peptides were not optimum for MRM analysis (not proteotypic, with Met/Cys, with missed cleavage). c.- Preliminary tests at LP-CSIC/UAB using targeted approaches require a limited list of peptides (need to restrict the list of target m/z values to 20-30) and failed to detect the target proteins. • DDP with Inclusion list increases the probability to positively detect low abundant proteins/peptides without the constraints of targeted approaches. 16 PROTEINS SELECTED FOR INCLUSION LIST - 6 proteins assigned to the LPCSICUAB laboratory - 10 proteins assigned to MRM labs and not detected by shotgun J. Villanueva
Procedure: Data Dependentwith IL To obtain the inclusion list: 1.- Alltrypticpeptides 7-25AA. 2.- m/z values assuming z=2 and z=3 for all peptides. 3.- Filter duplicate m/z values (software requirement) Number of m/z values in the inclusion list: 556 (num peptides 282) Samples CCD18 and MCF7 Aliquot 250 µgprotein OffGel (12 fractions) FASP digestion LC-MS/MS (DDP, IL, Targeted) ProteinDiscoverer J. Villanueva
DATA DEPENDENT WITH INCLUSION LIST: LTQ-ORBITRAP Sample VH: MCF-7 MS traces Offgel Fr6 Offgel Fr7 J. Villanueva
RESULTS: Inclusionlist and targeted DATA PROCESSING FOR IL DATA: 1.- MGF generation with PDv1.3 2.- Database search: Proteome Discoverer and Mascot 3.- FDR 5% RESULT: Data dependentwith IL: 282 Listedpeptidesundetected (samethattargeted experiments) • Lowamount of target proteins • Proteinsnotexpressed in thesecells J. Villanueva
Chromosome 16 proteindescription: Data Dependent Analysis DATA PROCESSING: 1.- MGF generation with PDv1.3 2.- Database search: Proteome Discoverer (and Mascot) 3.- Search results and Filtering (1 %FDR): MIAPE Extractor (Data Inspector Module) and Proteome Discoverer. Work in progress: MIAPE EXTRACTOR: The data could be uploaded and the FDR process could be achieved. Data Inspector Module: Detected errors to be solved: unable to extract protein information from SEQUEST data. J. Villanueva
Work in progress... Number of proteins that passed the 1%FDR filter: 1.- Significant differences between searching algorithms Need an in-depth data revision. J. Villanueva
Use of SEQUEST search results with ProteoRed.org MIAPE Extractor • Sp-HPP • HPP Thank you for your attention. Any question? La Cristalera, Miraflores de la Sierra, 10-11 December 2012