Methodology for MALD-TOF data analysis

Methodology for MALD-TOF data analysis Once the peptide mass data of individual protein is obtained from MALDI-TOF, the next step is for identification of protein through database search. The database matching for query sequence depends on lot of parameters, which need to optimized to come up with the best fit • Related Los: Protein database search tools • > Prior Viewing – IDD:26 Spot picking, IDD: 28 In solution digestion, IDD:29 Matrix preparation for MALDIanalysis, IDD: 31 MALDI-TOF data analysis, • >Future Viewing –IDD:43 MALDI Molecular weight application, IDD:44 MALDI PTM application. • Course Name: MALD-TOF data analysis • Level(UG/PG): PG • Author(s):Dinesh Raghu, Vinayak Pachapur • Mentor: Dr. Sanjeeva Srivastava *The contents in this ppt are licensed under Creative Commons Attribution-NonCommercial-ShareAlike 2.5 India license

Definitions and Keywords 1 1. Peptide Mass Fingerprinting: Probable protein identification method, which compares peptide mass values of protein analyte to a database of known proteins to arrive at its probable identity in the form of the “best fit”. 2. Spectrum from MALDI analysis: The peptide fragments generated after proteolytic digestion are analyzed by MALDI-TOF and the data generated is represented in the form of spectrum. The spectrum data can be used for online sequence databases search. 3. Online search: Several open source databases are available online, the protein data is updated on regular basic which allow analysis of the MS spectrum generated. 4. Open shareware for PMF: Database search algorithms used for comparing experimental peptide masses against theoretically calculated peptide masses derived by applying “cleavage rules” to large primary sequence protein databases. The open shareware consists of the following fields which need to entered by the user during the search: • Name and Email: Used for identification of search entry and also for e-mailing results page in case of loss of connection without requiring re-entry of data. • Search Title: Used to identify and label search entry and typically includes the name of the protein whose information is required. • Databases: The primary sequence protein databases, including NCBInr and SwissProt against whom the query is run. A contaminants database is also recommended to eliminate contaminants such as keratin, trypsin and BSA. 2 3 4 5

Definitions and Keywords 1 • Taxonomy: the search query to be limited to a particular species or a group of species. • Enzyme: used during sample preparation of analyte before its mass spectrometric analysis. the popular one is trypsin but if any other enzyme is used it’s site specificity is expected to be equal to or better than that of trypsin. • Missed Cleavage Allowed: Occurrence of partial digests during trypsinolysis of analyte protein at one or two Arginine and Lysine sites is a common phenomenon and needs to be accounted for during search against calculated peptide masses. • Modifications: During sample prep for Mass Spec Analysis of proteins, some changes in the mass of specific residues might occur, such as oxidation of methionine, carboxymethyl and cysteine etc. To account for these mass changes, the algorithm allows two types of modifications to be pre-selected- Fixed and Variable. • Fixed Modifications: Modifications that need to be applied collectively across the database to account for change in mass of specific residue/s. Most common fixed modification is the selection of the mass of carboxymethyl over cysteine replacing its mass as 161 Da. • Variable Modifications: These are mass changes suspected to occur during sample handling and accounted for by increasing the number of primary sequences compared against experimental masses. Most common variable modification is the oxidation of methionine residue in the analyte protein. • Protein Mass: Mass of intact protein in the form of a contiguous stretch including all matched peptides. If mass is unknown, this parameter can be left empty and the mass will remain unrestricted. 2 3 4 5

Definitions and Keywords 1 • Peptide Tolerance: This is a parameter associated with accuracy and resolution of the mass spectrometer and is used to account for shifts in isotope spacings. • Mass Values: To specify the type of charge of the analyte being examined by Peptide Mass Fingerprinting, i.e. MH+ , M-H- or if the masses correspond to neutral values like Mr . • Monoisotopic Mass Vs Average Mass Value: Depending upon the mass accuracy of a spectrometer, the experimental masses calculated for identification of analyte by Peptide mass fingerprinting is either chosen to be monoisotopic mass or the average mass of its isotopic elements. The selection of monoisotopic mass rests upon the ability of the instrument to resolve isotopes, and accurately determine peak mass. Average mass is the sum of abundance-weighted masses of all isotopes while the monoisotopic mass is the sum of masses of the most abundant isotope of each element. If the instrument has insufficient mass resolution capabilities combined with poor signal to noise ratio, the peptide mass of experimental values must be selected as being average to provide better identification. 5. Best fit – Score histogram: The “best fit” is defined as the primary identification of the analyte protein made by the database search algorithm representing either the exact protein being analyzed or the protein with the closest primary sequence homology, unusually with equivalent function in a related species. The score histogram depicts the distribution of protein scores for all the hits obtained by the query. 2 3 4 5

Learning objectives 1 • After interacting with this learning object, the learner will be able to: • Prepare in handling the database search tools • Operate on setting up the parameters • Analyze the result output from the database • Assess the troubleshooting steps involved in the experiments. 2 3 4 5

Master Layout 1 • Data input (Slide:6-7) 2 Parameters settings (Slide:8-19) Data output (Slide:20-21) 3 Data analysis (Slide:22-37) 4 5 Display the steps along with images next to them

Description of the action/ interactivity Audio Narration (if any)‏ Step 1: T1: Data input 1 2 3 Take the user throught the slides of IDD: 31 MALDI-TOF data analysis, animate the spectrum data getting converted to be saved into excel sheet. Show the excel sheet ready for the database search. For the data analysis, user can have the data in excel sheet for cut, copy and paste the values in the search window. Or user can upload the saved excel sheet during the database search. 4 5

Description of the action/ interactivity Audio Narration (if any)‏ Step 2: T2: Parameter settings 1 2 3 4 5

Description of the action/ interactivity Audio Narration (if any)‏ Step 2: T2: Parameter settings 1 2 3 Let user open a browsing window. Instruct user to type: www.matrixscience.com and click on enter. Display the matrix science window like in previous slide, please re-draw the image. Let user have control over the icons for selections. Instruct user to click on “Peptide mass fingerprint”. The matrix science is the online database search engine. To start up with the data analysis for peptide masses select the peptide mass fingerprint. 4 5

Description of the action/ interactivity Audio Narration (if any)‏ Step 2: T2: Parameter settings 1 2 3 4 5

Description of the action/ interactivity Audio Narration (if any)‏ Step 2: T2: Parameter settings 1 2 3 Display the “Peptide mass fingerprint” like in previous slide. Let user have control over, giving the details, making the selection on the scroll down options, making a selection over check box, buttons and with browse, start search and reset button. Redraw the figure from previous slide. User must feed in the details and parameters selection for best fit, depending on the background of sample type and source. 4 5

Description of the action/ interactivity Audio Narration (if any)‏ Step 2: T2: Parameter settings 1 2 3 Instruct user to provide the details, like user name, Email and Search title like shown above. Let user have full control to type in the details. The user details are needed, incase during the search if there is any network problem, the data can be mailed to the user. 4 5

Description of the action/ interactivity Audio Narration (if any)‏ Step 2: T2: Parameter settings 1 2 3 Instruct user to select from the scroll options for Database (s), Allow up to missed cleavages and Enzyme. Animate and provide the options (like shown in slide) in scroll down for user to select. When user selects the parameters, audio narration must start. Databases: The primary sequence protein databases, including NCBInr and SwissProt against which the query will run. Enzyme: used during sample preparation of analyte before its mass spectrometric analysis. Missed Cleavage Allowed: Occurrence of partial digests during trypsinolysis of analyte protein at one or two Arginine and Lysine sites 4 5

Description of the action/ interactivity Audio Narration (if any)‏ Step 2: T2: Parameter settings 1 2 3 Instruct user to select from the scroll options for Taxonomy. Animate and provide the options in scroll down for user to select. When user selects the parameters, audio narration must start. Taxonomy: the search query to be limited to a particular species or a group of species for which the sample belongs. 4 5

Description of the action/ interactivity Audio Narration (if any)‏ Step 2: T2: Parameter settings 1 2 3 Fixed Modifications: Modifications that need to be applied collectively across the database to account for change in mass of specific residue/s. Variable Modifications: These are mass changes suspected to occur during sample handling and accounted for by increasing the number of primary sequences compared against experimental masses. Instruct user to select from the scroll options for Fixed and Variable modifications. And arrow buttons to add the selected options from the scroll down menu (like shown in figure). Animate and provide the options in scroll down for user to select. When user selects the parameters, audio narration must start. 4 5

Description of the action/ interactivity Audio Narration (if any)‏ Step 2: T2: Parameter settings 1 2 3 Instruct user to type a value for Protein mass, select peptide tolerance of “+“ or “–” Da/mmu/%/ppm. Now once this selection is done let user select botton for Mass values like shown in figure. Later let user select for button for Monoisotopic. (like shown in figure). Animate and provide the options in scroll down for user to select. When user selects the parameters, audio narration must start. The parameters can be changed depending upon user need. Protein Mass: Mass of intact protein in the form of a contiguous stretch including all matched peptides. Mass Values: To specify the type of charge of the analyte being examined. Monoisotopic Mass Vs Average Mass Value: Depending upon the mass accuracy of a spectrometer, the experimental masses calculated for identification of analyte by Peptide mass fingerprinting is either chosen to be monoisotopic mass or the average mass of its isotopic elements. 4 5

Description of the action/ interactivity Audio Narration (if any)‏ Step 2: T2: Parameter settings 1 2 3 Instruct user to upload the peptide mass data file by clicking on “browse” button and opening the saved file. If not user can copy and paste the peptide mass data in the Query window. Animate scroll options for Report top Hits and a check button for Decoy. animate like shown in figure and provide the options in scroll down for user to select. When user selects the parameters, audio narration must start. User need to feed in the data or can browse and upload the file. For the report to be displayed, user can select the hits from the scroll down menu. Decoy helps to search with same parameter across database. 4 5

Description of the action/ interactivity Audio Narration (if any)‏ Step 2: T2: Parameter settings 1 2 3 After user upload the peptide mass data/ ASCII/MGF files. Highlight “Start search” button for user to click. In case if user likes to change the parameters, highlight “Reset Form” button. Animate like shown in figure. When user selects the parameters, audio narration must start. Once all the parameters are set and mass data is uploaded, user can click the start search button. Depending on the set parameters the search in the database begins. 4 5

Description of the action/ interactivity Audio Narration (if any)‏ Step 3: • T3:Data output 1 Section:1 2 Section:2 3 Section:3 4 5

Description of the action/ interactivity Audio Narration (if any)‏ Step 3: • T3:Data output 1 2 3 In Section 1: the parameters set by user during the search is shown. Section 2: Mascot Score Histogram. The number of protein hits with score is plotted along the graph. Section 3: Summary report. Matched proteins from the database, with the details of important parameters are displayed either in concise format, protein format and even user can export the data. Display the figure from the previous slide. Highlight the mascot search window with 3 sections like shown in the figure. Section 1: Parameters details. Section 2: Mascot Score Histogram. Section 3: Summary report. Instruct user to go through each section with audio narration. If any of the parameters are not set properly, user can go back and select the required parameters and do the search again. 4 5

Description of the action/ interactivity Audio Narration (if any)‏ Step 4: • T4:Data analysis 1 2 3 If in case the search parameters set range is too long or too small, the software gives out error message. Depending on the error message user need to change the setting and do the search again. Like example show above. Display the figure from the previous slide. Section 1: Parameters details. Animate few parameters with red highlight as error and instruct user to carry out the search again by changing the parameters. 4 5

Description of the action/ interactivity Audio Narration (if any)‏ Step 4: • T4:Data analysis 1 2 3 In Section 2: Mascot Score Histogram. The number of protein hits and their score is displayed along the graph. Display Section 2: Mascot Score Histogram. It’s a display user cannot make any changes in this section. 4 5

Description of the action/ interactivity Audio Narration (if any)‏ Step 4: • T4:Data analysis 1 2 3 In Section 3: Concise protein summary report. Matched proteins from the database, with name, mass, score and other details of important parameters as a concise details is given. For individual protein information, the details can be obtained by click on the blue link for each protein. Display the Section 3: Summary report. Instruct user to select “Format As” Concise protein with p-value: <0.5 and “Max. number of hits”: 1-10. animate and provide the options in scroll down menu. When user selects concise protein and clicks on Format as, display like in above fig, with 5 protein lists, their name, mass, score, expect value, matches. Redraw the above figure. Instruct user to click on protein name (Link highlighted in Blue) for more information. 4 5

Description of the action/ interactivity Audio Narration (if any)‏ Step 4: • T4:Data analysis 1 2 3 In protein view, matching of the query peptide to the protein sequence in the database is shown. What is the sequence, at what region matching occurs, what is the expected and calculated values of the query peptide is show. Display the protein view in other window, like show above. Display the protein sequence, highlight few letters within the sequence in bold red as matched peptide and display the information of matched peptide with start and end, observed, Mass range(expected), mass range(calculated) Delta value, Miss and the sequence. 4 5

Description of the action/ interactivity Audio Narration (if any)‏ Step 4: • T4:Data analysis 1 2 3 In protein summary report, index display the very concise details followed by the details like in protein view. matching of the query peptide to the protein sequence in the database is shown. What is the sequence, at what region matching occurs, what is the expected and calculated values of the query peptide is show. Display the Section 3: Summary report. Instruct user to select “Format As” Protein Summary with p-value: <0.5 and “Max. number of hits”: 1-10. animate and provide the options in scroll down menu. When user selects protein summary and clicks on Format as, display like in above figure, with 5 protein details in the index with Accession, mass, score and description. In “Results list” display like above with Accession, mass, score, Expect, Matches and details like in protein view from previous slide. 4 5

Description of the action/ interactivity Audio Narration (if any)‏ Step 4: • T4:Data analysis 1 2 3 Display the Section 3: Summary report. Instruct user to select “Format As” Export search result with p-value: <0.5 and “Max. number of hits”: 1-10. animate and provide the options in scroll down menu. When user selects export search result and clicks on Format as, display like in above figure for button selection option for Export search results, search information, Protein hit information and peptide information with all the option for user selection must be displayed. In export search results, user have all the options for the data information to be stored wrt parameters user selects. The result can be exported and saved, the data can be later taken for pathway analysis and literature study. In case user has done MS/Ms of a particular protein spot the search parameters are all the same with inclusion of few more parameters for the best fit. 4 5

Description of the action/ interactivity Audio Narration (if any)‏ Step 4: • T4:Data analysis 1 2 3 4 5 Re-draw the image with the options. Added parameters for MS/MS

Description of the action/ interactivity Audio Narration (if any)‏ Step 4: • T4:Data analysis 1 2 The MS/MS data analysis shareware has some extra inputs such as Quantitation, MS/MS tolerance, peptide charge, instrument etc. in addition to the fields for PMF and rest other parameters are similar to that of Peptide mass fingerprint. Let user to open a browsing window. Instruct user to type: www.matrixscience.com and click on enter. Display the matrix science window like slide:8, please re-draw the image. Let user have control over the icons for selections. Instruct user to click on “MS/MS ion search”. Display the image from the previous slide. Each of the fields must be filled in as shown with some requiring selection using user control. 3 4 5

Description of the action/ interactivity Audio Narration (if any)‏ Step 4: • T4:Data analysis 1 2 3 Provide user the options for Quantitation like shown above. Let user have the option to go through the option for selection and makes a choice of his own. The quantitation are the different types of process carried out before the MS analysis. Quantitation of the extracted protein, peptide, mixture of both, extracted from the different instruments before the MS analysis. For example: iTRAQ, Tandem mass Tags for fixed mass/charge values, ICAT, ICPL for precursors within a single data set, SILAC for ion fragments peaks, XICs for precursors in multiple data set etc. For more information on iTRAQ, ICAT, ICPL and SILAC follow the there respective IDD. 4 5

Description of the action/ interactivity Audio Narration (if any)‏ Step 4: • T4:Data analysis 1 2 3 Provide user the options for Instrument like shown above. Let user have the option to go through the option for selection and makes a choice of his own. The instrument option shows the list of instrument used to acquire the data. 4 5

Description of the action/ interactivity Audio Narration (if any)‏ Step 4: • T4:Data analysis 1 2 3 All the parameters added must be relevant to the sample to get the best hit from the database search. Let user provide the required information for the parameters and select the “Start Search” button to do the search. 4 5

Step 4: • T4:Data analysis Description of the action/ interactivity Audio Narration (if any)‏ Mascot Search Results User: Abhinav Email: abhinav@gmail.com Search title: Sample protein Database: NCBInr Taxonomy: Mammalia Time stamp: 2 June 2010 at 17:45:35 GMT Protein hits: 1 2 Mascot Score Histogram >5% Random match 3 <5% Random match The Tandem MS protein analysis is used to obtain protein identities from each of the sequenced peptides. The results page begins with a list of probable protein identities and their respective sources. The score histogram provides details similar to the PMF analysis, with the probability distribution being displayed graphically. The green shaded region is indicative of a match that has greater than 5% chance of being random while the red peak indicates that the chances of a random match is less than 5%. Display the mascot search result window like in figure. This must be zoomed into to clearly depict the report as shown. The red box must appear at the region indicated along with the blue arrow. 4 5

Step 4: Description of the action/ interactivity Audio Narration (if any)‏ • T4:Data analysis Peptide summary report 1 1. gi|31753114 Mass: 30840 Score: 225 Matches: 8(3) Sequences: 3(2) Unknown (protein for IMAGE:5194336) [Homo sapiens] Check to include this hit in error tolerant search Query Observed Mr(expt) Mr(calc) ppm Miss Score Expect Rank Unique Peptide 4492.2200 982.4254 982.4913 -67.02 0 66 0.00036 1 U K.FGEAVWFK.A 5492.2305 982.4464 982.4913 -45.65 0 (40) 0.14 1 U K.FGEAVWFK.A 6492.2348 982.4551 982.4913 -36.79 0 (32) 0.82 1 U K.FGEAVWFK.A 39960.4446 1918.8746 1918.9797 -54.78 0 118 2.8e-09 1 U R.WAMLGALGCVFPELLAR.N + Oxidation (M) 40960.4587 1918.9029 1918.9797 -40.03 0 (48) 0.023 1 U R.WAMLGALGCVFPELLAR.N + Oxidation (M) 44670.6395 2008.8966 2009.0155 -59.19 0 42 0.12 1 U R.LAMFSMFGFFVQAIVTGK.G + Oxidation (M) 451005.4635 2008.9124 2009.0155 -51.29 0 (35) 0.56 1 U R.LAMFSMFGFFVQAIVTGK.G + Oxidation (M) 47676.2986 2025.8741 2025.0104 427 0 (22) 12 1 U R.LAMFSMFGFFVQAIVTGK.G + 2 Oxidation (M) 2 Protein information Peptide information 3 2. gi|47522906 Mass: 60550 Score: 33 Matches: 3(0) Sequences: 2(0) zona pellucida sperm-binding protein 4 [Sus scrofa] Check to include this hit in error tolerant search Query Observed Mr(expt) Mr(calc) ppm Miss Score Expect Rank Unique Peptide 21649.2406 1296.4666 1296.5768 -85.00 0 31 1.1 1 U K.GPGSSMGVEASYR.G 22649.2485 1296.4823 1296.5768 -72.88 0 (21) 10 1 U K.GPGSSMGVEASYR.G 69 1237.2689 3708.7849 3710.1076 -356.51 1 3 6.4e+02 5 U K.YSRPPVDSHALWVAGLLGSLIIGALLVSYLVFRK.W First show the computer with the screen displaying the search results. This must be zoomed into to clearly depict the report as shown. The green highlight boxes must then appear with their labels. User must be allowed to click on these highlighted regions. Clicking on ‘protein information’ must redirect user to steps 4 (a) & (b) while ‘peptide information’ must redirect user to steps 4(c) & (d). The summary report lists all the protein matches obtained from the database search with their respective molecular weight, protein score, source organism and details regarding each of its fragmented peptides. Further information about any of the protein sequences can be obtained by clicking on the corresponding protein link. Data regarding each of the peptide fragmentation patterns can also be obtained by clicking on the peptide link indicated by the query number. 4 5

Description of the action/ interactivity Audio Narration (if any)‏ The protein score is a sum of the highest ion scores for each sequence, with duplicate matches being excluded. A score above 67 is considered significant. Part 3, Step 4 (a) 1 Protein information – data analysis & interpretation Mascot search results Protein view Predicted mass of the protein. Match to: gi|31753114 Score: 225 Unknown (protein for IMAGE:5194336) [Homo sapiens] Found in search of C:\Users\harini\Desktop\MS\3C.LC-MS-MS data analysis Raw data file- mgf files\Data file1.mgf Nominal mass (Mr): 30840; Calculated pI value: 6.00 NCBI BLAST search of gi|31753114 against nr Unformatted sequence string for pasting into other applications Taxonomy: Homo sapiens Links to retrieve other entries containing this sequence from NCBI Entrez: gi|111494016 from Homo sapiens Fixed modifications: Carbamidomethyl (C) Variable modifications: Oxidation (M) Cleavage by Trypsin: cuts C-term side of KR unless next residue is P Sequence Coverage: 14% Matched peptides shown in Bold Red 1 HHHSPTLREH GRRTRTSLLE AMATTAMALS PSSFAGKAVK DLPSSALFGE 51 ARVTMRKTAA KAKPVSSGSP WYGSDRVLYL GPLSGDPPSY LTGEFPGDYG 101 WDTAGLSADP ETFAKNRELE VIHCRWAMLG ALGCVFPELL ARNGVKFGEA 151 VWFKAGSQIF SEGGLDYLGN PSLVHAQSIL AIWACQVVLM GAVEGYRVAG 201 GPLGEIVDPL YPGGSFDPLG LADDPEAFAE LKVKEIKNGR LAMFSMFGFF 251 VQAIVTGKGP LENLADHLSD PVNNNAWAFA TNFVPGK Predicted isoelectric point of the protein. 2 All peptides are displayed with matching peptides indicated in red. Indicates the % of matching peptides. 3 4 Show all the text output. Next show the green highlighted boxes one at a time with the corresponding dialogue box appearing for each of the highlighted regions. The results on the next slide must also be displayed along with this page. The protein view obtained on selecting a particular protein link, is very similar to the protein view observed in PMF. It provides details regarding the protein score, molecular weight, isoelectric point, the sequence coverage of the protein etc. Protein scores above 67 are considered significant and greater the percentage sequence coverage, more are the number of matching peptides for that particular protein. All sequences are displayed with the matching sequences being indicated in red. 5

Description of the action/ interactivity Audio Narration (if any)‏ Part 3, Step 4 (b) 1 Protein information – data analysis & interpretation Indicates score of each ion fragment. Used for calculation of the protein score. Mascot search results Protein view 2 Indicates beginning & end of each peptide. Observed molecular weight. Experimental molecular weight. Calculated molecular weight. Sequence of peptide fragment. Start - End Observed Mr(expt) Mr(calc) ppm Miss Sequence 126 - 142 960.4446 1918.8746 1918.9797 -55 0 R.WAMLGALGCVFPELLAR.N Oxidation (M) (Ions score 118) 126 - 142 960.4587 1918.9029 1918.9797 -40 0 R.WAMLGALGCVFPELLAR.N Oxidation (M) (Ions score 48) 147 - 154 492.2200 982.4254 982.4913 -67 0 K.FGEAVWFK.A (Ions score 66) 147 - 154 492.2305 982.4464 982.4913 -46 0 K.FGEAVWFK.A (Ions score 40) 147 - 154 492.2348 982.4551 982.4913 -37 0 K.FGEAVWFK.A (Ions score 32) 241 - 258 670.6395 2008.8966 2009.0155 -59 0 R.LAMFSMFGFFVQAIVTGK.G Oxidation (M) (Ions score 42) 241 - 258 1005.4635 2008.9124 2009.0155 -51 0 R.LAMFSMFGFFVQAIVTGK.G Oxidation (M) (Ions score 35) 241 - 258 676.2986 2025.8741 2025.0104 427 0 R.LAMFSMFGFFVQAIVTGK.G 2 Oxidation (M) (Ions score 22) 3 4 Information about each of the matched peptides is also displayed. The start and end amino acid positions, calculated and experimental molecular weights, number of missed tryptic cleavages, sequence of each peptide fragment and their corresponding ion scores are shown. The highest ion scores are used for computing the final protein score. Show all the text output. Next show the green highlighted boxes one at a time with the corresponding dialogue box appearing for each of the highlighted regions. 5

Description of the action/ interactivity Audio Narration (if any)‏ Part 3, Step 4 (d) Peptide sequence whose fragmentation pattern is shown. 1 Peptide information – data analysis and interpretation Mascot search results Peptide view Range values for the x-axis that can be modified by the user to zoom in or zoom out of the graphical representation. MS/MS Fragmentation of FGEAVWFK Found in gi|31753114, Unknown (protein for IMAGE:5194336) [Homo sapiens] Match to Query 4: 982.425408 from(492.219980,2+) intensity(9920.0000) Title: Sum of 11 scans in range 1333 (rt=1686.21, f=2, i=174) to 1373 (rt=1732.47, f=2, i=184) [\\Qtof\Qtof 17\JAN2004.PRO\Data\6p013-sanjeeva-10.raw] Data file C:\Users\harini\Desktop\MS\3C.LC-MS-MS data analysis Raw data file- mgf files\Data file1.mgf 2 3 4 Each peptide in Tandem MS/MS undergoes a second round of fragmentation when it passes through the second mass analyzer before it reaches the detector. This provides significantly larger amount of information regarding each peptide fragment. This can be viewed by clicking on the peptide links provided in the summary report. The fragmentation pattern is displayed graphically, which can be zoomed into as per the requirement by adjusting the x-axis plot values. Show all the text output. Next show the green highlighted boxes one at a time with the corresponding dialogue box appearing for each of the highlighted regions. 5

Description of the action/ interactivity Audio Narration (if any)‏ Part 3, Step 4 (e) 1 Peptide information – data analysis & interpretation Mass of the peptide fragment displayed. Mascot search results Peptide view Amino acid sequence obtained through computation using y-ion and b-ion values. Monoisotopic mass of neutral peptide Mr(calc): 982.4913 Fixed modifications: Carbamidomethyl (C) (apply to specified residues or termini only) Ions Score: 66 Expect: 0.00036 Matches : 23/78 fragment ions using 16 most intense peaks (help) b-ions: Ions formed with charge retained on N-terminal. y-ions: Ions formed with positive charge retained on C-terminal. b1 (148.0757) – b2 (205.0972) = 57.0214  G 2 y7 (836.4301) – y6 (779.4087))= 57.0214  G 3 b6 (690.3246) – b7 (837.3930) = 147.0684  F y2 (294.1812) - y1 (147.1128) =147.0684  F 4 At low collision energy, each peptide fragment is cleaved at the amide bond which can result in the formation of two types of ions – the y ion & b ion. In y-ions, the positive charge is retained on the C-terminus of the peptide ion while in b-ions, charge is retained on the N-terminal. These ion masses can be used to compute the amino acid sequence by calculating the mass difference between consecutive ions. Each mass difference value corresponds to a particular amino acid, which can be obtained from a standard information table. The y-ion series & the b-ion series run opposite to each other as indicated in the example above. Show all the text output. Next show the green highlighted boxes one at a time with the corresponding dialogue box appearing for each of the highlighted regions. 5

Button 01 Button 02 Button 03 Slide 8-19 Slide 6- 7 Slide 20 -21 Slide 22-37 Slide 1-5 Introduction Tab 01 Tab 02 Tab 03 Tab 04 Tab 05 Tab 06 Name of the section/stage Animation area In slide-13: for given a sequence data, let user performs the search within the databases and compares the result. In slide-17: provide a mass/charge list for user to identify and remove the contaminant peaks? Interactivity area Instructions/ Working area Credits

Questionnaire: APPENDIX 1 1. Which one of these is common across all Mass Spec based proteomics experiments carried out? A) Liquid Chromatography B) Proteolysis C) 2-D Gel Electrophoresis D) Isoelectric Focusing Answer: B) Proteolysis 2. Peptide Mass Fingerprinting or PMF is defined as? A) Finding the best fit for peptides identified by fragmentation. B) Finding the best fir for protein by sequencing in a Triple Quadrupole Analyzer. C) Finding fingerprints of proteins on 2-DE Gels. D) Finding the best fit for masses of peptides identified by MALDI-TOF. Answer: D) Finding the best fit for masses of peptides identified by MALDI-TOF.

Questionnaire: APPENDIX 1 3. Which one of these mass values represents a protein/peptide ion? M-H- M-H+ MH+ MH- Answer: C) MH+ 4. The average mass of which of the following amino acids corresponds to 87.0782? A) Serine B) Glycine C) Alanine D) Glutamine Answer: A) Serine

Questionnaire: APPENDIX 1 Question 5 The wavelength of laser used for ionization? a) 337nm b) 437nm c) 537nm d) 637nm Answer: a) 337nm

APPENDIX 2 Links for further reading http://www.matrixscience.com/search_form_select.html 1. Henzel.W.J., Watanabe.C., Stults.J.T. (2003). Protein Identification: The Origins of Peptide Mass fingerprinting. J Am Soc Mass Spectrom, 14(9), pp:931-42. 2. Nesvizhskii , A.I., Vitek, O., Aebersold, R. (2007). Analysis and validation of proteomic data generated by tandem mass spectrometry. Nat.Methods., 49 (1), pp.787-97. 3. Deutsch, E.W., Lam, H., Abersold, R. (2008) Data analysis and bioinformatics tools for tandem mass spectrometry in proteomics. Physiol Genomics. 33 (1), pp:18-25. 4. Yates, JR., 2008. Mass Spectrometry and the Age of Proteome. J.Mass.Spec., 33(1), pp.1-19. Books: • Proteomics: A cold spring harbor laboratory course manual by Andrew J L and Joshua L, 2009.

APPENDIX 3 Summary A powerful search engine is required for the matching the query sequence to the protein sequences present in the database for protein identity. The best set parameters helps user to define the range for the search and get the required match, to get best protein hit. User need to have little bit knowledge of the sample background, to set the respective parameters. For the best match at least 5peptide masses from the query sequence must match with database sequence for a confident identification.

Methodology for MALD-TOF data analysis