360 likes | 482 Vues
This presentation outlines the development of Spectre, an innovative software tool designed for the analysis of mass spectrometry data in proteomics. Initiated by Dr. Bas van Breukelen from Utrecht University, Spectre aims to enhance the efficiency and effectiveness of protein identification and analysis. It incorporates advanced bioinformatics techniques, allowing users to load, visualize, manipulate, and analyze mass spectrometry data interactively. The project highlights the need for unified tools in the field and the progress made toward delivering a robust platform for researchers.
E N D
Software Project MassAnalyst Roeland Luitwieler Marnix Kammer April 24, 2006
Overview • Introduction • System requirements • Our solution: Spectre • Progress so far • Conclusion
Introduction • Project initiator • Scientific background • The need for software tools
Project initiator • Dr. ir. Bas van Breukelen • Department of Biomolecular Mass Spectrometry • Utrecht University! • WENT building • Expert in: • Bioinformatics • Proteomics
Scientific background: Proteomics • Our body consists of cells • Cell functionality and structure is offered by proteins • Proteomics • Main research areas: • Identification of proteins • Interaction of proteins • Comparison of protein levels
Protein identification • How to identify proteins? • Identity defined by their structure • Protein structure • Protein: sequence of peptides • Peptide: sequence of amino acids • 20 common types • Consist of different atoms – have different masses • Too small to see… but not to weigh • Mass Spectrometry!
Mass Spectrometry (MS) • Technique using a mass spectrometer • Input: sample of peptides • Proteins have been split chemically • Provides a.o. more accuracy, efficiency • Most head / tail subsequences are present • Output: mass spectrum • Frequencies of particles of certain masses • Full peptide sequence can be derived
Mass Spectrometry (MS) • How does it work? • Ionize particles • Now particles have an electrical charge • Accelerate them in an electric field • Deflect them in a magnetic field • Deflection depends on mass (F = m a) • Measure how far they have been deflected
Mass Spectrometry (MS) • Improvements for better analysis (1) • Use chromatography • Spreads input over time: more details • Output: a sequence of MS spectra
Mass Spectrometry (MS) • Improvements for better analysis (2) • Use “recursive” mass spectrometry • Called MS/MS (or MS2 or tandem MS) • Take part of the sample that produces a peak • Usually concerns one certain peptide • Output: MS spectra with related MS/MS spectra
Mass Spectrometry (MS) • Improvements for better analysis (3) • Use bioinformatics • All output is translated to mzXML • A database is searched on MS/MS spectra • Input: raw MS data • Output: pepXML: peptide information • Tools are used to e.g. display the data • Lots of redundant / boring work is taken care of!
Bioinformatics:what can be done? • Remember the Proteomics research areas: • Identification of proteins • Interaction of proteins • Comparison of protein levels • Most research: differ one aspect at a time • Requires interactive display of data • Zooming, “stacking”, cross sections, etc. • But not just display of data • Filtering, “warping”, peak detection, etc.
Bioinformatics:existing tools • Tools exist, but… • Lots of different tools to do different things • Functionality not always as desired • They also lack functionality • Not easily extendable • Example: Pep3D • Nice visualization, but • Only one sample at a time, only a single view • Solution: develop new software
System requirements • Load raw spectrometry data • Visualize the data • Manipulate and analyze the data interactively • Export data • Extendibility • Use in open community • Open source
Loading data • mzXML: raw spectrometry data • MS spectra • Embedded MS/MS spectra • pepXML: database of matches with peptides
Visualizing the data • List of loaded samples • MS spectrum • Cross sections of the MS spectrum • MS/MS spectra • Peptide information
Manipulating and analyzing the data • Stacking: toggle samples on/off • Warping • Zooming • Peak detection • More analysis, like ratio calculation
Export data • Lists of peak pairs • Modified PepXML (i.e. with ratios) • Images of spectra • Modified samples
The structure of Spectre • Graph: MS spectra, cross sections, MS/MS spectra • Workspace: a collection of samples and settings • Sample: internal data structure for one sample • GUI: the user interface • Processor: the main link between parts of the program
The structure of Spectre GUI 1 1 Processor 4 * Workspace Graph * Sample
Systematic approachto the problem • Phased development • Three versions • Lots of diagrams • Application of courses MSO, PM • HCI team and data layer team • Later on: data visualization team • Extreme Programming
Progress so far • First version will be due in week 18 • Functionality: • Loading raw data • Visualization and user interface • Basic interaction with zooming etc. • Complete internal data structures • Export of images • Missing link between mzXML and pepXML!
Further planning • Version 2 – week 23 • Warping • Peak detection / analysis • Export of calculated data • Version 3 – week 27 • Ratio calculation • Modification of samples
After completion of the project • Web site • Open source • further maintaining • extendable
Conclusion • Spectre: a modular and extendable program • A combination of many different requirements • Phased addition of features • Any questions?
The data structure Sample 1 MzTable SampleParser SampleWriter * MzNode … … MzParser PepParser