1 / 51

A Software Tool for Analyzing Genome-Scale Data in the Context of Biological Pathways and

GenMAPP. A Software Tool for Analyzing Genome-Scale Data in the Context of Biological Pathways and the Gene Ontology. J. David Gladstone Institute of Cardiovascular Disease UCSF. Overview. Intro to GenMAPP - GenMAPP analysis example Advanced features.

wilton
Télécharger la présentation

A Software Tool for Analyzing Genome-Scale Data in the Context of Biological Pathways and

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. GenMAPP A Software Tool for Analyzing Genome-Scale Data in the Context of Biological Pathways and the Gene Ontology J. David Gladstone Institute of Cardiovascular Disease UCSF

  2. Overview • Intro to GenMAPP - GenMAPP analysis example • Advanced features

  3. Analyzing Large-Scale Data in the Context of Biological Pathways • Which genes are expressed in my dataset? • What biological processes are important in my data model? • New insight into underlying biology

  4. Analyzing Large-Scale Data in the Context of Biological Pathway • View data in the context of known biology • Rather than seeing which individual genes are changed, pathway analysis emphasizes processes that are changed • Biologists are familiar with pathways, so it is a natural way of sharing data

  5. Cardiomyopathy: Downregulated genes

  6. Cardiomyopathy: Downregulated genes

  7. Fatty Acid Degradation Pathway

  8. Cardiomyopathy Data on Fatty Acid Degradation Pathway

  9. GenMAPPGene Map Annotator and Pathway Profiler www.GenMAPP.org • Visualize gene expression and other genomic data on biological pathways and other groupings of genes • Global analysis identifies significantly changed processes and functional groups

  10. GenMAPP • Developed in the Conklin lab at Gladstone as an internal tool for dealing with microarray data • Approximately ~12,000 registered users to date • 100% Free!! • Used in 150 - 200 publications • Open source, code available at SourceForge.net • Current version for Windows only (Coded in VB)

  11. Time Course Data on Cell Cycle Pathway

  12. SNPs with Predicted Effects http://alto.compbio.ucsf.edu/LS-SNP/

  13. SNPs that Predispose to Myocardial Infarction • 547 acute MI cases; 505 controls • 58 SNPs in 35 genes • => SNPs in 5 different genes showed statistical • association with MI • Study spans 19 pathways • => 4 of 5 hits are on a single pathway Tobin et al, European Heart Journal 2004

  14. SNPs and Myocardial Infarction Tobin et al, European Heart Journal 2004

  15. SNP Data in GenMAPP • Visualization • Distribution of SNPs per gene • Prioritization • Mapping SNP annotations onto pathways • Analysis • Interpreting SNP data in the context of biological • pathways • Future directions • High-resolution visualization of individual SNPs • with the ability to overlay data

  16. MAPPFinder Originally developed as a separate application by Scott Doniger* Gene Ontology terms Experimental Data GenMAPP Pathways MAPPFinder Global comparison of changes in dataset to changes expected by chance Pathways and GO terms with significant changes * Doniger et al. Genome Biology 4(1):R7

  17. MAPPFinder Browser

  18. MAPPFinder Browser

  19. GenMAPP Relationship Schema User Dataset (GEX) Gene ID System Criterion Gene ID Blue Affymetrix 1415904_at Pathway MAPP Gene ID System Gene Name Gene ID Lpl EntrezGene 16956

  20. GenMAPP Supported Species Fruit fly Human Mouse Rat Worm Yeast Zebrafish Chicken  Dog Cow By request: Chimp Frog Fugu  F.rubripes Honey bee Mosquito  Pufferfish T.nigroviridis

  21. GenMAPP Supported Gene IDs Species-specific MGI RGD SGD WormBase ZFIN HUGO FlyBase Annotations InterPro EMBL OMIM Pfam Gene Ontology Gene IDs Affymetrix Entrez Gene RefSeq (protein only) Unigene UniProt Ensembl PDB

  22. Available MAPP Archives Contributed MAPPs Hand-curated pathways created at GenMAPP.org or submitted by GenMAPP users. >70 MAPPs for human, mouse and rat. Inferred MAPPs Inferred from human contributed MAPPs, using homology information from Homologene and Ensembl    Tissue-Specific MAPPs(human and mouse only) Based on the analysis of two microarray datasets generated by the Genomic Institute of the Novartis Foundation GO Sample MAPPs An partial collection of GO terms formatted as GenMAPP MAPP files, each containing between 100 genes and 300 genes. GO MAPPs are formatted as lists of genes, and do not contain any graphics other than the gene object and the label SGD metabolic MAPPs (yeast only) Derived from the yeast pathways at SGD KEGG converted MAPPs The KEGG Converted MAPPs were converted from the Pathway Resource at the Kyoto Encyclopedia of Genes and Genomes. Download all MAPPs through Downloader in GenMAPP

  23. http://www.genmapp.org/featured_mapps.html

  24. Input Data • Data in spreadsheet summary format • NO raw data • Data should include metrics that you want to use as cutoffs: • avg signal, ratio, fold, signal quality, p-value, cluster ID, other statistics • Include ALL genes measured in experiment, DO NOT pre-filter • Choose optimal primary gene ID • Custom annotation can be useful (Database includes standard annotation) • Example: Group Comparison Experiment • Fold changes between groups • p-value associated with fold • Average signal per group

  25. GenMAPP Workflow Pre-Processed Formatted Data (with statistics, metrics) Import Data Expression Dataset Manager Drafting Board MAPPBuilder Converter Set Color Criteria Create/Edit/Convert Pathways Drafting Board Display Data on Pathways Gene Ontology analysis Export Pathways to the Web MAPPFinder MAPPSets

  26. Example: Analysis of Complex Time-Course Data Challenges: • How to represent your data in an intuitive manner • How to analyze patterns rather than specific comparisons. Approach: • Set up hypotheses to test • Attach global statistics (e.g. ANOVA) and pattern recognition • Efficiently import in data into GenMAPP • Visualize cluster and time-point data (GenMAPP 2.1-NEW) • Global analysis of pathway/ontologies (MAPPFinder) • Export results to the web/for publication

  27. Set Up Hypotheses to Test Build a MAPP to Test a Hypothesis • Use literature and previous knowledge about the model you are studying to build a list of candidates or pathway. Step 1): • Collect a list of gene IDs • Import them using the MAPPBuilder Function • Organize into a biological pathway along with predictions of expected changes. Salomonis N, et al. Genome Biol. 2005 6:R12–R12.16

  28. Import List of Genes in MAPPBuilder

  29. Gene Layout on the Drafting Board

  30. Example: Analysis of Complex Time-Course Data Challenges: • How to represent your data in an intuitive manner • How to analyze patterns rather than specific comparisons. Approach: • Set up hypotheses to test • Attach global statistics (e.g. ANOVA) and pattern recognition • Efficiently import in data into GenMAPP • Visualize cluster and time-point data (GenMAPP 2.1-NEW) • Global analysis of pathway/ontologies (MAPPFinder) • Export results to the web/for publication

  31. Dataset: Mouse Uterine Pregnancy Time-Course Experiment Design: • Analyzed 7 time-points (3-8 replicates): • Non-Pregnant mice • 14.5, 16.5 and 17.5 days post fertilization • 18.5 days (term pregnancy) • 6 hours and 24 hours postpartum • Hybridized to mouse 11k Affymetrix arrays Analysis: • Normalized and Adjusted expression (gcrma R) • Performed a global f-test (multtest R) • Hierarchical and partitioned clustering (hopach R) Salomonis N, et al. Genome Biol. 2005 6:R12–R12.16

  32. Hierarchical Ordered Partitioning and Collapsing Hybrid HOPACH Clustering • Use global f-test to filter probeset list down to 3500 entries. • Cluster fold changes for each replicate compared to non-pregnant baseline mean. • Take the top level cluster (left) and re-associate with expression data.

  33. Example: Analysis of Complex Time-Course Data Challenges: • How to represent your data in an intuitive manner • How to analyze patterns rather than specific comparisons. Approach: • Set up hypotheses to test • Attach global statistics (e.g. ANOVA) and pattern recognition • Efficiently import in data into GenMAPP • Visualize cluster and time-point data (GenMAPP 2.1-NEW) • Global analysis of pathway/ontologies (MAPPFinder) • Export results to the web/for publication

  34. GenMAPP Input Import File Design: • Include all probe data (not just filtered) • Include the following columns of data • Multtest p-values • HOPACH clusters • Average group expression values • Fold changes (all relevant pair wise comparisons) • Gene Database system code Salomonis N, et al. Genome Biol. 2005 6:R12–R12.16

  35. GenMAPP Input

  36. GenMAPP Expression Dataset Manager Import Text File into GenMAPP • Tell GenMAPP which columns have non-numeric data. Establishing Rules for Coloring Gene Boxes: • Design criterion that captures any patterns you want to see. • Here we want: • Fold change gradients for up and down regulated for time-point comparisons (Color Sets) • Different colors assigned to each HOPACH cluster Salomonis N, et al. Genome Biol. 2005 6:R12–R12.16

  37. GenMAPP Expression Dataset Manager

  38. GenMAPP Expression Dataset Manager

  39. Example: Analysis of Complex Time-Course Data Challenges: • How to represent your data in an intuitive manner • How to analyze patterns rather than specific comparisons. Approach: • Set up hypotheses to test • Attach global statistics (e.g. ANOVA) and pattern recognition • Efficiently import in data into GenMAPP • Visualize cluster and time-point data (GenMAPP 2.1-NEW) • Global analysis of pathway/ontologies (MAPPFinder) • Export results to the web/for publication

  40. Viewing Time-Course Data on MAPPs Method 1) • View criterion, one at a time on pathways of interest.

  41. Single Color Set View

  42. Single Color Set View

  43. Viewing Time-Course Data on MAPPs Method 1) • View criterion, one at a time on pathways of interest. Method 2) • View clusters directly on pathway.

  44. Single Color Set View

  45. Single Color Set View

  46. Viewing Time-Course Data on MAPPs Method 1) • View criterion, one at a time on pathways of interest. Method 2) • View clusters directly on pathway. Method 3) • View all criterion of interest simultaneously.

  47. Single Color Set View

  48. Multiple Color Set View

  49. Example: Analysis of Complex Time-Course Data Challenges: • How to represent your data in an intuitive manner • How to analyze patterns rather than specific comparisons. Approach: • Set up hypotheses to test • Attach global statistics (e.g. ANOVA) and pattern recognition • Efficiently import in data into GenMAPP • Visualize cluster and time-point data (GenMAPP 2.1-NEW) • Global analysis of pathway/ontologies (MAPPFinder) • Export results to the web/for publication

  50. Advanced Features • Customizing a Gene Database / Creating a Gene Database for a non-supported species => Implement GenMAPP for a novel model species • Create your own pathway MAPPs => Implement GenMAPP for a novel model species => Author novel pathways based on your discoveries • High-throughput export of browsable html pathway archive => For interactive web-display of data on pathway archive International Gene Trap Consortium

More Related