1 / 53

ONCOMINE: A Bioinformatics Infrastructure for Cancer Genomics

ONCOMINE: A Bioinformatics Infrastructure for Cancer Genomics. Dan Rhodes Chinnaiyan Laboratory Bioinformatics Program Cancer Biology Training Program Medical Scientist Training Program University of Michigan Medical School. Outline. Background DNA Microarrays and the Cancer Transcriptome

suki
Télécharger la présentation

ONCOMINE: A Bioinformatics Infrastructure for Cancer Genomics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ONCOMINE: A Bioinformatics Infrastructure for Cancer Genomics Dan Rhodes Chinnaiyan Laboratory Bioinformatics Program Cancer Biology Training Program Medical Scientist Training Program University of Michigan Medical School

  2. Outline • Background • DNA Microarrays and the Cancer Transcriptome • ONCOMINE • Data collection, normalization & storage • Statistical Analysis • Visualization of Data and Analysis • ONCOMINE Data Integration • Therapeutic Targets / Biomarkers • Metabolic and Signaling Pathways • Known protein-protein Interactions • ONCOMINE tutorial

  3. Outline • Background • DNA Microarrays and the Cancer Transcriptome • ONCOMINE • Data collection, normalization & storage • Statistical Analysis • Visualization of Data and Analysis • ONCOMINE Data Integration • Therapeutic Targets / Biomarkers • Metabolic and Signaling Pathways • Known protein-protein Interactions • ONCOMINE tutorial

  4. The Cancer Transcriptome

  5. The Cancer Transcriptome

  6. The Cancer Transcriptome

  7. The Cancer Transcriptome

  8. The Cancer Transcriptome

  9. The Cancer Transcriptome

  10. The Cancer Transcriptome

  11. The Cancer Transcriptome

  12. The Cancer Transcriptome

  13. The Cancer Transcriptome

  14. The Cancer Transcriptome • 180+ studies profiling human cancer • Each profiling 5 – 100+ samples • We estimate > 10,000 microarrays • 10k chips measuring 20k genes • = 200+ million data points

  15. Outline • Background • DNA Microarrays and the Cancer Transcriptome • ONCOMINE • Data collection, normalization & storage • Statistical Analysis • Visualization of Data and Analysis • ONCOMINE Data Integration • Therapeutic Targets / Biomarkers • Metabolic and Signaling Pathways • Known protein-protein Interactions • ONCOMINE tutorial

  16. Oncomineoncology + data-mining = oncomine • 105 independent datasets (90 analyzed) • 7,292 cancer microarrays • 79 million gene expression measurements • 382 distinct cancer signatures • > 5 million tests of differential expression • > 5 million tests of gene set enrichment • > 5 billion pairwise correlations

  17. Oncomine • Database – relational, Oracle 9.2 • Statistical computing – R, Perl, Java • Front End – Java Server Pages • Server – Apache/Tomcat • Graphics – Scalable Vector Graphics (SVG)

  18. Data Collection • Monthly Pubmed searches (cancer + microarray + transcriptome + tumor + gene expression profiling) • Gene Expression Repositories • Gene Expression Omnibus (GEO) (http://www.ncbi.nlm.nih.gov/geo/) • ArrayExpress (http://www.ebi.ac.uk/arrayexpress/) • Stanford Microarray Database (http://genome-www5.stanford.edu/) • Whitehead Cancer Genomics (http://www.broad.mit.edu/cancer/)

  19. Data Normalization • Global normalization – same scaling factors applied to all microarray features – mean and variance normalization • Affymetrix - Quantile normalization • Spotted cDNA - Loess normalization • normalize an M vs. A plot

  20. Data Storage • Generic data structures to accommodate a variety of data • Samples • Microarray Features / Genes • Normalized Data • Statistical Tests • Gene Sets

  21. Samples

  22. Samples

  23. Microarray Features / Genes

  24. Normalized Data

  25. Gene Sets

  26. Statistical Tests

  27. Statistical Tests

  28. Outline • Background • DNA Microarrays and the Cancer Transcriptome • ONCOMINE • Data collection, normalization & schema • Statistical Analysis • Visualization of Data and Analysis • ONCOMINE Data Integration • Therapeutic Targets / Biomarkers • Metabolic and Signaling Pathways • Known protein-protein Interactions • ONCOMINE tutorial

  29. Two-sided t-test for each gene: False discovery rate correction for multiple hypothesis testing Differential Expression Analysis

  30. R, Oracle, RODBC

  31. Outline • Background • DNA Microarrays and the Cancer Transcriptome • ONCOMINE • Data collection, normalization & storage • Statistical Analysis • Visualization of Data and Analysis • ONCOMINE Data Integration • Therapeutic Targets / Biomarkers • Metabolic and Signaling Pathways • Known protein-protein Interactions • ONCOMINE tutorial

  32. Oncomine Tutorial part I • Gene Differential Expression • Gene Co-Expression • Study Differential Expression WWW.ONCOMINE.ORG EMAIL: SHORTCOURSE PASSWORD: MCBI

  33. Outline • Background • DNA Microarrays and the Cancer Transcriptome • ONCOMINE • Data collection, normalization & storage • Statistical Analysis • Visualization of Data and Analysis • ONCOMINE Data Integration • Therapeutic Targets / Biomarkers • Metabolic and Signaling Pathways • Known protein-protein Interactions • ONCOMINE tutorial

  34. Therapeutic Targets / Biomarkers • Gene Ontology Consortium • Biological Process (apoptosis, cell cycle) • Cellular Component (cytoplasmic membrane, extracellular) • Molecular Function (kinase, phosphatase, protease, etc.) • Known Therapeutic Targets • NCI Clinical Trials Database • Therapeutic Target Database

  35. Therapeutic Target Database 338 proteins with Literature-documented Inhibitor, antagonist, Blocker, etc. http://xin.cz3.nus.edu.sg/group/cjttd/ttd.asp

  36. Known Drug Targets Expressed in Bladder Cancer

  37. Secreted proteins highly expressed in Ovarian Cancer

  38. Outline • Background • DNA Microarrays and the Cancer Transcriptome • ONCOMINE • Data collection, normalization & storage • Statistical Analysis • Visualization of Data and Analysis • ONCOMINE Data Integration • Therapeutic Targets / Biomarkers • Metabolic and Signaling Pathways • Known protein-protein Interactions • ONCOMINE tutorial

  39. Metabolic & Signaling Pathways • KEGG • Kyoto Encyclopedia of Genes & Genomes • 87 metabolic pathways, 1700 gene assignments • Biocarta • Signaling pathways reviewed and entered by ‘expert’ biologists • 215 signaling pathways, 3700 gene assignments

  40. Pathway enrichment analysis • Identify pathways and functional groups of genes deregulated in particular cancer types • Enrichment Analysis using Kolmogrov-Smirnov Scanning (Lamb et al)

  41. Kolmogrov-Smirnov Scanning (Lamb et al) 1 2 * 3 4 * 5 6 * 7 * 8 9 10 11 12 13 14 15 16 17 18 * 19 20 (1,2,3,4…,19,20) Vs. (2,4,6,7,18)

  42. Pathway Enrichment Liver vs. other Normal tissues

  43. Pathway Enrichment cont

  44. Pathway enrichment analysis A search for the Biocarta pathways most enriched in a medulloblastoma signature (C2) uncovered involvement of the Ras/Rho pathway

  45. Pathway enrichment analysis cont. A direct link to the Biocarta pathway provides the details (Medulloblastoma genes with red boxes)

  46. Outline • Background • DNA Microarrays and the Cancer Transcriptome • ONCOMINE • Data collection, normalization & storage • Statistical Analysis • Visualization of Data and Analysis • ONCOMINE Data Integration • Therapeutic Targets / Biomarkers • Metabolic and Signaling Pathways • Known protein-protein Interactions • ONCOMINE tutorial

  47. Known Protein-Protein Interactions • HPRD • Human Protein Reference Database • Manually curated • 20,000+ papers, 15,000+ distinct interactions • PKDB • Protein Kinase Database • Natural Language Processing • 60,000+ abstracts suggest interaciton, 16,000 distinct interactions • Error prone • Co-RIF • Locus Link Reference into Function • 12,000+ co-RIFs

  48. Human Interactome Map (www.himap.org)

  49. INTERACT

More Related