1 / 35

Analysis of High-throughput Gene Expression Profiling

Analysis of High-throughput Gene Expression Profiling. Why to Measure Gene Expression. 1. Determines which genes are induced/repressed in response to a developmental phase or to an environmental change. 2. Sets of genes whose expression rises and falls

lane
Télécharger la présentation

Analysis of High-throughput Gene Expression Profiling

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Analysis of High-throughput Gene Expression Profiling

  2. Why to Measure Gene Expression 1. Determines which genes are induced/repressed in response to a developmental phase or to an environmental change. 2. Sets of genes whose expression rises and falls under the same condition are likely to have a related function. 3. Features such as a common regulatory motif can be detected within co-expressed genes. 4. A pattern of gene expression may be used as an indicator of abnormal cellular regulation. • A useful tool for cancer diagnosis

  3. Why to Measure Gene Expression in Large Scale? Transitional vs. High-throughput Approaches

  4. Techniques Used to Detect Gene Expression Level • Microarray (single or dual channel) • SAGE • EST/cDNA library • Northern Blots • Subtractive hybridisation • Differential hybridisation • Representational difference analysis (RDA) • DNA/RNA Fingerprinting (RAP-PCR) • Differential Display (DD-PCR) • aCGH: array CGH (DNA level) High-throughput

  5. Basic Information of Microarray, SAGE and cDNA Library

  6. (DNA) Microarray 1. Developed around 1987. 2. Employ methods previously exploited in immunoassay context – specific binding and marking techniques. 3. Two types of probes: Format I: probe cDNA (500~5,000 bases long) is immobilized to a solid surface such as glass; widely considered as developed at Stanford University; Traditionally called DNA microarrays. Format II: an array of oligonucleotide (20~80-mer oligos) probes is synthesized either in situ(on-chip) or by conventional synthesis followed by on-chip immobilization; developed at Affymetrix, Inc. Many companies are anufacturing oligonucleotide based chips using alternative in-situ synthesis or depositioning technologies. Historically called DNA chips.

  7. Microarray • Single Channel: sub-type classification • Dual Channel: differential expression gene screening • Tissue microarray • Protein microarray • ……

  8. Array CGH • Detecting DNA copy variation via microarray approach • A hotspot in recent research works, especially in Cancer research

  9. Microarray Analysis Which genes are up-regulated, down-regulated, co-regulated, not-regulated? gene discovery pattern discovery inferences about biological processes classification of biological processes

  10. SAGE • Experimental technique assigned to gain a quantitive measure of gene expression. • ~10-20 base “tags” are produced (immediately adjacent to the 3’ end of the 3’ most NlaIII restriction site). • The SAGE technique measures not the expression level of a gene, but quantifies a "tag" which represents the transcription product of a gene.

  11. SAGE Tags are isolated and concatermized. Relative expression levels can be compared between cells in different states.

  12. SAGEmap (http://cgap.nci.nih.gov)

  13. SAGE: comparing two relational libraries

  14. EST library (UniGene)

  15. Gene expression info from Unigene Library

  16. An Example of In-house EST Library Analysis

  17. The Algorithms and Challenges of High-throughput Gene Expression Analysis

  18. Seeing is believing? No, need to correct errors.

  19. SAGE: • A typical experiment requires ~30,000 gene expression comparisons where normal and a diseased cell is compared. • The results were subject to the size and reliabilities of the SAGE libraries. • Statistical measures are used to filter out candidate genes to reduce the dimensionality of the data but it is tedious and time consuming to play with these measures until a good set is found.

  20. SAGE • TPM: a simple normalization method TPM=Count*1000,000/TotalCount • Bayesian approach http://cancerres.aacrjournals.org/cgi/content/full/59/21/5403

  21. Microarray: Sources of errors • systematic • random logsignal intensity log RNA abundance

  22. Sources of Errors (Cont.) • Printing and/or tip problems • Labeling and dye effects (differing amounts of RNA labeled between the 2 channels) • Differences in the power of the two lasers (or other scanner problems) • Difference in DNA concentration on arrays (plate effects) • Spatial biases in ratios across the surface of the microarray due to uneven hybridization • cDNA array cannot distinguish alternatively spliced forms

  23. Errors that cannot be corrected by statistics • Competitive hybridization of different targets on the chip • Failure to distinguish different splicing forms • Misinterpretation of time course data when there are not sufficient points • Misinterpretation of relative intensity

  24. Does clustered time course really mean co-expression? Picture taken from http://genomics.stanford.edu/yeast/additional_figures_link.html Yes, you can study known system (such as cell cycle) this way; but, how about the unknown systems?

  25. Normalization by iterative linear regression then apply slope and intercept to the original dataset repeat until r2 changes by < 0.001 fit a line (y=mx+b) to the data set set aside outliers (residuals > 2 x s.e.) D Finkelstein et al. http://www.camda.duke.edu/CAMDA00/abstracts.asp

  26. Normalization (Curvilinear) G Tseng et al., NAR 2001

  27. After Normalization …… • Differentially Expressed (DE) Gene screeing • T-test • T-statistics • SVM • Clustering • Hierarchical • SOM • K-means • Network (Pathway) analysis • BioCarta, KEGG, GO databases • Bayesian network learning • Topology • …

  28. Bioinformatics challenges 1. data management 2. utilizing data from multiple experiments 3. utilizing data from multiple groups * with different technologies * with only processed data available

  29. Bioinformatics Analysis of Integrated Analysis of Gene Expression Profiling

  30. Large-scale meta-analysis of cancer microarray data identifies common transcriptional profiles of neoplastic transformation and progression Daniel R. et al. PNAS, 2004(101), 9309-9314 T-test Q values (estimated false discovery rates) were calculated as where P is P value, n is the total number of genes, and i is the sorted rank of P value.

  31. Cont. Meta-Profiling. The purpose of meta-profiling is to address the hypothesis that a selected set of differential expression signatures shares a significant intersection of genes (a meta-signature), thus inferring a biological relatedness.

  32. 67 genes were screened by mata-analysis

  33. Integrated Cancer Gene Expression Map

  34. 7 genes were discovered by the system

  35. THANX!!

More Related