1 / 32

Lecture 4 Microarray & Analysis

Lecture 4 Microarray & Analysis. Alizadeh et al. Nature 403 (2000) 503-511. Microarray revolutionized biology and medicine research. One gene at a time before, now tens of thousands simultaneously - PROTEOMICS Gene expression Gene disease relation Gene-gene interaction

avonaco
Télécharger la présentation

Lecture 4 Microarray & Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lecture 4Microarray & Analysis Alizadeh et al. Nature 403 (2000) 503-511

  2. Microarray revolutionized biology and medicine research • One gene at a time before, now tens of thousands simultaneously - PROTEOMICS • Gene expression • Gene disease relation • Gene-gene interaction • Finding Co-Regulated Genes • Understanding Gene Regulatory Networks • Many, many more

  3. Basic idea of Microarray • 製造原理 • 將可特徵基因之對偶鹼基序列 – 稱為探針(probe) – 排列放置在微晶片(microchip) 上 • 應用原理 • 將含基因序列之樣品 (sample) 液體到在微晶片上 • 利用互補鹼基雜交作用(hybridization) 的原理,由樣品 與微晶片上基因序列相互作用的情形摘取所需的資訊

  4. Basic idea of Microarray • Construction • Place array of probes on microchip • Probe (for example) is oligonucleotide ~25 bases long that characterizes gene or genome • Each probe has many, many clones • Chip is about 2cm by 2cm • Application principle • Put (liquid) samplecontaining genes on microarray and allow probe and gene sequences to hybridize and wash away the rest • Analyze hybridization pattern

  5. cDNA microarray schema cDNA晶片製造原理

  6. Microarray analysis Operation Principle: Samples are tagged with flourescent material to show pattern of sample-probe interaction (hybridization) Microarray may have 60K probe

  7. Microarray Processing sequence From: Shin-Mu Tseng tsengsm@mail.ncku.edu.tw

  8. Gene Expression Data mRNA samples Gene expression data on p genes for n samples sample1 sample2 sample3 sample4 sample5 … 1 0.46 0.30 0.80 1.51 0.90 ... 2 -0.10 0.49 0.24 0.06 0.46 ... 3 0.15 0.74 0.04 0.10 0.20 ... 4 -0.45 -1.03 -0.79 -0.56 -0.32 ... 5 -0.06 1.06 1.35 1.09 -1.09 ... Genes Gene expression level of gene i in mRNA sample j Log (Red intensity / Green intensity) = Log(Avg. PM - Avg. MM)

  9. Some possible applications • Sample from specific organ to show which genes are expressed • Compare samples from healthy and sick host to find gene-disease connection • Probes are sets of human pathogens for disease detection

  10. Amount of data from single microarray is huge • If just two color, then amount of data on array with N probes is 2N • Cannot analyze pixel by pixel • Analyze by pattern – cluster analysis

  11. Major Data Mining Techniques • Link Analysis • Associations Discovery • Sequential Pattern Discovery • Similar Time Series Discovery • Predictive Modeling • Classification • Clustering

  12. Cluster Analysis: grouping similarly expressed genes, Cell samples, or both • Strengthens signal when averages are taken within clusters of genes (Eisen) • Useful (essential ?) when seeking new subclasses of cells, tumours, etc. • Leads to readily interpreted figures

  13. Some clustering methods and software • Partitioning:K-Means, K-Medoids, PAM, CLARA … • Hierarchical:Cluster, HAC、BIRCH、CURE、ROCK • Density-based: CAST, DBSCAN、OPTICS、CLIQUE… • Grid-based:STING、CLIQUE、WaveCluster… • Model-based:SOM (self-organized map)、COBWEB、CLASSIT、AutoClass… • Two-way Clustering • Block clustering

  14. A review paper assessing various methods • Algorithmic Approaches to Clustering Gene Expression Data, Ron Shamir School of Computer Science, Tel-Aviv University Tel-Aviv • http://citeseer.nj.nec.com/shamir01algorithmic.html • Conclusion: hierarchical clustering exceptional

  15. Partitioning

  16. Density-based clustering

  17. Hierarchical (used most often) agglomerativity divisivity

  18. Hierarchical Clustering: grouping similarly expressed genes Gene Expression Profile Analysis Sample … … …. B C A gene 0.4 0.9 0 0.5 .. .. 0.8 0.2 0.8 0.3 0.2 .. .. 0.7 0.6 0.2 0 0.7 .. .. 0.3 … … … … … … … 1 2 3 4 .. .. 1000 From: Shin-Mu Tseng tsengsm@mail.ncku.edu.tw

  19. After Clustering Gene Expression Profile Analysis sample … … …. B C A gene .. 0 0.4 0.5 .. 0.9 0.8 .. 0.3 0.2 0.2 .. 0.8 0.7 .. 0 0.6 0.7 .. 0.2 0.3 … … … … … … … .. 3 1 4 .. 2 1000 From: Shin-Mu Tseng tsengsm@mail.ncku.edu.tw

  20. randomized row column both data clustered Eisen et al. Proc. Natl. Acad. Sci. USA 95 (1998) time

  21. Types of Similarity Measurements • distance measurements • correlation coefficients • association coefficients • probabilistic similarity coefficients

  22. Correlation Coefficients • The most popular correlation coefficient is Pearson correlationcoefficient (1892) • correlation between X={X1, X2, …, Xn} and Y={Y1, Y2, …, Yn}: • where sXYis the similarity between X & Y sXY From: Shin-Mu Tseng tsengsm@mail.ncku.edu.tw

  23. Now can use similarity forTree construction sXX • Normalize similarity so that =1 • Then have nxn similarity matrixS whose diagonal elements are 1 • Define distance matrix by (for example) D = 1 – S Diagonal elements of D are 0 • Now use distance matrixto built tree (using some tree-building software recall lecture on Phylogeny)

  24. A dendrogram (tree) for clustered genes E.g. p=5 Let p = number of genes. 1. Calculate within class correlation. 2. Perform hierarchical clustering which will produce (2p-1) clusters of genes. 3. Average within clusters of genes. 4 Perform testing on averages of clusters of genes as if they were single genes. Cluster 6=(1,2) Cluster 7=(1,2,3) Cluster 8=(4,5) Cluster 9= (1,2,3,4,5) 1 2 3 4 5

  25. A real case Nature Feb, 2000 Paper by Allzadeh. A et al Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling

  26. Validation Techniques:Hubert’s Γ Statistics • X=[X(i, j)] andY=[Y(i, j)] are two n×n matrix • X(i, j): similarity of gene i and gene j • Hubert’s Γ statistic represents the point serial correlation: • where M = n (n - 1) / 2 • A higher value of Γ represents the better clustering quality. if genes i and j are in same cluster, otherwise From: Shin-Mu Tseng tsengsm@mail.ncku.edu.tw

  27. Discovering sub-groups

  28. Gene Expression is time-dependent Time Course Data

  29. Sample of time course of clustered genes time time time

  30. Limitations • Cluster analyses: • Usually outside the normal framework of statistical inference • Less appropriate when only a few genes are likely to change • Needs lots of experiments • Single gene tests: • May be too noisy in general to show much • May not reveal coordinated effects of positively correlated genes. • Hard to relate to pathways

  31. Some useful links • Affymetrixwww.affymetrix.com • Michael Eisen Lab at LBL (hierarchical clustering software “Cluster” and “Tree View” (Windows)) • rana.lbl.gov/ • Stanford MicroArray Database (“Xcluster” (Linux)) • genome-www4.stanford.edu/MicroArray/SMD/ • Review of Currently Available Microarray Software • www.the-scientist.com/yr2001/apr/profile1_010430.html • Microarray DB • www.biologie.ens.fr/en/genetiqu/puces/bddeng.html

  32. Some papers Eisen, M. B. et al., (1998)."Cluster analysis 'and display of genome-wide expression patterns."Proc Natl Acad Sci U S A 95(25): 14863-8.Wen, X., et al., (1998). "Large-scale temporal gene ex- pression mapping of central nervous system development."Proc Natl Acad Sci U S A 95(1): 334-9.U. Alon, et al., (1999) “Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays.”PNAS, 96:6745-6750, June 1999.Spellman, P. T. et al., (1998)."Comprehensive identification of cell cycle-regulated genes of the yeastSaccharomyces cerevisiae by microarray hybridization.”Mol Biol Cell 9(12): 3273-97

More Related