1 / 30

Gene Expression Data Analysis Lab Session

Gene Expression Data Analysis Lab Session. CAD course Jian Li 01.28. 2011. Gene expression signatures. Will be loosely defined here to mean a set of genes that are functionally associated with each other in some way.

kelton
Télécharger la présentation

Gene Expression Data Analysis Lab Session

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Gene Expression Data Analysis Lab Session CAD course Jian Li 01.28. 2011

  2. Gene expression signatures • Will be loosely defined here to mean a set of genes that are functionally associated with each other in some way. • When using expression profiling to define genes, a gene expression signature consists of two things: • A set of genes going “up” (relative to something). • A set of genes going “down” (relative to something).

  3. Gene expression profiling of IGF-I-stimulated MCF-7 cells

  4. MYC Ras E2F3 b-cat Src Five oncogenic pathway signatures in human cancers

  5. (1) One combined signature (3,4) compare 5 signatures (2)

  6. Course webpage

  7. Excel functions/features you will need for the computational exercise

  8. TTEST TTEST(array1,array2,tails,type) • array1 is the first data set. • array2  is the second data set. • tails specifies the # of distribution tails (Use “2”) • type  is the kind of t-Test to perform (Use “2”).

  9. AVERAGE AVERAGE(number1, number2) • Number1, number2, ...    are 1 to 30 numeric arguments for which you want the average. • The arguments must either be numbers or be names, arrays, or references that contain numbers.

  10. Data > Filter > AutoFilter • arrows appear to the right of the column labels • filtered items appear in blue. • complex criteria:rows that contain values within a specific range (e.g. p<0.01)

  11. MATCH MATCH(lookup_value,lookup_array,match_type) • lookup_value   what value are you looking for? • Lookup_array   range of cells • match_type   should be 0 for our purposes.

  12. (Don’t forget the $)

  13. COUNT COUNT(range) • Only numbers in a range are counted. Empty cells, logical values, text, or error values in the array or reference are ignored. • range   cells to count

  14. Compare two signatures • Sig A: 1152 • Sig B: 119 • Genes on both platforms: 11079 • Genes shared by both gene signatures: 44 one-sided Fisher's exact test

  15. R function for one-sided Fisher's exact testdhyper • Example: • 100 balls • 10 of the balls are red • I grab 20 balls • Five of my 20 balls are red • Was the number of red balls I selected a significant number ? > m<-10 #number of red balls > n<-90 #number of other balls (total pop-m) > k<-20 #number of balls selected > x<-0:k #vector of successes > 1-sum(dhyper(x,m,n,k)[1:5]) [1] 0.02546455

  16. R function for one-sided Fisher's exact testdhyper • Sig A: 1162 • Sig B: 119Genes on both platforms: 11079Genes shared by both gene signatures: 44 > m<-119 #number of Sig B genes > n<-11079-119 #number of other genes > k<-1162 #number of Sig A genes > x<-0:k #vector of successes > 1-sum(dhyper(x,m,n,k)[1:44]) [1] 1.265654e-14

  17. GSEA (rank-based) enrichment analysis All the genes in the dataset are used here Subramanian, Aravind et al. (2005) Proc. Natl. Acad. Sci. USA 102, 15545-15550 • Start from the top of the Ranked list. • Add points to “Random walk” for each gene you find in S. • Remove points from “Random walk” for each gene not in S.

  18. GSEA (rank-based) enrichment analysis assign nominal P value

  19. step 1 step 2 status/result

  20. GSEA (rank-based) enrichment analysis (1) (3) (2) All the genes in the dataset are used here Subramanian, Aravind et al. (2005) Proc. Natl. Acad. Sci. USA 102, 15545-15550 • Start from the top of the Ranked list. • Add points to “Random walk” for each gene you find in S. • Remove points from “Random walk” for each gene not in S.

  21. Rank-based approaches use all of the genes from one of the datasets to determine enrichment (does not make a “cut”). Ranked-based enrichment analysis Locations of genes from set Y Rank ordered genes from dataset X

More Related