1 / 42

Glue Grant H1 Analysis Tutorial

Glue Grant H1 Analysis Tutorial. Weihong Xu 11/12/2008 Boston, MA. Outline. Introduction to array design and library files Image quantification (DAT->CEL) CEL reduction (CEL->exprCEL, remove SNP) Low level analysis (CEL->Expression Index) Practice session #1 Expression Console

vinson
Télécharger la présentation

Glue Grant H1 Analysis Tutorial

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Glue Grant H1 Analysis Tutorial Weihong Xu 11/12/2008 Boston, MA

  2. Outline • Introduction to array design and library files • Image quantification (DAT->CEL) • CEL reduction (CEL->exprCEL, remove SNP) • Low level analysis (CEL->Expression Index) • Practice session #1 • Expression Console • High level analysis (Expression Index -> Gene List) • Practice session #2 If time permits, • Visualization • Glue Grant Exon Array Tools (beta-testing) • Practice session #3 Glue Grant H1 Analysis Tutorial

  3. Introduction to array design • Significant change over Affymetrix exon array ST1.0 • More focused on known transcripts • Higher coverage • More comprehensive probe selection method • More contents: • exon probes 3.2M (0.32M targets) • junction probes 1M (0.25M targets) • coding SNP 1M (85K targets) • Untranslated Regions (UTR) 0.5M (50K targets) • tiling un-annotated units 0.5M (50K targets) • … • http://gluegrant1.stanford.edu/wiki/ Glue Grant H1 Analysis Tutorial

  4. Some definitions (TC, EC, PSR, Juc, …) Glue Grant H1 Analysis Tutorial

  5. Potential Analysis Questions • Gene expression • Alternative splicing • Transcript isoform deconvolution • Allele-specific expression • Antisense expression • … Glue Grant H1 Analysis Tutorial

  6. Introduction to Library files • Support multiple tools: • Quality control • low level analysis and expression analysis using APT and Expression Console • High level analysis using dChip • Glue Grant Exon Analysis Tools; • visualization using cisGenomeBrowser or UCSC Genome Browser. • Library and annotation database • http://gluegrant1.stanford.edu/phpMyAdmin/ • username: ??? password: ??? • hglue – all tables are read-only • GlueArraySandBox – for users to generate personalized library files and annotations Glue Grant H1 Analysis Tutorial

  7. Major types of library files • CLF - mapping of probe IDs to x/y in the CEL file • PGF - groups probes (by probe ID) into probe sets. • PS – a list of probe IDs • MPS – a list of meta probe set IDs with a corresponding list of probe set IDs • BGP – a list of Probe IDs to be used in background correction • QCC – a table of probe IDs for quality control and their corresponding type • KIL – a list of probe IDs to be ignored in DABG (probe with GC < 3) • http://www.affymetrix.com/support/developer/powertools/changelog/FILE-FORMATS.html Glue Grant H1 Analysis Tutorial

  8. Image quantification (DAT->CEL) • Function: convert pixel image to probe intensity file • Gridding • Quantification • Software: • GeneChip Operating Software (GCOS) • Affymetrix GeneChip Command Console (AGCC) • http://www.affymetrix.com/products_services/software/specific/command_console_software.affx Glue Grant H1 Analysis Tutorial

  9. CEL file reduction (CEL->exprCEL) • Function: remove SNPs to meet the IRB concern • Script: • Mac/Unix: modCEL.unix.pl --xymap=mapping_file \ --CEL=path/*.CEL --OUTDIR=path --Prefix=expr • PC: modCEL.pc.pl –xymap=mapping_file \ --CEL=filename.CEL --OUTDIR=path --Prefix=expr • Parameters: • xymap - mapping, hGlue1_0.r3.CEL2exprCEL.xymay • Prefix – a string that will be added to the CEL file name Glue Grant H1 Analysis Tutorial

  10. Low Level Analysis (CEL->Expression Index) • APT/Expression Console and QC • Quality control • Extracting specific features • Background correction/Normalization/Summarization • Practice session (~30minutes to 1hr) Glue Grant H1 Analysis Tutorial

  11. APT/Expression console • APT-Affymetrix Power Tool • Support both 3’ expression array and exon array • Support both expression and genotype analysis • Apt-probeset-summarize -- S(N(B)) • Apt-cel-extract -- extract features • Apt-dump-pgf -- extract probe/probeset information • Apt-summary-vis -- generating visualization track files • Apt-midas –alternative splicing • Memory management • http://www.affymetrix.com/partners_programs/programs/developer/tools/powertools.affx#1_1 Glue Grant H1 Analysis Tutorial

  12. Overview of Quality Control • Function: ensure the quality and reproducibility of array result • What to assess? • Probe level • Per array: signal distribution of different probe types • Across array: overall signal distribution, PM-mean, BG-mean • Probe Set level (PSR, TC) • Per array: Pos_vs_Neg_AUC, Presence call • Across array: correlation plot (median correlation to other arrays in the same batch) Glue Grant H1 Analysis Tutorial

  13. Quality Control Tool – GlueQC.R • requires R and APT • Syntax: Rscript GlueQC.R celpath outpath libpath • Libraries: • hGlue1_0.r3.clf • hGlue1_0.r3.pgf • hGlue1_0.r3.PSR.ps • hGlue1_0.r3.TC.mps • hGlue1_0.r3.KIL • hGlue1_0.r3.qc.clfpgf Glue Grant H1 Analysis Tutorial

  14. Density distribution plot • Overall intensity range • separation between different probe types Glue Grant H1 Analysis Tutorial

  15. All array density plot • Check the similarity of intensity distribution across arrays Glue Grant H1 Analysis Tutorial

  16. QC summary plot • Check outliers in each plot • Flags can only be consider as caution sign, especially when the sample size is small Glue Grant H1 Analysis Tutorial

  17. QC summary table Glue Grant H1 Analysis Tutorial

  18. Extract features • Function: extract a subset of probe signals from CEL files • Tool: apt-cel-extract • Syntax: apt-cel-extract -o out.txt [-c chip.clf -p chip.pgf] [-d chip.cdf] [--probeset-ids=norm-exon.txt] *.cel • Parameters: • If using probeset-ids, CLF and PGF have to been supplied Glue Grant H1 Analysis Tutorial

  19. Exampleslowlevelanalysis/extractfeatures.bat • extract all raw probe signal >apt-cel-extract -o raw_probe_signal.txt --cel-files CELlist.txt • extract quantile normalized and GC-background corrected probe signal >apt-cel-extract -c hGlue1_0.r3.clf -p hGlue1_0.r3.pgf --b hGlue1_0.r3.antigenomic.bgp -a quant-norm,pm-gcbg -o bgc_probe_signal.txt --cel-files CELlist.txt • extract probe signal of a specific content: “main->junction” >apt-dump-pgf -c hGlue1_0.r3.clf -p hGlue1_0.r3.pgf --probeset-type main --probeset-type junction -o juc.pgf >apt-cel-extract -c hGlue1_0.r3.clf -p hGlue1_0.r3.pgf --probe-ids juc.pgf -o juc_raw_probe_signal.txt --cel-files CELlist.txt Glue Grant H1 Analysis Tutorial

  20. Background correction, normalization and summarization • Goal: transform probe signal into biological meaningful expression measure • Background correction -- remove non-target signal • Normalization --remove non-biological variance • Summarization -- summarize probe signal into probe set signal Glue Grant H1 Analysis Tutorial

  21. apt-probeset-summarize • Syntax • apt-probeset-summarize –a rma-sketch [–a dabg] –c chip.clf –p chip.pgf –b chip.bgp –o outpath –m chip.mps [–kill-list chip.kil] *.CEL • Parameters • -a, analysis method • Chipstream format: a comma separated list of transformations with specific parameters passed as key value pairs, e.g. • rma-bg,quant-norm.sketch=-1.usepm=true.bioc=true,pm-only,med-polish • Predefined method: rma-sketch, dabg, rma, plier etc • --kill-list: needed when the analysis involves gc-bg • Windows: using ‘—cel-files filename’ instead of *.CEL Glue Grant H1 Analysis Tutorial

  22. apt-probeset-summarize (2) • Background correction • gc-bg • rma-bg • Mas5-bg • Pm-gcbg • Pm-mm • Normalization • Quant-norm • Med-norm • Summarization • Plier/iter-plier • Median polish (RMA) • DABG • Median • No Li-Wong yet Glue Grant H1 Analysis Tutorial

  23. ExamplesLowLevelAnalysis/bns.bat • PSR rma-sketch and dabg analysis apt-probeset-summarize -a rma-sketch -a dabg -c hGlue1_0.r3.clf -p hGlue1_0.r3.pgf -b hGlue1_0.r3.antigenomic.bgp --qc-probesets hGlue1_0.r3.qcc -s hGlue1_0.r3.PSR.ps --qc-probesets hGlue1_0.r3.qcc -o BNS/PSR --cel-files CELlist.txt --kill-list hGlue1_0.r3.kil • TC (transcription cluster) Meta Probe Set rma-sketch or chipstream apt-probeset-summarize -a rma-sketch -a quant-norm.sketch=50000,pm-gcbg,iter-plier -c hGlue1_0.r3.clf --p hGlue1_0.r3.pgf -b hGlue1_0.r3.antigenomic.bgp --qc-probesets hGlue1_0.r3.qcc -m hGlue1_0.r3.TC.mps -o BNS/TC --cel-files CELlist.txt --kill-list hGlue1_0.r3.kil • Compute U133Plus2 probe Set apt-probeset-summarize -a rma-sketch -c hGlue1_0.r3.clf --p hGlue1_0.r3.pgf -b hGlue1_0.r3.antigenomic.bgp --qc-probesets hGlue1_0.r3.qcc -m hGlue1_0.r3.U133plus2.mps -o BNS/u133plus2 --cel-files CELlist.txt • Compute Human Exon ST1.0 Transcript Cluster apt-probeset-summarize -a rma-sketch -c hGlue1_0.r3.clf --p hGlue1_0.r3.pgf -b hGlue1_0.r3.antigenomic.bgp --qc-probesets hGlue1_0.r3.qcc -m hGlue1_0.r3.HuEX_TC.mps -o BNS/huex --cel-files CELlist.txt Glue Grant H1 Analysis Tutorial

  24. Apt-probeset-summarize output • [method].summary.txt – expression index matrix • [method].report.txt – quality control measures Glue Grant H1 Analysis Tutorial

  25. Expression Console • Improvement over last tutorial • More summary options: EC, TC, JUC, EX, TX • Define probes into core, extended (multi probes) • Convert to U133plus2, HuEx format • Walk through an example • Summary • QC metrix • Link with annotation • Refer to doc/EC_Tutorial.doc (recycled from last tutorial) Glue Grant H1 Analysis Tutorial

  26. Practice session #1 • CEL reduction (SNPremover) • GlueQC • GlueQC on data/07-20-08/CELlist_test.txt (15 arrays) • Low level Analysis • Feature extraction • Extract raw probe intensity of 15 arrays • Extract quantile normalized and GC-background corrected probe intensity of “main->junction” from 15 arrays • B.N.S • rma-sketch summary of PSR for 15 arrays • rma-sketch summary of TC for 15 arrays (use mps file from lib/GenBase) Glue Grant H1 Analysis Tutorial

  27. High level analysis (Expression Index -> Gene List) • Array annotation and annotation files • Import APT results to dChip for high level analysis • A practice session Glue Grant H1 Analysis Tutorial

  28. Array annotation (r3) • Update over r2 version • Corrected a bug caused by MySQL end-of-line problem • Added annotation for Transcript, Junction and other contents • Added annotation files for dChip and GenBase • Added BED files and REFFLAT files for Genome Browser • Refer to lib/readme.doc for details • Customerization: http://gluegrant1.stanford.edu/phpMyAdmin/ Glue Grant H1 Analysis Tutorial

  29. hGlue1_0.r3.TC_annot.csv Glue Grant H1 Analysis Tutorial

  30. hGlue1_0.r3.PSR_annot.csv Glue Grant H1 Analysis Tutorial

  31. hGlue1_0.r3.Junction_annot.csv Glue Grant H1 Analysis Tutorial

  32. dChip • Improve over last tutorial • Added Gene Ontology, KEGG pathway and chromosome band analysis • Walk through an example • Remove extra header and extra tail • Import external data into dChip • Differential Expression Analysis • Clustering/Enrichment • Chromosome/Genome enrichment Glue Grant H1 Analysis Tutorial

  33. Practice session #2 • dChip Glue Grant H1 Analysis Tutorial

  34. Visualization - cisGenomeBrowser • Light version of UCSC Genome Browser (Hui Jiang) • CEL image • Genome Region • http://biogibbs.stanford.edu/~jiangh/browser/index.html Glue Grant H1 Analysis Tutorial

  35. cisGenomeBrowser-CEL Image Glue Grant H1 Analysis Tutorial

  36. cisGenomeBrowser-Genomic Region Glue Grant H1 Analysis Tutorial

  37. cisGenomeBrowser • Annotation track • hGlue1_0.r3.TC.refflat • hGlue1_0.r3.TX.refflat • Hg18.genefile (refseq track only) • Signal track (visualization/genCisGenomeBrowserTrack.bat) • probe raw signal barfile >genbar.pl –coord = hGlue1_0.r3.Probe.BED --signal = raw_probe_signal.txt –outdir = Probe_barfile • PSR barfile >genbar.pl --coord=hGlue1_0.r3.PSR.BED --signal=PSR/rma-sketch.summary.txt --outdir=PSR_barfile • Gene barfile >genbar.pl --coord=hGlue1_0.r3.TC.BED --signal=TC/rma-sketch.summary.txt --outdir=TC_barfile • Demo Glue Grant H1 Analysis Tutorial

  38. Other Browsers • UCSC Genome Browser (visualization/genUCSCBrowsreTrack.bat) • apt-summary-vis -g hGlue1_0.r3.PSR.BED PSR/rma-sketch.summary.txt --wiggle-col-index 1 –o CEL1.PSR.wig • Need to tweak BED file to make PSR non-overlap in order to work on UCSC browser • Affymetrix Genome Browser • apt-summary-vis -g hGlue1_0.r3.PSR.BED PSR/rma-sketch.summary.txt –o PSR.egr Glue Grant H1 Analysis Tutorial

  39. Glue Grant Exon Array tool • Highlights • Specially tailored for exon arrays • Command line with R interface • Probe sequence specific background model-MAT • Summarization: probe-selection (GenBase), Li-Wong model (dChip) and median-polish (RMA) • Integrated alternative splicing analysis (MADS) • Run analysis (GlueGrantExonArrayTool/runEAT.bat) • ../../bin/GlueGrantExonArrayTool/eat.win32.exe EXPR_param.conf -l ../../data/07-20-08/CELlist.txt • ../../bin/GlueGrantExonArrayTool/eat.win32.exe MADS_param.conf -l ../../data/07-20-08/CELlist.txt Glue Grant H1 Analysis Tutorial

  40. Param.conf • Specify analysis parameters • Analysis type • Librarie files • Background correction method • Normalization method • Summarizaiton method • MADS parameters • Example: /GlueGrantExonArrayTool/Expr_param.conf Glue Grant H1 Analysis Tutorial

  41. Practice session#3 • cisGenomeBrowser • Generate bar files for PSR and TC of 15 arrays in CELlist_test.txt from practice session#1 • Search for genes of your interests • Glue Grant Analsysis Tool • Repeat steps in runEAT.bat Glue Grant H1 Analysis Tutorial

  42. Thank you Glue Grant H1 Analysis Tutorial

More Related