1 / 11

U54 Trancriptomics Analysis by GIS-PET (Data Issues Discussion)

This article discusses data issues in GIS-PET transcriptomics analysis, including library construction and quality control. It also covers PET extraction and mapping for genome analysis.

barrychase
Télécharger la présentation

U54 Trancriptomics Analysis by GIS-PET (Data Issues Discussion)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. U54 Trancriptomics Analysis by GIS-PET (Data Issues Discussion) Atif Shahab, Xiaoan Ruan, and Yijun Ruan (Genome Institute of Singapore) March 13-14, 2008 Barcelona, Spain

  2. Biotin Biotin GsuI GsuI GsuI N/B/M S M/B TT AA N/B/M M/B/S TT pGIS-GIS-flcDNA library (>10E+7CFU) PolyA+ mRNA GIS-PET: library Construction AAAAAAAAAAA • RT make 1st-strand flcDNA (-) • Biotinylation at end Gene-Identification-Signature Paired-End-diTaq +5me-dCTP AAAAA (A)16- (T)16 • Hydrolytic digest bound RNA • Release 1st-strand flcDNA (-) (T)16 flsscDNA flcDNA library construction steps • Synthesize 2nd strand flcDNA • with N/B/M adaptor seq NNNN (T)16 flsscDNA • digest GsuI to remove poly(A) tail AA fldscDNA N/B/M • + 3’ adaptor M/B • digest with NotI add AA n S fldscDNA B//M M/B TT • clone flcDNA into pGIS vector fldscDNA

  3. X X 4 kb 1 kb Prostate long Poly A + flcDNA library QC: • FlcDNA library colony-PCR QC: • 22 colonies showing variable size of PCR products • Average ~2 kb inserts were observed • DNA sequencing QC: • randomly picked colonies (96-well) were sequenced by Capillary sequencing • sequences of 5’ and 3’ end were obtained and mapped against UCSC human browser • out of 67 clones with good quality sequence, 62 were aligned to the boundaries of the matched gene models in browser (Hg18, March 2006). • a 92% full length cDNA ratio was obtained based on the colonies examined.

  4. AA N/B/M M/B/S TT B AA N/B/M fldscDNA clone M/B/S Library construction cont: TT pGIS-flcDNA library (>10E+7CFU) • digest with MmeI to create PET 20bp 20bp AA N/B/M M/B/S TT • re-ligation & transformation 18 16 Single-PET & diPET libraries pGIS Single-PET library (>10E+8 CFU) • restriction cut (S, B) to release PET AA M 18 16 • diPETTING to create diPET 3’ 5’ 5’ 3’ GTCGGATCCGAC 18 16 16 18 Spacer sequence

  5. diPET Sequencing structure: Spacer sequence 3’ 5’ 5’ 3’ GTCGGATCCGAC 18 16 16 18 3’ 5’ 5’ 3’ GSFLX-454 GTCGGATCCGAC 454-adaptor-A 454-adaptor-B 18 16 16 18 diPET sequence structure 3’ 5’ 5’ 3’ Solexa GTCGGATCCGAC Solexa-B 18 16 Solexa-A 16 18 Gene B Gene A 5’ 3’ 5’ 3’

  6. Library Summary • No of diPETS: 619060 • No. of PETS: 955581 • No of unique PETS: 676570 • No. of non map able: 314600 • No of map able: 361970 • No. of Unique mappings: 288230 • No of mappings (2 -10): 62747

  7. Sequence Data • diPET • (16bp3’)(18bp5’)[12bp linker](18bp5’)(16bp3’) • FASTA output >DHP001_454S_FullAnalysisOFF_000061_1518_1923 length=80 uaccno=E6NCSJP01DZLED CACTATGTACAAAACGGTCCGCGCGGCGCAGTCGTCGGATCCGACGGGGAGCGGGCGGCGGCGTAGCACAGCTGGCTGAG >DHP001_454S_FullAnalysisOFF_000093_1619_2463 length=80 uaccno=E6NCSJP01D8G0X GGTTTGCTAATTGCTGACTCCAGAGTTGTATCCGTCGGATCCGACGGCAGGTTCTCTTACATCATTCCCTGTCTTAAACG

  8. PET extraction and mapping • PET • (16bp3’)(18bp5’) • Remove redundancy • Extract 5’ and 3’ ends • Map the ends to the genome >DHP001fr-U_12893_COUNT:2 AGAAACCAAGTTCCCCGGCATGCAGAGATCAGAG >DHP001fr-U_12894_COUNT:1 AGAAACCAAGTTCCCCGGCATGCAGAGATCAGG 1,19(19) 20,33(14) - chr14:60185955-60182507 3449 >DHP001fr-U_12880_COUNT:2 AGAAACAGCCAAAGGGGAACAACAGCTGAAGCAC 1,18(18) 18,35(18) - chr15:70286174-70278424 7751 1,16(16) 18,31(14) + chr6:5918738-5919858 1121

  9. Data Publishing • T2G for GIS • UCSC based system • ENCODE/UCSC • Bed file format • ENCODE U54?

  10. E E E E E E E E Cell-Free Approach for GIS-PET Library Construction-cont. dscDNA E E fldscDNA circularization EcoP15I cut E E PET E E Add sequence adaptor High throughput sequencing

  11. Biotin Biotin GsuI GsuI GsuI E E Cell-Free Approach for GIS-PET Library Construction PolyA+ mRNA AAAAAAAAAAA • RT make 1st-strand flcDNA (-) • Biotinylation at end +5me-dCTP AAAAA (A)- (T)16 • Hydrolytic digest bound RNA • Release 1st-strand flcDNA (-) (T)16 • Synthesize 2nd strand flcDNA • with adaptor seq NNNN (T)16 • digest GsuI to remove poly(A) tail E E fldscDNA

More Related