1 / 34

Trans -splicing in Trypanosoma brucei— results from genome-wide experiments

Trans -splicing in Trypanosoma brucei— results from genome-wide experiments. Shai Carmi Bar-Ilan University Department of physics and the faculty of life sciences. February 2010. mRNA processing in T. brucei. Almost all genes have no promoters.

sarila
Télécharger la présentation

Trans -splicing in Trypanosoma brucei— results from genome-wide experiments

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Trans-splicing in Trypanosoma brucei— results from genome-wide experiments Shai CarmiBar-Ilan UniversityDepartment of physics and the faculty of life sciences February 2010

  2. mRNA processing in T. brucei • Almost all genes have no promoters. • Gene expression is regulated by controlling splicing (?),mRNA stability, and translation. Gene2 Gene3 Gene1 Gene4 • Polycistronic • Transcript • SL • Trans-Splicing= • And • Polyadenylation= • AAAA • AAAA • AAAA • AAAA Itai Dov Tkacz translation mature transcripts

  3. Splicing overview SL- Spliced Leader RNA See also:Liang et. al, Euk. Cell (2003).

  4. cis-splicing machinery and consensus mammalian snRNPs 10-12nts 3’ splice-site Yeast conserved branch site: TACTAAC

  5. Splicing regulation SR proteins create ’bridges’ to stabilize the spliceosome • In trypanosomes: • U2F65 and 35 exist and do not interact. • U2F65 interacts with SF1. • Interacting SR proteins were identified. • hnRNP proteins exist. hnRNP splicing enhancer splicing silencer

  6. Open questions • 3’ splice site recognition and selection. • Spatial organization of splicing factors: protein-protein and protein-RNA interactions. • Splicing efficiency and gene expression regulation. • Detailed molecular mechanism of trans-splicing and spliceosome assembly, structure of 5’ splice site, SL-RNA biogenesis, and coupling to poly-adenylation:not in this talk.

  7. Past studies of splicing regulation • Clayton et. al, Mol. Biochem. Parasit. (2005):Calculated the statistical properties of the splice sites based on a couple of hundreds ESTs. • Clayton et. al, Mol. Cell. Biol. (1994); Ullu et. al, Mol. Cell. Biol. (1998); Cross et. al, Mol. Cell. Biol. (2005):Used reporter gene systems with the splice sites of model genes (tubulin, actin, procyclin) to study the effect of splice site composition on splicing efficiency. • Limited applicability. 3’ splice-site promoter intron AG 5’UTR reporter gene Taken from endogenous gene and mutated

  8. Major known facts • Poly-adenylation is coupled to downstream trans-splicing. • Hierarchy of trans-splicing and polyA signals exist. • Specific sequences in the 5’UTR (exon) are required for splicing. • Optimal PPT should be 25 nts long, U dominated but interspersed with Cs, and have no two consecutive purines. • Optimal PPT-AG spacer should be 20-25 nts long, have U at position -3 and never AC at [-3,-4]. polyA-site 3’ splice-site reporter gene 3’UTR intergenic region 5’UTR reporter gene

  9. Research strategy– outline • Sequence all messenger RNAs to map transcript boundaries. • Silence splicing factors and measure the effect on each transcript. • Examine the splice site regions of regulated genes to infer possible roles for splicing factors and mechanisms of splicing regulation.

  10. Methods– deep sequencing illumina guide.

  11. Deep sequencing of T. brucei mRNA • Experiment performed at Ullu and Tschudi’s lab, Yale University. • Library preparation: Total RNA Terminator exonuclease treatment Poly(A)+ RNA selection First strand cDNA synthesis with random hexamer primers First strand cDNA synthesis with random hexamer or oligo(dT) primers Second strand cDNA synthesis with SL primer Second strand cDNA synthesis with RNaseH-derived RNA primers cDNA fragmentation and size selection 15 million useful reads! Addition of adapters and amplification Illumina sequencing

  12. Ullu’s lab results • 532 transcripts with misannotated start codon. • 805 annotated genes not producing an transcript. • 442 genes with alternative transcript in their UTRs. • 1,114 new transcripts, conserved coding and non-coding. • Trans-splicing and polyadenylation of snoRNA clusters. • The experimental method can be slightly modified to discover pol-II transcription initiation sites. These sites were found at strand-switch-regions, in proximity to tRNA genes, and within transcription units. • Digital gene expression.

  13. Examples of reannotated features Correctly annotated gene cluster. Blue- number of reads from SL-enriched library. Red- number of reads from polyA-enriched library. Chr VIII A novel transcript. Chr X A misannotated start codon. Blues line at the bottom denote SL reads. Chr VII An ORF which is part of a larger transcript. Chr XI A short transcript at the 3’end of a gene. Red lines at the bottom denote polyA reads. Examples were experimentally verified for all cases. Chr VII

  14. Statistics of UTR lengths 5’ median- 388 3’ median- 91 UTR length distribution is approximately log-normal.

  15. Splice-site composition Non AG splice-sites due to sequencing errors and strain differences. No G allowed at the -3 position PPT Maximum at about -25,distance from AG varies:unique to trypansomes. No signal observed in the exon

  16. Splice-site composition Pyrimidine content Sites closer to the PPT are stronger. AG exon PPT disturbed along tens of nucleotides. Purines favored in the exon.

  17. Splice-site composition AC is not preferred at positions [-3,-4] of the 3’ splice-site:Splice-site with AC are less abundant.

  18. Splicing heterogeneity Uncertainty of splice-site usage. • Not alternative splicing in the regular sense- leads to the same protein. log-scale Average distance (nts) of all weak splice sites from the strongest splice site. 6967 genes: one major site 978 genes: two major sites 21 genes: three major sites Uncertainty

  19. Splicing heterogeneity illustrated • Each row correspond to one gene. • Each site is denoted with a bar. • Sites are centered around the strongest site. • Bar color is according to relative usage. Downstream sites are more popular. Some sites are found in frame. ATG 60 40 relative usage of trans-splice sites 20 0 -300 -100 100 300 nt position relative to START codon

  20. Predicting splicing heterogeneity • What determines if a gene will be differentially spliced? • Look at 100nts up- and down-stream the strongest site. • Rank all potential splice sites: TAG-3, AAG, CAG-2, GAG-1. • heterogeneity rank of a gene = sum of ranks of all other AG dinucleotides / rank of strongest site. • Average heterogeneity rank about 10 for high uncertainty genes, but only about 7 for low uncertainty genes (P=10-20). • Signatures do not look meaningful, but analysis show that longer 5’UTRs, shorter PPTs, and longer PPT-AG distance also contribute significantly to heterogeneity.

  21. What is heterogeneity good for? • Unclear at the moment. Such heterogeneity is not found in other organisms. • In cis-splicing, exon boundaries must be conserved to maintain intact coding sequence. In trans-splicing, such evolutionary pressure does not exist. • However, trans-splicing heterogeneity was not observed in C. elegans. • Can reflect another level of complexity in gene expression regulation, as the degree of heterogeneity significantly varies throughout the genome.

  22. Explaining abundance • A-rich exons are more abundant. Splice-site ambiguity is anti-correlated with abundance. Other correlations: Genes with longer PPT and shorter 5’UTR are more abundant.

  23. A possible model for splicing factors organization? • U2F65 does not bind U2F35, so AG can be far from PPT. • Variable distance between AG and PPT allows regulation by differential binding of the splicing efficiency. TSR1 TSR1IP TSR1IP U2F35 U2F65 SF1 U2F35 competitor splice-site PTB1 AG intergenic region BP PPT AG 5’UTR 0-80 10-30 AC-rich 25 Optimal: 25

  24. Silencing methods– RNAi Inducible by Tertracycline.Gene is silenced after 3 days. Stem-loop construct T7-opposing construct Wang et. al, JBC (2000).

  25. Silencing methods– microarrays • Microarrays are chips on which thousands of DNA oligos are printed in an array. Each oligo represents a fragment of one gene. • Expression profiles of entire genomes are obtained in a single experiment. Wikipedia

  26. Genome-wide observations red-up, green-down. • Hundreds of genes are upregulated- unprecedented phenomenon. • U2F65 and SF1 are physically interacting and thus have similar pattern. • Vazquez et al., Mol. Biochem Parasitol. 164, 137 (2009).

  27. Genome-wide correlations • Potential protein-protein interactions should be biochemically verified. • Interactions maybe indirect.

  28. Processes affected by splicing defects Upregulated- Mostly ribosomal and translation involved proteins, peptidases, and chaperones. 10 candidates verified experimentally by RT-PCR. Downregulted-Mostly metabolic enzymes and transporters.

  29. Downregulated genes • The sequence at the splice site of the genes most impacted by silencing may indicate the role of the splicing factor. • Look at PPT length and distance to 3’ splice-site. • Most results are negative (discuss reason later). Genes with shorter PPT require SF1 Genes with longer PPT-AG distance require PTB1 P-value=0.001 P-value=0.004

  30. Sequence motifs • Using DRIM tool of Yael Mandel-Gutfreund’s lab. • Hard to assess the significance of the motifs. • Surprisingly no pyrimidine-rich motifs identified. • Other tools not suited for RNA motifs or intended for the human genome and thus perform poorly. • Should look which elements are conserved. hnRNPF/H binding sites.

  31. Mechanisms of regulation • RNA level regulation can be mediated via two mechanisms: • 1. mRNA stability. • The 3’UTR carries a specific sequence that causes stabilization or destabilization under given experimental conditions (silencing). • Demonstrated experimentally for a few upregulated genes. • Binding can be directly to the silenced splicing factor (U2F65, SF1, …). Splicing factors have been shown to bind mature mRNA in human cells (Carmo-Fonseca et. al, 2006). • Alternatively, binding can be to some other factor which is affected by the silencing (secondary effect). • Binding can induce both up- and down-regulation of different genes, depending on the context (e.g., competing with stabilizing/destabilizing proteins). • Regulation might not due to binding but due to secondary structure. • 2. Splicing defects. • The absence of a splicing factor might cause downregulation of genes for which it is required for splicing. • Such genes may have certain properties such as weak splice site, long PPT-AG distance, short PPT, competition with other AGs, etc.

  32. Discussion (problems) • Computational approaches are limited by low reproducibility of the microarrays, noisy fold changes, and the very small number of genes affected by more than one factor. • Genes with splicing defects are masked by many more genes which are regulated by mRNA stability. It is unclear at the moment if there is a significant number of genes regulated by splicing. • mRNA stability can be mediated by more than one factor (primary and secondary effects). • Thus, a clean set of genes which undergo the same regulation is hard to obtain.

  33. Discussion (future plans) • Computational: • Deep-sequencing of Leishmania at Ullu’s lab may provide information about conserved regulatory elements. • Secondary structure of 3’UTR will be explored. • Experimental: • Reporter gene system with the intergenic region of a model gene. • CLIP-seq (in vivo cross linking and immunoprecipiation followed by deep-sequencing) should yield RNA binding sites. • Examine splicing defects (accumulation of SL-RNA or Y-structure) of individual genes or genome-wide (co-silencing of the exosome).

  34. Thank you for your attention!

More Related