alternative splicing and disease an overview n.
Skip this Video
Loading SlideShow in 5 Seconds..
Alternative Splicing and Disease: an overview PowerPoint Presentation
Download Presentation
Alternative Splicing and Disease: an overview

Alternative Splicing and Disease: an overview

1221 Vues Download Presentation
Télécharger la présentation

Alternative Splicing and Disease: an overview

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Alternative Splicing and Disease: an overview Shoba Ranganathan Professor and Chair – Bioinformatics Dept. of Chemistry and Biomolecular Sciences & Adjunct Professor ARC CoE in Bioinformatics Dept. of Biochemistry Macquarie University Yong Loo Lin School of Medicine Sydney, Australia National University of Singapore, Singapore ( ( Visiting scientist @ Institute for Infocomm Research (I2R), Singapore

  2. Outline of the talk Background Determining gene architecture Graph theory in AS Whole genome analysis results AS and disease

  3. Unexpectedly low number of genes in the human genome How can the genome of Drosophila contain fewer genes than the undoubtedly simpler organism C. elegans? This raises the possibility of expanded diversity leading to biological complexity C.elegans 19,000 genes Drosophila 14,000 genes Human ~22,000-25000 genes,

  4. Sources of Biological complexity With a limited number of genes: Enhanced regulation of genes and pathways Post-translational modifications Alternative splicing

  5. A Genomic View

  6. Spliceosomal splicing

  7. Maniatis & Tasic, 2002

  8. a a b a a mRNA Sequences b b Protein Diversity

  9. Alternative splicing • Splicing is a regulated process that removes the non-coding sequence from transcripts to produce mRNA (Bernot, 2004). • Contradicts the central dogma of molecular biology: • One gene – one protein

  10. Why AS? • Protein diversity (Neverov et al., 2005). • Form of spatial and temporal regulation (Lopez, 1995) • Errors in splicing lead to diseases (Orengo & Cooper, 2007) • Drug discovery (Levanon & Sorek, 2003)

  11. Usual way of studying AS • One gene at a time – tedious for genomes • Collect intron-exon structures for all isoforms • Try to analyze them … again one isoform at a time and then gene by gene. • Unsuitable for genes with large numbers of transcripts.

  12. Usual way of studying AS

  13. Why use bioinformatics? • Most research into alternative splicing is limited to a few genes (reductionist approach) • Bioinformatics overcomes this by facilitating a systems biology approach: • Information can be obtained for all genes in a genome • This can be done for many genomes allowing for comparative genomics

  14. Where is the splicing? • Information on the intron-exon (coding/non-coding) arrangement of a gene is essential. • Aligning mRNA/EST sequence to their co-ordinate genomic sequences will give the arrangement of exons in a gene. (MGAlign, Ranganathan et al 2003; MGAlignIt, Lee et al 2003)

  15. Outline of the talk Background Determining gene architecture Graph theory in AS Whole genome analysis results AS and disease

  16. MGAlignIt (Lee et al., 2003) • Fast heuristic approach and highly accurate • Capitalizes on the fact that the mRNA sequence constitutes a very small percentage of the genomic sequence 15

  17. MGAlign’s “biological” alignment strategy

  18. MGAlignIt web service

  19. Benchmarking • Dataset: human Chr 22 from the Sanger Centre (Collins et al., 2003) • 936 annotated mRNA (5176 exons) • 48Mbp long human Chr 22 genomic sequence

  20. Some successes • Short internal exons (exon 2: 9 bp & exon 9: 21bp) • Short terminal exons (exon 1: 15 bp)

  21. MGAlign performance • More savings in computer time with longer gDNA sequences • Based on 41 randomly chosen genomic fragments sim4 spidey mgalign

  22. Outline of the talk Background Determining gene architecture Graph theory in AS Whole genome analysis results AS and disease

  23. Node Path or Edge Problem: Königsberg bridges (1700s) The residents of Königsberg, Germany, wondered if it was possible to take a walking tour of the town that crossed each of the seven bridges over the Presel river exactly once. Leonhard Euler, 1736 (father of graph theory)

  24. Graph theory for AS • First used for AS by Heber et al. (2002). • Each independent segment represented as a node, connected by arrows. • “Node” here is not necessarily based on introns and exons: simply a common contiguous segment of the gene. • Human ADSL (adenylosuccinate lyase)gene

  25. Our splicing graph approach • A biologist’s viewpoint: each exon should be a node and each intron, an edge (connection). • Automatic generation of AS clusters from gene structure. • Identifying Reference distinct Exon and its associated variants. • Simple rules for classifying alternative splicing events and visualization system for studying all variants from a single gene. • Single-line diagram :Experimentalist way of Alternative splicing analysis

  26. Making the splicing graph

  27. Usual classification of AS events(Leipzig et al., 2004)

  28. Representing splice variants of the same gene as a splicing graph

  29. Normal representation of transcripts human hyalouronidase HYAL1 gene: ENSG00000114378 (an early version)

  30. Splicing Graph representation of the same gene Intron retention Alternative Termination site Exon skipping Transcripts are shown as exon numbers: 5+2+3+9; 6+3+9; 1+7+3+4; 1+8+3+4; 1+2+4; 1+3+4.

  31. Single-line Splice Diagram Patterns using the above exon numbers are shown as: 5+2+3+9; 6+3+9; 1+7+3+4; 1+8+3+4; 1+2+4; 1+3+4. • A Digraph or DAG (Directed Acyclic Graph) • Graphs for which every unilateral orientation is traceable • Experimentalist’s way of Alternative Splicing analysis (for a gene of interest with all transcripts) for validating splive junctions • Intron retention is clearly visible

  32. Our extended classification Automatic rule-based classification

  33. Our extended classification

  34. Where to make your splicing graphs

  35. Outline of the talk Background Determining gene architecture Graph theory in AS Whole genome analysis results AS and disease

  36. AS Databases (Of men and mice) • Does not provide sufficient information for multi-gene comparison to understand the phenomenon of AS. 6

  37. Genome-wide AS analysis:“I” said the fly…

  38. Homology • Similarity between biological sequences due to shared ancestry • Orthology • Homologous sequences are orthologous if separated by a speciation event • The divergent copies of a singe gene in the resulting species are orthologous genes. • At least 25 - 30% similarity at the protein level 13

  39. Gene Ontology • Provides a controlled vocabulary to describe gene and gene product attributes in organisms. • Three organizing principles • Cellular component • A component of a cell, e.g. nucleus • Biological process • Series of events accomplished by one or more ordered assemblies, e.g. signal transduction • Molecular function • Describes activities, e.g. catalytic activity 14

  40. AS genes in Bovine genome • Part of bovine annotation project • 16560 human genes, 15986 mouse genes, 4567 bovine genes • Data extracted from ASTD and Ensembl (Hubbard et al., 2002) • Orthologous genes found using Biomart from Ensembl • Gene Ontology using Blast2GO (Conesa et al., 2005) • 2458 (out of 4567) Ensembl AS genes have GO annotations • 1716 AS genes can be further annotated 16

  41. Genome Total Genes Genes with multiples transcripts % of AS Bovine 21755 4567 21% Human 24573 16715 68% Mouse 28931 16491 57% Orthologous gene set Bovine 21755 3504 16% Human 24573 3835 16% Mouse 28931 3774 13% Percentage of AS genes and orthologous spliced genes in bovine, human and mouse • Orthologous genes were analysed in order to reduce bias in the data. 17

  42. No. of genes having one event Sum of the genes for all the events X 100 Percentage of Genes = Gene Level AS Analysis of orthologous subset • Percentage of bovine genes showing AS events are fewer compared to human. 18

  43. No. of times an event occurs Sum of occurrences of all events % Events = X 100 AS Event Analysis of the orthologous subset • % of AS events in bovine similar to human • implies that more splice variants are obtained from fewer bovine genes. 19

  44. Gene Ontology analysis Gene Ontology using Blast2GO (Conesa et al., 2005) • 2458 (out of 4567) AS genes has GO annotations in Ensembl • 1716 AS genes can be further annotated

  45. Outline of the talk Background Determining gene architecture Graph theory in AS Whole genome analysis results AS and disease

  46. Implications for disease • Diagnostics from early recognition of splice variants associated with disease, based on nucleotide detection • Treatment options using siRNA • Aberrant splicing in survival of motor neuron 1 gene (SMN1) in spinal muscular atrophy (Cartegni and Krainer 2002) • Suppressing anti-apoptotic AS variant of Bcl-x pre-mRNA in prostate and breast cancer cells (Mercatante et al. 2001) • Correcting CFTR mis-splicing (Friedman et al. 1999)

  47. Many diseases are caused by AS Myotonic dystropy

  48. Why study farm animals? • Provide valuable insights into gene function and genetic and environmental influences on animal production and human diseases. (Roberts et al., 2009 ) • The size and relatively long intervals between generations, domestic species are widely used to unravel the mechanisms involved in programming the development of an embryo and fetus, resulting in adult onset of diseases (King et. al., 2007 , Padmanabhan et al., 2007) • Mapping human disease genes to bovine orthologous genes is an excellent mode for carrying out analytical work and verifying the suitability of cow as a model organism.

  49. Mapping human disease genes to bovine genome • 94 human disease genes were extracted from NCBI Genes and Disease database to analyse which of these genes were alternatively spliced in human and bovine genomes. • AS analysis was conducted on 66 spliced genes. • 17 orthologous spliced genes were observed in bovine.