430 likes | 658 Vues
Online Quiz. Take the online quiz now Time until 14:00. A Handful of miRNA Fingerprints. Hamid Hamzeiy, Duygu Saçar, and Jens Allmer. Molecular Biology and Genetics, Izmir Institute of Technology. MBG40 4 – Lecture Slides – Week 1 0. Interest in miRNAs. 2008 - 2010
E N D
Online Quiz • Take the online quiz now • Time until 14:00
A Handful of miRNA Fingerprints Hamid Hamzeiy, Duygu Saçar, and Jens Allmer Molecular Biology and Genetics, Izmir Institute of Technology MBG404 – Lecture Slides – Week 10
Interest in miRNAs • 2008 - 2010 • Master Thesis (Mehmet Volkan Çakır): Systematic Computational Analysis of Potential RNA Interference Regulation in Toxoplasma gondii • 2010 - 2011: Highschool project competion, Özel Çakabey Anadolu Lisesi, İzmir: • KANSER GELİŞİMİNDE POTANSİYEL OLARAK ROL OYNAYAN DÜZENLEYİCİ miRNA AĞLARININ (BİYOİNFORMATİK ARACILIĞI İLE) HESAPLANMASI • Ongoing research: Hamid Hamzeiy, Müşerref Duygu Saçar
What is Cancer http://en.wikipedia.org/wiki/Cancer
Cancer and miRNAs • Although small, miRNAs have large effects due to their regulation of other genes • They have been definetively linked to cancer • Their up or downregulation affects cancer development • Therefore, we became interested in studying miRNA regulatory networks Meltzer 2005: Small RNAs with big impacts, Nature 435:745-746 He et al. 2005: A microRNA polycistron as a potential human oncogene, Nature 435:828-833 Lu et al. 2005: MicroRNA expression profiles classify human cancers, Nature 435:834-838 O’Donnell et al. 2005: c-Myc-regulated microRNAs modulate E2F1 expression, Nature 435:839-843
Brief Introduction to miRNAs Advanced Information on The Nobel Prize in Physiology or Medicin 2006, http://nobelprize.org
miRNA Expression • MicroRNAs (miRNAs) are ~22-nucleotide endogenous RNAs that often repress the expression ofcomplementary messenger RNAs. • miRNAs derive from characteristic hairpins inprimary transcripts through two sequential RNase III-mediated cleavages; Drosha cleaves near thebase of the stem to liberate a ~60-nucleotide pre-miRNA hairpin, then Dicer cleaves near the loopto generate a miRNA:miRNAduplex. Bartel (2004) “MicroRNAs: genomics, biogenesis, mechanism, and function”. Cell, 116:281–297. Lee Y, et al. (2004) “The nuclear RNase III Drosha initiates microRNA processing”. Nature, 425:415–419. Tomari, Zamore (2005) “Perspective: machines for RNAi”. Genes, 19:517–529. Bushati and Cohen (2007) “microRNA Functions”. Annu Rev Cell Dev Biol, 23:175–205.
Our Interests • Given a gene of interest • Can we find genes that are regulated via new miRNAs from this gene? • Given multiple genes of interest that act as sources for new, putative miRNAs • Can we find their targets? • Can we find shared targets • Can we find feedback regulations? • Can we establish a regulatory network? • Unsolved so far • Given genes of interest can we find miRNAs that regulate these genes?
Problem Definition • Input • One or more source genes • Aim • Finding intronic miRNAs • Finding their 3’UTR targets • Ouput • Putative miRNA-mRNA regulations • Extension • Repeating the process to establish a network
miRNA Genesis • Many a program claims prediction of miRNAs • None of these were useful for de novo prediction of miRNAs • We resorted to folding of nucleotide sequences and looking for hairpin structures manually • RNAStructure • RNAFold • RNAShapes Allmer, Jens in MikroRNA ve Sinir Sistem: mikroRNA Analizi ve Saklanmasında Hesaplamaya Dayalı Yaklaşımlar
miRNA Targeting • Many a program claims prediction of miRNA targets • None of these were useful for finding targets for unknown miRNAs • We resorted to using BLAST for targeting Allmer, Jens in MikroRNA ve Sinir Sistem: mikroRNA Analizi ve Saklanmasında Hesaplamaya Dayalı Yaklaşımlar
Proof of Concept • Selected examples from mirBase + 500nt context are folded with RNAStructure • Hairpins are visually inspected • Many hairpins could be re-established in this manner • Selected examples from TarBase are matched to human 3’UTRs • Interactions in Tarbase are present among other interactions Allmer, Jens in MikroRNA ve Sinir Sistem: mikroRNA Analizi ve Saklanmasında Hesaplamaya Dayalı Yaklaşımlar
Selecting Initial Genes • Must have an implication in cancer • Should be well annotated • Selection • ATM (breast, lung, and other cancers) • P65 (breast, lung cancer) • TP53 (variety of cancers amont them breast cancer) • BRCA2 (breast, ovarian, skin cancer)
Processing of Initial Genes • Only introns were folded • 113 introns (not complete introns but in small chunks) • About 200.000 nt overall • All folds were visually scanned for hairpins
Selected Putative Hairpin Structures • 86 hairpin like structures were selected • Examples: Drosha cutting has been performed
miRNA Targeting • BLAST all 86 hairpins (172 mature miRNAs) • Against human 3’UTRs • Filtering of results can be done • Clustering of targets • Proximity to stop signal • Multiplicity of miRNAs targeting the same 3’UTR intron exon 3’UTR
Blast Results • Each miRNA may target multiple other genes • >>1000 (unfiltered) • Each miRNA may have multiple target sites per 3’UTR • Avg: 2.8 +/- 0.2 • This leads to a large amount of data
TBRG4 Putative miRNAs of ATM Genes that have a target site in their 3’UTR for a miRNA from ATM. ATM 25 out of the 39 putative miRNAs originating from introns from the ATM gene have targets in 3’UTRs which are more abundant than average and therefore may represent true miRNA – 3’UTR interactions.
Extension of Gene Selection • Initial miRNA prediction and targeting • Lead to many putative targets • A small subjective selection was made • Based on target clustering, regulation multiplicity, and chance • Further selections • YME1L1 (metalloprotease) • TBRG4 (protein protein interaction, kinase) • ZNF785 (Zinc finger protein) • ZFYVE20 (Zinc finger protein)
Partial Success • Achievements • New hairpins can be identified manually • In conjunction with targeting information some filtering can be achieved • Problems • Processing presents a high work load • Slow • Still too many results, suggesting many false positives • Solutions?
Fingerprints http://www.psypost.org/2010/06/left-hand-motivated-right-1376
Learning Hairpins • mirBase Examples (Positive) • Human miRNAs (1426 hairpins available, May 2011) • Inconsistent • multiple accessions pointing to one sequence • Some mapped to Ensemble • Some contain extended information • Beginning and end of hairpin not consistent • Some hairpins contain multiple loops and even sub hairpins • About 358 were finally used as examples • Negative Examples • Random sequences • Closely mirroring examples from mirBase • Number of Examples: 1424
Some Possible Values for Hairpin Fingerprint • Number of mismatches (shape and longest stretch) • RNAhybrid mfe intervals (left, middle, and right) • Length of flanking ends (3’ and 5’) • Length of lower stem matches • Nucleotide entropy (sn, dn, and tn distinct counts)
Hairpin Fingerprint (Selected Attributes) http://orange.biolab.si/
Accuracy Calculations http://orange.biolab.si/
Distribution of Selected Attributes Stem Length Mfe of Stem miRNA miRNA random random
C4.5 Decision Tree Just 5 levels Can be implemented with 6 if-else statements http://orange.biolab.si/
Conclusion • Rule learning is somewhat involved • Problems with examples • Rule application • Fast • Easy • Good accuracy • Low number of false positives • All positives must be true so more filtering is necessary
Outlook Drosha Cleavage Dicer Cleavage Fingerprints Section from Shabalina and Koonin “Origins and evolution of eukaryoticRNA interference”. Trends in Ecology and Evolution23 (10)
Outlook RISC incorporation Mature miRNA Fingerprints Section from Shabalina and Koonin “Origins and evolution of eukaryoticRNA interference”. Trends in Ecology and Evolution23 (10)
Outlook • Example Measures • Clustering of targets • Proximity to stop signal • Multiplicity of miRNAs target • Target accessibility • ... Targeting Fingerprint
Acknowledgements http://bioinformatics.iyte.edu.tr
Hairpin Structure Created by Duygu Saçar
Select miRNAs from miRBase • Select at least 10 miRNAs • Fold selected miRNAs with their genomic context (+/- 50nt) • Collect the bracket dot notation formatted results • Translate the dot-bracket notation to nucleotides: • ( = A, ) = T, .=C • Make an MSA with the collected results Penalize gaps strongly
Use your Findings • Select a random gene • Extract the introns • Fold the first 500 nucleotides of those introns • Create a blast database from the folds • You need to translate the dot-bracket notation to nucleotide characters as before • Blast your consensus sequences (penalize gaps strongly) • Did you find anything that looked like a miRNA?
Galaxy 1 • Remainder of today we will try to add 1 of the MSA tools your group is using to Galaxy (https://wiki.galaxyproject.org/FrontPage) • Here is how to add a tool • https://wiki.galaxyproject.org/Admin/Tools/AddToolTutorial • More about the XML structure for the tool descriptor: • https://wiki.galaxyproject.org/Admin/Tools/ToolConfigSyntax • Final version of the term project should also provide a tool descriptor for all commandline based tools