1 / 19

Identifying and classifying functional small RNAs from pine

Identifying and classifying functional small RNAs from pine. Ryan Morin BC Genome Sciences Centre (presenting research conducted in the lab of Dr. Peter Unrau). Small RNAs in the land plants. MicroRNA Biogenesis. MicroRNA (miRNA) transcripts can be mono- or polycistronic

tryna
Télécharger la présentation

Identifying and classifying functional small RNAs from pine

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Identifying and classifying functional small RNAs from pine Ryan Morin BC Genome Sciences Centre (presenting research conducted in the lab of Dr. Peter Unrau)

  2. Small RNAs in the land plants

  3. MicroRNA Biogenesis • MicroRNA (miRNA) transcripts can be mono- or polycistronic • Each miRNA derives from its own hairpin foldback region of the primary miRNA transcript • A number of processing steps are required to form the mature miRNA sequence • The final mature miRNA is bound to a complex of proteins known as RISC

  4. MicroRNAs are post-transcriptional gene regulators • MiRNAs are short (18-24nt) RNAs that are derived from larger precursor transcripts • Mature miRNAs hybridize to near-complementary sites within their target messenger RNAs • Messenger RNAs hybridized to microRNAs are enzymatically degraded, preventing translation

  5. siRNAs can be post- or pre-transcriptional gene regulators • siRNA precursors are also processed by dicer (plants have multiple Dicer paralogs) • Many active mature siRNAs are produced from the primary duplex • Mature siRNAs can elicit post-transcriptional regulation by a microRNA-like mechanism • SiRNAs can also direct methylation of histones or DNA, altering chromatin state (so-called heterochromatin siRNAs)

  6. Purification, adaptor ligation and amplification of small RNAs Extract sequences in desired size range Extract Total RNA Pine Rice smRNA ligate 3’ DNA adaptor ligate 5’ RNA adaptor biotin Gel Purify (remove un-ligated sequences) RT-PCR & PCR amplification Library Identification Tag (8mer) Random 10mer biotin 454 sequencing and amplification primer sequences Forward PCR Primer sequence Reverse PCR Primer complementary sequence 454 sequencing and amplification primer sequences

  7. Rice and Pine sequences produced by 454 SBS technology • 142,493 reads from pine, 11,436 from rice • Small RNA sequence was extracted by identifying flanking adaptor sequences • Random tag sequence was used to separate PCR-amplified constructs from multiple occurrences of the same small RNA species • There were 58,466 unique sequences from pine and 8,615 from rice • Most common sequence was observed 1168 times in pine and 89 times in rice

  8. Positional annotation of small RNAs and identification of conserved pine small RNAs • Alignment of both of pine small RNA sequences to the rice genome reveals perfectly conserved pine small RNAs • Pine sequences aligning to rRNA or tRNA genes in the rice genome are likely degradation products of pine rRNA/tRNA • Sequences aligning within annotated regions were annotated as rRNA, tRNA, miRNA, repeat-derived siRNA etc.

  9. Positionally-annotated small RNAs show distinct trends in sequence length

  10. Pre-miRNA structure: the ‘gold standard’ for miRNA annotation 1)No more than 4 unpaired nt total (or >2 consecutive unpaired nt) within miRNA/miRNA* region 2) The length of the hairpin must be at least 60nt 3) No more than one asymmetrically unpaired nucleotide 4) The pairing must extend 4nt beyond the mature miRNA • To classify a sequence as a miRNA, one usually has to be convinced it forms a suitable pre-miRNA structure • Knowing the sequence of the mature miRNA helps to separate random hairpins from real pre-miRNAs Rules enforced by the Bartel lab (MIRcheck)

  11. Finding clusters of related sequences • Align each pair of sequences in the database including all known miRNAs (mature miRNA sequences from every plant species in miRBase) • Link sequences in the database if they align over >18 nt with < 4 mismatches • Treating these relationships as a graph, extract all the connected components 1 1 1 1 2 2 1 1 2 3 1 1 2 1 1 1

  12. Putting clusters to use Part 1: Guilt by association • Known miRNA families generally partition into the same cluster (along with un-annotated sequences) • If they are low-quality (i.e. large/diverse) clusters are ignored • sequences can be readily annotated as conserved miRNAs if they share a cluster with a known miRNA

  13. Leveraging EST/cDNA sequences(miRNA annotation Method 1) • Pine small RNAs were also aligned to the publicly available pine EST sequences • miRNAs align to their ‘parent’ transcripts in a distinctive way • We fished for un-annotated small RNA sequences with miRNA-like alignments to pine ESTs • Folding and validation of these ESTs revealed 19 putative novel pine miRNA families miRNA* sequences miRNA sequences

  14. Putting clusters to use part 2: miRNA-like Clusters (method 2) • Clusters of known miRNA families had many similarities which might allow their separation from non-miRNA clusters • High information content • Mean sequence length • Low variance in sequence length • Few indels, relatively few genomic loci • A support vector machine (SVM) classifier was trained on all known microRNA clusters • This classifier demonstrated high specificity (0.96) and modest sensitivity (0.65) • 54 clusters were predicted to be miRNA clusters (all pine-specific) and may represent novel pine microRNA families • 5 of these clusters are confirmed miRNAs based on EST folding

  15. Putting clusters to use, part 3: miRNA/miRNA* pairs (method 3) • Maturation of a miRNA often leads to two small RNA molecules: the miRNA and its semi-complementary partner (miRNA*) • Many of the clusters with small RNAs that align in forward/reverse pairs contained known microRNAs • Some of these clusters contained sequences that aligned to ESTs, which could be folded and checked for good pre-miRNA structures • This method revealed an additional 24 clusters of novel miRNA sequences

  16. Mean information content for clusters annotated as microRNAs

  17. Summary • Some of the 24nt heterachromatin siRNAs (including repeat-derived siRNAs) are conserved between gymnosperms and angiosperms • Pine small RNA sequences are rich in conserved microRNAs • Various methods, including sequence clustering, RNA folding and machine learning, reveal many putative novel pine miRNA families

  18. Acknowledgements Thanks: Peter Unrau (SFU, MBB) Cenk Sahinalp & his lab (SFU, CS) Gozde Cozen (SFU, CS) Alex Ebhardt (SFU, MBB) Elena Dolgosheina (SFU, MBB) Matt Hickenbotham (WashU) Vincent Magrini (WashU) The VanBUG Dev team Other VanBUG sponsors: IBM, Genome BC

  19. Alignment of 3 ESTs from a novel pine miRNA family

More Related