1 / 34

RNA Folding

RNA Folding. RNA Folding Algorithms. Intuitively: given a sequence, find the structure with the maximal number of base pairs For nested structures, four possibilities for S(i,...,j) i,j are paired, added to S(i+1,...,j-1) i is unpaired, added to S(i+1,...,j)

edita
Télécharger la présentation

RNA Folding

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. RNA Folding

  2. RNA Folding Algorithms • Intuitively: given a sequence, find the structure with the maximal number of base pairs • For nested structures, four possibilities for S(i,...,j) • i,j are paired, added to S(i+1,...,j-1) • i is unpaired, added to S(i+1,...,j) • j is unpaired, added so S(i,...,j-1) • i,j are paired but not to each other, to S(i,...,k), S(k+1,...,j)

  3. RNA Folding by DP • Fill in a matrix of S(0,...,seq_length)

  4. RNA Folding Assumptions • RNA folding algorithms typically detect only nested structures and do not recognize pseudoknots • Some folding algorithms identify pseudoknots but they are typically inefficient or limited (e.g., do not take stacking-dependent pairing models) • Current algorithms get about 50-70% of the base pairs correct, on average

  5. MicroRNA Identification

  6. miRISC Ago1 MicroRNAs: Introduction miRNAs are genomically encoded small RNAs processed into single stranded 21-23 mers incorporated into RNP complex (miRISC) miRISC binds to 3’UTRs, repression of translation modest mRNA degradation Bartel, Cell 116, 2004

  7. MicroRNA Transcription • miRNA genes can be in intergenic and intronic regions • miRNA genes can be clustered and co-expressed • Estimates: 60% singletons, 25% introns, 15% clusters

  8. MicroRNA Examples

  9. MicroRNA Gene Conservation • Some miRNAs are highly conserved (e.g. let-7) • Conservation must preserve a dsRNA hairpin from which the miRNA is processed by Dicer

  10. MicroRNA Gene Identification • MicroRNA Cloning • Map cloned ~22nt small RNAs to the genome • Predict pre-miRNA secondary structures using m-fold • Score pre-miRNAs based on known miRNA precursors • Computational Identification • Identify conserved genomic segments • Predict pre-miRNA secondary structures using m-fold • Scoring pre-miRNAs based on the known miRNA precursors

  11. MirScan, MirSeeker, … MicroRNA Gene Identification • More complex methods: additional features

  12. MiRBase • ~4500 miRNAs in 41 eukaryotes • Examples: 474 human, 78 fly • Eight viruses express microRNAs

  13. MiRBase

  14. MicroRNAs: Open Questions • Promoter • Transcritpional start site • Transcriptional Termination • Transcriptional complex • Regulation of miRNA expression

  15. MicroRNA Targets:Mechanism & Identification

  16. Are All RNAs Regulated by miRNAs?

  17. Existing algorithms seed focus on quality of the sequence match between miRNA and mRNA target introduce various filters, e.g. evolutionary conservation miRNA 3’ 5’ 5’ 3’ mRNA seed 87654321 miRNA 3’ 5’ wt mRNA 3’ 5’ 987654321 Brennecke et al. 05 The Target Prediction Problem • Target sites show imperfect sequence complementarity: • Strong match in 5’ region (‘seed’) • Varying complementarity on 3’ end • Computational target predictions: • Sensitive to exact pairing rules • ~100 targets per miRNA within fly transcriptome • ~25% of transcriptome under miRNA regulation

  18. miRanda • Target prediction: sequence-based rules • miRNA-target complementarity (strong in 5’, weaker in 3’) • Refinement with binding free energy scores • Use conservation to increase signal to noise

  19. miRNA mRNA Perfect nucleus Imperfect nucleus PicTAR: Combinatorial Targets Filter - over 33% of mature miRNA binding energy to perfect complementary site

  20. Anchor PicTAR: Combinatorial Targets

  21. PicTAR: Combinatorial Targets

  22. 1…m miRNAs Hidden states b Prior (transition) probabilities p2 p3 pm p0 p1 . . . Emission probabilities GGCAUUAC ACUGUAC A C U G ACUGUAC U C GGCAUUAC ACUGCAC . . . PicTAR: Combinatorial Targets 0.2 0.8 0.3 0.02 0.8 Generated mRNA • - Independency of binding sites (no overlapping) • Transition does not depend on current state (memoryless) • Competition between background and miRNA

  23. miRISC Accessibility: The Missing Component What about target accessibility? miRISC vs.

  24. Experimental Method Drosophila tissue culture cells (S2) No miRNA overexpression establish miRNA expression profile use endogenous miRNA (50-500 copies per cell) (bantam, miR-2 family, miR-184) Dual luciferase reporter assay Renilla experiment, firefly as internal control mild overexpression of target sequence (<10fold) no target degradation (20h transfection) 3’UTR Renilla firefly sensitive, quantitative, linear assay UTR engineering mutate target site sequence mutate sequence surrounding the target site to alter mRNA secondary structure

  25. The Role of Secondary Structure target site 5’ end 0.4 0.3 ~200 b AAAAA normalized luciferase ratio 0.2 3’UTR target site 0.1 N: ~200 bp fragment, native structure C: ~200 bp fragment, closed structure 0.0 3’UTR N C C3 C3+ C5 C5+ rpr (miR-2) A GA 5 CUCAUCAAAGC UUGUGAUA 3’ 3’ GAGUAGUUUCG GACACUAU 5’ C ACC Target miRNA

  26. Target Accessibility Matters target site 5’ end 0.4 0.3 normalized luciferase ratio 0.2 0.1 0.0 3’UTR N C C3 C3+ C5 C5+ N C 3’UTR N C 3’UTR rpr (miR-2) hid (bantam) grim (miR-2) A GA 5 CUCAUCAAAGC UUGUGAUA 3’ 3’ GAGUAGUUUCG GACACUAU 5’ C ACC C AAUUAGUUUUCA AAUGAUCUCG UUAGUCGAAAGU UUACUAGAGU U A GCA U GCUC AUCAAAGC UUGUGAU CGAG UAGUUUCG GACACUA ACC U Target miRNA

  27. Accessibility as Important as Sequence D5+3 D5 target site mutations A G C C A GA CUCAUCAAAGC UUGUGAUA target site 5’ end 87654321 0.7 0.4 0.3 normalized luciferase ratio 0.2 0.1 0.0 3’UTR N C C3 C3+ C5 C5+ M2 M3 M6 I5 D5 D5+3 rpr (miR-2) A GA 5 CUCAUCAAAGC UUGUGAUA 3’ 3’ GAGUAGUUUCG GACACUAU 5’ C ACC Target miRNA

  28. Thermodynamic miRNA::RNA Model

  29. Thermodynamic miRNA::RNA Model UTR ∆G3 = -10.2 ∆G5 = -15.1 ∆G = -25.3

  30. ∆G1= -19.5 Thermodynamic miRNA::RNA Model folding area = target +70bp CDS UTR Poly(A) ∆G0= -28.3 ∆Gopen = ∆G0 - ∆G1

  31. 0.4 DDG with flank 17 up, 13 down exploring flank size r 0.3 0 0.76 5 0.2 0.74 10 upstream (bp) 0.72 0.1 15 0.70 r=0.77 p<3x10-5 20 0.68 25 0 5 10 15 20 25 -30 -20 -10 0 10 20 30 downstream (bp) ddG Predicts Measured Repression 22 constructs altering accessibility of target sites in rpr, hid, grim rpr DDG DGduplex 0.4 0.4 grim hid 0.3 0.3 normalized luciferase ratio 0.2 0.2 0.1 0.1 r=0.7 p<4x10-4 r=0.36 p<0.11 -30 -28 -26 -24 -22 -30 -20 -10 0 10 20 30

  32. Native Target Analysis seed • 12 miR-184 targets with weaker 3’ pairing, • tested in different backgrounds to alter secondary structure • non-redundant set of 190 experimentally tested miRNA:mRNA target pairs in Drosophila miRNA 3’ 5’ mRNA 5’ 3’ 987654321 miR-184 targets 190 validated targets measured repression differential r=0.87 ddG differential

  33. Genome-Wide Target Analysis miRNA target seeds favor highly accessible regions of the genome DGopen overrepresentation vs. random fly human accessibility (DGopen) accessibility (DGopen)

  34. Assignment • Download the set of human microRNAs • Download the set of human UTRs • Download the mFold software • For each microRNA, identify the set of targets on each UTR, defined by a perfect match to the microRNA seed, bases 2-8 • Partition the targets of each microRNA into conserved and non-conserved targets (define a conservation cutoff) • Compare the RNA-accessibility of conserved and non-conserved targets for each microRNA • For each putative target, extract the 100 bases that surround it • Use mFold to compute the free energy of these 100 bases • Create a dot-plot with points being microRNAs, and axes being the median (plot #1) or mean (plot #2) free energy of all conserved (x-axis) or non-conserved (y-axis) targets of the microRNA

More Related