230 likes | 333 Vues
Explore general RNA editing, mRNA editing, ADAR-mediated editing, and the identification process of ADAR edit sites with examples and experimental validation. Discover the impact on protein function, editing distribution in different tissues, and the stability effects on dsRNA.
E N D
Systematic identification of abundand A-to-I editing sites in the human transcriptome Levanon et. al
General RNA editing • End modifications (5’ cap and 3’ poly-A) • Splicing/Alternate splicing • Cutting • Pre-tRNA & pre-rRNA cut to yield 2 or more functional molecules • Common in all life
General RNA editing • Chemical modification • Chemical groups added to nucleotides • Nucleotides themselves modified • Common in tRNA & rRNA • Also common in all life • Happens (rarely?) in mRNA
mRNA editing • Modifications in coding region can alter sequence of expressed protein • Ex apoliopoprotein B • C U edit changes codon CAA UAA causing in an inframe stop • Resulting proteins of different lengths (4563 aa and 2153 aa) are expressed in different tissues (liver and intestine)
mRNA editing • Ex Glu-B gene undergoes 2 edits • Glutamine Arginine • Arginine Glycine • Modifications are at protein’s active site (an ion channel) • Alters protein function
ADAR mediated mRNA editing • Paper focuses on ADAR mediated editing • Adenosine deaminases that act on RNA • ADAR is the enzyme which catalyzes editing • Converts an A to I by removing amine group • I is chemically similar to G
ADAR mediated mRNA editing • Substrate of ADAR is dsRNA • Requires 2 nearby inverted repeats to form stem structure
ADAR mediated mRNA editing • ADAR necessary for normal development • Knockout mice without ADAR1 die before birth • Without ADAR2 survive to birth but die soon after • ADAR deficient invertebrates show behavioral defects
Identification of ADAR edit sites • Naive method • Align transcribed sequence to genome • Call A-G mismatches as edit sites • Problems • High error rate in sequencing transcripts (~3%) • SNPs get classified as edit sites
Identification of ADAR edit sites • Solution presented in paper • Use knowledge of ADAR mechanism to filter out errors and SNPs • Start with human EST/cDNA sequence database (Genbank) • Align spliced sequence to genome to get genomic locus • Align expressed sequence (exons) to genomic locus
Identification of ADAR edit sites • Keep reverse compliment alignments • >32 bp in length • >85% identity
Identification of ADAR edit sites • Results in 429,000 putative dsRNA regions in 14,512 genes • Filter out candidates derived from low quality transcribed sequences • Filter out all known SNPs • Mismatches in remaining dsRNA regions called as ADAR edit sites
Identification of ADAR edit sites • Get 12,723 putative edit sites in 1,673 genes • Over 80% of mismatches are A-G
Sensitivity/Specificity • Parameters set to minimize false positive rate • Several well known editing examples not picked up • May be lots more sites than found in this paper
Experimental validation • Pick 30 novel predicted edit sites and test • Sequence transcripts from 5 tissues separately and pooled • Detected editing at 26/30 sites
Characterization of editing sites • 92% of edits in Alu repeats • 12% of all editing events in positions 27 & 28 of Alu repeats • 1.3% of edits in L1 repeats • Most of the time (83%) only one expressed sequence shows editing • Editing is not deterministic
mRNA editing in the brain • Previous work suggested most pre-mRNA editing in brain is in non-coding regions • In this study • 12% in 5’ UTR • 54% in 3’ UTR • 33% in introns
dsRNA stability • Edits can stabilize or destabilize dsRNA • Destabilize (78%) • A-U I-U • Stabilize (19%) • A-C I-C • Neutral (3%) • A-A I-A • A-G I-G
dsRNA stability • Mechanism seems to prefer stabilization over destabilization • 22% (19 + 3) of events targeted a mismatched base pairing • Frequency of mismatched base pairs at nearby sites was only 10%
dsRNA stability • Why does ADAR want to make stable stem-loops in pre-mRNA? • May have something to do with regulation of RNAi • Might not want to limit search to expressed sequences but look at whole transcripts