1 / 9

Advanced Modeling of 5' UTR Elements: Insights from New Gene Data

This study focuses on the development and evaluation of models for 5' untranslated regions (UTRs) using enhanced data from the DBTSS database. Analyzing over 6,400 genes, our research incorporates advanced statistical techniques to estimate length distributions for various UTR states (Einit, Epa, Ea, and Enc). We uncover that the splicing state of UTRs significantly influences coding hexamer distribution. Our findings suggest dividing the Einit state into spliced and unspliced categories for more precise modeling. Future work includes refining these models and incorporating conservation sequences.

krysta
Télécharger la présentation

Advanced Modeling of 5' UTR Elements: Insights from New Gene Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. UTR ModelingPart III Sam Gross Randy Brown

  2. Old 5' UTR Model Einit Inter Utr5 Prom Esngl Prom ATG Coding Exon Inter Utr5

  3. New 5' UTR Model Enc Inc Ep Einit Ea Inter Prom Epa Esngl Prom Epa ATG Coding Exon Inter Prom Ep Ea ATG Coding Exon Inter Inc Prom Ep Enc Ea ATG Coding Exon Inter Inc Inc

  4. New Data From DBTSS 3.0 • Extended RefSeq 5' UTR information by mapping full-length cDNAs from DBTSS using est2genome. 1500 genes in old training/testing set; new set has 6400 genes • Data set now large enough to move to directly-estimated length distributions for Ep, Epa, Ea, and Enc states

  5. Separate UTR “coding” models for Ep/Epa and Ea/Enc • Tried UTR “coding” models of 3rd, 4th, and 5th order, with and without division by isochore. Best model was 4th order with isochore division.

  6. Dual coding model (CpG-related/not CpG-related) for Ep, Epa, and Einit states was not very effective. Still working on modeling CpG islands. • Initial coding exon hexamer distribution depends more on whether the UTR is spliced or unspliced than whether the gene is associated with a CpG island or not • This suggests splitting the Einit state into two states, each with different coding parameters

  7. EinitS/EinitU 5' UTR Model Enc EinitS Inc Ep Ea Esngl Inter Prom Epa EinitU Prom Epa ATG EinitU Inter Prom Ep Ea ATG EinitS Inter Inc Prom Ep Enc Ea ATG EinitS Inter Inc Inc

  8. Future Directions • Test performance of EinitS/EinitU model • Try directly-estimated length distributions for 5' UTR states • Conservation sequence models for 5' UTR states • CpG island model

More Related