1 / 14

Practical applications of HMMs : ChromHMM

Practical applications of HMMs : ChromHMM. Sushmita Roy Nov 5th. Chromatin organization and gene expression. http:// www.youtube.com/watch?v =eYrQ0EhVCYA. ChIP-seq to measure histone data. Adapted from Dewey lecture and Peter Park Nature Genetics Review.

kele
Télécharger la présentation

Practical applications of HMMs : ChromHMM

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Practical applications of HMMs: ChromHMM Sushmita Roy Nov 5th

  2. Chromatin organization and gene expression http://www.youtube.com/watch?v=eYrQ0EhVCYA

  3. ChIP-seq to measure histone data Adapted from Dewey lecture and Peter Park Nature Genetics Review

  4. ChIP-seq data for multiple marks Chromatin state: A specific combinations of mark values. Important because it can be used to segment the genome into biologically meaningful units.

  5. Problem definition • Given • A collection of genome-wide measurements of chromatin marks • Do • Segment the genome into N chromatin states

  6. An HMM for segmenting genomes using chromatin marks • HMM • State: chromatin state • Emission->multiple chromatin marks • Need a multi-variate HMM

  7. Binarizing the chromatin data • Each mark is represented by a binary variable vt,m: • 1: mark is present • 0: mark is absent Observed Marks .. Genomic sequence .. .. t t+1 t+2 t+3 ..

  8. ChromHMM with 3 states Begin 1 3 2

  9. ChromHMM notation • pk,mdenotes the probability of mark mbeing ON in state k • Emission probability of M marks per state is a product of M bernoulli random variables. • bk,l denotes the probability of transitioning from state i to state j • ak: initial probability of state k

  10. Learning the ChromHMM • Need to figure out the number of states • Learn HMMs for K=2 to 80 states with a penalty factor to penalize the number of parameters • State transitions: start with the fully connected HMM, and if set parameters to zero if <10-10 • Final model had 51 states

  11. Learned Emission parameters Emission parameters for state 5 States

  12. Example output around CAPZA2 gene from ChromHMM Input chromatin marks Inferred state sequences

  13. Posterior probability distributions of all 51 states around CAPZA gene Max posterior state Posterior probability values of each state

  14. Summary • HMMs are powerful models to capture sequential data • Very popular in computational biology • Gene annotation • Representation of a profile: protein domain finding • Genome segmentation

More Related