ChIP-chip Data
Histones are critical proteins that form the scaffold for DNA, organizing it into nucleosomes and regulating access. With their various modifications, such as acetylation and methylation, histones can influence gene expression by compacting or loosening DNA. Transcription factors, both general and specific, mediate transcription initiation and recruit RNA polymerase II to target genes. Techniques such as ChIP-chip and chromatin immunoprecipitation are essential for assessing protein-DNA interactions and understanding these complex regulatory mechanisms.
ChIP-chip Data
E N D
Presentation Transcript
DNA-binding proteins • Constitutive proteins (mostly histones) • Organize DNA • Regulate access to DNA • Have many modifications • Acetylation, methylation, … • Sporadic proteins (Transcription Factors) • Mediate docking of transcription apparatus • Modify histones • Methylate DNA
Histones Histones are an ancient family of proteins which serve as the scaffold for DNA Four types of histones assemble in pairs to form a nucleosome DNA is wrapped twice around each nucleosome
Histones and Modifications Histone tails can be modified DNA contacts histones on their tails Histones can stay loose or assemble tightly – this compacts the DNA
Transcription Factors • General – help to set up transcription of many genes • Specific – draw in general factors or RNA Pol II to specific genes TATA Binding Protein
DNA Methylation Adding a Methyl to Cytosine Cytosine methylation is passed on to daughter cells
Tiling Array • One probe every n base pairs over some length of chromosome • Interrupted by repeat regions • Promoter array: each (known) promoter tiled An Affymetrix tiling design
What the data look like histone acetylation on 15 samples over one promoter (raw)
Methods and Issues • Normalization • Different enrichment ratios • Different probe thermodynamics • Dye and probe bias • Estimation • Categorical or continuous? • Individual values are noisy: • For TF binding: where is the peak?
Normalization • Basic idea: compensate technical variables • Technique differences should affect different probes differently • Try to estimate what part of signal can be attributed to technical factors • Easiest variable to access: sequence
MAT • One color Affy array • Needs separate array for comparison • Normalizes probe thermodynamics & enrichment ratio • Estimation by (robust) moving average
Estimation • Try to build an intelligent moving average • Not all neighbors will be similar • Typical TF binds to 8bp • Pol II may spread wider • Typical fragment is 100-200 bp • Cannot resolve < 200 bp Pol II binding on a 100 bp grid
TileMap • Ignores normalization • ‘Shrinkage’ estimator of variance • Improves individual scores • Smooths noise by moving average