130 likes | 312 Vues
The project explores the concept of "Know Your Surroundings" within genomic analysis, focusing on retrovirus integration patterns. Researchers Nirav Malani and Rick Bushman from the Department of Microbiology at the University of Pennsylvania aim to decode the genomic environment's influence on viral behavior. Using R and genomic coordinates from various species (hiAnnotator), the study utilizes RangedData objects to perform analyses, including site feature counts and nearest gene identification. The methodology emphasizes user-defined parameters to customize the data annotation process effectively.
E N D
LOST in the genome… find where you at, fool! Nirav Malani Rick Bushman Lab Department of Microbiology University of Pennsylvania
Basic Idea:“Know Your Surroundings” • Where is the concept coming from? • Retrovirus integration pattern • What are you trying to deduce? • Sense of genomic environment and/or preferences • What kind of data are you analyzing? • Genomic coordinates from some species
hiAnnotator • R package to annotate genomic ranges • Fundamentals • Take two RangedData objects (query & subject) • Call a specific annotation type function • Define customization parameters…optional. • That’s it! • Depends On: IRanges, doBy
Prepare the Objects > head(sites) > makeRangedData(sites,soloStart=TRUE)
Prepare the Objects > head(genes) > makeRangedData(genes) Usage: makeRangedData(x, positionsOnly=FALSE, soloStart=FALSE, chromCol=NULL, strandCol=NULL, startCol=NULL, stopCol=NULL)
Annotation Types In/Out Usage: getSitesInFeature(sites.rl, genes.rl, “InGene”)
Annotation Types In/Out Usage: getSitesInFeature(sites.rl, genes.rl, “InGene”, asBool=T)
Annotation Types Nearest Usage: getNearestFeature(sites.rl, genes.rl, “NearestGene”)
Annotation Types Feature Counts • Usage: getFeatureCounts( sites.rl, genes.rl, “NumOfGene”, • chromSizes= seqlengths(Hsapiens))
In Works • Parallel backend support for all the functions • Function for GC% annotation