Discussion on Exact Match Algorithms: Hashing vs Binary Trees in Bioinformatics
In this discussion, we explore algorithms for exact matches in bioinformatics, focusing on Hashing and Binary Trees. We analyze their complexities in real-world scenarios, specifically considering target sequences and query lengths in large genomic data sets. With an emphasis on performance, we compare O(mN) and O(m log N) complexities, where N is the target sequence length, m represents total query lengths, and discuss implications for genetic research involving millions of data pairs. We outline actionable tasks for ongoing projects, including disease modeling and genomic patterns.
Discussion on Exact Match Algorithms: Hashing vs Binary Trees in Bioinformatics
E N D
Presentation Transcript
101 6,8-Mar Discussion of algorithms for exact matches Hashing, Binary trees a,c,g,t = 00, 01, 10, 11 log2(N)= 34 N=target sequence length e.g. 3E9 m=total of query lengths, E9 people * 30E9bp(5X) O(mN) vs O(mlogN) = E29 vs E21
101 6-Mar To do list draft 1. Mike: Bioweather HapMap 2. Chris Code HapMap-OMIM + PG processing 3. Resmi: Prob of disease lifetime NHS access 4. Kay: Regulatory elements conserved 5. Hettman: Modeling C sequestration Cyanob. SALP.model geographical – pump nutrients up. 6. Cynthia: Disease host coevol 7. Tiffany: Flu 1918 smRNA 1800 sequences TIGR. map 8. Deniz: Z^n(mod4) Clustering Metagene. Universal sequence library. 9. Xiaodi : image patterns UI find order 10. Katie: talk Broad. Anticip. GenePattern. Algorithms for string matching 11. Zach: environ allerg corr SNPs 500K Entrez compare with random admixture