10 likes | 135 Vues
This project focuses on advancing proximity search methods for sequences, particularly in genomic analysis. It introduces almost metric distance functions and optimizes distance-based index structures for effective nearest neighbor searches. Significant results include improved indexing techniques for sequence proximity search using probabilistic methods, contributing to reliable, rapid, and generalizable solutions in data-intensive genomic research. The findings have been presented in leading conferences, showing potential applications in areas requiring efficient proximity searches in large datasets.
E N D
Grant Number: IIS-0209145 Institution of PI: Case Western Reserve UniversityPIs: Z. Meral Ozsoyoglu, Cenk Sahinalp, Piotr IndykTitle: Improving Proximity Search for Sequences Significant Results: Almost metric distance functions, and using distance based index structures for almost metrics (ICDE 2003) Improved distance based index structures for sequence proximity search using probabilistic techniques, optimal nearest neighbor search for distance based indexing (SSDBM 2004) Developing sequence proximity search methods which are particularly effective for genomic sequences, with improved generality, reliability and speed. Graphic: Nearest Neighbor Search for Distance Based Indexing Using Index structures for genomic sequences proximity search Generalizing distance based index structures to work on almost metrics Improving tree structured and array structured distance based indexing by choosing better reference points guided by probabilistic techniques Improved proximity (near neighbor, nearest neighbor, k-nn) search for sequences