Improving Biometric Record Management with Indexing and Binning Techniques

Binning and Indexing Biometric Records Sharat S. Chikkerur CUBS, University at Buffalo ssc5@eng.buffalo.edu

Problem Description • Biometrics are being deployed for immigration and national ID applications • US-VISIT program • Voter ID and national ID programs[3] • Potential size that can run into millions • Largest study by NIST considers only 620,000 records[4] • Apart from accuracy speed and efficiency also become important at this scale • Only biometric identification (1:N matching) can prevent duplicate enrollments

Problem Description (cont.) • In biometric templates, there is no natural order by which one can sort the biometric records • Biometric Templates are inherently higher dimensional • Semantic features are not stored in the template

Identification Problem • Let FAR and FRR be the false acceptance rate and false reject rate for 1:1 matching • For a 1:N matching, • The total number of false accepts is given by • Even if FAR = 0.0001%, False accepts = 1 in 10 for N=100000(lower bound) • No single biometric is capable of meeting this security requirement individually

Uses of Indexing and Binning • Ways to reduce identification errors: • Reduce N • Reduce FAR (Limited by technology) • We can reduce N by pruning the records • Let PSYS – Penetration rate • For a 1:N matching, • The total number of false accepts is given by • State of the art fingerprint systems has PSYS=0.5 [6]

Indexing and Binning(cont.) • Will allow us to screen immigrants at airports against a ‘watch list’ • Will make biometric systems more user-friendly by eliminating the need to remember PINs and Ids • Will improve accuracy (FARN) and performance

Binning Biometric Data Vector Quantization Approach

Vector Quantization(cont.) • In general a biometric template may be represented as a vector • The objective is to classify the vectors into N distinct classes(code book vectors) • The code book vectors divide the feature space into N distinct Voronoi regions • Properties of the regions:

Vector Quantization-Voronoi Regions

Hand Geometry- Template Model

Experimental Evaluation • 25x10 hand geometry features used • Each print represented by a 21D vector • Data divided equally among training and testing • Data is normalized using • VQ is implemented using k-means clustering • The codebook vectors are used on the test set

Normalization • Observations • Data normalization leads to spreading of data • Without norm., clusters converge to a single center • Equivalent to measuring Mahalanobis distance[5] • Difference instances of the same had misclassified

Preliminary Results

Indexing Biometric Data Spatial Access Methods Approach

Introduction to Spatial databases • Relational databases organize and store scalar data • Has planar organization • Contains scalar data (excluding LOBs, binary) • Data can be ordered linearly • Structured Query Language used to retrieve records • Spatial databases • Contain multi-dimensional or vectorial data • Relative positions may be explicit or inferred • Linear proximity does not imply spatial proximity • Multi dimensional data is used in computer vision, medical imaging, and BIOMETRICS • Original Applications • Point sets • CAD • VLSI drawings • Cartography, astronomy

Spatial databases (cont.) • Difference from pattern classification – QUERIES • Spatial searches • Neighborhood searches • PAM/SAM • Point Access Methods • Used on point databases • Points may be multi-dimensional (Vectors) • Points have spatial extents, intersection undefined • Each point is specified uniquely by its d co-ordinates • Spatial Access Methods • Used on lines, polygons, solids • Have spatial extent, intersection of objects well defined • A point may be occupied by more than one object

Problems with vectorial/spatial data • No standard algebra defined on spatial data • Union, intersection, union not defined exactly • Data operations highly application specific • Operators are not closed • Queries • Need support for spatial queries – point and region queries • No standard spatial query language • No natural ordering • Ordering that preserves spatial proximity does not exist • No mapping between multi-dimensional space to 1D such that two points that are close together in higher dimensional space are also closed linearly[1] • Is it possible to do this via PCA/KLT? • Cannot extend single key structures like B-Tree

Requirements of a spatial database • Dynamic updates • The structure should be consistent as data is inserted and deleted • Changes should be tracked • Independence of input data and insertion sequence • Should handle skewed data • Structure should be independent of insertion sequence(Compare tree) • Scalable • Efficiency • Time Efficiency • Efficient design will approach the performance of B-Trees • Space Efficiency • Indexing overhead should be small

Types of structures • K-d Trees • Binary tree in d-dimensional space • d-1 hyperspaces separate the subspaces • The directions alternate among the d-possibilities • Insertion and search are straight forward • Deletion is cumbersome • Structure is sensitive to insertion order

References • Gaede and Gunther, “Multidimensional Access Methods”, ACM Computing Surveys, Vol.30, No.2, 1998 • www.geocities.com/mohamedqasem/ vectorquantization/vq.html • Bolle et al. Guide to Biometrics, Springer Verlag, 2003 • NIST report to the United States Congress, “Summary of NIST Standards for Biometric Accuracy, Tamper Resistance and Interoperability”, http://www.itl.nist.gov/iad/894.03/NISTAPP_Nov02.pdf • http://www.galactic.com/Algorithms/discrim_mahaldist.htm • Dr.Wayman’s report, NIST

Thank You ssc5@cedar.buffalo.edu

Improving Biometric Record Management with Indexing and Binning Techniques

Improving Biometric Record Management with Indexing and Binning Techniques

Presentation Transcript

Indexing and Hashing

Indexing and Searching

Journals and Indexing

Processing and Binning Overview

Indexing and Hashing

Sequences and Indexing

Biometric

Indexing and Retrieval

Processing and Binning Overview

Lithuanian Vital Records Indexing Project

Indexing and Binning Large Databases

Indexing and Execution

Records Filing and Indexing Rules

3x3 binning pattern experiment

Searching and Indexing

Indexing and Complexity

Indexing and Joins

Vectors and indexing

Indexing and Searching

What are the benefits of Medical Records scanning and Indexing?

AI Medical Records Sorting and Indexing Services: The Boon