Template Security in Biometric Systems

Yagiz Sutcu Template Security inBiometric Systems

Outline • Introduction • Biometrics and biometric systems • Template security and user privacy • Proposed solutions • Feature Transformation for ECC- based Biometric Template Protection • Syndrome framework and its requirements for binary data • Binarization of minutiae-based fingerprint templates • Protecting Biometric Templates with Secure Sketch • Secure sketch framework, issues and limitations • Quantization-based secure sketch • Randomization • Multi-factor setup • Conclusions, discussions and future directions

Biometrics • Biometrics is the science and technology of measuring and statistically analyzing biological data. Adopted from: S. Prabhakar, S. Pankanti, and A. K. Jain, “Biometric Recognition: Security and Privacy Concerns”, IEEE SECURITY & PRIVACY, 2003. • Universality - do all people have it ? • Distinctiveness: can people be distinguished based on an identifier ? • Permanence : how permanent is the identifier ? • Collectability : how well can the identifier be captured and quantified ? • Performance : speed and accuracy • Acceptability : willingness of the people to use • Circumvention : foolproof

Biometrics

Biometric Systems S. Prabhakar, S. Pankanti, and A. K. Jain, “Biometric Recognition: Security and Privacy Concerns”, IEEE SECURITY & PRIVACY, 2003.

Issues, Challenges and Objectives • Biometric authentication is attractive • Closely related to the identity • Cannot be forgotten • Not easy to forge • Have been successfully used for a long time • However … • Cannot be exactly reproduced (intra-variability, noise) • Once compromised cannot be revoked • Entropy may not be sufficient • Objectives • Authentication/verification without storing the original biometric • Robustness to noisy measurements • Best possible tradeoff between • Security (How many bits must an attacker guess?) • Accuracy/performance (What is the false reject rate?)

Proposed Solutions – Transformation-Based Approaches • Employ one-way transformation • E.g., quantization, thresholding, random projections • Properties • Non-invertible or hard-to-invert • Similarity preserving • Cancelable • Technique depends highly on the biometric considered • Security not easy to analyze • Ratha’01&’07, Savvides’04, Ang’05, Teoh’06, etc.

Repeatable Distortion Do not Match Match Match Repeatable Distortion An Example: Cancelable Biometrics • Applied at the sensor level • Signal level • Feature level • Security • If compromised, a new distortion • Reusability • Different distortions for different databases • Performance • What about false accept and false reject rates? • Requires alignment

Proposed Solutions – Helper Data-Based Approaches • Generate user-specific helper data • E.g., syndrome, secure sketch • Helper data and generation method are public • General ECC-based framework that is applicable to many biometrics • Techniques may vary to optimize performance for different modalities • Security analysis based on information-theory is possible • Davida’98, Juels’99&’02, Dodis’04, Martinian’05, Draper’07, etc.

An example: “Fuzzy Commitment” • If the noisy biometric (X’) is close enough to the template (X), decoder successfully corrects the error. • The only information stored are  and hash(K)

An Example for Helper Data-based Biometric Template Protection: Syndrome Coding

Syndrome Coding Framework Encode enrollment biometric Decode w/ probe biometric Store syndrome S and hash of X X S Syndrome Encoding Syndrome Decoding Biometric Authentication Authenticate only if hash of estimate matches stored hash Original enrollment biometric Y Fingerprint Channel Noisy biometric probe • S cannot be uncompressed by itself and is therefore secure • In combination with a noisy second reading Y the original X can be recovered using a Slepian-Wolf decoder • Compare hash of estimate with stored hash to permit access

System Implementation Enrolment Fingerprint access granted Syndrome Encoding Alignment and Minutiae Extraction yes Extract binary feature vectors access denied no Syndrome Database yes Alignment and Minutiae Extraction Extract binary feature vectors access denied no Syndrome Decoding Probe Fingerprint

Fewer syndrome bits = greater security, but less robustness Overview of Syndrome Encoding/Decoding Security = number “missing” bits = original bits – syndrome bits Translates into number guesses to identify original biometric w.h.p. Robustness = false-rejection rate Robustnessto variations in biometric readings achieved by syndrome decoding process (syndrome + noisy biometric => original biometric)

Example: Distributed Coding with Joint Decoding Syndrome Bits Source X Source Y 0 0 1 1 1 1 0 0 0 0 1 1 0 0 0 0 0

Example: Syndrome Decoding Syndrome Bits Source X Side Info Y ? 0 1 ? 1 Use side info Y and syndrome bits to reconstruct X ? 0 0 ? 0 ? 1 ? 0 0 ? 0

Example: Syndrome Decoding Syndrome Bits Source X Guess X Flipping 1st bit from 0 to 1 satisfies 1st syndrome but violates 2nd syndrome 0 1 1 1 1 1 0 0 0 0 1 1 0 0 0 0 0

Example: Syndrome Decoding Syndrome Bits Source X Guess X 0 0 1 Flipping 2nd bit from 1 to 0 satisfies syndrome bits 1 and 2 but violates 3rd syndrome bit 1 0 1 0 0 0 0 1 1 0 0 0 0 0

Example: Syndrome Decoding Syndrome Bits Source X Guess X 0 0 1 1 1 Flipping 3rd bit from 0 to 1 satisfies all syndrome bits and recovers X 1 1 0 0 0 1 1 0 0 0 0 0

1 0 0 1 0 0 0 1 0 0 1 0 1 0 0 1 0 1 1 0 0 0 0 0 0 0 0 1 1 -1 0 1 0 F (F(X)) Security of Syndrome Approach list ofbiometrics satisfying linear constraints F(X), set of linear functions specified by code C + 1 …. + 0 + 0 secure biometric, S = evaluation of functions (syndrome vector) enrollment biometric X

1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 Previous Approach: Binary Grid Representation • Problems • Representation is sparse and difficult to model • Statistics of binary string are not well suited for existing codes • Poor performance • Solution: Pre-processing! • Pre-process fingerprint data to produce a binary string that is statistically compatiblewith existing codes • Syndrome coding on resulting binary string

. . 0 0 1 0 1 1 . . . . 0 0 1 0 1 1 . . . . 1 0 0 1 1 0 . . . . 0 0 1 0 1 1 . . . . 0 1 1 0 1 1 . . Zeros and ones equally likely 1 Independent bit strings Individual bits independent User A User B User A Reading 1 User A Reading 2 2 4 3 BSC-p Desired Properties of Feature Transformation

+ 6 + + 0 + + + + + + + Y 7 + + + 1 + + + + + X  N random cuboids in minutiae map + + + 9 + + + + + + 1 + + + + + # minutiae Median thresholds Bit vector Extracting Robust Bits from Biometrics Each cuboid contributes a 0 or 1 bit to the feature vector, if it contains less or more minutia points than the median

Elimination of Overlapping Cuboids Large overlap  similar bits  easy for attacker to guess 400 cuboids Best 150 cuboids

User-Specific Cuboids • Different users have different set of cuboids • Requirements: For each user, choose the cuboids which • Are most reliable • Result in equal number of zeros and ones • Have smallest possible pairwise correlation • What is a “reliable” cuboid? • # minutiae points far away from the median • 0 or 1 bit is likely to remain unchanged over repeated noisy measurements of fingerprint.

“Reliable” Cuboids 8 minutiae 3 minutiae Measurement 1 Measurement 1 5 minutiae 9 minutiae Measurement 2 Measurement 2 UNRELIABLE RELIABLE 0 1 E.g., median = 4 Each user has a different set of reliable cuboids !

1-List 1-List 0-List 0-List 2 1 Sort by reliability 4 3 … … 5 … Unordered list … N fair coin flips to choose cuboids from top of each list 5 4 2 … 1 3 User-Specific Reliable Cuboids

250 450 400 200 350 300 150 Number of feature vectors Number of feature vectors 250 200 100 150 50 100 50 0 0 20 40 60 80 100 120 140 30 40 50 60 70 80 90 100 110 120 Number of 1's in the transformed feature vectors Number of 1's in the transformed feature vectors Distribution of Zeros and Ones Proprietary database of 1035 users, 15 pre-aligned samples per user Desired 75 ones Desired 75 ones 150 Common Cuboids 150 User-Specific Cuboids

0.18 intra-user variation inter-user or attacker variation 0.16 0.14 0.12 0.1 Distribution of the NHD 0.08 0.06 0.04 0.02 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Normalized Hamming Distance (NHD) Intra-User and Inter-User Separation 0.35 inter-user variation 0.3 0.25 intra-user variation Distribution of the NHD 0.2 attacker steals user’s cuboids 0.15 0.1 0.05 0 0 0.2 0.4 0.6 0.8 1 Normalized Hamming Distance (NHD) 150 Common Cuboids 150 User-Specific Cuboids

0.25 0.2 inter-user scenario 0.18 attack scenario 0.2 0.16 0.14 0.15 0.12 Inter-user NHD or attacker NHD 0.1 Inter-user NHD 0.1 0.08 0.06 0.05 0.027 0.05 0.04 0.02 0 0 0 0 0.05 0.1 0.15 0.2 0.25 0.05 0.1 0.15 0.2 ≈ 0 Intra-user NHD Intra-user NHD Equal Error Rates User-specific cuboids provide lower equal error rate even if the attacker knows everybody’s cuboids. Common Cuboids User-Specific Cuboids

Syndrome Coding Results

Conclusions/Discussions • Random cuboids enable robust bit extraction with desired properties • User-specific (reliable) features require more computation and storage, but give better separation between intra-user and inter-user feature vectors and provide higher security than common feature vectors However… • Fast method for eliminating correlated bit-pairs from user-specific cuboids • Extending feature transformation to use ridge data which is provided along with minutiae map • Observing effect of alignment inconsistencies on overall performance

Protecting Biometric Templateswith Secure Sketch:Theory and Practice

Secure Sketch Enrollment Verification Noise P ENCODER DECODER Sketch randomness Generate password/key Sketch should not reveal too much information about the original biometric

Secure Sketch (min-entropy of X) Entropy-loss (averagemin-entropy of X given P)

Secure Sketch Security of secure sketch is defined in terms of entropy loss, L • Suppose original biometric data X have min-entropy H(X) • The strength of the key we can extract from X given sketch P is at least H(X) – L • L can be easily bounded by the size of the sketch P • L is an upper bound of information leakage for all distributions of X

However… • Known secure sketch schemes have limitations • Only deal with discrete data • But real world biometric features are mostly in continuous domain • One general solution • Quantize/discretize the data and apply known schemes in the quantized domain

Quantization-based Secure Sketch

A Simple Example X is original data, 0 < X < 1 and under noise, X can be shifted by at most 0.1 1 1

Problems Remain... • For different quantization methods • Min-entropy of quantized data could be different • Entropy loss could also be different • How to define the security? • Using entropy loss in the quantized domain? • Could be misleading

Different Quantizers • Using scalar quantizer: • Case 1: • step size 0.1 • entropy loss = log 3 • Case 2: • step size 0.01 • entropy loss = log 11 • Which one is better? • It depends on distribution of X • If X is uniform: • Case 1: H(X) = log(100), H(X) – L = log (100/3) = 5.06 • Case 2: H(X) = log(1000), H(X) – L = log(1000/11) = 6.51 • Case 2 yields a stronger key • However, there exists a distribution of X such that for both case 1 and 2: • H(X) is the same • L is the actual information leakage • Case 1 yields a stronger key

How to Compare? • Or, how to define the ``optimal'' quantization for all distributions of X? • It is difficult • Might be impossible • Instead, we propose to look at relative entropy loss, in addition to entropy loss • Essentially, we ask questions differently: • Given a family of quantizers (say, scalar quantizers with different quantization steps), for any one of them (say, Q), how many bits more that we could extract from X if another quantizer Q' was used? • How to bound the number of additional bits that can be extracted for any Q' (compared with Q)? • If we can bound it by B, then the ``best'' quantizer in the family cannot be better than Q by more than B bits

Main Results • For any well-formed quantizer family, we can always bound the relative entropy loss • well-formed: no quantizer in the family loses too much information (say, having too large a quantization step) • The safest way to quantize data is to use a quantization step same as the error rate • safest: relative entropy loss is the smallest • This result is consistent with intuition • useful to guide practical designs

However… • Known secure sketch schemes have limitations • Only deal with discrete data • But real world biometric features are mostly in continuous domain • One general solution • Quantize/discretize the data and apply known schemes in the quantized domain • Measure security using entropy loss in the quantized domain • For different quantization methods • Min-entropy/entropy could be different • How to define the security? • Using min-entropy alone could be misleading • Improve performance? • How about cancelability/reusability? • Better feature selection?

Enrollment Verification noisy biometric X Y template randomization XR YR quantization Q(XR) Q(YR) PX DECODER ENCODER sketch Q(XR) Randomized Quantization-based Secure Sketch

Results • ORL Face database - 40 different individual and 10 samples per individual • 7 for training and 3 for testing • PCA features (eigenfaces) considered • Range-based similarity measure • User-specific random (uniform) matrices

Results quantization randomization

Results n=20 n=50

Conclusions/Discussions • Randomization • Improve performance • Cancelability/reusability • Feature selection • Similar security with better average sketch-size estimation However… • How to measure biometric information? • Entropy estimation • Matching algorithm • How to define/find the “Optimal” quantization? • Given the input distribution • Given practical constraints (size of sketch and/or templates) • Different quantization strategies

Template Security in Biometric Systems

Template Security in Biometric Systems

Presentation Transcript

Evaluation of Biometric Identification Systems

Biometric Security System

Biometric Security System

Biometric Information Management For Security

National Biometric Security Project

Biometric Security and Privacy Module 1.1

Biometric Security for SAP R3

Introduction to Biometric Systems

Biometric Technology The Future of Security

Biometric Security

Mulitmodal Biometric Systems

Biometric Authentication Systems

Secure Personal Information With Biometric Security Systems

Biometric Systems Security

Biometric Systems Market

Mobile Biometric Security and Service Market

Various Aspects of Biometric Security

National Biometric Security Project

Biometric Authentication Systems

Biometric Information Management For Security

How Biometric Access Control Systems Can Enhance Your Organizations Security

Mobile Biometric Security and Service Market