Chapter 9 Creating and Maintaining Database

Chapter 9Creating and Maintaining Database Presented by Zhiming Liu Instructor: Dr. Bebis

Outline • Introduction • Enrollment Policies • The Zoo • Biometric Sample Quality Control • Training • Enrollment Is System Training

Introduction • Biometric enrollment asks an individual to give out private information. • Enrollment is a process directed by some enrollment policy, which needs to be acceptable to the public. • Positive enrollment: under enrollment policy EM, select trusted individuals and store machine representation of these m enrolled members in a verification database M.

Introduction • Negative enrollment: for criminal identification systems, under enrollment policy EN, determine the undesirable individuals and store machine representations of the n selected individuals in the screening database N. • Because of error and fraud, there are fake and duplicate identities in legacy databases.

Introduction - A fake identity can be one of two cases, created and stolen identities: 1. Created identity: some subject d enrolls in M as d’K using documents for a nonexistent identity, either fake documents or fake ID. 2. Stolen identity: a fake identity can also be a falsely enrolled subject d’K as subject dK, the stolen identity. - A duplicate identityIB Subject A duplicate IA

Enrollment policies • Positive enrollment: this is a process of the registration of M trusted subjects dm in database M. The enrollment could be based on some already enrolled population W. • Negative enrollment: is a process of registration of N questionable subjects dn by storing machine descriptions of these subjects in database N, which contains much more specific and detailed descriptions.

Enrollment policies • Social issues - How to make biometric authentication work without creating additional security loopholes, and without damaging civil liberties? - Who will administer and maintain databases of authorized subjects? - How will the data integrity of these databases be protected?

The zoo • Apply animals to subject categories, depend on whether one subject is easy to authenticate or not. - Sheep: The group of subjects that dominate the population are easy to authenticate because their real-world biometric is very distinctive and stable. - Goats: The group of subjects that are particularly difficult to authenticate because of a poor real-world biometric that is not distinctive, perhaps due to physical damage to body parts or due to large spurious variability in the biometric measurements over time. This is the portion of the population that generates the majority of False Rejects.

The zoo -Lambs: These are the enrolled subjects who are easy to imitate. Lambs are the cause of most False Accepts because they are imitated by wolves. - Wolves: These are subjects that are particularly good at imitating, impersonating, or forging a particular biometric. - Chameleons: These are the subjects who are both easy to imitate and good at imitating others. They are a source of passive False Accepts when enrolled and of active False Accepts when being authenticated.

The zoo

Biometric sample quality control • Many random False Rejects/Accepts occur because of adverse signal acquisition situations. - two solutions

Biometric sample quality control - for example, apply image enhancement or suggest subjects present the biometric in a different, “better” way. - Failure to Enroll (FTE) Input quality control higher FTE rates Low-quality samples lower FTE rates - Relationship with ROC lower FTE higher FAR and FRR

Biometric sample quality control

Training • Why does a biometric system need to be trained? - Compute match score s(B’, B). - The goal is to make the average difference between these match scores and mismatch scores as high as possible. • There are two aspects to training - Enrollment policies and authentication protocols

Training 1. Enrollment of subjects: During enrollment one or more samples B of a subject’s biometric β are acquired and biometric samples or templates derived from the samples B are stored in some database M. 2. Protocols: A biometric authentication system itself needs to be trained, by refining and enhancing the signal or image to match the user population characteristics and incrementally improving the match engine.

Training

Enrollment is system training • Build database M by selecting subjects d from the world population W and assigning an identifier ID to each subject.

Enrollment is system training • Three possibilities: 1.Correctly “linked”, ID = k 2. Subject dk is in reality a subject dj, with j < k, i.e., dk is “duplicate” of subject dj. As a result, IDj and IDk are duplicates, representing the same individual. 3. Subject dk is in reality a subject dj, with j > k, i.e., dk is faking unenrolled subject dj. As a result, IDk corresponds to a “fake” identity.

Enrollment is system training • We have non-zero probabilities - PD is the probability that some subject d M isalso enrolled under a different ID number - PF is the probability that subject d M is a fake identity • Database integrity - Integrity: how well the database reflects the truth data of the seed documents (birth certification, proofs of citizenship, and passports) used for enrollment

Enrollment is system training • The database integrity when it comes to duplicates is determined by PD , theprobability of duplicates - PDEA (Double Enroll Attack) refers to the probability that an already enrolled subject dj wishes to re-enroll in the database as a different identity dk. - FNMRE is the probability that a match between two samples of the same biometric is not detected, i.e., is missed. - The number of duplicates in M is PD* m, with m the number of entities in M

Enrollment is system training • The enrollment integrity is further determined by PF, the probability of a fake enroll as dk - FMRE is the probability that a match between two different biometric samples is falsely declared during enrollment - PIA is the probability of impersonation attack - The number of fake identities in M equals PF* m

Enrollment is system training • Probabilistic enrollment - build an access control list of subjects di, i = 1,…,m of some database M. - association between di and the corresponding biometric βi - compute likelihood it expresses how well a subject’s biometric βi match his template Bi - probability can only be computed if there exist some machine representation of real word biometrics βi , let these representations be another set of templates and write

Enrollment is system training where, for simplicity, we assume that the match score is the likelihood that di is the true subject, given Bi • Modeling the world - Prob (di | Bi) can be approximated by match score si only under very unrealistic circumstances. - more realistic approximations will have to involve the modeling of other subjects dkenrolled in M, more generally, compute Prob (di |O) the likelihood of subject di given the biometric data O collected at enrollment time

Enrollment is system training - Prob (O) is the prior probability that this particular observation will occur (which cannot be computed exactly) - assume Prob (di) = Pd is constant - evaluate Prob (O|di) is a matter of fitting model di to the data O and determine how well this can be done. - evaluating the rest of this expression Prob (O|dk) k = j+1,…, m is impossible, because these subjects are not available upon dj enrollment

Enrollment is system training • Modeling the rest of the world — cohorts - the most difficult issue in training a biometric authentication system is the modeling of data from unknown people. - voice verification methods not only use a model describing the speaker’s biometric machine representation, but also a model describing all other speakers. - two techniques to approximate the denominator of (9.7)

Enrollment is system training - reduce the set M to one fictitious model subject D, trained on a pool of data from many different speakers, who represent the “world”W of possible speakers. - factor , so that the denominator reflects the whole population D + di 1. World modeling

Enrollment is system training - approximate the set M bya subset Mi that resemble subject di . for each subject di , a set of approximate forgeries is computed and stored. We denote this set by Di— the set is called the set of cohorts of speaker i. - factor i = ci, the number of cohorts for di 2. Cohort modeling

Enrollment is system training • Updating the probabilities - denote Prob (di |O)with Pi - during operation of the authentication system, data from subjects is collected and likelihood Pi could be updated. - upon authentication of subject di , a biometric sample is acquired that we denote here as O. - compute Prob (di |O, O)

Enrollment is system training - what needs to be evaluated is the denominator Prob (O) - set Prob (di) = Pi

Enrollment is system training

Chapter 9 Creating and Maintaining Database

Chapter 9 Creating and Maintaining Database

Presentation Transcript

Creating and maintaining the exposition

Maintaining Database Safely

Maintaining and creating employment

CHAPTER 16 Creating and Maintaining High-Performance Organizations

CHAPTER 16 Creating and Maintaining High-Performance Organizations

Chapter 16 Creating and maintaining high-performance organizations

Creating and Maintaining Effective Partnerships

Chapter 16 Creating and maintaining high-performance organizations

Maintaining your Database

Creating and Maintaining Databases

Creating and Maintaining a Database

WEB PAGES: CREATING AND MAINTAINING **

Creating and Maintaining Effective Partnerships

Chapter 9 Publishing and Maintaining Your Site

WEB PAGES: CREATING AND MAINTAINING **

WEB PAGES: CREATING AND MAINTAINING **

WEB PAGES: CREATING AND MAINTAINING **

Creating and Maintaining a Database Systems Textbook

Chapter 9 Publishing and Maintaining Your Site

Maintaining A Database

Chapter 9 Creating and Maintaining Database