200 likes | 325 Vues
What is necessary (and unnecessary) for analyses of offender databases. Jason R. Gilder August 16, 2008. Forensic Bioinformatics (www.bioforensics.com) gilder@bioforensics.com. Offender databases. Originally designed for convicted offenders
E N D
What is necessary (and unnecessary) for analyses of offender databases Jason R. Gilder August 16, 2008 Forensic Bioinformatics (www.bioforensics.com) gilder@bioforensics.com
Offender databases • Originally designed for convicted offenders • CODIS: Convicted Offender DNA Index System • Expanded • Unsolved crime samples • Arrestees • Elimination profiles
CODIS • COmbined DNA Index System • National: NDIS • State: SDIS - fewer restrictions • Local: LDIS - fewest restrictions • Convicted Offender Profiles in NDIS: 6,031,000 • Forensic Profiles in NDIS: 225,400 • More than 71,800 cold hits
Why analyze a database? • Questions remain regarding the weight of a DNA database match • Random Match Probability (RMP) • Database Match Probability (DMP) • Balding & Donnelly LR • Other • Composition of database may affect chance of a coincidental match • Presence of relatives
Structure of a DNA database • Collection of records • Structured Query Language (SQL) format
Examples of possible issues with the use of DNA databases • Michigan v. Gary Leiterman • Evidence: blood found on victim’s hand • Cold hit to a 4-year-old boy • R v. Sean Hoey • Evidence: explosive device • Cold hit to a 14-year-old boy • Jaidyn Leskie inquest (Australia) • Evidence: clothing from deceased • Cold hit to a rape victim
How a database can be analyzed • Perform all pairwise profile comparisons • the “Arizona Search” • P1 with P2, P1 with P3, P1 with P4, …, P1 with Pn • P2 with P3, P2 with P4, P2 with P5, …, P2 with Pn • Analyze profile similarity • Count number of matching loci and alleles • Perform kinship analyses
Arizona Match Data • 65,493 Profiles • 122 pairs matched at 9 of 13 loci • 20 pairs matched at 10 of 13 • 1 pair matched at 11 of 13 • 1 pair matched at 12 of 13
Review of Victoria State Database Krane/Paoletti analysis: >11,000 profiles each compared to all others across 9 loci: Shared allelesObserved occurrences 14401 1527 161 1716 18 0 AussieBump
300 100 20 1
Issues with the release or analysis of a DNA database • Privacy concerns • Names, social security numbers, DNA profiles, addresses, etc. • Issues with analysis • Duplicate profiles, multiple databases, presence of relatives, processing time, CODIS requirements • Legal issues • California Proposition 69
Issue 1: Privacy concerns • Database contains private information that should not be released • Answer: provide anonymous profiles only • Accomplished through one command • SELECT D3, vWA, FGA, …, D7 FROM CODIS_DB
Issue 2: Duplicate profiles • Many databases contain at least 10-15% duplicate profiles • Answer: ignore duplicates in analysis • A fairly thorough database analysis can take place with duplicates removed • Also identify potential mistyping rate • The lab may be able to cull out duplicates from the same individual with additional information (e.g. SSN)
Issue 2b: Multiple databases • California DOJ contains information in two databases that can be cross referenced to remove duplicates • Login DB – contains unique “CII” ID and accession numbers of all samples for that individual • SDIS – contains accession number and profile • Answer: JOIN the data with one command • Only select the first accession number profile • SELECT D3, vWA, FGA, … D7 FROM SDIS JOIN LOGIN_DB WHERE (LOGIN_DB.ACCESSION1 = SDIS.ACCESSION)
Issue 3: Presence of relatives • It is difficult to identify the presence of relatives by hand by simply looking at the CODIS records • “There are a significant, but unknown number, of such related individuals in California’s offender database.” – Kenneth Konzack • Answer: Exactly!
Issue 4: Processing time • Performing an internal search of the database will take too long (a week or more) and will not allow for CODIS searches during that time • Answer: perform an analysis on a separate computer or computers • Pairwise database search is “embarrassingly parallel”
Issue 5: Legal issues • Legal statutes (e.g., California Proposition 69) prohibit release of database to citizens • Answer: 38 state statutes (including CA) allow for an outside review of their database for statistical analysis • Many require the removal of identifying information