500 likes | 906 Vues
Cold hits (and familial searches). Dan Krane Wright State University, Dayton, OH 45435. Forensic Bioinformatics (www.bioforensics.com). Probable Case Suspect is first identified by non-DNA evidence DNA evidence is used to corroborate traditional police investigation. Cold Hit Case
E N D
Cold hits (and familial searches) Dan Krane Wright State University, Dayton, OH 45435 Forensic Bioinformatics (www.bioforensics.com)
Probable Case Suspect is first identified by non-DNA evidence DNA evidence is used to corroborate traditional police investigation Cold Hit Case Suspect is first identified by search of DNA database Traditional police work is no longer focus Suspect identification is the key difference:
Murder Scene – June 4, 1999(United States v. Jenkins) • Dennis Dolinger was found in his basement • 25 stab wounds to head • Blood in basement, main floor, stairs to second floor, second floor, outside front of house • Wallet missing
Steven Watson • Street musician • Long criminal record • Drug offenses • Property offenses • Assaultive offenses
Evidence against Mr. Watson • Caught using Mr. Dolinger’s credit cards next day in Alexandria, Virginia • Had Mr. Dolinger’s wallet and personal effects in apartment • Observed with blood on his person on the day of the murder near the crime scene
DNA Second Perpetrator Theory • FBI develops single source DNA profile from crime scene evidence • FBI excludes Mr. Watson as contributor of DNA • USAO develops second perpetrator theory of the crime
DNA evidence begins to drive investigation • FBI searches Combined DNA Index System (CODIS): No hit • Prosecutors ask VA to search VA database: Hit = Raymond Jenkins • Mr. Jenkins is arrested and charged with murder • Jenkins is jailed pending trial in January 2000
Starting over • Charges against Mr. Watson are dropped • Mr. Watson is released from jail • Mr. Watson disappears • DC Public Defender Service gets case.
Random match probability (RMP) describes the chance that a randomly chosen, unrelated individual would have a particular DNA profile.Is it appropriate here?
Surveying the three (or four) proposed statistics for cold hits • NRC I : 1992 National Research Council Report • NRC II: 1996 National Research Council Report • Bayesian (aka Balding and Donnelly): Widespread in UK and Western Europe • DAB: 2000 DNA Advisory Board to FBI
The Problem: Ascertainment bias • First three approaches differ in how they take into account ascertainment bias. • Ascertainment bias is a statistical effect of fact suspect first identified by search of a database • How must RMP be modified
NRC I & NRC II • Position: Both say ascertainment bias makes the link between suspect and crime scene DNA weaker—less probative. • Rationale: As the size of the database searched increases, so does the chance that you will find a match to the crime scene profile by chance.
NRC I & NRC II • Example: If you are looking for someone named “Sheldon Krimsky,” the chance of finding a match greatly increases if you search US census data versus a local phone book. • And, how impressed you are at finding another “Sheldon Krimsky” decreases as the database size increases
NRC I & NRC II (cont.) • NRC I Solution: Set aside previous DNA testing and test the sample at additional loci. If the new testing produces a match, it will be free of ascertainment bias. • New, untainted loci • NRC II Solution: Multiply the number of profiles in the database searched by the rarity of the profile in the general population and you get a “database match probability” (DMP) or the probability of finding a match in a database of size N • (N x RMP = DMP)
Bayesian (Donnelly/Balding) • Position: Says the ascertainment bias makes the link between suspect and crime scene DNA stronger—more probative. • Rationale: Each additional profile searched in a database eliminates another (likely) suspect, making the posterior odds of the suspect being the actual source higher than if the match were random.
DNA Advisory Board Position Paper[Statistical and Population Genetics Issues Affecting the Evaluation of the Frequency of Occurrence of DNA Profiles Calculated From Pertinent Population Database(s). Forensic Science Communications, July 2000, Vol.2, No. 3] • www.fbi.gov/hq/lab/fsc/backissu/july2000/dnastat.htm • “Two questions arise when a match is derived from a database search: (1) What is the rarity of the DNA profile? And (2) What is the probability of finding such a DNA profile in the database searched?” • “Here we address the latter question, which is especially important when a profile found in a database search matches the DNA profile of an evidence sample.”
DNA Advisory Board Position Paper[www.fbi.gov/hq/lab/fsc/backissu/july2000/dnastat.htm] • “When the DNA profile from a crime scene sample matches a single profile in a felon DNA database, the NRC II Report (1996) recommended the evaluation of question number 2 be based on the size of the database.” • “we continue to endorse the recommendation of the NRC II Report for the evaluation of DNA evidence from a database search.”
Dr. Bruce Budowle(affidavit dated 9/3/04; filed in US v. Jenkins) • “The FBI calculates and reports the RMP in all cases, including cold-hit cases. . . . This is because the rarity of the DNA profile within the population is typically relevant.” • “the FBI and some other facilities will additionally provide the DMP when one of the parties indicates an interest in knowing the chance of finding a particular profile after searching a specific number of profiles in a convicted felon database.”
Dr. Chakraborty’s Testimony To present the whole picture, both RMP and DMP should be given to jurors, per DAB. Presentation of RMP alone is not generally acceptable. NRC I is overly conservative.
Dr. Bieber’s Testimony Because of advances in technology to 13 loci testing, there no longer is a basis for ascertainment bias. Reporting RMP alone is safe. NRC I and II are overly conservative.
The Ruling • “Today there is a controversy about the method that should be used and it’s not a manufactured controversy, it’s not an insignificant one because there are esteemed members of the community on, I would say, both sides but on all the several sides of this issue.” • “While the Court believes the DAB view of NRC II is appropriate, it would be disingenuous to say there is no controversy here.”
Core issue: related individuals • RMP only finds probability of finding a randomly chosen unrelated person in a population • Related people exist in databases (and the world in general) • Arizona DPS found 144 pairs of individuals matching at 9 or more loci in a database of 65,493 individuals
Simulation studies • Use FBI published Caucasian genotypes • Generate database of randomized (unrelated) individuals • Create and add pairs of related individuals • Siblings • Parent-child • Half-siblings • Cousins
Unrelated individuals A database of 65k has 109 pairs of individuals matching at 9+ loci
Effect of different degrees of related individuals (9+ locus matches) Databases contain 10% related individuals Results are averaged over 5 replicates of the database
Familial search • Database search yields a close but imperfect DNA match • Can suggest a relative is the true perpetrator • Great Britain performs them routinely • Reluctance to perform them in US since 1992 NRC report • CODIS software cannot (and will not) perform effective searches
Three approaches to familial searches • Search for rare alleles (inefficient) • Count matching alleles (arbitrary) • Likelihood ratios with kinship analyses
Three approaches to familial searches • Search for rare alleles (inefficient) • Count matching alleles (arbitrary) • Likelihood ratios with kinship analyses
Example • 2003 North Carolina performed post-conviction DNA testing on evidence from a 1984 rape and murder • Exonerated Darryl Hunt, who had served 18 years of a life sentence • Database search yielded best match to Anthony Brown with 16/26 alleles • Brother Willard Brown tested and found to be a perfect match
Thresholds for similarity • Virginia: “be very, very close” • California: “appear useful” • Florida: match at least 21 out of 26 alleles • North Carolina: 16 out of 26 is enough
Is 16/26 close enough? • How many pairs of individuals match at 16+ alleles with unrelated databases of size… • 1,000: 562 pairs of individuals • 5,000: 13,872 pairs of individuals • 10,000: 52,982 pairs of individuals
Is the true DNA match a sibling or a random individual? • Given a closely matching profile, who is more likely to match, a sibling or a randomly chosen, unrelated individual? • Use a likelihood ratio
Probabilities of siblings matching at 0, 1 or 2 alleles • Weir and NRC I only present probabilities that siblings match perfectly. HF = 1 for homozygous loci and 2 for heterozygous loci
Probabilities of parent/child matching at 0, 1 or 2 alleles • Weir and NRC I only present probabilities that parent/child match perfectly.
Other familial relationships Cousins: Grandparent-grandchild; aunt/uncle-nephew-neice;half-sibings: HF = 1 for homozygous loci and 2 for heterozygous loci
Bieber’s Monte Carlo simulations • 50% of the time, a sibling has the best match in a database of 50,000 • 80% of the time, a sibling is in the top 10 matches • Investigating the relatives of people in the top 10 could increase cold hit rate from 10% to 14% • 30,000 cold-hits in the U.S. to date could have been 33,000 Bieber, Brenner and Lazer. 2006. Finding criminals through DNA of their relatives. Science. 312:1315-1316.
Familial search experiment • Randomly pick sibling pair or unrelated pair from a synthetic database • Choose one profile to be evidence and one profile to be initial suspect • Test hypothesis: • H0: A sibling is the source of the evidence • HA: An unrelated person is the source of the evidence Paoletti, D., Doom, T., Raymer, M. and Krane, D. 2006. Assessing the implications for close relatives in the event of similar but non-matching DNA profiles. Jurimetrics, 46:161-175.
Type I and II errors with a popsize of 1,000,000 and non-cognate allele frequencies
Is the true DNA match a relative or a random individual? • What is the likelihood that the source of the evidence sample was a relative of an initial suspect? • What is the size of the alternative suspect pool? • What is an acceptable rate of false positives?
Limiting the size of the alternative suspect pool • Pre-screening with Y-STRs (only useful for male offenders, $40 per sample, almost $12M for VA to get an additional couple hundred investigative leads per year) • Driven by casework rather than database administrators (workshops aimed at educating attorneys and investigators)
Dr. Fred Bieber (leading proponent of searches) “We’ve been doing familial searches for years. The difference between investigating identical twins and other siblings is just a matter of degree.”
Dr. Fred Bieber (leading proponent of searches) • Familial searches create “a new category of people . . . under lifetime genetic surveillance.” • “It’s composition would reflect existing demographic disparities in the criminal justice system.” • “Familial searches potentially amplify these existing disparities.” Bieber, Brenner and Lazer. 2006. Finding criminals through DNA of their relatives. Science. 312:1315-1316.
6th Annual Conference Dayton, OH August 17-19, 2006 See web page for details Dan Krane Forensic Bioinformatics www.bioforensics.com