Inferring the Number of Contributors to Mixed DNA Profiles

Inferring the Number of Contributors to Mixed DNA Profiles David Paoletti

STR Samples • Each parent contributes one allele • Seeing one or two alleles at a locus implies at least one contributor • Three or four alleles means at least 2 contributors • Etc.

Crime Scene Sample • What is the actual number of contributors? • Counting alleles may be misleading • More than 3% of 3 contributor mixtures appear to be from 2 individuals • With 4 contributors, 75% or more of the mixtures can appear to originate with fewer individuals

Bayesian Bottleneck • We would like to be able to say something like “There is a 90% chance that this sample contains 3 individuals, not 2” • However, we do not know the priors • The actual percentage of crime scene samples that have a single contributor, etc • Criminals’ self-interest is against helping

Create All Possible Mixtures • Using a computer, consider every possible mixture that could have produced the crime scene sample • Do this for 2 contributors • Compute the probability for each potential mixture (using allele frequencies) • Sum up the probabilities for all mixtures • We refer to this as the Probabilistic Mixture Model, or PMM

Example • Call the available alleles at a locus 7, 8, 9, 10, 11, 12 • Assume 3 contributors, 4 unique alleles, 2 duplicates • Assume we’ve chosen the unique alleles to be the alleles 7, 8, 9, 10 • Assume that as duplicates we’ve chosen 8, 8 • The probability of seeing this is: p7•p8•p9•p10•p8•p8 •permutations = 0.00258 • 0.54381 • 0.11856 • 0.03866 • 0.54381 • 0.54381 • 120 = 0.0002282 = 0.02282% • Not very likely, but u = 4 can occur in many different ways (7,8,9,10,7,7), (7,8,9,10,7,8), …, (7,8,9,10,8,8), …, (9,10,11,12,12,12)

Comparing Probabilities • Repeat for a different number of contributors • Compare any two as a likelihood ratio; for example:

What does LR Mean? • Suppose from the previous example that the LR was 25 • This means that, if the number of contributors is actually 2, it is 25 times more likely to observe this profile than it is if the true number of contributors is 3

Verifying this Approach • Create 2-person mixtures • Create 3-person mixtures that appear (by allele counting) to be a mixture of two individuals

Actual 2-person Mixtures

3-person Mixtures that appear to have only 2 Contributors

Adjusting the Threshold • On the previous charts, the decision was based on comparing the likelihood ratio (LR) to a threshold of 1.0 • Suppose you want to be sure that you’re making the correct decision, and decide that the LR must be higher

Effect of Changing LR Threshold

PMM Demonstration • Three profiles from the publicly available FBI Dataset • http://www.fbi.gov/hq/lab/fsc/backissu/july1999/dnaloci.txt • Sample ID numbers 2000, 2017, B0670

Conclusions • The PMM seldom predicts more contributors than the sample contains • The PMM is much better than simple allele counting • Using cognate frequencies produces better results

Future Work • Compare cognate to non-cognate prediction ability • Modify the approach for cases where one contributor’s sample is known • Combine with other approaches (that use peak height or area) for a consensus decision

Tool and Contact Info • www.personal.psu.edu/drp15/tools/pmm/ • Email: drp15@psu.edu

References • David R. Paoletti, Travis E. Doom, Michael L. Raymer, and Dan E. Krane, Inferring the Number of Contributors to Mixed DNA Profiles, IEEE-ACM Transactions on Computational Biology and Bioinformatics (in preparation) • David R. Paoletti, Travis E. Doom, Michael L. Raymer, and Dan E. Krane, Assessing the Implications for Close Relatives in the Event of Similar but Nonmatching DNA Profiles, Jurimetrics, 46(2), Winter 2006, pg. 161–175. • David R. Paoletti, Travis E. Doom, Carissa M. Krane, Michael L. Raymer, and Dan E. Krane, Empirical Analysis of the STR Profiles Resulting from Conceptual Mixtures, Journal of Forensic Sciences, 50(6), November 2005, pg. 1361–1366. • Bruce Budowle and Tamyra R. Moretti, Genotype Profiles for Six Population Groups at the 13 CODIS Short Tandem Repeat Core Loci and Other PCR-Based Loci, http://www.fbi.gov/hq/lab/fsc/backissu/july1999/dnaloci.txt

Inferring the Number of Contributors to Mixed DNA Profiles

Inferring the Number of Contributors to Mixed DNA Profiles

Presentation Transcript

EXPLICIT AND IMPLICIT Contributors to mixed hypergraph coloring

Deconvolution of mixed DNA samples

Contributors to the American Victory

Table of Contributors

Inferring gene regulatory networks from transcriptomic profiles

Contributors to the Discovery of DNA Webquest

Contributors to the Cause of Liberty

DRAFT List of Contributors to the Trials

Contributors to Technology

Mixed Number Word Problems

Inferring Ethnicity from Mitochondrial DNA Sequence

INFERRING HAPLOTYPES OF COPY NUMBER VARIATIONS

Statistical weights of mixed DNA profiles

Contributors:

Contributors

Inferring gene regulatory networks from transcriptomic profiles

Inferring gene regulatory networks from transcriptomic profiles

Statistical weights of single source DNA profiles

Mixed Number to Improper Fractions

Contributors to Reproductive Isolation

EXPLICIT AND IMPLICIT Contributors to mixed hypergraph coloring

Contributors