50 likes | 170 Vues
This simulation example demonstrates the use of probabilistic matching for record linkage in statistical analysis. By imputing correct variable values using estimated match probabilities, distributions of variable values can be computed across candidate records. The data tables show standard linkage estimates, multiple imputation results, and prior-informed imputation estimates for Time to Infection in informative matching scenarios.
E N D
Record linkage results Simulation example
Following probabilistic matching: For statistical analysis we require that correct variable value is carried into file of interest (FOI) Among candidate record matches we have (estimated) probabilities of a correct match and for each record the associated variable value This allows us to compute across all candidate records a distribution of the values of the variable This probability distn, for each record in the FOI (where match is equivocal) is then treated as a ‘prior’ and this is sampled from to ‘impute’ a value
Table 4: Standard probabilistic record linkage estimates for Time to infection with informative matching and different threshold choices. Mean standard errors in brackets. Number of simulated datasets = 100.
Table 5: Standard multiple imputation estimates for Time to infection with informative matching. Mean standard errors in brackets. Number of simulated datasets = 100.
Table 6: Prior informed multiple imputation estimates for Time to infection with informative matching. Mean standard errors in brackets. Number of simulated datasets = 100.