100 likes | 268 Vues
2009 Face-to-Face RPUG Meeting San Diego, California June 17, 2009. Link Plus Version 3 Update. David Gu ( dgu@cdc.gov ), CDC/NPCR Contractor Kathleen Thoburn ( kthoburn@cdc.gov ), CDC/NPCR Contractor Joe Rogers ( jrogers@cdc.gov ), CDC. Division of Cancer Prevention and Control
E N D
2009 Face-to-Face RPUG MeetingSan Diego, CaliforniaJune 17, 2009 Link Plus Version 3 Update David Gu (dgu@cdc.gov), CDC/NPCR Contractor Kathleen Thoburn (kthoburn@cdc.gov), CDC/NPCR Contractor Joe Rogers (jrogers@cdc.gov), CDC Division of Cancer Prevention and Control National Center for Chronic Disease Prevention and Health Promotion Coordinating Center for Health Promotion Centers for Disease Control and Prevention Atlanta, Georgia
Link Plus Software • Stand-alone probabilistic record linkage program • Combines ease of use and statistical sophistication • Detects duplicates within a data file, or links two data files together • Supports fixed width files, delimited files, and North American Association of Central Cancer Registries files • Provides powerful support for manual review of uncertain matches
Link Plus V3 Enhancements • Removes size limitation on File 2 (4.5-4.8 million record limitation File 1) • Users can choose whether to write all potential matches or only matches with the highest score to the linkage report • Accepts various date formats for date comparison • Accepts quoted field values from delimited files • “Confirmation-like” method for address variables that contributes positively to linkage score with agreement but 0 weight with disagreement
Link Plus V3 Enhancements • Provides SSN-like matching method for generic ID • Incorporates phonetic code into name matching methods • New name matching method that is more robust against outlier or misspelled names (more robust linkage score; eventually enable determination of cutoff value automatically for production mode) • Name matching methods for multiple names • Users can provide their own name frequency files for use by name matching methods
Link Plus V3 Enhancements Manual Review • Can automatically assign non-match status to the current view based on previous non-match results • Option that allows users to assign match status by scores without overwriting the existing match status Export • Users can export the results of manual review to a NAACR format file • Users can save the settings and layouts of exporting
Link Plus Future Development • Allow CRS Plus users to select additional variables for manual review and export • Develop API; enable call from other software • Develop additional feature to enable use in production mode; including pre-analysis for selection of most effective cut-off • Write papers (including research on record linkage methods)
CDC–NPCR Link Plus Contacts Kathleen K. Thoburn, CDC/NPCR Contractor E-mail: kthoburn@cdc.gov David Gu, CDC/NPCR Contractor E-mail: dgu@cdc.gov Tom Rawson, CDC Computer Programmer
Thank you The findings and conclusions in this presentation are those of the author(s) and do not necessarily represent the views of the Centers for Disease Control and Prevention David Gu 770-488-3178 dgu@cdc.gov