1 / 16

Data Linkage Project

Data Linkage Project. Florida’s Newborn Screening Program. Gary Sammet Bureau of Vital Statistics. Outline. Data Linkage Approach Start with Probabilistic Linking Data Linkage Automated Process Flow Data Processing Design: Linking Variables, Weights, Bonuses, Use of Jaro-Winkler

indiya
Télécharger la présentation

Data Linkage Project

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Linkage Project Florida’s Newborn Screening Program Gary Sammet Bureau of Vital Statistics

  2. Outline • Data Linkage Approach • Start with Probabilistic Linking • Data Linkage Automated Process Flow • Data Processing Design: Linking Variables, Weights, Bonuses, Use of Jaro-Winkler • Data Processing Sample Results

  3. Data Linkage Approach • VS & LAB work closely together • System can accommodate needs • Reduce duplication of efforts • Reconciliation • All births have a screening record • All screening records have a birth • Most cost effective with best results

  4. Start With Probabilistic Linking • Identify linking variables - assign initial weight based on understanding & experience w/data • Run initial linking - sort by weight & display linkage flags to see data patterns/anomalies • Adjust weights as needed w/o changing code • Define deterministic rules to ensure consistent linking in automated process

  5. Data Linkage Automated Process

  6. Linking Variables & Weights

  7. Linking Variables & Weights

  8. Weight Bonuses • DOB, Time of Birth, Sex, Facility + Zipcode(MFirst or MSSN) BONUS = .50 • DOB, Time of Birth, Sex, Facility-JW + Zipcode (MFirst or MSSN) BONUS = .40 • DOB, Time of Birth, Sex, Facility + ZipcodeBONUS = .20 • DOB, Time of Birth, Sex, Facility-JW + Zipcode BONUS = .15

  9. Variables By % Linked

  10. Variables By % Linked

  11. Linking With Jaro-Winkler • With Exact Facility + Zipcode Match 41% - Facility & Zipcode must match • With Jaro-Winkler Facility + Zipcode Match Additional 36.84% Total Match = 77.84% vs. just 41% Examples: LAB FACILITY NAME FLORIDA HOSP ORLANDO – LAB SHANDS AT THE UNIV OF FLA BROWARD MED CTR SHANDS AT JACKSONVILLE HOLLYWOOD BIRTH CENTER, INC VS FACILITY NAME FLORIDA HOSP ORLANDO SHANDS AT UF BROWARD MEDICAL CENTER SHANDS JACKSONVILLE HOLLYWOOD BIRTH CENTER

  12. Linking Mother Address & City • Only 16% match on exact mother address & city • Additional 56% match on mother address & city, using Jaro-Winkler Total Match: 72% vs. just 16% Examples: LAB Mother AddressVS Mother AddressLAB CityVS City 2323 SAMSON ROAD 2323 SAMSON RD ORLANDO ORLANDO 5105 NE 75TH AVE 5105 NE 75 AVENUE MIAMI MIAMI 1001 MAIN ST APT A 1001 MAIN ST APT A KEY WEST KEY WEST 532 HORNET CT 532 HORNET COURT PENSACOLA PENSACOLA 101 MAGIC CIR 101 MAGIC CIRCLE TAMPA TAMPA

  13. Data Processing Results • LAB Data with DOB 12/1-31/2010 Unduplicated On OrigSpecID: 9,211 rows • VS Data with DOB 11/1 – 12/31/2010 Unduplicated on State File Number: 37,741 rows • 99% Unduplicated & Linked Records with weighted score > 2.5

  14. Overall Linkage Results • 98 – 99 % using back-end approach • Still not good enough • Follow Rhode Island front-end approach

  15. Advantages of Front-end Linkage • Provide real-time linkage at hospital with VS Birth Date & NBS demographic data • Reduces data entry by hospital staff • Provide daily report of unlinked/missing records • Provide LAB w/checklist of incoming blood specimens • Reduce follow-up by state staff to hospitals • Allow end-users (hospitals, MDs) ability to view electronic patient reports/results in real-time

  16. Acknowledgements Ken Jones Bureau Chief/Deputy State Registrar Bureau of Vital Statistics Sharon Dover Operations Manager Bureau of Vital Statistics Paula Stewart Database Analyst Health Statistics & Assessment

More Related