110 likes | 231 Vues
This document discusses the quality of contact data within the RIPE database, highlighting issues with domain and person objects frequently used by ccTLDs. It examines the effectiveness of previous data clean-ups and the ongoing need for consistency fixes and deletions of unnecessary information. Emphasis is placed on the reliability of email addresses as primary contact points, noting that a substantial percentage of inetnum objects lack reachable contacts. Recommendations include mandating email addresses and improving checks for reachability upon updates.
E N D
Contact Data in the RIPE Database Shane Kerr RIPE NCC <shane@ripe.net>
Background & Goal • Certain kinds of data have caused problems • Domain objects (heavy use by ccTLD’s) • Person objects (heavy use by ccTLD’s, etc.) • Cleanups have been made in the past • Consistency fixes • Deletions of unnecessary data, one-time and ongoing • Small numbers of “inconsistencies” not a problem • Perform some measurement of data quality
Contact Data • Contacts are: • Referenced by resources recorded in the Database • Administrative or technical • Contacts have: • Name • Postal address • Phone number • E-mail address
Focus on e-mail • Name impossible to check • Postal address/phone number difficult to check • E-mail possible • Sadly optional for person objects
Checking the addresses • Unique e-mail extracted (about 500,000 in all) • Syntax check to remove garbage and bad TLD • Unique domains extracted (about 280,000 in all) • DNS checked • Algorithm from RFC 2821 • MX lookups, with fallback to A lookups • SMTP checked • VRFY unreliable • Use RSET, MAIL, RCPT for each e-mail • Minimise connections (only 140,000 unique IP’s)
Interpreting the Results • 20% of e-mail addresses can never be reached • 80% may still fail • Depends on mail software and configuration • Impossible to check further without delivering mail • Even delivered mail may never be read
reachable non-reachable no e-mail inetnum results objects • A significant percentage of inetnum objects have no valid e-mail address. • A much smaller percentage of actual IP addresses has no valid e-mail address, but still a significant amount. • Most of these are because the “e-mail:” attribute is optional in the person object. IP addresses
Conclusions & Questions • Many networks have no reachable contacts • “e-mail:” being optional is a significant reason • Is this a problem? If so, how big of a problem? • Possible actions: • Make “e-mail:” mandatory • Check e-mail reachability on person creation/update • Put a “remark:” on networks with unreachable contacts • Return parent networks if contacts unreachable