160 likes | 280 Vues
This guide addresses key considerations when dealing with remote data acquisition. It outlines where your data is sourced from—be it email, web, XML, or FTP—and the formats it may be in, including plain text, HTML, XML, and CSV. The document emphasizes the importance of understanding the specific patterns of the data needed and provides essential coding practices for reliable data retrieval. Important notes on Google News Alerts and handling different formats like XML feeds and eBay data are included, ensuring a comprehensive approach to managing remote data resources effectively.
E N D
Dealing with Remote Data • 3 basic points • Where is the data coming from? • What format is the data in? • What is the pattern for the exact data you want?
Where is the data coming from? • Email • Web • XML • FTP • NNTP • Other
What format is the data in? • Plain text • HTML • XML • CSV • Other text formats • We will NOT be covering non-textual data
What goes with what? • Email • Plain text (most common) • HTML • Web • Plain Text • HTML • XML • CSV
What goes with what? • XML Feeds (RSS, etc) • XML • FTP • Any • NNTP • Plain text • HTML
Google News Alerts • Email based on specified subjects • Title of email ALWAYS contains the search subject • Body in plain text • Has standard footer • 2 different formats for data
Google News Alerts • Show email examples
Google News Alerts • CFPOP to retrieve • <cfx_pop action="GETALL" name="qPop" startrow="1" maxrows="2" server="mail.cfregex.com" • username="XXX" password="XXX"> • <CFDUMp var="#qPop#"> • Always test your data before trying to work with it
Google News Alerts • <CFIF Not IsDefined('qPop') OR Not qPop.Recordcount> • <CFABORT> • </CFIF> • Always stop operations when there is no data to work with
Google News Alerts • Get the subject • <CFSET Subject=ReReplace(qPop.Subject, '^Google News Alert - ', '')> • Difference between find and replace • Find has to be parsed, replace just replaces what you don’t want with nothing
Google News Alerts • Show code in studio
Full as a Goog • XML feed • Retrieved using CFHTTP • Dates are in a strange format
Full as a Goog • Show code
Ebay Agent • Ebay changes ALL THE TIME • Different results if a cookie exists or not • Results in HTML • Uses CFHTTP to get data • HTML has to be parsed
Ebay Agent • Show code
More Information • Mdinowit@houseoffusion.com • House of Fusion • http://www.houseoffusion.com • CF-Talk, CF-RegEx, other HoF mailing lists • Fusion Authority • http://www.fusionauthority.com