1 / 28

The Invisible Web

The Invisible Web. Gary Price, MLIS George Washington University Chris Sherman Associate Editor Search Engine Watch. How Search Engines Work. The Web. Crawler. URL1. URL2. Indexer. URL3. URL4. Your Browser. Eggs - 90% Eggo - 81% Ego- 40% Huh? - 10%. All About Eggs by

flo
Télécharger la présentation

The Invisible Web

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Invisible Web Gary Price, MLIS George Washington University Chris Sherman Associate Editor Search Engine Watch

  2. How Search Engines Work The Web Crawler URL1 URL2 Indexer URL3 URL4 Your Browser Eggs - 90% Eggo - 81% Ego- 40% Huh? - 10% All About Eggs by S. I. Am Search Engine Database Eggs? Eggs.

  3. What is the Invisible Web? • “Stuff” that search engine crawlers (spiders) can not -- or will not-- add to their databases • 2 to 50 times larger than the visible Web • Resources often much higher quality than the visible Web

  4. What is the Invisible Web? • Certain file formats (PDF, Flash, Office files, streaming media) • Why? They aren’t HTML text • Most real-time data (stock quotes, weather, airline flight info) • Why? Ephemeral & storage intensive

  5. What is the Invisible Web? • Dynamically generated pages (cgi, javascript, asp, or most pages with “?” in URL) • Why? Spider traps • Web accessible databases • Why? Spiders can’t type

  6. Invisible Web Gateways • Intelliseek • http://www.invisibleweb.com • http://beta.profusion.com • Complete Planet • http://www.completeplanet.com/ • Librarians’ Index to the Internet • http://www.lii.org

  7. The Invisible Web & The Librarian The Need For Knowledge! • Awareness that the IW ExistsMaybe the IW Hold the Content Your Users Can’t Find! What is the cost in both wasted time/effort and total frustration? • Let Others Know About the IW • Awareness of The Synonyms • Invisible Web • Deep Web • Hidden Web • Let the Content be Your Calling CardFocus Less on the Amount IW Data

  8. The Invisible Web & The Librarian Why is the IW Useful to the Librarian and the End User? • Quality of Content (Authority) • Deep Content on Subject Area (Comprehensiveness) • Focused Databases (Limited Scope)Smaller Universe of Documents to Search (Maximize Precision/Recall)

  9. The Invisible Web & The Librarian Why is the IW Useful to the Librarian & the End User? • Material Unavailable Elsewhere on the Web (Uniqueness) • Many Options to Limit, Sort, Interact with the Data(Maximize Precision) • Timeliness vs. Time Lag of General Search Tools (Currency)

  10. The Invisible Web & The Librarian The IW, The Librarian, The Future • What Happens If/When the General Search Tools Crawl IW Material? Good News? Bad News? • General Search Tools May NOT:Offer Many Interactive/Limiting ToolsMay Not be Updated/Refreshed (time lag) as FrequentlyTimeliness, making current info available is one of the things the NET does well.

  11. The Invisible Web & The Librarian The IW, The Librarian, The Future • The Search Engine Business, Will IW Material be a Priority? • Just One Dialog or SilverPlatter Database?NO, in Terms of Content!!! • Yes, Common Interface, SyntaxPerhaps XML will Assist

  12. The Invisible Web & The Librarian Challenges • It’s Not The Magic Bullet. It’s a Tool • We Still Need Traditional Online Databases • Learning Curve, Sorry! • Database Selection, When To Use the IW? • Numerous Interfaces, Syntax • A Non-Stop Flow of New Material

  13. The Invisible Web & The Librarian Things To Do! • Build Your Own CollectionsInternet Resource Collection Development • Mine Entire Sites, Often the IW Material Gets Little or No Notice In Reviews • Create Links When Possible DIRECT to the Interface. • “Save the Time of the Web Researcher” • Keep Current

  14. Bibliographic- OPAC’s- Subject Bibs Non-Bibliographic- Full-Text- Numeric- Graphic- Directory- Real-Time The Invisible Web & The Librarian Types of IW Content in Librarian Terms

  15. Future Trends • Killer apps will lead the way • Research Index (CiteSeer) • Search engines will work harder to “find” Invisible Web content • Inktomi (Index Connect, Ultraseek) • WhizBang (“wrappers”) • No matter what, there will always be a problem!

  16. Coming Soon Available: July 2001 CyberAge Books 0-910965-51-X http://www.invisible-web.net

  17. Invisible Web:Computer Science • MacAfee World Virus Map • http://www.mcafee.com • ResearchIndex • http://www.researchindex.com

  18. Invisible Web:Company Research • European High-Tech Industry Database • http://www.tornado-insider.com/radar/ • Kompass • http://www.kompass.com

  19. Invisible Web:Intellectual Property • Delphion Intellectual Property Network • http://www.delphion.com/ • ESP@CENET (European Patent Office) Patent Database • http://ep.espacenet.com/

  20. Invisible Web:Dictionaries & Languages • EuroDicAutom • http://eurodic.ip.lu • Verbix • http://www.verbix.com/index.html

  21. Invisible Web:Art & Artists • ADAM (Art, Design, Architecture & Media Information Gateway) • http://adam.ac.uk/ • Artcyclopedia • http://www.artcyclopedia.com/

  22. Invisible Web:Real-Time Information • Flight Tracker • http://www.trip.com/ft/home/0,2096,1-1,00.shtml • J-Track 3-D Satellite Locator • http://liftoff.msfc.nasa.gov/realtime/JTrack/Spacecraft.html

  23. Invisible Web:Maps and Driving Directions • MapBlast • http://www.mapblast.com • Streetmap.co.uk • http://www.streetmap.co.uk/

  24. Invisible Web:Government Info • Parline Database • http://www.ipu.org • United Nations Daily Press Briefings • http://www.un.org/News/

  25. Invisible Web:Health & Medicine • Economics of Tobacco Control Database • http://www1.worldbank.org/tobacco/database.asp • International Digest of Health Legislation • http://www.who.int

  26. Invisible Web:News & Current Events • Cold North Wind Newspaper Archive Project • http://www.coldnorthwind.com • Financial Times Global Archive • http://www.globalarchive.ft.com

  27. Invisible Web:Science • Great Barrier Reef Online Image Catalogue • http://www.gbrmpa.gov.au/corp_site/info_services/library/index.html • Nuclear Explosions Database • http://www.ausseis.gov.au/databases

  28. Invisible Web:Transportation • Equasis (Merchant Ships) • http://www.equasis.org/ • World Aircraft Accident Summary (WAAS) Fatal Airline Accident Subset • http://www.waasinfo.net/

More Related