1 / 27

Things You Just Have to Know About Search Engines

Things You Just Have to Know About Search Engines. Ran Hock Online Strategies May 14, 2002 InfoToday 2002. Things You Just Have to Know About Search Engines. 1 - No Search Engine Covers Everything

ivi
Télécharger la présentation

Things You Just Have to Know About Search Engines

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Things You Just Have to Know About Search Engines Ran Hock Online Strategies May 14, 2002 InfoToday 2002

  2. Things You Just Have to Know About Search Engines • 1 - No Search Engine Covers Everything • 2 - Different Engines "Miss" and Find Different Things • 3 - Large Numbers Aren’t Necessarily Bad Searches • 4 - All Search Engines Have Techniques That Allow You Improve Results

  3. Things You Just Have to Know About Search Engines • 5 - Metasearch engines are not "search engines" • 6 - Google is great, but not the only one you should use. • 7 - Some Things Change, Some Don't

  4. 1 -No Search Engine Covers Everything • There are pages no engine covers: Invisible pages • Un-linked pages, database pages, password protected sites, “deep” pages, etc. • Different engines ”miss" and find different things (Point #2)

  5. 2 - Different Engines Find and Miss Different Things • Each engine may find something others missed. • Even “2nd tier” engines find things missed by the top 3 • Consider the results of the following search on: “erris head” sailing

  6. 2 - Different Engines Find and Miss Different Things

  7. 2 - Different Engines Find and Miss Different Things • Of the 20 different records retrieved by all the engines, Google found (only) 14 (70%) • Google missed 6 (30%) • If you had searched Google, then just one more engine, your retrieval would have increased by 15% • Even HotBot found 2 the other three engines missed.

  8. 2 - Different Engines Find and Miss Different Things - Why ? • Indexing "policies" • What words and other items get indexed • How those things are "parsed" • Crawling differences • Starting points • Depth / Breadth of crawling etc. • Spam policies • Ranking

  9. 3 - Large Numbers Aren’t Necessarily Bad Searches • Most common complaint • You’re not “obligated” • All use some form of relevance ranking • Relevance ranking does, to some degree at least, the same things we do to find the best items • What relevance ranking uses:

  10. 3 - Large Numbers Aren’t Necessarily Bad Searches Relevance ranking uses some combination of: • Popularity • Frequency of terms • Weighting by field (e.g., Title counts more than Summary) • Proximity of terms • Weighting by size of the type • Weighting according to the order in which the searcher entered terms • Etc.

  11. 3 - Large Numbers Aren’t Necessarily Bad Searches Most search engines automatically “enhance” your search • Automatic phrase identification • Word variants (and/or truncation) • Case sensitivity • Analysis of documents in the database (links, term association, associative networks, cluster analysis, co-occurrence, etc.) • Etc.

  12. Automatic Re-Write - AllTheWeb

  13. 4- All Search Engines Provide Options for You to Enhance Your Search • Field Searching • title • URL • date • language • etc. • Boolean (yes, “Boolean,” which is neither difficult nor bad)

  14. 4- All Search Engines Provide Options for You to Enhance Your Search How do you know about these options • Use the Advanced Search page • Read the documentation • ________________

  15. 4- All Search Engines Provide Options for You to Enhance Your Search • Use the Advanced Search page

  16. 5 - Metasearch engines are not “search engines” • Consider the following example of a search done in individual engines, then in metasearch engines

  17. Search done for “geologic resources” worcester

  18. 5 - Metasearch engines arenot “search engines” • Most don’t search all of the largest engines • Most don’t give you more than 10 or 20 records from each engine • Most don’t convey your full query syntax to the target engines • Most give “paid sites” first • “Client-side” metasearch programs, e.g., Copernic and Bulls-Eye do NOT have the above problems. • Even online metasearch engines have occasional socially redeeming features (vivisimo’s clustering).

  19. 6 - Google is Great, But Not the Only One You Should Use • Points 1 and 2 - No search engine finds everything and different engines find different things

  20. 6 - Google is Great, But Not the Only One You Should Use Great Because of: • Size • Popularity-based ranking • Unique content • newsgroups • PDFs and other file types • largest image collection • Dandy little features like addresses, definitions, etc. • Pretty good search options

  21. 6 - Google is Great, But Not the Only One You Should Use But Doesn’t Have: • Everything • Truncation and NEAR that AltaVista has • As much news coverage as AllTheWeb • As much currentness as AllTheWeb (maybe) • Etc.

  22. 7 - Search Engines Change • In some ways a lot, in other ways very little

  23. 7 - Search Engines Change Areas of little change • For most engines: How they do basic things such as phrases, Boolean, truncation, field searching etc.

  24. 7 - Search Engines Change Areas of frequent/considerable change • Some come, some go Gone” Go/InfoSeek et al. Arrived: WiseNut, Teoma • How things are arranged on the home page (esp. AltaVista) • Partners (which directory they use, featured partners and tools, etc.) • Added content, esp, content types (PDFs, newsgroups, etc. in Google.)

  25. In Summary • 1 - No Search Engine Covers Everything • 2 - Different Engines "Miss" and Find Different Things • 3 - Large Numbers Aren’t Necessarily Bad Searches • 4 - All Search Engines Have Techniques That Allow You Improve Results • 5 - Metasearch engines are not "search engines" • 6 - Google is great, but not the only one you should use. • 7 - Some Things Change, Some Don't

  26. Ran Hock Online Strategies 1-800-871-4033 www.onstrat.com ran@onstrat.com

More Related