380 likes | 510 Vues
Concepts covered What is a search engine and how do they work? General search tips The Big Six search engines Other search tools. Computer Searches. Much of these lecture notes were based on Search Engines for the World Wide Web by Alfred and Emily. Looking for Information?.
E N D
Concepts covered • What is a search engine and how do they work? • General search tips • The Big Six search engines • Other search tools Computer Searches Much of these lecture notes were based on Search Engines for the World Wide Web by Alfred and Emily
Looking for Information? • Start with the Internet • World wide web • Newsgroups • Archived mailing lists • There are potential problems
The Internet Search Engines • What is a search engine • How do they work • Search engines may employ spiders
Search Engines (Continued) • Search engines may search human-created databases
Making Your Web Site More Noticeable • Add relevant keywords (Spiders) • Search engine submission (“suggesting your site” to Humans)
Keywords: The Secret To Effective Searches • Use keywords that are unique as possible • Run the search using a number of variations • Search only titles • Determine if the search engine is case sensitive • When searching for proper names, capitalize the first letters • Check your spelling • Re-run previous results
Types Of Searches • Plain English • AND • OR • NOT • Near Searches
Plain English Searches (Natural Language Searches) • Easy to formulate the query but may result in too many hits
Plain English Searches (Continued) • Supported by almost all of the Big Six • AskJeeves (www.ask.com)
AND Searches • Telling the search engine that it must include multiple keywords • Precede each keyword with a plus sign "+“ or “AND” • Some search engines use AND as the default, others do not
OR Searches • Provides broader search results • Tells the search engine to include web pages that include at least one keyword out of a list of many (2+)
NOT Searches • Precede the excluded keyword with a minus sign "-“ or “NOT”
NEAR Searches • Tell the search Engine to show web pages where keywords appear near each other in the document (within 10 words)
Using Wildcards "*" • Used to look for variations on particular words • Some search engines allow the wildcard to be placed at the beginning, middle or end of a keyword • Rules of thumb on the use of wildcards • Use them to find spelling variations • Use a minimum of three characters before the wildcard1 This will vary depending upon the particular search engine.
Stopwords • Ignored by search engines because they are too common or are reserved for some special purpose • Common words • Reserved words • The search engine can be forced to include the stopwords • Use quotes • Use a plus sign
The Big Six • AltaVista (www.altavista.com) • Google (www.google.com) • HotBot (www.hotbot.com) • Lycos (www.lycos.com) • Northern Light (www.northernlight.com) • Yahoo (www.yahoo.com)
Comparing The Big Six Self-reported sizes But size isn't everything! 1 Based upon figures from January 2001 and an estimate of 2 billion web pages in existence from www.searchenginewatch.com
AltaVista • Types of searches • Logical OR • Date • Field • Geographic • Wildcards • Language • Case sensitive • Proximity • Weighted • Babel Fish • Obscure facts and figures • Dead links
AltaVista (Continued) • Ranking of search results • Appearance in the title • Appearance near the beginning of the document • Links to related content
Google • Types of searches • Logical OR • Language • Domain • Type of file • Date • Not case sensitive • No wildcards • Specifies stopwords • Big! • Caches web pages • I feel lucky feature
Google (Continued) • Ranking of search results • By the number of links
HotBot • Types of searches • Logical OR • Case sensitive • Wildcard searches • Language • Date • Domain • Geographic region • Link searches • Type of file • Must contain, should contain, should not contain • Graphical control of searches
HotBot (Continued) • Ranking of search results • Having the keyword(s) in the title • Number of occurrences of the keyword
Lycos • Types of searches • Logical AND • Multi-media searches • Must include/should include, exclude • Link searches • No Stop words • Not case sensitive • No searches by date • No searches by wildcard • Kid's search site • www.lycoszone.com
Lycos (Continued) • Ranking of search results • "Popularity" of site • Occurrences of keyword
Northern Light • Types of searches • Logical AND • Special case sensitive search • Wildcard • Singular and plural • Stop words WWW and a special database Free search alerts Customized search folders
Northern Light (Continued) • Ranking of search results • By the number of links • Keyword frequency • Date of the document • Keyword appearing in title
Yahoo • Searches • Logical OR • Date added • Wildcards • Not case-sensitive Searches Yahoo directories and Google database Extensive classification
Yahoo (Continued) • Ranking of search results • Results in Yahoo directory comes before Google results • Ranking in Yahoo directory determined by: • The number of key words matched • Exact word matches • Location of the word in the web page
Summary of The Big Six and What They Do Best1 • AltaVista • Obscure facts and figures • Babel fish • Google • Big! • Often produces relevant search results • Caches web pages • HotBot • Multimedia • Ease of use 1 From Search Engines for the World Wide Web by Alfred and Emily
Summary of The Big Six and What They Do Best (Continued) • Lycos • Multimedia • Kid's zone • Northern Light • Search on the web and special data bases • Yahoo • The most extensive web directory
Metasearch engines • Search on multiple search engines automatically • Examples • www.metacrawler.com • www.dogpile.com • www.profusion.com • www.search.com • www.mamma.com • Drawbacks • Searches occur in the simplest form • Timeouts • Number of results returned
Other (Task-Specific) Search Tools • Products • Amazon: www.amazon.com • CDNOW: www.cdnow.com • Consumer World: www.consumerworld.org • CNET Shareware.com: www.shareware.com • ZDNet: www.zdnet.com • Health • CDC: www.cdc.gov
Other (Task-Specific) Search Tools (Continued) • Food • CuisineNet Menus Online: www.cuisinenet.com • Epicurious Food: www.epicurious.com • Martha Stewart: www.marthastewart.com Miscellaneous • Expedia: www.expedia.com • Internet Movie Database: www.imdb.com • Monster: www.monster.ca • Workopolis: www.workopolis.com
Summary • What is a search engine? • How do search engines gather information for their databases? • Types of Searches • By keyword • Logical • Plain English • Wildcards • Stopwords and searches. • Browsing topic directories. • What are the Big Six search engines? • Metasearch engines. • Task-specific search tools