1 / 27

How Search Engines Work General Search Strategies

The employ special software robots, called spiders, to crawl web pages ... Works like a search engine rather than a directory. Searches the web ...

ryanadan
Télécharger la présentation

How Search Engines Work General Search Strategies

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


    Slide 1:How Search Engines Work General Search Strategies

    Dr. Dania Bilal IS 587 SIS Fall 2007

    Slide 2:Fun Quiz

    Take the search engine quiz located at http://websearch.about.com/library/quizzes/search_engine_quiz/blsearchenginequiz.htm Record the no. of incorrect answers Share the results of the quiz with a classmate.

    Slide 3:How Search Engines Work?

    They collect information from selected web sites The employ special software robots, called spiders, to crawl web pages Spiders build lists of the words found in Web sites. When a spider is building its lists, the spider is Web crawling. Spiders store the lists in the engines database The engines indexing software builds an index of words Information is matched against query input and retrieved (processing algorithm)

    Slide 4:How Spiders and Crawlers Work?

    They begin with popular and heavily used web servers. They begin with a popular site, collect the words on its pages and follow every link found within the site. Spiders travel across pages and the most widely used portions of the Web

    Slide 5:How Spiders and Crawlers Work?

    A dedicated server of URLs is built by a search engine company (e.g., Google) so that spiders collect information quickly More than one spider is used to craw web pages at a time Google uses 3-4 spiders and collect over 100 pages per second

    Slide 6:How Spiders and Crawlers Work?

    When no dedicated URL server is used, search engine company relies on ISP for the domain names (translated into addresses) to use for crawling the web Delay in gathering information Delay in updating information Lack of control over URL addresses

    Slide 7:Google Spider and How it Works

    A spider looks at the html or xml or other coding used to build a web page and collects information from the meta-tags It indexes words within the actual text of a page It indicates where the words were found (URL, title, headings, etc.) It disregards initial articles It disregards pages that should not be crawled or indexed

    Slide 8:Google Spider and How it Works

    It uses Robot-Exclusion Protocol in disregarding pages Implemented in the meta-tag section at the beginning of a Web page Tells a spider to leave the page alone, neither index the words on the page nor try to follow its links Franklin, C. How Internet Search Engines Work. http://computer.howstuffworks.com/search-engine.htm

    Slide 9:How Search Engines Store Words Indexed?

    The process varies among engines Words are stored with no. of times they appear on a pages (posting) Weight is assigned to each word. Words appearing near top of a page may have more weight than those appearing in subheadings, in links, in meta tags, in title, etc.

    Slide 10:How Search Engines Store Words Indexed?

    Information is encoded to save space Information is indexed An index of words is built by the automatic indexer (indexing software) A hash table is created with an assigned weight or value for each word indexed Hashing allows for even the distribution of popular entries (e.g., letter M) with those that are less popular (e.g., letter X) for quick retrieval

    Slide 11:Using General Directories

    Yahoo and its family Browsing directory Directory database Small and human-selected and indexed Searching using keywords Search database Larger and non-selective database Spider and machine indexing

    Slide 12:Yahoo

    Yahoo.com Works like a search engine rather than a directory Searches the web Exercise: search under my name and see how Yahoo processes query while youre inputting information Directory found under more or at http://search.yahoo.com/dir

    Slide 13:Yahoo Search Engine

    Search Web Images Videos Local information Shopping More

    Slide 14:Yahoo Advanced Search

    Advanced Search feature Shown on screen after you perform a search, or by going directly to http://search.yahoo.com/web/advanced?ei=UTF-8&p=dr+dania+bilal&fr=yfp-t-471 Lots of search features to explore

    Slide 15:Yahoo Advanced Search Features

    Boolean Phrase Currency Domain File format Country Language Other

    Slide 16:Yahoo Advanced Search Features

    Exercise Perform a search on a topic of your choice Use Boolean equivalents All the words=AND The exact phrase=phrase; proximity search Any of these words=OR None of these words=Not Choose part of page to search Choose language other than English Report results in class

    Slide 17:Yahoo Search Services

    For searching specific content area such as Search Services Web Search Find anything from across the Web Answers Ask questions and get answers from real people Audio Search Find over 50mm audio files from across the Web Creative Commons Search Find Creative Commons content that you can share or re-use in your own works Directory Search Search or browse Yahoo!'s categorized guide to the Web Image Search Find over 1.6 Billion photos and illustrations from all over the Web Job Search Search for jobs, post your resume and more on Yahoo! HotJobs Local Find everything in your area from dry cleaners to day spas Maps Find maps and driving directions for anywhere you want to go Mobile Search Find whatever, wherever you are My Web (Beta) The newest way to save, share and organize any page you want on the Web News Search Search for news stories and related photos, videos and audio clips

    Slide 18:Yahoo Next

    http://next.yahoo.com/ Cutting edge technology at Yahoo Blogs, Web 2.0, use of alltheweb, Yahoo Maps, Podcasts, audio and all other features that are in Beta testing

    Slide 19:Yahoo Preferences

    Customize Yahoo to fit your needs Go to Preferences from the Web search page Edit preferences based on your needs Edited preferences are saved in browser on desktop

    Slide 20: General Search Strategies in Search Engines

    Slide 21:Strategies

    Boolean Boolean equivalents Proximity and phrase searching Searching within a field Search limits

    Slide 22:Yahoo Search Strategies

    Explore Yahoos help page Read the Search Tips Read the search limit parameters such as Intitle: url: inurl: Read how to use Boolean equivalents and other search parameters

    Slide 23:General Search Engines Besides Yahoo Search

    Slide 24:Engines and Information Need

    Several general search engines on the Web Select engine(s) that best fit your need Visit the Web Search Guide for latest information: http://websearch.about.com/od/generalsearchengines/General_AllPurpose_Search_Engines.htm

    Slide 25:Hands-on Activity

    Browe the list of general search engines in Web Search Guide Explore 4 of the engines listed Wisenut, Snap.com, Lycos, Exalead Search under my name in each engine Compare the results by viewing the first two pages retrieved How many overlaps were found among the three engines How many unique results were found in each engine

    Slide 26:Specialized Search Engines

    Web Search Guide has a listing of specialized search engines Web companion to the textbook, chapter 3 describes a variety of specialized engines Explore chapter 3 familiarize yourself with the engines described

    Slide 27:Hands-on Activity

    Find the answer or relevant information for these two queries using an appropriate, specialized search engine: Do squirrels hybernate? Find me a list of foreign-owned companies based in the U.S., organized by state.

More Related