1 / 50

For the Google-Dependent: The Other Search Engines

For the Google-Dependent: The Other Search Engines. Michael Hunter Reference Librarian Hobart and William Smith Colleges For Rochester Regional Library Council Member Libraries’ Staff Sponsored by the Rochester Regional Library Council

gaetan
Télécharger la présentation

For the Google-Dependent: The Other Search Engines

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. For the Google-Dependent:The Other Search Engines Michael Hunter Reference Librarian Hobart and William Smith Colleges ForRochester Regional Library Council Member Libraries’ Staff Sponsored by the Rochester Regional Library Council Supported by Regional Bibliographic Databases and Resources Sharing (RBDB) funds granted by the New York State Library 2008

  2. For Today . . . • Landscape of Search in 2008 • Update on Established Services • New Services • Creating Custom Search Engines

  3. Why do I need more than Google? • The Google effect -- • the single most powerful force in today’s Internet • a private profit-driven company • owns more information on individuals’ search behavior, companies and organizations than any other entity

  4. Why do I need more than Google? • Great potential for misuse/abuse of this information for financial gain • Societies seldom leave basic services (utilities, medical and traffic regulation) totally to the “free market” • Is web search now a “basic service” ???

  5. Search dominance --- • Potential skewing for commercial, political or social purposes • Database composition • Ranking • Privacy • No single search engine can crawl the whole web • Limits search features, results display, consumer and shopping information • http://google-watch.org

  6. Web Search in 2008Who’s crawling the Web? • Google • Yahoo • Live Search (MSN) • Ask owns Teoma • Gigablast • Exalead

  7. US Engines by Search ShareSeptember, 2007

  8. Size Estimates 7/9/08Google AND Yahoo!text filetypes in millions

  9. Size Estimates 7/9/08Google AND Yahoo! text filetypes in millions

  10. User SatisfactionForeSee Results and U. Michigan 8/14/07

  11. Convincing others ... • twingine.com • Searches Google and Yahoo, with results in separate frames • jux2.com • A meta for Google, Yahoo and Live, giving rank of each result from each service

  12. The Latest at Established Services

  13. Yahoo Open Strategy • Y!OS – major internal and external redesign to unify all Yahoo’s services • Owns Flickr, del.icio.us, Upcoming • “We are building social into everything we do” • Offers more control over what is shared • Easier to set up small social networks • Will open some search technology to developers and users (http://developer.yahoo.com/search/boss/)

  14. Yahoo and the Semantic Web • Will begin to include certain metadata embedded in web pages as search and ranking elements • Dublin Core hCard • Creative Commons hCalendar • RDF hReview • GeoRSS hAtom • Will support Open Search specification allowing crawler access to deep web resources (!!!)

  15. OpenSearch: The Invisible Web made Visible • Helps search engines and invisible web databases communicate through a common set of formats to perform search requests. • Created by Amazon.com and available through Creative Commons • Potentially one of the most significant developments for web search in the last ten years • http://www.opensearch.org

  16. Search.yahoo.com • Search subscription content • Consumer Reports Factiva • Forrester Research Wall St. Jn. (30 days) • LexisNexis FT.com TheStreet.com • Yahoo! Answers answers.yahoo.com • Online community connecting people with questions to people wanting to answer them • 90 million users sharing knowledge worldwide • Feedback and answer reviews encouraged • Limit by Creative Commons (advanced search)

  17. Yahoo’s Search Assist • Ajax-based service that “suggests” terms and shortcuts as you type • Activate by clicking blue arrow below the search box before searching • Also offers “Explore Concepts”, searches “Shortcuts” highly associated with the search terms

  18. Yahoo Pipes - pipes.yahoo.com • Users can combine, filter and display any RSS content • Finished “pipes” can be shared and embedded in other web pages • eg. A pipe for RSS feeds from educational blogs flitering for technology, physics or any other keywords • Version available for the iPhone iphone.pipes.yahoo.com

  19. Mashup • Web application combining data from more than one source into a single tool • Used to • Navigate and visualize large and/or dynamic datasets • Combine data with dimensions of time, distance and location • Juxtaposing data from different sources can reveal new relationships

  20. MSN’s Live.com • Database increasing • Simpler Interface (4/08) • “Rich Answers” blended results • Image search enhancements filter:face filter:portrait filter:bw • NLP question processing improved • Live Search Books and Search Academic ended 5/08

  21. New & Notable at Ask • The Butler is gone! Teoma is in his place! • Smart Search • Web Answers • Zoom • Superior Mapping Tools

  22. Gigablast • Maintains unique database • Offers advanced search features • “Freshness dating limit” estimates the date that a particular page was first published or most recently edited or modified • Custom Topic Search of Gigablast – up to 500 domains (www.gigablast.com/cts.html)

  23. Exalead - www.exalead.com • Launched October 2004, based in France • Maintains its own database • Smaller than most US services (8 billion) • Offers “Narrowing Options” • Advanced features: • Phonetic spelling with “soundslike” • Approximate spelling with “spellslike” • Limits: Site (URL), Filetype (8), Adult content, Language (57!!!)

  24. New Services

  25. Wikiasari:Quick rummaging search • “User ranked results” • Open source SE by Jimmy Wales and Amazon • Initial results ordered with algorithms a la Google • Users reorder results, which will be used in ranking of future similar searches

  26. Wikiasari:Quick rummaging search • Strength is in general search topics • Deep, complex or unusual searches will not benefit as much • Intended to rival Google and Yahoo • Edits allowed on all search results • Recently launched • http://search.wikia.com

  27. Kosmix www.kosmix.com • Google interface • Offers overview of results by document type Basic Facts Reviews & Opinions Media People & Community Shopping News • Extensive clustering by subject • Blended results with thumbnails of images, video and audio clips, presentations and reports • Human-created “topic pages” for subjects of current interest

  28. icerocketwww.icerocket.com • Searches Blogs Web MySpace News Image • Link to cached version from Internet Archive’s Wayback Machine • Limited advanced search features • MAY be a Google interface

  29. ChaChawww.chacha.com • Free mobile search service • Requires a (free) account • Text your questions and a human “guide” sends back an answer, limited to 160 characters • Supported by 98% of mobile providers

  30. Clustywww.clusty.com • Metaengine – Source engines for the Web include Live Ask Gigablast Wisenut Open Directory • Searches Web News Images Wikipedia Blogs Jobs • Most extensive clustering capability of any meta (Vivisimo) • Custom “Tabs” run saved searches on engines of you select

  31. Searchcrystal.com:Visual Metasearch • Options include List View, Spiral View and Cluster Display • Results common to more engines appear in the center • Color=source engine • Shape=number of engines retrieving the page • Size=rank position • Web, Blogs, Images, Tagging sites and more…

  32. Images – Cluster view

  33. Images – Spiral view

  34. Custom Search Engines

  35. Google’s Custom Search www.google.com/coop/cse/ • A tool from the Google Coop initiative • Keywords chosen determine content and weighting of results (limit of 100 characters) • Search • Entire web • Your selected sites only • Entire web with selected sites emphasized • Within Coop, a CSE can be created and maintained collaboratively • Stored or Linked versions available

  36. Adding sites to a Stored CSE • Manually • Using Google Marker – bookmarking tool available for Firefox and IE7 • RSS feeds may be included • Add • Full domains www.moma.org • Subdomains www.moma.org/*research* • A single page www.moma.org/modernteachers/

  37. Linked CSE • Sites can be added “in bulk” • Select among your sites for individual queries through specification files • Requires user to host and maintain their own XML specification files • Migration from stored to linked versions possible • More difficult to add single sites • Use G’s Search API’s to integrate other Google services into a CSE

  38. Other CSE’s • Gigablast – Custom Topic Search www.gigablast.com/cts.html • Live Search Macros search.live.com/macros • Rollyo – searches Yahoo www.rollyo.com • Swicki www.swicki.eurekster.com

  39. Semantic Search Systems • Understand the user’s query • Understand Web text • Bring these together for query results that are contextually relevant • Algorithms that match the meanings and not just the words • Natural Language Processing • Concept Mapping

  40. Semantic Search Systems • Expensive and time-consuming for general web search; more possible in subject-specific contexts “What is palladium used for?” • Link-based crawler results: London’s Palladium Theatre • Include the concept map “used for” Sites about the element palladium • Hakia, Powerset, Cognition Search

  41. Post SearchWhat do we do AFTER a search? • The search engine size wars are over • WANTED: Services that help manage, share and update • Web search results • Tagged sites WITH scalability confidentiality “collaborability” across all platforms, devices and file formats

  42. Thank You! Michael Hunter Reference Librarian Hobart and William Smith Colleges Geneva, NY 14456 (315) 781-3552 hunter@hws.edu

More Related