1 / 12

University of Palestine

University of Palestine. Topics In CIS ITBS 3202 Ms. Eman Alajrami 2 nd Semester 2009-2010. Reference Textbook s. [1] “ Modern Information Retrieval ” by: Ricardo Baeza-Yates, Berthier Ribeiro-Neto

ciro
Télécharger la présentation

University of Palestine

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. University of Palestine Topics In CIS ITBS 3202 Ms. Eman Alajrami 2nd Semester 2009-2010

  2. Reference Textbooks • [1] “Modern Information Retrieval” by: Ricardo Baeza-Yates, Berthier Ribeiro-Neto • [2] “Introduction to Information Retrieval” by: Christopher D. Manning (Author), Prabhakar Raghavan (Author), Hinrich Schütze • Lecture slides by Eman Alajrami

  3. Chapter 1 – Part1 Introduction to Information Retrieval

  4. What is this course about? • Processing • Indexing • Retrieving • … textual data • Fits in four lines, but much more complex and interesting than that

  5. Need for IR • With the advance of WWW - more than 8 Billion documents indexed on Yahoo, Google • Various needs for information: • Search for documents that fall in a given topic • Search for a specific information • Search an answer to a question • Search for information in a different language • Search for images • Search for music • Search for a (candidate) friend

  6. Some definitions of Information Retrieval (IR) Salton (1989): “Information-retrieval systems process files of records and requests for information, and identify and retrieve from the files certain records in response to the information requests. The retrieval of particular records depends on the similarity between the records and the queries, which in turn is measured by comparing the values of certain attributes to records and information requests.”

  7. IR systems on the Web • Search for Web pages http://www.google.com • Search for images http://www.picsearch.com • Search for image content http://wang.ist.psu.edu/IMAGE/ • Search for answers to questions http://www.askjeeves.com • Music retrieval http://www.fxpal.com/people/foote/musicr/

  8. Information Retrieval • Concerned with the: • Representation of • Storage of • Organization of, and • Access to • Information items.

  9. Motivation • Focus is on the user information need • User information need: • Find all docs containing information on college tennis teams which: (1) are maintained by a USA university and (2) participate in the NCAA tournament. • Emphasis is on the retrieval of information (not data)

  10. Motivation • Data retrieval • Task: which docs contain a set of keywords? • Well defined semantics • a single erroneous object implies failure! • Information retrieval • Task: information about a subject or topic • semantics is frequently loose • small errors are tolerated • IR system: • interpret contents of information items • generate a ranking which reflects relevance • notion of relevance is most important

  11. Brief History of IR • IR as a CS field (80s & early 90s): • classification andcategorization • systems and languages • user interfacesand visualization Still, area was seen as of narrow interest

  12. Recent History of IR Advent of the Web changed this perception • universal repositoryof knowledge • free (low cost)universal access • many problems:IR seen as key to finding the solutions! Increased capability for sharing personal collections of text and other media

More Related