1 / 5

Cross Language IR

Workshop on Challenges in Information Retrieval and Language Modeling Amherst, Massachusetts, September 11-12, 2002. Cross Language IR. Philip Resnik Salim Roukos. 2000. 2005. English. English. Chinese. Source: Global Reach. Global Internet User Population.

tracey
Télécharger la présentation

Cross Language IR

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Workshop on Challenges in Information Retrieval and Language Modeling Amherst, Massachusetts, September 11-12, 2002 Cross Language IR Philip Resnik Salim Roukos

  2. 2000 2005 English English Chinese Source: Global Reach Global Internet User Population If cross-language IR is “solved”, where is it???

  3. Opportunities • World Wide Web • Research literature • Intranet applications • Necessities in a post-9/11 world • High volume intelligence analysis • Replacing current Boolean engines (or worse!) • Dealing with the on-paper legacy

  4. Challenge: Role of the User • Query formulation for multilingual doc sets • Key idea: user needed in the query translation loop • Extracting examples from aligned parallel text • Document selection • Key idea: full MT isn’t good enough • Presenting phrases and entities (not “crummy MT”) • Query reformulation • Key idea: user’s understanding of the collection • Largely unexplored: different objective fn for MT

  5. Challenge: Relating MT and IR • It is typical to think of MT and IR as two different processes • Weighting developed with monolingual mindset • Steps toward factoring in translation ambiguity • Toward integrated models • Beyond bags of words (or bags of n-grams) • Translingual search process (> 2 languages) • Use of context introduced by the search process • Document-level analysis, use of document context • Collection-level analysis

More Related