1 / 16

A Study of Learning a Merge Model for Multilingual Information Retrieval

A Study of Learning a Merge Model for Multilingual Information Retrieval. Presenter : Cheng- Hui Chen Author : Ming- Feng Tsai , Yu-Ting Wang, Hsin-Hsi Chen SIGIR 2008. Outlines. Motivation Objectives Methodology Experiments Conclusions Comments. Motivation.

latona
Télécharger la présentation

A Study of Learning a Merge Model for Multilingual Information Retrieval

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Study of Learning a Merge Model for Multilingual Information Retrieval Presenter: Cheng-Hui Chen Author: Ming-Feng Tsai, Yu-Ting Wang, Hsin-Hsi Chen SIGIR 2008

  2. Outlines • Motivation • Objectives • Methodology • Experiments • Conclusions • Comments

  3. Motivation • Multilingual information retrieval(MLIR) that result list usually includes more irrelevant words. • Traditional merging methods for MLIR that assumption relevant documents are homogeneously distributedover monolingual result lists.

  4. Objectives • The various translation and retrieval qualities in different collections that to merge a unique result list. • To proposes merge method doesn’t assumption relevant documents are homogeneously distributed over monolingual result lists. • The enhancement merge model quality.

  5. Methodology • Traditional MLIR Framework. • Raw-score • Round-robin • Normalized-by-top1 • Normalized-by-topk • The Proposes a learning method. • FRank

  6. MLIR merge process • Feature Set • Query levels • Document levels • Translation levels • The Construction of a Merge Model • FRank ranking algorithm • BM25

  7. Feature set • Query levels • The manually classify the terms within a query into several pre-defined categories. • Location/country names (Loc) • Organization names (Org) • Event names (EN) • Technical terms (TT) • Document levels • The extracted document length (Dlength) and title length (Tlength).

  8. Feature set Loc 斗六 EN 英->中 Order、Park Loc EN 食べる • Translation levels • The size of a bilingual dictionary used for various language (i.e., DictSize). • The average number of translation equivalents within a query (i.e., AvgTAD). • If a query has two query terms both with three translation equivalents. • AvgTAD of the query is (3 + 3)/2 = 3.

  9. The Construction of Merge model • The FRank’s generalized additive model, a merge model can be represented as : • mt(x) is a weak learner • αtis the learned weight • t is the number of selected weak learners • The combine with a retrevalmodel (bm25) by using linear combination .

  10. Experiments • Data set • The Details of Experimental Collections • The Percentage of Retrieved Documents

  11. Experiments Mean Average Precision (MAP)

  12. Experiments The Experimental Results of Our Method using Different Combination Coefficient λ.

  13. Experiments Feature Analysis

  14. Conclusions The proposed merge model can significantly improve merging quality. The merge model indicates the key factors are the number of translatable terms and compound words.

  15. Conclusions • The future work • Use other learning-based ranking algorithms. • Such as RankSVM and RankNet. • Extract more representative features to construct a merge model. • Such as linguistic features. • Expect to discover more relations within query terms. • Such as query term association and substitution.

  16. Comments • Advantage • Improve merging quality. • Drawback • Application • Multilingual Information retrieval.

More Related