80 likes | 200 Vues
This poster presents an evaluation of vector space models applied to the "Högskoleprovet" dataset in an internal GSLT conference. It discusses the approach of calculating vectors for questions and answers, focusing on handling phrases in the models. Techniques like Singular Value Decomposition (SVD) are utilized for dimensionality reduction. The poster outlines the challenges encountered in tokenization and the need for improved representation of phrases. Collaboration with SICS aims to enhance the understanding and performance of these models, particularly in the evaluation of the more complex ORD200 dataset.
E N D
Evaluating Vector Space Models using "Högskoleprovet" Poster for: GSLT Internal Conference 2004 Leif Grönqvist, 23. October GSLT & MSI@VxU
Experiment Setup • Training data • Newspaper texts • Size: 10MTok? • Test data (HP200) • ORD from Högskoleprovet, 5 years (200 questions), example: ansats • A. sammanfattning • B. syfte • C. fortsättning • D. försök • E. granskning Evaluating Vector Space... GSLT Internal Conference
Basic Approach • Calculate a vector space model using training data (I will use SVD for dimensional reduction) • For each question: • Calculate vectors for the question word and the alternatives • Select the alternative with the vector closest to the question word vector • Can be used to evaluate different vector space models! Evaluating Vector Space... GSLT Internal Conference
But: The tests contain phrases! • psykoprofylax A. återspegling av känslolivet i t.ex. gester och kroppshållning B. förmåga att med tankekraft sätta föremål i rörelse C. förmåga att se in i framtiden D. metod för att förhindra oro och smärta vid t.ex. förlossning E. mätning av mentala prestationer, förmågor och personlighetsdrag Evaluating Vector Space... GSLT Internal Conference
Main problem • Try to build a vector space that handles phrases • Ordinary LSI: • The corresponding vector for a word A is the “meaning” of A • meaning (A B C) = meaning (A) + meaning (B) + meaning (C) • But how could we then know that: • reda av göra soppa eller sås tjockare • (that meaning of reda is very rare) Evaluating Vector Space... GSLT Internal Conference
Improvements(?) Has to be done during tokenization: • Improvement 1: add tuples up to length n (”president Bill Clinton” president, Bill, Clinton, president_Bill, Bill_Clinton, president_Bill_Clinton) • Dependency improvement: • Run the MALT parser • Create and include tuples according to the dependencies ((”president Bill Clinton” president, Bill, Clinton, Bill_Clinton, president_Bill_Clinton) • Ultimate improvement: combine them? Evaluating Vector Space... GSLT Internal Conference
No results yet, but more problems • HP200 contains 214 distinct words • 10 are not present in the training data (åsiktsmässig, porslinsmålning, humus, igångsättning, fröställning, …) • 30 have a frequency between 1 and 4 • If we don’t know the question word we have to guess result=baseline • If we don’t know some of the alternatives, should we ever guess the unknown alternative? • Maybe compound analysis is needed… Evaluating Vector Space... GSLT Internal Conference
Cooperation with SICS • Magnus Sahlgren will try the same experiment using RI and other training data • We will then try the tuned systems with the same training data • Compared to Toefl, ORD200 seems to be much harder • Many phrases • More difficult and old fashioned words • Easier to find good training data for English • Main goal: to use ORD200 for evaluation Evaluating Vector Space... GSLT Internal Conference