90 likes | 217 Vues
This study by Dongbin Yan explores a novel methodology for measuring relational similarity between word pairs using Multi-Task Lasso. The approach addresses SAT analogy questions through snippet and pattern extraction, allowing efficient compression and denoising of relational features. By treating a single SAT question as multiple tasks with shared features, we create a robust feature matrix. Our method offers an accuracy rate of 50.3%, outperforming previous techniques. The study emphasizes the integration of Lasso for vector and matrix expression, enhancing the effectiveness of relational similarity computation.
E N D
Department of Computer Science and Technology East China Normal University Dongbin Yan Relational Similarity Measurement BetweenWord-pairs using Multi-Task Lasso
Problem we need to solve • SAT analogy question • Sample
Our Method • Snippet extraction • Pattern extraction • Compression and denoising by multi-task Lasso • Computing relational similarity
Snippet Extraction • Retrieve keyword ‘lion cat’ in a web search engine • Split snippet by predefined separator
Pattern Extraction • Match ‘lion***cat’ and ‘cat***lion’ in snippet • Create feature matrix • match answer as the row of feature matrix • 64 conjunctions as column of feature matrix • F[m,n] = 1 if m-thmiddle words contains the n-th conjunction. Otherwise, F[m,n] = 0
Compression and denoising by Multi-Task Lasso • Represent the feature matrix by Lasso • (1) Vector expression • (2) Matrix expression • (3) Lasso expression • Treat one SAT question as six related tasks by sharing a common set of features such as parameter of sparsity controlling • Use MALSARto compress and denoise the features from matrix to vector
Computing Relational Similarity • Relational similarity can be represented by cosine of the angle between the corresponding feature vectors • We can compute the relational similarity as f0llow:
Accuracy Rate • We get an accuracy rate of 50.3%,which is higher than other mentioned methods