170 likes | 297 Vues
Survey Analysis. An attempt to develop an Intuition of Semantic Relatedness. Outline. Motivation Survey framework Analysis. Motivation. Semantic Relatedness – broad/subjective concept Given a pair of words – Are they related? If so, to what extent?
 
                
                E N D
Survey Analysis An attempt to develop an Intuition of Semantic Relatedness
Outline • Motivation • Survey framework • Analysis
Motivation • Semantic Relatedness – broad/subjective concept • Given a pair of words – • Are they related? • If so, to what extent? • What is the kind of relationship between them? • Answer varies from person to person – depends on his background, culture, work domain etc. • Example: Apple - Computer
Existing Datasets • Rubenstein & Goodenough (1965) – 65 English noun pairs (RG - 65) • Miller and Charles (1991) – subset of RG-65, 30 English noun pairs (MC - 30) • Finkelstein et al. (2002) – 353 word pairs (Fin1-153 and Fin2-200) • Yang and Powers (2006) – 130 verb pairs (YP-130)
Problems with current datasets • Part of speech limitation • Focus on semantic similarity instead of relatedness • Size of dataset usually very small. Constructed manually. Labor intensive. • Only general terms are included. Lack of domain specific terms • Provides no insight into the type of SR
Survey Framework • Was created using 30 word pairs from Miller and Charles (1991) dataset • Participants were asked to rate the relatedness on a scale of 0 – 4, 0 being not related at all and 4 being highly related • They were also asked to specify the kind of relationship • They were made aware of the fact that 2 words may be related in a variety of ways – Synonymy, Antonymy, Frequent association, is a, part of, domain related etc.
Survey Framework • Was conducted among students of IIT Bombay (particularly with a computer science & linguistics background) • 55 students participated in the survey • Was created using Java Servlet and Tomcat container
Correlation Coefficient Correlation between MC new and original = 0.91 – quite strong