Extracting Semantic Networks From Text Via Relational Clustering

Extracting Semantic Networks From Text Via Relational Clustering Stanley Kok Dept. of Computer Science and Eng. University of Washington Seattle, USA Joint work with Pedro Domingos

Goal: Reading & Understanding Text A Step Towards the Goal TextRunner[Banko et al., IJCAI’07] Autonomous Agent Semantic Network Knowledge Base invade country man(x)=>human(x) child(x,y)=>parent(y,x) … country emigrate to people embrace religion (Object,Relation,Object) triples Semantic Network Extractor Webpages Text 2

Snippet of Extracted Semantic Network pulled withdrew to_withdraw to_remove America US USA Australia Britain UK China Spain Iraq Germany … army troops force forces navy banned had_banned prohibited restricted export import imports importation banned had_banned prohibited restricted played has_played will_play … EU European_Union UN United_Nations played has_played will_play … part role 3 3

Motivation Supervised approaches Manual annotation of training data; not scalable to Web e.g., Semantic Parsing [Wong & Mooney, ACL’07] Unsupervised approaches Extracts noisy & sparse ground facts; no high-level knowledge that generalizes ground facts e.g., TextRunner [Banko et. al., IJCAI’07] SNE Unsupervised, domain-independent Scales to Web Text ! simple semantic network Abundance of Web text ! KB 4

Motivation Background Semantic Network Extractor Experiments Future Work Overview 5

Motivation Background Semantic Network Extractor Experiments Future Work Overview 6

Markov Logic • A logical KB is a set of hard constraintson the set of possible worlds • Let’s make them soft constraints:When a world violates a formula,it becomes less probable, not impossible • Give each formula a weight(Higher weight  Stronger constraint)

TextRunner [Banko et al. IJCAI’07] Extracts (object,relation,object) triples from webpages in a single pass Identify nouns with noun phrase chunker Heuristically identify string between two nouns as relation Classify each triple as true or false using Naïve Bayes classifier 8

Overview • Motivation • Background • Semantic Network Extractor • Experiments • Future Work

Semantic Network Extractor • Input: tuples r(x,y) • Output: simple semantic network • Clusters objects and relations simultaneously • Number of clusters need not be specified in advance • Cluster relations by objects they relate and vice versa

Notation • Cluster: , , • Clustering: , , r1 x1 x2 y2 y1 r4 y3 r2 r3 x3 x6 r5 x4 y4 r7 y5 r6 x5

Notation • Atom: , , , • Cluster combination: r1 x1 x2 y2 y1 r4 y3 r2 r3 x3 x6 r5 x4 y4 r7 y5 r6 x5

SNE Model • Four rules • Each symbol belongs to exactly one cluster • Exponential prior on #cluster combinations • Most symbols tend to be in different clusters

Wt of rule is log-odds of atom in its cluster combination being true Smoothing parameters #true & #false atoms in cluster combination SNE Model • Atom prediction rule: Truth value of atom is determined by cluster combination it belongs to

Learning SNE Model Learning consists of finding • Weights of atom prediction rules • Cluster assignment =(r,x,y): assignment of truth values to , and atoms that maximize log-posterior probability atom prediction rule vector of truth assignments to all observed ground atoms r(x,y) first three rules

prob. atom is true prob. atom is false constant Set of cluster combinations #pairs of symbols in different clusters #cluster combinations Log Posterior Intractable!

Number of Cluster Combinations space_shuttle Columbia Kennedy_Space_Center delivered_to orbits Earth planet astronomers think_about

space_shuttle Columbia Kennedy_Space_Center Number of Cluster Combinations delivered_to orbits space_shuttle Voyager Kennedy_Space_Center delivered_to earth planet astronomers think_about orbits Earth planet astronomers think_about

Si set of symbols of type i Log-Posterior (Approximation) • Assume atoms in cluster combinations with only false atoms all belong to a single ‘default’ cluster combination #cluster comb. with ¸ 1 true r(x,y) atom Set of cluster comb. With ¸ 1 true r(x,y) atom Pr(atom=false) #false atoms in cluster comb. with only false atoms

Search Algorithm • Approximation: Hard assignment of symbols to clusters • Searches over cluster assignments, evaluate each by its log-posterior • Agglomerative clustering • Start with each r, x, ysymbols in own cluster • Merge pairs of clusters in bottom-up manner

Search Algorithm Apollo 10 orbits orbits orbits revolves_around Earth Odyssey revolves_around revolves_around Moon space_shuttle … …

Search Algorithm Apollo 10 Apollo 10 Apollo 10 Odyssey Apollo 10 Odyssey orbits revolves_around orbits revolves_around Earth Earth Odyssey Odyssey Moon Moon space_shuttle space_shuttle space_shuttle … … 22

Search Algorithm Apollo 10 orbits Earth Odyssey revolves_around Moon space_shuttle … … 23

Overview • Motivation • Background • Semantic Network Extractor • Experiments • Future Work

Dataset • 2.1 million triples extracted in Web crawl by TextRunner [Banko et al, IJCAI 2007] • e.g., named_after(Jupiter,Roman_god), upheld(Court,ruling), etc. • 15,872r symbols, 700,781x symbols, 665,378y symbols • Only consider symbols appearing ¸25 times • 10,214r symbols, 8942x symbols, 7995y symbols • 2,065,045triples contain at least one such symbol

Comparison Systems Multiple Relational Clustering (MRC) [Kok & Domingos, ICML’07] Similar to SNE Finds multiple clusterings Exponential prior on #clusters No symbols pairs tend to be in different clusters rule Information-Theoretic Co-clustering (ITC) [Dhillon et al. , KDD’03] Clusters data in 2D matrix along both dimensions Maximize mutual info b/w row & column clusters Extended it to 3D Extended it to use BIC prior on #cluster combinations Infinite Relational Model (IRM) [Kemp et al., AAAI’06] Generative model: Beta !p! Bernoulli ! Atoms Changed it to use CRP prior #cluster combination Search algorithms changed to SNE’s agglomerative clustering 26

Evaluation • Pairwise precision, recall, & F1 against manually created gold standard • 2688r symbols, 2568x symbols, 3058y symbols assigned to non-unit clusters • 874r non-unit clusters, 511xnon-unit clusters, 700y non-unit clusters • Remaining symbols assigned to unit clusters • Correct semantic statements • Cluster combinations with ¸ 5 true ground r(x,y) atoms

Parameter Settings • Closed-world assumption • triples not in DB are assumed false • SNE parameters: •  =  = 100, pfalse= 0.9999 •  = 2.81 £ 10-9 ,  = 10 -  ,  /(+) = fraction of true triples in dataset • Tried various parameters values for MRC, ITC, and IRM, and chose the best ones

SNE vs. MRC

SNE vs. IRM vs. ITC 30

>2x >3x SNE vs. ITC vs. IRM 0.778 0.874 0.835

SNE vs. ITC vs. IRM Hours 32

SNE Full Joint Model vs.Separate Clustering 33

SNE and WordNet Compare SNE’s object clusters with WordNet 5000 object symbols overlaps with WordNet Convert each node (synset) in WordNet taxonomy to contain children concepts too Match SNE cluster to WordNet cluster with best F1 score Lower the matched cluster in WordNet taxonomy, more precise the concept 34

Levels of Matched WordNet Clusters Level 47 36 24 19 16 12 11 10 8 7 6 5 4 3 2 SNE Cluster Size 35

emigrated_to relocated_to escaped_to moved_back_to America US Australia Austria Britain India Germany … brother father mother parents couple family friends Snippet of Extracted Semantic Network converted_to embraced to_embrace conducted_in carried_out_in research studies study results Islam Judaism Catholicism Christianity Protestanism 36 36

Overview • Motivation • Background • Semantic Network Extractor • Experiments • Future Work 37

Future Work • Integrate tuple extraction into SNE • Learn richer semantic networks • Learn logical theories • Etc.

Conclusion SNE: unsupervised, domain-independent approach, Text ! Simple semantic network Takes us a step closer to “grand agenda” of Text ! KB Based on Markov logic Techniques to scale SNE up to the Web Comparisons with other approaches show promise of SNE 39

Extracting Semantic Networks From Text Via Relational Clustering

Extracting Semantic Networks From Text Via Relational Clustering

Presentation Transcript

Text Clustering

Semantic-based Language Models for Text Retrieval and Clustering

Snowball : Extracting Relations from Large Plain-Text Collections

LinkClus: Efficient Clustering via Heterogeneous Semantic Links

Semantic Networks

Text Document Clustering

Clustering via SAS

Extracting semantic role information from unstructured texts

Collective Relational Clustering

Semantic Smoothing for Text Clustering

Text Clustering

Extracting Semantic Knowledge from Wikipedia Category Names

A semantic approach for extracting domain taxonomies from text

Extracting Semantic Constraint from Description Text for Semantic Web Service Discovery

Extracting Semantic Predication from Medline Citations for Pharmacogenomics

Gleaning Relational Information from Biomedical Text

Extracting hidden information from knowledge networks

Semantic Networks

Text Summarization via Semantic Representation

Extracting Semantic Location from Outdoor Positioning Systems

LinkClus: Efficient Clustering via Heterogeneous Semantic Links

Collective Relational Clustering