190 likes | 444 Vues
How to perform tree surgery. Anna Rafferty Marie-Catherine de Marneffe. Tsurgeon by Roger Levy. What? makes operations on a grammatical tree How? based on Tregex syntax Where? Javanlp: trees.tregex.tsurgeon. S. VP. DT. VBD. VP. NP. NP. VBG. PP. NP. NN. NN. IN. NN. NNS.
E N D
How to perform tree surgery Anna Rafferty Marie-Catherine de Marneffe
Tsurgeon by Roger Levy • What? makes operations on a grammatical tree • How? based on Tregex syntax • Where? Javanlp: trees.tregex.tsurgeon
S VP DT VBD VP NP NP VBG PP NP NN NN IN NN NNS PRP The firm stopped using croco-dilite in its cigarette filters How? Tregex • utility for identifying patterns in trees (like regular expressions for strings) • node descriptions and relationships between nodes NP < /^NN/
Tsurgeon syntax • Define a pattern to be matched on the trees VBZ=vbz $ NP • Define one or several operation(s) relabel vbz VBZ_TRANSITIVE
Delete (ROOT (SBARQ (SQ (NP (NNS Cats)) (VP (VBP do) (VP (WHNP what) (VB eat))) (PUNCT ?))) PUNCT=punct > SBARQ delete punct
Delete (ROOT (SBARQ (SQ (NP (NNS Cats)) (VP (VBP do) (VP (WHNP what) (VB eat))) (PUNCT ?))) PUNCT=punct > SBARQ delete punct delete <name1>…<nameN> Delete the node and everything below it
Excise (ROOT (SBARQ (SQ (NP (NNS Cats)) (VP (VBP do) (VP (WHNP what) (VB eat)))))) SBARQ=sbarq > ROOT excise sbarq sbarq (ROOT (SQ (NP (NNS Cats)) (VP (VBP do) (VP (WHNP what) (VB eat)))))
Excise (ROOT (SBARQ (SQ (NP (NNS Cats)) (VP (VBP do) (VP (WHNP what) (VB eat)))))) SBARQ=sbarq > ROOT excise sbarq sbarq excise <name1> <name2> name1 is name2 or dominates name2. All children of name2 go into the parent of name1, where name1 was.
Prune prune <name1>…<nameN> Different from delete: If after the pruning the parent has no children anymore, the parent is pruned too.
Insert (ROOT (SQ (NP (NNS Cats)) (VP (VBP do) (VP (WHNP what) (VB eat))))) SQ=sq > ROOT !<- /PUNCT/ insert (PUNCT .) >-1 sq <tree> <position> (ROOT (SQ (NP (NNS Cats)) (VP (VBP do) (VP (WHNP what) (VB eat))) (PUNCT .))) Caveat: cyclic application of rules
Position for ‘insert’ and ‘move’ insert <name> <position> insert <tree> <position> <position> := <relation> <name> <relation> $+ the left sister of the named node $- the right sister of the named node >i the i_th daughter of the named node >-i the i_th daughter, counting from the right, of the named node.
Move (ROOT (SQ (NP (NNS Cats)) (VP (VBP do) (VP (WHNPwhat) (VBeat))) (PUNCT .))) VP < (/^WH/=wh $++ /^VB/=vb) move vb $+ wh <position> move <name> <position> moves the named node into the specifiedposition
Move (ROOT (SQ (NP (NNS Cats)) (VP (VBP do) (VP (WHNPwhat) (VBeat))) (PUNCT .))) VP < (/^WH/=wh $++ /^VB/=vb) move vb $+ wh <position> (ROOT (SQ (NP (NNS Cats)) (VP (VBP do) (VP (VBeat) (WHNPwhat))) (PUNCT .)))
Adjoin (ROOT (SQ (NP (NNS Cats)) (VP (ADVP (RB usually)) (VP (VBP do) (VP (VB eat) (WHNP what))) (PUNCT .))) (ROOT (SQ (NP (NNS Cats)) (VP (VBP do) (VP (VB eat) (WHNP what))) (PUNCT .))) VP=vp > SQ !> (__ << usually) adjoin (VP (ADVP (ADV usually)) VP@)vp
Adjoin syntax adjoin <auxiliary_tree> <name> Adjoins the specified auxiliary tree into the named node. The daughters of the target node will become the daughters of the foot of the auxiliary tree. adjoin (VP (ADVP (ADV usually)) VP@) vp foot
On the command line java Tsurgeon -treeFile <aFile> [<operationFile>]* aFile -> a file containing the trees to be transformed operationFile -> pattern (Tregex expression) an empty line operation(s) (one by line)
How to use the Tsurgeon class TregexPattern matchPattern = TregexPattern.compile("SQ=sq < (/^WH/ $++ VP)"); List<TsurgeonPattern> ps = new ArrayList<TsurgeonPattern>(); TsurgeonPattern p = Tsurgeon.parseOperation("relabel sq S"); ps.add(p); Collection<Tree> result = Tsurgeon.processPatternOnTrees(matchPattern,Tsurgeon.collectOperations(ps),lTrees);
To become a specialist • See Roger’s README! • Practice tree surgery!