380 likes | 509 Vues
Welcome back to Advanced Computational Linguistics! In today's session, we'll discuss Homework Task 3 focusing on treebanks, exploring the 180,487 verb phrases in the Wall Street Journal corpus. We will analyze verb frames present in the EVCA project and compare them with the PTB corpus. Additionally, we'll review key readings over Spring Break, including Chomsky’s "Derivation by Phase." Don't forget to set up your software environment with Tcl/Tk and SWI-Prolog for class exercises, and be prepared to dive into practical parsing tasks.
E N D
LING 581: Advanced Computational Linguistics Lecture Notes March 23rd
Administrivia • Welcome back! • Task 3 is due
Homework Task Treebank • There are 180487 VPs in the Wall Street Journal section • Q: what kinds of verb frames are attested? EVCA Project • Pick verbs that exist in EVCA (evca93.index) and also in the PTB • Produce a report that compares EVCA with what is present in the corpus
Minimalist Syntax From stochastic parsing to the (near) latest in syntax • Paper: • Derivation by Phase (DbP) (Chomsky) • (manuscript 1999, published 2001) • 4 files on usb drive • Reading Homework over Spring Break • dpb.pdf(the unlocked published version) you may find the following notes very useful • JU_DbP_1.pdf (DbyP with inline notes from Juan Uriagereka) • JU_DbP_2.pdf (part 2) • Yoon_DbP.pdf (notes from James Yoon, UIUC)
Software • Code • GUI: (Tcl/Tk side) treeserver.tcl • GUI: (Prolog side) prolog_client.pl • Grammar: (GUI version) grammar8.pl • What you need • Tcl/Tk wish interpreter • Prolog SWI Prolog • Command line (some) shell
Processes • Process #1 run treeserver.tcl under wish interpreter • example: bash-3.2$ wish treeserver.tcl Connection from 127.0.0.1 channel closed • Process #2 run SWI-Prolog • example: bash-3.2$ swipl Welcome to SWI-Prolog (Multi-threaded, 64 bits, Version 5.8.3) ?- [prolog_client, grammar8]. % library(error) compiled into error 0.00 sec, 17,480 bytes % library(lists) compiled into lists 0.01 sec, 39,256 bytes % library(shlib) compiled into shlib 0.01 sec, 59,552 bytes % library(unix) compiled into unix 0.01 sec, 65,968 bytes % prolog_client compiled 0.01 sec, 134,368 bytes % grammar8 compiled 0.00 sec, 46,816 bytes true. chatter: uses sockets to communicate with Prolog side
Example 1 • John likes Mary • Run in SWI Prolog window ?- parse([john,like,mary]). Probe [v*] agrees with goal [nmary] Probe [t] agrees with goal [n john] [c[c][t[njohn][t[t][v[njohn][v[v*][V[V like][nmary]]]]]]] true ; false. Tree display zoom slider Click to see derivation steps Agree(P,G) computations displayed terminal parse output: same as last step of GUI output
Current Limitations • John likes Mary • Run in SWI Prolog window ?- parse([john,like,mary]). Probe [v*] agrees with goal [nmary] Probe [t] agrees with goal [n john] [c[c][t[njohn][t[t][v[njohn][v[v*][V[V like][nmary]]]]]]] true ; false. Implementation limitations: • No Spell-Out implemented (just narrow syntax) T + v* f(phi,3-sg-f)+ V(like) = likes • No head movement/affix-hopping implemented (part of Spell-Out) input: awarded -> [ed,award] • No undo of displacement implemented (displacement is computed from base position) input:[be,ed,award,several,prizes] Several prizes are awarded Note: not inflected Spell-Out: implementation of simple inflectional morphology would be straightforward Parsing from final landing sites would require an additional computational mechanism
Interacting with the Grammar • parse(List). • Prolog List = input sentence • Opens a socket connection to display server • Calls the definite clause grammar (DCG) rule for a clause: c(Parse,List,[]). • Closes socket connection • resetDisplay. • Opens a socket connection to display server • Clears the display history for a derivation • Closes socket connection • Components of parse(List) can be called separately • open_connection. • close_connection. • X(Parse,List,[]). • X = grammar rule name • useful when other DCG rules need to be called directly • Implementation (prolog_client.pl) parse(List) :- open_connection, c(Parse,List,[]), format('~p~n',[Parse]), close_connection. use after each derivation
Example 1 • There are likely to be awarded several prizes • (note input lexme order) ?- parse([be,likely,there,be,ed,award,several,prizes]). Probe [a!caseed] agrees with goal [n!case several prizes] Probe [tdef] agrees with goal [n!case several prizes] Probe [t!phi] agrees with goal [n!phi there] Probe [t] agrees with goal [n several prizes] [c[c][t[nthere][t[t][v[vbe][a[alikely][t[nthere][t[tdef][v[vbe][a[aed][V[Vaward][n several prizes]]]]]]]]]]] true ; false.
Example 2 • Several prizes are likely to be awarded • (note input lexeme order) ?- resetDisplay. true. ?- parse([be,likely,be,ed,award,several,prizes]). Probe [a!caseed] agrees with goal [n!case several prizes] Probe [tdef] agrees with goal [n!case several prizes] Probe [t] agrees with goal [n several prizes] [c[c][t[n several prizes][t[t][v[vbe][a[alikely][t[n several prizes][t[tdef][v[vbe][a[aed][V[Vaward][n several prizes]]]]]]]]]]] true ; false.
Example 3 • we expect there to be awarded several prizes (note input lexeme order) ?- parse([we,expect,there,be,ed,award,several,prizes]). Probe [a!caseed] agrees with goal [n!case several prizes] Probe [tdef] agrees with goal [n!case several prizes] Probe [v*!phi] agrees with goal [n!phi there] Probe [v*] agrees with goal [n several prizes] Probe [t] agrees with goal [n we] [c[c][t[nwe][t[t][v[nwe][v[v*][V[V expect][t[nthere][t[tdef][v[vbe][a[aed][V[Vaward][n several prizes]]]]]]]]]]]] true ; false.
Example 4 • we expect several prizes to be awarded (note input lexeme order) ?- parse([we,expect,be,ed,award,several,prizes]). Probe [a!caseed] agrees with goal [n!case several prizes] Probe [tdef] agrees with goal [n!case several prizes] Probe [v*] agrees with goal [n several prizes] Probe [t] agrees with goal [n we] [c[c][t[nwe][t[t][v[nwe][v[v*][V[V expect][t[n several prizes][t[tdef][v[vbe][a[aed][V[Vaward][n several prizes]]]]]]]]]]]] true ; false.
Basic Computation • start with lexical array of syntactic objects: {α,..,ω} • Merge “an indispensible operation of a recursive system” • (external) • two syntactic objects (SOs): α, β • create merged SO: {α, β} • label({α, β})=label(α) or label(β) • (internal), implements Displacement • SOs: α and γ, γ properly contained in α • create SO: {α, γ} • label({α, γ})=label(α) • Agree • active probe SO: α (active = uninterpretable features), goal SO: β • match and delete uninterpretable features of probe and goal • Convergent derivation: uninterpretable features must be eliminated copy v*: φ, acc N: φ, Case
Basic Implementation (1) Definite clause grammar (DCG) (simplified) V([V V N]) --> V(V), n(N). V([V Verb]) --> [Verb]. features (V) features (N) • phonetic matrix: f(pmatrix,like) f(pmatrix,john) • (takes an) argument: f(arg,+) f(arg,+) • uninterpretable Case:f(case,_) variable % (big V) verb classes bV(n('V',[],[V,N])) --> bV0(V), n0(N), {theta(V),theta(N)} report 'theta merge V & N'. bV0(n('V',[f(pmatrix,Verb),f(arg,+)],[])) --> [Verb], {transitive(Verb)}. bV0(n('V',[f(pmatrix,Verb),f(arg,+)],[])) --> [Verb], {unaccusative(Verb)}. bV0(n('V',[f(pmatrix,Verb)],[])) --> [Verb], {unergative(Verb)}. transitive(like). transitive(expect). unergative(run). unaccusative(arrive). n0(n(n,[f(pmatrix,BNoun)|Fs],[])) --> [BNoun], {bareNoun(BNoun,Fs)}. bareNoun(john,[f(phi,3-sg-m),f(case,_),f(arg,+)]).
Basic Implementation (2) Definite clause grammar (DCG) (simplified) v([v N v]) --> n(N), v(v). v([vv V]) --> v(v), V(V). v([v*]) --> []. • features (v*) • uninterpretableφ-features: f(phi,_-_-_) • can value accusative Case: f(case,acc) f(phi,_-_-_) f(case,acc) f(phi,3-sg-f) f(case,_) triggers Agree % v*/vP v(n(v,[],[N,V])) --> n0(N), {theta(N)}, v1(V) report 'theta merge N & v'. v1(n(v,[],[V,BV])) --> v0(V), bV(BV), {goals(BV,Goals),agree(V,Goals)} report 'merge v & V'. v0(n('v*',[f(phi,_-_-_),f(case,acc)],[])) --> [].
Basic Implementation (2) • Agree(v*,N) f(phi,_-_-_) f(case,acc) Operation: unification (Robinson, 1965) (match and instantiate unvalued features) ProbeGoal f(phi,_-_-_) f(phi,3-sg-f) f(case,acc) f(case,_) f(phi,3-sg-f) f(case,_) Uninterpretable/unvalued features (represented by variables) are eliminated
Basic Implementation (3) Definite clause grammar (DCG) T([T N v]) --> n(N), v(v). T([T N v]) --> v(v), {N a goal} T([T]) --> []. • features (φ-complete T) • uninterpretableφ-features: f(phi,_-_-_) • can value nominative Case: f(case,nom) • EPP defective T φ-incomplete e.g. infinitivals ╳ triggers Agree t(n(t,[],[N,T])) --> n0(N), {nonarg(N)}, t1(T,_) report 'merge expl & T'. % EPP: (1) merge t(n(t,[],[Goal,T])) --> t1(T,Goal) report 'move to spec-T’. % EPP: (2) move with maximize matching t1(n(t,[],[T,V]),G) --> t0(T), v(V), {goals(V,Goals),agree(T,Goals), Goals = [G|_]} report 'merge T & v’. t0(n(t,[f(phi,_-_-_),f(case,nom)],[])) --> [].
Putting it all together (5) (EPP requirement)
Putting it all together (6) Operation: Spell-Out (not currently implemented) “John likes Mary” only one copy of John is pronounced bundle T + v* f(phi,3-sg-f)+ V (like) = likes
A Worked Example Consider the derivation of • several prizes are likely to be awarded (= 4(b)(ii)) awarded = award + -ed(adjectival participle) • -ed • ϕ-incomplete: uninterpretable Number and Gender only • uninterpretable Case • morphologically unrealized in English (cf. Icelandic) Agree(a,N) -ed: φ, Case N: φ, Case ?- parse([be,likely,be,ed,award,several,prizes]). Probe [a!caseed] agrees with goal [n!case several prizes] Probe [tdef] agrees with goal [n!case several prizes] Probe [t] agrees with goal [n several prizes] [c[c][t[n several prizes][t[t][v[vbe][a[alikely][t[n several prizes][t[tdef][v[vbe][a[aed][V[Vaward][n several prizes]]]]]]]]]]]
A Worked Example Notation: !case means feature is unvalued Agree(a,N) -ed: φ, Case N: φ, Case
A Worked Example Agree(Tdef,N) Tdef: φ N: φ, Case
A Worked Example (several prizes raises to subject position of embedded infinitival)
A Worked Example Matrix T Agree(T,N) T: φ, Nominative N: φ, Case Case for –ed also valued because of earlier unification step Agree(a,N) -ed: φ, Case N: φ, Case Unification presents an possible advantage: computation is more local
Probing with multiple goals compare with… (Chomsky 2001)
A Worked Example Spell-Out Several prizes are likely to be awarded
Another Example ?- parse([be,likely,there,be,ed,award,several,prizes]). Probe [a!caseed] agrees with goal [n!case several prizes] Probe [tdef] agrees with goal [n!case several prizes] Probe [t!phi] agrees with goal [n!phi there] Probe [t] agrees with goal [n several prizes] [c[c][t[nthere][t[t][v[vbe][a[alikely][t[nthere][t[tdef][v[vbe][a[aed][V[Vaward][n several prizes]]]]]]]]]]] Consider also the derivation of • There are likely to be awarded several prizes (= 4(b)(i)) pleonastic there: φ-incomplete (Person only) Agree(a,N) -ed: φ, Case N: φ, Case Agree(T,N) T: φ N: φ
Another Example Agree(a,N) -ed: φ, Case N: φ, Case
Examples • Examples (ECM): • (i) we expect there to be awardedseveral prizes • (ii) we expect several prizes to be awarded (Chomsky 2001)
Example: we expect several prizes to be awarded Prior Unification: locality advantage Case
Example: we expect several prizes to be awarded Spell-Out We expect several prizes to be awarded