310 likes | 418 Vues
This document provides insights into the use of Definite Clause Grammars (DCGs) with a focus on employing difference lists for efficient list manipulations in parsing. It presents an overview of how difference lists can improve append complexities and how DCGs can be utilized for various applications such as natural language processing, coding parsers, and evaluating expressions. Through several examples, it demonstrates syntax rules, queries, and the representation of sentences utilizing difference lists in Prolog, showcasing practical implementations and automated translations from DCGs to Prolog clauses.
E N D
Definite Clause Grammars t.k.prasad@wright.edu http://www.knoesis.org/tkprasad/ L10DCG
Review : Difference Lists • Represent list L as a difference of two lists L1 and L2 • E.g., consider L = [a,b,c] and various L1-L2 combinations given below. L10DCG
Review: Append using Difference Lists append(X-Y, Y-Z, X-Z). • Ordinary append complexity = O(length of first list) • Difference list append complexity = O(1) X-Z X X-Y Y Y Y-Z Z Z Z L10DCG
DCGs • Mechanize attribute grammar formalism (a generalization of CFG) • Executable specification • Use difference lists for efficiency • Translation from DCGs to Prolog clauses is automatic L10DCG
Sample Applications of DCGs • Coding recursive descent backtracking parser • Encoding and checking context-sensitive constraints • Simple NLP • In general, enabling syntax directed translation • E.g., VHDL Parser-Pretty Printer L10DCG
DCG Example : Syntax sentence --> noun_phrase, verb_phrase. noun_phrase --> determiner, noun. verb_phrase --> verb, noun_phrase. determiner --> [a]. determiner --> [the]. determiner --> [many]. noun --> [president]. noun --> [cat]. noun --> [cats]. verb --> [has]. verb --> [have]. L10DCG
DCG to Ordinary Prolog Syntax sentence(S,R) :- noun_phrase(S,T), verb_phrase(T,R). noun_phrase(S,T) :- determiner(S,N), noun(N,T). verb_phrase(T,R) :- verb(T,N), noun_phrase(N,R). determiner([a|R],R). determiner([the|R],R). determiner([many|R],R). noun([president|R],R). noun([cat|R],R). noun([cats|R],R). verb([has|R],R). verb([have|R],R). L10DCG
Queries ?- sentence([the, president, has, a, cat], []). ?- sentence([the, cats, have, a, president], []). ?- sentence([a, cats, has, the, cat, president], [president]). ?- sentence([a, cats, has, the, cat, President], [President]). • Each non-terminal takes two lists as arguments. • In difference list representation, they together stand for a single list. L10DCG
DCG Example: Number Agreement sentence --> noun_phrase(N),verb_phrase(N). noun_phrase(N) --> determiner(N), noun(N). verb_phrase(N) --> verb(N), noun_phrase(_). determiner(sgular) --> [a]. determiner(_) --> [the]. determiner(plural) --> [many]. noun(sgular) --> [president]. noun(sgular) --> [cat]. noun(plural) --> [cats]. verb(sgular) --> [has]. verb(plural) --> [have]. L10DCG
Extension: AST plus Number agreement sentence(s(NP,VP)) --> noun_phrase(N, NP),verb_phrase(N, VP). noun_phrase(N, np(D,NT)) --> determiner(N, D), noun(N, NT). verb_phrase(N, vp(V,NP)) --> verb(N, V), noun_phrase(_, NP). determiner(sgular, dt(a)) --> [a]. determiner(_, dt(the)) --> [the]. determiner(plural, dt(many)) --> [many]. noun(sgular, n(president)) --> [president]. noun(sgular, n(cat)) --> [cat]. noun(plural, n(cats)) --> [cats]. verb(sgular, v(has)) --> [has]. verb(plural, v(have)) --> [have]. L10DCG
Queries ?- sentence(T,[the, president, has, a, cat], []). T = s(np(dt(the), n(president)), vp(v(has), np(dt(a), n(cat)))) ; ?- sentence(T,[the, cats, have, a, president|X], X). ?- sentence(T,[a, cats, has, the, cat, preside], [preside]). • Each non-terminal takes two lists as arguments for input sentences, and additional arguments for the static semantics (e.g., number, AST, etc). • Number disagreement causes the last query to fail. L10DCG
Prefix Expression DCG expr --> [if], expr, [then], expr, [else], expr. expr --> [’+’], expr, expr. expr --> [’*’], expr, expr. expr --> [m]. expr --> [n]. expr --> [a]. expr --> [b]. L10DCG
Queries ?-expr([’*’, m, n], []). ?-expr([m, ’*’, n], []). ?-expr([’*’, m, ’+’, ’a’, n, n], [n]). ?-expr([if, a, then, m, else, n], []). ?-expr([if, a, then, a, else, ’*’, m, n], []). L10DCG
Prefix Expression DCG : Type Checking Version tExpr(T) --> [if], tExpr(bool), [then], tExpr(T), [else], tExpr(T). tExpr(T) --> [’+’], tExpr(T), tExpr(T). tExpr(T) --> [’*’], tExpr(T), tExpr(T). tExpr(int) --> [m]. tExpr(int) --> [n]. tExpr(bool) --> [a]. tExpr(bool) --> [b]. • Assume that + and * are overloaded for int and bool. L10DCG
Queries ?-tExpr(T,[’*’, m, n], []). ?-tExpr(T,[m, ’*’, n], []). ?-tExpr(T,[’*’, m, ’+’, ’a’, n, n], [n]). ?-tExpr(T,[if, a, then, m, else, n], []). T = int ; ?-tExpr(T,[if, a, then, b, else, ’*’, m, n], []). L10DCG
Prefix Expression DCG : Type Checking and Evaluation Version evalExpr(V) --> etExpr(V,_). etExpr(V,T) --> [if], etExpr(B,bool), [then], etExpr(V1,T), [else], etExpr(V2,T), {B==true -> V = V1 ; V = V2}. etExpr(V,bool) --> [’+’], etExpr(V1,bool), etExpr(V2,bool), {or(V1,V2,V)}. etExpr(V,int) --> [’+’], etExpr(V1,int), etExpr(V2,int), {V is V1 + V2}. L10DCG
(cont’d) etExpr(V,bool) --> [’*’], etExpr(V1,bool), etExpr(V2,bool), {and(V1,V2,V)}. etExpr(V,bool) --> [’*’], etExpr(V1,int), etExpr(V2,int), {V is V1 * V2}. etExpr(V,int) --> [m], {value(m,V)}. etExpr(V,int) --> [n], {value(n,V)}. etExpr(V,bool) --> [a], {value(a,V)}. etExpr(V,bool) --> [b], {value(b,V)}. L10DCG
(cont’d) value(m,10). value(n,5). value(a,true). value(b,false). and(true,true,true). and(true,false,false). and(false,true,false). and(false,false,false). or(true,true,true). or(true,false,true). or(false,true,true). or(false,false,false). L10DCG
Prefix Expression DCG : AST Version treeExpr(V) --> trExpr(V,_). trExpr(cond(B,V1,V2),T) --> [if], trExpr(B,bool), [then], trExpr(V1,T), [else], trExpr(V2,T). trExpr(or(V1,V2),bool) --> [’+’], trExpr(V1,bool), trExpr(V2,bool). trExpr(plus(V1,V2),int) --> [’+’], trExpr(V1,int), trExpr(V2,int). L10DCG
(cont’d) trExpr(and(V1,V2),bool) --> [’*’], trExpr(V1,bool), trExpr(V2,bool). trExpr(mul(V1,V2),int) --> [’*’], trExpr(V1,int), trExpr(V2,int). trExpr(m,int) --> [m]. trExpr(n,int) --> [n]. trExpr(a,bool) --> [a]. trExpr(b,bool) --> [b]. L10DCG
Other Compiler Operations • From parse tree and type information, one can: • compute (stack) storage requirements for variables and for expression evaluation • generate assembly code (with coercion instructions if necessary) • transform/simplify expression • Ref: http://www.cs.wright.edu/~tkprasad/papers/Attribute-Grammars.pdf L10DCG
Variation on Expression Grammars Inefficient Backtracking Parser Exists Unsuitable Grammar E -> E + E | E * E | x | y E -> T + E | T T -> F * T | F F -> (E) | x | y L10DCG
Attribute Grammars • Formalism for specifying semantics based on context-free grammars (BNF) • Static semantics (context-sensitive aspects) • Type checking and type inference • Compatibility between procedure definition and call • Dynamic semantics • Associate attributes with terminals and non-terminals • Associate attribute computation rules with productions L167AG
Attributes A(X) • Synthesized S(X) • Inherited I(X) • Attribute computation rules(Semantic functions) X0 -> X1 X2 … Xn S(X0) = f( I(X0), A(X1), A(X2), …, A(Xn) ) I(Xj) = Gj( I(X0), A(X1), A(X2), …, A(Xj-1)) for allj in1..n P( A(X0), A(X1), A(X2), …, A(Xn) ) L10DCG
Information Flow inherited computed available synthesized ... ... L10DCG
Synthesized Attributes Pass information up the parse tree • Inherited Attributes Pass information down the parse tree or from left siblings to the right siblings • Attribute values assumed to be available from the context. • Attribute values computed using the semantic rules provided. The constraints on the attribute evaluation rules permit top-down left-to-right (one-pass) traversal of the parse tree to compute the meaning. L10DCG
An Extended Example • Distinct identifiers in a straight-line program. BNF <exp> ::= <var> | <exp> + <exp> <stm> ::= <var> := <exp> | <stm> ; <stm> Attributes <var> id <exp> ids <stm> ids num • Semantics specified in terms of sets (of identifiers). L167AG
<exp> ::= <var> <exp>.ids = {<var>.id } <exp> ::= <exp1> + <exp2> <exp>.ids = <exp>.idsU<exp>.ids <stm> ::= <var> := <exp> <stm>.ids ={ <var>.id }U <exp>.ids <stm>.num = | <stm>.ids | <stm> ::= <stm1> ; <stm2> <stm>.ids = <stm1>.ids U <stm2>.ids <stm>.num = | <stm>.ids | L167AG
Alternative Semantics using lists • Attributes envi : list of vars in preceding context envo : list of vars for following context dnum : number of new variables <exp> ::= <var> <exp>.envo = ifmember(<var>.id,<exp>.envi) then <exp>.envi elsecons(<var>.id,<exp>.envi) L10DCG
Attribute Computation Rules <exp> ::= <exp1> + <exp2> envienvienvi envoenvoenvo dnumdnumdnum <exp1>.envi = <exp>.envi <exp2>.envi = <exp1>.envo <exp>.envo = <exp2>.envo <exp>.dnum = length(<exp>.envo) L10DCG