130 likes | 252 Vues
This text provides an in-depth exploration of context-free languages (CFLs) and grammars (CFGs). We begin by discussing finite automata, which accept regular languages but fall short of representing certain non-regular languages such as palindromes and specific patterns. We then define context-free grammars and establish their superiority in terms of expressiveness compared to finite automata. Examples illustrate CFGs, including various derivations and the construction of parse trees. Furthermore, we discuss grammar ambiguity and Chomsky Normal Form, emphasizing the necessity for structured rules in language generation.
E N D
Introduction • Finite Automata accept all regular languages and only regular languages • Even very simple languages are non regular ( = {a,b}): - {anbn : n = 0, 1, 2, …} - {w : w is palindrome word} • We are going to define a new class of languages, called context-free languages that contain all regular languages and many more (including the 2 above)
Context-Free Grammar (preliminaries) • A context-free grammar is a kind of program • Languages that are generated by context-free grammars are called context-free languages • Context-free grammars are more expressive than finite automata: if a language L is accepted by a finite automata then L can be generated by a context-free grammar
My First Context-Free Grammar S bA A aA A b • = {a,b} • Elements in are called terminals • S and A are called variables
Context-Free Grammar (CFG) • Definition. A context-free grammar (CFG) is a 4-tuple (V, , R, S), where: • is an alphabet (characters are called terminals) • V is a set (elements in NT are called variables) • R is a subset of NT ( NT)* • S, the start variable, is one of the variables in NT • V = • If (,) R, we write • is called a rule
Derivations • Definition. u yields v in one-step, written u v, if: for some u,v in (V )* the following 3 conditions hold: • u = xz • v = xz • in R • Definition. u derives v, written u * v, if: • There is a chain of one-step yields of the form: • u u1 u2 … v
Example (2) • = {a,b} • V = {S} • R = { S aSb, • S e }
Context-Free Languages Definition. Given a context-free grammar G = (V, , R, S), the language generated or derived from G is the set: L(G) = {w *: } S * w Definition. A language L is context-free if there is a context-free grammar G = (, NT, R, S), such that L is generated from G
Example (3) • = {a,b} • NT = {S} • R = { S aS, • S Sb, • S e}
Example (4) • = {a,b} NT = {S} R = { S aSa, S bSb, S e}
S a S S a b S e Parse Tree • A parse tree of a derivation u u1 u2 … v • is a tree in which: • Each internal node is labeled with a variable • If a rule A A1A2…An occurs in the derivation then A is a parent node of nodes labeled A1, A2, …, An
Leftmost, Rightmost Derivations Definition. A leftmost derivation of a sentential form is one in which rules transforming the left-most nonterminal are always applied Definition. A rightmost derivation of a sentential form is one in which rules transforming the right-most nonterminal are always applied
Ambiguous Grammar Definition. A grammar G is ambiguous if there is a word w L(G) having are least two different leftmost derivations S A S B S AB A aA B bB A e B e • Notice that the word a has at least two left-most derivations • Some ambiguous grammars G can be disambiguated: • find an unambiguous grammar G’ such that L(G) = L(G’) • Some languages cannot be disambiguated
Chomsky Normal Form • Definition: A grammar is in Chomsky Normal Form if every rule is of the form: • A BC (A, B, C variables; B and C are not the start variable) • A a • S e (S is the start variable) • Theorem: Any CFG G can be converted into a grammar G’ in Chomsky Normal Form such that L(G) = L(G’) • Add new rule S0 S (S0 is the new start variable) • Remove rules of the form A e, and for every rule B <…>A<…> add a new rule: B <…><…> • Remove rules of the form A B and for every rule B <…> add a new rule: A <…> • Remove rules A <C1|c1> …<Cn|cn> with n > 2 and add rules: A <C1|c1> A1, A1 <C2|c2>, …, An-1 <Cn-1|cn-1> <Cn|cn> • Replace any rule: A cAi with A UAi, U c See example 2.10