330 likes | 439 Vues
This lecture covers the fundamentals of Finite State Automata (FSA) and their equivalence to regular expressions. It includes examples such as sheeptalk and detailed step-by-step construction methods for FSAs from regular expressions. We'll explore Microsoft Word wildcards, demonstrate how to define FSAs in Prolog, and include queries for different input strings. The session emphasizes the finite nature of states in an FSA and illustrates their limitations in expressiveness. Join us to deepen your understanding of computational linguistics!
E N D
LING 388: Language and Computers Sandiway Fong 9/27 Lecture 10
Adminstrivia • Reminder • Homework 4 due Wednesday
Today’s Topic • Finite State Automata (FSA) • equivalent to the regular expressions we’ve been studying
Regular Expressions: Example .... from lecture 8 • example (sheeptalk) • baa! • baaa! • baaaa! • … • regular expression • baaa*! • baa+!
Regular Expressions: Example .... from lecture 8 • example (sheeptalk) • baa! • baaa! • baaaa! • … • regular expression • baaa*! • baa+! a s b w a x a y > ! z
Regular Expressions: Example • step-by-step • regular expression • baaa*! s > Start state: s
Regular Expressions: Example • step-by-step • regular expression • baaa*! • b • from s, • see ‘b’, • move tow s b w >
Regular Expressions: Example • step-by-step • regular expression • baaa*! • ba • From w, • see an ‘a’, • move tox s b w a x >
Regular Expressions: Example • step-by-step • regular expression • baaa*! • baa • From x, • see an ‘a’, • move toy s b w a x a y >
Regular Expressions: Example • step-by-step • regular expression • baaa*! • baaa* • baa • baaa • baaaa • baaaaa... • from y, • see an ‘a’, • move to ? s b w a x a y > a y’ but machine must have a finite number of states! a y” a ...
Regular Expressions: Example • step-by-step • regular expression • baaa*! • baaa* • baa • Baaa • baaaa • baaaaa... • from y, • see an ‘a’, • “loop” aka return to state y a s b w a x a y >
Regular Expressions: Example • step-by-step • regular expression • baaa*! • baaa*! • from y, • see an ‘!’, • move to final state z (indicated in red) a s b w a x a y > ! z Note: machine cannot finish (i.e. reach the end of the input string) in states s, x or y
Finite State Automata (FSA) • construction • the step-by-step FSA construction method we just used • works for any regular expression • conclusion • anything we can encode with a regular expression, we can build a FSA for it • an important step in showing that FSA and REs are equivalent
a b c d x y e ... ... z etc. a etc. one loop for each character y Microsoft Word Wildcards • basic wildcards • ? and * • ? any single character • e.g. p?t put, pit, pat, pet • * zero or more characters
a x y a a e i o x y u Microsoft Word Wildcards • basic wildcards • @ • one or more of the preceding character • e.g. a@ • [ ] • range of characters • e.g. [aeiou]
see anything but ‘<‘ see anything but ‘>‘ x x y y < > Microsoft Word Wildcards • basic wildcards • < > • < • beginning of a word • can think of there being a special symbol/invisible character marking the beginning of each word • > • end of a word • suppose there is an invisible character marking the end of each word
see anything but ‘>‘ x y > see anything but ‘m‘ x m y z > Microsoft Word Wildcards • basic wildcards • < > • > • end of a word • Note • the see-anything-but loop is implicit • m> • “word that ends in m” • example: • mom is...
Finite State Automata (FSA) • more formally • (Q,s,f,Σ,) • set of states (Q): {s,w,x,y,z} 5 states must be a finite set • start state (s): s • end state(s) (f): z • alphabet (Σ): {a, b, !} • transition function : signature: character × state → state • (b,s)=w • (a,w)=x • (a,x)=y • (a,y)=y • (!,y)=z a s b w a x a y > ! z
Finite State Automata (FSA) • in Prolog • define one predicate for each state • taking one argument (the input list L) • consume input character (take the head of the list) • call next state with the tail of the list • rule • fsa(L) :- s(L). i.e. call start state s
Finite State Automata (FSA) • state s: (start state) • s([b|L]) :-w(L). match input string beginning with b and call statewwith remainder of input • state w: • w([a|L]) :- x(L). • state x: • x([a|L]) :- y(L). • state y: • y([a|L]) :- y(L). • y([!|L]) :- z(L). • state z: (end state) • z([]). a s b w a x a y > ! z
Finite State Automata (FSA) • query • ?-s([b,a,a,a,!]). a s b w a x a y > [a,a,a,!] [a,a,!] [a,!] [b,a,a,a,!] ! [!] Database s([b|L]) :-w(L). w([a|L]) :- x(L). x([a|L]) :- y(L). y([a|L]) :- y(L). y([!|L]) :- z(L). z([]). z []
Finite State Automata (FSA) • In which state does query • ?-s([b,a,b,a,!]). fail? a s b w a x a y > [a,b,a,!] [b,a,b,a,!] [b,a,!] ! Database s([b|L]) :-w(L). w([a|L]) :- x(L). x([a|L]) :- y(L). y([a|L]) :- y(L). y([!|L]) :- z(L). z([]). z
FSA • Finite State Automata (FSA) have a limited amount of expressive power • Let’s look at a modification to FSA and its effect on its power
b a b b abb String Transitions • so far... • all machines have had just a single character label on the arc • so if we allow strings to label arcs • do they endow the FSA with any more power? • Answer: No • because we can always convert a machine with string-transitions into one without
Finite State Automata (FSA) • equivalent a a s b w a x a y > s baa y > ! ! z machine with 5 states z
Finite State Automata (FSA) • equivalent a a s b w a x a y > s baa y > ! ! Database s([b|L]) :-w(L). w([a|L]) :-x(L). x([a|L]) :- y(L). y([a|L]) :- y(L). y(['!'|L]) :- z(L). z([]). Database s([b,a,a|L]) :- y(L). y([a|L]) :- y(L). y(['!'|L]) :- z(L). z([]). z z
b ε x y Empty Transitions • so far... • how about allowing the empty character? • i.e. go from x to y without seeing a input character • does this endow the FSA with any more power? • Answer: No • because we can always convert a machine with empty transitions into one without
a b b Empty Transitions • example • (ab)|b a b > > ε
a b a b > ε Empty Transitions • example • (ab)|(empty string) = final state
s x a a b y a b b 1 2 3 ε NDFSA • Basic FSA • deterministic • it’s clear which state we’re always in, or • deterministic = no choice point • NDFSA • ND = non-deterministic • i.e. we could be in more than one state • non-deterministic choice point • example: • initially, either in state 1 or 2 > >
a b 1 2 3 a NDFSA • more generally • non-determinism can be had not just withε-transitionsbut with any symbol • example: • given a, we can proceed to either state 2 or 3 >
a b a b 1 2 3 2,3 2,3 2 1 3 a NDFSA • NDFSA • are they more powerful than FSA? • similar question asked earlier forε-transitions • Answer: No • We can always convert a NDFSA into a FSA • example • (set of states) > >
> > > > a a {1} {1} {1} {1} {2,3} {2,3} b {3} a b a b > 1 2 3 2,3 2,3 2 1 3 a b {2,3} {3} a > NDFSA • example • (set of states) • construct new machine with states = set of possible states of the old machine • Essential trick: • i.e. simulate the old (non-deterministic) machine with the new machine