1 / 28

Introduction to FSM Toolkit

Introduction to FSM Toolkit. Examples: Part I NLP Course 07. Example 1. Acceptor for “sheeptalk”: /baa+!/ Text Representation Symbols File (sheep.txt) (S.syms) 0 1 b eps 0 1 2 a a 1 2 3 a b 2

addison
Télécharger la présentation

Introduction to FSM Toolkit

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction toFSM Toolkit Examples: Part I NLP Course 07

  2. Example 1 • Acceptor for “sheeptalk”: /baa+!/ Text RepresentationSymbols File (sheep.txt) (S.syms) 0 1 b eps 0 1 2 a a 1 2 3 a b 2 3 3 a ! 3 3 4 ! w 4 4 o 5 u 6 f 7 -Symbols w, o, u and f are needed for the 2nd example. -eps symbol stands for possible future epsilon transitions.

  3. Example 1 • fsmcompile –i S.syms sheep.txt > sheep.fsa • fsmdraw –i S.syms sheep.fsa | dot –Tps > sheep.ps • Image format: PostScript. For jpg write: fsmdraw –i S.syms sheep.fsa | dot –Tjpg > sheep.jpg

  4. Write an acceptor for “dogtalk”: /wouf!/

  5. Example 2 • Acceptor for “dogtalk”: /wouf!/ Text RepresentationSymbols File (dog.txt) (S.syms)  same as Ex.1 0 1 w eps 0 (sheep & dog share 1 2 o a 1 the same symbols file) 2 3 u b 2 3 4 f ! 3 4 5 ! w 4 5 o 5 u 6 f 7

  6. Example 2 • fsmcompile –i S.syms dog.txt > dog.fsa • fsmdraw –i S.syms dog.fsa | dot –Tps > dog.ps

  7. Having the 2 fsa for “sheeptalk” and “dogtalk”, use the appropriate function to generate an acceptor that accepts a “sheeptalk” OR a “dogtalk”.

  8. Example 3 • fsmunion sheep.fsa dog.fsa > shORdg.fsa • fsmdraw –iS.syms < shORdg.fsa | dot –Tps > shORdg.ps

  9. Having the 2 fsa for “sheeptalk” and “dogtalk”, use the appropriate function to generate an acceptor that accepts a “sheeptalk” AND a “dogtalk”, using the constraint that sheep talks first!

  10. Example 4 • fsmconcat sheep.fsa dog.fsa > shANDdg.fsa • fsmdraw –iS.syms < shANDdg.fsa | dot –Tps > shANDdg.ps

  11. But the Society of Animals is always fair! This time let the dog to speak first…!!! ?

  12. Example 5 • Generate the following weighted FSM:

  13. Example 5 Text RepresentationSymbols File (A.txt) (S2.syms) 0 1 red 0.3 eps 0 1 3 blue 0.7 red 1 0 2 green 0.4 blue 2 2 3 yellow 0.8 green 3 3 0.3 yellow 4 4 0.4 As before: fsmcompile, fsmdraw

  14. Which is the path with the lowest cost?

  15. Example 5 • fsmbestpath A.fsa > B.fsa • fsmdraw –iS2.syms < B.fsa | dot –Tps > B.ps

  16. Integrating the power of Perl with the FSM Toolkit

  17. Perl & FSM Toolkit • Problem Definition: We have as input a file containing a single sentence of lower case words. “ hi nlp world” Goal: transform the above words into upper case using FSM. “ HI NLP WORLD”

  18. Perl & FSM Toolkit • A Perl script (composition.pl) that: • Extracts the lower case words from the input file • Generates the corresponding transducer • Generates a second transducer that transforms each word to its’ upper case form • Compose the two transducers • Projects the output of the resulted transducer • Extracts the output of the above transducer by reading the appropriate file and prints the upper case sentence to the screen

  19. #!/usr/bin/perl open (IN, $ARGV[0]) || die “error"; $rdln = <IN>; @in_wrds = split(/\s+/,$rdln); close(IN); # write the files for the transducers open (OUT_T11, ">T11") || die "error"; open (OUT_T12, ">T12") || die “error"; @low_up_words=@in_wrds; $c=0; foreach $tmp (@in_wrds) { print OUT_T11 ($c,"\t",$c+1,"\t",$tmp,"\t",$tmp,"\n"); print OUT_T12 ($c,"\t",$c+1,"\t",$tmp,"\t",uc($tmp),"\n"); push (@low_up_words,uc($tmp)); #gather lower and upper case words $c++; } print OUT_T11 ($c,"\n"); print OUT_T12 ($c,"\n"); close(OUT_T1); close(OUT_T2);

  20. # write symbols file $i=1; open (OUT_S12, ">S12") || die “error"; foreach $tmp (@low_up_words) { print OUT_S12 ($tmp,"\t",$i,"\n"); $i++; } close(OUT_S12); #call the FSM Library system ("fsmcompile -iS12 -oS12 -t < T11 > T11.fst"); system ("fsmdraw -iS12 -oS12 < T11.fst | dot -Tps > T11.ps"); system ("fsmcompile -iS12 -oS12 -t < T12 > T12.fst"); system ("fsmdraw -iS12 -oS12 < T12.fst | dot -Tps > T12.ps"); system ("fsmcompose T11.fst T12.fst > T12comp.fst"); system ("fsmdraw -iS12 -oS12 < T12comp.fst | dot -Tps > T12comp.ps");

  21. system ("fsmproject -2 T12comp.fst > final_out.fsa "); system ("fsmdraw -iS12 < final_out.fsa | dot -Tps > final_out.ps"); system ("fsmprint -iS12 < final_out.fsa > final_out"); # Finally, read the resulted file and extract the field of interest open (IN2, "final_out") || die "can not open the input file...\n"; $rdln2 = <IN2>; while ($rdln2 ne "") { @out_wrds = split(/\s+/,$rdln2); push (@up_wrds,$out_wrds[2]); $rdln2 = <IN2>; } close(IN2); # print the upper case content of the initial input file print (join(" ",@up_wrds),"\n");

  22. Perl & FSM Toolkit First fst (T11.fst) Second fst (T12.fst) 0 1 hi hi 0 1 hi HI 1 2 nlp nlp 1 2 nlp NLP 2 3 world world 2 3 world WORLD 3 3 Symbols File (S12) hi 1 nlp 2 world 3 HI 4 NLP 5 WORLD 6

  23. Perl & FSM Toolkit • Compose T11.fst and T12.fst system ("fsmcompose T11.fst T12.fst > T12comp.fst"); system ("fsmdraw -iS12 -oS12 < T12comp.fst | dot -Tps > T12comp.ps");

  24. Perl & FSM Toolkit Project the output of the resulted transducer: system ("fsmproject -2 T12comp.fst > final_out.fsa "); Draw the final_out.fsa: system ("fsmdraw -iS12 < final_out.fsa | dot -Tps > final_out.ps"); Print a textual description of the above fsa: system ("fsmprint -iS12 < final_out.fsa > final_out"); Read the textual this textual description using Perl: open (IN2, "final_out") || die "can not open the input file...\n"; $rdln2 = <IN2>; . . .

  25. Perl & FSM Toolkit Textual description of final_out.fsa: 0 1 HI 1 2 NLP 2 3 WORLD 3

  26. Simple extra exercises

  27. Extras 1 • Generate the following acceptor, determinize and minimize it

  28. Extras 2 • Generate the following transducers and find their composition

More Related