200 likes | 318 Vues
In this lecture, we discuss key concepts in Perl programming, including file I/O operations, array and hash references, and iterative processes. We recap important functions used to manipulate arrays and strings, and we delve into the details of Homework 2, which is due next Thursday. Additionally, we introduce useful Perl modules for handling Unicode and demonstrate how to set up cpanm for module installation. Students will also complete a task involving detecting repeated words in a text file.
E N D
LING/C SC/PSYC 438/538 Lecture 6 Sandiway Fong
Today’s Topic • Homework 2 out today • due next Thursday by midnight • submit one file only (preferably PDF) • Subject line: Your Name 538/438 Homework 2 • Perl Recap • file I/O, references, arrays • Perl • iteration • Perl Modules • Homework 2
Perl Recap • File I/O open($txtfile, $ARGV[0]) or die "$ARGV[0] not found!\n"; while ($line = <$txtfile>) { print "$line"; } • References: • $aref = [ ITEMS ] $aref is a reference to an array • @{$aref} the array • ${$aref}[0] first element of the array • $href = { ITEMS } $href is a reference to a hash • %{$href} the hash • ${$href}{key} the value associated with key in the hash
Perl Arrays • Functions on arrays: • From the front: shift array unshift array, list • From the back: pop array push array, list generalized function
Perl Strings and Arrays • Strings and arrays chomp vs. chop split a string into words
Iteration • We’ve already seen for/foreach: foreach $var (@array) { .. do something with $var .. } • for-loop: @words = qw(this is a short sentence); for (my $i=0; $i<$#words; $i++){ do something with $i and @words … } @words
Sorting revisited • From two lectures ago… • Sort (cmp - strings, <=> - numeric): • Perl version 5.16 onwards has fc (fold case) function • Unicode compatible • Unicode::CaseFold module has a function fc • Standard function is called lc (lowercase) • Works for ASCII • Also functions uc/ucfirst
Experiment • Unicode encoding (utf-8)
Module not downloaded bash-3.2$ perltest.perl Can't locate Unicode/CaseFold.pmin @INC (@INC contains: /Library/Perl/5.12/darwin-thread-multi-2level /Library/Perl/5.12 /Network/Library/Perl/5.12/darwin-thread-multi-2level /Network/Library/Perl/5.12 /Library/Perl/Updates/5.12.4 /System/Library/Perl/5.12/darwin-thread-multi-2level /System/Library/Perl/5.12 /System/Library/Perl/Extras/5.12/darwin-thread-multi-2level /System/Library/Perl/Extras/5.12 .) at test.perl line 1. BEGIN failed--compilation aborted at test.perl line 1.
cpanm • Example (command line): • cpanm Unicode::CaseFold • installing cpanminus (cpanm) • cpanminus is a script to get, unpack, build and install modules from CPAN and does nothing else. • assume command line program curl (cURL) to get cpanm from the internet • curl is a command line tool for getting or sending files using URL syntax
cpamn bash-3.2$ which cpanm bash-3.curl -L http://cpanmin.us | perl - --sudo App::cpanminus % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 303 0 303 0 0 56 0 --:--:-- 0:00:05 --:--:-- 1553 100 262k 100 262k 0 0 22916 0 0:00:11 0:00:11 --:--:-- 64374 --> Working on App::cpanminus Fetching http://www.cpan.org/authors/id/M/MI/MIYAGAWA/App-cpanminus-1.7000.tar.gz ... OK Configuring App-cpanminus-1.7000 ... OK Building and testing App-cpanminus-1.7000 ... Password: OK Successfully installed App-cpanminus-1.7000 1 distribution installed bash-3.2$ which cpanm /usr/local/bin/cpanm bash-3.2$
Cpanm Unicode::CaseFold bash-3.2$ cpanm Unicode::CaseFold ! ! Can't write to /Library/Perl/5.12 and /usr/local/bin: Installing modules to /Users/sandiway/perl5 ! To turn off this warning, you have to do one of the following: ! - run me as a root or with --sudo option (to install to /Library/Perl/5.12 and /usr/local/bin) ! - Configure local::lib your existing local::lib in this shell to set PERL_MM_OPT etc. ! - Install local::lib by running the following commands ! ! cpanm --local-lib=~/perl5 local::lib && eval $(perl -I ~/perl5/lib/perl5/ -Mlocal::lib) ! --> Working on Unicode::CaseFold Fetching http://www.cpan.org/authors/id/A/AR/ARODLAND/Unicode-CaseFold-0.03.tar.gz ... OK Configuring Unicode-CaseFold-0.03 ... OK Building and testing Unicode-CaseFold-0.03 ... OK Successfully installed Unicode-CaseFold-0.03 1 distribution installed No root access
Cpanm Unicode::CaseFold bash-3.2$ sudocpanm Unicode::CaseFold --> Working on Unicode::CaseFold Fetching http://www.cpan.org/authors/id/A/AR/ARODLAND/Unicode-CaseFold-0.03.tar.gz ... OK Configuring Unicode-CaseFold-0.03 ... OK Building and testing Unicode-CaseFold-0.03 ... OK Successfully installed Unicode-CaseFold-0.03 1 distribution installed With root access
Appendix: command line tools • For the mac (10.8), command line tools are not installed even if you install XCode
Appendix: command line tools • Xcode: Preferences
Appendix: command line tools • make
Homework 2 Question 1 (10pts) • Create a file data.txt (example): • I saw the cat on on the mat. • The the the cat sat on the mat. • Write a Perl program that detects repeated words (many spell check/grammar programs can do this) • Your program should read in data.txt and print a message stating the line number, the repeated word and its position if one exists. Example output: • perlrepeated.perldata.txt • Line 1: word 5, “on” repeated 2 times • Line 2: word 1, “the” repeated 3 times • Submit your program and examples of its output Note: case
Homework 2 • Question 2: (10 pts) • describe how a repeated word program could stop flagging legitimate examples of repeated words in a sentence • Examples: • I wish that that question had an answer • Because he had had too many beers already, he skipped the Friday office happy hour
Homework 2 • Microsoft Word: