html5-img
1 / 110

Introducing Perl

Introducing Perl. P ractical E xtraction and R eport L anguage First developed by Larry Wall, a linguist working as a systems administrator for NASA in the late 1980s, as a way to make report processing easier. Since then, it has moved into a large number of roles:

palmer-park
Télécharger la présentation

Introducing Perl

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introducing Perl • Practical Extraction and Report Language • First developed by Larry Wall, a linguist working as a systems administrator for NASA in the late 1980s, as a way to make report processing easier. • Since then, it has moved into a large number of roles: automating system administration acting as glue between different computer systems; the language of choice for CGI programming on the Web? • Free: Source code, documentation, use • Portable: more than 50 operating system platforms

  2. Introducing Perl • Perl is often called the Swiss Army chainsaw of languages: versatile, powerful and adaptable - resembles the Swiss Army knife • Perl is an interpreted language optimised for • Scanning arbitrary text files • extracting information from those text files • Printing reports based on that information • Perl is intended to be practical: easy to use, efficient and complete rather than tiny, elegant and minimal • Perl’s slogan is “There is more than one way to do it” and its philosophy is to “make easy things easy while making difficult things passible”

  3. Introducing Perl A little history: • December 1987, release 1.0 • Current major release 5.0 in October 1994 • July 1998 release 5.005 • March 1999 release 5.005_03 • March 2000 release 5.6 • Latest version ? Portable • Unix: AIX, HSD, HP-UX, IRIX, Linux, Solaris … • MS Windows • MacOS • Others: AmigaOS, OS2, VMS ...

  4. Introducing Perl Programs, Scripts, Compilers, and Interpreters • Perl programs are call “Scripts”. There is no difference. It is just source code • Perl “compiler” is also call “interpreter”. • The source code is compiled into bytecode which is executed in the main memory(rather high level bytecode: “sort a list” is one operation)

  5. Introducing Perl Perl Internet References Resources • Home page of Perl: http://www.pwrl.com • Perl user groups: http://www.perl.org • CPAN(Comprehensive Perl Archive Network) http://cpan.org Getting Perl • Unix: www.cpan.org/ports • Windows: http://www.activestate.com/ • MacOS: http://www.macperl.com/

  6. Writing and Running Perl Programs Writing Perl scripts: • Code is plain text, any text editor will do • UNIX: Emacs • Windows: Notepad Running Perl • from command line: perl myprog.pl • with option: perl -w myprog.pl • in UNIX, make the first line of your program to be #!/usr/bin/perl, and add execute permission using chmod

  7. Documentation and online help • Extensive documentation comes with the standard distribution UNIX: man perl Windows: Programs->ActivePerl->Online Documentation • Online http://www.perl.com/pub/v/documentation • FAQs http://www.perl.com/pub/v/fags • The Perl Journal http://www.tpj.com/

  8. Hello World! In C #include <stdio.h> main() { printf(“Hello World\n”); } In Java public class Hello { public static void main (String [] args) { System.out.println(“Hello World!”); } } In Perl print “Hello World!\n”; More than one way to do it: print (“Hello World!\n”); print “Hello world!”, “\n”; print “Hello”, “”, “World!”, “\n”

  9. Some Simple Scripts/One-liners Example 1: Print lines containing the string ‘Shazzam!’ • #/usr/bin/perl • while (<STDIN>) { • print if /Shazzam!/ • }; • #<STDIN> is a bit of Perl magic that delivers the next line of input each time round the loop Example 2: The same thing the hard way • #/usr/bin/perl • while ($line = <STDIN>) { • print $line if $line =~/Shazzam!/ • };

  10. Some Simple Scripts/One-liners Example 3: A script with arguments • #/usr/bin/perl • $word = shift; • while (<>) {print if /$word/}; • If we put the script in a file called match, we can invoke the script • match Shazzam! • Match Shazzam! file1 file 2 • The shift operator returns the first argument from the command, and move others up one position. • Called with one argument, match reads standard input and prints those lines which contains the word given as the argument. • Called with two or more arguments, the first argument is the word to be searched for, and second and subsequent arguments are filenames of fles that will be searched in sequence for the target word.

  11. Some Simple Scripts/One-liners Example 4: Error Messages • #/usr/bin/perl • die “Need word to search for\n” if @ARGV ==0; • $word = shift; • while (<>) {print if /$word/}; • @ ARGV is a special array which holds the command line parameters. A program is executed as a result of a system command, which consists of the executable program file, followed by a command tail, e.g. : • C:> program param1 param2 ... paramn • Then $ARGV[0] = "program", $ARGV[1] = "param1", $ARGV[2] = "param2" ... $ARGV[n] = "paramn".

  12. Some Simple Scripts/One-liners Example 5: Reverse order of lines in a file • #/usr/bin/perl • open IN, $ARGV[0] or die “Cannot open $ARGV[0]\n”; • @file = <IN>; • for ($I = @file-1; $I >= 0; $I--) • { • print $file[$I]; • } • Do the same in C, Java ?

  13. Variables and Datatypes • $ Scalar • @array • %hash • The type of a variable is marked by the type prefix ($ @ %), which is always used. • $x = $y +3 • Variable names are arbitrary long, which can consist of characters a - z, A- Z, the underscore _ (and must begin with any of these), and digits from 0 - 9. It is case sensitive: uppercase and lowercase are different. • $No_of_Students, @StudentList, %StudentRecord_01 • Special control variables have “punctuation” in their names, e.g., the $^O variable which tells the name of the current operation system.

  14. Variables and Data types (Scalars) • A $scalar holds a single data item. The data can be • string, numeric, boolean depending on context. • When scalars are understood as numbers (that is, used as numbers), they are double precision floating point numbers. • Scalars are given a value by the assignment operator = . For example • $a = “The University of Nottingham, UK”; • $b = 129867445; • $c = 3.14159; • $d = 03776; #octal • $e = 0x3fff; #hex

  15. Variables and Data types (Aggregates) • There are two aggregate datatypes, @array and %hash. Both can hold an unlimited number (as long as there is memory) of scalars, there is no explicit declaration, allocation, deallocation, or any explicit memory management • @array: • Arrays are ordered and they are indexed by a number (a scalar in numeric context) • %hash: • Hashes are unordered and they ar indexed by a string (a scalar in string context)

  16. Variables and Data types (Arrays) • An @array is an aggregate for storing scalars and indexed by a number (a scalar) inside square brackets []. • The indexing is zero-based. • Also negative indices can be used, they count from the end of the array. • @a = (“one”, “two”, “three”); • print “$a[1] $a[0] $a[-1]\n”; • will result in • two one three

  17. Variables and Data types (Arrays) • If you enclose the array in double quotes, the scalars of the array are printed separated by space. • @a = (“one”, “two”, “three”); • print “@a\n”; will give one two three • Note that while an array has a type prefix of @ and element of the array is a scalar and therefore has a prefix of $: • @a = (“one”, “two”, “three”); • $a[3]= “four”; • print “@a $a[0]\n”; will give one two three four one

  18. Quoting: Basic • Inside double quotes variables and \-constructs (like \n) are expanded, inside single quotes they are not expanded. • The difference between single quotes and double quotes is that single quotes mean that their contents should be taken literally, while double quotes mean that their contents should be interpreted. print "This string\nshows up on two lines."; will show: This string shows up on two lines print 'This string \n shows up on only one.'; will show: This string \n shows up on two lines @a = (“one”, “two”, “three”); print “@a\n”, ‘@a’, “\n”; will show one two three @a

  19. Quoting: Basic • Inside double quotes, the following are the most common \-constructs • \n The logical newline • \t The tabulator • \$ The dollar • \@ The at sign • \xHH Character encoded in hexadecimal • \010 Character encoded in octal • \” The double quotes • \\ The backslash itself

  20. Quoting: The qw operator • There is a special quoting construct for quoting “words”, or strings consisting only of alphanumeric characters. • @a = (“one”, “two”, “three”); • can be written as • @a = qw(one two three); • The qw stands for “quote words” • Note also that all the quoting constructs are operators.

  21. Variables and Data types (Arrays, $#) • The notation $#arrayname returns the index of the farthest array element ever modified. • @a = qw(one two three); • print “The las index of \@a is $#a. \n”; • will show • The last index of @a is 2

  22. Variables and Data types (undef) • What is in the aggregate elements that have not been assigned a value? The undef value (the value of all uninitialized scalars, not just of uninitialized aggregate elements). • This can be tested using defined, explicitly assigned by using the undef() function, and should be always caught by using -w switch. • @a = qw(one two three); • if (defined $a[1]) {print “Oh, yes\n”}; • if (defined $a[9]) {print “Impossible. \n”}; • Usin the undef() a scalar can be “returned” to an uninitialized state. • @a=qw(one two three); • undef $a[1]; • if (defined $a[1]) {print “Impossible. \n”}

  23. Important: -w • The following script will print The tenth element is . And you would waste your time by wondering what went wrong • @ = qw (one two three); • print “The tenth element is $a[9]. \n”; • But, by adding the -w switch you would have seen this: • Use of uninitialized value …. • Use -w is strongly encouraged. • The -w catches not only the use of uninialized values but also other mistakes and problems, such as using a variable only once (usually indicative of a typo)

  24. Scalar Context: Strings or Numbers • Numbers in Perl can be manipulated with the usual mathematical operations: addition, multiplication, division and subtraction. (Multiplication and division are indicated in Perl with the * and / symbols, by the way.) $a = 5; $b = $a + 10; # $b is now equal to 15. $c = $b * 10; # $c is now equal to 150. $a = $a - 1; # $a is now 4. • You can also use special operators like ++, --, +=, -=, /= and *=. These manipulate a scalar's value without needing two elements in an equation. • $a = 5; • $a++; # $a is now 6; we added 1 to it. • $a += 10; # Now it's 16; we added 10. • $a /= 2; # And divided it by 2, so it's 8.

  25. Scalar Context: Strings or Numbers • Strings in Perl don't have quite as much flexibility. About the only basic operator that you can use on strings is concatenation. The concatenation operator is the period . Concatenation and addition are two different things: • $a = "8"; # Note the quotes. $a is a string. $b = $a + "1"; # "1" is a string too. $c = $a . "1"; # But $b and $c have different values! • Remember that Perl converts strings to numbers transparently whenever it's needed, so to get the value of $b, the Perl interpreter converted the two strings "8" and "1" to numbers, then added them. The value of $b is the number 9. However, $c used concatenation, so its value is the string "81". • Just remember, the plus sign adds numbers and the period puts strings together.

  26. Context: Scalar v.s. List • A very pervasive concept is different context. Certain constructs and functions behave differently depending on the context they are used. For example, the context of the left side of the assignment operator (=) forces the right side to comply: • @x = qw (adc de f) • @a = @x; • $a = @x; • print “@a: $a \n”; • will print abc de f: 3 • In scalar context the value of an array is the size of the array ($#array plus one)

  27. Array versus List • The difference between an array and a list is that an array has a name and the @ type prefix, while a list is a parenthesis-enclosed comma-separated entity. In @ x = qw(abs de f); • a list is assigned to an array Separate Name Space • The name space of scalars, arrays, and hash are completely separated because the type prefix explicitly tells which one we are talking about • $ x = “Tyreytio”; • @x = qw(asd df f);

  28. Hashes (I) • A hash is an unordered aggregate which holds scalars, the values of the hash, indexed by strings (scalars), the keys of the hash. The index is enclosed in curly brackets • %a = qw ( Nottingham 0115 • Sheffield 0114 • Leeds 0113); • print “$a{Leeds}\n”; • will output • 0113

  29. Hashes (II) • The keys and the values of a hash can be returned by the keys and values functions. • %a = qw (Nottingham 0115 • Sheffield 0114 • Leeds 0113); • @k = keys %a; • @v = values %a • print “@k\n@v\n”; • will possibly (the order is pseudo random) output • Nottingham Sheffield Leeds • 0115 0114 0113

  30. Hashes (III) • The existence of a key-value pair in a hash can be verified using the exists function. • %a = qw( Nottingham 0115 • Sheffield 0114 • Leeds 0113); • $b = exists $a{Leeds} ? 1:0; • $c = exists $a{Birmingham} ? 1:0; • print “$b $c \n” • will print 1 0 • The exists cares only about the existence of the key: the value is irrelevant.

  31. Hashes (IV) • The key-value pairs of a hash can be returned iteratively (in a loop) by each function • %a = qw(Nottingham 0115 • Sheffield 0114 • Leeds 0113); • while (($k, $v)= each %a){ • print “$k $v\n”; • } • will possibly print (again the order is pseudo random) • Nottingham 0115 • Sheffield 0114 • Leeds 0113

  32. Hashes (V) The => operator • The => operator is a variant of the , (comma) operator which as a side effect forces its left operand to be a bare word, effectively a string constant with implicit single quotes around it. This is a convenient notation which is most often used when specifying the key-value pairs for a hash • %b = (‘English’, ‘one’); • %b = (English => ‘one’); • these are equivalent.

  33. Hashes (VI) • Hash elements or groups of hash elements (slices) can be deleted using the delete function • %a =(English =>”one”, French => “un”, • German => “ein”, Finish => “yksi” • Japanese => “ichi”, Chinese “yi”); • delete $a{German}; • delete $a{‘French’, ‘Finish’}; • print values %a, “\n”; • will print the values one ichi yi in some order

  34. Slices of Aggregates • In addition to accessing the aggregates either as a whole or per element, it is possible to access them by groups of elements. These groups are called slices. The syntax is @ variable indices, or in other words, @array[number] or @hash{strings}. The slice is the list of scalars at the specified indices. • %a =(English =>”one”, French => “un”, • German => “ein”, Finish => “yksi” • Japanese => “ichi”, Chinese “yi”); • @s = @a{“German”,”English”}; • print “@s\n”; • Should result in: ein one • The order of the returned list is well defined because of the order of the indices is well-defined.

  35. Operators • A fairly standard set of mathematical, logical, and relational operators exists, the only somewhat exceptional one is the power operator **, for example 2**3 is 8. • Scalars can be either numbers or strings depending on the context they are used in - and this is exactly what operators do: they force a contexr on their operands. For example, while the + forces numeric context on its operands and sums them, the . (dot) operator forces string context on its operand and concatenates them. • (B, $c) = (2, 3); • $a = $b + $c; • $d = $b.$c; • print “$a $d \n” • will print: 5 23

  36. Operators: Specialties • Perl has these • separate sets of comparison operators for string and numeric context • generalized comparison operators cmp and <=> • low-precedence and, or and not (&&,|| and ! Are high precedence) • string concatenator . (dot) and string/list repeater x • left-quoting pseudo-comma =>, range generator .. • Quoting operators • file/directory input operator < > • Pattern matching, substitition, and biding operators, m,s,=~, and !~

  37. Operators: Boolean • The && and || are short-circuiting as in C: they stop evaluating their operands as soon as the first decisive value is met (The first false for &&, the first true for||) • There are also variants of lower precedence: and, or, xor and not

  38. Operators: Precedence • Precedence rules are much like in C, C++, or Java • Some confusion may stem from the fact that when calling functions (either built-in or user defined), the parentheses are not required. In other words there are equivalent: • $n = length ($header); • $n = length $header; • This would be easy enough to comprehend, but things gets interesting when the functions have more than one argument, and especially interesting when the number of arguments varies. • When in doubt, parenthesize

  39. Operators: Assignments, Valued and Modifiable • Assignments have values. • $a += ($b=$c); • This copies the value of $c to $b and then adds that to $a. • Assignments are modifiable. • ($c=$d)+=@e; • This copies the value of $d to $c and then adds the number of elements in @e to $c. • Or, in other words, the left side of an assignment can be used in further assignment (this property is often called “lvalue”, left-value).

  40. Operators: Assignments, List Context • Assignment also works in list context. • ($a, $b) = ($b, $a); • This swaps the values of $a and $b

  41. Operators: Comparing • For comparing scalars with each other and with literal values all the usual comparison operators exist. The catch is that there are two sets of comparison operators: One for comparing in string context, one for comparing in numeric context. The cmp and < = > return -1, 0, or 1 depending on whether the comparands fullfil the less-than, equal-to, or greater than relation

  42. Operators: Concatenate and Repeat • Scalars can be concatenated (as string) using the . (dot) operator • $a = “con”; • $b = “nate”; • $c = $a . “cate”. $b; • will result in $c being “concatenate”. • Scalars and lists can be repeated using the x operator. The repetition count comes after the operator • $a = “yes”; @b = ($a, “No”); $c = $a x 3;@c = @bx2; • print “@c, $c!\n”; • Will print: Yes No Yes No, No No No

  43. Operators: Range Generator . . • The . . Operator can be used to generate ranges of values (as lists) between two scalar endpoints. This works both for numbers and strings. The rules ffor “incrementing strings” to generate ranges are as with the ++ operator. • @a = 0 . . 4; • @b = “aa” . . “zz”; • print “@a @b[0 .. 4] @b[-4 .. -1]\n”; • will result in 0 1 2 3 4 aa ab ac ad ae zv zw zx zz

  44. Operators: Quoting • Single quotes do not extrapolate (do not expand) variables, double quotes do. • Double quotes have also several special constructs triggered by using the backquotes (\), for eaxmple \n for a logical new line and \U for uppercasing the value of the following variable. • If ($need eq ‘urgent’) { print “\U$task\n”} • else {print “$task\n”}

  45. Operators: Here-Documents (I) • A quoting mechanism called here-documents (inherited from UNIX shell scripts) enables one to easily include multiline blocks of text. • The syntax is <<terminator, (the terminator can be any “word”: alphanumerics and underscores) followed by the lines of text, and terminated by the terminator string at the beginning of a line and alone on the line. • $message = <<HERE_IS_THE_MESSAGE; • One • Two • Three • HERE_IS_THE_MESSAGE • This is equivalent to $ message = “One\nTwo\nThree\n”;.

  46. Operators: Here-Documents (II) • The here-document mechanism can be used either in single quoted way (variables not extrapolated) or in double quoted way (variable extrapolated) • If there are no quotes around the terminator after the <<, the singlequoted case is implicitly assumed. Explicit single quoteing can be achieved by enclosing the terminator inside single quotes. • $x = <<EOF • no @expansions • EOF • $y = <<‘EOF’ • no @expansions • EOF • After this, $x and $y are equivalently “no \@expansions\n”

  47. Operators: Here-Documents (III) • If there are scalars or arrays in the here-block that need extrapolated, double quotes around the terminator will help. • @x = qw (a be def); • $x = << “EOH”; • This • @x • is it. • EOH • Will have $x equal to “This \na bc def\nis it\n”.

  48. Operators: Input Operator < > • After having opened a file or a directory one gets a handle. The “next item” can be read from the handle using the diamond operator < >. Open (X, “FileName”) or die “$0:failed to open FileName: $!\n”; • $line = <X>; • will read the first line of the file called “FileName” into $line. • The $0 is a special variable that contains the name of the script. The $! Is a special variable that contains the error message caused by the latest failed system call, such as trying to open a file. In numeric context, $! is the numeric error code. • The definition of “next item” for files is the next line and for directories, the next filename.

  49. Control Structures • A very rich selection of flow control structures is available. • Control structures control blocks, statements enclosed in { }. • If ($a >= $b +$c) { • $a = 0; • } else { • $a++; • }

  50. Control Structures: if, unless, elsif, else • The if control structure works as might be expected, the following block is executed if the condition is true. • If ($x <=100) { …..} • The unless is the logical oposite of if: the block follwing it is chosen if the condition is false • unless ($x>100) {……} • if ($x <10) {……) • elsif ($x <20) {…..} • elsif ($x <30) {….} • else {….}

More Related