270 likes | 410 Vues
MCB3895-004 Lecture #4 Sept 4/14. Perl input/output, while loops and strings OBJECTIVE: use Perl to open a file and count the number of lines that it contains. The absolute basics.
E N D
MCB3895-004 Lecture #4Sept 4/14 Perl input/output, while loops and strings OBJECTIVE: use Perl to open a file and count the number of lines that it contains
The absolute basics • For simplicity, title all of your scripts as "<something>.pl", and execute them from the terminal using: perl <something>.pl • Every Perl script must start with this line: #!/usr/bin/perl • Tells the computer that this is a Perl script and how to run it • Every line of a Perl script (except the first) ends with a ";" • Can run code over multiple lines • Include "use warnings;" at the top of all scripts • protects you from unexpected code errors
Comments • In a Perl script, everything that follows the “#” character is a comment • Is not executable e.g.: print"something"; # the "print something"to the left is executed, the one following the # is not • It is CRUCIAL to liberally comment your code • Otherwise you typically have no idea what you did the next time you look at it
print • The simplest command is "print", used to output data • e.g., print"something"; # outputs the text "something" to the screen
Script #1: Hello world #!/usr/bin/perl use warnings; # print command print"Hello world!";
Scalar Variables • A variable contains information held by the computer for later use • The simplest Perl variable is a "scalar" • Think of it a word or number in language • Scalar variables always start with a "$" character • You can call them whatever you want but names must not contain spaces or start with a number or special character • e.g., "$x", "$var", "$this_is_a_long_variable_1" • Use "=" to assign a value to a scalar variable • e.g., $num= 3;
Script #2: Numerical scalars #!/usr/bin/perl use warnings; # assign the variable $var= 3; # print $var print$var;
Scalar variables - strings • Scalars can also hold strings of text • Also assigned using "=", text must be in quotation marks • e.g., $var="ACTGACGTACGAT";
Script #3: Text string scalar #!/usr/bin/perl use warnings; # assign the variable $var="ACTG"; # print $var print$var;
Scalar variables - interpolation • In the previous example, the double quotation marks tell Perl that this is "interpolated" string, • i.e., allows the use of special variables and other variables in the string • e.g., $var1 ="ACTG$var2"; # includes the contents of $var2 in the new $var1 string • OR: $var1 ="ACTG" . $var2; # same thing - "." means "concatenate" • e.g., $var="ACTG\n"; # includes the "\n" new line character • e.g., $var="ACTG\tACTG"; # includes the "\t" tab character • printcan use "," instead of "." • e.g., print$var, "\n";
Scalar variables - non-interpolated strings • Single quotes do the opposite of double quotes: special characters and strings are interpreted literally • e.g., print'$var\n'; # literally prints '$var\n'
Script #4: More text strings #!/usr/bin/perl use warnings; # assign the variable $var1 ="ACTG\n"; $var2 ="AC\t$var1"; $var3 ='AC\t$var1'; # print $vars print$var1, $var2, $var3;
Hints about variable names • Names should be descriptive • e.g., “$x” is a poor name for a string containing DNA sequence • Names should not be too long • e.g., “$dna_sequence_variable” is also poor • A good example: • “$dna” or “$dna_seq” would be good • Avoid variable names that are the same as a Perl functions • e.g., “$print” is poor
Good code is readable code! • Use blank lines to separate logical blocks • Use comments to explain what code blocks are for • Use spaces to separate and align linked parts of the code
Math • Perl can do all of the expected mathematical operations $sum = 3 + 2; $minus = 3 - 2; $times = 3 * 2; $divide = 3 / 2; $modulo = 3 % 2; $exponent = 3 ** 2; • Order of operations is as expected, use round brackets as appropriate
Script #5: Basic math #!/usr/bin/perl use warnings; # assign the variables $sum = 3 + 2; $minus = 3 - 2; $times = 3 * 2; $divide = 3 / 2; $modulo = 3 % 2; $exponent = 3 ** 2; # print the answers print"3 + 2 = $sum\n"; print"3 - 2 = $minus\n"; print"3 * 2 = $times\n"; print"3 \ 2 = $divide\n"; print"3 % 2 = $modulo\n"; print"3 ** 2 = $exponent\n";
More handy math abs(-1); # returns: 1 sqrt(9); # returns: 3 log(10); # returns: 1 int(1.6); # returns: 1 rand(10); # returns: random number between 0 and 10 int(rand(10)); # returns: random integer between 0 and 10
Incrementing a value • Counting things is one of the most common perl applications • Two methods: $num=$num+ 1; $num++; • First method is more general, can use with all other mathematical operators • Second one also works as $num-- • In both cases, the previous values are overwritten
Code blocks • Code is often broken into smaller chunks • Logically more coherent • Use in loops, if/else-type conditional logic • Perl code blocks are separated by curly brackets { } • Everything in the block is indented using tab { # code here } • Can have nested blocks { # code here { # more code here } }
use strict • Like use warnings, designed to keep you out of trouble so that you always know what your variables are doing. • Requires all variables be declared the first time that they are used • e.g., my$x= 3; • If inside a code block, only defined within that code block • Can only be declared once • Keeps you from reusing things unintentionally
Script #6: strict and code blocks #!/usr/bin/perl # use warnings; # use strict; my$x= 3; { print"$x $y\n"; my$y= 2; print"$x $y\n"; { my$z = 1; print"$x $y$z\n"; } print"$x $y $z\n"; } # this script is broken under warnings and strict
Opening an input file • In Perl, files are opened using the "open" command • e.g., open (INFILE, "~/myfile") ordie"Cannot find ~/myfile"; • By convention, Perl file handles (i.e., what Perl will call the file in your script) are written in BLOCK CAPS • The “ordie” code provides a reality check: if the file cannot be opened the script with exit with an error indicating that if could not open the expected file
The file operator • Now that you have opened a file, you can use the file operator "<>" to read it one line at a time • e.g., assign the first line of a file to a string $string =<INFILE>; • Why only one line assigned?
The whileloop • Logically, a whileloop states: “do something while the specified argument remains true" • e.g., while ($line = <INFILE>){ # do something } • literally: "while the $line variable can be defined by something, i.e., so long as there are lines in the INFILE that can be assigned to $line" • What happens to $line is specified by what is in the code block and happens for each line of INFILE
Script #7: reading an input file using a while loop • Make an input file "~/myfile" that contains the following text: 1 2 3 4 5 6
Script #7: reading an input file using a while loop #!/usr/bin/perl use warnings; use strict; # open the input file open (INFILE, "~/myfile") ordie"Cannot open ~/myfile"; # loop through the file line by line while (my$line =<INFILE>){ print$line; }
Script #8: Count all of the lines in ~/myfile • You don't need me for this ;)