1 / 31

Perl: subroutines (for sorting)

Perl: subroutines (for sorting). Good Programming Strategies for Subroutines. #!/usr/bin/perl # example why globals are bad $one = &lt;STDIN&gt;; $two = &lt;STDIN&gt;; $max = &amp;larger; print &quot;$max<br>&quot;; sub larger { if ($one &gt; $two) { $one; } else { $two; }

hua
Télécharger la présentation

Perl: subroutines (for sorting)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Perl: subroutines (for sorting)

  2. Good Programming Strategies for Subroutines #!/usr/bin/perl # example why globals are bad $one = <STDIN>; $two = <STDIN>; $max = &larger; print "$max\n"; sub larger { if ($one > $two) { $one; } else { $two; } In this example, it is tedious to use the same code for 2 different variables, such as $fred and $barney

  3. #!/usr/bin/perl # example why globals are bad $one = <STDIN>; $two = <STDIN>; $max = &larger; print "$max\n"; $fred = <STDIN>; #hack to make "larger" work with 2 different variables $barney = <STDIN>; $keep_one = $one; # redundant stuff $keep_two = $two; $one = $fred; $two = $barney; $max = &larger; print "$max\n"; $one = $keep_one; $two = $keep_two; sub larger { if ($one > $two) { $one; } else { $two; } #there HAS to be a better way

  4. Arguments Passed to Subroutines • To pass parameters to a subroutine, simply list them in parentheses after the subroutine call $n = &larger (10, 15); $X = &larger ($one, $two);

  5. Arguments Passed to Subroutines New automatic variable (@_) • parameters passed to a subroutine are automatically stored in this array – as many as are required • recall array syntax: $_[0] is first element of @_ array • do NOT confuse this with $_ -- the default variable • example: $i is scalar, and distinct from @i and $i[0] – both arrays • scope – the variables in @_ are local to the subroutine • this means that when the subroutine is called, the values of the variables used in calling the subroutine are copied into the automatic argument array – so changes are only local to the subroutine • This also means that if a subroutine calls a subroutine, the local scope of @_ is maintained – there would be two separate and distinct versions of @_ • this is why recursion works • this is taken care of by perl – we don't have to worry about it – we just have to understand that it occurs

  6. Arguments Passed to Subroutines We can re-write our subroutine with parameter passing as: #!/usr/bin/perl $x = 5; $y = 18; $max = &larger ($x, $y); #call the subroutine print "$max"; ########################### sub larger { if($_[0] > $_[1]) { $_[0]; #local copy of X } else { $_[1]; #local copy of Y } However, this is hard to read, write, check, and debug Another problem – what happens if an extra parameter is included – you don't know from the context of the name of the subroutine "larger" that it only accepts 2 parameters. $x = larger(10, 15, 30); #30 will be ignored

  7. Arguments Passed to Subroutines • all variables in perl are global by default • you can specify local variables with "my" sub larger { my ($a, $b); #my $a; my $b; ($a, $b) = @_; #list assignment # $a = $_[0]; # $b = $_[1]; if ($a > $b) { $a; } else {$b;} #note, you may omit ;'s for single lines of code within a #block }

  8. Argument Passing to Sub's • The variables $a and $b are private ("scoped") to only this code block – the subroutine • They will not affect any other $a/$b in the program • Changes to $a/$b in other parts of the program will NOT affect $a/$b in this subroutine • The subroutine is modular and reusable, it can be placed into virtually any perl program and operate predictably.

  9. More condensed without comments sub larger { my ($a, $b); ($a, $b) = @_; if ($a > $b) { $a; } else {$b;} } # even simpler sub larger { my ($a, $b) = @_; #subroutines often have this line if ($a > $b) { $a; } else {$b;} }

  10. Variable-length Parameter Lists • many traditional programming languages require subroutine parameter lists to be strictly typed • predefined number and type of parameters • this may be enforced in perl as well sub larger { if(@_ != 2) { # scalar context print "Warning: 2 arguments expected\n" } : : }

  11. Another way: Variable-length Parameter Lists #!/usr/bin/perl $max = &larger (3, 5, 10, 4, 5); sub larger{ my ($largest) = shift @_; # shift element off of the LHS of the array foreach (@_) { # default variable used ($_) if($_ > $ largest) { $largest = $_; } } $largest; }

  12. Pragma: use strict; • A pragma is used to convey information to the perl compiler • use strict; • tells perl's compiler that it should enforce some good programming rules for this code block • essentially forces you to declare all variables with "my" • -w (warnings pragma) – already talked about • same as: use warnings; • -w applies to all code (modules, subroutines, etc.) #!/usr/bin/perl $i =~ m/tag/; #warnings: Name "main::i" used only once: possible typo at test.pl line2 # Use of uninitialized value in pattern match (m//) at line 2.

  13. use strict; #!/usr/bin/perl use strict; $i++; $i =~ m/tag/; # Global symbol "$i" requires explicit package name at ./strict.pl line 3. # Global symbol "$i" requires explicit package name at ./strict.pl line 4. # Execution of ./strict.pl aborted due to compilation errors. perldoc warnings perldoc perllexwarn perldoc strict More useful than appears: example

  14. #!/usr/bin/perl # maxBad.pl # Example program to show use of subroutines and strict use strict; my @numbers = (1, 10, 11, 3, 9, 8, 5, 3); my $max; $max = &max(@numbers); print "max = $max\n"; if($number[1] == $max) { print "The largest number is at position 1 in array\n"; } sub max { my $large = shift @_; foreach my $i (@_) { if($i > $large) { $large = $i; } $large; } Global symbol "@number" requires explicit package name at ./maxBad.pl line 12. Execution of ./maxBad.pl aborted due to compilation errors. Missing right curly or square bracket at ./maxBad.pl line 24, at end of line syntax error at ./maxBad.pl line 24, at EOF

  15. Comment on Pragmas use warnings; use strict; Ideally these would be used from the beginning of every program that is longer than a few lines. It can be quite a challenge to take a large and complicated program and clean up all the warnings/errors Adding these checks after the program is written and debugged defeats the benefit of reducing development time by finding "mistakes".

  16. return Operator • return EXPRESSION • immediately returns a value from a subroutine my @seqs = qw/TTT GTC CTG ATG GTA CGA/; my $index = &which_codon_is("ATG", @seqs); sub which_codon_is { my ($this_codon, @list) = @_; foreach (0..$#list) { # indices of list if($this_codon eq $list[$_] ) { # missing ) on web return $_; } } return(-1); # -1 if not found } # return is optional here – could just put -1

  17. #!/usr/bin/perl # translate # Take input from STDIN, # convert DNA to AA's # # Modified -- to have function call another function $line = <>; while( $line = <>) { chomp($line); #take care of new lines $sequence = $sequence.$line; } # amino acid hash # key values pairs %aminos = ( "TTT", "F", "TTC", "F", "TTA", "L", "TTG", "L", "CTT", "L", "CTC", "L", "CTA", "L", "CTG", "L", "ATT", "I", "ATC", "I", "ATA", "I", "ATG", "M", "GTT", "V", "GTC", "V", "GTA", "V", "GTG", "V", "TCT", "S", "TCC", "S", "TCA", "S", "TCG", "S", "CCT", "P", "CCC", "P", "CCA", "P", "CCG", "P", "ACT", "T", "ACC", "T", "ACA", "T", "ACG", "T", "GCT", "A", "GCC", "A", "GCA", "A", "GCG", "A", "TAT", "Y", "TAC", "Y", "TAA", ".", "TAG", ".", "CAT", "H", "CAC", "H", "CAA", "Q", "CAG", "Q", "AAT", "N", "AAC", "N", "AAA", "K", "AAG", "K", "GAT", "D", "GAC", "D", "GAA", "E", "GAG", "E", "TGT", "C", "TGC", "C", "TGA", ".", "TGG", "W", "CGT", "R", "CGC", "R", "CGA", "R", "CGG", "R", "AGT", "S", "AGC", "S", "AGA", "R", "AGG", "R", "GGT", "G", "GGC", "G", "GGA", "G", "GGG", "G", ); $peptide = &tlate($sequence); print "$peptide\n"; ############# Subroutines sub tlate { $seq = $_[0]; $peptide=""; $length = length($seq); while($length>=3) { ($codon,$seq) = getCodon($seq); if($aminos{$codon}) { $peptide = $peptide.$aminos{$codon}; } else { $peptide = $peptide."."; #just put in a stop for XXX, ATX, etc } $length = length($seq); #print "$length\r"; } print "\n"; return($peptide) } sub getCodon { $sequence = $_[0]; $codon = substr($sequence,0,3); # look up codon $sequence = substr($sequence,3,$length-3); # remove that codon #print "codon=$codon\n"; return($codon,$sequence); }

  18. Output ./subsubRoutine.pl Main program s1 a = 4 b = 6 s2 a = 6 b = 10 return1 = 60 val = product =

  19. Sort Subroutine A sort subroutine might (INCORRECTLY?) be expected to take 2 values, compare those values, and return them in sorted order: # POOR example sub sort_sub { my $a = shift; my $b = shift; # my ($a, $b) = @_; #same thing if ($a > $b) { return ($b, $a)} else {return ($a, $b)} }

  20. Sort Subroutine Problems • this subroutine may be called hundreds/thousands of times • inefficient because • allocate variables • assign values • return 2 values • alternative way • both $a and $b have been allocated in the "calling" code block • return one coded value • -1 if $a < $b #Don't swap • 1 if $a > $b # swap • 0 if order doesn't matter • Note – sorting is a special case that is performed so often that it has been highly optimized (that's why it gets all of this "special" attention)

  21. numeric sort subroutine sub by_num { # expect $a and $b if($a < $b) {-1} elsif ($a > $b) {1} else {0} } # ;'s are optional if one line in code block

  22. "sort" with custom subroutine The custom sorting subroutine (without &) may be specified for the "sort" operator in perl. It assumes the -1, 1, 0 cases #!/usr/bin/perl use strict; my @nums = qw/5 2 9 12 37 2 18/; @results = sort by_num @nums; sub by_num { if($a < $b) {-1} elsif ($a > $b) {1} else {0} } Note, in "by_num", we need don't need to declare $a (even with strict) If we do, this will not work correctly. This is an example of "pass by reference" The values of the array are passed to the subroutine by reference – the actual values are not passed back and forth. That is why this works – and why changing $a changes the array.

  23. sort Syntax (pg 217) sort SUBNAME LIST sort BLOCK list sort LIST #Note, this is a special case where the subroutine is called WITHOUT the &

  24. Example #!/usr/bin/perl # sort.pl # @array = (9, 8, 7, 1, 2, 3, 9, 69, 5,); @sorted = sort by_num @array; print "sorted = @sorted\n"; sub by_num { print "$a $b\n"; if($a < $b) {-1} elsif ($a > $b) {1} else {0} } # example -- sort-brok.pl

  25. Shortcut to Sorting • sorting like this is so common, a special operator exists to replace: if($a < $b) {-1} elsif ($a > $b) {1} else {0} Therefore, we can also replace the whole subroutine: @results = sort by_num @nums; @results = sort { $a <=> $b} @nums; Descending order: @results = sort { $b <=> $a} @nums; cmp can be used for string comparisons – but this is how the sort operator already sorts.

  26. Example Profile #!/usr/bin/perl # sort-prof.pl (for profile)_ # # Note how if I change the value of $a, it changes in the original array use strict; my @nums = qw/5 2 9 12 37 2 18/; my @results = sort by_num @nums; print "@nums\n"; print "@results\n"; sub by_num { print "$a $b\n"; if($a == 5) { $a = 55} if($a < $b) {-1} elsif ($a > $b) {1} else {0} }

  27. Example Profile 5 2 9 12 37 2 2 9 9 55 55 12 2 18 18 37 2 2 2 9 9 18 18 12 18 55 55 37 55 2 9 12 37 2 18 2 2 9 12 18 37 55

  28. Sorting a Hash #!/usr/bin/perl use strict; my %score = ("tim" => 195, "tracy" => 205, "indy" => 30); # sort keys of hash, based on values of hash my @winners = sort by_score keys %score; foreach (@winners) { print "$_ $score{$_}\n"; } sub by_score { $score{$b} <=> $score{$a} } #note descending order #tracy 205 #tim 195 #indy 30

  29. #!/usr/bin/perl $i = "4-18-69"; # match 1 digit, dashe, 2 digits, dash, and 2 digits if($i =~ m/\d-\d\d-\d\d/) { print "$i\n"; print "$&\n"; } #But what happens if day is 1 digit #match both cases $i = "4-1-69"; if($i =~ m/\d-(\d\d|\d)-\d\d/) { print "$i\n"; print "$&\n"; } #Match 1 or more digits #But, this also matches 100 for day $i = "4-1-69"; if($i =~ m/\d-\d+-\d\d/) { print "$i\n"; print "$&\n"; } #What about 2 digit years VS 4 digit years # Carefull, this will match 19 before 69 $i = "4-1-1969"; if($i =~ m/\d-\d+-\d\d|\d\d\d\d/) { print "$i\n"; print "$&\n"; } # $i = "4-1-1969"; if($i =~ m/\d-\d+-(\d\d\d\d|\d\d)/) { print "$i\n"; print "$&\n"; } #What about 2 digit months # But this will match 13 $i = "12-1-1969"; if($i =~ m/\d+-\d+-\d\d\d\d|\d\d/) { print "$i\n"; print "$&\n"; } #How about this #Doesn work, becuase it matches 1 $i = "13-1-1969"; if($i =~ m/1|2|3|4|5|6|7|8|9|10|11|12-\d+-\d\d\d\d|\d\d/) { print "$i\n"; print "$&\n"; } # Added ^ and $ # Added parenthises for grouping # Will NOT match because of 14 $i = "14-1-1969"; if($i =~ m/^(1|2|3|4|5|6|7|8|9|10|11|12)-\d+-(\d\d\d\d|\d\d)$/) { print "$i\n"; print "$&\n"; } # Now it will match $i = "12-1-1969"; if($i =~ m/^(1|2|3|4|5|6|7|8|9|10|11|12)-\d+-(\d\d\d\d|\d\d)$/) { print "$i\n"; print "$&\n"; } # Note limit on days to 1 or 2 digits # also, storing results into vars $i = "9-28-1969"; if($i =~ m/^(1|2|3|4|5|6|7|8|9|10|11|12)-(\d{1,2})-(\d\d\d\d|\d\d)$/) { print "$i\n"; print "$&\n"; $month = $1; $day = $2; $year = $3; print "$month $day $year\n"; } Regular Expression Example

  30. End

More Related