1 / 41

Topics

Quiz 1 Match operators Substitution Transliteration String functions length, reverse Array functions scalar, reverse, sort push, pop, shift, unshift Loops while, foreach, for Split and join Input/Output. Examples Reading FASTA file Programming Assignment #1 printf. Topics.

Télécharger la présentation

Topics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Quiz 1 Match operators Substitution Transliteration String functions length, reverse Array functions scalar, reverse, sort push, pop, shift, unshift Loops while, foreach, for Split and join Input/Output Examples Reading FASTA file Programming Assignment #1 printf Topics BINF634 FALL13 LECTURE 2

  2. $dna = "ATGCATTT"; if ($dna =~ /ATT/) { print "$dna contains ATT\n"; } else { print "$dna doesn't contain ATT\n"; } Output of code snippet: ATGCATTT contains ATT # matching a pattern $dna = "ATGAAATTT"; $pattern = "GGG"; if ($dna =~ /$pattern/) { print "$dna contains $pattern\n"; } else { print "$dna doesn't contain $pattern\n"; } print "\n"; ATGAAATTT doesn't contain GGG Match Operator from snippets2.pl BINF634 FALL13 LECTURE 2

  3. print "substitution example:\n"; $dna = "ATGCATTT"; print "Old DNA: $dna\n"; $dna =~ s/TGC/gggagc/; print "New DNA: $dna\n\n"; substitution example: Old DNA: ATGCATTT New DNA: AgggagcATT print "single substitution:\n"; $dna = "ATGCATTT"; print "Old DNA: $dna\n"; $dna =~ s/T/t/; print "New DNA: $dna\n\n"; single substitution: Old DNA: ATGCATTT New DNA: AtGCATTT print "global substitution:\n"; $dna = "ATGCATTT"; print "Old DNA: $dna\n"; $dna =~ s/T/t/g; print "New DNA: $dna\n\n"; global substitution: Old DNA: ATGCATTT New DNA: AtGCAttt Substitution from snippets2.pl BINF634 FALL13 LECTURE 2

  4. print "removing white space\n"; $dna = "ATG CATTT CGCATAG"; print "Old DNA: $dna\n"; $dna =~ s/\s//g; print "New DNA: $dna\n\n"; removing white space Old DNA: ATG CATTT CGCATAG New DNA: ATGCATTTCGCATAG print "substitution ignoring case\n"; $dna = "ATGCAttT"; print "Old DNA: $dna\n"; $dna =~ s/T/U/gi; print "New DNA: $dna\n\n"; substitution ignoring case Old DNA: ATGCAttT New DNA: AUGCAUUU Substitution from snippets2.pl BINF634 FALL13 LECTURE 2

  5. Computing complementary DNA (with bug) #!/usr/bin/perl -w # File: complement1 # Calculating the complement of a strand of DNA (with bug) # The DNA $strand1 = 'ACGGGAGGACGGGAAAATTACTACGGCATTAGC'; print "strand1: $strand1 \n"; # Copy strand1 into strand2 $strand2 = $strand1; # Replace all bases by their complements: A->T, T->A, G->C, C->G $strand2 =~ s/A/T/g; $strand2 =~ s/T/A/g; $strand2 =~ s/G/C/g; $strand2 =~ s/C/G/g; print "strand2: $strand2 \n"; exit; % complement1 strand1: ACGGGAGGACGGGAAAATTACTACGGCATTAGC strand2: AGGGGAGGAGGGGAAAAAAAGAAGGGGAAAAGG Can you find the bug? BINF634 FALL13 LECTURE 2

  6. print "transliteration operator\n"; $dna = "ATGCAttT"; print "Old DNA: $dna\n"; $dna =~ tr/T/U/; print "New DNA: $dna\n\n"; transliteration operator Old DNA: ATGCAttT New DNA: AUGCAttU print "tr on multiple characters\n"; $dna = "ATGCAttT"; print "Old DNA: $dna\n"; $dna =~ tr/Tt/Uu/; print "New DNA: $dna\n\n"; tr on multiple characters Old DNA: ATGCAttT New DNA: AUGCAuuU Transliteration Operator from snippets2.pl BINF634 FALL13 LECTURE 2

  7. DNA Complement print "DNA complement strand\n"; $dna = "ATGCAttT"; $complement = $dna; $complement =~ tr/AaTtGgCc/TtAaCcGg/; print "$dna\n"; print "$complement\n\n"; DNA complement strand ATGCAttT TACGTaaA from snippets2.pl BINF634 FALL13 LECTURE 2

  8. Computing complementary DNA (without bug) #!/usr/bin/perl -w # File: complement2 # Calculating the complement of a strand of DNA # The DNA $strand1 = 'ACGGGAGGACGGGAAAATTACTACGGCATTAGC'; print "strand1: $strand1\n"; # Copy strand1 into strand2 $strand2 = $strand1; # Replace all bases by their complements: A->T, T->A, G->C, C->G # tr replaces each char in first part with char in second part $strand2 =~ tr/ATGC/TACG/; print "strand2: $strand2 \n"; exit; % complement2 strand1: ACGGGAGGACGGGAAAATTACTACGGCATTAGC strand2: TGCCCTCCTGCCCTTTTAATGATGCCGTAATCG How have we eliminated the bug? BINF634 FALL13 LECTURE 2

  9. Length Function print "length function\n"; $dna = "ATGCAttT"; $size = length($dna); print "DNA $dna has length $size\n\n"; length function DNA ATGCAttT has length 8 from snippets2.pl BINF634 FALL13 LECTURE 2

  10. print "reverse function\n"; $dna = "ATGCAttT"; $reverse_dna = reverse($dna); print "DNA: $dna\n"; print "Reverse DNA: $reverse_dna\n\n"; reverse function DNA: ATGCAttT Reverse DNA: TttACGTA print "reverse complement\n"; $dna = "ATGCAttT"; $rev_comp = reverse($dna); $rev_comp =~ tr/AaTtGgCc/TtAaCcGg/; print "$dna\n"; print "$rev_comp\n\n"; reverse complement ATGCAttT AaaTGCAT Reverse function from snippets2.pl BINF634 FALL13 LECTURE 2

  11. Array Functions: scalar, reverse, sort print "array of gene names\n"; @genes = ("HOXB1", "ALPK1", "TP53"); $size = scalar @genes; print "A list of $size genes: @genes\n"; @genes = reverse @genes; print "Reversed list of $size genes: @genes\n"; @genes = sort @genes; print "Sorted list of $size genes: @genes\n\n"; array of gene names A list of 3 genes: HOXB1 ALPK1 TP53 Reversed list of 3 genes: TP53 ALPK1 HOXB1 Sorted list of 3 genes: ALPK1 HOXB1 TP53 from snippets2.pl BINF634 FALL13 LECTURE 2

  12. Adding items to the end of an array print "Appending to an array\n"; @genes = ("HOXB1", "ALPK1", "TP53"); push @genes, "ZZZ3"; $size = scalar @genes; print "There are now $size genes: @genes\n"; push @genes, ("EGF", "EFGR"); $size = scalar @genes; print "There are now $size genes: @genes\n\n"; Appending to an array There are now 4 genes: HOXB1 ALPK1 TP53 ZZZ3 There are now 6 genes: HOXB1 ALPK1 TP53 ZZZ3 EGF EFGR from snippets2.pl BINF634 FALL13 LECTURE 2

  13. Removing items from the end of array print "Removing items from end of array\n"; @genes = ("HOXB1", "ALPK1", "TP53", "EGF"); $size = scalar @genes; print "A list of $size genes: @genes\n"; pop @genes; $size = scalar @genes; print "There are now $size genes: @genes\n"; $gene = pop @genes; $size = scalar @genes; print "There are now $size genes: @genes\n"; print "There gene removed was $gene\n\n"; Removing items from end of array A list of 4 genes: HOXB1 ALPK1 TP53 EGF There are now 3 genes: HOXB1 ALPK1 TP53 There are now 2 genes: HOXB1 ALPK1 There gene removed was TP53 from snippets2.pl BINF634 FALL13 LECTURE 2

  14. Removing items from front of array print "Removing items from front of array\n"; @genes = ("HOXB1", "ALPK1", "TP53", "EGF"); $size = scalar @genes; print "A list of $size genes: @genes\n"; shift @genes; $size = scalar @genes; print "There are now $size genes: @genes\n"; $gene = shift @genes; $size = scalar @genes; print "There are now $size genes: @genes\n"; print "There gene removed was $gene\n\n"; Removing items from front of array A list of 4 genes: HOXB1 ALPK1 TP53 EGF There are now 3 genes: ALPK1 TP53 EGF There are now 2 genes: TP53 EGF There gene removed was ALPK1 from snippets2.pl BINF634 FALL13 LECTURE 2

  15. @genes = ("HOXB1", "ALPK1", "TP53"); while (scalar @genes > 0) { $gene = shift @genes; print "Processing gene $gene\n"; # put processing code here } Processing gene HOXB1 Processing gene ALPK1 Processing gene TP53 @genes = ("HOXB1", "ALPK1", "TP53"); while (@genes) { $gene = shift @genes; print "Processing gene $gene\n"; # put processing code here } $size = scalar @genes; print "There are now $size genes in the list: @genes\n"; Processing gene HOXB1 Processing gene ALPK1 Processing gene TP53 There are now 0 genes in the list: while loops for list processing from snippets2.pl BINF634 FALL13 LECTURE 2

  16. foreach loops for list processing print "for loop to process all items from a list\n"; @genes = ("HOXB1", "ALPK1", "TP53"); foreach $gene (@genes) { print "Processing gene $gene\n"; # put processing code here } $size = scalar @genes; print "There are still $size genes in the list: @genes\n"; for loop to process all items from a list Processing gene HOXB1 Processing gene ALPK1 Processing gene TP53 There are still 3 genes in the list: HOXB1 ALPK1 TP53 from snippets2.pl BINF634 FALL13 LECTURE 2

  17. for loops for list processing print "another for loop to process a list\n"; @genes = ("HOXB1", "ALPK1", "TP53"); $size = scalar @genes; for (my $i = 0; $i < $size; $i++) { $gene = $genes[$i]; print "Processing gene $gene\n"; # put processing code here } $size = scalar @genes; print "There are still $size genes in the list: @genes\n"; another for loop to process a list Processing gene HOXB1 Processing gene ALPK1 Processing gene TP53 There are still 3 genes in the list: HOXB1 ALPK1 TP53 from snippets2.pl BINF634 FALL13 LECTURE 2

  18. print "converting array to string\n"; @genes = ("HOXB1", "ALPK1", "TP53"); $string = join(" ", @genes); print "String of genes: $string\n"; $size = length $string; print "String has length: $size\n"; converting array to string String of genes: HOXB1 ALPK1 TP53 String has length: 16 print "join with empty separator\n"; @genes = ("HOXB1", "ALPK1", "TP53"); $string = join("", @genes); print "String of genes: $string\n"; $size = length $string; print "String has length: $size\n"; join with empty separator String of genes: HOXB1ALPK1TP53 String has length: 14 join: converting arrays to strings from snippets2.pl BINF634 FALL13 LECTURE 2

  19. join with newline separator print "join with newline separator\n"; @genes = ("HOXB1", "ALPK1", "TP53"); $string = join "\n", @genes; print "String of genes: $string\n"; $size = length $string; print "String has length: $size\n\n"; join with newline separator String of genes: HOXB1 ALPK1 TP53 String has length: 16 from snippets2.pl BINF634 FALL13 LECTURE 2

  20. split: converting string to arrays print "converting string to array\n"; $dna = "ATGCATTT"; @bases = split "", $dna; print "dna = $dna\n"; $size = scalar @bases; print "The list of $size bases: @bases\n\n"; converting string to array dna = ATGCATTT The list of 8 bases: A T G C A T T T from snippets2.pl BINF634 FALL13 LECTURE 2

  21. print "split on white space\n"; $string = "HOXB1 ALPK1 TP53"; @genes = split " ", $string; print "$string\n@genes\n\n"; split on white space HOXB1 ALPK1 TP53 HOXB1 ALPK1 TP53 print "split on 'P'\n"; $string = "HOXB1 ALPK1 TP53"; @genes = split "P", $string; print "$string\n"; foreach $gene (@genes) { print "|$gene|\n"; } split on 'P' HOXB1 ALPK1 TP53 |HOXB1 AL| |K1 T| |53| split: using separators from snippets2.pl BINF634 FALL13 LECTURE 2

  22. The @ARGV Array Array @ARGV is the list of command line arguments for the program: % myprogram.pl hello 73 abcdef has effect of: @ARGV = ("hello", 73, "abcdef"); BINF634 FALL13 LECTURE 2

  23. Opening a File Perl has two simple, built-in ways to open files: -- the shell way for convenience -- the C way for precision. The shell way % myprogram file1 file2 file3 % myprogram < inputfile % myprogram > outputfile % myprogram >> outputfile % myprogram | otherprogram % otherprogram | myprogram BINF634 FALL13 LECTURE 2

  24. Input from keyboard #!/usr/bin/perl use strict; use warnings; # File: readname.pl # Read in name and age print "Enter your name: "; my $name = <>; # "<>" reads one line from "standard input" chomp $name; # chomp deletes any newlines from end of string print "Enter your age: "; my $age = <>; chomp $age; print "Hello, $name! "; print "On your next birthday, you will be ", $age+1, ".\n"; exit; % readname.pl Enter your name: Joe Smith Enter your age: 42 Hello, Joe Smith! On your next birthday, you will be 43. % BINF634 FALL13 LECTURE 2

  25. Input from file % cat infile Joe Smith 42 % readname.pl < infile Enter your name: Enter your age: Hello, Joe Smith! On your next birthday, you will be 43. % BINF634 FALL13 LECTURE 2

  26. Input from file #!/usr/bin/perl use strict; use warnings; # File: readname.pl # Read in name and age # print "Enter your name: "; my $name = <>; # "<>" reads one line from "standard input" chomp $name; # chomp deletes any newlines from end of string # print "Enter your age: "; my $age = <>; chomp $age; print "Hello, $name! "; print "On your next birthday, you will be ", $age+1, ".\n"; exit; % readname.pl < infile Hello, Joe Smith! On your next birthday, you will be 43. % BINF634 FALL13 LECTURE 2

  27. Redirecting Standard Input and Output INPUT PIPE: "<" redirects standard input to a file: % readname.pl < infile Hello, Joe Smith! On your next birthday, you will be 43. OUTPUT PIPE: ">" redirects standard output to a file: % readname.pl < infile > outfile % cat outfile Hello, Joe Smith! On your next birthday, you will be 43. BINF634 FALL13 LECTURE 2

  28. open Function • The "open" function takes two arguments: • a filehandle, and • a single string comprising both what to open and how to open it. • "open" returns true when it works, and when it fails, returns a false value and sets the special variable $! to reflect the system error. • If the filehandle was previously opened, it will be implicitly closed first. open INFO, "< datafile" or die "can't open datafile: $!"; open RESULTS,"> runstats" or die "can't open runstats: $!"; open LOG, ">> logfile " or die "can't open logfile: $!"; Note: whitespace before or after file name is ignored. BINF634 FALL13 LECTURE 2

  29. #!/usr/bin/perl # Example 5-3 Searching for motifs # Ask the user for the filename of the file containing # the protein sequence data, and collect it from the keyboard print "Please type the filename of the protein sequence data: "; # STDIN is "standard input" file handle # STDIN is automatically opened $proteinfilename = <STDIN>; # get the next line from keyboard # Remove the newline from the protein filename chomp $proteinfilename; # open the file, or exit unless ( open(PROTEINFILE, $proteinfilename) ) { print "Cannot open file \"$proteinfilename\"\n\n"; exit; } # Read the protein sequence data from the file, and store it # into the array variable @protein @protein = <PROTEINFILE>; # note: reads in the whole file! # Close the file - we've read all the data into @protein now. close PROTEINFILE; BINF634 FALL13 LECTURE 2

  30. # Put the protein sequence data into a single string, as it's easier # to search for a motif in a string than in an array of # lines (what if the motif occurs over a line break?) $protein = join( '', @protein); # Remove whitespace $protein =~ s/\s//g; # In a loop, ask the user for a motif, search for the motif, # and report if it was found. # Exit if no motif is entered. do { print "Enter a motif to search for: "; $motif = <STDIN>; # Remove the newline at the end of $motif chomp $motif; # Look for the motif if ( $protein =~ /$motif/ ) { print "I found it!\n\n"; } else { print "I couldn't find it.\n\n"; } # exit on an empty user input } until ( $motif =~ /^\s*$/ ); # exit the program exit; BINF634 FALL13 LECTURE 2

  31. #!/usr/bin/perl # Example 5-4 Determining frequency of nucleotides # Get the name of the file with the DNA sequence data print "Please type the filename of the DNA sequence data: "; $dna_filename = <STDIN>; # Remove the newline from the DNA filename chomp $dna_filename; # open the file, or exit unless ( open(DNAFILE, $dna_filename) ) { print "Cannot open file $dna_filename\n\n"; exit; } # Read the DNA sequence data from the file, and store it # into the array variable @DNA @DNA = <DNAFILE>; # Close the file close DNAFILE; BINF634 FALL13 LECTURE 2

  32. # From the lines of the DNA file, # put the DNA sequence data into a single string. $DNA = join( '', @DNA); # Remove whitespace $DNA =~ s/\s//g; # Now explode the DNA into an array # where each letter of the # original string is now an element # in the array. # This will make it easy to look # at each position. # Notice that we're reusing the # variable @DNA for this purpose. @DNA = split( '', $DNA ); # Initialize the counts. $count_of_A = 0; $count_of_C = 0; $count_of_G = 0; $count_of_T = 0; $errors = 0; # In a loop, look at each base in turn, # determine which of the # four types of nucleotides it is, and # increment the appropriate count. foreach $base (@DNA) { if ( $base eq 'A' ) { $count_of_A++; } elsif ( $base eq 'C' ) { $count_of_C++; } elsif ( $base eq 'G' ) { $count_of_G++; } elsif ( $base eq 'T' ) { $count_of_T++; } else { print "!!!!!!!! Error - I don't recognize this base: $base\n"; ++$errors; } } # print the results print "A = $count_of_A\n"; print "C = $count_of_C\n"; print "G = $count_of_G\n"; print "T = $count_of_T\n"; print "errors = $errors\n"; # exit the program exit; BINF634 FALL13 LECTURE 2

  33. #!/usr/bin/perl -w # Example 5-6 Determining frequency # of nucleotides, take 2 # Get the DNA sequence data print "Please type the filename of the DNA sequence data: "; $dna_filename = <STDIN>; chomp $dna_filename; # Does the file exist? unless ( -e $dna_filename) { print "File \"$dna_filename\" doesn\'t seem to exist!!\n"; exit; } # Can we open the file? unless ( open(DNAFILE, $dna_filename) ) { print "Cannot open file \"$dna_filename\"\n\n"; exit; } @DNA = <DNAFILE>; close DNAFILE; $DNA = join( '', @DNA); # Remove whitespace $DNA =~ s/\s//g; # Initialize the counts. $count_of_A = 0; $count_of_C = 0; $count_of_G = 0; $count_of_T = 0; $errors = 0; # In a loop, look at each base in turn, # and increment the appropriate count. for ( $position = 0 ; $position < length($DNA); $position++ ) { $base = substr($DNA, $position, 1); if ( $base eq 'A' ) { ++$count_of_A; } elsif ( $base eq 'C' ) { ++$count_of_C; } elsif ( $base eq 'G' ) { ++$count_of_G; } elsif ( $base eq 'T' ) { ++$count_of_T; } else { print "!!!!!!!! Error - I don't recognize this base: $base\n"; ++$errors; } } # print the results print "A = $count_of_A\n"; print "C = $count_of_C\n"; print "G = $count_of_G\n"; print "T = $count_of_T\n"; print "errors = $errors\n"; # exit the program exit; What is the difference between 5-3 and 5-4? Why might this difference be important? BINF634 FALL13 LECTURE 2

  34. #!/usr/bin/perl -w # Determining frequency of nucleotides, take 3 # Get the DNA sequence data print "Please type the filename of the DNA sequence data: "; $dna_filename = <STDIN>; chomp $dna_filename; # Does the file exist? unless ( -e $dna_filename) { print "File \"$dna_filename\" doesn\'t seem to exist!!\n"; exit; } # Can we open the file? unless ( open(DNAFILE, $dna_filename) ) { print "Cannot open file \"$dna_filename\"\n\n"; exit; } @DNA = <DNAFILE>; close DNAFILE; $DNA = join( '', @DNA); # Remove whitespace $DNA =~ s/\s//g; # Initialize the counts. $a = 0; $c = 0; $g = 0; $t = 0; $e = 0; # Use a regular expression "trick", and # five while loops, to find the counts # of the four bases plus errors while($DNA =~ /a/ig){$a++} while($DNA =~ /c/ig){$c++} while($DNA =~ /g/ig){$g++} while($DNA =~ /t/ig){$t++} while($DNA =~ /[^acgt]/ig){$e++} print "A=$a C=$c G=$g T=$t errors=$e\n"; # Also write the results to a file called "countbase" $outputfile = "countbase"; unless (open(COUNTBASE,">$outputfile")){ print "Cannot open file \"$outputfile\" to write to!!\n\n"; exit; } print COUNTBASE "A=$a C=$c G=$g T=$t errors=$e\n"; close(COUNTBASE); # exit the program exit; BINF634 FALL13 LECTURE 2 What does the i and g do in those while statements above?

  35. #!/usr/bin/perl -w # Determining frequency of nucleotides, # take 4 # Get the DNA sequence data if (scalar @ARGV < 1) { print “Usage: countbase DNAfile\n”; exit; } $dna_filename = $ARGV[0]; # Can we open the file? open DNAFILE, $dna_filename or die "Can't open file $dna_filename"; @DNA = <DNAFILE>; close DNAFILE; $DNA = join( '', @DNA); # Remove whitespace $DNA =~ s/\s//g; # Initialize the counts. $a = 0; $c = 0; $g = 0; $t = 0; $o = 0; $a = ($DNA =~ tr/Aa//); $c = ($DNA =~ tr/Cc//); $g = ($DNA =~ tr/Gg//); $t = ($DNA =~ tr/Tt//); $o = (length $DNA) - ($a+$c+$g+$t); print "A=$a C=$c G=$g T=$t others=$o\n"; # Also write the results to a file called "countbase" $outputfile = "countbase"; open (COUNTBASE, ">$outputfile") or die "Can't open output file $outputfile"; print COUNTBASE "A=$a C=$c G=$g T=$t others=$o\n"; close(COUNTBASE); # exit the program exit; What is going on here? BINF634 FALL13 LECTURE 2

  36. #!/usr/bin/perl use strict; use warnings; # File: fasta.pl # Author: Jeff Solka # Date: 01 Aug 2012 # # Purpose: Read sequences from a FASTA format file # the argument list should contain the file name die "usage: fasta.pl filename\n" if scalar @ARGV < 1; # get the filename from the argument list my ($filename) = @ARGV; # Open the file given as the first argument on the command line open(INFILE, $filename) or die "Can't open $filename\n"; # variable declarations: my @header = (); # array of headers my @sequence = (); # array of sequences my $count = 0; # number of sequences What do the array declarations do? BINF634 FALL13 LECTURE 2

  37. # read FASTA file my $n = -1; # index of current sequence while (my $line = <INFILE>) { chomp $line; # remove training \n from line if ($line =~ /^>/) { # line starts with a ">" $n++; # this starts a new header $header[$n] = $line; # save header line $sequence[$n] = ""; # start a new (empty) sequence } else { next if not @header; # ignore data before first header $sequence[$n] .= $line # append to end of current sequence } } $count = $n+1; # set count to the number of sequences close INFILE; # remove white space from all sequences for (my $i = 0; $i < $count; $i++) { $sequence[$i] =~ s/\s//g; } ########## Sequence processing starts here: ##### REST OF PROGRAM exit; BINF634 FALL13 LECTURE 2

  38. #!/usr/bin/perl use strict; use warnings; # File: fasta.pl # Author: Jeff Solka # Date: 01 Aug 2012 # # Purpose: Read sequences from a FASTA file # the argument list contains the file name die "usage: fasta.pl filename\n" if scalar @ARGV < 1; # get the filename from the argument list my ($filename) = @ARGV; # Open the file given on the command line open(INFILE, $filename) or die "Can't open $filename\n"; # variable declarations: my @header = (); # array of headers my @sequence = (); # array of sequences my $count = 0; # number of sequences # read FASTA file my $n = -1; # index of current sequence while (my $line = <INFILE>) { chomp $line; if ($line =~ /^>/) { $n++; # this starts a new header $header[$n] = $line; # save header line $sequence[$n] = ""; # start new sequence } else { next if not @header; $sequence[$n] .= $line; # add to seq } } $count = $n+1; # number of sequences close INFILE; # remove white space from all sequences for (my $i = 0; $i < $count; $i++) { $sequence[$i] =~ s/\s//g; } ########## Sequence processing starts here: # process the sequences for (my $i = 0; $i < $count; $i++) { print "$header[$i]\n"; print "$sequence[$i]\n"; } exit; BINF634 FALL13 LECTURE 2

  39. printf - print with formatting When using printf, you can specify a format for each variable printed. "%3d" means print a decimal number using at least 3 characters (padding with blanks on the left if necessary) $p = 9; $years = 2; printf “I have written %3d programs in %2d years.\n”, $p,$years; $p = 125; $years = 10; printf “I have written %3d programs in %2d years.\n”, $p,$years; Output: I have written 9 programs in 2 years. I have written 125 programs in 10 years. BINF634 FALL13 LECTURE 2

  40. printf - print with formatting When using printf, you can specify a format for each variable printed. "%6.2f" means print a floating point number using at least 6 characters with 2 places after decimal point $p = 10 / 7; printf “I have written %f programs.\n”, $p; printf “I have written %6.2f programs.\n”, $p; printf “I have written %0.2f programs.\n”, $p; I have written 1.428571 programs. I have written 1.43 programs. I have written 1.43 programs. BINF634 FALL13 LECTURE 2

  41. Programming Assignment #1 • See Programming Assignment #1 on Course Webpage • Read and understand assignment • See sample input and output files • Due September 23, 2013 at 7:00 pm • HW #2: • Read Chapter 6 • Read Appendix B, pages 315-334, 340-343 • Submit exercises 6.1 and 6.2 BINF634 FALL13 LECTURE 2

More Related