1 / 21

Perl Programming: Developing Key Tools for Bioinformatics

Perl Programming: Developing Key Tools for Bioinformatics. An Informative Look Behind the Importance of Programming Skills and Brief Tutorial on Getting Started With Perl and Bioperl. Andrew C. Rieser 4-1-04. Why Computer Skills are Important.

shiro
Télécharger la présentation

Perl Programming: Developing Key Tools for Bioinformatics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Perl Programming: Developing Key Tools for Bioinformatics An Informative Look Behind the Importance of Programming Skills and Brief Tutorial on Getting Started With Perl and Bioperl Andrew C. Rieser 4-1-04

  2. Why Computer Skills are Important “Computers are powerful devices for understanding any system that can be described in a mathematical way” (Gibas, 2001). • At one time Computer Skills for Biologists weren’t important • Mass Quantities of resources on the Internet • Numerous tools to manipulate and discover genomic info • Programming Skills can be very important

  3. Problem? • The Rapid Growth of GenBank and Other Online Databases • Different Formats that these sequences are stored in

  4. Who Recognizes This Problem? Cynthia Gibas – an assistant professor in biology at Virginia Tech James Tisdall – Consultant for Biocomputing Associates of Kimberton, PA, was one of the first people to use Perl in bioinformatics, and he is also the developer of DNA WorkBench, a parallel-processing bioinformatics Perl program used worldwide.

  5. WHY PROGRAM? WHY NOT USE THE READILY AVAILABLE TOOLS? http://biowb.sdsc.edu/register.cgi

  6. The majority of biological researchers are not programmers • Biologists often feel that basic computing skills are all that they need to fulfill their everyday research tasks. • In many cases Tisdall observes, “you can accomplish quite a bit using existing tools.” • “What happens when you want to do something a preexisting tool doesn’t do? What happens when you can’t find a tool to accomplish a particular task, and you can’t find someone to write it for you?” (Tisdall, 2001).

  7. BENEFITS OF PROGRAMMING • Skill is in Strong Demand • Makes You More Marketable as a Biologist • Saves Time • Easy Skill to Pick Up “The only chance biologists have of keeping up with the job of analyzing data is by developing libraries of reusable software tools.” Cynthia Gibas - 2001

  8. PERL & BIOPERL Common Languages Relevant To Bioinformatics: C/C++, Python, and Perl WHY PERL? Perl has become a very popular bioinformatics programming language because of it’s suitability for rapid prototyping – the ability to quickly write a working program Also, Perl is excellent for string manipulation, and bioinformatics deals mainly with large strings of genetic sequences and base pairs.

  9. How to Use Perl and BioPerl Perl is mainly for use with Unix/Linux based operating systems; however, you can install both for every windows based operating system. I will be showing you how to install Perl/Bioperl on Win2k and XP systems, because this is what I installed it on and realized that there weren’t any real good installation directions for Windows based systems.

  10. First Download Perl • Visit http://www.activestate.com/Products/ActivePerl/?_x=1 • Download Perl Activestate Click to download  • Run the Setup and Follow the Directions • Verify that Perl was Installed

  11. This will setup Perl on your computer. • It should setup a Folder on your hard-drive (Typically C:\Perl) unless otherwise changed. This folder contains all the needed modules and libraries to run Perl on your system. • Running Perl on Windows operating systems requires the use of MS_Dos… so get used to the command line because this is what you will be using to run all your Perl Scripts.

  12. TEST SCRIPT All of your Perl Scripts can be easily written in a basic word processing program such as NOTEPAD – Then saved with the .pl ending ** make sure that the “Save As Type” is set for “All Files”

  13. Let’s develop a simple “Hello World” Test Program… • In Notepad simply Type: print “Hello World”; • Then Save as test.pl • Now let’s test to see if the script worked. 1.) Open up MS-DOS prompt and type cd\ HIT ENTER • 2.) Type in perl C:\windows\desktop\test.pl (or wherever you saved your test.pl) HIT ENTER • 3.) Should print out “Hello World” on the screen! It’s as easy as that! • Now it will get a little bit more complicated … Next we will install Bioperl- this is what caused me the most trouble! However, once I figured it out, it was fairly simple.

  14. BioPerl Click Here for Newest Release http://www.bioperl.org/Core/Latest/ • Download BioPerl and Unpack using Winzip (or another extracting tool) Extract to Perl Lib Directory (typically C:\Perl\lib) • Download Nmake and Save in Perl Lib Directory • Install Bioperl and it’s modules http://download.microsoft.com/download/vc15/Patch/1.52/W95/EN-US/Nmake15.exe http://www.bioperl.org/Core/windows-bioperl.html FOR MORE INFO

  15. Detailed Instructions • 1. Open Up MS-Dos prompt and type: cd C:\Perl\Lib HIT ENTER • 2. Now Type: perl Makefile.PL: You will have to specify the directory HIT ENTER • 3. You type : nmake ENTER • 4. NEXT: nmake test ENTER • 5. Finally: nmake install ENTER • It’s that easy…now to install other modules not included in the bioperl package follow the directions below using PPM!!!!

  16. To use PPM • Just go to the DOS command line and type "PPM" (without the quotes). You will be at the PPM command prompt. (I should mention that you need to be connected to the Internet at this point). • At the PPM prompt, enter "install YOUR::MODNAME" - you will be prompted if you want to continue. At that point, PPM will connect to ActiveState and see if the module you requested is available in a pre-compiled form. If it is, it will install the module and you are all done! PPM is especially nice because it will even install other required modules for you. • If you get an error message that the module was not found, then it's not available from ActiveState and you will have to find it elsewhere. • NOW BIOPERL IS SUCCESSFULLY INSTALLED!!!

  17. Perl (BioPerl) Examples: • Easy to learn • Rapid Prototyping FASTA Format:

  18. Fasta.pl Only 9 Lines of Code use strict; use warnings; my @file_data = (); my $dna =''; @file_data = get_data("C:\\Perl\\bin\\BioInfo\\sample.dna.txt"); $dna = extract_data(@file_data); print_sequence($dna,25); exit; Hardcode Location

  19. Run Perl Script Revcomp.pl http://jje.uchicago.edu/revcomp.pl Reverse Complimentary Strands of the FASTA format we just made!

  20. Read and Practice To develop Computer Programming Skills and the abilities to develop your own scripts, you must first learn how to program and to do this I recommend reading …

  21. References: • Ezzell, C. (2000). Hooking up Biologists. Scientific American 283 (5), 22. • Gibas, C., and Jambeck, P. (2001). Developing Bioinformatics Computer Skills. Sebastopol: O’Reilly. • Roos, D. (2001). Bioinformatics – Trying to Swim in a Sea of Data. Science 291 (5507), 1260 –1261. • Stewart, B (2001, December 7). An Interview with Lincoln Stein. Retrieved April 14, 2002 from the World Wide Web: http://www.oreillynet.com/pub/a/network/2001/12/07/stein.html • Tisdall, J. (2001, October 15). Why Biologists Want to Program Computers. Retrieved April 12, 2002 from the World Wide Web: http://www.oreilly.com/news/perlbio_1001.html • Tisdall, J. (2001). Beginning Perl for Bioinformatics. Sebastopol: O’Reilly.

More Related