22
Advanced Perl For Bioinformatics Part 1 2/23/06 1-4pm Module structure Module path Module export Object oriented programming Part 2 2/24/06 1-4pm Bioperl modules Sequence access Sequence manipulation Parsing BLAST records

Advanced Perl For Bioinformatics Part 1 2/23/06 1-4pm Module structure Module path Module export Object…

Embed Size (px)

DESCRIPTION

Why use module? Reusable by different programs. Keep your code well organized.

Citation preview

Page 1: Advanced Perl For Bioinformatics Part 1 2/23/06 1-4pm Module structure Module path Module export Object…

Advanced Perl For Bioinformatics Part 1 2/23/06 1-4pm

Module structureModule path Module export Object oriented programming

Part 2 2/24/06 1-4pm

Bioperl modulesSequence accessSequence manipulationParsing BLAST records

Page 2: Advanced Perl For Bioinformatics Part 1 2/23/06 1-4pm Module structure Module path Module export Object…

Module and main program

package Hello1;

sub greet { return "Hello, World!"; } 1;

Hello1.pm test1.pl

#!/usr/bin/perl

use Hello1;

print Hello1::greet();

Page 3: Advanced Perl For Bioinformatics Part 1 2/23/06 1-4pm Module structure Module path Module export Object…

Why use module?

• Reusable by different programs.

• Keep your code well organized.

Page 4: Advanced Perl For Bioinformatics Part 1 2/23/06 1-4pm Module structure Module path Module export Object…

Module structure

package Hello1;

sub greet { return "Hello, World!\n"; } 1;

Declare a package; file must be saved as Hello.pm

Contents of the package:functions, and variables.

Return a true value at end

Page 5: Advanced Perl For Bioinformatics Part 1 2/23/06 1-4pm Module structure Module path Module export Object…

Path to module• Default path to look for module: @INC

perl -e “print @INC”• If your module is placed under one of the path in @INC, you can refer

to your module use relative path. E.g. If @INC contains /usr/my/lib, and

(1) your Mod.pm is /usr/my/lib/Mod.pm, you can refer to your module by “use Mod.pm”.

(2) Your Mod.pm is /usr/my/lib/Mymod/Seq/Mod.pm, then you say:use Mymod::Seq::Mod

• If your module is not placed under any of @INC, e.g. /some/dir/Mod.pm, then:

use lib “/some/dir”; --- this adds the path to the beginning of @INC

use Mod;

Page 6: Advanced Perl For Bioinformatics Part 1 2/23/06 1-4pm Module structure Module path Module export Object…

Variable scope in module• my $var --- accessible only in module• our $var --- accessible from outside • $var ---same as “our $var”• use strict; --- This forces all variables to be qualified with ‘my’ or ‘our’.

package Hello2;use strict;our $var1 = 1;my $var2 = 3;my $str = "Hello World!\n";sub greet { return $str;}1;

Hello2.pm

#!/usr/bin/perluse Hello2;print "var1= $Hello2::var1\n";print "var2= $Hello2::var2\n";

pring Hello2::greet();

test2.pl

Page 7: Advanced Perl For Bioinformatics Part 1 2/23/06 1-4pm Module structure Module path Module export Object…

ExportExport functions and variables, so that they can be accessed without qualifier

package Hello3;use strict;require Exporter;our @ISA=“Exporter”;our @EXPORT_OK = qw(greet);our $var1 = 1;my $var2 = 3;my $str = "Hello World!\n";sub greet { return $str;}1;

Hello3.pm

#!/usr/bin/perluse Hello3 qw(greet);print "var1= $Hello3::var1\n";print "var2= $Hello3::var2\n";

print greet();

test3.pl

Page 8: Advanced Perl For Bioinformatics Part 1 2/23/06 1-4pm Module structure Module path Module export Object…

package Hello3;use strict;use Exporter;our @ISA=“Exporter”;our @EXPORT_OK = qw(greet);our $var1 = 1;my $var2 = 3;my $str = "Hello World!\n";sub greet { return $str;}

1;

Hello3.pmNeed functionality in Exporter.pm to do exporting.

This programs inherits functionsExporter module, rather than createsits own.

Exporter this sub routineupon request by other program

Page 9: Advanced Perl For Bioinformatics Part 1 2/23/06 1-4pm Module structure Module path Module export Object…

#!/usr/bin/perluse Hello3 qw(greet);print "var1= $Hello3::var1\n";print "var2= $Hello3::var2\n";

print greet();

test3.pl

Request “greet”

Page 10: Advanced Perl For Bioinformatics Part 1 2/23/06 1-4pm Module structure Module path Module export Object…

package Hello4;use strict;use Exporter;our @ISA=“Exporter”;our @EXPORT_OK = qw(greet);our @EXPORT = qw(greet2);our $var1 = 1;my $var2 = 3;my $str = "Hello World!";sub greet { return $str;}

sub greet2 { return “Hi.\n”;}1;

Hello4.pm

Export this automatically

Page 11: Advanced Perl For Bioinformatics Part 1 2/23/06 1-4pm Module structure Module path Module export Object…

#!/usr/bin/perluse Hello4 qw(greet);use Hello4;print "var1= $Hello4::var1\n";print "var2= $Hello4::var2\n";

print greet();print greet2();

test4.pl

Request “greet”

This automatically importswhatever in @EXPORT.

Page 12: Advanced Perl For Bioinformatics Part 1 2/23/06 1-4pm Module structure Module path Module export Object…

Exercise 1

• Create a module which has functions to calculate the area and boundary of a rectangle. The width and length are to be supplied in your main program and passed into your module. Practice the @EXPORT, and @EXPORT_OK.

Page 13: Advanced Perl For Bioinformatics Part 1 2/23/06 1-4pm Module structure Module path Module export Object…

Object Orientied Programming

•A package (or module) is a class.

•A reference to a hash becomes an object of this class.

•The object contains member variables which are stored in the hash.

•The object also contains member functions.

Page 14: Advanced Perl For Bioinformatics Part 1 2/23/06 1-4pm Module structure Module path Module export Object…

Hello5.pmpackage Hello;use strict;

sub new { my $class = shift; my $ref = {}; bless ( $ref, $class ); return $ref;}

sub greet { my ($ref, $str)= @_; return $str;}

sub greet2 { return "Hi\n";}1;

#!/usr/local/bin/perluse Hello5;$h = new Hello5;

print $h->greet("Good morning\n");print $h->greet2;

test5.pl

Page 15: Advanced Perl For Bioinformatics Part 1 2/23/06 1-4pm Module structure Module path Module export Object…

Rectangle.pm

package Rectangle;sub new { my ($class, $width, $length)=@_; my $hashref = {W=>$width, L=>$length }; bless ( $hashref, $class); return $hashref;}

sub getArea { my $self = shift; return $self->{W} * $self->{L};}

sub getBoundary { my $self=shift; return 2*($self->{W}+$self->{L});}

1;

#!/usr/bin/perluse Rectangle;my $w = 3;my $l = 4;

my $rect = new Rectangle($w,$l);my $area = $rect->getArea();print "Area = $area\n";

my $b = $rect->getBoundary();Print “Boundary=$b\n”;

recttest.pl

Page 16: Advanced Perl For Bioinformatics Part 1 2/23/06 1-4pm Module structure Module path Module export Object…

Exercise 2

• Create a class called “Cube”. It should have methods to calculate volume based on the cube’s width, length and height.

Page 17: Advanced Perl For Bioinformatics Part 1 2/23/06 1-4pm Module structure Module path Module export Object…

More Pratices on Class

• Sequence.pm:clean,wrap,reverse complement,shuffle,GC content,translate

• Main program: seq.pl

Page 18: Advanced Perl For Bioinformatics Part 1 2/23/06 1-4pm Module structure Module path Module export Object…

Bioperl• A collection of perl modules for bioinformatics

• Facilitates sequence retrieval, manipulation, and parsing results of programs like blast, clustalw.

• http://bioperl.org for download and documentation.

• Individual .pm file has info on how to use modules.

• Usually installed: /usr/local/lib/perl5/site_perl/5.8.0/Bio

Page 19: Advanced Perl For Bioinformatics Part 1 2/23/06 1-4pm Module structure Module path Module export Object…

Some Bioperl modules

• Bio::Perl, Bio::DB -- access seq databases. Examples: seqret.pl

• Bio::Seq -- sequence and its annotation. E.g. seqio.pl

• Bio::SeqIO – read sequence from file, and write to file. E.g. seqio.pl

• Bio::Tools:SeqStats -- molecular weight, etc. E.g. seqmw.pl

• Bio::SearchIO -- parse blast results.

Page 20: Advanced Perl For Bioinformatics Part 1 2/23/06 1-4pm Module structure Module path Module export Object…

Accessing Remote Databases

use Bio::Perl;$seqobj = get_sequence(‘swiss’, “ROA1_HUMAN”);write_sequence(“roa1.fasta”, ‘fasta’, $seqobj);

Databases can be: swiss, genbank, genpept, refseq, etc.

Page 21: Advanced Perl For Bioinformatics Part 1 2/23/06 1-4pm Module structure Module path Module export Object…

Bio::Seq• Contain sequence and annotation• Methods: display_id, desc, seq, revcom, translate, etc.

The revcom and translate methods create new Bio::Seq object.

One way to create a Bio::Seq object:$seq = Bio::Seq->new(-seq => 'actgtggcgtcaact',

-desc => 'Sample Bio::Seq object', -display_id => 'something', -accession_number => 'accnum', -alphabet => 'dna' );

An other way: read the sequence from file via Bio::SeqIO object.

Page 22: Advanced Perl For Bioinformatics Part 1 2/23/06 1-4pm Module structure Module path Module export Object…

Parsing blast results• Module: Bio::SearchIO• my $in = new Bio::SearchIO(-format => 'blast', -file => 'report.bls'); while( my $result = $in->next_result ) { while( my $hit = $result->next_hit ) { while( my $hsp = $hit->next_hsp ) {

if( $hsp->length('total') > 100 ) { if ( $hsp->percent_identity >= 75 ) {

print "Hit= ", $hit->name, ",Length=", $hsp->length('total'), ",Percent_id=", $hsp->percent_identity, "\n";

} } } } }

Example: blastparse.pl