25
Page 1 VI, October 2006 P ractical E xtraction and R eport L anguage « Perl is a language of getting your job done » Larry Wall « There is more than one way to do it » Page 2 VI, October 2006 Perl Filehandles & File Tests Subroutines (functions) References passing arguments to functions nested data structure Packages/Modules Namespace/@INC Outline :

Practical Extraction and Report Language Perl - … · VI, October 2006 Page 1 Practical Extraction and Report Language Ç Perl is a language of getting your job done È Larry Wall

Embed Size (px)

Citation preview

Page 1VI, October 2006

Practical Extraction and Report Language

« Perl is a language of getting your job done »

Larry Wall

« There is more than one way to do it »

Page 2VI, October 2006

Perl

Filehandles & File Tests

Subroutines (functions)

References

passing arguments to functions

nested data structure

Packages/Modules

Namespace/@INC

Outline :

Page 3VI, October 2006

Perl Filehandles

A filehandle is the name in a Perl program for an I/O connection between your Perl

process and the outside world.

(Good practice: use all uppercase letters in the name of your filehandle)

Perl special file handles

There are three connections that always exist and are always "open" when your program starts:

STDIN, STDOUT, and STDERR.

Actually, these names are file handles. File handles are variables used to manipulate files.

STDIN reads from standard input which is usually the keyboard in normal Perl script

(or input from a Browser in a CGI script. Cgi-lib.pl reads from this automatically.)

STDOUT (standard output) and STDERR (standard error) by default write to a console

(or a browser in CGI).

We have been using the STDOUT file handle without knowing it for every print()

statement during Perl presentations. The print() function uses STDOUT as the default if no

other file handle is specified.

Page 4VI, October 2006

How to get a value from the keyboard into a Perl program ?

The simplest way is to use the line-input operator: <STDIN>

Each time we use <STDIN> in a place where a scalar value is expected, Perl reads the next complete

text line up to the first newline from the keyboard (unless you modified it).

Please enter your Lastname:

Please enter your Firstname: Vassilios

Ioannidis

Hello Vassilios Ioannidis,I hope you like Perl programming !

#!/usr/bin/perl

print "Please enter your Lastname: ";

my $lastname = <STDIN>; #<>

chomp $lastname;

print "Please enter your Firstname: ";

my $firstname = <STDIN>; #<>

chomp $firstname;

print "Hello $firstname $lastname,\n

I hope you like Perl programming !\n";

exit;

Perl Filehandles

Page 5VI, October 2006

Perl Filehandles

vioannid$ cat listparticipants.csv"Barkow","Simon","ETHZ","8057","Mr.""Basle","Arnaud","University of Basel","4056","Dr. (Mr.)""Blevins","Todd","FMI","4058","Mr.""Bodenhausen","Natacha","University of Lausanne","1015","Mrs.""Botta","Francesca","University of Fribourg","6601","Mrs.""Kerschgens","Jan","EPFL","1015","Mr.""Keusch","Jeremy","FMI","4058","Dr. (Mr.)""Kutter","Claudia","FMI","4058","Mrs.""Livingstone","Magdalena","ETHZ","8057","Mrs.""Meury","Marcel","University of Basel","4056","Mr.""Moore","James","University of Basel","4056","Dr. (Mr.)""Muller","Joachim","University of Bern","3012","Dr. (Mr.)""Mungpakdee","Sutada","other","5008","Mrs.""Nipitwattanaphon","Mingkwan","University of Lausanne","CH - 1015","Mrs.""Padavattan","sivaraman","University of Basel","4056","Dr. (Mr.)""Paul","Ralf","University of Basel","4056","Dr. (Mr.)""Tobler","Kurt","University of Zurich","8057","Dr. (Mr.)""Vanoaica","Liviu","EPFL","1066","Mr.""Vellore Palanivelu","Dinesh","University of Basel","4056","Dr. (Mr.)""von Castelmur","Eleonore","University of Basel","4056","Mrs.""Wassmann","Paul","University of Basel","4056","Mr.""Yadetie","Fekadu","other","N-5008","Dr. (Mr.)"vioannid$

Page 6VI, October 2006

The invocation argument: @ARGV

Perl Filehandles

#!/usr/bin/perl

use strict;

use warnings;

open (FILE, "listparticipants.csv") or die "Error. Could not open the file !\n";

while (<FILE>) {

if (m/^\"(.*)\",\"(.*)\",\"(.*)\",\"(.*)\",\"(.*)\"/) {

print "Hello $5 $2 $1 from $4, $3 !\n";

}

else {}

}

exit;

vioannid$ ./argv.plHello Mr. Simon Barkow from 8057, ETHZ !Hello Dr. (Mr.) Arnaud Basle from 4056, University of Basel !Hello Mr. Todd Blevins from 4058, FMI !

. . .

Hello Dr. (Mr.) Fekadu Yadetie from N-5008, other !vioannid$

Page 7VI, October 2006

The invocation argument: @ARGV

Perl Filehandles

#!/usr/bin/perl

use strict;

use warnings;

print @ARGV;

my $nb_arg = @ARGV;

my $argument = $ARGV[0];

print "\n$nb_arg\n";

print "The invocation argument is:

$argument\n";

exit;

Technically, the diamond <> operator is not looking literally at the invocation argument.

It works from the @ARGV array. This is a special array that is preset by Perl to be a list

of the invocation arguments. When the program starts, @ARGV contains the list of

invocation arguments. And can be handled as a just like any other array !

vioannid$ ./argv3.pl listparticipants

The invocation argument is: listparticipantsvioannid$

1

listparticipants

Page 8VI, October 2006

The invocation argument: @ARGV

Perl Filehandles

#!/usr/bin/perl

use strict;

use warnings;

my $filename = $ARGV[0];

open (FILE, "$filename") or die "Error. Could not open the file $filename !\n";

while (<FILE>) {

if (m/^\"(.*)\",\"(.*)\",\"(.*)\",\"(.*)\",\"(.*)\"/) {

print "Hello $5 $2 $1 from $4, $3 !\n";

}

else {}

}

exit;

vioannid$ ./argv.pl listparticipants.csvHello Mr. Simon Barkow from 8057, ETHZ !Hello Dr. (Mr.) Arnaud Basle from 4056, University of Basel !Hello Mr. Todd Blevins from 4058, FMI !

. . .

Hello Dr. (Mr.) Fekadu Yadetie from N-5008, other !vioannid$

Page 9VI, October 2006

Perl Filehandles

You can open a file for input or output using the open() function.

open(INFILE, "input.txt") or die "Can't open input.txt: $!";

open(OUTFILE, ">output.txt") or die "Can't open output.txt: $!";

open(LOGFILE, ">>logfile") or die "Can't open logfile: $!";

You can use your own naming instead of "INFILE", "OUTFILE" or "LOGFILE".

When you're done with your filehandles, you should close() them (though Perl will clean up after you

if you forget…):

close INFILE;

print() can also take an optional first argument specifying which filehandle

to print to:

print STDERR "This is your final warning\n";

print OUTFILE $record;

print LOGFILE $logmessage;

use whatever name you like BUT: STDIN, STDOUT, STDERR, ARGV !

Page 10VI, October 2006

Perl Filehandles

File test Meaning-r File is readable by effective user/group.

-w File is writable by effective user/group.

-x File is executable by effective user/group.

-o File is owned by effective user.

-R File is readable by real user/group.

-W File is writable by real user/group.

-X File is executable by real user/group.

-O File is owned by real user.

-e File exists.

-z File has zero size.

-s File has nonzero size (returns size).

-f File is a plain file.

-d File is a directory.

-l File is a symbolic link.

File test Meaning-p File is a named pipe (FIFO).

-S File is a socket.

-b File is a block special file.

-c File is a character special file.

-t Filehandle is opened to a tty.

-u File has setuser bit set.

-g File has setgroup bit set.

-k File has sticky bit set.

-T File is a text file.

-B File is a binary file (opposite of -T).

-M Age of file (at startup) in days since modification.

-A Age of file (at startup) in days since last access.

-C Age of file (at startup) in days since inode change.

. . .

if (-e $filename) {

#do something

}

. . .

Page 11VI, October 2006

Perl Filehandles

You can read from an open filehandle using the "<>" operator.

In scalar context it reads a single line (or a single record) from the filehandle, and in list context it

reads the whole file in, assigning each line to an element of the list:

my $line = <INFILE>;

my @lines = <INFILE>;

Reading in the whole file at one time is called slurping. It can be useful but it may be a memory hog.

Most text file processing can be done a line at a time with Perl's looping constructs.

The "<>" operator is most often seen in a while loop:

while (<INFILE>) { # assigns each line in turn to $_

print "Just read in this line: $_";

}

close INFILE;

Page 12VI, October 2006

Perl Filehandles

You can read from an open filehandle using the "<>" operator.

In scalar context it reads a single line (or a single record) from the filehandle.

Most text file processing can be done a line at a time with Perl's looping constructs.

The "<>" operator is most often seen in a while loop:

>3BHS1_RATPGWSCLVTGAGGFVGQRIIRMLVQEKELQEVRALDKVFRPETKEEFSKLQTKAKVTMLEG

DILDAQYLRRACQGISVVIHTAAVIDVSHVLPRQTILDVNLKGTQNILEACVEASVPAFI

YCSTVDVAGPNSYKKIILNGHEEEHHESTWSDAYPYSKRMAEKAVLAANGSILKNGGTLH

TCALRPMYIYGERSPFLSVMILAALKNKGILNVTGKFSIANPVYVGNVAWAHILAARGLR

DPKKSQNVQGQFYYISDDTPHQSYDDLNCTLSKEWGLRLDSSWSLPLPLLYWLAFLLETV

SFLLRPFYNYRPPFNCHLVTLSNSKFTFSYKKAQRDLGYVPLVSWEEAKQKTSEWIGTLV

EQHRETLDTKSQ

>3BHS2_RATPGWSCLVTGAGGFVGQRIIRMLVQEKELQEVRALDKVFRPETKEEFSKLQTKAKVTMLEG

DILDAQYLRRACQGISVVIHTASVMDFSRVLPRQTILDVNLKGTQNLLEAGIHASVPAFI

YCSTVDVAGPNSYKKTILNGREEEHHESTWSNPYPYSKKMAEKAVLAANGSILKNGGTLH

TCALRPMYIYGERGQFLSRIIIMALKNKGVLNVTGKFSIVNPVYVGNVAWAHILAARGLR

DPKKSQNIQGQFYYISDDTPHQSYDDLNCTLSKEWGLRLDSSWSLPLPLLYWLAFLLETV

SFLLRPFYNYRPPFNCHLVTLSNSKFTFSYKKAQRDLGYEPLVSWEEAKQKTSEWIGTLV

EQHRETLDTKSQ

>3BHS4_RATPGWSCLVTGAGGFLGQRIVQLLVQEKDLKEVRVLDKVFRPETREEFFNLGTSIKVTVLEG

DILDTQCLRRACQGISVVIHTAALIDVTGVNPRQTILDVNLKGTQNLLEACVQASVPAFI

. . .

Page 13VI, October 2006

Perl Filehandles

You can read from an open filehandle using the "<>" operator.

In scalar context it reads a single line (or a single record) from the filehandle.

Most text file processing can be done a line at a time with Perl's looping constructs.

The "<>" operator is most often seen in a while loop:ID 3BHS1_RAT STANDARD; PRT; 372 AA.AC P22071;

. . . . .

SQ SEQUENCE 372 AA; 41906 MW; F989617C1AF18949 CRC64;

PGWSCLVTGA GGFVGQRIIR MLVQEKELQE VRALDKVFRP ETKEEFSKLQ TKAKVTMLEG

DILDAQYLRR ACQGISVVIH TAAVIDVSHV LPRQTILDVN LKGTQNILEA CVEASVPAFI

YCSTVDVAGP NSYKKIILNG HEEEHHESTW SDAYPYSKRM AEKAVLAANG SILKNGGTLH

TCALRPMYIY GERSPFLSVM ILAALKNKGI LNVTGKFSIA NPVYVGNVAW AHILAARGLR

DPKKSQNVQG QFYYISDDTP HQSYDDLNCT LSKEWGLRLD SSWSLPLPLL YWLAFLLETV

SFLLRPFYNY RPPFNCHLVT LSNSKFTFSY KKAQRDLGYV PLVSWEEAKQ KTSEWIGTLV

EQHRETLDTK SQ

//ID 3BHS2_RAT STANDARD; PRT; 372 AA.AC P22072;

. . . . .

SQ SEQUENCE 372 AA; 42145 MW; EDAB175F3F33334B CRC64;

PGWSCLVTGA GGFVGQRIIR MLVQEKELQE VRALDKVFRP ETKEEFSKLQ TKAKVTMLEG

DILDAQYLRR ACQGISVVIH TASVMDFSRV LPRQTILDVN LKGTQNLLEA GIHASVPAFI

YCSTVDVAGP NSYKKTILNG REEEHHESTW SNPYPYSKKM AEKAVLAANG SILKNGGTLH

TCALRPMYIY GERGQFLSRI IIMALKNKGV LNVTGKFSIV NPVYVGNVAW AHILAARGLR

DPKKSQNIQG QFYYISDDTP HQSYDDLNCT LSKEWGLRLD SSWSLPLPLL YWLAFLLETV

SFLLRPFYNY RPPFNCHLVT LSNSKFTFSY KKAQRDLGYE PLVSWEEAKQ KTSEWIGTLV

EQHRETLDTK SQ

//

Page 14VI, October 2006

Perl Filehandles

You can modify the regular record separator "\n" by something else:

$/= "\/\/\n"; for a file containing SwissProt entries or

$/=">"; for a fasta file

$/=">";

while (<INFILE>) {

# assigns each line in turn to $_

print "Entry: $_";

} vioannid$ ./test.pl wgetz-1 Entry:>Entry:3BHS1_RATPGWSCLVTGAGGFVGQRIIRMLVQEKELQEVRALDKVFRPETKEEFSKLQTKAKVTMLEGDILDAQYLRRACQGISVVIHTAAVIDVSHVLPRQTILDVNLKGTQNILEACVEASVPAFIYCSTVDVAGPNSYKKIILNGHEEEHHESTWSDAYPYSKRMAEKAVLAANGSILKNGGTLHTCALRPMYIYGERSPFLSVMILAALKNKGILNVTGKFSIANPVYVGNVAWAHILAARGLRDPKKSQNVQGQFYYISDDTPHQSYDDLNCTLSKEWGLRLDSSWSLPLPLLYWLAFLLETVSFLLRPFYNYRPPFNCHLVTLSNSKFTFSYKKAQRDLGYVPLVSWEEAKQKTSEWIGTLVEQHRETLDTKSQ>Entry:3BHS2_RATPGWSCLVTGAGGFVGQRIIRMLVQEKELQEVRALDKVFRPETKEEFSKLQTKAKVTMLEGDILDAQYLRRACQGISVVIHTASVMDFSRVLPRQTILDVNLKGTQNLLEAGIHASVPAFIYCSTVDVAGPNSYKKTILNGREEEHHESTWSNPYPYSKKMAEKAVLAANGSILKNGGTLHTCALRPMYIYGERGQFLSRIIIMALKNKGVLNVTGKFSIVNPVYVGNVAWAHILAARGLRDPKKSQNIQGQFYYISDDTPHQSYDDLNCTLSKEWGLRLDSSWSLPLPLLYWLAFLLETVSFLLRPFYNYRPPFNCHLVTLSNSKFTFSYKKAQRDLGYEPLVSWEEAKQKTSEWIGTLVEQHRETLDTKSQ. . . . .

Page 15VI, October 2006

Perl

Filehandles & File Tests

Subroutines (functions)

References

passing arguments to functions

nested data structure

Packages/Modules

Namespace/@INC

Outline :

Page 16VI, October 2006

Perl Subroutines

#!/usr/local/bin/perl

use strict;

use warnings;

my @names1 = ("Pedro", "Claire", "Yemima", "Fabien" ,"Francisco");

foreach (@names1 ) {

my $size = length($_);

print '*'x($size+2),"\n";

print "*$_*\n";

print '*'x($size+2),"\n";

}

exit ;

********Pedro*****************Claire******************Yemima******************Fabien*********************Francisco************

Page 17VI, October 2006

Perl Subroutines

#!/usr/local/bin/perl

use strict;

use warnings;

my @names1 = ("Pedro", "Claire", "Yemima", "Fabien" ,"Francisco");

foreach (@names1 ) {

my $size = length($_);

print '*'x($size+2),"\n";

print "*$_*\n";

print '*'x($size+2),"\n";

}

my @names2 = ("Sandra Yukie", "Simona", "Christophe", "Dominique", "Michaela");

foreach (@names2 ) {

my $size = length($_);

print '*'x($size+2),"\n";

print "*$_*\n";

print '*'x($size+2),"\n";

}

my @names3 = ("Lionel", "Gabriele", "Michael", "Charlotte", "Subhash", "Adam");

foreach (@names3 ) {

my $size = length($_);

print '*'x($size+2),"\n";

print "*$_*\n";

print '*'x($size+2),"\n";

}

exit ;

********Pedro*****************Claire******************Yemima******************Fabien*********************Francisco************

Page 18VI, October 2006

Perl Subroutines

Functions in Perl are called subroutines

A subroutine is a named, reusable, and accessible chunk of code that was written to

accomplish a specific goal. Therefore, functions are useful to avoid typing redundant

code over and over.

Functions help in the clarity of scripts.

Don't reinvent the wheel !!!

There are already many available functions in Perl:

http://searchcpanorg/~nwclark/perl-5.8.6/pod/perlfunc.pod

Page 19VI, October 2006

Perl Subroutines (procedure, function)

#defining subroutine

sub myfunc {

my $param = shift(@_);

. . .

return $result;

}

# calling a function

$calcul = myfunc($value);

#defining subroutine

sub myproc {

my $param = shift(@_);

. . .

return;

}

# calling procedure

myproc($value);

Some Perl commands tell the Perl interpreter to do something. A statement starting with a "verb" is generally

purely imperative. We often call these "verbs" procedures: a frequently seen command is the print command.

Some verbs are for asking questions, and are useful in conditional statements. Other verbs translate their input

parameters into return values, just as a recipe tells you how to turn raw ingredients into something (hopefully)

edible. We tend to call these verbs functions.

Page 20VI, October 2006

Perl Subroutines

#!/usr/local/bin/perl

use strict;

use warnings;

my @names1 = ("Pedro", "Claire", "Yemima", "Fabien" ,"Francisco");

my @names2 = ("Sandra Yukie", "Simona", "Christophe", "Dominique", "Michaela");

my @names3 = ("Lionel", "Gabriele", "Michael", "Charlotte", "Subhash", "Adam");

my @names4 = ("Sebastian", "Tu", "Sergey", "Olusegun", "Joel", "Uta", "Viviane");

my @names5 = ("Stanislav", "Kyrill", "Petr", "Sebastien", "Haleh");

&pretty_print(@names1);

&pretty_print(@names2);

&pretty_print(@names3);

&pretty_print(@names4);

&pretty_print(@names5);

exit ;

sub pretty_print {

foreach (@_) {

my $size = length($_);

print '*'x($size+2),"\n";

print "*$_*\n";

print '*'x($size+2),"\n";

}

}

********Pedro*****************Claire******************Yemima******************Fabien*********************Francisco************

Page 21VI, October 2006

Perl functions

#!/usr/bin/perl

use strict;

use warnings;

#call the helloworld function

#& is optional with parentheses

helloworld();

#tell the program to exit

exit;

sub helloworld{

print "hello World !\n";

}

vioannid$ ./sub_hello.pl hello World !vioannid$

In the same manner than “$” stands for scalar, “@” for arrays and “%” for hashes,

“&” stands for subroutines (optional when () are used).

Note, however, that in practice, helloworld(); is preferred to &helloworld;

Page 22VI, October 2006

Perl functions

#!/usr/bin/perl

use strict;

use warnings;

# pass 2 arguments to the plus function

# receive the output in $sum

my $sum = plus(12,34);

print "$sum\n";

exit;

sub plus{

my($x, $y)=@_;

return $x+$y;

}

vioannid$ ./sub_plus.pl 46vioannid$

In order to accomplish this goal a subroutine can sometimes need input, or what are

called input parameters. In this case, the list of those parameters will be caught

through the default variable @_

Page 23VI, October 2006

Perl functions

#!/usr/bin/perl

use strict;

use warnings;

my @vals=(1, 4, 5, 8);

my $sum = &plus(@vals);

print "sum=$sum\n";

sub plus{

my @values = @_;

my $add = 0;

foreach(@values){

$add += $_;

}

return $add;

}

vioannid$ ./sub_plus_array.pl sum=18vioannid$

Page 24VI, October 2006

Perl functions

Checking/Detecting parameters

Checking the number of parameters

A common problem is to check the number of parameters (and maybe return an error).

@_ is a normal array, therefore:

my $nbParam = scalar @_;

Detecting a parameter type with ref($x)

(is it a scalar? an array reference? a file handle?)

ref($x) will return a string with the type of the argument: empty string for scalar,

“ARRAY” for array, “HASH” for hash table, “GLOB” for a file handle etc.

Detecting the parameters type allows to handle in the same subroutine many different

situations.

Page 25VI, October 2006

Perl Scope of Variables

Scope refers to the visibility of variables. In other words, which parts of your program

can see or use them.

Lexical Scope

Lexical scope is more ideal for the majority of variables

programmers use regularly. Lexical scope allows a variable

to exist only within its containing closure.

#!/usr/bin/perl

use strict;

use warnings;

my @list = (

"Simon","Arnaud",

"Todd","Natacha" );

foreach my $name (@list) {

print "Hello $name !\n";

}

print "Hello $name !\n";

exit ;

vioannid$ ./scope.plGlobal symbol "$name" requires explicitpackage name at ./scope.pl line 14.Execution of ./scope.pl aborted due tocompilation errors.vioannid$

The widest scope available to a lexical variable is the entire file in which its declared. No

entry is created in the symbol table for lexically scoped variables and these variables will

be purged once they're out of scope. This use of Perl's garbage collection can save

resources and prevent variable collisions. Lexical scope has to be declared by using Perl's

my() function. Variables have to be declared with "use strict".

Page 26VI, October 2006

Perl Scope of Variables

#!/usr/bin/perl

use strict;

use warnings;

my $a = 5;

my $b = 10;

print '$a before: '."$a\n";

print '$b before: '."$b\n\n";

double($a,$b);

print '$a after: '."$a\n";

print '$b after: '."$b\n";

exit ;

sub double {

$a = shift; #$a = $_[0];

$b = shift; #$b = $_[1];

$a=$a*2;

$b=$b*2;

print 'double, $a: '."$a\n";

print 'double, $b: '."$b\n\n";

}

vioannid$ ./ref.pl$a before: 5$b before: 10

double, $a: 10double, $b: 20

$a after: 10$b after: 20vioannid$

Page 27VI, October 2006

Perl Scope of Variables

#!/usr/bin/perl

use strict;

use warnings;

my $a = 5;

my $b = 10;

print '$a before: '."$a\n";

print '$b before: '."$b\n\n";

double($a,$b);

print '$a after: '."$a\n";

print '$b after: '."$b\n";

exit ;

sub double {

my $a = shift; #my $a = $_[0];

my $b = shift; #my $b = $_[1];

$a=$a*2;

$b=$b*2;

print 'double, $a: '."$a\n";

print 'double, $b: '."$b\n\n";

}

vioannid$ ./ref.pl$a before: 5$b before: 10

double, $a: 10double, $b: 20

$a after: 5$b after: 10vioannid$

Page 28VI, October 2006

Perl functions

#!/usr/bin/perl

use strict;

use warnings;

my @list1 = ("Pamela","Monica","Sophie");

my @list2 = ("Natacha","Francesca","Magdalena");

print "@list1\n";print "@list2\n";

list(@list1,@list2);

exit ;

sub list {

my(@firstArray, @secondArray) = @_ ;

print("The first array is @firstArray.\n");

print("The second array is @secondArray.\n");

}

vioannid$ ./ref3.plPamela Monica SophieNatacha Francesca MagdalenaThe first array is Pamela Monica Sophie Natacha Francesca Magdalena.The second array is .vioannid$

Page 29VI, October 2006

Perl

Filehandles & File Tests

Subroutines (functions)

References

passing arguments to functions

nested data structure

Packages/Modules

Namespace/@INC

Outline :

Page 30VI, October 2006

Perl References

References

If you want to pass more than one array or hash into a function (or return them

from it) and maintain their integrity, then you have to use an explicit pass-by-

reference…

In Perl, you can pass only one kind of argument to a subroutine: a scalar. You do that by

passing a reference to it. A reference to anything is a scalar. Think at a reference as the

Macintosh's alias or the Windows's shortcut.

Page 31VI, October 2006

Perl References

References

If you want to pass more than one array or hash into a function (or return them

from it) and maintain their integrity, then you have to use an explicit pass-by-

reference…

In Perl, you can pass only one kind of argument to a subroutine: a scalar. You do that by

passing a reference to it. A reference to anything is a scalar. Think at a reference as the

Macintosh's alias or the Windows's shortcut.

name

value

"address"

@list_names

john

magdalena

luc

0x180b324

\@list_names

0x180b324

0x180b318

@list_copy

john

magdalena

luc

0x180b524

(an array with 10000 elements … ?)

Page 32VI, October 2006

Perl References

#!/usr/bin/perl

use strict; use warnings;

my @list1 = ("Pamela","Monica","Brad");

my @list2 = ("Natacha","Francesca","James");

print "@list1\n";

print "@list2\n";

list(\@list1,\@list2);

print "@list1\n";

print "@list2\n";

exit ;

sub list {

my($ref1, $ref2) = @_;

@{$ref1} = uc reverse @{$ref1};

print("First array sorted: @{$ref1}.\n");

print("Second array reversed: @{$ref2}.\n");

}

vioannid$ ./ref4.plPamela Monica SophieNatacha Francesca MagdalenaFirst array sorted: EIHPOSACINOMALEMAP.Second array reversed: Natacha Francesca Magdalena.EIHPOSACINOMALEMAPNatacha Francesca Magdalenavioannid$

Page 33VI, October 2006

Perl References

References

The following table discusses the referencing and de-referencing of variables. Note that

in the case of lists and hashes, you reference and dereference the list or hash as a whole,

not individual elements.

${$ref}{"name"}

$ref->{"job"}

%{$ref}

%$ref$ref = \%hash

$hash = {

"name" => "steve",

"job" => "DJ"

};

%hash = (

"name" => "steve",

"job" => "DJ"

);

%hash

${$ref}[1]

$ref->[1]

@$ref

@{$ref}$ref = \@list

$ref = [

"steve",

"fred"

];

@list = (

"steve",

"fred"

);

@list

NA$$ref

${$ref}$ref = \$scalar${$ref}$scalar="steve";$scalar

Accessing an elementDereferencing itReferencing itInstantiating

a reference to it

Instantiating

the scalarVariable

Page 34VI, October 2006

Perl References

References#!/usr/bin/perl

use strict;

use warnings;

my %hash = (

"name" => "steve",

"job" => "DJ"

);

my $ref = \%hash;

print ${$ref}{"name"};

print "\n";

print $ref->{'job'};

print "\n";

my %hash_copy = %$ref;

print %hash_copy;

print "\n";

exit;

steve

DJ

namestevejobDJ

Page 35VI, October 2006

Perl References

References#!/usr/bin/perl

use strict;

use warnings;

my @list = (

"bruce", "michael"

);

my $ref2 = \@list;

print ${$ref2}[0];

print "\n";

print $ref2->[1];

print "\n";

my @list_copy = @$ref2;

print @list_copy;

print "\n";

exit;

bruce

michael

brucemichael

Page 36VI, October 2006

Perl

Filehandles & File Tests

Subroutines (functions)

References

passing arguments to functions

nested data structure

Packages/Modules

Namespace/@INC

Outline :

Page 37VI, October 2006

Perl Nested Data Structures

References are commonly used in Nested Data Structures:

ARRAY REF

ARRAY REFGene1

AKT_signaling

Erk

Integrin_signaling

ARRAY REF

ARRAY REFGene2

LCK_signaling

BRCA-1_pathway

TCF-1_pathway

Page 38VI, October 2006

Perl Nested Data Structures

References are commonly used in Nested Data Structures:#!/usr/bin/perl

use strict; use warnings;

my @gene1 = qw(AKT_signaling Erk Integrin_signaling);

my @gene1_name = ('gene1', \@gene1);

my @gene2 =qw(LCK_signaling BRCA-1_pathway TCF-1_pathway );

my @gene2_name = ('gene2', \@gene2);

my @all_gene_names = (\@gene1_name,\@gene2_name,);

print @all_gene_names ; print "\n";

my @array_gene_name_refs = @{$all_gene_names[0]};

print @array_gene_name_refs ; print "\n";

my $value_array_gene_name_refs1 = @{$all_gene_names[0]}[0];

print $value_array_gene_name_refs1 ; print "\n";

my $value_array2_field3 = ${${$all_gene_names[1]}[1]}[2];

print $value_array2_field3; print "\n";

exit;

gene1ARRAY(0x180b324)

gene1

ARRAY(0x180b318)ARRAY(0x180d888)

TCF-1_pathway

Page 39VI, October 2006

Perl Nested Data Structures

References are commonly used in Nested Data Structures:#!/usr/bin/perl

use strict; use warnings;

my @gene1 = qw(AKT_signaling Erk Integrin_signaling);

my @gene1_name = ('gene1', \@gene1);

my @gene2 =qw(LCK_signaling BRCA-1_pathway TCF-1_pathway );

my @gene2_name = ('gene2', \@gene2);

my @all_gene_names = (\@gene1_name,\@gene2_name,);

print @all_gene_names ; print "\n";

my @array_gene_name_refs = @{$all_gene_names[0]};

print @array_gene_name_refs ; print "\n";

my $value_array_gene_name_refs1 = $all_gene_names[0]->[0];

print $value_array_gene_name_refs1 ; print "\n";

my $value_array2_field3 = $all_gene_names[1]->[1]->[2];

print $value_array2_field3; print "\n";

exit;

gene1ARRAY(0x180b324)

gene1

ARRAY(0x180b318)ARRAY(0x180d888)

TCF-1_pathway

Page 40VI, October 2006

Perl Nested Data Structures

References are commonly used in Nested Data Structures:

Arrays of arrays

Hashes of arrays

Arrays of Hashes

Hashes of hashes… And more!

When the nested data structures become to complex, it may beworth considering Object Oriented Perl programming ……

Page 41VI, October 2006

Perl

#!/usr/local/bin/perl

use strict;

use warnings;

my $v1 = complex_operation (param1);

. . .

my $v2 = complex_operation (param2);

exit ;

sub complex_operation {

- - - - -

}

#!/usr/local/bin/perl

use strict;

use warnings;

my $v1 = complex_operation (param1);

. . .

my $v2 = complex_operation (param2);

exit ;

sub complex_operation {

- - - - -

}

script1.pl script2.pl

Page 42VI, October 2006

Perl

Filehandles & File Tests

Subroutines (functions)

References

passing arguments to functions

nested data structure

Packages/Modules

Namespace/@INC

Outline :

Page 43VI, October 2006

Perl Package/Module

#!/usr/local/bin/perl

use strict;

use warnings;

use Mymod;

my $v1 = Mymod::complex_operation (param1);

. . .

my $v2 = Mymod::complex_operation (param2);

. . .sub complex_operation {

- - - - -

}

script1.pl

Module Mymod

#!/usr/local/bin/perl

use strict;

use warnings;

use Mymod;

my $v1 = Mymod::complex_operation (param1);

. . .

my $v2 = Mymod::complex_operation (param2);

. . .

script2.pl

Page 44VI, October 2006

Perl Package/Module

• Structure of module

Mymod.pm

package Mymod;

sub f { … }

sub g { … }

@array = ( … );

1;

• Calling the functions in a

script

use Mymod;

Mymod::f($param);

$a = Mymod::g();

A module is a package defined in a file whose name is the same as the package.

Page 45VI, October 2006

Perl

Filehandles & File Tests

Subroutines (functions)

References

passing arguments to functions

nested data structure

Packages/Modules

Namespace/@INC

Outline :

Page 46VI, October 2006

Perl namespace

A namespace stores names (or identifiers), including names of variables, subroutines,

filehandles, and formats. Each namespace has its own symbol table, which is basically a

hash with a key for each identifier. Variables in different namespaces can even have the

same name, but they are completely distinct from one another.

The default namespace for programs is main.

Each package starts with a package declaration. The package call takes one argument, the

name of the package. Within the scope of a package declaration, all regular identifiers

are created within that package (except for my variables).

Page 47VI, October 2006

Perl @INC / Export

Perl locates modules by searching the @INC array (defined when Perl is built).

When you refer to MyModule in your program, Perl searches in the directories listed in

@INC for the module file MyModule.pm, and uses the first one it finds.

Page 48VI, October 2006

Perl @INC / Export

To include a module in your program:

require Module;

use Module;

The difference between use and require is that use pulls in the module at compile time.

This means that functions like func1 or func2 can be used as predeclared list operators

throughout the file.

func1($a,$b);

The require call does not necessarily load the module during compilation, so you must

explicitly qualify its routines with the package name.

Module::func1($a,$b);

Page 49VI, October 2006

Perl @INC / Export

# two essential lines in a package

require Exporter;

our @ISA = ('Exporter'); # inherits from Exporter

# export by default

our @EXPORT = qw($cat %canis carnivore);

# export on demand

our @EXPORT_OK = qw($tiger);

# variables & functions declaration

. . .

1;

use Animals; # import all @EXPORT symbols

use Animals qw($cat $tiger); # import $tiger and $cat

Package

User's script

Animals.pm

Page 50VI, October 2006

Perl