13
O Log in to amazon biolinux O For mac users O ssh ubuntu@public_dns_address O For Windows users O use putty O Hostname public_dns_address O username ubuntu mkdir bioperl cd bioperl wget http://biobase.ist.unomaha.edu/~ithapa/myfile.gbk

O Log in to amazon biolinux O For mac users O ssh O For Windows users O use putty O Hostname public_dns_address O username ubuntu

Embed Size (px)

DESCRIPTION

How Perl saved the human genome project an_genome O DATE: Early February, 1996 O LOCATION: Cambridge, England, in the conference room of the largest DNA sequencing center in Europe.Cambridge, England, in the conference room of the largest DNA sequencing center in Europe. O OCCASION: A high level meeting between the computer scientists of this center and the largest DNA sequencing center in the United States. O THE PROBLEM: Although the two centers use almost identical laboratory techniques, almost identical databases, and almost identical data analysis tools, they still can't interchange data or meaningfully compare results. O THE SOLUTION: Perl.

Citation preview

Page 1: O Log in to amazon biolinux O For mac users O ssh O For Windows users O use putty O Hostname public_dns_address O username ubuntu

O Log in to amazon biolinuxO For mac users

O ssh ubuntu@public_dns_address

O For Windows usersO use puttyO Hostname public_dns_addressO username ubuntu

mkdir bioperlcd bioperlwget http://biobase.ist.unomaha.edu/~ithapa/myfile.gbk

Page 2: O Log in to amazon biolinux O For mac users O ssh O For Windows users O use putty O Hostname public_dns_address O username ubuntu

BioPerlIshwor Thapa (02/17/2012)

Page 3: O Log in to amazon biolinux O For mac users O ssh O For Windows users O use putty O Hostname public_dns_address O username ubuntu

How Perl saved the human genome project

http://www.bioperl.org/wiki/How_Perl_saved_human_genome

O DATE: Early February, 1996O LOCATION:

Cambridge, England, in the conference room of the largest DNA sequencing center in Europe.

O OCCASION: A high level meeting between the computer scientists of this center and the largest DNA sequencing center in the United States.

O THE PROBLEM: Although the two centers use almost identical laboratory techniques, almost identical databases, and almost identical data analysis tools, they still can't interchange data or meaningfully compare results.

O THE SOLUTION: Perl.

Page 5: O Log in to amazon biolinux O For mac users O ssh O For Windows users O use putty O Hostname public_dns_address O username ubuntu

Installing BioPerlO BioLinux comes with BioPerlO For other machines (linux, mac,

windows),O

http://www.bioperl.org/wiki/Main_Page

Page 6: O Log in to amazon biolinux O For mac users O ssh O For Windows users O use putty O Hostname public_dns_address O username ubuntu

Programming in Perlprint “Hello World!\n”;

for (int $i = 0; $i < 10; $i++){

print “$i\n”;}

Page 7: O Log in to amazon biolinux O For mac users O ssh O For Windows users O use putty O Hostname public_dns_address O username ubuntu

BioPerlO Two Main Classes in BioPerl

Bio::SeqIOBio::Seq

Page 8: O Log in to amazon biolinux O For mac users O ssh O For Windows users O use putty O Hostname public_dns_address O username ubuntu

using Bio::SeqIOO 3 Main Methods

new next_seq write_seq

Page 9: O Log in to amazon biolinux O For mac users O ssh O For Windows users O use putty O Hostname public_dns_address O username ubuntu

Genbank to Fasta converter

use Bio::SeqIO;$in = Bio::SeqIO->new(-file => ”myfile.gbk" , -format => ’Genbank'); $out = Bio::SeqIO->new(-file => ">myfile.fasta" ,

-format => ’Fasta');

while ( my $seq = $in->next_seq() ) {$out->write_seq($seq);

}

Page 10: O Log in to amazon biolinux O For mac users O ssh O For Windows users O use putty O Hostname public_dns_address O username ubuntu

Bio::SeqO 3 Main Methods

new seq subseq display_id desc revcom

Page 11: O Log in to amazon biolinux O For mac users O ssh O For Windows users O use putty O Hostname public_dns_address O username ubuntu

Using Bio::Sequse Bio::SeqIO;$in = Bio::SeqIO->new(-file => "myfile.gbk" , -format => 'Genbank');

while ( my $seq = $in->next_seq() ) { print $seq->display_id; print $seq->desc; #print $seq->seq; #print $seq->subseq(10,20); #print $seq->revcom->seq;}

Page 12: O Log in to amazon biolinux O For mac users O ssh O For Windows users O use putty O Hostname public_dns_address O username ubuntu

SeqFeatures

Page 13: O Log in to amazon biolinux O For mac users O ssh O For Windows users O use putty O Hostname public_dns_address O username ubuntu

while (my $seq = $seq_io->next_seq()){ my @features = $seq->get_SeqFeatures(); foreach my $feat(@features) {

if($feat->primary_tag eq "CDS") { my @pid = $feat->get_tag_values('protein_id'); my @translation = $feat->get_tag_values('translation'); for (my $index = 0; $index < scalar @pid; $index++) { print ">$pid[$index]"."\n"; print $translation[$index]."\n"; } } }

}