Upload
duena
View
52
Download
0
Tags:
Embed Size (px)
DESCRIPTION
PORTING HMMER AND INTERPROSCAN TO THE GRID. Daniel Alberto Burbano Sefair ( [email protected] ) Michael Angel Pérez Cabarcas ( [email protected] ) University of The Andes Information Technology Division Colombia November 2008. Topics. Introduction HMMER InterProScan - PowerPoint PPT Presentation
Citation preview
PORTING HMMER AND INTERPROSCANTO THE GRID
Daniel Alberto Burbano Sefair ([email protected])
Michael Angel Pérez Cabarcas ([email protected])
University of The Andes
Information Technology DivisionColombia
November 2008
Topics
• Introduction• HMMER• InterProScan• What do we have?• What do we want with your help?• Questions
INTRODUCTION
• Our users, from Biologic department, want to use HMMER and InterProScan by an easy way saving processing time.
– Graphic User Interface instead of command line interface.
– They are few users that submit many jobs (1000 - 3000).
– Submit jobs with files upper than 10 MB.
– Reduce the processing time using other computers.
– Depend of the job, the time could be 1 h to 12 h.
– Some jobs from InterProScan fail, and must be submited again.
1. What is HMMER?
- “HMMER is a sequence analysis tool using profile Hidden Markov Models”.
- It is a set of 9 applications used by command line:
hmmpfam, hmmsearch, hmmalign, hmmbuild, hmmconvert, hmmcalibrate, hmmemit, hmmindex, hmmfetch.
The above definition is taked from: ftp://selab.janelia.org/pub/software/hmmer/CURRENT/Userguide.pdf
Home page: http://hmmer.janelia.org/
HMMERProfile Hidden Markov Models
2. How can I use HMMER by command, PBS, and JDL?HMMER is a command line application, this is an example
hmmsearch file.hmm MySequence.fasta >> output
HMMER
1. What is InterProScan?
The following definition is taked from Europan Bioinformatic Institute: http://www.ebi.ac.uk/2can/tutorials/function/InterProScan.html
“InterProscan is a tool that combines different protein recognition methods into one resource. It scans a given protein sequence against the protein signatures of the InterPro member databases (PROSITE, PRINTS, Pfam, ProDom, SMART, TIGRFAMMs.”
Home Page: http://www.ebi.ac.uk/Tools/InterProScan/
InterProScan
2. How does InterProScan work?
1. The User submit a protein sequence.
2. Protein sequence applications are launched and search against specific databases.
3. Each application returns a list of hits.
4. The results are combined.
5. The information returned to the user
1
2 3
4
InterProScan
Infomration and Sshema are taken from: http://www.ebi.ac.uk/2can/tutorials/images/scan_schema.gif
3. How can I use InterProScan by command, PBS, and JDL?
InterProScan is a command line application, this is an example
iprscan -cli –I input.seq -o test.out -format raw -goterms -iprlookup
InterProScan
What do we have?
• Bioinformatic Grid Wrapper (BGW) for HMMER and InterProScan that is a Command Line Interface (CLI)
What do we want with your help?
Architecture
Thanks
?
• “Profile hidden Markov models (profile HMMs) can be used to do sensitive database searching using statistical descriptions of a sequence family's consensus. HMMER is a freely distributable implementation of profile HMM software for protein sequence analysis.”