Upload
jan-aerts
View
631
Download
1
Tags:
Embed Size (px)
DESCRIPTION
Presentation by Prlic at BOSC2012 "BioJava Update"
Citation preview
How to use BioJavato calculate one billion protein structure alignments at
the RCSB PDB website
Andreas Prlić
My Two Hats
RCSB PDBBioJava
www.pdb.org
Overview N
umbe
r of r
elea
sed
entr
ies
Year
Some of the things you can do at the RCSB PDB site
• Advanced queries
• Custom reports
• Visualization
• Education section
• Comparisons across PDB, based on sequence and 3D structure similarities
Jmol
LigandExplorer
Custom report
www.pdb.org
Systematic Structural AlignmentObjective: Find novel relationships
Example: Green Fluorescent Protein§ Nidogen-1: similar 11-stranded § beta-barrel and internal helices§ 3 Å RMSD, only 9% sequence identity§ Nidogen-1: component of basement membrane, no chromophore§ GFP and NID-1 may share common ancestor
Open Science Grid
based on the FATCAT (rigid) algorithm Yuzhen Ye & Adam Godzik. Flexible structure alignment by chaining aligned fragment pairs allowing twists. 2003. Bioinformatics vol.19 suppl. 2. ii246-ii255.
Systematic comparisons of representative chains from 40% sequence identity clusters
22000 sequence clusters33000 representative domains
PDBCustom Job Management
Java Clients can run anywhere
Open Science
Grid
Sends out instructionsto clients
Writes resultsto disk
.
.
.
Initial calculation of frozen snapshot of PDB
~170k CPU hourson OSG
Incremental weekly updates(~1-2 million alignments)
<1000 CPU hours
Code www.biojava.org
1 billion alignmentsavailable freely at
www.rcsb.org
BioJava
• Major rewrite - BioJava 3
BioJava 1 BioJava 3
core data model
symbols/alphabets, counts, distributions
Genome/sequencing
Mult. seq. align
Structure alignment
Modfinder
AA Properties
Protein Disorder
Hmmer3 WS
NCBI WS
Parsers: Genbank/Embl/Blast
Acknowledgments
• Spencer Bliven
• Peter Rose
• Phil Bourne
• all contributors
• A. Yates, J. Jacobsen, P. Troshin, M. Chapman, J. Gao, C.H. Koh, S. Foisy, R. Holland, G. Rimsa, M. Heuer, H. Brandstaetter-Mueller, S. Willis
RCSB PDB BioJava
FundingRCSB PDBGoogle Summer of Code Open Science Grid