EMBO workshop, 26 Sept. 2008 Tutorials for PDBj Search Tools · (X-ray, NMR, EM, Neutron)...

Preview:

Citation preview

http://www.pdbj.org/http://www.protein.osaka-u.ac.jp/rcsfp/pi/

Haruki NakamuraInstitute for Protein Research,

Osaka University

Tutorials for PDBjSearch Tools

EMBO workshop, 26 Sept. 2008

Protein Data Bank Japan

http://www.pdbj.org/

At Institute for Protein Research, Osaka Univ. since 2001 supported from the Institute for Bioinformatics Research and Development, Japan Science and Technology Agency (BIRD-JST).

Structure Data curationand editing

Structure Data browsing and downloading

PDBj members at IPR, Osaka Univ.

http://www.pdbj.org/

Processed data numbers at PDBj

0

2000

4000

6000

8000

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35

Yearly wwPDB processed numberYearly PDBj processed number

Yea

rly

regi

stra

tion

num

ber 8000

6000

4000

2000

01972 75 80 85 90 95 2000 2007

We process 25-30 % deposited data of the entire world, mainly from Asian and Oceania regions

Total 52,535 on August 19, 2008

year

http://www.pdbj.org/

Get Entry DataAccess to http://www.pdbj.org/

Summary for each PDBID is displayed.

PDBID (e.g. 1gof) should be input in a box and GO

Get Entry DataAccess to http://www.pdbj.org/

Summary for each PDBID

Graphic viewer: jV version 3.6Access to http://www.pdbj.org/jV/

Several Information for each Entry: Structural Details

Name of the molecule(s)

Molecular weight(s)

Keywords

etc.

Details of the structure

Several Information for each Entry: Experimental Details

Experimental method

(X-ray, NMR, EM, Neutron)

Parameters for the crystal

Crystallization conditions

etc.

Details of the experiment

Several Information for each Entry: Functional Details

Gene ontology information

Ligand binding

Functional site

etc.

Details of the function

Several Information for each Entry: Sequence Neighbor

Result of BLAST search

Sequence Navigator is used.

PDBID list of homologs

Several Information for each Entry: Download/Display

Conventional PDB format

Conventional PDB header

mmCIF

PDBML

PDBML without coordinates

PDBML for only coordinates

Structure factor

Download or display of the archival data

Several Information for each Entry: Link

RCSB-PDB, MSD-EBI

CATH, SCOP, FSSP: folds

UniProt: Sequences

KEGG: Pathways

EzCatDB: enzymes

etc.

Link to other databases

Advanced SearchFrom "Advanced Search" on Top page

Author names & Journal

Experimental method

Ligand name

Residues

Resolution

Species

etc.

Search by many conditions

XQuery/XPath SearchFrom “xPSSS (xml-based Protein Structure Search Service)” page

XML based search

XQuad: help by XQuery advisor

Search by XQuery/XPath

Search for all entries with helix of length equal to 10 residues:

XQuad: XQuery Advisor

XQuad: XQuery Advisor

List of hit PDBIDs is displayed.

Search of Similar Sequences:Sequence Navigator

PDBID or amino-acid sequence should be input and "Find All Homologs"

Structural alignment: GASH

PDBID or PDB-format file name should be input and "Superimpose"

The optimal structural alignment is displayed.

Search of Similar Structures:Structure Navigator

PDBID should be input and "Start Structure Navigator"

List of hit PDBIDs is displayed.

Total number in PDBMLplus 52,535

GO Information (Biological Process, Molecular Function, Cellular Component) 20,186

Extracted from Literatures by Annotators 20,040

Information of binding site residues from HETATM 29,006

Function Information from Uni-Prot(ACT_SITE, BINDING, DNA_BIND, NP_BIND, ZN_FING, TRANSMEM)

28,243

Function Information from CATRES/extCATRES-EBI-CSA-EBI-

3,17418,668

Primary Citation Information 48,968

Additional Information in our XML databse: PDBMLplus

(as of August 19, 2008)

Addition of Data in PDBMLplus<exptl>

<method>SYNCHROTRON RADIATION</method><crystal id="1"><grow auth_validate="N" update_id="6">

<method auth_validate="N" update_id="6">Microdialysis</method><temp auth_validate="N" unit="&amp;#x2103;" update_id="6">4</temp><pH auth_validate="N" update_id="6">4</pH>

</grow>

<grow_comp id="1" auth_validate="N" update_id="6"><sol_id auth_validate="N" update_id="6">1</sol_id><name auth_validate="N" type="common name" update_id="6">protein</name><conc auth_validate="N" unit="mg/ml" update_id="6">13</conc>

</grow_comp>

<grow_comp id="2" auth_validate="N" update_id="6"><sol_id auth_validate="N" update_id="6">2</sol_id><name auth_validate="N" type="common name" update_id="6">ammonium

sulphate</name><conc auth_validate="N" unit="%sat" update_id="6">70</conc>

</grow_comp>::

</crystal></exptl>

Example for 12as with the functional site information

Command:

show xps3

Advanced usage of jV version3 with xPSSS

xPSSS (xml-based Protein Structure Search Service)

PDBML

PDBMLplus

Web server

XSLT processor

downloader

Loader

Archive(RCSB-PDB/MSD-EBI

/PDBj)

Native XML-DB

PDBMLplus

PDBMLplusF

download(FTP)

FTP server

Internet

DDBJSwisProt/UniProt

PIR/GenBank/KEGG/GDB/ProTherm/EzCatDB

EBI/CSA/CATRES

Function/Source

Information

Get/Input Tools

CATRESData

AnnotationData

AddInformation

Filtering &Recostructing

PDBMLplus

PDBMLplusF

xPSSS

Manual inputfrom literatures

Primary Citation DB with PDF files

Browser

Primary Citation DatabaseOnly Internal Usages in PDBj

Web input tool

18,814 PDF files have been collected.

Protein Molecular Surface Database, eF-site(Kinoshita & Nakamura)

Protein Dynamics Database, ProMode(Wako & Endo)

Development of other Databases and Services

BioMagResBank-NMR experimental data(Akutsu, Harano & Nakatani)

Search for Similar Surface,eF-seek (Kinoshita & Nakamura)

Electron Microscopy Navigator, EM-Navi (Suzuki)

Encyclopedia of Protein Structures, eProtS(Kinjyo, Kudo, & Ito)

Protein Globe

(Kinjo, A. R. & Standley, D. M.)

Standley, D. M. et al., Brief. Bioinfo. (2008) 9, 276-285.

Protein GlobeBy Akira Kinjo & Daron Standley

http://www.pdbj.org/Globe/

Standley, D. M. et al., Brief. Bioinfo. (2008) 9, 276-285.

All-α

All-β

α/β

eF-site/eF-surf/eF-seek

(Kinoshita, K. & Nakamura, H.)

Kinoshita, K. et al., Nucl. Acids Res. (2007) 35, W398-W402. Kinoshita, K. & Nakamura, H., Protein Sci (2005) 14, 711-718. Kinoshita, K. & Nakamura, H., Bioinformatics (2004) 20, 1329-1330.

eF-site database: http://ef-site.hgc.jp

Almost all PDB entries are calculated.Individual subunits are calculated..Each model for NMR structure is calculated.

Molecular surface and electrostatic potential

Connolly surface(Molecular surface)

Dielectric constant: 80.0

Dielectic constant: 2.0

Charges:AMBER partial chargesGrid size:1.0Å

Ionic strength: 0.1 M

Probe sphere Solvent Accessible Surface

Protein core

Re-entrant surface

What can we see from molecular surface?

[example] Myb proto-oncogene protein

DNA-boundDNA-unbound DNA-bound

eF-site IDPID_ModelID-ChainID

– example: 1a1t_3-AModelID is ignored, when no ModelID

– example: 1tup-CAlphabetic Chain IDs for multiple chains

– example: 1tup-ABCEFLink to individual surfaces

– For each eF-site ID• http://ef-site.hgc.jp/eF-

site/servlet/Summary?entry_id=1tup-EF– For each PDB-ID

• http://ef-site.hgc.jp/eF-site/servlet/Search?pdb=1tup

Summary Page for each Entry

Surface Browsing with jV

Download of each data file

Link to other DBs

Structure Page for surface and structure browsing

Structure based function prediction

Functional site database:Local structure of functional site of proteins

similarity search

Goal– To predict a molecular

function of proteins from their 3D structures

Approach– To search for similar

structures against the functional site database (local structure)

Structural information– Molecular surface

generated by Connolly’s algorithm

– Electrostatic potential obtained by solving Poisson-Boltzmann equations numerically -0.1 +0.1(V)

Function unknown protein

Prediction of Ligand Binding Sites: eF-seekhttp://ef-site.hgc.jp/eF-seek

Prediction of Functional sites by similarity search for eF-siteSearch for representative ligand binding sites

For the uploaded PDB-formatted file, the putative functional sites are predicted, and the assumed complex structures will be replied.

ProMode

(Wako, H. & Endo, S.)

Wako et al., Bioinformatics (2004) 20, 2035-2043.

Database of Normal Mode Analysis of Proteinshttp://promode.socs.waseda.ac.jp/

Command window of jV.

A protein structure vibrating in a given normal mode can be observed (animation displayed by jV and Chime viewer).

Dynamic domains (blue and red regions) are defined for each normal mode.

Dynamic domains (blue and red regions) are defined for each normal mode.

A protein structure vibrating in a given normal mode can be observed (animation displayed by jV and Chime viewer).

Time average properties obtained by the normal mode analysis are shown.

Fluctuation of atoms

Fluctuation of dihedral angles

Correlations between fluctuations of atoms.Red: atom pairs with a strong positive correlationBlue: atom pairs with a strong negative correlation

Example of tetramerFluctuation of atoms

Internal (green) and external (blue) motions of each subunit are shown for oligomer data

New!

EM Navigator

(Suzuki, H.)

What’s EM Navigator?What’s EM Navigator?EM Navigator is • web site for browsing 3D electron microscopy (3D-EM) data

URL: http://emnavi.protein.osaka-u.ac.jp/• based on data from EM Data Bank (EMDB) and Protein Data Bank (PDB)• for non-specialists, beginners, and experts in 3D-EM or structural / molecular biology.

Top page with “Movie Slots” and text-search box

Enjoy viewing 3D structuresEnjoy viewing 3D structures

Interactive structure viewer (jV / Jmol) on Detail page(Data: PDB-ID-1GRU, requires Java Runtime Environment)

Interactive movie player on Movie page(Data: EMDB-ID-1508, requires Adobe Flash Player)

Enter into detailsEnter into details

Detail page for EMDB data (ID: 1261)

Table page to view details of multiple data

Detail page for PDB data (ID: 2J37)

Search, view, and enjoy 3D-EM data !

SeSAWSequence-derived Structure Alignment Weights

for identifying functional sites

• A way of comparing sequence and structure similarities between proteins

• Structural similarities measured using ASH structural alignment program

• Sequence similarities measured using position specific scoring matrices (PSSMs) from psiBLAST

(Standley, D. M.)Standley, D. M. et al., PROTEINS (2008) 72, 1333-1351.

arg 20

1

exp( ( / ) ){ }alignN

T et ASH Blosum Blosum PSSM PSSMm

m

S S d d w S w S=

⎡ ⎤= + − +⎣ ⎦∑

Score for strucutural similarity Blosum62 score Score from PSSM values

dm: distance between the Cα atom pairs in the aligned structures. do: 4A

Identification of Protein Families/Superfamilies

ROC curve

Pfam familySCOP/CATH Superfamily

Confidence Measures

argT etS

TP/

(TP+

FP)

Pfam family

SCOP/CATH Superfamily

Standley, D. M. et al., PROTEINS (2008) 72, 1333-1351.

Example: Hypothetical protein TTHA1568 fromThermus thermophilus 2czl

Co-crystalized withtartaric acid

2czl

2czlA Results

SG Targets

His BindingGlu Binding

Lys/His Binding

SG TargetsGlu Binding

Good match to 2czl: 1ii5, a glutamate-binding protein

2czlA Family: DUF191 1ii5A Family: SBP_bac_3 STarg: 57

2czl and 1ii5 have common binding site

Tartaric acidGlutamate

G82

eProtSEncyclopedia of Protein Structures

(Wiki-eProtS)

(Kudo, T. & Kinjo, A. R.)

Encyclopedia of Protein Structures (Wiki-eProtS)

http://eprots.pdbj.org/

Introduction and request for writing articles of Encyclopedia of Protein Structures (Wiki-eProtS)

Protein Data Bank (PDB)52,535 entries

•select proteins•annotate for the general audience(in English and Japanese)

eProtS322 entries

(at Aug 20,2008)

What’s Encyclopedia of Protein Structures (eProtS) ?

To enlighten and feedback the accomplishment of structural biology to the general public...

Example (α-amylase)

Protein nameSpeciesBiological context

Structure description

Links to PDBj (xPSSS, jV)

Referencesoriginal paper,links to other database,author, and translator

MembersHead: Haruki NakamuraGroup for PDB Database Curation:Atsushi Nakagawa, Takanori MatsuuraReiko Igarashi, Yumiko Kengaku, Kanna Matsuura,Mayumi Inoue, Chen Minyu

Group for Development of New Tools and Services:Daron M. Standley, Akira R. Kinjo, Hirofumi Suzuki,Reiko Yamashita, Takahiro Kudou, Yukiko Shimizu

Group for NMR Database (BMRB-PDBj):Toshimichi Fujiwara, Hideo Akutsu, Eiichi Nakatani, Yoko Harano

Other Collaborators:Kengo Kinoshita (IMS, Univ. Tokyo), Hiroyuki Toh (MIB, Kyushu Univ.), Hiroshi Wako (Waseda Univ.),Nobutoshi Ito (Tokyo Med. Dent. Univ. )

Secretary:Chisa Kamada

Recommended