Upload
loraine-wood
View
227
Download
2
Embed Size (px)
Citation preview
11/09/05 D Dobbs ISU - BCB 444/544X: Protein Structure Databases - cont. 1
11/9/05
Protein Structure Databases (continued)
Prediction & Modeling
11/09/05 D Dobbs ISU - BCB 444/544X: Protein Structure Databases - cont. 2
Bioinformatics Seminars
Nov 10 Thurs 3:40 Com S Seminar in 223 Atanasoff
Computational EpidemiologyArmin R. Mikler, Univ. North Texashttp://www.cs.iastate.edu/~colloq/#t3
Nov 10 Thurs 4:10 EEOB Seminar in 210 BesseyDiversity and Evolution of Plant Immunity
Genes: Insights from Molecular Population Genetics
Peter Tiffin, Univ. of Minnesotahttp://www.cbs.umn.edu/tiffin/index.html
11/09/05 D Dobbs ISU - BCB 444/544X: Protein Structure Databases - cont. 3
Bioinformatics Seminars
CORRECTION:
Next week - Baker Center/BCB Seminars: (seminar abstracts available at above link)
Nov 14 Mon 1:10 PM Doug Brutlag, StanfordDiscovering transcription factor binding
sites
Nov 15 Tues 1:10 PM Ilya Vakser, Univ Kansas Modeling protein-protein interactions both seminars will be in Howe Hall Auditorium
11/09/05 D Dobbs ISU - BCB 444/544X: Protein Structure Databases - cont. 4
Protein Structure & Function:Analysis & Prediction
Mon Protein structure: basics; classification,databases,
visualization Wed Protein structure databases - cont.
Thurs Lab Protein structure databases Protein structure analysis &
prediction
Fri Protein structure prediction Protein-nucleic acid interactions
11/09/05 D Dobbs ISU - BCB 444/544X: Protein Structure Databases - cont. 5
Reading Assignment (for Mon-Fri)
Mount Bioinformatics• Chp 10 Protein classification & structure prediction
http://www.bioinformaticsonline.org/ch/ch10/index.html
• pp. 409-491 • Ck Errata: http://www.bioinformaticsonline.org/help/errata2.html
Additional reading assignments for BCB 544:• Gene Prediction: Burge & Karlin 1997 JMB 268:78
Prediction of complete gene structures in human genomic DNA
• Structure Prediction: Schueler-Furman…Baker, Science 310:638 Progress in modeling of protein structures and interactions
11/09/05 D Dobbs ISU - BCB 444/544X: Protein Structure Databases - cont. 6
Review last lecture:
Protein Structure: Basics
11/09/05 D Dobbs ISU - BCB 444/544X: Protein Structure Databases - cont. 7
Protein Structure & Function
• Amino acids characteristics• Structural classes & motifs• Protein functions & functional families
(not much - more on this later) • Classification• Databases• Visualization
11/09/05 D Dobbs ISU - BCB 444/544X: Protein Structure Databases - cont. 8
Amino Acids
Each of 20 different amino acids has different "R-Group," side chain attached to C
11/09/05 D Dobbs ISU - BCB 444/544X: Protein Structure Databases - cont. 9
Peptide bond is rigid and planar
11/09/05 D Dobbs ISU - BCB 444/544X: Protein Structure Databases - cont. 10
Hydrophobic Amino Acids
11/09/05 D Dobbs ISU - BCB 444/544X: Protein Structure Databases - cont. 11
Charged Amino Acids
11/09/05 D Dobbs ISU - BCB 444/544X: Protein Structure Databases - cont. 12
Polar Amino Acids
11/09/05 D Dobbs ISU - BCB 444/544X: Protein Structure Databases - cont. 13
Certain side-chain configurations are energetically favored (rotamers)
Ramachandran plot: "Allowable" psi & phi angles
11/09/05 D Dobbs ISU - BCB 444/544X: Protein Structure Databases - cont. 14
Glycine is smallest amino acidR group = H atom
• Glycine residues increase backbone flexibility because they have no R group
11/09/05 D Dobbs ISU - BCB 444/544X: Protein Structure Databases - cont. 15
Proline is cyclic• Proline residues reduce flexibility of polypeptide chain
• Proline cis-trans isomerization is often a rate-limiting step in protein folding • Recent work suggests it also may also regulate ligand binding in native proteins -Andreotti
11/09/05 D Dobbs ISU - BCB 444/544X: Protein Structure Databases - cont. 16
Cysteines can form disulfide bonds
• Disulfide bonds (covalent) stabilize 3-D structures
• In eukaryotes, disulfide bonds are found only in secreted proteins or extracellular domains
11/09/05 D Dobbs ISU - BCB 444/544X: Protein Structure Databases - cont. 17
Globular proteins have a compact hydrophobic core
Packing of hydrophobic side chains into interior is main driving force for folding
Problem? Polypeptide backbone is highly polar (hydrophilic) due to polar -NH and C=O in each peptide unit; these polar groups must be neutralized
Solution? Form regular secondary structures, e.g., -helix, -sheet, stabilized by H-bonds
11/09/05 D Dobbs ISU - BCB 444/544X: Protein Structure Databases - cont. 18
Exterior surface of globular proteins is generally hydrophilic
Hydrophobic core formed by packed secondary structural elements provides compact, stable core
"Functional groups" of protein are attached to this framework; exterior has more flexible regions (loops) and polar/charged residues
Hydrophobic "patches" on protein surface are often involved in protein-protein interactions
11/09/05 D Dobbs ISU - BCB 444/544X: Protein Structure Databases - cont. 19
Protein Secondary Structures
HelicesSheetsLoopsCoils
11/09/05 D Dobbs ISU - BCB 444/544X: Protein Structure Databases - cont. 20
helix: stabilized by H-bonds between every ~ 4th residue in
backbone
C = blackO = redN = blue
11/09/05 D Dobbs ISU - BCB 444/544X: Protein Structure Databases - cont. 21
Certain amino acids are "preferred" & others are rare in helices
• Ala, Glu, Leu, Met = good helix formers• Pro, Gly Tyr, Ser = very poor• Amino acid composition & distribution varies,
depending on on location of helix in 3-D structure
11/09/05 D Dobbs ISU - BCB 444/544X: Protein Structure Databases - cont. 22
-sheets - also stabilized by H-bonds between back bone atoms
Anti-parallel Parallel
11/09/05 D Dobbs ISU - BCB 444/544X: Protein Structure Databases - cont. 23
Loops• Connect helices and sheets• Vary in length and 3-D
configurations• Are located on surface of
structure• Are more "tolerant" of
mutations• Are more flexible and can
adopt multiple conformations
• Tend to have charged and polar amino acids
• Are frequently components of active sites
• Some fall into distinct structural families (e.g., hairpin loops, reverse turns)
11/09/05 D Dobbs ISU - BCB 444/544X: Protein Structure Databases - cont. 24
Coils
• Regions of 2' structure that are not helices, sheets, or recognizable turns
• Intrinsically disordered regions appear to play important functional roles
11/09/05 D Dobbs ISU - BCB 444/544X: Protein Structure Databases - cont. 25
Globular proteins are built from recurring structural patterns
Motifs or supersecondary structures = combinations of 2' structural
elements
Domains = combinations of motifs • Independently folding unit (foldon)• Functional unit
11/09/05 D Dobbs ISU - BCB 444/544X: Protein Structure Databases - cont. 26
Simple motifs combine to form domains
11/09/05 D Dobbs ISU - BCB 444/544X: Protein Structure Databases - cont. 27
6 main classes of protein structure
1) Domains • Bundles of helices connected by loops
2) Domains• Mainly antiparallel sheets, usually with 2 sheets
forming sandwich
3) Domains• Mainly parallel sheets with intervening helices, also
mixed sheets
4) Domains • Mainly segregated helices and sheets
5) Multidomain (• Containing domains from more than one class
6) Membrane & cell-surface proteins
11/09/05 D Dobbs ISU - BCB 444/544X: Protein Structure Databases - cont. 28
-domain structures: 4-helix bundles
11/09/05 D Dobbs ISU - BCB 444/544X: Protein Structure Databases - cont. 29
-sheets: up-and-down sheets & barrels
11/09/05 D Dobbs ISU - BCB 444/544X: Protein Structure Databases - cont. 30
-domains: leucine-rich motifs can form horseshoes
11/09/05 D Dobbs ISU - BCB 444/544X: Protein Structure Databases - cont. 31
New today:
Protein Structure
DatabasesClassificationVisualization
Protein Structure PredictionSecondary structure
Tertiary structure
11/09/05 D Dobbs ISU - BCB 444/544X: Protein Structure Databases - cont. 32
Protein sequence databases
• UniProt (SwissProt, PIR, EBI)http://www.pir.uniprot.org
• NCBI Protein http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=Protein
More on these later: protein function prediction
11/09/05 D Dobbs ISU - BCB 444/544X: Protein Structure Databases - cont. 33
Protein sequence & structure: analysis
• Diamond STING Millennium - many useful structure analysis tools, including Protein Dossier http://trantor.bioc.columbia.edu/SMS/
• SwissProt (UniProt)protein knowledgebase
http://us.expasy.org/sprot
• InterPROsequence analysis tools
http://www.ebi.ac.uk/interpro
11/09/05 D Dobbs ISU - BCB 444/544X: Protein Structure Databases - cont. 34
Protein structure databases
• PDB Protein Data Bank http://www.rcsb.org/pdb/ (RCSB) - THE protein structure database
• MMDB Molecular Modeling Databasehttp://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=Structure
(NCBI Entrez) - has "added" value
• MSD Molecular Structure Database http://www.ebi.ac.uk/msd
Especially good for interactions, binding sites
11/09/05 D Dobbs ISU - BCB 444/544X: Protein Structure Databases - cont. 35
Protein structure classification
• SCOP = Structural Classification of ProteinsLevels reflect both evolutionary and structural relationshipshttp://scop.mrc-lmb.cam.ac.uk/scop
• CATH = Classification by Class, Architecture, Topology & Homology
http://cathwww.biochem.ucl.ac.uk/latest/
• DALI/FSSP (recently moved to EBI & reorganized)• fully automated structure alignments
• DALI server http://www.ebi.ac.uk/dali/index.html• DALI Database (fold classification)
http://ekhidna.biocenter.helsinki.fi/dali/start
11/09/05 D Dobbs ISU - BCB 444/544X: Protein Structure Databases - cont. 36
Protein structure visualization
• Molecular Visualization Freeware:http://www.umass.edu/microbio/rasmol
• MolviZ.Orghttp://www.umass.edu/microbio/chime
• Protein Explorer http://www.umass.edu/microbio/chime/pe/protexpl/frntdoor.htm• RASMOL (& many decendents: Protein Explorer,PyMol, MolMol,
etc.)http://www.umass.edu/microbio/rasmol/
index2.htm• CHIME
http://www.umass.edu/microbio/chime/getchime.htm
• Cn3D http://www.biosino.org/mirror/www.ncbi.nlm.nih.gov/Structure/cn3d/
• Deep View = Swiss-PDB Viewerhttp://www.expasy.org/spdbv
11/09/05 D Dobbs ISU - BCB 444/544X: Protein Structure Databases - cont. 37
PDB (RCSB) http://www.rcsb.org/pdb
11/09/05 D Dobbs ISU - BCB 444/544X: Protein Structure Databases - cont. 38
RCSB PDB - Beta site http://pdbbeta.rcsb.org/pdb/Welcome.do
11/09/05 D Dobbs ISU - BCB 444/544X: Protein Structure Databases - cont. 39
RCSB PDB - New Tutorial http://core1.rcsb.org/tutorial
11/09/05 D Dobbs ISU - BCB 444/544X: Protein Structure Databases - cont. 40
NCBI Structurehttp://www.ncbi.nlm.nih.gov/Structure
11/09/05 D Dobbs ISU - BCB 444/544X: Protein Structure Databases - cont. 41
MMDBhttp://www.ncbi.nlm.nih.gov/Structure/MMDB/mmdb.shtml
11/09/05 D Dobbs ISU - BCB 444/544X: Protein Structure Databases - cont. 42
Cn3Dhttp://www.ncbi.nlm.nih.gov/Structure/CN3D/cn3d.shtml
11/09/05 D Dobbs ISU - BCB 444/544X: Protein Structure Databases - cont. 43
MMDB: MMolecular MModeling Data Base
Derived PDB structure recordsValue added to PDB records including:
• Integration with other ENTREZ databases & tools• Conversion to parseable ASN.1 data description
language• Correction of numbering discrepancies in structure vs
sequence• Validation • Addition of explicit chemical graph information
Structure neighbors determined by Vector Alignment Search Tool (VAST)
11/09/05 D Dobbs ISU - BCB 444/544X: Protein Structure Databases - cont. 44
Searching MMDB
1CET
11/09/05 D Dobbs ISU - BCB 444/544X: Protein Structure Databases - cont. 45
MMDB Structure Summary
Cn3D viewer
VAST neighbors
BLAST neighbors
11/09/05 D Dobbs ISU - BCB 444/544X: Protein Structure Databases - cont. 46
Cn3D : Displaying 2' Structures
Chloroquine
11/09/05 D Dobbs ISU - BCB 444/544X: Protein Structure Databases - cont. 47
Cn3D : Displaying 3' Structures
Chloroquine
11/09/05 D Dobbs ISU - BCB 444/544X: Protein Structure Databases - cont. 48
Cn3D: Structural Alignments
Chloroquine
NADH
11/09/05 D Dobbs ISU - BCB 444/544X: Protein Structure Databases - cont. 49
Protein Explorer (RasMol/Chime)
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
11/09/05 D Dobbs ISU - BCB 444/544X: Protein Structure Databases - cont. 50
Protein Explorer
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
11/09/05 D Dobbs ISU - BCB 444/544X: Protein Structure Databases - cont. 51
SCOP - Structure Classification
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
11/09/05 D Dobbs ISU - BCB 444/544X: Protein Structure Databases - cont. 52
CATH - Structure Classification
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
11/09/05 D Dobbs ISU - BCB 444/544X: Protein Structure Databases - cont. 53
Structural Genomics
~ 30,000 "traditional" genes in human genome
(not counting: ???)~ 3,000 proteins in a typical cell> 2 million sequences in UniProt> 33,000 protein structures in the PDB Experimental determination of protein
structure lags far behind sequence determination!
Goal: Determine structures of "all" protein folds in nature, using combination of experimental structure determination methods (X-ray crystallography, NMR, mass spectrometry) & structure prediction
11/09/05 D Dobbs ISU - BCB 444/544X: Protein Structure Databases - cont. 54
Structural Genomics Projects
TargetDB: database of structural genomics targetshttp://targetdb.pdb.org
Protein Structure Prediction?
11/09/05 D Dobbs ISU - BCB 444/544X: Protein Structure Databases - cont. 55
Protein Folding
"Major unsolved problem in molecular biology"
In cells: spontaneousassisted by enzymesassisted by chaperones
In vitro: many proteins fold spontaneously & many do not!
11/09/05 D Dobbs ISU - BCB 444/544X: Protein Structure Databases - cont. 56
Steps in Protein Folding
1- "Collapse"- driving force is burial of hydrophobic aa’s
(fast - msecs)2- Molten globule - helices & sheets form, but "loose"
(slow - secs)3- "Final" native folded state - compaction, some 2'
structures rearranged
Native state? - assumed to be lowest free energy - may be an ensemble of structures
11/09/05 D Dobbs ISU - BCB 444/544X: Protein Structure Databases - cont. 57
Protein Dynamics
• Protein in native state is NOT static• Function of many proteins depends on
conformational changes, sometimes large, sometimes small
• Globular proteins are inherently "unstable"(NOT evolved for maximum stability)
• Energy difference between native and denatured state is very small (5-15 kcal/mol)
(this is equivalent to 1 or 2 H-bonds!)• Folding involves changes in both entropy &
enthalpy
11/09/05 D Dobbs ISU - BCB 444/544X: Protein Structure Databases - cont. 58
Protein Structure Prediction
• Structure is largely determined by sequence
BUT:• Similar sequences can assume different structures• Dissimilar sequences can assume similar structures• Many proteins are multi-functional• Protein folding:
• determination of folding pathways • prediction of tertiary structure
still largely unsolved problems