25
Protein Structure Prediction

Protein Structure Prediction. Why do we want to know protein structure? Classification Functional Prediction

Embed Size (px)

Citation preview

Page 1: Protein Structure Prediction. Why do we want to know protein structure? Classification Functional Prediction

Protein Structure Prediction

Page 2: Protein Structure Prediction. Why do we want to know protein structure? Classification Functional Prediction

Why do we want to know protein structure? Classification Functional Prediction

Page 3: Protein Structure Prediction. Why do we want to know protein structure? Classification Functional Prediction

What is protein structure? Primary - chains of amino acids Secondary - interaction between groups

of amino acids Tertiary - the organization in three

dimensions of all the atoms in a polypeptide

Quaternary - the conformation assumed by a multimeric protein

Page 4: Protein Structure Prediction. Why do we want to know protein structure? Classification Functional Prediction

Proteins are chains of amino acids joined by peptide bonds

The N-C-C sequence is repeated throughout the protein, forming the backbone

The bonds on each side of the C atom are free to rotate within spatial constrains,the angles of these bonds determine the conformation of the protein backbone

The R side chains also play an important structural role

Polypeptide chain

The structure of two amid acids

Primary Structure

Page 5: Protein Structure Prediction. Why do we want to know protein structure? Classification Functional Prediction

Interactions that occur between the C=O and N-H groups on amino acids

Much of the protein core comprises helices and sheets, folded into a three-dimensional configuration:

- regular patterns of H bonds are formed between neighboring amino acids- the amino acids have similar angles- the formation of these structures neutralizes the polar groups on each amino acid- the secondary structures are tightly packed in a hydrophobic environment- Each R side group has a limited volume to occupy and a limited number of interactions with other R side groups

helix sheet

Secondary Structure

Page 6: Protein Structure Prediction. Why do we want to know protein structure? Classification Functional Prediction

helix

sheet

Secondary Structure

Page 7: Protein Structure Prediction. Why do we want to know protein structure? Classification Functional Prediction

Other Secondary structure elements(no standardized classification)

- loop

- random coil

- others (e.g. 310 helix, -hairpin, paperclip)

Super-secondary structure

- In addition to secondary structure elements that apply to all proteins (e.g. helix, sheet) there are some simple structural motifs in some proteins

- These super-secondary structures (e.g. transmembrane domains, coiled coils, helix-turn-helix, signal peptides) can give important hints about protein function

Secondary Structure

Page 8: Protein Structure Prediction. Why do we want to know protein structure? Classification Functional Prediction

Structural classification of proteins (SCOP)

Class 1: mainly alpha

Class 4: few secondary structures

Class 2: mainly beta

Class 3: alpha/beta

Classification

Page 9: Protein Structure Prediction. Why do we want to know protein structure? Classification Functional Prediction

Alternative SCOP

Class : only helices Class : antiparallel sheets Class / : mainly sheetswith intervening helices

Class + : mainlysegregated helices withantiparallel sheets

Membrane structure:hydrophobic helices withmembrane bilayers

Multidomain: containmore than one class

More Classification

Page 10: Protein Structure Prediction. Why do we want to know protein structure? Classification Functional Prediction

Q: If we have all the Psi and Phi angles in a protein, do we then have enough information to describe the 3-D structure?

Tertiary structure

A: No, because the detailed packing of the amino acid side chains is not revealed from this information. However, the Psi and Phi angles do determine the entire secondary structure of a protein

Protein Structure Review

Page 11: Protein Structure Prediction. Why do we want to know protein structure? Classification Functional Prediction

Secondary-Structure Prediction Programs * PSI-pred * JPRED Consensus prediction (includes many of the

methods given below) * DSC * PREDATOR * PHD * ZPRED * nnPredict * BMERC PSA * SSP

Page 12: Protein Structure Prediction. Why do we want to know protein structure? Classification Functional Prediction

The tertiary structure describes the organization in three dimensions of all the atoms in the polypeptide

The tertiary structure is determined by a combination of different types of bonding (covalent bonds, ionic bonds, h-bonding, hydrophobic interactions, Van der Waal’s forces) between the side chains

Many of these bonds are very week and easy to break, but hundreds or thousands working together give the protein structure great stability

If a protein consists of only one polypeptide chain, this level then describes the complete structure

Tertiary Structure

Page 13: Protein Structure Prediction. Why do we want to know protein structure? Classification Functional Prediction

Proteins can be divided into two general classes based on their tertiary structure:

- Fibrous proteins have elongated structure with the polypeptide chains arranged in long strands. This class of proteins serves as major structural component of cells Examples: silk, keratin, collagen

- Globular proteins have more compact, often irregular structures. This class of proteins includes most enzymes and most proteins involved in gene expression and regulation

Tertiary Structure

Page 14: Protein Structure Prediction. Why do we want to know protein structure? Classification Functional Prediction

The quaternary structure defines the conformation assumed by a multimeric protein.The individual polypeptide chains that make up a multimeric protein are often referred toas protein subunits. Subunits are joined by ionic, H and hydrophobic interactions

Example:Haemoglobin(4 subunits)

Quaternary Structures

Page 15: Protein Structure Prediction. Why do we want to know protein structure? Classification Functional Prediction

Common displays are (among others) cartoon, spacefill, and backbone

cartoon spacefill backbone

Structure Displays

Page 17: Protein Structure Prediction. Why do we want to know protein structure? Classification Functional Prediction

Classic Approach to Determining Structure?

Determine biochemicaland cellularrole of protein

Purify protein

Experimentally determine3D structure

Clone cDNAencodingprotein

Obtain proteinBy expression

Infer function, mechanism of action

Page 18: Protein Structure Prediction. Why do we want to know protein structure? Classification Functional Prediction

Structural Genomics Approach?

genomicDNA sequences

predictprotein-codinggenes

Obtain proteinby expression

Obtain proteinIn silico

Experimentallydetermine3D structure

Predict 3D structure

Determinebiochemical andcellular roleof protein

homology searches (PSI-BLAST)

Page 19: Protein Structure Prediction. Why do we want to know protein structure? Classification Functional Prediction

3-D macromolecular structures stored in databases

The most important database: the Protein Data Bank (PDB)

The PDB is maintained by the Research Collaboratory for Structural Bioinformatics (RCSB) and can be accessed at three different sites (plus a number of mirror sites outside the USA):

- http://rcsb.rutgers.edu/pdb (Rutgers University)- http://www.rcsb.org/pdb/ (San Diego Supercomputer Center)- http://tcsb.nist.gov/pdb/ (National Institute for Standards and Technology)

It is the very first “bioinformatics” database ever build

Sources of Protein Structure Information?

Page 20: Protein Structure Prediction. Why do we want to know protein structure? Classification Functional Prediction

Researches have been working for decades to develop procedures for predicting protein structure that are not so time consuming and not hindered by size and solubility constrains.

As protein sequences are encoded in DNA, in principle, it should therefore be possible to translate a gene sequence into an amino acid sequence, and topredict the three-dimensional structure of the resulting chain from this amino acid sequence

Computational Modeling

Structural Prediction

Page 21: Protein Structure Prediction. Why do we want to know protein structure? Classification Functional Prediction

How to predict the protein structure?

Ab initio prediction of protein structure from sequence: not yet.

Problem: the information contained in protein structures lies essentially in theconformational torsion angles. Even if we only assume that every amino-acid residuehas three such torsion angles, and that each of these three can only assume oneof three "ideal" values (e.g., 60, 180 and -60 degrees), this still leaves us with 27possible conformations per residue.

For a typical 200-amino acid protein, this would give 27200 (roughly 1.87 x 10286)possible conformations!

If we were able to evaluate 109 conformations per second, this would still keep us busy 4 x 10259 times the current age of the universe

There are optimized ab initio prediction algorithms available as well as fold recognition algorithms that use threading (compares protein folds with know fold structures from databases), but the results are still very poor

Q: Can’t we just generate all these conformations, calculate their energy and see which conformation has the lowest energy?

Computational Modeling

Page 22: Protein Structure Prediction. Why do we want to know protein structure? Classification Functional Prediction

Homology (comparative) modeling attempts to predict structure on

the strength of a protein’s sequence similarity to another protein of known

structure

Basic idea: a significant alignment of the query sequence with a target sequence from PDB is evidence that the query sequence has a similar 3-D structure (current threshold ~ 40% sequence identity). Then multiple sequence alignment and pattern analysis can be used to predict the structure of the protein

Homology Modeling

Page 23: Protein Structure Prediction. Why do we want to know protein structure? Classification Functional Prediction

Computational modeling: summary

Partial or full sequencespredicted through gene

finding

Similarity searchagainst proteins

in PDB

Alignment can be used to position theamino acids of the query sequence inthe same approximate 3-D structure

Find structures that have a significantlevel of structural similarity (but not

necessarily significant sequence similarity)

If member of a family with a predicted structural fold,

multiple alignment can be used for structural modeling

Infer structural information (e.g. presence of smallamino acid motifs; spacing and arrangement of

amino acids; certain typical amino acid combinationsassociated with certain types of secondary structure)

can provide clues as to the presence of active sites andregions of secondary structure

Structural analyses in the lab(X-ray crystallography, NMR)

How do wedo this?

Page 24: Protein Structure Prediction. Why do we want to know protein structure? Classification Functional Prediction

3D Comparative Modeling Profile Methods - match sequences to folds

by describing each fold in terms of the environment of each residue in the structure

Threading Methods - match sequences to structure by considering pairwise interactions for each residue, rather than averaging them into an environmental class

HMM Methods - the equivalent state corresponds to one structurally aligned position in a structural fold, including gaps

Page 25: Protein Structure Prediction. Why do we want to know protein structure? Classification Functional Prediction

Structural HMM