X-ray Diffraction Stephen J Everse Fall 2004 In order for an object to be visible under magnification, the wavelength ( ) of the light must be, roughly

$Page 1: X-ray Diffraction Stephen J Everse Fall 2004 In order for an object to be visible under magnification, the wavelength ( ) of the light must be, roughly$
X-ray Diffraction

Stephen J EverseFall 2004

In order for an object to be visible under magnification, the wavelength () of the light must be, roughly speaking, no larger than the object.

Obtaining images of macromolecules

Visible light (400-700 nm) cannot produce an image of protein molecules, in which bonded atoms are about 1.5Å apart (0.15 nm). Electromagnetic radiation of this wavelength falls into the X-ray range.

S. Doublié ‘00

X-rays

X-rays are just another form of

Electromagnetic Spectrum:Visible LightX-rays

Energy:High Low

Wavelength:

Frequency:High Low

LongShort

Resolving Power:

High Low

~1Å(=0.1nm) ~400nm

Ultraviolet Infrared Microwave Radio

electromagnetic radiation:

{Electrons (~500keV, as in electron microscope) are not a form of electromagnetic radiation, but they still have wave-like character (deBroglie wavelength ~0.01Å).Unlike photons (EM rad.), electrons are charged --> fry the specimen faster }

(Ability to see detail)Atomic

Resolution

M. Rould ‘02

X-ray Crystallography

• a method for studying the three-dimensional, atomic structure of molecules. In this course we will concentrate on applications for biologicalmacromolecules.

A protein crystal

is placed in the x-ray beam

the x-rays are diffracted by the electron clouds around atoms

the atomic structure can be deduced from the data

A single molecule is a very weak scatterer of X-rays. Most of theX-rays will pass through the molecule without being diffracted. The diffracted rays are too weak to be detected. Solution: Analyzing diffraction from crystals instead of single molecules. A crystal is made of a three-dimensional repeat of ordered molecules (1014) whose signals reinforce each other. The resulting diffracted rays are strong enough to be detected.

Unlike visible light, X-rays cannot be focused by lenses. The refractiveindex of X-rays in all materials is very close to 1.0.

Solution: Use a computer to simulate an image-reconstructing lens.In short, the computer plays the part of the objective lens, computingthe image of the object, then displaying it on a screen.

Why can’t we visualize molecules directly?

Sylvie Doublié © 2000

Under certain circumstances, macromolecules (protein, DNA, RNA)can form crystals. The resulting crystal is a three-dimensional array of ordered molecules held together by noncovalent interactions.

Evidence that solution and crystal structure are similar:

1- NMR and X-ray crystallography have been used to determine thestructure of the same molecule. The two methods produce similarmodels.2-Many macromolecules are still functionalin the crystalline state.

Most protein crystals contain 50-70% solvent.

The nature of crystals

S. Doublie ‘02

What is a Crystal ?

object formed by stacking a basic unit in all 3-dimensions

Unit Cell

M. Rould ‘02

The Ideal Crystalthe ordered disposition of molecules such that

there exists a regular repetition of a pattern in 3-D space,where this repetition extends over a distance equal to or greater than thousands of molecular dimensions.

The Real Crystala crystal with less than perfect periodicity,

imperfections are often caused by impurities and the effectsof non-zero temperatures.

The Protein Crystalthe crystal contains a high degree of solvent,

meaning that some molecules present are not in the

crystalline state, but in the liquid state, creating disorder.

Supersaturation

to add more of a substance ( to a solution) than can normally be dissolved. This is a thermodynamically unstable state,

achieved most often in protein crystallography by vapor diffusion or slow evaporation techniques.

Zone 1 - Metastable zone. The solution may not nucleate for a long time but this zone will sustain growth.It is frequently necessary to add a seed crystal.

Zone 2 - Nucleation zone. Protein crystals nucleate and grow.

Zone 3 - Precipitation zone. Proteins do not nucleate but precipitate out of solution.

Diagram from the website for The University of Reading, Course FS460Investigating Protein Structure and Function

phenomenon whereby a “nucleus”, such as a dust particle, a tiny seed crystal, or commonly in protein

crystallography, a small protein aggregrate, starts a crystallization

process.

Common difficulties:

1. If supersaturation is too high, too many nuclei form, hence an overabundance of tiny crystals.

2. In supersaturated solutions that don’t experience spontaneous nucleation, crystal growth often only occurs in the presence of added nuclei or “seeds”.

Nucleation

Cessation of growth

Caused by the development of growth defects or the approach of the solution to

equilibrium.

Mother liquor

The solution in which the crystal exists - this is often not the same as the original crystallization screening solution, but is

instead the solution that exists after some degree of vapor diffusion, equilibration

through dialysis, or evaporation.

1) Purity of proteins

2) Protein concentration

3) Starting conditions (make-up of the protein solution)

4) Precipitating agent (precipitant)

5) Temperature

6) pH

7) Additives: Detergents, reducing agents, substrates, co-factors, etc.

Factors that affect crystallization

Hanging/Sitting Drop Vapor DiffusionMost popular method among protein crystallographers.

1. Crystal screen buffer is the well solution (0.5 - 1 mL)2. Drop (on siliconized glass cover slip) is 1/2 protein solution, 1/2 crystal screen buffer (0.5-4 L). So, the concentration of precipitant in the drop is 1/2 the concentration in the well.3. Cover slip is inverted over the top of the well and sealedwith vacuum grease (airtight).4. The precipitant concentration in the drop will equilibrate with the precipitant concentration in the well via vapor diffusion.

Interpreting the Results of the Crystallization

Experiment

The Hampton Crystal Galleryhttp://www.hamptonresearch.com/stuff/gallery.html

Experimental Set- Up

X-ray source

MonochromatorOr Mirrors

X-ray beam

Crystal

Goniometer

Beam Stop

Detector

European Synchrotron Radiation FacilityGrenoble, France

Rigaku rotating copper anode(in-house source)

Cryostream

S. Cates ‘02

X-rays in the useful range for crystallography (around 1 Å) can be produced by bombarding a metal target (most commonly copper or molybdenum) with electrons produced by a heated filament and accelerated by an electric field. A high energy electron collides with and displaces an electron from a low lying orbital in a target metal atom. Then an electron from a higher orbitaldrops into the resulting vacancy, emitting its excess energy asan X-ray photon.

How are X-rays produced?

S. Doublié © 2000

X-ray Generators - The Rotating Anode

Rigaku rotating copper anode(in-house source)

X-rays are generated by bombarding a rotating copper anode with electrons. This creates X-ray radiation consisting of two wavelenghts characteristic of copper sources, 1.54 Å (K radiation) and 1.39 Å (K radiation). Crystallographers usually use K radiation (the intensity is greater).

X-ray Generators - The Synchrotron

European Synchrotron Radiation FacilityGrenoble, France

Electrons (or positrons) are released from a particle accelerator into a storage ring. The trajectory of the particles is determined by their energy and the local magnetic field. Magnets of various types are used to manipulate the particle trajectory. When the particle beam is “bent” by the magnets, the

electrons (or positrons) are accelerated toward the center of the ring. Charged particles moving under the influence of an accelerating field emit

electromagnetic radiation, and when they are moving at close to relativistic speeds, the radiation emitted includes high energy x-ray radiation.

The oscillation equipment Rotates the crystal about an axis () perpendicular to the

x-ray beam (and normal to the goniometer). The diffractionpattern from a crystal is a 3-D pattern, and the crystal must

be rotated in order to observe all the diffraction spots.

Check out Bernhard Rupp’s Crystallography 101 website: http://www-structure.llnl.gov/Xray/101index.html

1- Photographic filmNot much used anymore because of the availability of far more sensitive detectors. Superior resolution due to its fine grain, but limited dynamic range.

2- Image platesImage plates are coated with a layer of inorganic storage phosphor. X-ray photons excite electrons in the material to higher energy levels. Part of the energy is emitted as fluorescence, but an appreciable amount of energy is retained in the material. The stored energy is released upon illumination with a red laser. Blue light is emitted and measured with a photomultiplier. The light emitted is proportional to the number of photons. Ten times more sensitive than film, dynamic range (1:104-105)

Detectors

S. Doublié © 2000

A characteristic of wave phenomena, where whenever a wavefront encounters an obstruction that alters the amplitude or phase of a part of the wavefront, diffraction will occur. The components of the wavefront, both the unaffected and the altered, will interfere with one another, causing an observable energy-density distribution referred to as the diffraction pattern.

Diffraction

Interactions between X-rays and atoms

X-rays are scattered almost exclusively by the electrons in the atoms, not by the nuclei.

The incident electromagnetic wave exerts a force on the electrons. This causes the electrons to oscillate with the same frequency as the incident radiation. The oscillating electrons act as radiation scatterers and emit radiation at the same frequency as the incident radiation.

S. Doublié © 2000

When an incident x-ray beam hits a scatterer, scattered x-rays are emitted in all directions. Most of the scattering wavefronts are out of phase interfere destructively. Some sets of wavefronts are in phase and interfere constructively.

A crystal is composed of many repeating unit cells in 3-dimensions, and therefore, acts like a 3-dimensional diffraction grating. The constructive interference from a diffracting crystal is observed as a pattern of points on the detector. The relative positions of these points are related mathematically to the crystal’s unit cell dimensions.

Destructive Interference

Constructive Interference

Diffraction gratings Diffraction patterns

Notice - when the diffraction grating gets smaller, the pattern spacing gets larger (inverse

relationship)

Bragg’s Law

2d sin = nwhere = wavelength of incident x-rays = angle of incidence d = lattice spacing n = integer

Spots are observed when the following conditions are met:1. The angle of incidence = angle of scattering.2. The spacing between lattice planes is equal to

an integer number of wavelengths.

The Ewald SphereA tool to visualize the conditions under which Bragg’s law is satisfied and therefore a reflection (diffraction spot) will be observable.

This occurs when the surface of a sphere centered about the crystal with radius = 1/ intersects with a point on the reciprocal lattice.

http://perch.cimr.cam.ac.uk/Course/Adv_diff2/Diffraction2.html#Ewald

Movie downloaded from An Interactive Course on Symmetry and

Analysis of Crystal Structure by Diffraction By: Gervais Chapuis and Wes Hardaker

Unit Cell

A crystal’s unit cell dimensions are defined by six numbers, the lengths of the 3 axes, a, b, and c, and the three interaxial angles, , and .

The convention for designating the reciprocal lattice defines its axes as a*, b*, and c*, and its interaxial angles as *, * and *.

Asymmetric unit

Recall that the unit cell of a crystal is the smallest 3-D geometric figure that can be stacked without rotation to form the lattice. The asymmetric unit is the smallest part of a crystal structure from which the complete structure can be built using space group symmetry. The asymmetric unit may consist of only a part of a molecule, or it can contain more than one molecule, if the molecules not related by symmetry.

Symmetry"An object has a particular symmetry if the object looks exactly the same after applying the corresponding symmetry operation."

Types of Symmetry Operations:

• Translational

• Combination symmetries:

• Screw axis (translation + rotation)

• Glide plane (translation + mirror)

• Roto-inversion axis

4-fold rotation

•n-fold Rotation

• Mirror operation

• Inversion operation

Mirror Plane

Inversion center

Note that mirror and inversion operations change the hand. I.e., if an object possesses this symmetry, either both enantiomers must be present, or the object must be achiral.M. Rould ‘02

Can natural proteins have mirror or inversion symmetry?

xNo - proteins are chiral -- only L-amino acids are present.

How about nucleic acids (DNA, RNA)?

No, (deoxy)ribose is chiral -- only the D- stereoisomers are present.

Of the 232 Crystallographic Space Groups, only 65 are possible for crystals containing enantiomorphic specimens such as most biological macromolecules.

Symmetry

M. Rould ‘02

L-Alanine D-Alanine

X-Ray Scattering from a CrystalA typical image of x-rays scattered by a crystal: (Dark spots are the scattered x-rays)

When x-rays scatter from a crystal we see discrete spots: Reflections

X-Ray Diffraction Pattern

Why?

M. Rould ‘02

X-Ray Diffraction from a Crystal • Electromagnetic radiation is wave-like:

Electric field + + + + +

- - - - -+

Direction of motion of x-ray photon

•Waves can add constructively or destructively:

Electric field + + + + +

- - - - -+

Sum

=

M. Rould ‘02

Each reflection in the diffraction pattern is the result of diffractive contributions from all the atoms in the unit cell.

F(hkl) = f1 ei + f2 ei + f3 ei + … + fN ei + f1' ei + f2' ei + f3' ei + …

or, F(hkl) = ∑fj ei

The term fj describing the diffractive contributions of each atom is called the atomic scattering factor of atom j.

The scattering factor essentially describes the amplitude for the scattering contributed by a particular species of atom.

Structure Factor - F(hkl)

F(hkl), as a complex number, can be expressed in terms of its real and imaginary components:

F(hkl) = A(hkl) + i B(hkl), where A = ∑fj cos j = fresultant cos resultant and B = ∑fj sin j = fresultant sin resultant,

fj are the atomic scattering factorsand j are the phase angles of the waves scattered from individual atoms.

This is just an alternate, mathematically equivalent representation for the structure factor that sometimes proves useful.

Structure Factor - F(hkl) cont’d

For each point in a diffraction, there is a corresponding spatial frequency. Therefore, the distribution of a far-field diffraction pattern is the Fourier

transform of the aperture function. (aperture - an opening, often adjustable, that controls the amount of light reaching the lens on a camera or other optical instrument.)

In our case, the aperture function is the regularly periodic (due to the repetition of the unit cell in the lattice) electron density distribution within our crystals. The electron density is the inverse Fourier transform of the diffraction pattern expressed as follows:

(x, y, z) = 1/Vunit cell ∑∑∑F(hkl) e -2πi(hx+ky+lz),

where Vunit cell = volume of one unit cell and F(hkl) is called the structure factor for a particular set of Miller indices h, k and l. We can do a summation here, instead of integrating, because we know we will only have reflections at integer values for h, k and l.

h k l

Fourier Methods in Diffraction Theory

Electron density distribution:(x, y, z) = 1/Vunit cell ∑∑∑F(hkl) e -2πi(hx+ky+lz)

for convenience, let us substitute = 2π(hx+ky+lz) in the future

The amplitude of the structure factor F(hkl) for any given reflection is proportional to the square root of the intensity of the diffracted beam, or: |F(hkl)|2 I

Therefore, we can deduce |F(hkl)|, the magnitude of F, directly from our data, but not its phase.

h k l

Electron Density

We have all the information we need,

except the phase. Why worry about the phase?

On the top are photographs of Jerome Karle (left) and Herb Hauptman (right), who won the Nobel Prize for their work on solving the phase problem for small molecule crystals. We can treat the photographs as density maps and calculate their Fourier transforms, to get amplitudes and phases. If we combine the phases from the picture of Hauptman with the amplitudes from the picture of Karle, we get the picture on the bottom left. The bottom right picture combines the phases of Karle with the amplitudes of Hauptman.

The pursuit of phases

Although |Fhkl| can be derived from the recorded intensities Ihkl, thephase angle ahkl cannot be derived straightforwardly from the diffraction pattern. Several methods have been developed to solvethis problem.

Multiple isomorphous replacement (MIR)Free of model bias, but noisy due to lack of isomorphism

Multiwavelength anomalous diffraction (MAD)Most reliable source of phases; isomorphism is nearly perfect

Molecular replacement (MR)Widely used; errors due to model bias are variable and difficult to detect and correct

MIR

Basic principle:• Add heavy atom compound (Hg, Pt, Au, etc.) to the crystal.• Collect diffraction data from this derivatized crystal.• Hopefully the heavy metal will bind to just a few sites• It is relatively easy to determine the positions of these fewreally big atoms.

-> knowing the positions of the heavy atoms, we can calculatetheir effect on the intensity and phase of each reflection.

Caveat:MIR only works if the heavy atom doe not change the conformationof the protein or the crystal lattice in any way. The only differences allowed are the presence of the heavy atom in the crystal and theresulting change in intensity and phase of the scattered X-rays.

S. Doublié ‘02

Protein phase angles

Real axis

Imaginary axis

FP

-FH

FPH

O

G

H

Harker construction for phase determination by the method of single isomorphous replacement: the vectorsOH and OG represent two possibilities for FP.

A- Single isomorphous replacement

FPH= FP + FH

FP, protein structure factorFH, heavy atom structure factorFPH, structure factor for derivatized crystal

S. Doublié ‘02

Harker construction for MIR

Real axis

Imaginary axis

FP

-FH

FPH

O

G

H

-FH2

FPH2

The addition of another derivative breaks the phase ambiguity: FP is given unequivocally by the vector OH.

S. Doublié ‘02

MAD

MAD depends on the presence of sufficiently strong anomalouslyscattering atoms in the protein. Anomalous scattering occurs if the electrons in an atom cannot be regarded as free electrons.An anomalous scatterer absorbs X-rays of specified wavelength. Asa result of this absorption, Friedel’s law does not hold, i.e, the reflections hkl and -h-k-l are not equal in intensity. This inequalityof symmetry related reflections is called anomalous scattering.

Metalloproteins (Fe) structures have been solved with MAD.Proteins that do not naturally contain anomalous scatterers can be expressed in E. coli in a defined medium with selenomethionine. The selenium atoms serve as anomalous scatteringheavy atoms. Caveat:MAD requires a tunable wavelength: data collection can only be done at synchrotron radiation facilities (Brookhaven, Stanford, APS etc.).

S. Doublié ‘02

Molecular replacement

Prerequisite: The protein of interest should have a structural homologue in the PDB in order to use the related protein as phasing model.

Molecular replacement entails calculating initial phases by placingthe model of a known protein in the unit cell of the unknown protein.

Caveat: Errors due to model bias are variable and difficult to detect and correct

S. Doublié ‘02

R factor

Measure of the crystallographic residual, indicates the correctness of a model:

R = ∑ | (|Fobs|-|Fcalc|) | ∑ (|Fobs|

Variations that can prove confusing to the novice:

Rmerge measurement of the quality of a merged data set

Rsym measurement of the variation between symmetry-related reflections

Rfree R factor for a test set of unique reflections that have been omitted from the refinement process (unbiased)

R = ∑ | (|Fobs|-|Fcalc|) |

∑ |F obs|hkl T

where hkl T designates all reflections belonging to a test set T of randomly selected, unique reflections. The

size of the test set is commonly 10% of the data set.

R factor for a test set of unique reflections that have been omitted from the refinement process (unbiased)

hkl T

Rfree

measurement of the quality of a merged data set

R = ∑ ∑ | (|Fhkl|-|Fhkl(j)|) |

∑ N x (|Fhkl|

hkl j=1

N

hkl

where | Fhkl| is the final value of the structure factor amplitude for that reflection,

N = total no. of data sets (or images) merged.

Rmerge

R = ∑ ∑ | (|F(i)hkl|-|Fhkl|) |

∑ ∑ |F (i) hkl|

hkl i

hkl

for i observations of each symmetry-related reflection,where |Fhkl| is the average value for the structure factor amplitude of the i observations of a given reflection.

measurement of the variation between symmetry-related reflections

i

Rsym

Refinement Target

Refinement searches for a global minimum for a target energy function similar to the one illustrated below:

E total = wxray Exray + Econformation + Enonbonded

where wxray = weight for the xray energy term Exray = xray energy term Econformation = conformational energy terms

(bonds, angles) and Enonbonded = nonbond energy terms

(van der Waals, electrostatic)

Reduces the conformational freedom within the model to improve the ratio of observables to parameters in the early stages of refinement. The entire model can be treated as a rigid body, or it can be regarded as linked, rigid groups.

For each group of atoms specified by the user as a rigid body, the 3 rotational and 3 translational degrees of freedom are minimized.

Rigid Body Refinement

RIGIDBODY 1

RIGIDBODY 2

The atomic position parameters x, y and z are refined for each atom.

Difficulties in protein crystallography:• large number of parameters to fit• macromolecular crystals diffract weakly,

producing a poor parameters to observations ratio.

The geometrical constraints introduced by the conformational energy terms greatly reduces the number of parameters to be refined. Least-squares optimization or conjugate gradient minimization techniques are commonly used for finding the best fit of the model to the data.

Positional Refinement

B-factors are indicators of atomic mobility. High values correspond to low electron density, indicating a dynamic or disordered region, or a possible error in position.

The B-factor is an exponential expression applied to the scattering factor that relates to the thermal motion of the scattering atom and the decrease in scattering intensity that results from thermal motions.

fe -B[(sin )/ ]

The x-ray energy term is modified in the target energy function is revised where Fcalc is replaced by Fcalc e -s B/4

2 2

2

B-factor (temperature factor) refinement

The occupancy factor is used to describe disorder in the model. An atom with a partial occupancy factor can be thought of as an atom that does not occupy that position 100% of the time (i.e., ions, water, cofactors). Some refinement programs do not require that the occupancy factor be ≤ 1, so it is up to the crystallographer to remember that 1 is the upper limit on the occupance factor for a given atom in a given position.

Occupancy Refinement

Simulated annealing - MD-refinement technique that involves the control of the temperature, mathematically related to the kinetic energy (KE) of the MD simulation by:

Tcurrent = 2 KE/3nkb, for n = degrees of freedom, kb = Boltzmann constant

Gradient descent minimization and least-squares optimization methods are prone to get “stuck” in regions of local minima when applied to the vast problem of solving the structure of a biological macromolecule. In these cases, it is often necessary to overcome an energy barrier between the local minimum and the global minimum. Therefore, to reach the global minimum, an algorithm must be applied that can go energetically “uphill”.

Simulated Annealing

Model Building

Starting model:

Molecular replacement modelinitial model is the search model that has been positioned in the unit cell by the rotation and

translationfunction.

MAD/isomorphous modelelectron density is calculated using the heavy

atom phases, then the model has to be built into the electron density.

Maps

Electron density distribution:(x, y, z) = 1/Vunit cell ∑∑∑F(hkl) e-ø

The first map is an approximation to the true electron density derived from the observed structure factor amplitudes (Fobs) and the estimated phases from the model (MR, MAD, or MIR phases).

(Remember our illustrations that the correctness of the model image depends more on having the correct phase information

than on having the correct amplitudes.)

h k l

Maps cont’d

Both tryptophans are from the same 1.7 Å crystal structure, but the map in Figure 1 is the first map calculated using the initial MR phases and the map in Figure 2 is the final map calculated using the refined phases.

1 2

6.0 - 4.5Å Placement of secondary structures3.0Å Chain tracing2.5Å Side chain orientation1.8Å Alternate side chain orientations1.2Å Hydrogen atoms

Resolution limits

Map types

2 FO - FC Maps FO = observed structure factors FC = calculated structure factor

Subtracting Fc from 2 Fo exaggerates the areas where Fo differs from Fc. In the case where Fo is greater than Fc, the net structure factor amplitude is intensified and in the case where Fo is less than Fc, the net structure factor amplitude is decreased.

Produces “positive” or “negative” peaks in areas where Fo differs from Fc. This map is usually contoured at a high level - 3 or 4 - so all the crystallographer views are the large difference peaks (not likely to be just noise).

FO - FC Maps (Difference Maps)

Atomic Model Deposition - The Protein Data Bank

You’ve solved your 1.2 Å crystal structure with an R-factor of 15.4% and an R-free of 16.2%. It’s time to share your hard-won scientific knowledge with the rest of the world. When you publish your paper, most journals will request that you provide your PDB accession number, indicating you have deposited your coordinates for the betterment of mankind. So, you type the following URL into your browser:

http://www.rcsb.org/pdb/and wind up here:

Welcome to the PDB, the single worldwide repository for the processing and distribution of 3-D biological

macromolecular structure data.

http://www.rcsb.org/pdb/

PDB Validation SuitePROCHECK, NUCHECK, SFCHECK

PROCHECK Assesses the geometry of the residues in a given protein structure, as compared with stereochemical parameters derived from well-refined, high-resolution structures.

Unusual regions highlighted by PROCHECK are not necessarily errors, but may be unusual features for which there is a reasonable explanation (eg distortions due to ligand-binding in the protein's active site). Nevertheless, they are regions that should be checked carefully.

The only input required for PROCHECK is the PDB file holding the coordinates of the structure of interest.

Practical Considerations - generalizations (that means, of course, that there are always exceptions)

R-factor: Good upper limit for ~ 2Å data:R-free:

Resolution: Good:For sidechain conformations:

≤ 2Å

< 3Å

20 - 23 %within 10% of R(closer for hi res)

Documents

X-ray Diffraction Stephen J Everse Fall 2004 In order for an object to be visible under magnification, the wavelength ( ) of the light must be, roughly