47
Interactive fitting of high resolution models into low resolution maps. Last developments in the UROX software // http://mem.ibs.fr/UROX Xavier Siebert and Jorge Navaza Methods in Electron Microscopy Institut de Biologie Structurale CNRS, Grenoble, France Leiden, May 16, 2008

Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

Embed Size (px)

Citation preview

Page 1: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

Interactive fitting of high resolution modelsinto low resolution maps.

Last developments in the UROX software //http://mem.ibs.fr/UROX

Xavier Siebert and Jorge Navaza

Methods in Electron MicroscopyInstitut de Biologie Structurale

CNRS, Grenoble, France

Leiden, May 16, 2008

Page 2: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

Fitting high resolution models . . .

◮ XRay crystallography

◮ need crystals◮ hard for large complexes

◮ NMR

◮ hard > 50kDa◮ many peaks◮ broader peaks

Page 3: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

. . . into Electron Microscopy maps

Rotavirus, 25 Å (Jean Lepault)

Page 4: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

. . . or Small Angle Scattering envelopes

Page 5: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

Methods for Fitting

◮ “by hand”, with a graphics software◮ subjective (Wang et al. 1992 ; Stewart et al. 1993)

◮ with a fitting algorithm◮ real space : 3SOM, ADP_EM, CHIMERA, DOCKEM,

EMFIT, FOLDHUNTER, MOLREP, SITUS◮ reciprocal space : COAN, URO, UROX

◮ “force feedback 3D devices” : SENSITUS

Page 6: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

Fitting in real space or reciprocal space ?

◮ minimize mismatch / maximize correlation (Q = 1 − CC2)◮ . . . in real space : between densities

Q =

|ρem(r) − λρmod(r)|2d3r∫

|ρem(r)|2d3r

◮ . . . in reciprocal space : between Fourier coefficients

Q =

|F em(s) − λF mod(s)|2d3s∫

|F em(s)|2d3s

◮ equivalent formulations (Parseval)

Page 7: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

Reciprocal-space fitting with UROX

Page 8: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

Application of UROX : Rotavirus (J. Virol, March 2008)Fitting in the whole reconstruction, using symmetry

Page 9: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

Application of UROX : RotavirusChannel shrinks (right), inhibiting transcription

Page 10: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

Reciprocal-space formulation with symmetry

◮ Goal : maximize correlation map - models:

CC =

F em(s)F mod(s)d3s√

|F em(s)|2d3s√

|F mod(s)|2d3s. (1)

◮ where F mod are functions of the positional variables of theindependent molecules :

F mod(s) =∑

m∈M

g∈G

fm(sMgRm) exp[2πis(MgXm + Tg)] , (2)

◮ m = one of the M independent molecules, located at theposition Xm in the orientation Rm with respect to areference position

◮ g = symmetry operator represented by the translation Tg

and the rotation Mg.

Page 11: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

UROX design

◮ core calculations : Fortran77 code (adapted from URO)◮ graphical libraries : VTK (Visualization Toolkit)◮ Python wrapper

◮ Tkinter : graphical user interface◮ F2PY : import fortran from python

Page 12: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

VTK (Visualization Toolkit) www.vtk.org

◮ powerful libraries for medical and scientific applications

Page 13: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

Why reciprocal space ?

Pros:

1. speed allows real-time calculations

2. choose fitting resolution without extra computations

3. use whole EM reconstruction (no masking)

4. symmetry (if any) incorporated in the formulation

Cons:◮ hard to impose real-space restrains

Page 14: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

Why reciprocal space ?

Pros:

1. speed allows real-time calculations

2. choose fitting resolution without extra computations

3. use whole EM reconstruction (no masking)

4. symmetry (if any) incorporated in the formulation

Cons:◮ hard to impose real-space restrains

Page 15: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

Why reciprocal space ?

Pros:

1. speed allows real-time calculations

2. choose fitting resolution without extra computations

3. use whole EM reconstruction (no masking)

4. symmetry (if any) incorporated in the formulation

Cons:◮ hard to impose real-space restrains

Page 16: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

Why reciprocal space ?

Pros:

1. speed allows real-time calculations

2. choose fitting resolution without extra computations

3. use whole EM reconstruction (no masking)

4. symmetry (if any) incorporated in the formulation

Cons:◮ hard to impose real-space restrains

Page 17: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

Why reciprocal space ?

Pros:

1. speed allows real-time calculations

2. choose fitting resolution without extra computations

3. use whole EM reconstruction (no masking)

4. symmetry (if any) incorporated in the formulation

Cons:◮ hard to impose real-space restrains

Page 18: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

1. Speed of UROXInteractive graphics . . .

◮ fast calculation of correlation coefficient (CC):◮ 10−7 s / symmetry operation / Fourier coefficient

symmetry resol # coeff timeGroEl 14 30 Å ≈ 5000 3.5 ms

DLP (Rotavirus) 60 20 Å ≈ 650,000 4 s◮ interactive graphics

◮ CC computed in real-time as a model is moved in the map◮ speed up

◮ recuperate asymmetric unit in reciprocal space◮ decimate Fourier coefficients (only 6N unknowns !)

Page 19: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

1. Speed of UROXInteractive graphics . . .

◮ fast calculation of correlation coefficient (CC):◮ 10−7 s / symmetry operation / Fourier coefficient

symmetry resol # coeff timeGroEl 14 30 Å ≈ 5000 3.5 ms

DLP (Rotavirus) 60 20 Å ≈ 650,000 4 s◮ interactive graphics

◮ CC computed in real-time as a model is moved in the map◮ speed up

◮ recuperate asymmetric unit in reciprocal space◮ decimate Fourier coefficients (only 6N unknowns !)

Page 20: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

1. Speed of UROXInteractive graphics . . .

◮ fast calculation of correlation coefficient (CC):◮ 10−7 s / symmetry operation / Fourier coefficient

symmetry resol # coeff timeGroEl 14 30 Å ≈ 5000 3.5 ms

DLP (Rotavirus) 60 20 Å ≈ 650,000 4 s◮ interactive graphics

◮ CC computed in real-time as a model is moved in the map◮ speed up

◮ recuperate asymmetric unit in reciprocal space◮ decimate Fourier coefficients (only 6N unknowns !)

Page 21: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

1. Speed of UROXInteractive graphics . . .

◮ fast calculation of correlation coefficient (CC):◮ 10−7 s / symmetry operation / Fourier coefficient

symmetry resol # coeff timeGroEl 14 30 Å ≈ 5000 3.5 ms

DLP (Rotavirus) 60 20 Å ≈ 650,000 4 s◮ interactive graphics

◮ CC computed in real-time as a model is moved in the map◮ speed up

◮ recuperate asymmetric unit in reciprocal space◮ decimate Fourier coefficients (only 6N unknowns !)

Page 22: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

1. Speed of UROXInteractive graphics . . .

◮ fast calculation of correlation coefficient (CC):◮ 10−7 s / symmetry operation / Fourier coefficient

symmetry resol # coeff timeGroEl 14 30 Å ≈ 5000 3.5 ms

DLP (Rotavirus) 60 20 Å ≈ 650,000 4 s◮ interactive graphics

◮ CC computed in real-time as a model is moved in the map◮ speed up

◮ recuperate asymmetric unit in reciprocal space◮ decimate Fourier coefficients (only 6N unknowns !)

Page 23: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

1. Speed of UROXInteractive graphics . . .

◮ fast calculation of correlation coefficient (CC):◮ 10−7 s / symmetry operation / Fourier coefficient

symmetry resol # coeff timeGroEl 14 30 Å ≈ 5000 3.5 ms

DLP (Rotavirus) 60 20 Å ≈ 650,000 4 s◮ interactive graphics

◮ CC computed in real-time as a model is moved in the map◮ speed up

◮ recuperate asymmetric unit in reciprocal space◮ decimate Fourier coefficients (only 6N unknowns !)

Page 24: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

1. Speed of UROXInteractive graphics . . .

◮ fast calculation of correlation coefficient (CC):◮ 10−7 s / symmetry operation / Fourier coefficient

symmetry resol # coeff timeGroEl 14 30 Å ≈ 5000 3.5 ms

DLP (Rotavirus) 60 20 Å ≈ 650,000 4 s◮ interactive graphics

◮ CC computed in real-time as a model is moved in the map◮ speed up

◮ recuperate asymmetric unit in reciprocal space◮ decimate Fourier coefficients (only 6N unknowns !)

Page 25: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

1. Speed of UROXInteractive graphics . . .

◮ fast calculation of correlation coefficient (CC):◮ 10−7 s / symmetry operation / Fourier coefficient

symmetry resol # coeff timeGroEl 14 30 Å ≈ 5000 3.5 ms

DLP (Rotavirus) 60 20 Å ≈ 650,000 4 s◮ interactive graphics

◮ CC computed in real-time as a model is moved in the map◮ speed up

◮ recuperate asymmetric unit in reciprocal space◮ decimate Fourier coefficients (only 6N unknowns !)

Page 26: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

1. Speed of UROXInteractive graphics . . .

◮ fast calculation of correlation coefficient (CC):◮ 10−7 s / symmetry operation / Fourier coefficient

symmetry resol # coeff timeGroEl 14 30 Å ≈ 5000 3.5 ms

DLP (Rotavirus) 60 20 Å ≈ 650,000 4 s◮ interactive graphics

◮ CC computed in real-time as a model is moved in the map◮ speed up

◮ recuperate asymmetric unit in reciprocal space◮ decimate Fourier coefficients (only 6N unknowns !)

Page 27: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

1. Speed of UROXSpeedup Graphics (for oldish graphics cards like mine . . . )

◮ VTK decimation (wireframe mode)

Page 28: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

1. Speed of UROXSpeedup Graphics (for oldish graphics cards like mine . . . )

◮ VTK decimation (surface mode)

Page 29: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

1. Speed of UROXSpeedup Graphics

◮ VTK BoxWidget◮ Analyse local parts of the map (and speed up)

Page 30: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

2. Change fitting resolutionCorrelation profile at 1

20 Å−1

-10

0

10

20

30

40

50

60

70

80

-40 -20 0 20 40 60 80 100

CC

Z [Angstroms]

20A

Page 31: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

2. Change fitting resolutionCorrelation profile at 1

40 Å−1

-10

0

10

20

30

40

50

60

70

80

90

100

-40 -20 0 20 40 60 80 100

CC

Z [Angstroms]

40A

Page 32: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

2. Change fitting resolutionCorrelation profile at 1

60 Å−1

-20

0

20

40

60

80

100

-40 -20 0 20 40 60 80 100

CC

Z [Angstroms]

60A

Page 33: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

2. Change fitting resolutionStrategy

◮ to avoid local extrema :1. low resolution2. high resolution

◮ two modes :◮ interactive with least-squares optimization◮ exhaustive 3D or 6D searches

Page 34: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

3. and 4. Use whole EM map and symmetryIllustrated by a benchmark comparison of several fitting softwares

◮ test case : GroEl (cryo-stain, Dubochet, JSB 2002)◮ D7 symmetry

Page 35: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

3. and 4. - Benchmark : GroElWarning : subject to my mishandling of other people’s softwares

Page 36: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

3. and 4. - Benchmark : GroElAnalysis of Benchmark for other softwares

◮ most softwares struggle because of "extra" density◮ alternative : mask around putative solution (but bias . . . )

◮ in that case most softwares find the solution

Page 37: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

3. and 4. - Benchmark : GroElAnalysis of Benchmark for UROX (exhaustive search mode)

◮ without symmetry (C1) : difficult (requires tweaking)◮ with symmetry (D7) : easy (10 min)◮ conclusions :

◮ symmetry matters, no mask necessary◮ could use interactive mode instead of exhaustive search

Page 38: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

Other features of UROX

◮ refine electron microscope magnification (5% error)

Page 39: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

Latest developments (UROX 2.0)

◮ (re-writing of the Python classes . . . )◮ flexible fitting : normal modes◮ fit map in map◮ applications to tomography

Page 40: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

UROX 2.0 - flexible fittingNormal modes (with K. Suhre and Y-H. Sanejouand)

◮ low frequency motion of proteins◮ harmonic approximation

rj(t) = r0j +

k

Ajkαkcos(ωk t + φk ) (3)

◮ use with care (will always give better answer ! )

Page 41: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

UROX 2.0 - fit map in map

Page 42: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

UROX 2.0 - Tomography and missing wedgePresentation of the problem

Page 43: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

UROX 2.0 - Tomography and missing wedgeVisualize the Fourier transform (and select reflections)

Page 44: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

UROX 2.0 - Tomography and missing wedge

◮ Detect missing wedge◮ remove it from fitting (don’t align missing wedges !)

Page 45: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

Thank you

. The Organizers . . .

. Jorge Navaza (IBS, Grenoble)

. Jean Lepault and Sonia Libersou (LVMS, Gif-sur-Yvette)

. Karsten Suhre (Neuherberg, Germany)

. Yves-Henri Sanejouand (ENS, Lyon)

. Leandro F. Estrozi (EMBL, Grenoble)

. Stefano Trapani (CBS, Montpellier)

. James Conway (Pittsburgh, USA)

. Irina Gutsche and Ambroise Desfosses (EMBL, Grenoble)

+ http://mem.ibs.fr/UROX

Page 46: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

Error EstimatesMy map has a resolution of x Å. What is the error on the fit ?

◮ in UROX:1. R-factor ↔ quality of the map

R =

h ||F emh | − |F mod

h ||∑

h |F emh |

2. Q = quadratic misfit

◮ rule of thumb : 10% resolution (Rossmann, Acta Crys. 2001)

◮ empirically : VP6 of the rotavirus (25 Å map)◮ fit with a trimer◮ fit with 3 monomers◮ RMSD (trimer, 3 monomers) ≈ 3 Å

Page 47: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

Error Estimate by Least SquaresBorel p. 204

◮ let us suppose that the errors are distributed as a gaussian:

P(ǫ) =1

σ√

2πexp(− ǫ2

2σ2 ) (4)

◮ if σ is the same for all N reflections :

P({F modH , F em

H , σ}) = (1

σ√

2π)N exp(−

H

|F emH − F mod

H |22σ2 ) (5)

σ ≈√

Qmin

N − M(6)