57
Absolute Structure Determination and Light-Atom Structures Simon Parsons The University of Edinburgh

Absolute Structure Determination and Light-Atom Structures...iso freely. • Data collected by Trixie Wagner (Novatis), Olly Presly (Agilent) or me Parsons, Flack & Wagner Acta Cryst

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

  • Absolute Structure Determination

    and Light-Atom Structures

    Simon Parsons

    The University of Edinburgh

  • What This is About • X-ray crystallography is

    so popular because it

    directly produces

    images of molecules.

    No other technique

    does this.

    • One problem with the

    technique is that

    characterisation of the

    absolute configuration

    of a molecule is difficult.

    • This information is often

    vitally important though.

  • Absolute Structure & The Flack

    Parameter

    _2 2

    model single single( ) (1 ) | ( ) | | ( ) |I x F x F h h h

  • Precision of x

    Suppose I refine a structure and get x = 0.00(8).

  • Precision of x

    Suppose I refine a structure and get x(u) = 0.00(8).

    Notice that x may be –ve.

    This is unphysical but may happen statistically.

    0.24 = 3u u

    σ

    esd

    sd

    su

  • x = 0 x = 1

    Physical range of x

    Statistical range of x

  • x = 0 x = 1

  • How wide should the

    distributions be?

    x = 0 x = 1

  • x = 0 x = 1

    Model right Model wrong Don’t know

    Flack & Bernardinelli J. Appl. Cryst. (2000), 33, 1143

  • x = 0 x = 1

    3u 6u 3u

    12u = 1

    u = 1/12 = 0.08 ~ 0.1

  • Refinement of x in SHELXL

    TWIN -1 0 0 0 -1 0 0 0 -1

    BASF 0.1

    N value esd shift/esd parameter

    1 4.86497 0.02065 0.000 OSF

    2 -0.03716 0.26951 0.000 BASF 1

    3 0.00881 0.00260 0.000 EXTI

  • The Problem

    • The requirement that u is less than 0.1 is

    actually quite difficult to attain for light atom

    structures (C,H,N,O compounds).

    • A precise determination of x requires large

    values of f ” relative to f0+f ’.

    • But even with Cu-radiation f ”(O) etc. are

    small.

    • Alanine C3H7NO2 x = -0.04(27) [100 K data]

  • The Problem

    • If Friedel’s law held

    exactly absolute

    structure determination

    by X-ray crystallography

    would be impossible.

    • But anomalous

    scattering introduces

    deviations which carry

    absolute structure

    information.

    • Magnitudes depend on

    – Elements present

    – Wavelength of X-

    rays used.

    f” values Mo Cu

    C 0.002 0.009

    N 0.003 0.018

    O 0.006 0.032

    S 0.124 0.558

  • Friedif (Friedifstat)

    Flack & Shmueli. Acta Cryst. (2007) A63, 257

    Friedif = 104 χ

  • Friedif

    Flack & Bernardinelli. Acta Cryst. (2008) A64, 484

  • Test Data Sets • Good crystals – 23 data sets

    • Mostly 100 K data collection. High redundancy.

    • Complete or nearly complete Friedel data

    • Cu-Kα radiation. CCD Instruments

    • Multiscan absorption correction.

    • Merged in SORTAV (defaults)

    • Refine against F2 with all data. Shelx weights.

    • Spherical scattering factors – Dittrich et al. Acta Cryst (2006) A62, 217

    • Refine H positions and Uiso freely.

    • Data collected by Trixie Wagner (Novatis), Olly Presly (Agilent) or me

    Parsons, Flack & Wagner Acta Cryst (2013) B69, 249

  • Test Data Sets

    Formula Space Group Redundancy R1(F>4u(F))

    C3H7NO2 P212121 25 2.19

    C25H31NO5 P212121 15 2.48

    C19H26N6O* P212121 14 4.25

    C16H18N2 P32 13 2.83

    C27H48 P21 18 2.88

    *Some disorder

  • Friedif and refined x

    Code Formula Friedif x (Normal Ref’t)

    L-Alanine C3H7NO2 34 -0.04(27)

    GKO02 C25H31NO5 32 0.01(15)

    R-CYCLO C19H26N6O 21 -0.02(27)

    TWA16A C16H18N2 13 0.00(69)

    Cholestane C27H48 9 -0.01(77)

  • Expected distribution of 23

    values of x/u

  • Refinement of x

    Code x (Normal Ref’t)

    L-Alanine -0.04(27)

    GKO02 0.01(15)

    R-CYCLO -0.02(27)

    TWA16A 0.00(69)

    Cholestane -0.01(77)

    Distribution of x/u compared

    to a unit Gaussian for 23

    Structures

    Chi2 = 0.03

  • Precise Absolute Structure

    Determination

    • Is there a way to get lower (more

    realistic) standard uncertainties?

    • Post refinement methods

    – Probability the model hand is correct

    – Estimate of x (2 methods)

    • Obtaining x during refinement

    – Restraints

  • Right or Wrong?

    Bayesian Methods

    (PLATON/BIJVOET)

    y = Hooft parameter = x calculated by Bayesian methods

    Hooft, Straver & Spek. J. Appl. Cryst. (2008), 41, 96.

  • Bayesian Methods

    Calculate structure factors F2single = F2

    c with x = 0.

    2 2

    single

    2

    ( ) ( )

    ( ( ))

    o

    h

    o

    F Fz

    u F

    h h

    h

    21( ) exp

    22

    hh

    zp z

    (observations | x 0) ( )hp p z

  • Bayesian Methods

    Calculate structure factors with x = 0.

    Fc2 Fo2 Sigma(Fo2)

    -6 2 1 68.55 65.70 0.71

    6 2 1 67.61 64.50 0.71

    2 2

    (67.61 68.55) (64.50 65.70)

    0.71 0.71z

    Assumes measurement errors

    are distributed with a Gaussian

    pdf.

    21( ) exp

    22

    hh

    zp z

    2 2

    single

    2

    ( ) ( )

    ( ( ))

    o

    h

    o

    F Fz

    u F

    h h

    h

  • Bayesian Methods

    Calculate structure factors F2single = F2

    c with x = 0.

    2 2

    single

    2

    ( ) ( )

    ( ( ))

    o

    h

    o

    F Fz

    u F

    h h

    h

    21( ) exp

    22

    hh

    zp z

    (observations | x 0) ( )hp p z

  • Bayesian Methods

    2 2

    single

    2

    ( ) ( )

    ( ( ))

    o

    h

    o

    F Fq

    u F

    h h

    h

    21(q ) exp

    22

    hh

    qp

    (observations | x 1) (q )hp p

  • Bayesian Methods:

    Right or Wrong Structure?

    (obs | 0)

    (obs | 0) (

    (x 0)( 0 | obs)

    (x 0) (obs | 1) x 1)

    pp x

    p

    p x

    p x p px

  • Bayesian Methods

    Code Friedif x (Normal Ref’t) P2(true)

    L-Alanine 34 -0.04(27) 1.000

    GKO02 32 0.01(15) 1.000

    R-CYCLO 21 -0.02(27) 1.000

    TWA16A 13 0.00(69) 1.000

    Cholestane 9 -0.01(77) 1.000

  • Calculation of x(u) by

    Bayesian Methods • This analysis can be

    extended to obtain a value of x (aka y).

    • Instead of calculating probabilities at y = 0 and 1, calculate for the range 0-1, and build up a distribution.

    • Equations expressed in terms of

    γ (gamma) = 1-2y

    Hooft, Straver & Spek. J. Appl. Cryst. (2008), 41, 96.

  • Hooft Parameter

    Code Friedif x (Normal Ref’t) y

    L-Alanine 34 -0.04(27) 0.01(4)

    GKO02 32 0.01(15) 0.03(3)

    R-CYCLO 21 -0.02(27) -0.02(4)

    TWA16A 13 0.00(69) 0.02(7)

    Cholestane 9 -0.01(77) -0.04(9)

    Chi2 for 23 structures = 0.83 (~1)

  • When Errors are Non-

    Gaussian (‘Poor Data’)

    2 2

    single

    2

    ( ) ( )

    ( ( ))

    o

    h

    o

    F Fz

    u F

    h h

    h

    21( ) exp

    22

    hh

    zp z

    (observations | x 0) ( )hp p z

    Hooft, Straver & Spek. J. Appl. Cryst. (2010), 43, 665

  • When Errors are Non-

    Gaussian (‘Poor Data’)

    2 2

    single

    2

    ( ) ( )

    ( ( ))

    o

    h

    o

    F Fz

    u F

    h h

    h

    (observations | x 0) ( )hp p z

    Student-t

    ν = 5

    Hooft, Straver & Spek. J. Appl. Cryst. (2010), 43, 665

    12 2

    1

    2( , ) 1

    2

    zp z

  • Example

    Riebenspies & Bhuvanesh Acta Cryst. (2013), B69, 288

  • ‘Quotient’ Methods

    • Systematic errors like absorption may drown-out

    anomalous differences.

    • Measure Friedel opposites in such as way that

    absorption errors are the same for both.

    • Stoe - measure

    I(h) at 2, , and

    I(-h) at -2, -, and

    • The quotient I(h)/I(-h) is free from absorption and

    extinction errors. Also scale-free.

    Le Page, Gabe & Gainsford. J. Appl. Cryst. (1990), 23, 406

  • Quotients

    Parsons, Flack & Wagner Acta Cryst (2013) B69, 249

    2 2 2 22 2single single

    2 2 2 2 2 2

    single single

    | ( ) | | ( ) |( ) ( )( ) ( )(1 2 )

    ( ) ( ) ( ) ( ) | ( ) | | ( ) |o o

    o o

    F FF FI Ix

    I I F F F F

    h hh hh h

    h h h h h h

  • Quotients

    2 2 2 22 2single single

    2 2 2 2 2 2

    single single

    | ( ) | | ( ) |( ) ( )(1 2 )

    ( ) ( ) | ( ) | | ( ) |o o

    o o

    F FF Fx

    F F F F

    h hh h

    h h h h

    This can be

    calculated from

    your data set.

    This can be calculated

    (Fc2 for a model refined

    with x = 0).

  • 2 2 2 22 2single single

    2 2 2 2 2 2

    single single

    | ( ) | | ( ) |( ) ( )(1 2 )

    ( ) ( ) | ( ) | | ( ) |o o

    o o

    F FF Fx

    F F F F

    h hh h

    h h h h

    single( ) (1 2x)Q ( )oQ h h

  • 2 2 2 22 2single single

    2 2 2 2 2 2

    single single

    | ( ) | | ( ) |( ) ( )(1 2 )

    ( ) ( ) | ( ) | | ( ) |o o

    o o

    F FF Fx

    F F F F

    h hh h

    h h h h

    single( ) (1 2x)Q ( )oQ h h

    y = mx

  • Fit graph to y = mx

    Equate gradient m to (1-2x)

    Solve for Flack:

    Gradient = 0.893(50) = 1 – 2x

    Flack = 0.05(3)

    Implemented in XPREP and Shelxl-2013.

    No TWIN/BASF instructions!

    single( ) (1 2x)Q ( )oQ h h

  • Code Friedif x (Normal Ref’t) x(QUOT)

    L-Alanine 34 -0.04(27) 0.01(4)

    GKO02 32 0.01(15) 0.02(3)

    R-CYCLO 21 -0.02(27) 0.00(4)

    TWA16A 13 0.00(69) 0.18(8)

    Cholestane 9 -0.01(77) -0.01(13)

  • Differences

    2 2 2 2 2 2

    single single

    single

    ( ) ( ) (1 2 )(| ( ) | | ( ) | )

    (1 2 ) ( )

    o o

    o

    F F x F F

    D x D

    h h h h

    h h

    Parsons, Flack & Wagner Acta Cryst (2013) B69, 249

  • The Post-Refinement Problem

    A potential problem with post-refinement methods is

    that any correlations involving x are lost.

    But…

    These quantities can also be applied as restraints.

    Need to code F2(h) etc. in terms of x’s, U’s, occs and

    so on.

    2 2 2 22 2single single

    2 2 2 2 2 2

    single single

    | ( ) | | ( ) |( ) ( )(1 2 )

    ( ) ( ) | ( ) | | ( ) |o o

    o o

    F FF Fx

    F F F F

    h hh h

    h h h h

  • Facilities in Topas Academic 5

    • A symbolic equation can be written for the quotient.

    • Like a function or subroutine in Fortran

    fn FPC(h, k, l)

    {

    return

    2.31000*Exp( -20.84390*s2(h,k,l)) +

    1.02000*Exp( -10.20750*s2(h,k,l)) +

    1.58860*Exp( -0.56870*s2(h,k,l)) +

    0.86500*Exp( -51.65120*s2(h,k,l)) +

    ( 0.23370);

    }

    prm !FPPC 0.00910

  • fn X1O1(h, k, l) = tpi*(h*xO1 + k*yO1 + l*zO1);

    ...

    fn QUOT(h, k, l) = (1-2*ENANTIO)*( (

    U1O1(h, k ,l)*(FPO(h,k,l)*Cos(X1O1(h, k, l)) - FPPO*Sin(X1O1(h, k, l)))

    +

    U2O1(h, k ,l)*(FPO(h,k,l)*Cos(X2O1(h, k, l)) - FPPO*Sin(X2O1(h, k, l)))

    +

    ...

    restraint = ( 0.01642 - QUOT( 5, 2, 3 ))/ 0.00978;

    restraint = ( 0.02984 - QUOT( 4, 3, 2 ))/ 0.00968;

    restraint = ( -0.02283 - QUOT( 3, 1, 9 ))/ 0.00924;

    restraint = ( 0.04175 - QUOT( 1, 2, 8 ))/ 0.02252;

    Refine with intensity data merged in the centrosymmetric Laue

    group and Q (or D) applied as restraints.

  • Code Friedif x(QUOT)

    Post Refine

    x(QUOT)

    Refine

    L-Alanine 34 0.01(4) 0.01(3)

    GKO02 32 0.02(3) 0.03(3)

    R-CYCLO 21 0.00(4) -0.02(4)

    TWA16A 13 0.18(8) 0.14(8)

    Cholestane 9 -0.01(13) 0.00(11)

  • Summary Code x(TWIN) y(HOOFT) x(QUOT)

    Post Refine

    x(QUOT)

    Refine

    L-Alanine -0.04(27) 0.01(4) 0.01(4) 0.01(3)

    GKO02 0.01(15) 0.03(3) 0.02(3) 0.03(3)

    R-CYCLO -0.02(27) -0.02(4) 0.00(4) -0.02(4)

    TWA16A 0.00(69) 0.02(7) 0.18(8) 0.14(8)

    Cholestane -0.01(77) -0.04(9) -0.01(13) 0.00(11)

  • Validation – Alanine (34)

  • Cholestane (9)

  • Cholestane (9)

  • TWA16A

    Outlier omission 0.18(8) 0.08(8)

  • Outlier Detection

    • Remove Bijvoet pairs if Do(h) > 2Dc, max

    • For quotient calculations remove data

    where Fo2(h)/u(Fo

    2(h)) and

    Fc2(h)/u(Fo

    2(h)) are < 3.

    2 2 2 22 2single single

    2 2 2 2 2 2

    single single

    | ( ) | | ( ) |( ) ( )(1 2 )

    ( ) ( ) | ( ) | | ( ) |o o

    o o

    F FF Fx

    F F F F

    h hh h

    h h h h

  • Normal Probability Plots

  • Validation

    Mandelic acid

  • Why might this work?

  • Conclusions • Several methods for obtaining precise absolute

    structure determination for light-atom structures.

    • Post refinement calculations are OK. But still work to

    do on the effects of completeness.

    • Validation is important.

    • Still work to do on automatic detection of outliers.

    • Transformation into sensitive and insensitive

    components is important. Still work to do on the

    reasons the methods work - or why L.S. doesn’t.

  • Acknowledgements

    • Howard Flack (Geneva)

    • Trixie Wagner (Novartis)

    • Alan Coelho (Topas)

    • Richard Cooper & David Watkin (Oxford)

    • George Sheldrick (Göttingen)