5
J. Appl. Cryst. (1988). 21,485-489 Information in Powder Pattern Indexing BY DANIEL TAUPIN Laboratoire de Physique des Solides, associb au CNRS, B~timent 510, 91405 Orsay, France (Received 8 January 1988; accepted 6 May 1988) 485 Abstract A criterion to estimate the reliability of powder pat- tern indexings, based upon information theory prin- ciples, is proposed. It takes account of the individual discrepancies between computed and observed line positions, of the unit-cell volume and of the systematic extinction rules due to the lattice type. The resulting figure is in fact the information quantity conveyed by the set of indices assigned to the given experimental lines. I. Introduction Half a dozen programs at least are now available to assign indices to X-ray powder patterns of unknown lattice cell: Werner (1964), Taupin ( 1968, 1973), Visser (1969), Lou6r & Lou6r (1972), Kohlbeck & H6rl (1976), Werner, Eriksson & Westdahl (1985) etc. These programs and others have been discussed by Shirley (1978) who gives some other references. Usu- ally, these programs do not give only one solution but several possibilities, sometimes a great many, as does ours (Taupin, 1973) when loose constraints and al- lowed discrepancies are introduced by inexperienced users. This raises the unavoidable question: how can we distinguish between bad and good solutions? An interesting attempt to solve this problem has been made by de Wolff(1968) who proposed a simple criterion. His 'figure of merit' for an indexed pattern is defined as M2o = Q 2o/2gN 2o (1) where Q2o is the value of 1/d~kt of the 20th indexed line, N2o is the number of different calculated lines such that Qi < Q2o and g is the average deviation between calculated and observed line positions. The obvious underlying idea of de Wolff is to compute the ratio of the Q space 'covered" (we denote Q = 1/d 2 and q = 1/d) by experimental lines to the coverage by the numerous possible lines in the same range. We used this criterion to evaluate the quality of the numerous solutions yielded by our program for nearly 20 years, with a good proportion of satisfactory rating. Although this criterion proves to be empirically very useful, de Wolff's criterion deserves some theoret- ical and practical criticism. Its main drawback comes from that arbitrary num- ber of 20 indexed lines. Of course, one could choose to consider 15 or 30 or 100 lines when available, but it is rather disappointing either to discard lines above that limit when evaluating the reliability of the output lattice, or to inhibit the comparison because the number of lines available is too low. On the other hand, the density of computed lines within a given interval of q = 1/d increases as q2 while the density of observed lines remains rather steady due to decreasing average intensities when reaching high labels. Thus, Mzo is often half of M10 and Ms0 is usually insignifi- cant and low. We are therefore looking for another criterion which: (1) does not strongly depend upon an arbitrary number such as 20; (2) steadily increases when new indexed lines are found at higher angles; (3) does not discard a part of the result of the computation. Information theory readily gives a straightforward solution to this problem, which we describe below. !1. Recalling the basis of information theory By definition, the information ,~(A) associated with an event A is the cologarithm of the probability (A) of that event, beJore it happened. .Yg(A) = -log.~(A). (2) Depending on the kind of logarithm used, the information is expressed in bits (case of log2), in nats (case of the Napierian logarithm), or in other units. We shall use the most expressive way which consists in using bits: an event which had a prior probability of 50% yields one bit of information. Since, by definition, a probability is a real number contained within [0, 1], then ,~(A) > 0. (3) If two events A and B have independent proba- bilities, the probability of the event A n B (that is both A and B happening) is the product of their proba- bilities, hence: .h" ( A ~ B) =- .h'5( A, B) = ,g( A) + .~f"(B); A, B are independent events. (4) 0021-8898/88/050485-05503.00 ~'; 1988 International Union of Crystallography

Information in powder pattern indexing

  • View
    220

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Information in powder pattern indexing

J. Appl. Cryst. (1988). 21,485-489

Information in Powder Pattern Indexing

BY DANIEL TAUPIN

Laboratoire de Physique des Solides, associb au CNRS, B~timent 510, 91405 Orsay, France

(Received 8 January 1988; accepted 6 May 1988)

485

Abstract

A criterion to estimate the reliability of powder pat- tern indexings, based upon information theory prin- ciples, is proposed. It takes account of the individual discrepancies between computed and observed line positions, of the unit-cell volume and of the systematic extinction rules due to the lattice type. The resulting figure is in fact the information quantity conveyed by the set of indices assigned to the given experimental lines.

I. Introduction

Half a dozen programs at least are now available to assign indices to X-ray powder patterns of unknown lattice cell: Werner (1964), Taupin ( 1968, 1973), Visser (1969), Lou6r & Lou6r (1972), Kohlbeck & H6rl (1976), Werner, Eriksson & Westdahl (1985) etc. These programs and others have been discussed by Shirley (1978) who gives some other references. Usu- ally, these programs do not give only one solution but several possibilities, sometimes a great many, as does ours (Taupin, 1973) when loose constraints and al- lowed discrepancies are introduced by inexperienced users. This raises the unavoidable question: how can we distinguish between bad and good solutions?

An interesting attempt to solve this problem has been made by de Wolff(1968) who proposed a simple criterion. His 'figure of merit' for an indexed pattern is defined as

M2o = Q 2o/2gN 2o (1)

where Q2o is the value of 1/d~kt of the 20th indexed line, N2o is the number of different calculated lines such that Qi < Q2o and g is the average deviation between calculated and observed line positions. The obvious underlying idea of de Wolff is to compute the ratio of the Q space 'covered" (we denote Q = 1/d 2 and q = 1/d) by experimental lines to the coverage by the numerous possible lines in the same range. We used this criterion to evaluate the quality of the numerous solutions yielded by our program for nearly 20 years, with a good proportion of satisfactory rating.

Although this criterion proves to be empirically very useful, de Wolff's criterion deserves some theoret- ical and practical criticism.

Its main drawback comes from that arbitrary num- ber of 20 indexed lines. Of course, one could choose to consider 15 or 30 or 100 lines when available, but it is rather disappointing either to discard lines above that limit when evaluating the reliability of the output lattice, or to inhibit the comparison because the number of lines available is too low. On the other hand, the density of computed lines within a given interval of q = 1/d increases as q2 while the density of observed lines remains rather steady due to decreasing average intensities when reaching high labels. Thus, Mzo is often half of M10 and Ms0 is usually insignifi- cant and low.

We are therefore looking for another criterion which: (1) does not strongly depend upon an arbitrary number such as 20; (2) steadily increases when new indexed lines are found at higher angles; (3) does not discard a part of the result of the computation.

Information theory readily gives a straightforward solution to this problem, which we describe below.

!1. Recalling the basis of information theory

By definition, the information , ~ ( A ) associated with an event A is the cologarithm of the probability (A) of that event, beJore it happened.

.Yg(A) = - log .~(A) . (2)

Depending on the kind of logarithm used, the information is expressed in bits (case of log2), in nats (case of the Napierian logarithm), or in other units. We shall use the most expressive way which consists in using bits: an event which had a prior probability of 50% yields one bit of information.

Since, by definition, a probability is a real number contained within [0, 1], then

,~(A) > 0. (3)

If two events A and B have independent proba- bilities, the probability of the event A n B (that is both A and B happening) is the product of their proba- bilities, hence:

.h" ( A ~ B) =- .h'5 ( A, B) = , g ( A) + .~f"(B);

A, B are independent events. (4)

0021-8898/88/050485-05503.00 ~'; 1988 International Union of Crystallography

Page 2: Information in powder pattern indexing

486 I N F O R M A T I O N IN P O W D E R P A T T E R N I N D E X I N G

If the events were not independent then the proba- bility of B should be replaced by the probability of B knowing that A happened; this latter probability might be less or greater than the a priori probability of B but, in any case, it is also within the interval [0, 1], so that

,.U(A, B) >_ S ( A ) (5)

X(A,B)>_J{(B).

III. Information theory in powder pattern indexing

By definition, we consider the merit M of a given lattice L(a, b, c, ~, fl, y, system, type) as being equal to the information conveyed by the event S({q~, ei, i~ 1 :N }), namely the fact that each experimental line q~ is indexed with a deviation e~,

M ( L ) = -log2~@[S({qi, ei, i~ l :U}) ]. (6)

Obviously one could object that, once the lattice L is defined by its parameters, the probability of the event S is fully determined, namely it is 0 if the indexing fails, and 1 if it succeeds. In fact we do not intend to consider here the accurate value of these parameters, but only their order of magnitude: as an example a lattice with parameters ranging from 100 to 200A has much more chance to index a typical inorganic diagram than a lattice of parameters rang- ing from 2 to 5 A. Therefore, success in indexing a given pattern with a large cell gives much less informa- tion than indexing it - even with only fair agreement - with a cell of small parameters.

On the other hand, the probability that a given experimental line is indexed by a given lattice is nearly independent of whether other experimental lines have already been indexed or not. In practice, when we make such an assumption, we neglect two situations: (1) closely neighbouring experimental lines, where the fact that one is indexed reduces the chances of index- ing its neighbour (because one calculated line may not index two or more experimental lines); (2) lines which 'obviously' are multiples of each other, where the fact that one is indexed increases the chances of its multiple.

We show below that the first situation can be approximately handled by some adjustment in the computation of the merit. The second one can practi- cally be neglected, since experimental errors forbid one from stating with certainty that a line is actually the multiple of another.

Thus, the merit M of a lattice L collated with a set of experimental lines is the sum of the merits of indexing each of these lines:

N

M = ~ Mi (7) i

Mi = - log2 ~[S(qi , ei)] (8)

where N is the number of experimental lines, and where ~[S(qi , e~)] is the probability of the event S(q~,ei) that the experimental line at position q; is indexed with a discrepancy c,. by the lattice L.

IV. Establishing the expression for the merit of a trial lattice

Let us first consider the simple question: what is the probability that a line at position q; is indexed by a lattice L with a maximum discrepancy of ei? This is equivalent to finding the probability that at least one computed line of the lattice L falls within the interval [qi - e,i, q; + ~]. To calculate this probability, we have to consider the a priori density n(q~) of computed lines generated by the lattice L in the neighbourhood of q,.. For ordinary lattices - i.e. lattices whose three cell parameters are of same order of magnitude - n(q) is simply given by

n( q) = 4rcq z V/la (9)

where V is the volume of the unit cell of the lattice L and/~ is a multiplicity factor which takes in account the fact that hkl lines are identical to h kl lines in powder diagrams, that orthorhombic systems only consider the magnitude of h, k and l, that monoclinic systems only consider one minus sign, etc. The factor/~ also considers lattice types with extinction rules (S, F, I lattices). Its precise values will be discussed later.

The probability that, given an a priori density n(q), at least one computed line falls within an interval of width 2e,~ is a classical probability problem which is easily solved by calculating the complementary proba- bility that no line falls within. The result is

~[S(qi , e,i)] = 1 - exp [ - 2ein(q)] (10)

where one should resist the temptation of approximat- ing it by 2ein(q) since this quantity frequently exceeds one in the domain where the average distance between computed lines is similar to the discrepancy (or expected discrepancy) ei. Of course, such regions do not contribute significantly to the "merit" of the lattice, but the criterion should not, however, give them a negative contribution.

Putting together the previous results, one readily gets the expression of the information conveyed by the success of indexing the set {q;, c;} with the lattice L:

1 N M - ~ log ]- 1 - exp ( - 2c,;4rcq/2 V/la)]. ( 11 )

log 2 T

All but one of the parameters of this formula are obvious, namely ~:i. In fact, it raises the question whether one should use the a priori allowed deviations used in indexing routines or the a posteriori deviations observed between computed and experimental line positions. The answer is not at all trivial, but it can be deduced from the following considerations: (1) It

Page 3: Information in powder pattern indexing

DANIEL TAUPIN 487

would be strange that the merit did not depend upon the actual final deviations. (2) Moreover, it would be strange that the same results had different merits depending on parameters ei which remain arbitrary, once the complete indexing has succeeded. (3) Con- versely, using a posteriori data to compute a priori probabilities is also unusual, at least. If this happens - and, in fact, it does - only global and vague param- eters (such as the volume or the g 2) may be used, and obviously not the precise positions of the calcu- lated lines. (4) Moreover, considering actual deviations might lead to infinite merit if some computed lines happened to fall exactly at the same position as an experimental one.

In fact, all these difficulties come from the fact the allowed deviations (or the expected deviations) are a priori stated on an arbitrary scale, which is subject to adjustment according to the average observed devi- ations. This means that one has first to compute the sum of the squared deviations (the Z2), namely

N Z 2 = Y'(qi-ri)2/e, 2, (12)

i

whose expectation is equal to N - v, where v is the number of parameters of the crystal family (v = 1 for cubic, v = 2 for hexagonal and tetragonal, v = 3 for orthorhombic, v = 4 for monoclinic and v = 6 for triclinic families), and where r; is the computed value of 1/dhkt which has been associated with the experi- mental line q~ by the indexing process. Next, the value of the resulting reduced Z 2 is used to 'scale' the stated ei (see Taupin, 1988, § VIII.2), which are replaced in the formula by

6i = Z~i (13)

where

so that

1 ~i (qi--ri)211/2 z~=(z2) '/2 : ~ ~ J ' (14)

1 N M - log 2 , ~. log[1 - exp( - 8~zqZe~xrV/it)].(15)

An important feature of the above formula must be pointed out: for a given set of experimental lines and stated e~, the value of M is a function of the product zrV. Therefore, when trying to index the given set of experimental lines by a certain lattice cell, the value of M is immediately deduced from Zr and conversely - provided that the function M(ZrV ) has been tabu- lated. Thus, some given (or computed) minimum merit Mmi n will result in a maximum value of the product z~V. If the volume is known (this is the case when one tries to index with a tentative cell) then a maximum value of Zr, namely Z . . . . . can be es- tablished, which fulfils the condition

mmin - m(z . . . . V) (16)

so that trials which would lead to poor merits can be rejected as soon as the sum inside the expression for Zr 2 exceeds the corresponding threshold, namely when

k<N ~" ( q i - - r i ) 2 / g 2 > ( N - - v ) Z 2 r m a x . (17) i

This may save computer time. Moreover, Zr can reasonably be enclosed within a

restricted interval: if the maximum allowed deviations {e,i} are neither over- nor underestimated, then Zr will approximately range in the interval [0.25, 0-5]. Indeed, the indexing program is likely to have overestimated these allowed deviations by a factor of 2 or 3, but not by a factor of more than 10. Thus, a reasonable maximum value Vmax of the cell volume can be deduced from a given minimum merit by solving

Mmi n = M ( 0 " 0 5 Vmax). (18)

This has a twofold advantage: (1) it gives a sensible maximum volume of the cell, depending on the sym- metry, on the extinction rules sought and - obviously - on the required merit Mmi,; (2) once a possible solution has been found, the minimum merit for further trials is significantly increased, so that the upper limit Nmax of the unit cell is often drastically cut, thus dramatically reducing the computer time consumed.

This raises another question, namely which mini- mum merit makes another cell eligible, once a satis- factory solution has already been found.

V. Merits and relative probabilities

As defined above, the merit can be understood as the cologarithm of the probability that the given set of experimental lines has been indexed by chance. Thus, if two cells C1 and C2 have respective merits M1 and M2 (MI > M2), it can be stated that the probability of C2 being indexed by chance is 2 M'-M2 times higher than that of Ca. In other words C1 is 2 M'-M2 times more likely than C2.

Therefore and in practice, once a satisfactory cell has been found with a merit M, any further cell with merit less than M - 10 can be rejected with less than 1/1000 chance of error.

Setting the minimum merit Mini n to the value of the best found one, minus 10, and taking account of the conclusions of the previous section to reduce Vm,x actually results in cutting computer time spent in useless trials by a factor which may reach 10 and sometimes 100.

VI. The case of experimental lines close to each other

All the above treatment assumes that the probability of an experimental line being indexed is independent

Page 4: Information in powder pattern indexing

488 I N F O R M A T I O N IN P O W D E R PATTERN INDEXING

of the fact that the others already have been indexed. This assumption becomes false when some given experimental lines are close to each other so that the domains where the calculated line must fall overlap, i.e. when a unique calculated line could correspond to two or more distinct experimental lines. This is the case when there exists an i such that

F'i -~- ei + 1 < q i + 1 - - q i (19) (we assume that experimental lines are ordered by increasing values of Bragg angles, i.e. by increasing q;). If one were to keep strictly to mathematical rigour, this would lead to complicated conditional proba- bility calculus in order to estimate the conditional probability that the line i + i be indexed, knowing that the ith is already indexed, depending on the distance between these experimental lines. We do not think that the need for precision in merit evaluation is such that it is worthwhile to perform this sophisticated calculus.

In this situation, the sensible simple solution con- sists of replacing the term 2ei in (11) by the sum of maximum possible deviations on each side, namely

M _ 1 N

log { 1 log 2 -F

- e x p [ - ( A q i _ + Aqg + )4nq{ V/p] } (20)

where

Aqi_ = min [zre~, 0"5(qi -- qi- 1)]

Aqg+ = min [Zre~, 0"5(qi+ 1 - q~)]. (21) M

This latter formula exhibits a drawback compared 25o with (15); it is no longer a function of "zrV only. However, it must be pointed out that (15) is always a slightly pessimistic estimation of (20). Therefore, we suggest that (15) is used for prediction purposes and 2oo for the establishment of maximum values of ;(r and V, and that (20) is used only for final merit estimations.

As an example, we give in Fig. 1 a comparison of the 1 5 0

variation of de Wolff's merit for an arbitrary lattice. In contradiction with common affirmations, it shows that higher indexed lines convey information which cannot be neglected, although less important than the loo information conveyed by the first ones.

VII. The case of non-indexed lines

Most of the powder indexing programs allow (or optionally allow) a number of lines to be discarded, if they cannot be indexed within the given specifications. If these partially failing lattices are considered, then it is necessary to have a way of comparing them with other trial lattices which actually indexed more or less experimental lines.

This problem is easily solved by considering non- indexed lines as being indexed with an infinite discrep-

ancy. Equation (20) clearly shows that this merely results in a zero contribution to the sum. Thus, even a badly indexed line is better rated than a non-indexed one, and this is rather satisfactory.

VIil. The multiplicity factor ,u

The precise values ofp have been extensively discussed by de Wolff(1961) but we do not think such accuracy is needed for the present purpose. If we keep to a first approximation, p can be considered as the product of two contributions:

p = pda~ (22)

where p~ is related to the lattice type:

P lattice: Pt = I F lattice: Pt = 4 I and S lattices (A, B, C): P t = 2

and p~ is related to the crystal family:

cubic: Ps = 32 tetragonal: p., = 16 hexagonal: p.~ = 24 orthorhombic: p., = 8 monoclinic: Ps = 4 triclinic: ps = 2.

There might be objections that the number of calculated lines n(q) could be directly counted during the indexing process, and that this would have eliminated the need

o o o o

TTT o

o

- - 000

0

p o

e e o

o ° o •

° o

• o • o

• OO°ooo _ o ° Ooo

o

o •

e II

° ° °OOoo o °°OOOoooooOooooo

Ooooo

[ 1 0 [ 2 0 [ 3 0 [ 4 0 1 5 0 N ----*

Fig. 1. Comparison between de Wolff's merit (,.?) and information merit (O) as a function of the number of indexed lines considered, for an arbitrary orthorhombic lattice, Information merit is much smoother than de Wolff's. mainly because the Z, involved in its estimation is always computed using the overall deviance.

Page 5: Information in powder pattern indexing

DANIEL TAUPIN 489

for the above discussion. In fact, this is irrclcvant sincc the quant i ty needed in merit computa t ion is the a priori density of lines, for a given volume V, and not the actual a posteriori line count.

whcre

AQi_ --- min [z, Ei, 0"5(Qi - Q,_ 1)] AQi + -- min [zrEi, 0"5(Q;+ 1 - Q;)].

(26)

IX. Express ing M as a funct ion o f l / d 2

The above formulae can also be expressed as func- tions of Q = 1/d z which is of common use in some indexing programs - at least in ours (Taupin, 1973) - owing to its linear dependence upon the unknown parameters of the cell. The 6i, ei and Aq can be substi tuted by derivation, namely

AQi+- Ei Di (23) Aqi+_ - 2V/~i; e i - 2 /-Qi; ~ i - 2 /-Qi

so that

1 N M - log 2~. log [1-- exp ( - - 2Ei2rcx/~f/,rV/la)]

(24)

where Ei is the stated maximum deviation of Qi. Note that :~r is dimensionless and takes the same value when computed with the deviations of q/ and when com- puted with the deviations of Qi, provided that these deviations are sufficiently small to allow first-order differential estimations. When account is taken of experimental line vicinity, (20) becomes

M _ 1 N

log { 1 log 2 T

- exp[ - (AQ,_ + AQi+)27z~/-Qiv/p]} (25)

X. Appl icat ion

This criterion is now used in a new version of our indexing program of 1973, which is currently under- going final checks. Since it is nearly exhaustive, this program usually gives a few, and often many, possible cells which fit more or less the given experimental line positions. A set of 11 artificial powder patterns (i.e. synthetic data corresponding to a known set of cell parameters and spoiled by an artificial noise) was submitted to this new version of the indexing program. All the retrieved cells which exhibited the best merit were the same as the original ones, within the experimental errors.

References

KOHLBECK, IV. & HORL, E. M. (1976). J. Appl. Cryst. 9, 28-33. LOU~R, D. & LOU~R, M. (1972). J. Appl. Cryst. 5, 271-275. SHIRLEY, R. (1978). Indexing Powder Diagrams. In Proc. of

1978 Summer School. Delft Univ. Press and Oosthoeks. TAUPIN, D. (1968). J. Appl. Cryst. l, 178-181. TAUPIN, D. (1973). J. Appl. Cryst. 6, 380-385. TAUPIN, D. (1988). Probabilities, Data Reduction and Error

Analysis in the Physical Sciences. Les Ulis, France: Les Editions de Physique.

VISSER, J. W. (1969). J. Appl. Cryst. 2, 89-95. WERNER, P. E. (1964). Z. Kristallogr. 120, 375-387. WERNER, P. E., ERICKSSON, L. & WESTDAHL, M. (1985). J.

Appl. Cryst. 18, 367-370. WOLFF, P. M. DE (1961). Acta Cryst. 14, 579--582. WOLFF, P. M. DE (1968). J. Appl. Cryst. 1, 108-113.