Upload
damebrolis
View
234
Download
0
Embed Size (px)
Citation preview
8/12/2019 Estimation variogram uncertainty
1/32
Mathematical Geology, Vol. 36, No. 8, November 2004 ( C2004)
Estimating Variogram Uncertainty1
B. P. Marchant2 and R. M. Lark2
The variogram is central to any geostatistical survey, but the precision of a variogram estimated fromsample data by the method of moments is unknown. It is important to be able to quantify variogramuncertainty to ensure that the variogram estimate is sufficiently accurate for kriging. In previousstudies theoretical expressions have been derived to approximate uncertainty in both estimates of theexperimental variogram and fitted variogram models. These expressions rely upon various statisticalassumptions about the data and are largely untested. They express variogram uncertainty as functionsof the sampling positions and the underlying variogram. Thus the expressions can be used to designefficient sampling schemes for estimating a particular variogram. Extensive simulation tests show thatfor a Gaussian variable with a known variogram, the expression for the uncertainty of the experimentalvariogram estimate is accurate. In practice however, the variogram of the variable is unknown andthe fitted variogram model must be used instead. For sampling schemes of 100 points or more thishas only a small effect on the accuracy of the uncertainty estimate. The theoretical expressions for
the uncertainty of fitted variogram models generally overestimate the precision of fitted parameters.The uncertainty of the fitted parameters can be determined more accurately by simulating multipleexperimental variograms and fitting variogram models to these. The tests emphasize the importanceof distinguishing between the variogram of the field being surveyed and the variogram of the randomprocess which generated the field. These variograms are not necessarily identical. Most studies ofvariogram uncertainty describe the uncertainty associated with the variogram of the random process.Generally however, it is the variogram of the field being surveyed which is of interest. For intensivesampling schemes, estimates of the field variogram are significantly more precise than estimates ofthe random process variogram. It is important, when designing efficient sampling schemes or fittingvariogram models, that the appropriate expression for variogram uncertainty is applied.
KEY WORDS: ergodic, nonergodic, error, simulation tests.
INTRODUCTION
The variogram characterizes the structure of spatial correlation of a variable and is
central to any geostatistical survey. It expresses the variance of the difference be-
tween two observations of the variable as a function of the lag vector that separates
them. A variogram estimate, expressed as a mathematical function, is required to
1Received 12 November 2003; accepted 18 May 2004.2Silsoe Research Institute, Wrest Park, Silsoe, Bedford, MK45 4HS, United Kingdom; e-mail:
867
0882-8121/04/1100-0867/1 C 2004 International Association for Mathematical Geology
8/12/2019 Estimation variogram uncertainty
2/32
868 Marchant and Lark
krige or simulate a spatially correlated variable (Webster and Oliver, 2001). How-
ever, both of these techniques assume that the variogram of the variable is known,
whereas in reality the variogram must be estimated from the available data. There-fore there is some unavoidable uncertainty associated with variogram estimate.
In this paper, we discuss methods of quantifying this uncertainty for variograms
estimated by the method of moments.
Variogram uncertainty has been considered previously in a number of dif-
ferent contexts. Webster and Oliver (1992) measured the uncertainty of vari-
ograms estimated from different sampling schemes to determine whether the sam-
pling schemes were adequate for variogram estimation. Muller and Zimmerman
(1999) and Bogaert and Russo (1999) have suggested techniques for design-
ing sample schemes where the sample points are positioned to minimize thevalue of a theoretical expression of variogram uncertainty. The same theoret-
ical expressions are used to fit variogram models in a way that accounts for
the difference in accuracy of the experimental semivariance at each lag distance
(Cressie, 1985). Some measure of variogram uncertainty is also important when
considering the reliability of simulated or kriged estimates derived from the esti-
mated variogram (Brooker, 1986; Todini, 2001; Todini, Pellegrini, and Mazzetti,
2001).
We draw attention to three possible problems with previous approaches for
estimating variogram uncertainty. The reliability of the theoretical expressions ofvariogram uncertainty used by Muller and Zimmerman (1999) and Bogaert and
Russo (1999) have not been tested comprehensively. Yet the expressions are only
approximate and rely upon certain statistical assumptions. Furthermore, generally
it is the error in estimating the variogram of the field being surveyed which is
of interest. However, the theoretical expressions used by Muller and Zimmerman
(1999) and Bogaert and Russo (1999) quantify the expected error in the exper-
imental variogram as an approximation to the variogram of the random process
which generated the field. Finally, the theoretical expressions to determine the
uncertainty depend on the variogram of the random process. When applying theseexpressions, the variogram of the random process must be approximated by a
modelfitted to the experimental variogram. Thus this approach to the estimation
of variogram uncertainty is circular.
Here, through experiments on simulated data sets, we assess the impact of
each of these concerns. We follow Brus and de Gruijter (1994) in referring to
the variogram averaged over all realizations of the underlying random process as
the ergodic variogram, and the exhaustive variogram of the single realization or
field being sampled as the nonergodic variogram. Journel and Huijbregts (1978)
refer to these as the theoretical and local variograms respectively. First, we test
the accuracy of the theoretical expressions for the uncertainty of the methods
of moments variogram as an estimate to a known ergodic variogram. Second,
we consider the uncertainty associated with an experimental variogram estimate
8/12/2019 Estimation variogram uncertainty
3/32
Estimating Variogram Uncertainty 869
to a nonergodic variogram when (for the purpose of assessing uncertainty) the
ergodic variogram is known. In addition we compare the magnitude of the er-
rors when using the experimental variogram as an estimate of the ergodic andnonergodic variogram. Third, we test the accuracy of the theoretical expressions
for the uncertainty of the methods of moments variogram estimates to an un-
known ergodic variogram. In this case the uncertainty expressions are calculated
using a modelfitted to the experimental variogram, rather than the correct ergodic
variogram.
We denote the experimental variogram estimate by (h), the ergodic var-
iogram by (h) and the nonergodic variogram by NE(h). We assume that the
variograms are isotropic and therefore functions of the scalar separation distance
h. We now present the three problems being addressed in more detail and describeprevious studies of them.
Uncertainty of Estimates to a Known Ergodic Variogram
Previous studies of variogram uncertainty have mostly concentrated on es-
timates of the ergodic variogram. Cressie (1985), Ortiz and Deutsch (2002), and
Pardo-Iguzquiza and Dowd (2001a) suggested similar expressions for the covari-
ance matrix of experimental variogram estimates to the ergodic variogram. These
expressions are functions of both the sampling scheme and the ergodic variogram.
The elements of the main diagonal of the covariance matrix represent the variance
of the experimental variogram estimates at each separating distance. For conve-
nience, we refer to the standard error at each separating distance as the ergodic
error. The ergodic error is the result of two different types offluctuation. We are
most concerned with the sampling error, that is the expected difference between
the variogram estimate (h) and the nonergodic variogram of the realization being
sampled NE(h). However, the ergodic error also includes the effect offluctua-
tions between the ergodic variogram(h) and the nonergodic variogram NE(h).
Pardo-Iguzquiza and Dowd (2001a) also consider how the uncertainty of the ex-
perimental variogram may be incorporated into the uncertainty offitted variogram
parameters. This leads to an expression for the covariance matrix of variogram
parametersfitted by generalized least squares (GLS).
Few previous tests of the reliability of expressions of variogram uncertainty
have been carried out. Pardo-Iguzquiza and Dowd (2001a) applied their expression
to a particular case study and confirmed qualitatively that variogram uncertainty
varied with lag distance in the manner they expected. Ortiz and Deutsch (2002)
applied two different methods of simulation to test their expressions of variogram
uncertainty. One method simulated multiple values of a random variable at sets
of two pairs of locations. The observed covariances between the variogram es-
timates from each pair were in good agreement with their expressions. In the
8/12/2019 Estimation variogram uncertainty
4/32
870 Marchant and Lark
second test they simulated multiple realizations of the random variable at a set
of sampling points and calculated the experimental variogram for each realiza-
tion. This was referred to as the global simulation method. The observed vari-ances of semivariances from the global method were generally less than those
predicted by their expressions. Difficulties in simulating realizations that hon-
oured the variogram function, particularly for long lag distances, were blamed
for these discrepancies. McBratney and Webster (1986) used a similar method
of simulation to establish confidence intervals on experimental variogram
estimates.
Expected Error in Estimates to the Nonergodic Variogramfor a Known Ergodic Variogram
Munoz-Pardo (1987) derived expressions for the uncertainty of estimates
to the nonergodic variogram. He separated the two components of fluctuation
within the ergodic error to approximate the expected error in approximating the
nonergodic variogram NE(h) by the experimental variogram (h). We refer to
this quantity, which may be thought of as the sampling error, as the nonergodic
error. It is this quantity that is of interest when optimizing sample schemes for
variogram estimation. Muller and Zimmerman (1999) and Bogaert and Russo(1999) designed optimal sample schemes for variogram estimation by minimizing
the ergodic error. Therefore we investigated both the reliability of Munoz-Pardos
(1987) expressions and the difference between the ergodic error and the nonergodic
error.
Prior to our investigations, Munoz-Pardos (1987) expressions had not been
validated comprehensively. Munoz-Pardo (1987) used his expressions to calcu-
late the expected sampling error of variogram estimates for different sampling
schemes, ergodic variograms, andfield sizes. He found that the ratio of variogram
range, that is the distance over which the variable is spatially-correlated, to fieldsize was a critical factor in determining the nonergodic error. Other authors have
attempted to establish confidence bands on nonergodic variograms using simu-
lated data. Webster and Oliver (1992) carried out extensive simulation tests in
order to estimate the nonergodic error when applying different sampling schemes.
Motivated by thefindings of Munoz-Pardo (1987), they examined data sets with
different ratios of variogram range tofield size. They also varied the basic struc-
ture of the ergodic variogram used to simulate the data. They found that between
150 and 225 sampling points are required to estimate the variogram accurately.
In each of their simulation tests they sampled the same region several times bytranslating the sampling grid across the region. Although they ensured that the
same point was not sampled by two different versions of the grid, they might have
underestimated the expected error of variogram estimates because of correlation
8/12/2019 Estimation variogram uncertainty
5/32
Estimating Variogram Uncertainty 871
between the samples. We discuss this correlation and the effect it has on the error
estimates later in this paper. Webster and Oliver (1992) briefly compared their
observed confidence intervals with those of Munoz-Pardo (1987), and saw somesimilarities.
Uncertainty of Estimates to an Unknown Ergodic Variogram
All of the expressions of variogram uncertainty described above are functions
of the ergodic variogram. However, in any real survey the ergodic variogram would
not be known; it would be approximated by the estimated variogram. The effect
of this approximation has neither been accounted for in the theoretical studies norestimated from simulated data.
THEORY
Estimating the Variogram
In geostatistics we regard the value of a variable at a location x as a re-
alization of a random function Z(x). This random function is assumed to beintrinsically stationary. This is a weak form of second-order stationarity and
is met if two conditions hold. The first is that the expected value of the ran-
dom function, E[Z(x)], is constant for all x. Secondly, the variance of the dif-
ferences between the value of the variable at two different locations depends
only on the lag vector separating the two locations and not on the absolute lo-
cations. In general, this variance may be a function of both the direction and
length of the lag vector. In this study isotropic variograms only are considered.
These are purely a function of the length of the vector which we denote h.
Thus the relationship between values from different locations is described by thevariogram
(h) =1
2E[(Z(x) Z(x+ h))2]. (1)
The variogram is estimated from variable values observed at sampled points,
xs ,s = 1, . . . , n. The method of moments estimator is the average of squared dif-
ferences between observations separated by distance h . Pairs of observations are
divided amongst different bins based upon their separating distance. If the obser-vations are on a regular sampling grid, then bins consisting of pairs with exactly
the same separating distance may be chosen. Otherwise a small tolerance must
be placed on the separating distances associated with each bin. The experimental
8/12/2019 Estimation variogram uncertainty
6/32
872 Marchant and Lark
variogram (hj), j= 1, . . . , kis then estimated by
(hj) =1
2n(h j)
n
(hj)i=1
zi1(hj) z
i2(hj)
2, (2)
wheren(h j) is the number of pairs in the bin centred on separating distance h j,
andz i1(hj),zi2(h j) are thei th pair of observed values in this bin.
Kriging and simulation require that the variogram is expressed as a mathemat-
ical function or model. This function must obey several mathematical constraints
to describe random variation and to avoid negative variances. This is achieved
typically byfitting a suitable function to the experimental variogram. The math-ematical constraints, and the most commonly used authorized functions which
obey them, are described by Webster and Oliver (2001). In practice, the model
type may be chosen by visual inspection of the experimental variogram or, after
fitting the model, by more formal criteria such as the Akaike Information Criterion
(McBratney and Webster, 1986). Webster and Oliver (2001) recommend that the
model type should be chosen by a procedure which combines visual and statistical
assessment.
Each function has a few parameters that are selected tofit the function to the
experimental variogram. Different methods are used to estimate these parameters.Some practitioners do so by eye, but most prefer more objective methods. Cressie
(1985) describes three mathematical techniques forfitting the parameter values.
The simplest is the least squares method. If is the vector of p variogram pa-
rameters,(h; ) the corresponding parameterised variogram function, and k the
number of experimental variogram bins, then the method of least squares chooses
that minimizes
ki=1
((hi ) (hi ; ))2
. (3)
However, the reliability of each experimental semivariance (hi ) varies according
to the number of point pairs used to describe it and the actual value of(hi ).
Therefore it is better to use weighted least squares and minimize
k
i=1
wi ((hi ) (hi ; ))2, (4)
wherewi is a weighting function. The weighting function may be set proportional
ton (hi ) or, in order to account for the inverse relation between the reliability of
8/12/2019 Estimation variogram uncertainty
7/32
Estimating Variogram Uncertainty 873
an estimate of variance and the variance itself,
wi =n(h
i)
(hi )2, (5)
may be specified.
The most rigorous of the three techniques described by Cressie (1985) is
generalized least squares (GLS). The GLS technique accounts for the accuracy of
each bin estimate of the experimental variogram, and the correlation between each
estimate. The chosen parameter values minimize
((h) (h; ))
T1
(h; ) ((h) (h; )). (6)
Here, h is the length kvector of lag bin centres and 1(h; )isthe k kcovariance
matrix of(h). This matrix will be discussed in more detail later in this section.
The direct method of minimizing Equation (6) has been shown to be inconsistent
(Muller and Zimmerman, 1999). To account for this the following iterative scheme
is used
m+1 = min
((h) (h; ))T1(h;m )((h) (h; )),
= limm
m . (7)
Here, m is the estimate ofafterm 1 iterations of Equation (7). This iterative
scheme requires1, an initial estimate of the parameter values. This initial estimate
may be chosen by weighted least squares [Eqs. (4)(5)]. The procedure then con-
verges to the GLS parameter estimate in an asymptotically efficient and consistent
manner.
Assessing Variogram Uncertainty
Several authors (Cressie, 1985; Ortiz and Deutsch, 2002; Pardo-Iguzquiza
and Dowd, 2001a) have derived similar expressions for the uncertainty of the
experimental variogram. In each case they express this uncertainty in terms of,
the covariance matrix of the experimental variogram. The pq th element of this
matrix is
[]pq = Cov [(hp), (hq )], (8)
and the diagonal elements are the variances of semivariances. The expected value
of(h) for each lag distanceh is equal to(h). Therefore we refer to the standard
8/12/2019 Estimation variogram uncertainty
8/32
874 Marchant and Lark
deviations of the semvariance at each lag bin, that is the square root of each element
on the main diagonal, as the ergodic error and to as the ergodic covariance matrix.
From the definition of covariance
[]pq = E [(hp)(hq )] (hp)(hq ), (9)
=1
4n(hp)n(hq )
n(hp )i=1
n(hq )j=1
E
zi1(hp) zi2(hp)
2z
j1 (hq ) z
j2 (hq )
2
(hp)(hq ). (10)
Munoz-Pardo (1987) showed that ifZ(x) is multivariate Gaussian with an isotropic
ergodic variogram(h), then
E
zi1(hp) zi2(hp)
2z
j
1 (hq ) zj
2 (hq )2
= 2Ci j(hp, hq ) + 4(hp)(hq ). (11)
The functionCi j(hp, hq ) describes the covariance between [zi1(hp) z
i2(hp)] and
[zj1 (hq ) z
j2 (hq )] and may be written
Ci j(hp, hq ) = xi1 xj1 + xi2 xj2 xi1 xj2 xi2 xj1 2,(12)
wherexi1,xi2,x
j
1 , andxj
2 are the sample points at which the valueszi1(hp),z
i2(hp),
zj
1 (hq ), andzj
2 (hq ), are measured, and |.| denotes the distance between the sample
points. Therefore the pq th element of the ergodic covariance matrix is written
[]pq =1
2n(hp)n(hq )
n(hp )i=1
n(hq )j=1
Ci j(hp, hq ). (13)
Pardo-Iguzquiza and Dowd (2001b) provide Fortran code to calculate this ex-
pression. To calculate Equation (12), the program requires the ergodic variogram
function as an input. This is best approximated from the fitted variogram model.
If the distribution of semivariances is multivariate Gaussian, it is completely
defined by the ergodic variogram (h) and the covariance matrix . Furthermore,
standard statistical theory (Gathwaite, Joliffe, and Jones, 1995) states that the
quantity
((h) (h))T1
(h)((h) (h)), (14)
has a chi squared distribution with kdegrees of freedom. Confidence sets for,
with confidence (1 ), where is the significant probability level, are given by
8/12/2019 Estimation variogram uncertainty
9/32
8/12/2019 Estimation variogram uncertainty
10/32
876 Marchant and Lark
Similarly,
E[NE(hp)NE(hq )] 1
2N(hp)N(hq )
N(hp
)i=1
N(hq
)j=1
Ci j+ (hp)(hq ), (21)
and
E[(hp)(hq )] 1
2n(hp)n(hq )
n(hp )i=1
(nq )j=1
Ci j+ (hp)(hq ). (22)
Therefore, substituting Equations (20), (21), and (22) into Equation (17) gives
[NE]pq 1
2n(hp)n(hq )
n(hp )r=1
n(hq )s=1
Cr s (hp, hq )
+1
2N(hp)N(hq )
N(hp)r=1
N(hq )s=1
Crs (hp, hq )
1
2n(hp)N(hq )
n
(hp )r=1
N
(hq )s=1
Cr s (hp, hq )
1
2N(hp)n(hq )
N(hp )r=1
n(hq )s=1
Cr s (hp, hq ). (23)
This expression may be calculated numerically in a similar manner to Equation
(13). It is more computationally expensive however since the covariances between
N(N 1)/2 pairs of points must be considered.The most common method of estimating the uncertainty offitted variogram
parameter estimates is by calculating the inverse of the information matrix (Pardo-
Iguzquiza and Dowd, 2001a). The p pinformation matrix, M, that corresponds
to parameter vector(of length p)fitted by GLS is
M = JT1J. (24)
Here,J is thek pJacobian matrix in which thei jth element is [J]i j= (hi )/
j, evaluated at the GLS estimate of. A result from nonlinear inversion theory(Menke, 1984) says that M1 is a leading order Taylor series approximation to
the covariance matrix of the parameter estimates. Since this is a leading order ap-
proximation it is only accurate for estimates ofthat are themselves accurate. The
8/12/2019 Estimation variogram uncertainty
11/32
Estimating Variogram Uncertainty 877
approximation assumes that the parameter estimates have a multivariate Gaussian
distribution. In this case, and under the assumption that the variogram estimation
technique is unbiased, the distribution of the vector of parameter estimates, ,is completely defined. The mean value is given by the parameter vector of the
simulated variable, and the covariance matrix by M1.
Ortiz and Deutsch (2002) assessed variogram parameter uncertainty by a
more arbitrary criterion. They examined the experimental variogram covariance
matrix, , andfitted variograms to what they judged to be extremerealizations
of the experimental variogram. Thesefitted variograms were themselves assumed
to haveextremeparameter values.
SIMULATION EXPERIMENTS
Simulated Fields and Sampling Schemes
The characteristics of the simulated fields and sampling schemes matched
those used by Webster and Oliver (1992). Fields were generated with one of two
ergodic variogram models. Thefirst was the exponential variogram model
(h) = c0 + c1(1 exp(h/r)) for h >0, (25)
(0) = 0, (26)
wherec0is the nugget variance,c1the sill of the spatially structured variance, and
rthe distance parameter of the model. The chosen parameter values were c0 = 0,
c1 = 1, andr= 16. The other was the spherical variogram model
(h) = c0 + c13h2a
1
2h
a
3
for 0a, (28)
(0) = 0, (29)
wherec0 andc1 have the same meaning as in Equation (25) and a is the distance
parameter. The parameter values werec0 = 1/3,c1 = 2/3, anda = 50. The dis-
tance parameter,a , for the spherical model is the range of the spatial dependence,
whereas the exponential model has effective range 3r. Thus both of the models
applied had approximately the same effective range.Each field was generated using unconditioned sequential Gaussian simulation
(Deutsch and Journel, 1998) and consisted of either 120 120= 14,400 or 256
256 = 65, 536 values on a square grid at unit interval. Henceforth we refer to the
8/12/2019 Estimation variogram uncertainty
12/32
878 Marchant and Lark
Table 1. The Number of Sample Points in Each Scheme
and the Corresponding Distance Between These Points
Sample points 25 49 100 144 225 400Interval 20 15 10 8 7 5
four sets of simulatedfields as Sets 14. Sets 1 and 2 are the sets of large fields
with exponential and spherical variograms respectively. Sets 3 and 4 are the set of
smallfields with exponential and spherical variograms respectively.
The smallerfields were simulated 1000 times, and the larger ones 100 times.
The smallerfields, where the effective range was almost half of the length of the
field, were sampled using regular square grids with the sample sizes and sampling
intervals listed in Table 1. For the largerfields, the effective range of the simulated
variogram was less than a fifth of the length of the field. If the whole field had
been sampled using a square grid it would have provided little information about
the structured part of the variogram, unless the grid was very dense. Therefore the
field was sampled along transects. The combinations of sample sizes and sample
intervals were the same as those listed in Table 1, for example, a sample size of
25 points was split into 5 transects, with each point separated by distance 5.
Each of the four sets of fields were sampled with six different sampling
schemes. The exact position of the sampling grids or transects was chosen at
random, but the same positions were used for each realization within the same
field set. All of the theoretical expressions for variogram uncertainty described
previously are functions of both the ergodic variogram and the sampling scheme
used. Therefore, in the three simulation tests described below, each combination
of sampling scheme and field type was tested independently. In each case, the
theoretical expressions of variogram uncertainty were calculated. Then each of
the realizations was sampled, and from these data an experimental variogram was
estimated, and a variogram modelfitted by GLS. Thefitted variogram model was
of the same type as that of the simulated variable. The errors in the variogram
estimates were then compared with the expected values. Also, a further simulated
approximation of the covariance matrix offitted parameter values was made by
simulating Gaussian realizationsof the experimental variogram(h) directly, using
the estimated experimental variogram covariance matrix . A model wasfitted to
each realization by GLS and the covariance matrix of these simulated parameter
estimates was calculated.
Experiment 1
The first experiment considered the covariance matrices calculated from
Equations (13) and (24) which describe the uncertainty of method of moments
8/12/2019 Estimation variogram uncertainty
13/32
Estimating Variogram Uncertainty 879
Table 2. The Constraints Placed on the Fitted Parameters for Each Data Set
Data set c0 min c0 max c1 min c1 max a orr min aorrmax
Set 1 0.0 0.7 0.1 1.6 7.0 30.0
Set 2 0.0 0.7 0.1 1.6 21.0 80.0
Set 3 0.0 0.7 0.1 1.6 7.0 30.0
Set 4 0.0 0.7 0.1 1.6 21.0 80.0
estimates to the ergodic variogram. These theoretical uncertainty estimates were
calculated for each combination of test set and sampling scheme. The ergodic
variogram values required by Equations (13) and (24) were taken from the vari-
ogram used to simulate the relevant data set.The covariance matrices of the experimental variograms and fitted parameters
for the sets of simulated data were then derived. The sampling scheme being tested
was applied to each realization of the random variable. Experimental variograms
Figure 1. Comparison between expected ergodic errors and those observed for the Set 1 data set. The
continuous lines show the expected ergodic errors for the marked sample size. The ergodic errors from
400 sample points are denoted by , from 100 sample points by +, and from 25 sample points by .
8/12/2019 Estimation variogram uncertainty
14/32
880 Marchant and Lark
were calculated for each realization. The tolerance on the lag bins was set at
zero. A variogram model of the same type as the simulated variable was fitted
to the experimental variogram by a single iteration of the GLS procedure [Eq.(7)]. Limits were placed on the possiblefitted values for each data set in order to
prevent negative variances and ensure that the range of spatial correlation was not
greater than half the length of the field. Variogram estimates for lags greater than
half the length of a region are known to be unreliable (Webster and Oliver, 2001).
The limits are listed in Table 2. For Sets 1 and 3 the fitting procedure was then
repeated with the minimum value ofc0equal to0.3. Such a model would not be
fitted in reality since the variance is negative for small lag distances. It is included
here so that the effect of the c0 = 0 constraint on the uncertainty estimates may be
separated from other sources of error.In Figures 14 the expected ergodic error is compared with that observed from
the simulated data sets. There is good agreement for all data sets. The expected
Figure 2. Comparison between expected ergodic errors and those observed for the Set 2 data set. The
continuous lines show the expected ergodic errors for the marked sample size. The simulated standard
errors from 400 sample points are denoted by , from 100 sample points by +, and from 25 sample
points by .
8/12/2019 Estimation variogram uncertainty
15/32
Estimating Variogram Uncertainty 881
Figure 3. Comparison between expected ergodic errors and those observed for the Set 3 data set. The
continuous lines show the expected ergodic errors for the marked sample size. The ergodic errors from
400 sample points are denoted by , from 100 sample points by +, and from 25 sample points by .
variance offitted parameter estimates (c0,c1,a orr) are compared with the sim-
ulated variance of these estimates in Tables 36. Here, the minimum permissible
value ofc0for Set 1 and Set 3 is0.3. For sample schemes of fewer than 100 points,
the simulated values are less than the expected values. This is due to the theoretical
Table 3. Comparison of Theoretical Variances of Fitted Variogram Parameters, With Variances
Observed From Multiple Simulated Fields and Multiple Simulated Experimental Variograms, for the
Set 1 Data Set
Theoretical Simulatedfield Simulated variogram
Size c0 c1 a c0 c1 a c0 c1 r
25 2.34e01 1.99e01 6.27e03 1.39e-01 4.36e-01 8.56e02 1.34e-01 5.67e-01 1.37e03
49 1.67e00 1.31e00 8.46e02 1.06e-01 2.72e-01 7.81e02 1.11e-01 2.99e-01 6.75e02
100 1.09e-01 1.33e-01 1.44e02 5.45e-02 1.92e-01 4.65e02 7.60e-02 1.45e-01 4.83e02144 3.47e-02 7.87e-02 8.45e01 2.83e-02 1.78e-01 3.06e02 4.44e-02 9.11e-02 2.22e02
225 1.39e-02 5.59e-02 5.41e01 1.87e-02 9.95e-02 1.55e02 2.23e-02 9.42e-02 1.72e02
400 2.96e-03 4.23e-02 2.69e01 5.10e-03 4.91e-02 3.52e01 5.82e-03 3.70e-02 3.66e02
8/12/2019 Estimation variogram uncertainty
16/32
8/12/2019 Estimation variogram uncertainty
17/32
8/12/2019 Estimation variogram uncertainty
18/32
8/12/2019 Estimation variogram uncertainty
19/32
Estimating Variogram Uncertainty 885
Figure 6. A histogram showing a distribution of the fitted values ofc1, for the Set 4 data set sampled
with a 400 point square grid. The continuous line shows the distribution predicted by Equation (13).
The expected value ofc1 = 2/3.
Experiment 3
In Experiments 1 and 2, the ergodic variogram of the simulated variable is used
to calculate the expected variogram errors. In practice, this would be unknown.
Instead it would have to be approximated by the model fitted to the experimental
variogram. The third experiment investigates the effect that this has on the accuracy
of the confidence limits.
For each variogram estimated in Experiment 1, the value of
((h) (h))T1((h) (h)), (30)
was calculated. Here, (h) is the ergodic variogram of the simulated variable
calculated at the vector of lag distances h, (h) is the estimated experimentalvariogram values and is the covariance matrix of the experimental variogram
estimates, calculated from Equation (13), using (h). The covariance matrix
is then recalculated, using the variogram fitted to (h). Then Equation (30) is
8/12/2019 Estimation variogram uncertainty
20/32
886 Marchant and Lark
Figure 7. A histogram showing a distribution of thefitted values ofa , for the Set 4 data set sampled
with a 400 point square grid. The continuous line shows the distribution predicted by Equation (13).
The expected value ofa = 50.
recalculated using this new matrix. For each test set and sampling scheme
combination, the distributions of the two sets of values of Equation (30) should
form chi squared distributions of orderk, wherekis the number of experimental
lag bins, and the confidence limits may be calculated from Equation (15).
In Tables 710, the observed percentage of ergodic experimental variogramestimates lying within each theoretical confidence limit are given. The theoretical
confidence limits appear to be reasonable for covariance matrices calculated with
fitted variogram estimates and for covariance matrices calculated with the actual
ergodic variogram. In general, the confidence limits resulting from the actual
ergodic variogram are slightly more accurate. This is particularly noticeable for
sample schemes with fewer than 100 points.
DISCUSSION
Extensive simulation tests have shown that, for an isotropic Gaussian random
variable with known ergodic variogram, the covariance matrix of experimental
8/12/2019 Estimation variogram uncertainty
21/32
Estimating Variogram Uncertainty 887
Figure 8. Comparison between expected nonergodic errors and those observed for the Set 1 data set.
The continuous lines show the expected nonergodic errors [calculated from Eq. (23)] for the marked
sample size. The nonergodic errors from 400 sample points are denoted by, from 100 sample points
by +, and from 25 sample points by .
Table 7. Percentage of Estimates to the Ergodic Variogram Lying Within the Theoretical Con fidence
Limits for the Set 1 Data Set
Sample size Variogram 99 98 95 90 80 70 50 30 10
25 Fitted 92.0 91.0 85.0 77.0 68.0 58.0 45.0 27.0 9.0
25 Ergodic 99.0 98.0 94.0 87.0 79.0 72.0 56.0 36.0 12.0
49 Fitted 94.0 92.0 92.0 87.0 77.0 67.0 49.0 31.0 13.0
49 Ergodic 97.0 96.0 93.0 92.0 84.0 74.0 61.0 39.0 17.0
100 Fitted 97.0 95.0 93.0 86.0 73.0 65.0 52.0 30.0 10.0
100 Ergodic 95.0 95.0 93.0 90.0 83.0 80.0 57.0 31.0 10.0
144 Fitted 98.0 97.0 94.0 86.0 81.0 68.0 52.0 33.0 12.0
144 Ergodic 98.0 98.0 95.0 93.0 88.0 82.0 62.0 44.0 15.0
225 Fitted 97.0 96.0 93.0 87.0 78.0 64.0 56.0 37.0 11.0
225 Ergodic 98.0 96.0 94.0 91.0 75.0 67.0 54.0 32.0 12.0
400 Fitted 100.0 96.0 93.0 89.0 74.0 64.0 44.0 28.0 11.0400 Ergodic 97.0 94.0 89.0 86.0 78.0 73.0 58.0 36.0 16.0
Note. Theoretical confidence limits calculated with the fitted and ergodic variograms are treated
separately.
8/12/2019 Estimation variogram uncertainty
22/32
888 Marchant and Lark
Figure 9. Comparison between expected nonergodic errors and those observed for the Set 2 data set.
The continuous lines show the expected nonergodic errors [calculated from Eq. (23)] for the marked
sample size. The nonergodic errors from 400 sample points are denoted by, from 100 sample points
by +, and from 25 sample points by .
Table 8. Percentage of Estimates to the Ergodic Variogram Lying Within the Theoretical Con fidence
Limits for the Set 2 Data Set
Sample size Variogram 99 98 95 90 80 70 50 30 10
25 Fitted 93.0 91.0 88.0 85.0 72.0 58.0 44.0 27.0 7.0
25 Ergodic 97.0 97.0 89.0 87.0 78.0 73.0 56.0 38.0 8.0
49 Fitted 97.0 93.0 91.0 83.0 72.0 63.0 51.0 27.0 11.0
49 Ergodic 99.0 98.0 95.0 91.0 83.0 73.0 53.0 36.0 14.0
100 Fitted 95.0 93.0 87.0 79.0 74.0 64.0 50.0 29.0 18.0
100 Ergodic 98.0 96.0 92.0 89.0 83.0 73.0 58.0 37.0 16.0
144 Fitted 97.0 96.0 92.0 88.0 74.0 69.0 52.0 33.0 10.0
144 Ergodic 97.0 97.0 95.0 93.0 87.0 80.0 62.0 46.0 13.0
225 Fitted 95.0 91.0 89.0 82.0 74.0 62.0 45.0 28.0 12.0
225 Ergodic 97.0 96.0 92.0 87.0 76.0 69.0 60.0 35.0 11.0
400 Fitted 97.0 96.0 92.0 84.0 78.0 61.0 47.0 31.0 13.0400 Ergodic 98.0 96.0 94.0 85.0 79.0 73.0 58.0 39.0 18.0
Note. Theoretical confidence limits calculated with the fitted and ergodic variograms are treated
separately.
8/12/2019 Estimation variogram uncertainty
23/32
Estimating Variogram Uncertainty 889
Figure 10. Comparison between expected nonergodic errors and those observed for the Set 3 data
set. The continuous lines show the expected nonergodic errors [calculated from Eq. (23)] for the
marked sample size. The nonergodic errors from 400 sample points are denoted by , from 100
sample points by +, and from 25 sample points by .
Table 9. Percentage of Estimates to the Ergodic Variogram Lying Within the Theoretical Con fidence
Limits for the Set 3 Data Set
Sample size Variogram 99 98 95 90 80 70 50 30 10
25 Fitted 93.4 91.3 87.0 82.2 72.4 66.0 50.5 32.6 11.5
25 Ergodic 97.0 95.2 93.0 90.6 83.9 78.3 60.4 38.1 11.9
49 Fitted 93.3 91.5 88.8 81.9 74.1 65.0 49.7 30.6 10.3
49 Ergodic 96.5 95.0 91.9 88.4 82.8 75.5 59.4 40.5 14.1
100 Fitted 95.1 92.8 89.4 83.9 75.1 66.3 47.9 30.4 11.1
100 Ergodic 96.8 95.1 92.5 89.4 80.9 73.3 58.4 39.2 14.8
144 Fitted 95.2 92.0 88.7 84.4 76.6 66.4 46.8 29.9 10.8
144 Ergodic 96.5 95.0 92.4 89.4 81.5 74.6 60.3 41.0 15.4
225 Fitted 95.5 93.7 90.1 85.9 75.9 67.2 49.3 31.5 10.3
225 Ergodic 96.0 93.9 90.4 85.4 78.7 71.6 60.8 42.3 17.1
400 Fitted 96.8 96.2 91.4 86.6 80.1 72.0 47.8 28.5 11.8400 Ergodic 98.0 96.0 94.0 85.0 79.0 73.0 58.0 39.0 18.0
Note. Theoretical confidence limits calculated with the fitted and ergodic variograms are treated
separately.
8/12/2019 Estimation variogram uncertainty
24/32
890 Marchant and Lark
Figure 11. Comparison between expected nonergodic errors and those observed for the Set 4 data
set. The continuous lines show the expected nonergodic error [calculated from Eq. (23)] for the
marked sample size. The nonergodic errors from 400 sample points are denoted by , from 100
sample points by +, and from 25 sample points by .
Table 10. Percentage of Estimates to the Ergodic Variogram Lying Within the Theoretical Confidence
Limits for the Set 4 Data Set
Sample size Variogram 99 98 95 90 80 70 50 30 10
25 Fitted 92.3 90.2 87.0 80.7 72.4 63.9 47.4 28.5 11.6
25 Ergodic 94.8 93.3 90.3 86.9 81.1 74.3 57.5 36.9 11.1
49 Fitted 92.3 90.3 86.1 80.7 71.2 62.0 46.5 31.1 11.0
49 Ergodic 95.2 93.5 90.5 86.6 81.5 75.2 60.1 40.8 14.0
100 Fitted 90.2 88.3 82.8 78.2 67.6 61.3 43.0 26.5 10.0
100 Ergodic 95.5 94.7 90.4 85.7 78.5 71.9 56.7 38.7 12.7
144 Fitted 90.0 87.6 83.1 76.1 66.6 59.5 43.3 26.6 8.8
144 Ergodic 95.9 94.4 91.5 88.1 82.1 74.2 59.4 38.9 15.1
225 Fitted 89.1 87.7 83.1 77.6 68.1 60.8 43.8 28.7 11.3
225 Ergodic 94.6 93.0 89.6 85.6 79.1 73.6 60.0 39.8 16.8
400 Fitted 91.3 88.9 83.4 78.7 69.4 58.5 42.9 24.9 8.7400 Ergodic 96.8 94.5 90.7 86.0 76.9 71.9 55.7 40.9 18.2
Note. Theoretical confidence limits calculated with the fitted and ergodic variograms are treated
separately.
8/12/2019 Estimation variogram uncertainty
25/32
Estimating Variogram Uncertainty 891
Figure 12. Comparison of the expected ergodic (continuous line) and nonergodic (dotted line)
errors for the Set 1 data set.
variogram estimates is approximated accurately by Equation (13). Pardo-Iguzquiza
and Dowd (2001a) approximated the covariance matrix of the parameters offitted
variogram models by calculating the inverse of the information matrix [Eq. (24)].
This method is seen to overestimate the precision of the parameter estimates.This might be due to a number of factors. First, this expression for parameter
uncertainty is based on a leading order Taylor series expansion centered on the
actual ergodic variogram. Thus it is accurate only when the uncertainty is small. In
Tables 36, the uncertainty estimates are seen to improve as the sample size, and
therefore the precision of estimates, increases. Secondly, this method assumes that
the distribution offitted variogram parameters is Gaussian. The distributions of
parameters fitted to the simulated data sets were seen to deviate from Gaussian. The
method also assumes that the parameters may take any value. In the simulation
tests it was necessary to place constraints on the parameter values for practicalreasons, as would be the case in a real survey. Finally, for some realizations an
inappropriate choice of variogram model might have caused larger deviations from
the expected parameter values.
8/12/2019 Estimation variogram uncertainty
26/32
892 Marchant and Lark
Figure 13. Comparison of the expected ergodic (continuous line) and nonergodic (dotted line)
errors for the Set 2 data set.
As an alternative to calculating the inverse of the information matrix, the
uncertainty of fitted parameter values may be assessed by simulating multiple
experimental variograms using Equation (13) and then fitting variogram models to
these. This process is computationally more expensive, but for a small number oflag bins it is practical. The results in Tables 36 show that this simulation method is
more accurate than using the information matrix. Also, there is no need to assume
a particular distribution of parameter estimates, and constraints on the parameter
values can be accounted for.
Simulation tests have also shown that the expected nonergodic errors are
approximated accurately by Munoz-Pardos (1987) expression [Eq. (23)].These
nonergodic errors are due purely to sampling. Estimates of the ergodic variogram
have a component of uncertainty due to thefluctuations of the random variable, in
addition to this sampling error. When the large fields studied in Set 1 and Set 2 weresampled with a 25 point scheme, the ergodic and nonergodic errors were almost
identical. When more sampling points were used, the nonergodic error became
less than the ergodic error. The difference between the nonergodic and ergodic
8/12/2019 Estimation variogram uncertainty
27/32
Estimating Variogram Uncertainty 893
Figure 14. Comparison of the expected ergodic (continuous line) and nonergodic (dotted line)
errors for the Set 3 data set.
errors was more pronounced for the smallerfields considered in Set 3 and Set 4,
particularly over large lag distances.
These results reflect that for the small fields, the sample grids covered the
entire field effectively. Thus most of the variation of the variable within the region,particularly over large lag distances, was accounted for. Therefore the nonergodic
error was much smaller than the ergodic error, which also had to account for
fluctuations of the random variable over other realizations. For the largerfields,
the sample points were more sparse. Therefore there were parts of the field that were
unsampled and the variation in these was not accounted for. Thus the nonergodic
error was more similar to the ergodic error since both estimators have to account
for behavior andfluctuations of the variable in unsampled regions. In the case of
the ergodic estimator this unsampled regionconsisted of all other realizations
of the variable.These simulation tests were computationally very expensive. Each realiza-
tion was sampled at every point to calculate a definitive nonergodic variogram.
Webster and Olivers (1992) study of nonergodic variogram uncertainty used a
8/12/2019 Estimation variogram uncertainty
28/32
894 Marchant and Lark
Figure 15. Comparison of the expected ergodic (continuous line) and nonergodic (dotted line)
errors for the Set 4 data set.
method requiring far fewer computations. However, we feel that our method was
worthwhile since it was more accurate. To illustrate this, Figure 16 shows the ex-
pected nonergodic error for the Set 4 data set calculated using Webster and Olivers
method. The values seen here agree with those found in 1992. They are however,significantly less than both the nonergodic errors observed in our simulations and
the expected nonergodic errors from Munoz-Pardos (1987) expression [Eq. (23)].
Webster and Oliver (1992) estimated the nonergodic error for a particular
sampling grid design and separation distance h by the standard deviation of(h)
values derived from translations of the grid over a single realization. Since they
came from the same realization, these estimates ofNE(h) were not independent.
The covariance between (h; gp), the semivariance estimated from translated grid
gp, and (h; gq ), the semivariance estimated from translated grid gq is given by
Cov ((h; gp), (h; gq )) =1
2n2(h)
n(h)i=1
n(h)j=1
Ci j(h; gp, gq ), (31)
8/12/2019 Estimation variogram uncertainty
29/32
Estimating Variogram Uncertainty 895
Figure 16. The expected nonergodic error calculated using Webster and Olivers (1992) method. The
are for 400 sample points, + are for 100 sample points, and the are for 25 sample points. The
lines show the expected nonergodic error using Munoz-Pardos (1987) method [Eq. (23].
where Ci j(h; gp, gq ) describes the covariance between [zi1(h) z
i2(h)] sampled
from grid gp and [zj
1 (h) zj
2 (h)] sampled from grid gq . Values of Ci j(h; gp,
gq ) may be calculated from Equation (12). If the translated grids are well sep-arated, orh is small, then the covariance between semivariances estimated from
different grids are small and Webster and Olivers method gives a good approxima-
tion of the nonergodic error. However there are only a small number of translations
of large sample grids over small regions which do not have a point in common.
The position of sample points within some of these grids are close enough to cause
significant correlation between the estimated semivariances. This leads to the non-
ergodic error being underestimated as illustrated in Figure 16. Our method did
not contain such a bias since the nonergodic error estimates came from different
realizations of the random process and were therefore uncorrelated.The difference between the nonergodic and ergodic errors can have impli-
cations for both the design of efficient sampling schemes and variogram model
fitting. The ergodic covariance matrix has been used previously to optimize sample
8/12/2019 Estimation variogram uncertainty
30/32
8/12/2019 Estimation variogram uncertainty
31/32
Estimating Variogram Uncertainty 897
Equation (13). Then Equation (23) could be calculated from this model, and used
for onefinal iteration of thefitting procedure [Eq. (7)]. More theoretical work is
required to ensure that such an approach is consistent.A major disadvantage of calculating the nonergodic covariance matrix is the
extra computational work required. The method described in this paper requires all
the covariances between pairs of pairs within a very concentrated sample scheme
to be calculated. For each entry of the covariance matrix it is only the average of the
covariance between pairs from each bin that is needed. Therefore the computational
load may be reduced by subsampling of these pairs.
In Experiment 3, the effect upon the experimental variogram confidence limits
from using the fitted variogram rather than the correct ergodic variogram, was seen
to be small for sample schemes of 100 or more points. It therefore appears that thecircular approach in calculating variogram uncertainty is valid.
CONCLUSIONS
This study has demonstrated that for a known ergodic variogram, it is pos-
sible to accurately determine the expected difference between the experimental
semivariances calculated from a particular sampling scheme and the correspond-
ing ergodic and nonergodic variogram values. Ergodic errors may be estimatedby Pardo-Iguzquiza and Dowds (2001a) method [Eq. (13)] and nonergodic errors
by Munoz-Pardos (1987) expression [Eq. (23)]. The ergodic error is significantly
less demanding to compute than the nonergodic error. For large fields the differ-
ence between the two error expressions is negligible. However for small regions,
say with length around twice the range of spatial correlation of the variable, the
nonergodic error is significantly less than the ergodic error.
Previously Muller and Zimmerman (1999) and Bogaert and Russo (1999)
have used the ergodic error expressions to compute optimal sampling schemes for
variogram estimation. If the aim of these schemes is to approximate the variogramof the single region being sampled with maximum precision, then a nonergodic
expression of variogram uncertainty would be more appropriate. On smallerfields
use of the ergodic expression leads to more intensive sampling than is required.
Our results have also suggested that the GLS variogramfitting procedure may be
improved (in the sense that the fitted variogram better matches the nonergodic
variogram) if the nonergodic error is incorporated into the final iteration of the
procedure.
It should be noted that if these expressions are used to determine variogram
uncertainty in a real survey, there will be additional uncertainty because the ergodicvariogram is unknown. Further simulation tests suggested that the additional un-
certainty from using the estimated variogram rather than the true ergodic variogram
is small for sample schemes of more than 100 points.
8/12/2019 Estimation variogram uncertainty
32/32
898 Marchant and Lark
ACKNOWLEDGMENTS
This work was supported by the Biotechnology and Biological Sciences Re-search Council of the U.K. through Grant 204/D1 5335 and by the Home-Grown
Cereals Authority of the U.K. through grant 2453.
REFERENCES
Bogaert, P., and Russo, D., 1999, Optimal spatial sampling design for the estimation of the variogram
based on a least squares approach: Water Resour. Res., v. 35, no. 4, p. 12751289.
Brooker, P. I., 1986, A parametric study of robustness of Kriging variance as a function of range and
relative nugget effect for a spherical semivariogram: Math. Geology, v. 18, no. 5, p. 477488.Brus, D. J., and de Gruijter, J. J., 1994, Estimation of nonergodic variograms and their sampling variance
by design-based sampling strategies: Math. Geology, v. 26, no. 4, p. 437453.
Cressie, N., 1985, Fitting variogram models by weighted least squares: Math. Geology, v. 17, no. 5,
p. 563586.
Deutsch, C. V., and Journel, A. G., 1998, GSLIB: Geostatistical software library and users guide, 2nd
ed.: Oxford University Press, New York, 369 p.
Gathwaite, P. H., Joliffe, I. T., and Jones, B., 1995, Statistical inference: Prentice Hall, London, 290 p.
Journel, A. G., and Huijbregts, C. J., 1978, Mining geostatistics: Academic Press, London, 600 p.
McBratney, A. B., and Webster, R., 1986, Choosing functions for semi-variograms of soil properties
andfitting them to sampling estimates: J. Soil Sci. v. 37, no. 4, p. 617639.
Menke, W., 1984, Geophysical data analysis: Discrete inversion theory: Academic Press, San Diego,
CA, 285 p.
Muller, W. G., and Zimmerman,D. L., 1999,Optimal designs for variogram estimation:Environmetrics,
v. 10, no. 1, p. 2337.
Munoz-Pardo, J. F., 1987, Approche Geostatistique de la variabilite spatiale des Milieux Geophysiques:
These Docteur-Ingenieur,UniversitedeGrenobleetlInstitut National Polytechnique de Grenoble,
254 p.
Ortiz, C. J., and Deutsch, C. V., 2002, Calculation of uncertainty in the variogram: Math. Geology:
v. 34, no. 2, p. 169183.
Pardo-Iguzquiza, E., and Dowd, P. A., 2001a, Variancecovariance matrix of the experimental vari-
ogram: Assessing variogram uncertainty: Math. Geology, v. 33, no. 4, p. 397419.
Pardo-Iguzquiza, E., and Dowd, P. A., 2001b, VARIOG2D: A computer program for estimating the
semi-variogram and its uncertainty: Comput. Geosciences, v. 27, no. 5, p. 549561.
Todini, E., 2001, Influence of parameter estimation uncertainty in Kriging: Part 1Theoretical devel-
opment: Hydrol. Earth Sci. Syst., v. 5, no. 2, p. 215223.
Todini, E., Pellegrini, F., and Mazzetti, C., 2001, Influence of parameter estimation uncertainty
in Kriging: Part 2Test and case study applications: Hydrol. Earth Sci. Syst., v. 5, no. 2,
p. 225232.
Webster, R., and Oliver, M. A., 1992, Sample adequately to estimate variograms of soil properties:
J. Soil Sci., v. 43, no. 1, p. 177192.
Webster, R., and Oliver, M. A., 2001, Geostatistics for environmental scientists: John Wiley & Sons,
Chichester, 271 p.