19
The Practical Use of Semiparametric Models in Field Trials Maria DURBAN , Christine A. HACKETT , James W. MCNICOL , Adrian C. NEWTON , William T. B. THOMAS , and Iain D. CURRIE This articleexaminesthepracticaluseof semiparametricmodelsin theanalysisof eld trials—that is, models with parameterized treatment effects and additive terms derived by a data-drivenapproach using a locally weighted running line smoother (loess). We discuss graphical methods to identify spatial structure in the data and model selection procedures to choose the degree of smoothing. Once the spatial part of the model has been chosen, hypotheses about the treatment effects may be tested. Semiparametric models are used to analyze two barley eld trials exhibiting spatial trends. The rst has a single experimental treatment and a row-column design. The second has a split-plot design, and we use a semiparametric model which accounts for the randomization at the different strata of this design. We compare the semiparametric analyses with classical analyses of variance and with alternative spatial models. We nd that semiparametric models give a good insight into spatial variation in the eld and can improve the precision of parameter estimates. Key Words: Bootstrap F test; Loess; Model selection;Smoothing; Spatial analysis;Split- plot. 1. INTRODUCTION There is a large statistical methodology on experimental designs and analyses to ac- count for spatial trends in experiments such as agricultural or forestry trials. Sophisticated designs, for example alpha designs and row-column designs, are frequently used for exper- iments with a single treatment, such as plant variety trials. Fewer experimental designs are appropriate for assessing the effects of several factors simultaneously. Edmondson (1993) considered systematic designs for experiments (including factorial experiments) where low degree polynomial trends are expected across rows and columns. Williams and John (1996) Maria Durban is Assistant Professor, Department of Statistics and Econometrics, Universidad Carlos III de Madrid, Edicio Torres Quevedo, 28911 Leganes (Madrid), Spain. Christine A. Hackett is a Statistician, and James W. McNicol is a Statistician, Biomathematics and Statistics Scotland, based at the Scottish Crop Research Institute, Invergowrie, Dundee, DD2 5DA, Scotland (E-mail: [email protected]). Adrian C. Newton is a Plant Pathol- ogist, and William T. B. Thomas is a Barley Geneticist, Scottish Crop Research Institute. Iain Currie is a Senior Lecturer in Statistics at the Department of Actuarial Mathematics and Statistics, Heriot-Watt University, Edin- burgh, EH14 4AS, Scotland. Maria Durban carried out this work while based with Biomathematics and Statistics Scotland at the Scottish Crop Research Institute. Christine Hackett is the corresponding author. c ® 2003 American Statistical Association and the International Biometric Society Journal of Agricultural, Biological, and Environmental Statistics, Volume 8, Number 1, Pages 48–66 DOI: 10.1198/1085711031265 48

The practical use of semiparametric models in field trials

Embed Size (px)

Citation preview

Page 1: The practical use of semiparametric models in field trials

The Practical Use of Semiparametric Modelsin Field Trials

Maria DURBAN , Christine A. HACKETT , James W. MCNICOL,

Adrian C. NEWTON, William T. B. THOMAS , and Iain D. CURRIE

This articleexaminesthe practicaluse of semiparametricmodels in the analysisof �eldtrials—that is, models with parameterized treatment effects and additive terms derived bya data-drivenapproach using a locally weighted running line smoother (loess). We discussgraphical methods to identify spatial structure in the data and model selection proceduresto choose the degree of smoothing. Once the spatial part of the model has been chosen,hypotheses about the treatment effects may be tested. Semiparametric models are used toanalyze two barley �eld trials exhibiting spatial trends. The �rst has a single experimentaltreatment and a row-column design. The second has a split-plot design, and we use asemiparametric model which accounts for the randomization at the different strata of thisdesign. We compare the semiparametric analyses with classical analyses of variance andwith alternative spatial models. We �nd that semiparametric models give a good insightinto spatial variation in the �eld and can improve the precision of parameter estimates.

Key Words: BootstrapF test; Loess; Model selection;Smoothing; Spatial analysis;Split-plot.

1. INTRODUCTION

There is a large statistical methodology on experimental designs and analyses to ac-count for spatial trends in experiments such as agricultural or forestry trials. Sophisticateddesigns, for example alpha designs and row-column designs, are frequently used for exper-iments with a single treatment, such as plant variety trials. Fewer experimental designs areappropriate for assessing the effects of several factors simultaneously. Edmondson (1993)considered systematic designs for experiments (including factorial experiments) where lowdegree polynomial trends are expected across rows and columns. Williams and John (1996)

Maria Durban is Assistant Professor,Department of Statistics and Econometrics, Universidad Carlos III de Madrid,Edi�cio Torres Quevedo, 28911 Leganes (Madrid), Spain. Christine A. Hackett is a Statistician, and James W.McNicol is a Statistician, Biomathematics and Statistics Scotland, based at the Scottish Crop Research Institute,Invergowrie, Dundee, DD2 5DA, Scotland (E-mail: [email protected]). Adrian C. Newton is a Plant Pathol-ogist, and William T. B. Thomas is a Barley Geneticist, Scottish Crop Research Institute. Iain Currie is a SeniorLecturer in Statistics at the Department of Actuarial Mathematics and Statistics, Heriot-Watt University, Edin-burgh, EH14 4AS, Scotland. Maria Durban carried out this work while based with Biomathematics and StatisticsScotland at the Scottish Crop Research Institute. Christine Hackett is the corresponding author.

c® 2003 American Statistical Association and the International Biometric SocietyJournal of Agricultural, Biological, and Environmental Statistics, Volume 8, Number 1, Pages 48–66DOI: 10.1198/1085711031265

48

Page 2: The practical use of semiparametric models in field trials

PRACTICAL USE OF SEMIPARAMETRIC MODELS IN FIELD TRIALS 49

proposed a row-column design for factorial experiments, but stated that it is dif�cult toinclude more than two factors.

Unexpected environmental trends can also develop in the course of �eld experiments,such as local water-logging, gradients of diseases, or patchy growth. It is useful to have amethod of analysis to detect environmental trends in both single- and multi-factor experi-ments, and compare treatments satisfactorily when such trends are present. Spatial modelsfor the analysis of such experiments date back to Papadakis (1937), who adjusted for lo-cal trend by analysis of covariance using the treatment-corrected yields of neighbouringplots. Wilkinson, Eckert, Hancock, and Mayo (1983) extended this approach with their“smooth trend + error” decomposition. Cullis and Gleeson (1991) modeled the trend ina two-dimensional �eld trial as a separable process with autoregressive integrated movingaverage (ARIMA) models for the row and column trends. This approach was extended byGilmour, Cullis, and Verbyla (1997) and Verbyla, Cullis, Kenward, and Welham (1999).

Additive models (Hastie and Tibshirani 1986, 1987, 1990; Buja, Hastie, and Tibshirani1989) use a data-driven approach to represent the relationship between a response andexplanatory variables as a sum of smooth terms without assuming a speci�c functionalform. A model with smooth terms and parameterized treatment effects is referred to assemiparametric (Green, Jennison, and Seheult 1985). Green et al. (1985) and Hastie andTibshirani (1987) used semiparametric models to represent a smooth trend in a single rowof plots of barley. Hackett, Reglinski, and Newton (1995) demonstrated the potential ofsemiparametric models to represent two-dimensional trends in a barley trial with rowsand beds. One important feature of these models is that they enable spatial trends to bevisualized easily, either as a sum of two one-dimensional trends or as a two-dimensionalsmooth surface.

The theoretical developmentof semiparametric models (e.g., Green et al. 1985; Speck-man 1988; Hastie and Tibshirani 1990) has concentrated on semiparametric models with asingle smooth term. Here we present the theory for a semiparametricmodel with two smoothterms. We discuss smoothing parameter selection, use graphical and analytical methods todetermine the form of the smooth part of the model, and show how treatment effects andstandard errors may be estimated.

Semiparametric models are then applied to analyze two barley �eld trials from theScottish Crop Research Institute (SCRI), Dundee, UK. The �rst trial compares a largenumber of varieties, using a row-column design. The second trial has a split-plot design,and an approach to spatial analysis and hypothesis testing for such a design is proposed.The semiparametric analyses are compared to conventional analyses of variance, and tothe spatial models of Gilmour et al. (1997). The trial data are available as �les TrialA andTrialB from ftp://ftp.bioss.sari.ac.uk/pub/maria.

2. ADDITIVE AND SEMIPARAMETRIC MODELS

Additive models (Hastie and Tibshirani 1986, 1990) are a generalization of linearregression models. Let ·(:) be the expected value of the response, Y = (Y1; : : : ; Yn),

Page 3: The practical use of semiparametric models in field trials

50 M. DURBAN ET AL.

corresponding to explanatory variables X = (X1; : : : ; Xq). In a linear model ·(X) =

¬ +P

j Xj�j; in an additive model this becomes ·(X) = ¬ +P

j fj(Xj), where fj isan unspeci�ed smooth function of Xj called a smoother. Several smoothers were describedby Hastie and Tibshirani (1990): here we use the locally weighted running line smoother(loess) (Cleveland 1979), which is suitable for smoothing in one or two dimensions and isimplemented in S-Plus (MathSoft Inc.). Loess is a linear smoother and the estimate of thetrend may be written as ^

f = SY , where S is an n £ n matrix called the smoother matrix.The loess smoother calculates a local, weighted average over a neighborhood. The

number of points in the neighborhood is known as the span of the smoother. The span isrelated to degrees of freedom of the smoother: the larger the span, the fewer the degreesof freedom and the smoother the �tted function. By analogy with the hat matrix in linearregression, we take the degrees of freedom of a smoother to be the trace of the smoothermatrix S, although other choices are possible (Hastie and Tibshirani 1990, p. 52).

When analyzing data from �eld trials, semiparametric models are required. Thesecombine smooth terms and parameterized treatment effects, as in

Y = Z� + f1 + f2 + ¢ ¢ ¢ + fq + °; (2.1)

where Z is the n £ p design matrix for treatment effects, with 1, the column vector of 1’s,included in the column space of Z, � is the p £ 1 vector of treatment parameters and f1,f2; : : : ; fq represent smooth terms. The ° are independent with E(°) = 0 and var(°) = ¼2.Model (2.1) can be �tted by the back�tting algorithm (Hastie and Tibshirani 1990, p. 90),with explicit solutions for ^

� and ^f1 if q = 1 (Hastie and Tibshirani 1990, p. 118). We have

^� = (ZT (I ¡ S1)Z)¡1ZT (I ¡ S1)Y = A1Y (2.2)^f1 = S1(Y ¡ Z

^�) = S1(I ¡ ZA1)Y;

where A1 = (ZT (I¡S1)Z)¡1ZT (I¡S1) and S1 is the smoothermatrix to �t f1. [To ensurethat the necessary inversematrices exist,we will assume thatS1 and all subsequentsmoothermatrices have been centred (Hastie and Tibshirani 1990, p. 115).] In a �eld experiment thesituation q = 2 is of particular interest, with the two smooth terms corresponding to thetwo perpendicular axes of the trial (referred to here as rows and beds). Durban, Hackett,and Currie (1999) showed that the equation for ^

� is similar to Equation (2.2) for any q, andin particular for q = 2 we have

^� = (ZT (I ¡ M2)Z)¡1ZT (I ¡ M2)Y = A2Y; (2.3)

where A2 = (ZT (I¡ M2)Z)¡1ZT (I¡ M2) and M2 is the hat matrix for model (2.1) withsmooth terms f1 and f2, and no treatment term, that is, Z� = ·1. The exact calculationof A2 involves �tting the smooth terms of the model with the n columns of the identitymatrix as responses, that is, �tting n models. This is expensive computationally,but Durbanet al. (1999) showed that a good approximation in (2.3) may be achieved by assuming thatMT

2 Z = M2Z, thereby reducing the computation to �tting the smooth terms of the modelswith the p columns of Z as the responses, that is, �tting p models. The variance of ^

� maybe calculated from (2.3) as var( ^

�) = ¼2A2AT2 . The expected value of the residual sum of

Page 4: The practical use of semiparametric models in field trials

PRACTICAL USE OF SEMIPARAMETRIC MODELS IN FIELD TRIALS 51

squaresP

(Yi ¡ ^Yi)

2 is ¼2tr[(I¡H)(I¡H)T ] where H is the hat matrix for (2.1) satisfying^Y = HY = Z

^� +

^f1 +

^f2. The calculationof the degrees of freedom, tr[(I ¡ H)(I ¡ H)T ]

is also computationally expensive and we have used the simpler

s2 =

P(Yi ¡ ^

Yi)2

n ¡ p ¡ tr(M2)(2.4)

as the estimate of ¼2 where p is the rank of Z (Buja et al. 1989).

3. EXPERIMENTAL LAYOUT

3.1 TRIAL A

Trial A was a spring barley variety trial with 272 entries, comprising260 test entries and12 controls. It was grown at SCRI in 1998. The trial was sown as a row and column designof two replicates, with each replicate being 8 rows deep by 34 beds wide. The replicateswere adjacent to each other so that the whole trial occupied 16 rows (north-south) and 34beds (east-west). There was a notable slope across this trial, with row 1 lower than row 16.Plot yields were converted to tonnes per hectare for analysis.

3.2 TRIAL B

Trial B was an experiment to examine the effects of growing mixtures of winter barleycultivars on the yield and level of Rhynchosporiumsecalis infection. The trial was grown atSCRI in 1995–1996. The trial area was 56 beds wide (east-west) and 10 rows long (north-south), and was laid out as a split-plot design with four blocks (each 10 rows by 14 beds).Each block consisted of two main plots (10 rows by 7 beds), a randomly selected one ofwhich was treated with fungicide. The main plots were split into 70 subplots, and 70 barleycultivars or cultivar mixtures were allocated at random to these. Plot yields are analyzedhere. Further details of the experiment are given in Newton, Ellis, Hackett, and Guy (1997).

4. MODEL SELECTION

4.1 GRAPHICAL METHODS

An initial check for the presence of a trend is to calculate the residuals from a modelwith all treatment effects and interactions, but no block effects or other spatial terms, andplot them against position in the �eld, as a function of row number or bed number. In thesetrials, we have numbered rows from south to north and beds from west to east. Figure 1 plotsresiduals against row number and bed number for Trial A. The graph of residuals againstbed number shows that the large negative residuals occur at each side of the trial. The graph

Page 5: The practical use of semiparametric models in field trials

52 M. DURBAN ET AL.

��

��

���

��

�����

��

��

��

�����

���

��

��

��

��

��

��

���

���

��

��

���

��

��

��

��

���

��

��

��

���

��

��

��

����

��

��

��

��

��

���

��

��� �

���

��

��

��

���

��

�����

��

��

���

��

��

��

��

����

���

��

��

��

��

��

��

��

��

���

��

��

��

��

��

���

��

��

��

��

��

����

��

������

��

���

���

��

��

��� �

��

��

��

���

���

��

��

���

��

��

��

��

Row Number

Re

sidu

als

5 10 15

-0.5

0.0

0.5

��

� �

��

��

� � � � �

��

� �

��

� � ��

��

��

� �

��

��

��

��

� ��

� ��

��

� �

�� �

� �

� �

��

��

� � �

��

��

� �

� � �

� �

��

��

��

� �

��

��

��

��

��

� � �

� �

� � ���

�� �

� �

��

� �

��

��

��

�� �

��

��

�� �

��

� �

��

��

��

� �

��

��

��

��

� �

� �

��

� �

��

��

� �

��

��

��

��

�� �

��

� �

��

��

��

� �

�� �

� �

��

� ��

� �

�� �

��

��

��

� � ���

� �

� �

��

� ��

��

��

� �

��

��

��

� �

��

��

Bed NumberR

esid

uals

0 10 20 30

-0.5

0.0

0.5

Figure 1. Plot of residuals from Trial A against row and bed number with scatterplot smoother.

of residuals against row number shows a decrease in the residuals for rows 1–5. Figure 2shows similar graphs for Trial B. The graph of residuals against bed number shows a strongtrend in this case: most of the largest residuals are associated with beds 1–13, and most ofthe largest negative residuals with beds 14–24. The graph of residuals against row numberfor trial B shows little trend.

Coplots (Cleveland 1993) are a useful graphical tool for detecting whether the trendacross the beds varies with the row number, that is, whether a two-dimensional trend ispresent. A coplot, as in Figures 3 and 4, shows a series of graphs of the partial residuals(after adjustments for the treatment effects) against, for example, bed number, for values ofrow number in a certain interval. Successive graphs show different intervals, as indicatedbya panel at the top of the coplot. If the relationship between the partial residual and the bednumber differs between graphs, then a two-dimensional smooth surface may be necessary.

��

��

������

���

��������

��

���

��

��

�������

���

��

���

��

��

���

���

��

���

��

���

��

��

��

��

��

�����

��

��

��

���

�����

��

��

�����������

��

��

���

���

��

��

����

��

��

����

��

�����

������

��

���

��

��

��

�����

���

���

��

��

��

��

��

��

��

��

����

��

�����

��

��

���

���

��

���

��

�����

��

��

��

��

��

��

��

��

��

������

��

����

��

����

��

���

��

��

�����

��

��

���

���

���

��

��

���

���

��

���

��

�����

��

���

��

���

���

��

����

��

��

��

��

��

��

��

����

Row Number

Res

idua

ls

2 4 6 8 10

-1.0

-0.5

0.0

0.5

1.0

��

��

���

� ��

���

������ ��

��

�� �

��

��

� ��

�� ��

� ��

��

���

��

���

�� �

��

���

��

���

��

��

��

��

��

���� �

��

��

��

� ��

���� �

��

��

����

����� ��

��

��

���

���

��

� �

����

��

��

����

� �

�����

���� ��

��

���

��

��

��

�����

���

�� �

��

��

��

� �

��

��

��

� �

�� �

��

���

��

��

��

� ��

���

� �

���

��

����

��

��

� �

��

��

��

��

��

��

�� �

���

��

����

��

����

��

���

��

��

����

� �

��

���

�� �

���

��

��

� ��

���

��

�� �

��

����

��

� ��

��

���

���

��

����

� �

��

��

��

��

��

��

����

Bed Number

Res

idua

ls

0 10 20 30 40 50

-1.0

-0.5

0.0

0.5

1.0

Figure 2. Plot of residuals from Trial B against row and bed number with scatterplot smoother.

Page 6: The practical use of semiparametric models in field trials

PRACTICAL USE OF SEMIPARAMETRIC MODELS IN FIELD TRIALS 53

���

��

���

�����

��

�������

0 10 20 30

��

���

�����

���

�����

��

��

��

� �

��

����

����

��

��

���

���

0 10 20 30

���

���

��

��

���

���

�����

-0.5

0.0

0.5

��

��

��

����

��

���

�������

����

��

-0.5

0.0

0.5

��

��

��

����

����

��

��

��

���

���� �

���

�����

���

��

���

��

���

���

��

�����������

����

���

���

��

��

����

����

��

��

��

��

����

��

��

��

���

��

��

��

���

������

���

��

���

���

���

��

��

��

��

���

����

��

��

����

��

����

��

-0.5

0.0

0.5

���

������

�������

��

��

-0.5

0.0

0.5

��

���

���

���

����

��

��

���

0 10 20 30

��

��

��

��

���

�����

���

����

���

��

��

��

��

��

��

����

����

0 10 20 30

5 10 15

Bed Number

Res

idua

ls

Given : Row Number

Figure 3. Conditional plots of partial residuals from Trial A versus bed number for different row positions.Successive panels (read from left to right and bottom to top) correspond to the set of row positions in the top panel.

In Figure 3 (Trial A) all graphs show an increase up to bed 15, but differ for the highernumbered beds, depending on the row range. The coplot for Trial B (Figure 4) shows amarked decrease then increase across the beds for the graphs corresponding to the highnumbered rows, but less so for the low numbered rows. A two-dimensional smooth surfacewill probably be required to describe the trend in Trial B, but there is less indication of atwo-dimensional trend for Trial A.

4.2 ANALYTICAL METHODS

4.2.1 Model Selection Criteria

If the graph of residuals against a covariate suggests that a smooth function of thecovariate be included in the model, it is necessary to select a span for the smoother by

Page 7: The practical use of semiparametric models in field trials

54 M. DURBAN ET AL.

��

�����

������

���

�����

����������

��

���

��

�������

���

���

�������

���

������

���

����

���

����

�������������

��

��

0 10 20 30 40 50

���

�������

���

������

���

����

���

����

�������������

��

���

�����

��

�����

��

����

��������

���

���

��

��������

���

��

�����

��

�����

��

����

��������

���

���

��

��������

���

��

���

�����

��

��

����

��

��

���

�������

�����

����

����

��

����

0 10 20 30 40 50

���

�����

��

��

����

��

��

���

�������

�����

����

����

��

����

���

�������������

�����

������

���

���������

������

���

�������������

��

���

�������

��

����������

���

�����

������

-1.0

-0.5

0.0

0.5

1.0

���

�������������

�����

������

���

���������

������

���

�������������

��

���

�������

��

����������

���

�����

������

���

��

�����

�����

��������

��

���

��

���

��������

��

����

��

-1.0

-0.5

0.0

0.5

1.0

���

��

�����

�����

��������

��

���

��

���

��������

��

����

��

��

�����

��

����

��

��

������

���

���

�����

������

��

�������

0 10 20 30 40 50

��

�����

��

����

��

��

������

���

���

�����

������

��

�������

���

���

������

��������

����

���

����

���

�����

���

������

���

���

������

��������

����

���

����

���

�����

���

������

����

�����

��

�����

��

��

���

���

��

������

������

������

��

0 10 20 30 40 50

2 4 6 8 10

Bed Number

Res

idua

lsGiven : Row Number

Figure 4. Conditional plots of partial residuals from Trial B versus bed number for different row positions.

minimizing a suitable criterion. Four such criteria are: cross-validation (CV) (Stone 1974),generalized cross-validation (GCV) (Craven and Wahba 1979), the Akaike informationcriterion (AIC) (Akaike 1973) and, recently, a modi�ed Akaike criterion (AICC) (Hurvich,Simonoff, and Tsai 1998).

Cross-validation omits each point in turn and, for two smoothers f1 and f2 with spans¶1 and ¶2, minimizes the cross-validation sum of squares:

CV(¶1; ¶2) =1n

nXi = 1

�Yi ¡ zT

i^�(¡i) ¡ ^

f(¡i)1;¶1

(x1;i) ¡ ^f

(¡i)2;¶2

(x2;i)´2

; (4.1)

where zTi

^�(¡i), ^

f(¡i)1;¶1

(x1;i) and ^f

(¡i)2;¶2

(x2;i) are the �tted treatment and smooth terms esti-mated without the ith observation. We can write (4.1) more succinctly as

CV =1n

nXi = 1

ÃYi ¡ ^

Yi

1 ¡ Hii

!2

; (4.2)

where the Hii are the diagonalentries of the hat matrix for (2.1) usingall the data.By manip-ulating the estimatingequations (Hastie and Tibshirani 1990, p. 109) for the semiparametricmodel, we can write ^

f1 +^f2 = M2(Y ¡ Z

^�), and so by (2.3) we have

HY = ZA2Y + M2(Y ¡ ZA2Y ) (4.3)

which gives H = M2 + (I ¡ M2)ZA2. The cross-validation criterion thus requires theevaluation of M2 and is computationallydemanding. The three other criteria considered in

Page 8: The practical use of semiparametric models in field trials

PRACTICAL USE OF SEMIPARAMETRIC MODELS IN FIELD TRIALS 55

this paper use functions of the trace of H and are much simpler to compute. Generalizedcross-validation replaces each term Hii in (4.2) with its average value, tr(H)=n:

GCV =¼̂2

[1 ¡ tr(H)=n]2; (4.4)

where

¼̂2 =1n

nXi = 1

(Yi ¡ ^Yi)

2: (4.5)

In the Akaike information criterion, the span is chosen to minimize AIC = log ¼̂2 +

2tr(H)=n. Hurvich et al. (1998) proposed a modi�ed version of the AIC designed to avoidover�tting in nonparametric regression:

AICC = log ¼̂2 + 1 +2(tr(H) + 1)

n ¡ tr(H) ¡ 2: (4.6)

These four criteria will be compared for our dataset.

4.2.2 Approximate and Bootstrap F Tests

If a decrease in the span is associated with an apparently small change in the criteria,then a natural question is whether the change is statistically signi�cant. Cleveland andDevlin (1988) and Hastie and Tibshirani (1987, 1990) discussed approximate F tests in thiscontext. Let H1 and H2 be the hat matrices of two semiparametric models with spans g1

and g2, g2 < g1. We seek a test of the null hypothesis: the span is g1, against the alternativehypothesis: the span is g2.

Let Y T RiY = Y T (I ¡ Hi)(I ¡ Hi)T Y; i = 1; 2, be the residual sums of squares for

the two models. Typical statistical software bases a test of the null hypothesis against thealternative hypothesis on the F -like statistic

F1 =(Y T R1Y ¡ Y T R2Y )=(tr(H2) ¡ tr(H1))

Y T R2Y=(n ¡ tr(H2) ¡ 1)(4.7)

and refers this statistic to the F distributionwith tr(H2)¡ tr(H1) and n¡tr(H2)¡1 degreesof freedom. However, from the expected value of the residual sum of squares, the residualdegrees of freedom for each model should be de�ned as tr[(I ¡ Hi)(I ¡ Hi)

T ]; i = 1; 2.Cleveland and Devlin (1988) showed that

F2 =(Y T R1Y ¡ Y T R2Y )=¸1

Y T R2Y=¯1(4.8)

with an approximate F distribution with ¸21=¸2 and ¯2

1=¯2 degrees of freedom is a moreappropriate test, where ¸1 = tr(R1 ¡ R2), ¸2 = tr[(R1 ¡ R2)2], ¯1 = tr(R2) and ¯2 =

tr(R22). Experience suggests that this adjustment often changes the numerator degrees of

freedom of the F distribution signi�cantly. However, manipulation of the n £ n matricesHi to compute the appropriate degrees of freedom is computationally demanding.

Page 9: The practical use of semiparametric models in field trials

56 M. DURBAN ET AL.

Table 1. Comparison of Different Model Selection Criteria for a Range of Models for Trial A

Row span Bed span df AICc AIC GCV CV

1 1 3.0 0.128 ¡1.928 0.216 0.2161 30/34 3.3 0.094 ¡1.996 0.211 0.205

14/16 20/34 4.6 0.001 ¡2.071 0.185 0.18512/16 20/34 5.0 ¡0.018 ¡2.098 0.181 0.18110/16 15/34 6.7 ¡0.071 ¡2.169 0.170 0.17110/16 10/34 8.7 ¡0.098 ¡2.221 0.159 0.15910/16 9/34 9.7 ¡0.110 ¡2.246 0.162 0.1658/16 10/34 9.3 ¡0.100 ¡2.232 0.160 0.1586/16 10/34 10.5 ¡0.092 ¡2.238 0.159 0.157

10/16 5/34 16.8 ¡0.077 ¡2.303 0.168 0.1548/16 5/34 17.4 ¡40.080 ¡2.315 0.168 0.1536/16 5/34 18.6 ¡0.071 ¡2.322 0.168 0.1524/16 5/34 10.8 ¡0.042 ¡2.323 0.176 0.153

Best 2-d 125/544 12.4 ¡0.036 ¡2.256 0.164 0.175NOTE: df = degrees of freedom

In this article, we use the F1 statistic of (4.7), but bootstrap the residuals under thenull hypothesis to obtain its distribution. This involves less manipulation of large matricesthan the use of (4.8), and so is more ef�cient in large trials. The bootstrap approach alsoaddresses the problem of non-nested hypotheses. If g2 < g1, then a model with span g1

is smoother than one with span g2 and it is reasonable to take the model with span g1 asthe null model. However, the null model is not generally a submodel of the alternative andthe usual results on the independence of the sums of squares in (4.7) and (4.8) will nothold. The bootstrap estimates the percentage points in (4.7) directly regardless of whetherthe numerator is independent of the denominator. We also use the bootstrapped F test tocompare the sum of two trends f1(bed) + f2(row) with a two-dimensional smooth surfacef (bed; row).

5. ANALYSIS OF THE TRIAL RESULTS

5.1 TRIAL A

5.1.1 Semiparametric Modeling

The spatial trend in the yields from Trial A may be modeled as either a two-dimensionaltrend E(Yrb) = · + ½j + lo(r; b) where Yrb is the yield of the plot in row r and bed b,which is plantedwith cultivarj; j = 1; : : : ; 272, and lo(r; b) represents the two-dimensionalloess smoother across beds and rows, or as a sum of two one-dimensional trends E(Yrb) =

· + ½j + lo(r) + lo(b).AIC, AICC , CV, and GCV were calculated for both models for a range of spans of the

smoother. Some of these are compared in Table 1. The minimum of the AICC was achievedby �tting the model with two one-dimensional trends with a span of 10/16 for rows and9/34 for beds. The best-�tting model with a two-dimensional trend had a larger AICC , asdid models with a single one-dimensional trend.

Page 10: The practical use of semiparametric models in field trials

PRACTICAL USE OF SEMIPARAMETRIC MODELS IN FIELD TRIALS 57

24

68

1012

1416

Row Number

5

10

15

20

25

30

Bed Number

-0.6

-0.4

-0.2

00.

20.

40.

60.

8A

dditi

veS

urfa

ce

Figure 5. Additive trend for Trial A.

The model chosen by minimizing the AICC criterion had larger row and bed spansthan the models chosen by the other criteria; the spans selected by minimizing the originalAIC were smallest (4/16 for rows, 5/34 for beds). A bootstrap F test with 500 replicationswas used to test whether the smaller span provides a signi�cant improvement in �t. The teststatistic F1 of Equation (4.7) comparing these spans had a value of 2.84, compared with a95% point of 2.58 for the bootstrap distribution of F1; the corresponding 99% point was3.18. Hence there is no strong evidence for a signi�cant improvement in �t associated withthe decrease in span. These percentage points are larger than given by the simple F test in(4.7) with degrees of freedom tr(H2) ¡ tr(H1) º 11 and n ¡ tr(H2) ¡ 1 º 251, whichwould have led to the conclusion that the smaller spans gave an improved �t. The �ttedsurface obtained by combining the one-dimensional trends for trial A is shown in Figure 5.The higher yields in the low-numbered rows can be interpreted as an effect of the slope ofthe trial: nutrients from the higher part of the trial (high-numbered rows) may be washeddown by rainfall and accumulate at the bottom of the slope, giving depleted yields at thetop and higher yields at the bottom.

Table 2. Comparison of Models to Investigate the Signicance of the Smooth Terms and CultivarEffects (C) in Trial A. The variance ratios in each case are calculated with the full model (a)as the denominator.

ChangeModel Res SS Res df Test in df vr F Pr.

(a) lo(row,10/16) + lo(bed,9/34) + C 20.44 262.3 ¡ ¡ ¡ ¡(b) lo(row,10/16) + C 31.87 269.3 (a) v (b) 7.0 21.0 < .001(c) lo(bed,9/34) + C 28.31 265.0 (a) v (c) 2.7 37.6 < .001(d) lo(row,10/16) + lo(bed,9/34) 76.94 533.3 (a) v (d) 271 2.7 < .01

NOTE: df = degrees of freedom, v.r. = variance ratio, F Pr. = F probability

Page 11: The practical use of semiparametric models in field trials

58 M. DURBAN ET AL.

The signi�cance of the treatments effects may be investigated by dropping treatmentterms to build up an analysis of variance table (Table 2). Due to the nonparametric natureof the smooth terms, standard results invoking orthogonality no longer hold and again weuse bootstrapping to obtain the distribution of the test statistic F1. The test statistic valueof 2.7 is greater than the bootstrap 99% point of 1.76, indicating signi�cant differences(p < 0:01) among the cultivars. A very large number of bootstrap samples would berequired to establish whether this test statistic is signi�cant with (p < 0:001). The averagestandard error of difference for comparisons of two cultivars is 0.280. Plots of residualsfrom the best semiparametric model against position in the trial, and coplots of residualsshowed that all spatial trends had been removed.Examinationof the estimatesof the cultivarmeans showed that these changed very little as the span of the smoother changed.

5.2 COMPARISON WITH ANALYSIS OF VARIANCE

An analysis of variance regarding the trial design as a randomizedblock design showedsigni�cant differences among the cultivars (F271;271 = 2:16, p < 0:001). The standard errorof differences of cultivar means is 0.372, larger than that for the semiparametric model(0.280). However, plots of the residuals from this analysis against their position in the�eld indicate the presence of spatial trends, and so this analysis is inadequate. A betteranalysis takes into account the row-column design of the experiment and �ts rows andcolumns as random effects, and block and cultivar as �xed effects using residual maximumlikelihood (REML). Both random effects are signi�cant, and plots of residuals against rowor bed number show no remaining spatial trend. The change in deviance when the cultivarterm is dropped (Welham and Thompson 1997) is 343.3 on 271 degrees of freedom, witha signi�cance of 0.002, and the average standard error of differences of cultivar means is0.291.The rank correlationsof the cultivars from the three analyses are high: 0.968 betweenthe semiparametric model and REML, and 0.899 and 0.889 between the randomized blockanalysis and the semiparametric model and REML respectively. However, there are someimportantchangesin ranking,for example,cultivarnine is ranked�fthby thesemiparametricmodel, 10th by REML and in 28th place by the randomized block analysis.

5.3 TRIAL B

5.3.1 Semiparametric Modeling

The semiparametric analysis of Trial B is complicated by the split-plot design. Thespatial models discussed earlier assume a single error term. It is possible to ignore thesplit-plot design, �t a semiparametric model to the yields from the subplots, and test thesigni�cance of the treatments against the resulting residual mean square. However, themainplot treatment and subplot treatment are randomized at different strata, so it is dif�cultto justify testing the mainplot treatment against a subplot residual mean square.

Page 12: The practical use of semiparametric models in field trials

PRACTICAL USE OF SEMIPARAMETRIC MODELS IN FIELD TRIALS 59

24

68

10

Row Number10

20

30

40

50

Bed Number

2-D

Sur

face

Figure 6. Two-dimensional trend for Trial B.

A semiparametric model with a two-dimensional trend for the yields of the subplots is

Yrb = · + ¬i + ½j + (¬½ )ij + lo(r; b) + °rb; (5.1)

where Yrb is the yield of the subplot in row r and bed b, which is planted with subplottreatment (cultivar)j; j = 1; : : : ; 70 and receivesmainplot treatment (fungicide) i; i = 1; 2.· is the overall mean, ¬i is the effect of the ith level of fungicide, ½j is the effect of the jthcultivar, (¬½ )ij is the interaction, lo(r; b) is the two-dimensional loess smoother across bedsand rows and °rb are the subplot errors. We propose that this model, and the correspondingmodel with two one-dimensional smooth terms, are suitable for investigating the spatialtrend. Once the best smooth term has been selected, the signi�cance of the interaction termand the cultivarmain effects may be tested by dropping terms from Equation(5.1). However,as in a conventional split-plot analysis, the fungicide treatment should be tested against aresidual mean square calculated from the mainplotsand we will use a semiparametric modelfor the mainplots, as described later, to do this.

For Trial B, the lowest AICC was achieved by a two-dimensional smooth surface withspan 0.082 (= 46/560). A bootstrap F test with 500 replications con�rmed that the two-dimensional surface was a better �t than the best model with two one-dimensional smoothterms, with the test statistic F1 equal to 10.32, greater than the 99% point of 1.66. The �ttedsmooth surface is shown in Figure 6.

The signi�cances of the smooth terms and the treatment effects are shown in Table 3.There was no signi�cant cultivar£ fungicideinteraction,but highly signi�cant cultivarmaineffects (p < 0:001). The average standard error of differences for cultivar comparisons was0.107.The smooth term was alsosigni�cant (p < 0:001). The smoothterm and the treatmenteffects are not orthogonal, but the order of dropping terms from the model made negligibledifferences to their signi�cance.

If the fungicide term is dropped from the split-plot model the F value is highly sig-

Page 13: The practical use of semiparametric models in field trials

60 M. DURBAN ET AL.

Table 3. Comparison of Models to Investigate the Signicance of the Smooth Term and Effects ofCultivar (C) and Fungicide (F) in Trial B. The variance ratios in each case are calculated withthe full model (a) as the denominator.

ChangeModel Res SS Res df Test in df vr F Pr.

Subplot model:(a) lo(bed,row,0.082) + F + C + F.C 16.31 379.6 ¡ ¡ ¡ ¡(b) + F + C + F.C 51.09 420.0 (a) v (b) 40.4 20.0 < .001(c) lo(bed,row,0.082) + F + C 18.93 448.6 (a) v (c) 69 0.9 n.s.(d) lo(bed,row,0.082) + F 52.89 517.6 (c) v (d) 69 11.5 < .001

Mainplot model:(a) lo(mainplot, span= 0.625) + F 1.85 3.1 ¡ ¡ ¡ ¡(b) + F 18.35 6 (a) v (b) 2.9 9.3 < .001(c) lo(mainplot, span= 0.625) 31.38 4.1 (a) v (c) 1 48.9 < .01

NOTE: df = degrees of freedom, v.r. = variance ratio, F Pr. = F probability

ni�cant (p < 0:001). However comparison with the subplot residual mean square is notsuitable, due to the fungicide being applied to the mainplots.The mean yield was calculatedfor each of the eight mainplots. These form a single row, and so spatial trends at this levelwere represented by a one-dimensional smooth term. The span was selected (by AICC )to be 0.625 (=5/8). The signi�cances of the fungicide and the smooth term are tested inthe lower part of Table 3. Due to the small number of mainplots bootstrapping to test thesigni�cance of the fungicide effect was inappropriate, and the more exact F test (4.8) wasused. The usual F test (4.7) had 1 and 3.05 df, so that the signi�cance of the F test was0.005, while the more exact F test of (4.8) had very similar degrees of freedom of 0.98 and2.95, giving a signi�cance of 0.006. The fungicide standard error of difference was 0.069.

5.3.2 Comparison with Analysis of Variance

A split-plot analysis of variance indicated signi�cant effects of fungicide (F1;3 = 40:3,p = 0:008) and cultivar (F69;414 = 7:2, p < 0:001). The cultivar £ fungicide interactionwas not signi�cant. The standard errors of differences are 0.086 for fungicide and 0.141for cultivar; these are larger than those from the semiparametric analysis, indicating theimproved precision obtained by adjusting for the spatial trend.

A REML analysis was also used to model the data, with random effects for blocks,beds, rows and a block by row interaction. The standard error of differences for cultivar forthis model is 0.108, very close to that of the semiparametric model. The change in deviancewhen the cultivar £ fungicide interaction term is dropped is 88.1 (69 df), with a signi�canceof approximately 0.06.

5.4 OTHER SPATIAL MODELS

In Sections 5.1.2 and 5.2.2 we compared the semiparametric spatial analysis with aloess smoother with conventional analysis of variance. In this section we look at another

Page 14: The practical use of semiparametric models in field trials

PRACTICAL USE OF SEMIPARAMETRIC MODELS IN FIELD TRIALS 61

Table 4. Summary of Analyses for Three Mixed Models Fitted to Trial A

Random Error Number of REMLModel terms variance model variance parameters log-likelihood

1 spl(bed)+ spl(row)+ units I 3 104.9552 spl(bed)+ spl(row)+ units AR1£ AR1 6 122.2993 units AR1£ AR1 4 121.207

parametric approach to modeling the spatial variation in variety trials. Gilmour et al. (1997)proposed a general mixed model for the analysis of variety trials. Their model containsterms to model �xed effects, and long range and short range correlations and can be writtenas Y = X½ + Zu + ¹ + ², where Y is the vector of yields, ½ is the vector of �xed effectswith design matrix X, u is the vector of random effects with design matrix Z, ¹ is a spatiallydependent random error vector, and ² is a vector of independent plot errors; see Gilmour etal. (1997) for further details. Large scale trends, for example fertility, are modeled by theterm Zu; polynomials and splines are two possibilities. Local correlation is modeled alongrows and columns by the term ¹, and Gilmour et al. (1997) suggested a class of separableprocesses for this. The models are all �tted within the mixed model framework so modelscan be compared via the residual log-likelihood.We have two concerns with this approach:�rst, a trend, such as fertility, induces a short range correlation and the two effects Zu and¹ could be confounded; second, the class of separable processes, while wide, does imposea particular structure on the responses.

5.4.1 Analysis of Trial A

We �tted the Gilmour et al. (1997) model to the data in trial A using the SAMMsoftware (Butler, Gilmour, and Cullis 1999). We considered three models: model 1 has twosplines (for beds and rows) and independent errors, model 2 has two splines and an AR1 £AR1 + units covariance structure, and model 3 omits the splines but has an AR1 £ AR1 +

units covariance structure. From the log-likelihoods in Table 4, the preferred model wouldbe model 3, with an AR1 £ AR1 + units covariance structure. However, we suspected thatthe differences in the log-likelihood are explained by the fact that in some cases the shortterm correlation and smooth trend cannot be identi�ed and/or separated appropriately. Toinvestigate this further, we simulated data where the true model had trends for rows andbeds plus independent errors, �tted the same three models and obtained similar results asthose in Table 4. Figure 7 shows the sample variogram of residuals from trial A after �ttingmodel 3 (left), model 2 (center), and model 1 (right). Although the left panel suggests thatthe sample variogram is not inconsistentwith an AR1 £ AR1 + unitsprocess, the right panelis also consistent with the variogram generated from a process with independent errors.

5.4.2 Analysis of Trial B

The standard analysis of variance model for a split-plot experiment is a simple exampleof a mixed model so the Gilmour et al. (1997) model is able to maintain the error strata

Page 15: The practical use of semiparametric models in field trials

62 M. DURBAN ET AL.

5

10

15

20

2 5

3 0

2

4

6

8

10

12

14

0.0 5

0.10

0.15

BED

ROW

v

5

10

1 5

20

25

30

2

4

6

8

10

12

14

0.0 2

0.04

0.0 6

0.08

0.10

0.12

BED

ROW

v

5

10

1 5

20

25

30

2

4

6

8

10

12

1 4

0.0 2

0.0 4

0.06

0.08

BED

ROW

v

Figure 7. Left: variogram after �tting AR1 £ AR1 model; center: variogram after �tting AR1 £ AR1 + splinesmodel; right: variogram after �tting one-dimensional + one-dimensional loess model.

of the split-plot while modeling the spatial variation. We �tted three models with differentcovariance structure at the main and subplot level (see Table 5). Model 1 is the standardanalysis of variance model for a split-plot experiment, model 2 assumes that the subplotstructure is explainedby an AR1 £ AR1 + unitsprocess, and model 3 ignores the covariancestructure at the main plot level. The preferred model would be model 2, with an AR1 £AR1 + units covariance structure. However, the residual plot in the left hand panel ofFigure 8 from the AR1 £ AR1 model suggests that some nonadditive trend remains in theresiduals. Again, data were simulated using the two-dimensional trend (as estimated by thesemiparametric model) plus a random error term and the variogram from �tting an AR1 £AR1 + units was very similar to that from Trial B.

A referee pointed out that the analysis of variance table (see Table 6) for model 2suggests that the cultivar £ fungicide interaction may be signi�cant (F69;414 = 1:3; p º0:06). The signi�cance test appears to contradict the �nding in Table 3 where F = 0.88 andthere is no suggestion that the interaction may be signi�cant. We believe that this is due tomodel 2 being inappropriate for the subplot variance (as we have shown above). A plot ofthe cultivar means with and without fungicide estimated from the two-dimensional loessmodel (5.1) shows little or no suggestion of a signi�cant cultivar £ fungicide interaction.

6. DISCUSSION

Unexpected trends in �eld trials, if overlooked,can affect estimates of treatment effectsor reduce the precision of estimation, particularly in experiments with many treatments and

Table 5. Variance Modeling for Three Mixed Models Fitted to Trial B

Term Model 1 Model 2 Model 3

Block 0.0288 0.0000 ¡Block.Fungicide 0.0138 0.0125 ¡Units ¡ 0.0243 0.0244Residual 0.0791 0.1255 0.1444Row Correlation ¡ 0.884 0.8990Bed Correlation ¡ 0.921 0.924REML log-likelihood 215.7 343.7 342.9

Page 16: The practical use of semiparametric models in field trials

PRACTICAL USE OF SEMIPARAMETRIC MODELS IN FIELD TRIALS 63

-1.0

-0.5

0.0

0.5

1

0 10 20 30 40 50

2 3

0 10 20 30 40 50

4

5 6 7

-1.0

-0.5

0.0

0.5

8

-1.0

-0.5

0.0

0.5

9 10

0 10 20 30 40 50

BED

resi

dual

-0.6

-0.4

-0.2

0.0

0.2

0.4

0.6

1

0 10 20 30 40 50

2 3

0 10 20 30 40 50

4

5 6 7

-0.6

-0.4

-0.2

0.0

0.2

0.4

0.6

8-0.6

-0.4

-0.2

0.0

0.2

0.4

0.6

9 10

0 10 20 30 40 50

BED

resi

dual

Figure 8. Top: residuals after �tting AR1 £ AR1 model; bottom: residuals after �tting two-dimensional loessmodel.

Page 17: The practical use of semiparametric models in field trials

64 M. DURBAN ET AL.

Table 6. Approximate F Tests for Treatment Effects for Three Mixed Models Fitted to Trial B

Term Model 1 Model 2 Model 3

Fungicide (F) 40.271 35.950 97.114Cultivar (C) 7.200 15.979 15.987F.C 0.933 1.300 1.285

low replication, such as early stage variety trials. We have presented graphical methods fordetecting spatial trends and different criteria to select the span of smooth terms. Based onthese and further trials (Durban 1998), we �nd that the modi�ed AICC criterion of Hurvichet al. (1998) chose larger spans than three other criteria, but that the smaller spans selectedby these criteria did not give a signi�cantly better �t when the models were compared by abootstrapped F test. These conclusions support those of Hurvich et al. (1998).

The calculation of treatment effects and standard errors for a semiparametric modelwith a single smooth term is well established (Hastie and Tibshirani 1990, p. 118), but herewe present the correspondingresults for two smooth terms, as may be required for �eld trialswith rectangular layouts. It should be noted that some statistical software separates smoothterms into linear and nonlinear parts, and ignores the nonlinear part in the calculation oftreatment standard errors. This can result in rather optimistic standard errors (Durban et al.1999).

For Trial A, the best-�tting semiparametric model was found to be the sum of twoone-dimensional smooth terms. We have also analyzed the trial by the ARIMA approachof Cullis and Gleeson (1991), and by the mixed model approach of Verbyla et al. (1999).In the latter article, cubic smoothing splines are reformulated as mixed models and �ttedusing residual maximum likelihood.This powerful technique can include correlated errorson neighboring plots. From the analysis of Trial A we conclude that it may not be easy todecide the true nature of the underlying process, since more than one process may lead todata with common features, for example, similar variograms. Further reasearch is neededto �nd tools to separate smooth trend from short-term correlation. However, our experienceleads us to believethat trials such as Trial B, requiringa two-dimensionalsmooth surface, aremore common in practice. At present no mixed model formulation is available for the loesssmoothers used here. There is scope for more research on these, and on two-dimensionalsplines such as the thin plate smoothing spline (Wahba 1990; Green and Silverman 1994)or the bivariate tensor-product spline (Koo and Lee 1994). A practical advantage of oursemiparametric models is the visualization of the trend which can be related to physicalfeatures in the �eld.

For Trial B, we proposed an intuitive analysis of a split-plot design in the presence of aspatial trend. The model is proposed as an extension of the split-plot analysis of covariancediscussed by Federer and Meredith (1992). Alternative approaches to a split-plotexperimentcould be to seek a parametric equation to reproduce the trend or to use the approach ofVerbyla et al. (1999) if the trends are one-dimensional.

Page 18: The practical use of semiparametric models in field trials

PRACTICAL USE OF SEMIPARAMETRIC MODELS IN FIELD TRIALS 65

ACKNOWLEDGMENTSThe work of M. Durban, C. A. Hackett, A.C. Newton, W.T.B Thomas, and J.W. McNicol is supported by

the Scottish Executive Rural Affairs Department. We thank Brian Cullis for the use of the Spatial Analysis andMixed Models software, and for useful discussions.

[Received January 2000. Revised December 2001.]

REFERENCES

Akaike, H. (1973), “Information Theory and an Extension of the Maximum Likelihood Principle,” in SecondInternational Symposium on Information Theory, eds. B. N. Petrov and F. Csaki, Budapest: AkademiaKiado, pp. 267–281.

Buja, A., Hastie, T., and Tibshirani, R. (1989), “Linear Smoothers and Additive Models” (with discussion), TheAnnals of Statistics, 17, 453–555.

Butler, D. G., Gilmour, A. R., and Cullis, B. R. (1999), “SAMM: An S-PLUS Module for Mixed Models usingREML,” in International S-PLUS User Conference, New Orleans, USA.

Cleveland, W. S. (1979), “Robust Locally-Weighted Regression and Smoothing Scatterplots,” Journal of theAmerican Statistical Association, 74, 829–836.

(1993), Visualizing Data, New Jersey: Hobart Press.

Cleveland, W. S., and Devlin, S. J. (1988), “Locally-Weighted Regression: An Approach to Regression Analysisby Local Fitting,” Journal of the American Statistical Association, 83, 596–610.

Craven, P., and Wahba, G. (1979), “Smoothing Noisy Data with Spline Functions,” Numerische Mathematik, 31,377–403.

Cullis, B. R., andGleeson,A. C. (1991),“SpatialAnalysisofFieldExperiments - an Extension toTwoDimensions,”Biometrics, 47, 1449–1460.

Durban, M. L. (1998), “Modelling Spatial Trends and Local Competition Effects Using Semiparametric AdditiveModels,” unpublished Ph.D. thesis, Heriot-Watt University, Dept. of Actuarial Mathematics and Statistics.

Durban, M., Hackett, C. A., and Currie, I. D. (1999), “Approximate Standard Errors in SemiParametric Models,”Biometrics, 55, 699–703.

Edmondson, R. N. (1993), “Systematic Row-and-Column Designs Balanced for Low-Order Polynomial Interac-tions Between Rows and Columns,” Journal of the Royal Statistical Society, Series B, 55, 707–723.

Federer, W. T., and Meredith, M. P. (1992), “Covariance Analysis for Split-Plot and Split-Block Designs,” TheAmerican Statistician, 46, 155–162.

Gilmour, A. R., Cullis, B. R., and Verbyla, A. P. (1997), “Accounting for Natural and Extraneous Variation inthe Analysis of Field Experiments,” Journal of Agricultural, Biological and Environmental Statistics, 2,269–293.

Green, P., Jennison, C., and Seheult, A. (1985), “Analysis of Field Experiments by Least Squares Smoothing,”Journal of the Royal Statistical Society, Series B, 47, 299–315.

Green, P. J., and Silverman,B. W. (1994),NonparametricRegression andGeneralized LinearModels:A RoughnessPenalty Approach, London: Chapman & Hall.

Hackett, C. A., Reglinski, T., and Newton, A. C. (1995), “Use of Additive Models to Represent Trends in a BarleyField Trial,” Annals of Applied Biology, 127, 391–403.

Hastie, T. J., and Tibshirani, R. J. (1986), “Generalized Additive Models,” Statistical Science, 1, 297–318.

(1987), “Generalized Additive Models: Some Applications,” Journal of the American Statistical Associ-ation, 82, 371–386.

Page 19: The practical use of semiparametric models in field trials

66 M. DURBAN ET AL.

(1990), Generalized Additive Models, London: Chapman & Hall.

Hurvich, C. M., Simonoff, J. S., and Tsai, C.-L. (1998), “Smoothing Parameter Selection in NonparametricRegression Using An Improved Akaike Information Criterion,” Journal of the Royal Statistical Society,Series B, 60, 271–293.

Koo, J. Y., andLee, Y. (1994),“BivariateB-Splines in Generalized Linear Models,”Journalof StatisticalSimulationand Computation, 50, 119–129.

Newton, A. C., Ellis, R. P., Hackett, C. A., and Guy, D. C. (1997), “The Effect of Component Number onRhynchosporium secalis Infection and Yield in Mixtures of Winter Barley Cultivars,” Plant Pathology, 45,930–938.

Papadakis, J. S. (1937),“Methodestatistiquepourdes experiences sur champ,” Bulletinde l’Institut d’Ameliorationdes Plantes a Salonique, 23.

Speckman, P. (1988),“Kernel Smoothing in Partial Linear Models,” Journal of the Royal Statistical Society, SeriesB, 50, 413–436.

Stone, M. (1974), “Cross-Validatory Choice and Assessment of Statistical Predictions” (with discussion), Journalof the Royal Statistical Society, Series B, 36, 111–147.

Verbyla, A. P., Cullis, B. R., Kenward, M. G. and Welham, S. J. (1999), “The Analysis of Designed Experimentsand Longitudinal Data by using Smoothing Splines” (with discussion), Applied Statistics, 48, 269–311.

Wahba, G. (1990), Spline Models for Observational Data, Philadelpia: SIAM.

Welham, S. J., and Thompson,R. (1997),“LikelihoodRatio Tests forFixed ModelTerms Using ResidualMaximumLikelihood,” Journal of the Royal Statistical Society, Series B, 59, 701–714.

Williams, E. R., and John, J. A. (1996), “Row-Column Factorial Designs for Use in Agricultural Field Trials,”Applied Statistics, 45, 39–46.

Wilkinson, G. N., Eckert, S. R., Hancock, T. W., and Mayo, O. (1983), “Nearest Neighbour (NN) Analysis ofField Experiments” (with discussion), Journal of the Royal Statistical Society, Series B, 45, 151–211.