CHANGE DETECTION
IN AUTOCORRELATED PROCESSES
Jiangbin Yang
A t hesis submit ted in conformity wi th the requirements for the degree of Doctor of Philosophy
Graduate Department of Mechanical and Industrial Engineering University of Toronto
@Copyright by Jiangbin Yang 1999
National Library Bibliothèque nationale du Canada
Acquisitions and Acquisitions et Bibliographie Services services bibliographiques 395 Wellington Street 395. nre Wellington ûttawaON K 1 A W OnawaON K1AON4 Canade canada
The author bas granted a non- exclusive licence allowing the National Library of Canada to reproduce, loan, distribute or seil copies of this thesis in rnicrofom, paper or electronic formats.
The author retains ownership of the copyright in this thesis. Neither the thesis nor substantial extracts fiom it may be printed or othenvise reproduced without the author's permission.
L'auteur a accordé une licence non exclusive permettant à la Bibliothèque nationale du Canada de reproduire, prêter, distribuer ou vendre des copies de cette thèse sous la forme de microfiche/nlm, de reproduction sur papier ou sur fomt électronique.
L'auteur conserve la propriété du droit d'auteur qui protège cette thèse. Ni la thèse ni des extraits substantiels de celle-ci ne doivent être imprimés ou autrement reproduits sans son autorisation.
Change Detection in Autocorrelated Processes
Jiangbin Yang, Ph.D.
Department of Mechanical and Industrial Engineering
University of Toronto, 1999
Abstract
The problem of change detection is about quick detection ~f a change in a dynamic system or
process at a low rate of false alarm by sequentially observing the system or process. It has important
applications in quality control, signal processing and other areas. This thesis studies the problem in
the context of autocorrelated processes.
First, we systematically investigate the properties of process residuals (one-step ahead forecast
errors) for change detection. We show t hat process resid uals are statistically sufficient for the pro blem
of change detection, and change detection can be done by using process residuals. We show that
process residuals are mutually uncorrelated with zero means when there is no change to the process,
that is, when the process is in-control. We develop a general procedure for specifically deriving
the forms of process residuals. Using the procedure, we derive the forms of residuals of general
autoregressive integrated moving average (ARIMA) processes and state space models, and obtain
some specific properties of the residuals under several situations. Under the Gaussian assumption,
for an ARIMA process or a steady-state state space mode1 subject to a change in process mean
level, the residuals are i.i.d. with zero means before the occurrence of change. After the occurrence
of change, the residuals are still mutudy independent with the same variance as before, but with
tirne-varying and generally
process mean level, we find
nonzero means. For an autocorrelated process subject to a change in
that the properties of residuals are independent of the feedback control
applied to the process.
We then concentrate on detection of a change iii proceçs mean level in an autocorrelated process
by using the process residuals. Cumulative surn (CUSUM), exponentially weighted moving average
(EWMA) and Shewhart control chart procedures are applied to the residuals. For computation of the
average run lengths ( ARLs) of the cont rol chart procedures applied to the residuals whose means are
time-varying after change, we derive an explicit formula for Shewhart, establish integral equations for
CUSUM and EWMA, and develop efficient numerical procedures for solving the integral equations.
Under the ARL criterion, we numerically study the performance of the control chart procedures
applied to the residuals of au tocorrelated processes under several situations.
We study the likelihood ratio (LR) testing procedure applied to the process residuals. We extend
the classical LR testing procedure by replacing its constant threshold value with a time-varying
threshold sequence. We propose a combined CUSUM and Shewhart control chart procedure to
approximate the extended LR testing procedure. We develop numericai procedures based on integral
equations for cornputation of the ARLs of these change detection procedures, and numerically study
t heir performance.
We study the problern of optimal sequential testing on process mean levels. The problern is a
special situation of change detection, an extension of the Wald's sequential probability ratio testing
(SPRT) problern, and has wide application in signal processing and other areas. We formulate the
problern as an optimal stopping problem, derive the optimal stopping rule, obtain sorne important
properties of the optimal stopping boundaries, develop a numerical procedure for computation of the
optimal stop ping boundaries, and present a numerical example.
To rny father, Jin Yang
who is so wam-hearted and diligent
and
to whom 1 feel deeply indebted
Acknowledgment s
1 thank rny supervisor, Prof. Viliam Makis, who has offered me valuable guidance and assistance
during the course of my Ph.D. study.
1 thank Mr. Youhua Chen for his friendly suggestion made to me in Spring, 1994, which inspired
me to pursue my Ph.D. in the engineering field.
I thank the members of my departmental oral examination cornmittee, Prof. Viliam Makis
(Supervisor and Chair), Prof. Andrew K. S. Jardine, Prof. Su bbarayan Pasupat hy (Department of
Electrical and Computer Engineering), Prof. Morton J. M. Posner, and Prof. Jim G. C. Templeton,
for their valuable comments and feedback on the manuscript of my thesis.
1 thank the external appraiser, Prof. Hoang Pham of the Department of Industrial Engineering,
Rutgers University, br his valuable appraisal on my thesis.
1 acknowledge the Connaught Scholarships offered to me in 1994,1995 and 1996 by the University
of Toronto. I acknowledge the Ontario Graduate Scholarships offered to me in 1996, 1997 and
1998. 1 acknowledge the continuous financial support since January, 1995 from the Condition Based
Maintenance (CBM) Consortium a t the University of Toronto led by the principal investigators Prof.
Andrew K. S. Jardine and Prof. Viliam Makis. I acknowledge the financial support from Prof. Viliam
Makis' NSERC fund.
I thank Ms. Jayne Beardsmore, an efficient administrator of the CBM Consortium, who has been
very helpful and kind. 1 thank Mr. Oscar del Rio, a knowledgeable Computing Adrninistrator of our
department, who h a s offered me much prompt and effective help. 1 thank Ms. Brenda Fung and the
graduate office who have efficiently helped me organize the final stage of my Ph.D. study.
I thank my wife, Hongmei Cao, who has accompanied me for many days and nights of hard
work, who has understood and appreciated me, who has shared many things with me, and who has
supported and encouraged me.
Contents
Abstract
Acknowledgments
1 Introduction
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 The Problem
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Two Production Processes
. . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.1 A Machining Process
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.22 A Cutting Process
. . . . . . . . . . . . . . . . . . . . 1.3 Problem Formulation of Change Detection
2 Residuals for Mode1 Transformation 10
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Introduction 11
. . . . . . . . . . . . . . . . . . 2.2 Sufficiency of Residuals for Change Detection 13
. . . . . . . . . . . . . . . . . . . . . . . 2.3 Second-Order Properties of Residuals 15
. . . . . . . . . . . . . . . . . . . . 2.4 Procedure for Deriving Forms of Residuals 17
. . . . . . . . . . . . . . . . . . . 2.5 Properties of Residuals of ARIMA Processes 18
. . . . . . . . . . . . . . . . . . 2.5.1 Processes Subject to Additive Changes 18
2.5.2 Processes Subject to Nonadditive Changes . . . . . . . . . . . . . . . .
2.6 Properties of Residuals of State Space Models . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . 2.6.1 Residuals of a General Model
. . . . . . . . . . . . . . . . . . . . 2.6.2 Residuals of the Machining Process
2.6.3 Residuais of the Steady-State Machining Process . . . . . . . . . . . .
2.7 Independence of Residuals of Feedback Control . . . . . . . . . . . . . . . . .
2.8 Summary and the Mode1 Transformation . . . . . . . . . . . . . . . . . . . . .
3 Performance of Control Charts Applied to Residuals
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Introduction
3.2 Integral Equations for Computation of the ARLs . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . 3.2.1 Shewhart Control Chart Procedure
. . . . . . . . . . . . . . . . . . . . . 3.2.2 CUSUM Control Chart Procedure
3.3.3 EWMA Control Chart Procedure . . . . . . . . . . . . . . . . . . . . .
3.3 Numerical Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.3.1 Results on the Machining Process Subject to a Step Shift . . . . . . . .
3.3.2 Results on the Machining Process Subject to a Drift Rate Change . . .
3.3.3 Results on the Cutting Process . . . . . . . . . . . . . . . . . . . . . .
3.3.4 Summary of the Performance of Control Charts . . . . . . . . . . . . .
4 Sequential Testing on Process Mean Levels 56
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.2 Formulation as an Optimal Stopping Problem . . . . . . . . . . . . . . . 60
. . . . . . . . . . . . . . . . . . . . . 4.3 Derivation of the Optimal Stopping Rule 63
. . . . . . . . . . . . . . . . . 4.4 Properties of the Optimal S topping Boundaries 71
. . . . . . . . . . . . . . . . 4.5 Computation of the Optimal Stopping Boundaries 76
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.6 NumericalResults 78
5 Likelihood Ratio Test ing Procedures
Introduction . . . * . . . . . . . . . . . . . . . . . . . . . . . O
The Structure of LR Testing Procedures . . . . . . . . . . . .
Extended LR Testing Procedures . . . . . . . . . . . . . . . .
A Combined CUSUM and Shewhart Control Chart Procedure
Performance Analysis . . . . . . . . . . . . . . . . . . . . . . .
5.5.1 The LR Testing Procedure Applied to AR(1) Processes
. . . . . 5.5.2 The Extended LR Procedure Applied to I MA(1. 1) Processes
. . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.5.3 The CCS Procedure
. . . . . . . . . . . . . . . . . . . . . . Summary of the Performance Analyses
6 Conclusion and Future Research
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1 Contributions 101
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 Future Research 103
Bibliography
vii
List of Tables
3.1 Optimum ARLs and parameters of the CUSUM, EWMA and Shewhart control
chart procedures applied to the residuals of the machining process with w =
0.553 subject to an additive step shift of size A in its state equation when their
in-control ARLs are 100. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.2 Optimum ARLs and parameters of the CUSUM, EWMA and Shewhart control
chart procedures applied to the residuals of the machining process with w =
0.553 subject to a drift rate change of c' in its state equation when their in-
control ARLs are 300. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.3 Optimum ARLs and parameters of the CUSUM, EWMA and Shewhart control
chart procedures applied to the residuals of the AR(2) cutting process subject
to an observational additive step shift of size A when their in-control ARLs are
100. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . * 53
4.1 The optimal stopping boundaries when b ( i ) = 0.9' for i 2 0. . . . . . . . . . . 78
5.1 ARLs of the LR testing procedure and the CUSUM and EWMA control chart
procedures applied to the residuals of an AR(1) process with autoregressive
coefficient al = 0.45 subject to an observational additive step shift of size A
when their in-control ARLs axe 100. ARLs are optimized at A = 1 in the upper
panel, and at A = 3 in the lower panel. . . . . . . . . . . . . . . . . . . . . . 90
5.2 ARLs of the extended LR (XLR) testing procedure in comparison with the
classical LR testing procedure and the CUSUM and EWMA control chart pro-
cedures applied to the residuals of an IMA(1,l) process with rnoving average
coefficient bl = 0.1 subject to an observational additive step shift of size A
when their in-control ARLs are 100. ARLs are optimized at A = 1 in the
. . . . . . . . . . . . . . . . . . upper panel, and at A = 3 in the lower panel. 94
5.3 ARLs of a CCS procedure in comparison with the LR testing procedure and the
CUSUM control chart procedure applied to the residuals of an AR(1) process
wi th autoregressive coefficient al = 0.45 subject to a n observational additive
. . . . . . . . . . . . step shift of size A when their in-control ARLs are 100. 97
List of Figures
3.1 Optimum ARLs versus sizes of change (upper panel) and the ARL curves
with the chart parameters optimizing the ARLs a t A = 2 (lower paael) of
the CUSUM, EWMA and Shewhart control chart procedures applied to the
residuals of the machining process with w = 0.553 subject to an additive step
shift of size A in its state equation, when the fixed in-control ARL is 100. . . 48
3.2 Optimum ARLs versus sizes of change (upper panel) and the ARL curves
with the chart parameters optimizing the ARLs at d = 2 (lower panel) of the
CUSUM, EWMA and Shewhart control chart procedures applied to the resid-
uals of the machining process with w = 0.553 subject to a drift rate change of
d in its state equation, when the fixed in-control ARL is 300. . . . . . . . . . 51
3.3 Optimum ARLs versus sizes of change (upper panel) and the ARL curves
with the chart parameters optimizing the ARLs at A = 2 (lower panel) of
the CUSUM, EWMA and Shewhart control chart procedures applied to the
residuals of the AR(2) cut ting process subject to an observational additive step
shift of size A, when the fixed in-control ARL is 100. . . . . . . . . . . . . . . 54
4.1 A typical illustration of g,(A) and the related functions and quantities. . . . . 70
4.2 The optimal stopping boundaxies of the example. . . . . . . . . . . . . . . . . 78
5.1 ARLs of the LR testing procedure and the CUSUM and EWMA control chart
procedures applied to the residuals of an AR(1) process with autoregressive
coefficient al = 0.45 subject to an observational additive step shift of size A.
For the fixed in-control ARL, the chosen chart parameters optimize the ARL
at A = 1 in the upper panel and at A = 3 in the lower panel. . . . . . . . . . 91
5.2 ARLs of the extended LR (XLR) testing procedure in comparison with the
classical LR testing procedure and the CUSUM and EWMA control chart pro-
cedures applied to the residuals of an I MA(1,l) process with moving average
coefficient bl = 0.1 subject to an observational additive step shift of size A.
For t h e fixed in-control ARL, the chosen chart parameters optimize the ARL
at A = 1 in the upper panel and at A = 3 in the lower panel. . . . . . . . . . 95
5.3 ARLs of a CCS detection procedure in comparison with the classicai LR testing
procedure and the CUSUM and EWMA control chart procedures applied to the
residuals of an AR(1) process with autoregressive coefficient al = 0.45 subject
to an observational additive step shift of size A. . . . . . . . . . . . . . . . . 98
Chapter 1
Introduction
1.1 The Problem
The problem of change detection is about quick detection of a change in a dynamic system
or process at a low rate of false alarm by sequentially observing the system or process. The
instant of change is unknown, and randomness is usudly involved in the system. The problem
arises naturally in the areas of quality control, signal processing, automatic control and othen.
Specific examples of applications, such as statistical process control, navigation system moni-
toring, seisrnic data processing, signal segmentation, and vibration monitoring of mechanical
systems, can be found in Basseville and Nikiforov (1993).
Autocorrelation may exist in the process or the sequential observations of the system.
In quality control, more than 70% of the studied processes subject to change detection are
autocorrelated, according to Alwan and Roberts (1995). Because of more complexity, more
effort is involved in designing change detection procedures for autocorrelated processes than
for independent and identically distributed (i.i.d.) processes, and neglect of the presence of
autocorrelation may seriously reduce the effectiveness of the designed detection procedures.
The impact of autocorrelation on the performance of control charts directly applied to the
output of an autocorrelated process has been reported in the literature, for example, by Alwan
(1992). Therefore, care must be taken to deal with autocorrelation properly.
In production and other areas, an autocorrelated process may be controlled in order to
compensate for expected deviations of the process from its taxget value. This will reduce the
systematic variability of the process and improve the process capability in quality control.
As a result , bot h the quality and productivity levels will be increased. A controlled process
should be rnonitored in order to detect changes to the process. A special-cause change can
disturb the process and invalidate the process mode1 that has been used to develop the control
policy. If it is not detected, the control policy will become ineffective. The complementary
roles of process control and process monitoring have been discussed by MacGregor (1988)
and Box and Krarner (1992). Integration of process control and process monitoring has been
proposed by Tucker, Faltin and Vander Wiel (1993) and Montgomery, Keats, Runger and
Messina (1994). Usually, for a controlled autocorrelated process, autocorrelation still exists
in the closed-loop process output.
Since Wald (1947) founded sequential analysis and Page (1954) proposed the CUSUM
procedure, substantial research has been done for detection of step shifts in process mean
levels in i.i.d. processes and their continuous analogue - drift rate changes in Wiener processes.
Works including Shiryaev (1961), Shiryaev (1963), Shiryaev (1978), Lorden (1971), Lai (1995),
and Siegmund and Venkatraman (1995) have made contributions to the related problems. The
optimality of the likelihood ratio CUSUM procedure under a well-defined rninimax criterion
for detection of a step shift in process mean level in an i.i.d. process was first proven by
Moustakides (1986) by using optimal stopping theory, and later by Ritov (1990) by using
saddle point decision theory. Similar optimality results have also been obtained by Beibel
(1996) and Yakir (1997). The situation where an i.i.d. process is subject to a step shift in
process mean level has been referred to as the i.i.d. case in the literature, because under this
situation, the process is i.i.d. both before and after change. Moustakides (1998) extended
the optimality condition, that is, the process is i.i.d. both before and after change, of the
likelihood ratio CUSUM procedure to a more general condition, where only the iikelihood
ratio sequence is required to be i.i.d. both before and after change. However, as stated by
Moustakides, the generalized condition is very restrictive, and the extension for autocorrelated
processes is very limited.
Since 19709, change detection in autocorrelated processes has received growing attention.
Kligene and Telknis (1983) surveyed and analyzed the works published before 1980. Bas-
seville (1988) surveyed and described the then state-of-theart statisticd methods for detect-
ing changes in signals and systems. Basseville and Nikiforov (1993) presented the existing
met hodology comprehensively, and included the relevant references. Lai (l995), wit h discus-
sions, surveyed a large variety of sequential change detection procedures widely scattered in
quality control, fault detection and signal processing literature. Since change detection in au-
tocorrelated processes is more complicated than in the i.i.d. case, there are more chalienging
problems. And less attention has been drawn to change detection in controlled processes.
This t hesis studies some problems on change detect ion in autocorrelated processes. Specif-
ically, this thesis systematically investigates the properties of process residuals and by using
residuals, achieve a mode1 transformation of change detection in autocorrelated processes ei-
ther controlled or not, establish a set of procedures for accurately and precisely computing
the ARLs of control charts and studying their performance, study a situation of change detec-
tion in autocorrelated processes - optimal sequential testing on process mean levels, analyze
and extend a widely used change detection procedure - the LR testing procedure, propose
a combined CUSUM and Shewhart scheme for change detection in autocorrelated processes,
and study the performance of these change detection procedures.
In the following sections, we will introduce two production processes that have motivated
this work, and fomulate the problem of change detection in a general framework.
1.2 Two Production Processes
1.2.1 A Machining Process
An application that rnotivated this work was a machining process introduced in Crowder
(1992), which is described as follows. A cornputer-controlled bar lathe forms cylindrical metal
pieces from a metal bar in a turning operation by cutting away excess metal. The process
is subject to tool Wear and stochastic shocks resulting from small differences in the ban,
environmental condition changes, etc.. The outside diameters of metal cylinders are measured
by an automatic gauging system w hich contri butes moderate measurement errors.
The process can be represented by a controlled random walk model with linear drift and
measurement errors as follows. For t = 1,2, *,
where Ot is the outside diameter value of t he part at time t, is its measurement, cl - N(O, 0:)
is the measurement error, ut - N ( 0 , 0:) is the stochastic shock, c is the linear drift rate, which
may well approximate the effect of tool Wear during a time period, and ut-1 is the amount
of feedback control deterrnined by {K, i t - 1). We assume that O. - N(&, q i ) is the
initial setting, {a,} and {ut) are two independent sequences of i.i.d. random variables. Vander
Wiel (1996 b) studied such a machining process with target outside diameter value of 15010
units (1 unit = 0.0001 inch), and obtained the following maximum likelihood estimates of the
process parameters:
The machining process modeled by ( 1.1) is representative of nonstationaxy production
processes subject to tool Wear and rneasurement errors. Its optimal control has been studied
by several authors. Crowder (1992) studied optimal control of the process when no drift was
present, Jensen and Vardernan (1993) developed an optimal control policy for the process with
additional adjustment errors, and Vander Wiel(1996 b) found an optimal control policy under
the assurnption that the lathe can automatically compensate for the deterministic tool W e a r
wit h negligible cost. The process mode1 belong to the class of state-space models, which have
been extensively examined in the process control literature such as Astrom and Wittenmark
(1997).
1.2.2 A Cutting Process
A second application that further stimulated this work was a metal cutting process studied by
Allen and Huang (1994). During rough turning operations of a tool, the vertical cutting force
experienced by the tool was rneasured by two strain gauges mounted on the top and bottom
faces of t h e tool. The force measurements were digitized by a analogue-to-digital converter,
and input to a computer numerical controller. The controller then manipulated the feed rate
in order to maintain the cutting force at a desired level.
The process was modeled by an autoregressive process of order 2 (AR(2) process) under
feedback control as follows.
where & represents the deviation of the vertical cutting force from its target level at time t ,
ut-1 is the control input (feed rate) that is deterrnined by {k;., i 5 t - 11, and et is the noise
input. It's assumed that { e t ) is a sequence of i.i.d. normal random variables with mean O
6
and variance u:. Allen and Huang (1994) obtained the following least square estimates of the
process paramet ers:
al = 0.45, a2 = 0.2,
b l = O . l , bz=0 .04 ,
and derived a so-called general minimum variance control policy.
The mode1 ( 1.3) is a special case of ARIMA processes under feedback control. With their
powerful capability of modeling, ARIMA processes have been widely applied in the areas of
automatic control, quality control and others, which can be seen in the literature such as Box,
Jenkins and Reinsel (1994).
In the turning operations, the cutting conditions such as depth of cutting and cutting
speed are usually changeable. These changes may cause the cutting force to largely deviate
from its target. Special-cause changes occur to the machining process in 5 1.2.1 as well, which
rnay result from abnormal status of the hydraulic system or from changes in the cutting tool
conditions. Two approaches can be employed to deal with the changes, which are adaptive
control and change detection. By employing change detection, we can detect and eliminate
the root cause of change if possible. Or, we can detect the change, and then adjust the control
policy when it is necessary. Thus, we can leave the system more stable.
1.3 Problem Formulation of Change Detection
The change detection problern can be formulated in a general statistical inference framework.
Let {x) be a stochastic process having a parametric model !P that is subject to a change at
an unknown t ime point r. The probability distribution functions of {x) belong to {Fq(,), r E
O), where R = {1,2, , oo). r is called the change point, and we want to have it detected.
The parameter value of \Ii may be a vector, and rnay be a function of time t. V(r) denotes
how the process model parameter changes. If r = 00, then no change occurs to the process,
and ik = &. If r < oo, then the change occurs to the process,
.s>, for t < r,
$i for t 3 r.
That is, the process model parameter jumps to q!q from at t = r , and then remains tl>l.
When iU = the process is said to be in-control, and when = di7 the process is said to
We assume that we start observing the process at time t = 1. Let yt be the observation
of Y f , and = (yi , , yt) be the observations obtained up to t 2 1. Let ( 0 - , y-,, y,) denote
the initial condition of the process prior to t = 1, assumed to be given. Let Y' = (- , y,-1, y,).
Based on y: (or equivalently y'), detection of the change point r is statistical inference on
r , and the detection or inference problem is the following sequential testing problem. At time
t = 1,3, O , test the hypotheses
based on the observations y: (or equivalently y'). We call these two hypotheses the change
point hypotheses. Obse r~ t ion is continued and the test is camied out until a decision in favor
8
of Hl is made for the fint time. When Hi is decided at stage t , occurrence of change is
alarmed, and the observation and test are stopped.
For t = 1,2, *, let Ft be the minimum O-dgebra induced by {K, i = 1,2, *#., t). Then,
the change detection procedure or equivalently, the sequential testing procedure is a stopping
nile (or a stoppiiig time) N 2 1 with respect to {Ft, t = 1,2, . 8 . ) .
Chapter 2
Residuals for Mode1 Transformation
2.1 Introduction
When autocorrelation became prominent in the 19709, concern was raised about the perfor-
mance of control charts directly applied to autocorrelated processes. As observed and studied
by Johnson and Bagshaw (1974), Bagshaw and Johnson (1975), Vasilopoulos and Stamboulis
(1978), Alwan (1992) and Wardell, Moskowitz and Plante (1992), control charts directly a p
plied to autocorrelated processes are seriously misleading. They either result in too many
false alarms or c m fail to alarm when a change really occurs.
To circumvent autocorrelation, residuals have been used for change detection in autocor-
related processes, since residuals are mutually independent under certain circumstances. An
early work on monitoring residuals of autocorrelated processes was Berthouex, Hunter and
Pallesen (1978). Since it was formally proposed by Alwan and Roberts (1988), applying con-
trol charts to residuals for change detection in autocorrelated processes has been a popular
practice in the quality control area. In some special cases, some properties of residuals have
appeared or been studied in the literature. Harris and Ross (1991) studied the properties of
residuals of an integrated moving average process of order (1,l) (IMA(1,l) process) and an
autoregressive process of order 1 (AR(1) process) subject to step shifts in the process mean
levels. Wardell, Moskowitz and Plante (1992) studied how residuals of an autoregessive mov-
ing average process of order (1,l) respond to a step shift in process mean level by assuming
the process is deterministic. Wardell, Moskowitz and Plante (1994) derived the forms of resid-
uals of an autoregressive moving average process of order (p, q) process subject to a change
in process mean level by using Ztransform. Basseville and Nikiforov (1993) studied some
properties of residuals, and utilized the convenience of residuals in designing change detection
procedures when the residuals were mu tudy independent.
Although residuals have been used for change detection in practice and some properties of
residuals have been around in the literature, the usage of residuals has been ad hoc, properties
of residuals have not been systematically investigated and especidy, the sufficiency of resid-
uals for change detection has been overlooked. In this chapter, we are going to systematically
investigate the properties of residuals for change detection, show the sufficiency of residuals for
change detection, develop a general procedure for derivation of forms of residuals of autocor-
related processes, and show the independence of feedback control of residuals. The sufficiency
of residuals will provide us a theoretical background for the use of residuals, knowledge of the
properties of residuals will enable us to design better change detection procedures and monitor
residuals more effectively, and the independence of residuals of feedback control will turn out
to be a convenient bridge for the combination of process control and change detection.
For the stochastic process {K) defined in 3 1.3, let
for t = 1,2, - -. We cal1 g(y t - l ) the forecast function. Obviously, g(yt- ' ) is independent of
the unknown value of r. The process residual at t = 1,2, - is defined as
{el, ez, - -) forms a stochastic process. We will also use et denote its realization. That is,
Let ei = (el, - - , et) . For convenience, we also let et denote e:.
2.2 Sufficiency of Residuals for Change Detection
We first study the sufficiency of process residuals for change detection. We have noted that
the initial condition is always given. For t = 1,2,--,
define a one-to-one transformation G from the observations y: to the residuals e:, that is,
e l = G(y:). e: is a statistic. For any given ei, there is one and only one yf = G-'(e:). That
is, the conditional distribution of y: given e: is
1, if yf = G-'(e:),
0, otherwise,
which is independent of the unknown parameter r. Therefore, by the definition of sufficiency,
at any time t 2 1, e: = (el , ---, et) is a sufficient statistic on {Fe(,), r E O), or equivaleotly
a sufficient statistic on the unknown parameter r. That is, al1 the information about the
unknown parameter r contained in y: is also fully preserved in e:. Hence, the residuals are
sufficient for the change detection problem.
One can design change detection procedures based on the observations directly. However,
due to the one-to-one transformation G-', any change detection procedure based on the
observations con be transformed to a change detection procedure based on the residuals, and
vice versa. And a detection procedure based on the residuds will have exactly the same
detection propert ies as its counterpart based on the observations.
Because residuals are suficient, and equivalent to observations for a change detection prob-
13
lem, when residuals have nicer properties than observations, we can concentrate on residuals
in designing and searching for effective change detection procedures. As we will see, residuals
can have much nicer properties, such as mutual independence, than observations, which will
provide us great convenience in designing and analyzing change detection procedures.
We note that the sufficiency of residuals always holds for the change detection problem,
no matter what the process mode1 is, if a change occurs, or how the change occurs. More on
the sufficiency of residuals for change detection will be discussed in 5 2.3 and 5 2.5.2.
2.3 Second-Order Properties of Residuals
We now study the properties of residuals further. Because of the definition of residuals ( 2. l),
&=,(et lyt-')
= &=,(Y, - g(yt-l) 1 y'-')
= Er=,(Y,Iyt-') - g(yt-')
= o. Then, by using total probability,
for any t 3 1. Therefore, when there is no change, the residuals are mutually uncorrelated,
and their means are zero. For a Gaussian process, the mutual uncorrelatedness implies that
the residuals are mutually independent when there is no change.
One may now be interested in looking at the distributions of residuds. We consider this in
the case of continuous variables. Let f*(,) be the probability density function corresponding
to FQ(,). That is, Kt has a joint density function f*,,)(yf). The Jacobian of G,
Therefore, the joint density function of e: = G(yf) is
f;<r>(e:) = f*(r,(G-l(4))7
and
That is, the likelihood of the residuals e: and the likelihood of the corresponding observations
y: are equal. This from another side confirms the sufficiency of residuals for change detection.
If the residuals are mutually independent, then
f d y : IyO)
= I L I f ~ ( r ) (yi Igi- l )
= II:=, f$crl(ei>v
and therefore, a LR testing procedure based on residuals will have much simpler form than
its counterpart based on observations.
2.4 Procedure for Deriving Forms of
In order to analyze the properties of the residuals { e t ) in detail,
they behave when a change @ ( r ) occurs to the process {K), we
Residuals
and especially to leam how
need to derive the forms of
{et). Two things must be kept in mind when we are deriving the forms. The f is t thing is
t hat the forecast function g(y t - ' ) used to compute the residuals is always defined under the
assumption that (x) is in control, that is, r = m, because we don't know the occurrence of
the change until it's detected, and before the detection, we make forecasts by assuming that
there is no change. The second thing is that {K) is subject to the change ck(r).
The forms of residuals may have to be specifically derived for a particular type of process
{ Y I ) . However, t here is a general procedure to follow, which is the following.
(i) Derive the forecast function g ( y t - ' ) , that is, the one step ahead forecast by assuming
r = 00.
(ii) Find the conditional distribution of the process output for given y'-' by taking into
account the effect of Q (r).
(iii) Find the relationship between the conditional expectation of Y, in (ii) and g(yt- ' ) in (i).
(iv) Subtract g(y t - ' ) from k; having the conditional distribution in (ii) by wing the rela-
t ionship of (iii). Then, the forms of residuals et = x - g(yt-' ) are ready.
2.5 Properties of Residuals of ARIMA Processes
Now, we can derive the forms of residuals of an ARIMA process.
Before going on, we need to distinguish between two kinds of changes in an autocorrelated
process, namely, additive changes and nonadditive changes. An additive change occurs in
the process mean level, and has an additive effect on the mean values of process observations.
Nonadditive changes are mode1 parameter changes, including changes in spectral characteristic
or system dynamics.
2.5.1 Processes Subject to Additive Changes
An ARIMA process of order (p, 1, q) ( A RZMA(p, 1, q) process) under feedback control subject
to a n additive change can be represented as
where by using time series notations as those in Box, Jenkins and Reinsel (1994), B is the
backward shift operator applied to the time index t, a ( B ) = 1 + a i B + = a + a,BP is the
stationary autoregressive operator, P(B) = 1 + blB + *O- - + 4 Bq is the invertible moving
average operator. 1 is a nomegative integer. The process is stationary when 1 = 0, and
nonstationary when I 2 1. { e t ) is the innovation sequence that is a sequence of white noise
input with mean zero and variance O:. u+i is the net effect of the feedback control that is
determined by y'-', and ut-1 z O represents that there is no control applied. r is the unknown
change point. A(-) is a detenninistic function, A(i) O for i < O and A(i) may be nonzero
for i 2 0. ( 2.2) can also be equivalently represented as
where
We cal1 A ( t -r) an observational additive change, and d(t -r) an innovational additive change.
In r e d world, these two types of additive changes may corne from different backgrounds.
However , t hey are mat hemat ically equivalent .
For the AR1 M A ( p , 1, q) process subject to the additive change A(t - r), we can derive the
foms of residuals by following the generd procedure in 5 2.4.
( 9
(ii) Given
(iii) The relationship between g( and E(G 1 y'-' ) is obvious.
(iv)
for t < r, et = (2.3)
- 1.1 + et, a(B)
for t > r.
Two resdts can be observed for the ARIMA process subject to the additive change. The
first is that the residuals { e t ) are the inverse filtering of the observations {k;) by removing
the term ut,
et = 4 B ) P - N B )
- ut-1)-
The second is that the residuals form an i.i.d. process subject to an additive change,
where 6(t - r) is the inverse filtering of A(t - r ) and 6(i) = O for i < O. Therefore, detection
of an additive change in an ARIMA process is equivalent to detection of an additive change
in an i.i.d. process.
2.5.2 Processes Subject to Nonadditive Changes
For an A R M A process, a mode1 parameter change from (a, ,û, 1, oc) to (a', P', l', oz) at change
point r , with the process mean value unchanged, is a nonadditive change to the ARIMA
process. Under such a change, we have the following development for the residuals.
(il
a ( B ) ( l - B)' a ( B ) ( l - B)' g(yt-l) = [l -
P ( W Iyt + P(B)
U t - 1
(ii) Given yt-',
a ( ~ ) ( i - ~ ) l I N + @ ( B I utml + et, for t < r, Y t =
where = et for t < r, and {ci, t 2 r ) is a sequence of white noise input with mean zero
and standard deviation O:.
(iii) The relationship between g(yt-l) and E ( x 1 yt-l) is obvious as well.
(iv)
Therefore, before the change, the residuals are an i.i.d. process. After the change, the residuals
either exhibit autocorrelation or have a changed variance. The possible autocorrelation arnong
residuals will not provide us the convenience for constructing change detection procedures as
the mutual independence does in the situation of additive changes. However, being autocor-
related or not does not afKect the sufficiency, and equivalence to observations, of residuals,
because the sufficiency holds for change detection problems in general.
Detecting an additive change in a normal i.i.d. process was defined as a basic problem in
Basseville and Nikiforov (1993). They established the sufficiency of likelihood ratios for change
detection. For change detection in AMMA processes subject to innovational additive changes,
they recognized the sufficiency of residuals because the likelihood ratios were functions of the
residuals. However, they did not recognize the prevalent sufficiency of residuals for change
detection in general. For detection of a nonadditive change in an ARIMA process, they
concluded wrongly that residuals were no longer sufficient for change detection just because
they could no longer be mutually independent.
One probably could be puzzled by the sufficiency of residuals for detection of an additive
change in a nonstationary ARIMA process, for example, an observational additive step change
in an I M A ( 1 , I ) process. Because of the differencing operation (1 - B)', 1 2 1 in ( 2.3) for
nonstationary A R M A processes, the residual mean will go to zero exponentially fast after
an obser~tional additive step change has occurred to the ARIMA process. This means that
the effect of the observational step change on the residuals will vanish. So, how could the
residuals be sufficient? The puzzle will not exist if one can realize that for a nonstationary
ARIMA process, the process observations wander around anywhere because of the integrating
operation (1 - B)-' in ( 2.2). Therefore, the effect of the observational step change on the
process observations themselves will be buried into the wandering, that is, will vanish as
well. The puzzle cas also be resolved by the fact that any change detection procedure based
on observations (or feedback control actions) c a n be transformed to a procedure based on
residuds with the sarne detection properties preserved, and vice versa.
2.6 Properties of Residuals of State Space Models
2.6.1 Residuals of a General Mode1
Consider a general state space model subject to an additive change A(t-r) in its measurement
equation, which is described as follows. For t = 1,2, --,
= HtOt + at + A(t - r),
Qt = @te,-i + ut-, + ut,
where at time t, et is the state of the dynamic system, Ot is the transition matrix, Ht is the
state observation transformation matrix, I; is the measurement of the transformed et, et is the
measurement error, ut is the noise input, ut-' is the amount of feedback control determined
by { e t ) and {ut) are independent normal white noise processes, et -* N(0, C,) and
ut .- N(0, C , ) . r 2 1 is the change point, A(*) is a deterministic function and A(i) = O for
i < O. O. - N(&, b) is initialized. Although it is not necessary to explicitly specify their
Following the general procedure in 3 2.4 and using Kalman filtering, we can derive the
forms of residuals of the model ( 2.4).
(9
where
and
(ii)
Y J y t - l - N( Ht [ @ t ~ ( O t - l ]y t - ' ) + ut-l] + A(t - r ) ,
H ~ [ O , C O V ( B ~ - ~ I Y ~ - ~ ) @ : + %]H: + cc).
(iii)
By mathematical induction, for t 2 r ,
where n::, r 1.
(iv) For t < r ,
For t 2 r ,
where C& 0.
By an approach similar to that in 5 2.3, we can prove that {et , t = 1,2,-} are mutudy
uncorrelated. Therefore, {et, t = 1,2, - - a) are m u t u d y independent because of the Gaussian
property.
Similady, we can derive the forms of residuals of a state space model subject to an additive
change in its state equation. We will illustrate this with the machining process in the next
section.
2.6.2 Residuals of the Machining Process
We have noted that the model of the machining process is a special case of state space models.
We now consider the machining process subject to an additive change A(t - r) in the state
equation, which is represented as follows. For t = 1 ,2 , - - O ,
yt = @t + et,
[Ot - A(t - r ) ] = [et-~ - A(t - 1 - r ) ] + ut-1 + c + ut,
or equivalent Iy,
Yt = Ot + et, (2.6)
et = et-1 + ut-i + c + ut + [A(t - P) - A(t - 1 - 7 - ) ] ,
where r 2 1 is a change point, A(*) is a deterministic function and A(i) i O for i < 0.
The change A(t - r ) represents an additive change to at in ( 2.5). Looking at ( S.6), we see
the differencing of A(t - r ) with respect to t, A(t - r ) -A(t - 1 - r ) , can also represent additive
changes in the drift rate c and in the mean level of the stochastic shocks {vt). Therefore, we
use A(t - r ) to represent the total effect of additive changes to &, which may result from
changes in tool conditions, met al bar quality and environmental conditions and others.
The function A(-) can represent various types of additive changes.
A(;) ii A for i > O
represents a step shift.
A(i) = (i + 1)c'for i 2 O
represent a drift rate change from c to c + d per part or an external ramping shift.
A(0) = L
A(i ) = O for i # O
represents a impulse shift.
We have mentioned that O. - N(&, qi) is initialized. For t = 1,2, -, let
By Kalman filtering,
Let
and 2
wt = qi-i + 02 q:-i + 0; + a,"
Then, following the general procedure in 5 2.4 and using Kalman fdtering, we can derive the
forms of residuals of the process represented by ( 2.6).
( ii)
(iii)
For t < r ,
E(@tlvt) = Er=,(O,lyt).
By mathematical induction, for t 2 r,
( iv) For t < r ,
For t 2 r ,
where n;=, 1.
Because of the one-to-one transformation G from y: to e:, by an approach sirnilar to that
in 5 2.3, we can prove that {et , t = l ,2 , -) is a sequence of mutudy uncorrelated random
variables. By the Gaussian property, (e t , t = 1,2, *) is a sequence of mutudy independent
normal random variables.
Therefore, for the machining process, its residuds form a sequence of mutually independent
normal random variables with wiances independence of the change. The mean values of
residuds are constant zero before the change point r and start to change after r .
Since sf is independent of the change A(t - r), the residuals { e t ) depend on A(t - r ) only
t hrough t heir mean values
t-r i
We now examine how E(et) responds to some specific types of additive changes. For a step
shift of size A, A(i) = A for i 2 0,
for t < r,
( An:Z(l - ~ i - ~ ) , for t 3 r.
For a drift rate change, A(i) = ( i + l ) d for i 2 0,
for t < r,
- W ) for t 2 r.
2.6.3 Residuals of the Steady-State Machining Process
One can see from ( 2.9) and ( 2.8) that the filtering rate wt and the residual variance s: depend
on t only through t$ defined by ( 2.7), which in turn is fully determined by q& the variance
of the initial process setting eO. For any initial value q& i4:) converges to
which is called the steady state variance. When q: is stabilized at q2, we Say that the process
is in steady state.
It is interesting to investigate the effect of qg on the convergence of ($1. It is because of
( 2.7) that ($1 converges very fast to q2 for any value of q& Numerical results have shown
the convergence of {q:) for the machining process with the parameters given by ( 1.2) in two
extreme cases: (i) qi = O, that is, the initial setting of the process rnean is error-free and (ii)
q; is large, say 102. In both cases, (q:} stabilizes at q2 after two iterations. Furthemore, if
q,2 = g2, then q: = q2 for al1 t. Therefore, we assume that the process is in steady state by
assuming g,2 = q2.
When the process is in steady state,
and the filtering rate
where rsw = oE/o: is the signal-tenoise ratio. It is easy to see that w(*) is a increasing
function of rsN. That is, the higher the signal-to-noise ratio, the larger the filtering rate.
From ( 2.10), for t 2 r ,
t-r
E(eJ = C(1 - w ) ' [ ~ ( t - i - r ) - A(t - i - 1 - r ) ] . i=O
When the change is a step shift of size A,
That is, the residual mean jumps to h at r and then decreases exponentially back to zero.
When the change is a drift rate change from c to (c + d) per part,
That is, the residual mean will converge exponentidy to a non-zero level of d/w.
For the machining process with parameters in ( 1.2),
q2 = 2.i'8g2, s2 = 5.608~ and w = 0.5528.
In sumrnary, for the machining process in steady state, its residuals form a sequence of
mutually independent normal random variables with a constant variance s2. Let
for k < 0,
~ , " = ~ ( i - w)'[A(k - i) - A(k - i - l)], for k 2 O.
Then the residuals form an i.i.d. normal process subject to an additive change 6(t - r).
2.7 Independence of Residuals of Feedback Control
An objective of qudity control is to reduce process variability and improve process capability.
For autocorrelated processes, by applying automatic process control, we can reduce systematic
variation. By applying change detection, we can possibly detect and remove special causes in
order to improve the processes. As we have mentioned in 5 1.1, automatic process control and
change detection (process monitoring) are complementary rather than cornpetitive tools, and
their combination for better quality control has been advocated in the literature.
Examining the results and developments so far, we see that the forms of residuals of the
ARIMA processes and state space models subject to additive changes are independent of the
applied feedback control. This may at first appear surprising. However, the reason behind
this is simple. That is, ut-1 is fully determined by Y'-', and the conditioning operation in
computation of the residuals cancels out the effect of ut-1 on et. Actually, the form of residuals
of a process is independent of any deterministic component of the process, for example, the
deterministic drift rate c in the machining process.
However, we must note that the forms of residuals of a process may depend on the applied
feedback control when the process is subject to a nonadditive change. For example, when the
autoregressive or moving average operator of an ARIMA process changes, feedback control
will impose some effect on the means of residuals.
The independence of residuals of feedback control under situations of additive changes
serves as a bridge for the combination of automatic process control and change detection in
quality control. Under different optimization criteria and control situations, different feedback
control policies can be obtained. When adjustment cost is minor, simple minimum variance
control can be applied. When adjustment is disruptive or time consuming, adjustment cost
has to be taken into account. Under some situations, such as that described in Vander Wiel
(1996 b), tool Wear can be automaticdy corrected by the lathe with negligible cost while in
others it cannot. However, no matter what specific form of feedback control is used, residuals
behave the same in response to additive changes. Therefore, an independent change detection
procedure based on residuals can be developed and used regardless of the specific form of
feedback control applied.
One may wonder why the feedback control cannot rernove the eflect of additive changes
in the residuals. The reason is that the effect of al1 feedback control is incorporated into the
forecast and the forecast g(yt-') is done under the assumption that the process is in-control.
Consider an example. Suppose a step shift of size A occurs at r and an additional amount of
(-A) of control is applied to the process at (r - 1). Then the effect of the step shift on process
output can be compensated because 8, = 0,4 + (u,-l - A) + C+ ur +A = El,-* + u,-1 +c+ v,.
Hence, Y, lyr-l - N(E(O,- l l y r - ' ) + u,-l + c, s:). However, at ( r - l), @, and Y , is forecast as
E ( @ , - ~ lyr-') + (u,-1 -A) +c. Therefore, e, = Y , - [E(8 , -1 lyr-') + (u,4 -A) +cl - N ( A , s:).
That is, t h e feedback can not remove the effect of the change in the residuals. This from
another side confirms that the properties of residuals of a process subject to an additive
change are independent of the applied feedback control.
2.8 Surnmary and the Mode1 Tkansformation
Residuals are always sufficient and equivalent to process observations for change detection.
Any change detection procedure based on observations can be transformed to a change de-
tection procedure based on residuals with the same detection properties such as optimality
preserved, and vice versa. When there is no change, the residuais axe mutually uncorrelated
with zero rneans, and furthermore mutudly independent if the process is Gaussian. For an
ARIMA process or a steady-state space model subject to an additive change, the residuals
are mutuaily independent. Actually, the residual process is an i.i.d. process subject to an
additive change that is the inverse filtering of the original additive change.
Because of the properties of residuals, detection of an additive change in an ARIMA
process or a steady-state state space model, possibly under feedback control, is transformed
to detection of an additive change in an i.i.d. process with detection properties preserved.
This model transformation will offer us much convenience for designing and analyzing change
detection procedures based on residuals.
In the rest of this work, we will focus on the residual process
which is an i.i.d. process with zero rneans and variance 02 subject to an additive change
6(t - P), where r = 1,2, , oo is the change point- E(e t ) = d ( t - r ) . b(i) = O for i < 0,
6 ( i ) for i 2 O is the mean of residual afler change. The formulation of the chônge detection
problem is now as foliows. At t = 1,2, - O , test the hypotheses
based on ei. The correspondhg stopping tirne N is now with respect to {Ft, t = l,2, - O),
where Ft is the minimum O-algebra induced by {ei, i = 1,2, , t ).
Chapter 3
Performance of Control Charts
Applied to Residuals
Introduction
From now on, we will study change detection procedures. In this chapter, we study the
CUSUM, EWMA and Shewhart control chart procedures applied to the residud process
{ e l , t = 1,2, ---) defined in 5 2.8.
Although applying control charts to residuals has been a popular practice for change
detection, especially, in the quality control area, a systematic investigation on the performance
of the control chart procedures applied to residuals of autocorrelated processes has not been
available in the literature. The lack of research in this area was due to two reasons.
Reason one, lack of a systematic investigation on the properties of residuals in the
previous literature.
r Reason two, the tirne-varying feature of the mean of residual after change, which made
the computation of performance measures not so easy as in the i.i.d. case.
By using a discrete-state Markov chah approach, Vander Wiel (1996 a) studied the perfor-
mance of two-sided CUSUM, EWMA and Shewhart control charts applied to the residuals
in a special case, that is, for an IllfA(1,1) process subject to an observational additive step
shift.
In this chapter, for a general pattern of the mean of residual after change, that is, a
geoeral 49, we will derive an explicit formula for computation of the ARLs of the Shewhart
control chart procedure, establish integral equations for the CUSUM and EWMA control
chart procedures, and develop numerical procedures of Gaussian quadrature for solving the
integral equations. Because of the timevarying feature of the mean of residual after change,
the integral equations and the numerical procedures based on them have not been done by
other authors. Under the ARL criterion, we will then study the performance of the control
chart procedures applied to the residuals under three typical situations, where the mean of
residual after change is decreasing to zero, decreasing at fist and then stabilized at a nonzero
level, and gradually increasing and stabilized at a nonzero level. By the ARL criterion, we
mean that for a change detection procedure N, for a fixed value of its in-control ARL that is
the smaller its out-of-control ARL that is
the faster the procedure N for detection of the change 4).
For computation of the ARLs, both the integral equation approach and the Markov chain
approach can be used. We are using the integral equation approach here, because it is more
efficient and can offer us more accurate and precise solution than the Markov chah approach.
Before we examine the advantage of the integal equation approach, we look at the difierence
between the two approaches, which exists in the following aspects.
0 First, in the case of a continuous state variable, by using the integral equation approach,
we establish an integral equation of ARL function on the base of the continuous state
variable directly. By using the Markov chain approach, however, one has to discretize
the continuous state variable and then establish transition probability matrices on the
base of the discret izat ion intervals.
0 Second, when solving the ARL values, by using the Markov chah approach, one a p
proximates the ARL values at each discretization interval by a constant (zer~degree
polynomial). That is, a continuous ARL function is approximated by a discontinuous
37
step function. By using the integral equation approach, however, we fit the integrand
(the continuous ARL function) on each quadrature interval by a polynomial of higher
degree when solving the i n t e g d equation by using quadrature methods. That is, the
continuous ARL function is fitted by a smooth spline curve.
a Third, by using Gaussian quadrature to solve an integral equation, we can choose the
location of the quadrature abscissas in an optimal way to achieve the closest fitness
of t h e ARL function by a spline curve. However, there is no such an optimal way for
discretization in the Markov chain approach.
For an N-point quadrature rule, basically we solve an N-dimensional system of linear
equations in the integral equation approach. In an N-discretization interval Markov chain a p
proach, one needs to carry out N-dimensional rnatrix operations. Therefore, when the number
of quadrature points is about the same as the number of discretization intervals, the computa-
tional complexities of the two approaches are about the same, the integral equation approach,
however, can offer much more accurate and precise solutions than the Markov chain approach.
Usually, only a small number of quadrature points are needed to achieve desired computa-
tional accuracy and precision. However, the same degree of accuracy and precision may not be
achieved by a large number of discretization intervals in the Morkov chain approach. Merely
increasing the number of discretization intervals cannot increase the accuracy and precision
of final solutions, because in the Markov chain approach, (i) the continuous ARL function
is always approximated by a discontinuous step function, which is a vital shortcoming; (ii)
increasing the number of discretization i n t e r d s will increase the computational complexity
and intermediate computational errors dramatically. More discussions on quadrature methods
and solution of integral equations can be found in Press, Teukolsky, Vetterling and Flannery
38
(1992).
When both the integral equation approach and the Markov chah appoach can be used,
the former is superior to the later in terms of computational efficiency, accuracy and precision.
However, in some cases (for example, when the state miab le is more than two dimensional),
it may be hard to solve the integral equations efficiently, while the discrete-state Markov chain
approximation can be still applied. The less efficiency and accuracy makes the Markov chain
approach more flexible and versatile. Lucas and Crosier (1982) computed the ARLs of CUSUM
charts in the i.i.d. case by using the two approaches. They discussed and their numerical
results demonstrated the relative advantages and disadvantages of the two approaches.
3.2 Integral Equations for Computation of the ARLs
Under the ARL cri terion, for a change detect ion procedure N, since we are only interested in
Er,, (N) and Er=* (N), we can simply assume that
and concentrate on computation of KZI(N) . By substituting constant O for 6(t - 1), t =
1,2, - -, Er,,(N) is just a special case of (N).
Two other assurnptions made in this chapter are
a et has a normal distribution;
6(t - 1) converges as t increases, and the limiting value is denoted by &O).
Let fN( I1 ,O1) be the normal density function with mean p and variance 02, and O(-) be the
standard normal distri bution function.
3.2.1 Shewhart Control Chart Procedure
The Shewhart control chart procedure with control limit C applied to {el, ez, 0 ) is a stopping
time,
Ns = inf(t 3 1 : (etl 2 Co).
For i 1 O, define ARLs(i) the ARL of Ns applied to {e i+l , ei+l, 0 ) . Then,
B y condit ioning,
which leads to the following formula.
3.2.2 CUSUM Control Chart Procedure
Let
&(t) = max{O, (et - k) + S H ( ~ - l)],
&(t ) = max{O, (-et - k) + &(t - 1)).
&(O) = O and &(O) = O are initidized.' Then, stopping times
and
N; = inf{t 2 1 : SL(t) 2 h )
are the higher-side and lower-side CUSUM control chart procedures applied to (el, el , - O ) ,
respectively. k > O is the CUSUM reference value, and h > O is the CUSUM threshold value.
We note that
{SH(t), t = 1>2, - = }
and
(S&(t),t = i , 2 , - a * }
form two non-homogeneous cont inous-state Markov chains.
For i 2 O, we define ARL$(s, i ) and ARL& i) the ARLs of the upper-side and lower-
side CUSUM control chart procedure applied to {e;+l, e;+?, 9 ) given that SH( i ) = s and
SL(i) = s, respectively. Then, for &(O) = O and &(O) = 0,
By condit ioning and the Markov property,
Using 4.) instead of 6(9 in ( 3.1), we obtain the equation for ARL$(s, i).
Since d( t - 1) + @O) when t + 00, from ( 3.1),
and
which is a Fredholm equation of the second kind that can be solved numerically by using
standard procedures such as Gaussian quadrature.
For an N-point Gaussian quadrat use appiied to [O, hl, let qi, q2, - - , q~ be the abscissas
and wl , w2, - * O , WN the corresponding weights. Then, adapting the idea of the interpolatory
formula observed by Nystrom, which is described in Press, Teukolsky, Vetterling and Flannery
(1992), we can accurately approximate ( 3.1) by
Usually, 6 ( t ) converges exponentially fast. For a sufficiently large T, Say 100, we can
initialize ARLZ(S, T) to be A R L ~ ( S , oo) and then compute A R L ~ ( S , i) recursively from i =
T - 1 to i = O by using ( 3.3). For the initialization and recursions, only the function values
at s = 0, qi , qz, , q~ need to be evaluated.
In summary, to compute ARLg(0, O), we carry out the following numerical procedure.
a Solve the integral equation ( 3.2) for ARLE(., a).
a For a sufficiently large integer T 2 O, initialize ARLg(0, T) to be A R L ~ ( . , w).
O For i = T - 1, T - 2, , O , recursively obtain ARL& i) from A R L & ~ + 1) by using
the interpolatory formula ( 3.3).
a Obtain Er,i ( N g ) = A R L ~ ( o , O), the final value of our interest.
3.2.3 EWMA Control Chart Procedure
For a chosen coefficient X E (0,1], let
is initidized, and
is assumed. For C > 0, the stopping time
is the EWMA control chart procedure applied to (el, e2, 0 ) . X is the EWMA coefficient, and
C is the EWMA control limit.
is a non-homogeneous continous-state Markov chain.
For i 3 O, define ARLE(r, i) the ARL of applied to {ei+,, ei+*, * ) given that Zi = z.
Then, for Zo = 0,
Er=1 ( N E ) = ARLE(O, O).
By conditioning and the Markov property,
ARLE (r, i)
Then, a similar numerical procedure as in § 3.2.2 can be developed to obtain
3.3 Numerical Analysis
In this section, we will numerically investigate the performance of the CUSUM, EWMA and
Shewhart control chart procedures applied to the residuals. O bviously, for different types
of & ( O ) , the mean of residual after change, a change detection procedure may show different
performance. Therefore, performance of the control chart procedures should be investigated
for various types of &). To compare the performance of different control chart procedures
under the ARL criterion, their in-control ARLs will be fixed at the same value.
Both the CUSUM and EWMA control chart procedures have two design pararneters.
CUSUM has its reference value k and threshold value h. EWMA has its coefficient X and
control limit C. When their in-control ARLs are fixed, the degree of freedom of choice of their
design parameters will be reduced from 2 to 1. That is, once one of the two parameters is
chosen, the other will be determined.
For a fixed in-control ARL value, at a particular size of change, we minirnize the ARLs
of the CUSUM and EWMA control chart procedures. This can be done by optimizing one
of their design parameters. We choose to optimize the reference value k for CUSUM and the
coefficient X for EWMA.
In the rest of this section, we will present numerical results in three typicd cases of if(-).
In each case, we will first present data in a table, where the CUSUM reference values k and
EWMA coefficients A optimal for diflerent sizes of change are given, together with the ARL
values for each set of optirnal design pararneters. The optimum ARLs of CUSUM and EWMA
at each considered size of change are supencripted with * in the tables. We will then plot the
optimum ARLs versus sizes of change, and the ARL curves for chosen design parameters. In
this section, we will assume that
0 the standard deviation of residuai is a = 1.
3.3.1 Results on the Machining Process Subject to a Step Shift
Consider the steady-state machining process subject to an additive step shift of size A in its
state equation, that is, A(i) = A for i 2 O. Then the mean of residual ofter change is
for i 2 0, where w = 0.553.
Table 3.1: Optimum ARLs and parameters of the CUSUM, EWMA and Shewhart control
chart procedures applied to the residuds of the machining process with w = 0.553 subject to
an additive step shift of size A in its state equation when their in-control ARLs are 100.
r
L
From Table 3.1 and Figure 3.1, we have the foilowing observations on the performance of
the CUSUM, EWMA and Shewhart control chart procedures when the meôn of residud after
change is ( 3.4).
1. Roughly, EWMA performs the best in this case.
46
2
53.9'
35.0"
35.4
40.1
69.6
A O
100
100
100
100
100
CUSUM
EWMA
2.5
33.5'
14.6'
14.9
18.3
50.1
k = 0.9
X = 0.35
X = 0.45
X=0.70 -
Shewhart
1.5
73.5'
62.0'
62.2
66.1
84.9
3
17.2"
4.9'
5.0
6.4
30.7
0.5
96.0'
96.6'
96.6
97.0
98.8
1
87.8'
84.5'
84.6
86.4
94.3
3.5
7.4'
2.0
1.9'
2.2
15.8
4
3.0'
1.3
1.2.
1.2
6.9
4.5
1.5'
1.1
1.1
O *
2.9
5
1.1*
1.0
1.0
1.0"
1.5 b.
2. Only when the ~ i z e of change is very small, CUSUM performs better than EWMA.
3. Shewhart performs the worst for any considered size of change.
A: Size of change
I I I 1 1
I I 1 I 1
CUSUM: k d . 9 '.
\
80
3 5 60- E 3
-E C, 40 0"
20
O
- O 1 2 3 4 5 6 7
k Size of change
. . ' . ' .X ".
t ' S . 'Q. - . . . - -. X: CUSUM - - . ". . . . - O. . .
;t '. . +: EWMA -
- O: Shewhart - +, . ..,
- - +. , ' x * .. =o..
I 1 ;r :*...m,,,,,a-,-,-a ---- c& -*--
Figure 3.1: Optimum ARLs versus sizes of change (upper panel) and the ARL curves with
the chat parameters optimizing the ARLs at A = 2 (lower panel) of the CUSUM, EWMA
O 1 7
2 - -
3 -
4 5 6 7
and Shewhart control chart procedures applied to the residuals of the machining process with
w = 0.553 subject to an additive step shift of size A in its state equation, when the fixed
in-control ARL is 100.
3.3.2 Results on the Machining Process Subject to a Drift Rate
Change
Consider the steady-state machining process subject to a drift rate change of c' in its state
equation. Then the mean of residual after change is
for i 2 0, where w = 0.553.
Table 3.3: Optimum ARLs and parameters of the CUSUM, EWMA and Shewhart control
chart procedures applied to the residuals of the machining process with w = 0.553 subject to
a drift rate change of d in its state equation when their in-control ARLs are 300.
From Table 3.2 and Figure 3.2, Ive have the follorving observations on the performance of
the CUSUM, EWMA and Shewhart control chart procedures when the mean of residud after
49
change is ( 3.5).
1. Optimized CUS UM control chart procedures produce the smallest optimum ARL values.
2. The CUSUM control chart procedure optimizing the ARL at a considered size of change
produces a lower ARL curve than the EWMA optimizing the ARL at the same size of
change, and also a lower curve than Shewhart.
3. Shewhôrt performs almost as well as CUSUM for large size of changes.
4. Shewhart performs better than EWMA for medium and large size of changes.
5. EWMA does not perform as well as CUSUM.
6. EWMA performs better than Shewhart for small size of changes.
O: Shewhart
+: EWMA
CUSUM: k=1,4
EWMA: L0.57
Shewhart
6: Arnount of drift rate change
Figure 3.2: Optimum ARLs versus sizes of change (upper panel) and the ARL curves with
the chart parameters optimizing the ARLs at c' = 2 (lower panel) of the CUSUM, EWMA
and Shewhart control chart procedures applied to the residuals of the machining process with
w = 0.553 subject to a drift rate change of d in its state equation, when the fixed in-control
ARL is 300.
3.3.3 Results on the Cutting Process
Consider the AR(2) cutting process with parameters in ( 1.4) subject to an observational
additive step shift of size A, that is, A(i) E A for i 2 0. Then the mean of residual after
change is
a(i) = (1 - 0.458 - O . ~ B ~ ) A ( ~ )
A, for i=O,
0.55A for i = 1,
0.35A for i > 1.
From Table 3.3 and Figure 3.3, we have the following observations on the performance of
the CUSUM, EWMA and Shewhart control chart procedures when the mean of residual after
change is ( 3.6).
1. CUSUM and EWMA perform better than Shewhart, especially for srnail and medium
size of changes.
2. CUSUM performs better than EWMA for smaller size of changes.
3. EWMA performs better than CUSUM for larger size of changes.
chart procedures applied to the residuals of the AR(2) cutting process subject to an observa-
- .. A O 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
k = O . 1 100 39.5' 20.9 13.0 8.9 6.5 4.9 3.8 3.1 2.6 2.2
CUSUM
u
tional additive step shift of size A when their in-control ARLs are 100.
I
-
-
b
EWMA -
-
-
S hewhart 40.9 23.3 12.0 5.7
Table 3.3: Optimum ARLs and parameters of the CUSUM, EWMA and Shewhart control
'O.
x: CUSUM
+: EWMA
O: Shewhart
. *%,, ' 2 .' ,.
1 ' . * . . - 4. : 2 , , ,a,, , , ,a,, , , ,a,, , , -a,, . , , m -, , - , a.. , , . - - - - -
3 4 5 6 7 A: Size of change
- CUSUM: k=OS
- - . EWMA: k0.2
. - . - Shewhatt
1 2 3 4 5 A: Size of change
Figure 3.3: Optimum ARLs versus sizes of change (upper panel) and the ARL curves with
the chart parameters optirnizing the ARLs at A = 2 (lower panel) of the CUSUM, EWMA
and Shewhart control chart procedures applied to the residuds of the AR(2) cutting process
subject to an observational additive step shift of size A, when the fixed in-control ARL is 100.
3.3.4 Summary of the Performance of Control Charts
Under the ARL Criterion, we have studied the performance of the CUSUM, EWMA and
Shewhart control chart procedures under three different situations, where the mean of residud
after change represented by 6(i) for i 2 0 decreases to O rapidly ( 3.4), decreases at first and
then stabilizes at a nonzero level ( 3.6), and gradudly increases to a constant level ( 3.5).
From the analyses, we have the following results.
1. When 6(i) for i 2 O decreases to O rapidly, EWMA roughly performs the best among
the three control chart procedures.
2. Otherwise, CUSUM roughly performs the best.
3. Generally, Shewhart performs well for detection of very lôrge size of changes.
Chapter 4
Sequent ial Testing on Process Mean
Levels
4.1 Introduction
Consider a situation for the residud process { e t , t = 1,2,. -}. For a fixed time point ro 2 1,
either the additive change occurs at ro, or no change occurs a t d l . That is, the change point
r E {ro, oo). We want to discriminate r = r o from r = oci by sequentiaily observing el until a
decision based on the observations can be made. Therefore, we sequentially test Ho: r = m
versus Hi : r = ro. Before ro, et behaves the same under both Ho and H l , and starting at
ro, et behaves differently under Ho and Hi. Specifically, starting at ro, E(et) is time-varying
according to 6(9 under Hi, and E(et) is constant zero under Ho. Therefore, starting at ro,
we actually sequentially test
Ho: E(ei) = 0, for i = 1,2,-, versus
: E ( e i ) ( i - ) , fori=1,2,-,
based on the observations et. We cal1 these two hypotheses the mean level hypotheses. Since
{a, m) c {1,2, - - -, oc), the mean level hypot hesis testing is a special situation of the general
change detection problem formulated in 5 2.8 and originally in 5 1.3.
Anot her situation t hat leads to formulation of the mean level hypot hesis testing is when it
is suspected that an additive change has taken place before observation is started. If we want
to clarify this suspicious situation, we will have to test it. If the suspicion is true, then E(e t )
is tirne-varying according to 6(*) starting at the beginning of the observation. Therefore, we
will need to test the mean level hypotheses.
The problem of sequential testing on process mean levels also stands on its own. It has
already been a problem widely accepted and studied in the area of signal detection. Basically,
in the circumstances of signal detection, we use the mean level hypothesis test ing formulation
to test
57
absence or presence
of a known signal in a noise process. The problem has been studied by many authors. Liptser
and Shiryaev (1978) in their 517.6 studied sequential testing of two hypotheses that observa-
tions follow a noise process or the observations follow the noise process containing a known
signal additively. Cochlar and Vrana (1978) studied sequential testing of two hypotheses that
observations have a joint probability density function, Say, f (zîlO), or the observations have
a different joint probability density function, Say, f ( x y l l ) . They also particularly studied se-
quential testing on presence or absence of a known signal in a Gaussian noise process. Liu and
Blostein (1992) studied sequential testing of two hypotheses that independent observations
have a joint probability density function ni=, foi(~i), or they have a joint probability density
function nf=, f l i ( x i ) . They also discussed the applications of this sequential testing problem
in communications. Recently, Kailath and Poor (1998) has comprehensively reviewed the
related topics of t his sequent ial test ing problem.
Other applications of the mean level hypothesis testing problem may arise in system iden-
tification. Consider that the system status or process mean level is unknown to the observer,
but can only be in two possible states, Say, normal or abnomal. Regard the normal state as
an in-control state. Then, the means of residuals are al1 zero if the system or process is in the
normal state, and are time-varying following a pattern if in the abnormal state. Therefore,
the unknown system status or process mean level forms a set of mean level hypotheses. By
sequentially testing the mean level hypotheses, we can have the unknown state identified.
We note that the classical Wald's SPRT problem, where b( i ) for i 2 O is a constant, is a
special case of the mean level hypothesis testing problem. SPRT has been applied in different
areas. Since the time-varying feature is more prevalent than the time-invariant feature, the
meaa level hypothesis testing is more applicable.
Our objective here is to derive an optimal sequential test for the mean level hypotheses.
When d(i - 1) is a constant for i = 1,2, - - -, the optimal sequentid test is the SPRT proposed
by Wald (1945), and its optimality was proved by Wald and Wolfowitz (1948), and later by
Lehmann (1959), Chow and Robbins (1963) and others. For 6(9 taking a general time-varying
pattern, only a few works on the optimal sequential tests exist in the literature. Cochlar and
Vrana (1978) derived a Bayes optimum sequential test minimizing the expected sample size
over the a posteriori probability distribution of the two hypotheses for given type4 and type-
II errors. Liu and Blostein (1992) derived the same Bayes optimum sequential test, and aiso
obtained some properties of the optimal stopping boundaries. Liptser and Shiryaev (1978)
derived an optimal sequential test under a different criterion (see the (17.115) in their Theorem
17.8).
In this chapter, we will denve the optimal sequential test that minimizes the expected
sample size when the process is in the out-of-control state for given upper bounds of type4
and type-II errors. We will first formulate the optimal sequential testing problem as an optimal
stopping problem, then derive the optimal stopping rule, study some important properties of
the optimal stopping boundaries, and develop an algorithm for computation of the optimal
stopping boundaries, and present an example.
In this chapter, we assume that
O et has a normal distribution with standard deviation a = 1.
4.2 Formulation as an Optimal Stopping Problem
For n = 0,1,2, * * O , let 3, be the minimum 0-dgebra induced by {et, t = 1,2, ff , n), where
specially, {et , t = 1, .-- -, 0) = 0, 6 = (0, R). Let e l be a
S = (r, d) consists of a stopping time r 2 O with respect
nul1 vector. A sequentid test
to (Fn,n = 0,1,2,***) and a
terminal decision rule d that is a function
O, to accept Ho, d(e;) =
( 1, to accept H ~ ,
given r = n and e;. r = O represents no sarnpling a t dl . A fixed sample size test is a special
case of sequential tests, for which r r m, the fixed sample size. The type4 and type41 errors
of a sequential test S are
a = P(d = ilHo is true) and
p = P(d = OIHl is true),
respectively. When it is necessary, we use T(S), a(S) and P(S) denote the stopping time,
type-1 and type-II errors associated with a sequential test S, respectively.
Our objective here is to find an optimal sequential test minimizing
subject to the constraints a(S) 5 a0 and P(S) 5 Bo, where O < cro, Po < i are specified values.
That is, we are interested in finding an optimal sequential test, satisfying the prescribed upper
bounds of decision errors, which will stop sampling most quickly on average when Hl is true.
For convenience, let El ( 0 ) denote E(-1 Hi).
This objective leads us to consider the following auxiliary problem: for g, cl > O, hding
the optimal sequentid test S* attaining
inf S ~ { a f l sequential tests)
{QQ(S) + CIP(S) + E i ( r ( W -
q-, and cl cm be interpreted a s the loss incurred by fdsely rejecting Ho and HI, respectively.
The sufficiency of the auxiliary problem for Our objective ( 4.1) results from the fact that,
for any other sequentiai test S', if a ( S r ) 5 a(S') and @(Y) < - ,û(S'), then we must have
( ( ) ) ( ( S ) ) . Otherwise, it contradicts ( 4.2). Therefore, if we have a(S*) = cro
and P(S') = Bo by properly choosing q, and CI, then S' will be the optimal sequentid test
we want. Sirnilar auxiliary problems were first used by Wald and Wolfowitz (1948) to prove
the optimality of Wald's SPRT, and subsequently used and analyzed by some other authors
in optimality proofs.
For convenience, for t = 1,2, - - O , let
Let f o , t ( * ) and f 1 J - ) denote the density functions of e t under Ho and H l , respectively. Then
and
Let
and
which are the likelihood functions. For convenience, denote Lo(ey) = 1 and Li (e:) = 1.
For any stopping time 7, the optimal terminal decision d e d* rninimizing %a + cl@, and
therefore minimizing Qa + cl/3 + El ( T ) , is when r = n,
0 (to accept HO), if coLo(eî) 2 s L ~ ( e ; ) , 8(eC) =
This is because of the following inequali ty :
Therefore, to find the optimal sequential test of ( 4.2), we will find an optimal stopping time
T' minimizing
The optimal stopping time r', together with the optimal terminal decision rule d' ( 4.3), will
render an optimal sequential test S* = (T*, d.) of our objective ( 4.1).
In the next section, we derive the form of the optimal stopping time T*.
4.3 Derivation of the Optimal Stopping Rule
Define the stopping loss at stage n = 0,1,-• as
Denote A, the Iikelihood ratio at stage n, which is
Let z, and A, be the realizations of Zn and A,, respectively. By the definition of Zn and ( 4 4 ,
the optimal stopping time T' is the optimal stopping time with respect to { (Z,&),n =
0,1, - 1 , which satisfies
El (Zr.) = inf El (2,).
For K = 0,1, and n = 0,1, , K , by backward recursion, define
with
pF c m be interpreted as the minimum expected loss at stage n for a finite stopping time
r 5 K, which means that we must stop sarnpling and make a decision before or at stage h'.
By the definition, for any n 5 K, ,OF 2 O, and by backward induction, BE+' 5 ,OF. Therefore,
for n = O, 1, - -,
exists. Since in
Zn = n + min{~An,ci) ,
min{qAn, cl) is bounded between O and cl for al1 n = 0 , 1 , - - O , and {n) is an increasing
sequence, by Theorem 2 of Chow and Robbins (1963), the optimal stopping time r* with
respect to {(Zn, F,,)) exists and
For A 2 0, let
h(X) = min{~X, cl).
Then, for n = 0,1, -,
2, = n + h(Xn),
and
Hence, by the definit ion, for given n, K and the deteministic sequence { p i ) , ,8: is a function
of A, only, and we may denote it by @(A,).
For n = 0 ,1 , - -0 ,K + 1, define
By the definitions, for n = O, 1, # # * , K ,
where
fN(o,i)(-) is the standard normal density function. For n = 0,1, O , because lim~,, exists,
gn(An) = $ i w f f ( ~ n ) (4.8)
exists. Since
0 5 $(An) 5 h(Xn) I CI,
by the Lebesgue theorem of dominated convergence,
gn (An) = min{h(Xn), Gn+i (An) + 11,
where
By the definitions and ( 45 ) , the optimal stopping time is
The functions gn(X) and Gn(X) will provide us convenience for studying the structure of
t'. We first derive some important properties of them, which Xe the following proposition.
Proposition 1 For any n = 0, 1 , - -O,
( i ) both g,(A) and & ( A ) are concave in A 2 O;
(ii) g,(O) = O , gn(X) = cl for suficiently k r g e A, and gn(A) is increasing in A 2 0;
(ii;) Gn(0) = O , Gn(w) = cl, and &(A) is increasing in X 2 O.
Proof of Proposition 1:
(i) By the definitions, for any integer K 2 0, g g ( ~ ) = h(X) is obviously concave in A 2 O. For
any n = 0,1 , - , K , if c(A) is concave, then by the definition of concavity and ( 4 4 , it can
be verified that G ~ ( A ) is concave in h 2 O. Hence, by ( 4.6) and backward induction, $(A)
65
and GF(X) are concave in X 2 O. Then, for al1 n - 0,1, -, by the definition of concavity and
( 4.8), g,(X) is concave in X _> O, so is therefore G,(X) by ( 4.10).
(ii) gn(0) = h(0) = O is obvious.
If cl 1, then by ( 4.9), for A 2 ci/% and n = 0,1,--,
Suppose ci > 1. By ( 4.9), for al1 n = 0,1, and A > cl/@,
Let [cl] denote the integer part of cl. For any n = 0,1, -, consider m = n + [ci] + 1 and
6 E (O, 1).
9 m ( W 2 1,
for X 2 cl /co. Hence, by ( 4-10), there exists v,-1 2 cl/%, such that for X 2 v,-l,
and by ( 4.9),
€ gmmi(X) 2 min{cl,2 - -1. 2"-n
Once again, by ( 4.10), there exists v,-2 2 c1/cO, such that for X 2 ~ ~ - 2 ,
and by ( 4.9),
Thus, by backward recursion, there exists v,, 2 cl/co, such that for X > v,,
hl+l = min{cl, [cl] + 2 - C z)
That is, for A 2 v,, g,(X) = cl.
For any n = 0,1, : ., since gn(0) = O and gn(A) - cl for A 2 un, by its concavity, gn (A)
must be increasing in A > O. Therefore, (ii) is proved.
( i i i ) follows ( i i ) by ( 4.10). Proposition 1 is proved.
Now, we can establish the structure of the optimal stopping time T*.
Theorem 1 The optimal stopping time has the f o m
{A,, n = 0,1, 0 ) and {B,, n = 0,1, ':) are Iwo detenninistic sequences. I f cl > 1,
then for n = O , 1, -,
and A, = c l / ~ if and only if B,, = cl/co. If O < ci < 1 , then for any n = O , l , - g ,
An = Bn r c 1 / g *
Proof of Theorem 1:
From ( 4.11) and ( 4.9), for n = 0,1, O:'-, we need to compare h(A) with Gn+I(X) + 1 for X 2 0.
Suppose that cl > 1. By Proposition 1,
From the proof of Proposition 1 (ii), t here exists v, 2 cl/%, such t hat for X 2 un,
Gn+l(X) + 1 > h(A) = Cl.
Therefore, if there exists An E (O, cl /QI, such that
Gn+l(An) + 1 =-CO& = h(An), (4.12)
then, by the properties of G,+l(X) in Proposition 1 (i) and (iii), firstly A, > l / ~ , secondly
such an A, is unique, and thirdly there uniquely exists B, E cl/^, oo) such that
Gn+l(Bn) + 1 = c l = h(Bn)*
If A, = cl /Q, then B, = cl/%, and vice versa. Furthemore, for A E (A,, Bn),
and for XÉ(A,, B,),
When 44n = B n , (A, , B n ) = 0.
If t here does not exist such an A, E (0, cl /QI tbat ( 4.12) holds, that is, for dl X E (0, cl/co],
then by the increasing property of Gn+1 (A), for al1 A 2 0,
In this case, (cl/co, Q/Q) = 0, we let A, = Bn = cl/% for convenience.
Suppose O < cl 5 1 instead. By Proposition 1 (iii), for all A 2 O,
Hence, the optimal stopping tirne is T* O in this case. We then let A, = Bn = cl/% for
convenience as well.
Therefore, under ail situations, for n = 0,1, - - =,
and the optimal stopping time is
Theorem 1 is proved.
Illustrated in Figure 4.1 are the relationships among A,, Bn, the functions h(*) , gn(*) and
G,(-), and the decision parameters co and cl, as well as the properties of the functions shown
in Proposition 1.
kVe cal1 {A,, B,} the optimal stopping boundaries at stage n = 0,I , - - * , and (A, , Bn) the
sampling continuation interval of the optimal sequential test. If (A,, B,) = 0, sampling must
be stopped before or at stage n. In the case of cl 5 1, that is, the loss incurred by falsely
rejecting Hl is not greater than the cost of taking a sarnple, r* O, t hat is, it is optimal not
to observe at all. When sampling is stopped at stage n, the optimal terminal decision d e by
1 (to accept Hl), if A, 5 A,, d ' ( e ; ) =
( O (to accept Ho), if X, 2 Bn.
In case of A, = A, = Bn7 decisions to accept Hl or Ho c m be randomized in any way. For
simplicity, we let d*(er ) = 1 in this case. In summary, the optimal sequentid test is
I stop sampling and accept Hi at stage n, if A, 5 A,,
S' = continue sômpling at stage n, if An E (A,, &),
stop sampling and accept Ho at stage n, if A, 2 B,.
Figure 4.1: A typical illustration of g,(X) and the related functions and quantities.
4.4 Properties of the Optimal Stopping Boundaries
The optimal stopping boundaries {An, Bn, n = 0,1, --) are determined by 9, CI and (p i , i =
1,2, - -1. Generally, A,, and Bn Vary as n changes. In t his section, we study the properties
of the optimal stopping boundaries, especially, how {A,) and {B.) vary when {pt) exhibits
certain patterns.
If for to 2 1, { p t , t = to, to+l, - - * ) is a constant sequence, then {g,,(-), n = to-1, to, 9 ) is a
constant sequence of functions. Therefore, {A,, n = to - 1, toT - - e) and {B,,n = to-l,to,-0.)
are two constant sequences. When to = 1, this is the special situation of Wald's SPRT.
If {pt, t = 1,2, - --) is periodically varying in t, that is, there exists a positive integer I
such t hat
for any t = 1,2, -, then both {An, n = 0,1, - -) and {B,, n = 0,1, -) are periodically
varying in n with the same period 1, that is,
and
&+I = B n
for any n = 0,1, ---. This is because for any n 5 by the definitions,
and t hen
for any n = 0,1, ---.
The following proposition shows that when {p i ) converges to 0, the optimal sequential
test is truncated. That is, there exists M, such that A, = B, = ci /Q for a l l n 2 M.
Proposition 2 Assume
The optimal sequential test is truncated. That is, there exàsts an integer M 1 0, such that
Proof of Proposition 2:
I t will be sufficient to show that under the condition of the proposition, there exists an integer
hl, such that for any K 2 n 3 M + 1 and X 2 0,
and
IG:(A) - h ( ~ ) l c 1.
For any e E (0,1), thereexists 6, > O , such that
For any positive integer K , by the definition, for X 2 0,
Since lim+, pi = O, there exists m,, such that for any K 2 m,, 1x1 < 4, aod X E (0, a), eo
is bounded. Therefore, for A' 2 m,, 1x1 < cf,, and A E (O, %),
€ Ih(Aexp{-pK(x + y))) - h(A)l < co- = e.
Co
For K 2 me and 1x1 < de,
and
pK € exp{-pK(x + -)) 3 1 - -. 2 C l
and t herefore,
The non-equality of ( 4.17) is because of that h ( - ) is an increasing function and h(X) = cl for
X 3 cl/%. Since h(0) = O, from ( 4.16) and ( 4.17), for any K _> m,, 1x1 < 6,, and A 2 0,
Thus, for any K 2 m, and A 2 0,
Hence, by ( 4.6),
If A? - 1 2 m,, then by a similar development,
Then by backward induction, for any K 2 n 2 M + 1 and X 2 0, ( 4.13) and ( 4.14) hold,
and by ( 4.6),
g:;<w = w* Therefore, by ( 4.8), for any n 3 M and X 2 0,
and by ( 4-11),
This means that the optimal sequential test is truncated. That is, sampling must be stopped
no later than M. Proposition 2 is proved.
We cal1 such an M in Proposition 2 a tnincation point. ( 4.15) indicates a way for h d i n g
a tnincation point when Lmi+, = 0. ( 4.15) also indicates that limi,, pi = O is oot a
necessuy condition for the optimal sequential test to be tmncated. In fact, if Ki+,lpil
is sufficiently small but not zero, then the optimal sequential test can be still truncated.
Proposition 2 substantiaily strengthens the result of Liu and Blostein (1992), where it was
shown that the optimal stopping boundaries were truncated when pt m O for t beyond a
certain stage M.
4.5 Computation of the Optimal Stopping Boundaries
Generally, to compute An and B,, we need to compute gn(*) and Gn+i(*). To compute g&)
and G,+l ( O ) , we can first compute ~ ( 0 ) and G,K++, (-) for a sufficjently large K > n. By the
convergence properties, gn ( 9 ) and G,+l ( 0 ) can be well approximated by g;L(-) and Gs, (O).
Once g,(-) has been computed, g,-1 (O), gn-P(=), * * * can be computed backward recursively. In
t his way, -4. and B, can be found for any n = 0,1, *fip.
Major effort is required in the computation of Gn(*) functions, that is, the integration
in ( 4.7). The integration can be done by using Gaussian quadrature eventually. However,
caution must be taken to avoid computational singularity in the integal kernels, as well a s in
the integral intervals. By using variable transformation, the kernel singularity can be removed,
and an expression of CF(=) suitable for computation can be obtained as follows.
where log(-) is the natural logarithm function. Variable transformation was first made to
utilize the properties of gf(-) in ( 4.18). Then, the variable was transforrned back to avoid
the nearly kernel singularities in ( 4.19).
An efficient cornputer algorithm for computation of the optimal stopping boundaries by
using t h e procedure described in the beginning of this section and the expression of G ~ ( A )
of ( 4.20) has been developed. In the computer aigorithm, we have used some programming
techniques to avoid forward recursive calling of G:(*) in ( 4.20). which can cause exponentially
increasing memory consumption and slow down computation. The computer algorithm has
been successfu~ly implemented in Turbo C++. In the following section, we will present an
example of the computed optimal stopping boundaries.
4.6 Numerical Result s
Consider
pt = d(t - 1) = 0.9~-', for t 2 1.
Using decision parameters Q = 10 and cl = 10, we have computed the optimal stopping
boundaries that are presented in Table 4.1 and are plotted in Figure 4.2.
Table 4.1: The optimal stopping boundaries when 6( i ) = 0.9' for i 2 0.
Stop & Accept Ho
Continuation Region . . . . . . . - . . - . . . . . . . . . . . . . . . . . . . . . . . . . . . - C
pping Boundary A,,
Stop & Accept H,
Figure 4.2: The optimal stopping boundaries of the example.
A distinct feature of the optimal stopping boundaries of the example is shown in Figure 4.2.
When = 6(t - 1) decreases to O for t 2 1, the continuation region, where the sampling
of the sequential test is continued, is bounded. That is, the optimal stopping boundaries are
truncated, and the test must be stopped before or at the truncation point, which is 12 in this
example.
Chapter 5
Likelihood Ratio Testing Procedures
5.1 Introduction
For change detection in autocorrelated processes, there have been two sets of tools. The first
set is the control chart procedures. We have studied the performance of the CUSUM, EWMA
and Shewhart control chart procedures for change detection in autocorrelated processes. Now,
we examine their alternatives, the second set of tools of LR testing procedures.
A classical LR testing procedure signals occurrence of a change when the maximum of
the likelihood ratio over unknown parameten, such as the change point, exceeds a constant
threshold value. Basseville and Nikiforov (1993) described a unified likelihood ratio framework
for design and performance analysis of change detection procedures, and studied the LR testing
procedure in details. Because of the fine intuition behind them and their ability to deal
with cornplex situations, LR testing procedures have been widely used for change detection
in autocorrelated processes. When the magnitude of change is unknown, the LR testing
procedure is a generalized likelihood ratio testing procedure proposed by Willsky and Jones
(1976) and studied recently by Lai (1995) and Siegmund and Venkatraman (1995). When the
change point is the only unknown parameter, the LR testing procedure is a likelihood ratio
CUSUM procedure by using residuals, and it is optimal under a minimax criterion in the i.i.d.
case. However, the optimality rnay not hold in general, and the performance of the LR testing
procedure in comparison with the control chart procedures is unknown.
In this chapter, we will study the structure of the LR testing procedure, extend the LR
testing procedure, propose a combined CUSUM and Shewhart control chart procedure, derive
integral equations for computation of their ARLs in some cases, and study their performance.
5.2 The Structure of LR Testing Procedures
Consider the change detection problem on the residual process {et , t = 1,2, - * 0 ) . Assume
et has a continuous distribution, and let f,(*) be the probability density function of et if
E ( e t ) = p. Because of the mutual independence, the joint density function of e: is
under Ho, and
under Hi. The likelihood ratio is
and the LR testing procedure is a stopping time
t
NLR = inf{t 2 1 : max C log f ~ ( i - r ) (ei) 1 9 % i=r fo (ei) 1 h l ,
where h is a constant threshold value.
As we have discussed in the last part of 5 2.3, the joint densities of ei ( 5.1) under Ho and
( 5.2) under Hi are equal to the joint densities of qt under Ho and Hl, respectively. Therefore,
the likelihood ratio ( 5.3) based on the residuals e: is equal to the likelihood ratio based on
the observations y:, and the LR testing procedure ( 5.4) based on the residuals has the same
detection properties as that based on y:. However, because of the mutual independence, the
LR testing procedure based on residuals has a simpler expression than its counterpart based
on observations for an autocorrelated process.
The stopping time NLR can be rewritten in the fouowing form.
w here
NiR = inf{n 2 1 : xi=+=-' log f ~ ( i - r ) (ei 10 (ei) 2 h)
= infin 2 1 : log f ' ( i ) ( C r + i )
/ ~ f e r + i ) 2 h ) ,
for r = 1,2, - - O . NiR is the stopping rule of a one-side SPRT on the mean level hypotheses
presented in 5 4.1, and NLR, i = 2,3, , is the same stopping nile appbed to (e,, e,+~, -1 .
( 5.5) suggests that in the LR testing procedure, we actually activate a common stopping rule
at each possible change point r = 1,2, -, and whenever one of the activated tests signals that
E(e i ) has started to change, the whole LR testing procedure NtR is stopped, and Hl : r 5 t
in the change point hypotheses is accepted.
Lorden (1971) pointed out the relationship ( 5.5) between the likelihood ratio CUSUM
procedure and the one-side SPRT. By using it, he obtained the asymptotic optimality of the
likelihood ratio CUSUM procedure in the i.i.d. case. The relationship ( 5.5) has also been
used for design and analysis of change detection procedures by Basseville and Nikiforov (1993)
and Lai (1995).
5.3 Extended LR Testing Procedures
In the i.i.d. case, that is, when 6( i ) is a constant for i 2 O, SPRT is optimal for testing the
change profile hypotheses, and the likelihood ratio CUSUM procedure ( 5.4) is optimal for
change detection - testing the change point hypotheses. In both SPRT and the c1assica.l LR
testing procedure ( 5.4), their threshold values are a constant. When 6(i ) is generally tirne-
varying for i 2 0, we have shown in Chapter 4 that the optimal sequentid test on the mean
level hypotheses is a likelihood ratio test with time-varying thresholds. These let us propose
the following extended LR testing procedure by replacing the constant threshold in ( 5.4) with
time-vârying thresholds.
t
NELR = infit 2 1 : max [x log f ~ ( i - r ) (ei) i<r<t i=r
f d e i ) - ht -r ] 2 O),
where {hi, i = 0,1, - - -} is a time-varying sequence.
The classical LR testing procedure NLR can be rewritten ôs
t
NLR = inf { t 2 1 : max [ç log f~(i-r) (ei) 1 9 9 i=r
-hl 2 O}. f o ( 4
For a &), let { N E L ~ ) denote the class of al1 extended LR testing procedures. Clearly,
Therefore, the best change detection procedure from the class {NELR) may improve the
performance of the classicd LR testing procedure NLR.
To find the optimal threshold sequence (hi, i = 0,1, - - 9 ) for the extended LR testing
procedure ( 5.6) needs some effort. However, we can let
where {Ai, i = 1,2, - - *) is the lower threshold sequence of the optimal sequential likelihood
ratio test T* on the mean level hypotheses studied in Chapter 4, which takes at least one
observation.
By doing the value assignment ( 5.7), we actuaily make
where NkLR is the one-side optimal sequential test r' but applied to e,, e,+l, - 0 0 , and fiEtR
is the extended LR testing procedure with the threshold sequence {- log Ai+i, i = 0,1, - *),
t
&LR = inf ( t 2 1 : max [C log fqi-r) (ei) + 1% &,+il > O).
l ~ r ~ t i=, fo(ei)
We expect that the extended LR testing procedure NELR of ( 5.8) performs better than the
classical LR testing procedure NLR of ( 5.5).
In Chapter 4, we have studied the structure of ru and obtained some properties of {Ai, i =
0,1, - -). When 6(i) converges to zero for i = 0,1, - -*, for example, when {et) are the residuals
of an IMA(1,l) process subject to an observational additive step shift, the optimal sequential
test r' on the mean level hypotheses is truncated and takes at most M observations. Therefore,
the extended LR testing procedure NELR is
which turns out to be a window-limited procedure.
When {et} are the residuals of an AR(p) process subject to an observational additive step
shift, 6(i) varies at the first p time points i = O, 1, --- , p- 1, and is a constant for i = p, p+l, -- -. In this case, we have shown thôt Ai is a constant for i = p,p + 1, - - O - Therefore, in NI, the
threshold hi is a constant for i = p - 1, p, p + 1, --O, and varies only at the first p - 1 time
points i = 0,1,--,p -2.
85
5.4 A Combined CUSUM and Shewhart Control Chart
Procedure
To approximate the best change detection procedure from the class { N E ~ R ) , we propose the
following combined CUSUM and Shewhart (CCS) control chart procedure.
where
is the usual higher-side CUSUM statistic, k > O is the CUSUM reference value, h > O is the
CUSUM threshold value, C > O is the Shewhart control limit. Note that we assume
Otherwise, the CUSUM part will be dominated by the Shewhart part.
CCS procedures first appeared in Westgard, Groth, Aronsson and de Verdier (1977), and
were studied by Lucas (1 982) for detection of step shifts of unknown size in the i.i.d. case. She-
whart chart was capable of detecting changes of large size, CUSUM was superior in detecting
changes of small size, and their combination combined the strength of both.
Our intention here is to approximate the time-varying t hreshold sequence of the extended
LR testing procedure by adding the Shewhart control limit to the threshold of CUSUM.
In cornparison with the LR or extended LR testing procedures, the structure of the CCS
procedure is relatively simple. The CCS procedure is non-parametric, that is, its composition
(not performance) is independent of process mode1 parameters- For a general if(-), we can
establish integral equations to solve the ARLs of the CCS procedure.
86
5.5 Performance Analysis
In this section, under the ARL criterion, we will investigate the performance of three change
detection procedures: the LR testing procedure applied to the residuals of an AR(1) process
for detection of an observational additive step shift, the extended LR testing procedure applied
to the residuals of an IMA(1 , I ) process for detection of an observational additive step shift,
and the CCS procedure. For these procedures, because of the Markov chain nature of their
testing statistics, we can manage to establish integral equations for their ARLs, and solve
them accurately and precisely. The involved Markov chains will be non-homogeneous.
Singulority, primarily the discontinuity of the integral kernels, will be present in the inte-
gral equations of the three procedures. This imposes much difficulty in solving the integral
equations, care and effort are needed to handle the singularity in order to design valid and
efficient computer algori t hms. Gaussian quadrature for solving standard Fredholm equations
of the second kind cannot be applied due to the singularity. A technique we will use to
overcome the singularity is to construct quadrature on a unifom grid by using cubic spline
approximation of the unknown function. Once the singuiarity is handled, similar recursive
computational procedures can be developed to solve the ARL values as we did for CUSUM
and EWMA in Chapter 3.
In the following subsections, we will describe the three change detection procedures, derive
the integral equations, and present numerical results. Al1 the computation has been success-
fully implemented in Turbo Cf+. As in 5 3.2, for performance andysis of a change detection
procedure N, we will be only interested in E,,(N) and E r Z I ( N ) , and we can assume that
We will assume that
0 et has a normal distribution with standard deviation u = 1.
5.5.1 The LR Testing Procedure Applied to AR(1) Processes
For an AR(1) process with the autoregressive parameter al subject to an observational additive
step shift, the mean of residud after change is
6(i) = { i = O, (5.10)
( 1 - a ) i > 1 ,
and the threshold in ( 5.7) is a constant for all i 2 O. Hence, the extended LR testing procedure
NELR in ( 5 .8 ) reduces to the usual LR testing procedure with a constant threshold.
Let A. > O be the size of an observational additive step shift we intend to detect, po = A.
and pl = (1 - a l ) A o . For t 2 1, let
is a non-homogeneous Markov chain. Let h > O be a constant threshold value. Let So = 0,
NLR = NELR = inf(t 2 1 : St 2 h ) .
For i 2 0, define AR@, i) the ARL of the detection procedure hR applied to {eN17 e;+l, --)
given that Si = S. Obviously, s < h. Then, for So = 0,
Er,i (NLR) = ARL(0, O ) .
For s < h, we c m derive an integral equation suitable for c
ment ation as follows.
ARL(s, i)
ation and computer imple-
w here f N ( p , 0 2 ) ( - ) is the N (pl 02) density function, and is the indicator function. Clearly, the
integral kernel is discontinuous, and therefore, singular. Besides, there is another singularity,
the infinite integral range. To solve this integral equation, special techniques and major effort
more than those for solving the integral equations in 5 3.2 are required.
procedures applied to the residuals of an AR(1) process with autoregressive coefficient al =
F
0.45 subject to an observational additive step shift of size A when their in-control ARLs are
100. ARLs are optimized a t A = 1 in the upper panel, and at A = 3 in the lower panel.
Table 5.1: ARLs of the LR testing procedure and the CUSUM and EWMA control chart
For the fixed in-control ARL of 100 and al = 0.45, Table 5.1 presents the ARLs, Er=i(N),
of the LR procedure in cornparison to those of CUSUM and EWMA, and Figure 5.1 shows
the ARL curves.
Figure 5.1 shows that when the mean of residuai after change is ( 5.10), the LR testing
procedure performs roughly the sarne as the CUSUM control chart procedure, and both of
them perform better than the EWMA control chart procedure.
-I
A O
100
100
100
100
100
100
LR
CUSUM
EWMA
LR
CUSUM
EWMA
(h=2.72)
k = 0.28
X=O.OS
(h=2.92)
k = 1.0
À=0.63
0.5
29.04
28.47
35.04
40.89
40.93
62.80
1
12.66
12.62
13.68
17.45
18.12
26.53
1.5
7.07
7.31
7.47
7.98
8.67
11.03
2
4.46
4.90
4.77
4.05
4.50
4.83
2.5
3.01
3.59
3.34
2.35
2.57
2.45
3
2.13
2.78
2.55
1.58
1.68
1.59
3.5
1.59
2.24
2.15
1.23
1.27
1.26
4
1.28
1.84
1.94
1.08
1.09
1.11
4.5
1.12
1.53
1.79
1.02
1.03
1.04 -
-- . - . - LR: (h=2.9234) 60- a CUSUM: k=1 .O (h=1.532)
- V
A - - - - EWMA: k0.63 (C=2.W)
2 40- -
20 - -
O I I 1 - 1 D I b
1 I 1 I I
- . - . - . - LR: (h=2.7217)
CUSUM: b0.28 (h=4.156) - - - - - EWMA: M . 0 8 (C=2.067)
-
-
- O I 1 1 1 1 - . - - , . .
O 0.5 1 1.5 2 2.5 3 3.5 4 4.5 A: Size of change
I
Figure 5.1: ARLs of the LR testing procedure and the CUSUM and EWMA control chart
O 0.5 1 1.5 2 2.5 3 3.5 4 4.5 A: Size of change
procedures applied to the residuals of an AR(1) process with autoregressive coefficient al =
0.45 subject to an observational additive step shift of size A. For the fixed in-control ARL,
the chosen chart parameters optimize the ARL at A = 1 in the upper panel and at A = 3 in
the lower panel.
5.5.2 The Extended LR Procedure Applied to IMA(1,l) Processes
For an I MA(1,l) process subject to an observahional additive step shift, the rnean of residual
after change is decaying exponentially fast:
for i 2 O. We wiil study the situations where IPII is small so that the optimal sequential test
r' on the mean level hypotheses is truncated at two observations. Thmefore, the window size
of N E L R in ( 5.9) is M = 2 and integral equations for computation of ARLs c m be established
and solved.
Let A. > O be the size of an observational additive step shift we intend to detect, po = A.
and pl = Piho. Let ho and hl > O be two threshold values.
is a non-homogeneous Markov chain.
For i > O, define ARL(e, i) the ARL of the extended LR procedure fiELR applied to
{ei+i, ei+2, - - a ) given t hat e; = e. For eo = 0,
We can derive the following integral equation.
Po PO Cl1 = lP(~o(ei+i - j-)) 1 hot or po(e - -) + pi(eà+i - 5 2 hi) 2
for e c 2 + 7. The same kinds of singularity as in ( 5.11) are present in ( 5.13).
For the fixed in-control ARL of 100 and bl = 0.1, Table 5.2 presents the ARLs of the
extended LR detection procedure in cornparison with the classical LR detection procedure
and the CUSUM and EWMA control chart procedures and Figure 5.2 shows the ARL curves.
Figure 5.2 shows that when the mean of residual &ter change is ( 5.12), the extended LR
procedure improves the performance of both CUSUM and the classical LR detection procedure
slightly. For detection of this rapidly decreasing change, the EWMA control chart procedure
offers t h e best performance.
ho = 2.4 XLR
(hl = 1.77)
EWMA 1 X = 0.55
5 ho = 4.2
XLR
CUSUM 1 k = 2.0
Table 5.2: ARLs of the extended LR (XLR) testing procedure in cornparison with the classicd
LR testing procedure and the CUSUM and EWMA control chart procedures applied to the
residuals of an I M A ( 1 , l ) process with moving average coefficient bl = 0.1 subject to an
observational additive step shift of size A when their in-control ARLs are 100. ARLs are
optimized at A = 1 in the upper panel, and at A = 3 in the lower panel.
. . . . . . . . . . XLR: b2.4 (h=l.7742)
80 - .-.-.-. LR: (h=1.9294) - CUSUM: k=2.0 (h4.327) MMA: M.55 (C=2.544) -
-
20 - \ - \
O 0.5 1 1.5 2 2.5 3 3.5 4 A: Size of change
f 00 - ....... . . . XLR: hw.2 (h=2.2928)
80 - . - . - . - . LR: (h=2.7634)
CUSUM: k=2.0 (h=0.329) A 60- a EWMA: M.495 (C=2.533)
- Y
2 40- -
20 -
O - - - - O 0.5 1 1.5 2 2.5 3 3.5 4
A: Size of change
Figure 5.2: ARLs of the extended LR (XLR) testing procedure in cornparison with the classical
LR testing procedure and the CUSUM and EWMA control chart procedures applied to the
residuals of an I M A(1 , l ) process with moving average coefficient bl = 0.1 subject to an
observational additive step shift of size A. For the fixed in-control ARL, the chosen chart
parameters optimize the ARL at A = 1 in the upper panel and at A = 3 in the lower panel.
5.5.3 The CCS Procedure
For a general &),
is a non-homogeneous Maskov c h a h
For i > O, define ARL(s, i) the ARL of the CCS procedure applied to {ei+l, e;+2, -) given
that S H ( i ) = S. For SH(0) = O,
For O < s < h, we can derive the following integral equations for computation.
= lP(lei+l( 3 Cor s+(e i+ l - k ) 2 h )
+ARL(O, i + 1)[9(-b(i) + min{C, k - s)) - a(-&) - C)]
+ARL(O, i + l)[@(-d(i) + k - s) - 9(-b( i ) - C)]
Therefore,
and
for the purpose of initialization. Singularity of discontinuous integral kernels also appears in
the above integrd equations.
CCS
CUSUM
Table 5.3: ARLs of a CCS procedure in cornparison with the LR testing procedure and the
CUSUM control chart procedure applied to the residuds of an AR(I) process with autore-
gressive coefficient a, = 0.15 subject to an observational additive step shift of size A when
their in-control ARLs are 100.
A recursive computational procedure sirnilar to that for CUSUM chart in 5 3.2.2 has been
developed. As an example, we compute the ARL values of the CCS procedure and compare its
performance with the CUSUM and the LR testing procedme when the mean of residual after .
change is ( 5.10). For the fixed in-control ARL of 100 and al = 0.45, Table 5.3 presents the
computed ARL values, E,,&V), and Figure 5.3 shows the ARL curves. The CCS procedure
performs roughly the same as CUSUM and the LR procedure, and it performs a little better
97
than CUSUM for detection of larger size of changes.
. . . . . . . . . . CCS: Cd.8, b0.28 ( h d . 170)
CUSUM: k=0.28 (h4.156)
. - . - . W . LR: (hs2.7217)
- d
- - - . - . - b - . - . - . - . - . d C .
O t I I 1 I I
O 0.5 1 4 f .5 2 2-5 3 3.5 A: size of change
Figure 5.3: ARLs of a CCS detection procedure in cornparison with the classical LR testing
procedure and the CUSUM and EWMA control chart procedures applied to the residuals of
an AR(1) process wi th autoregressive coefficient ai = 0.45 subject to an observational additive
step shift of size 4.
5.6 Summary of the Performance Analyses
Combining the performance analyses in Chapter 3 and this chapter, we have the following re-
sults on the performance of the control chart procedures, the classical and extended LR testing
procedures, and the CCS procedure for detection of an additive change in ôn autocorrelated
process .
1. The classical and extended LR testing procedures and the CCS procedure roughly per-
form as well as CUSUM. The LR testing procedure shows slightly better performance
than CUSUM for detection of a large size of observational additive step shift in an
AR(1) process. The extended LR testing procedure can improve the performance of the
classical one.
2. When the mean of residual after change decreases to O rapidly, EWMA roughly performs
the best among al1 the considered change detection procedures.
3. Otherwise, CUSUM roughly performs the best.
4. Generally, Shewhart performs well for detection of very large size of changes.
Chapter 6
Conclusion and Future Research
6.1 Contribut ions
This thesis has studied several issues on change detection in autocorrelated processes. The
contributions and major findings of this thesis are as follows.
We have systematically investigated the properties of process residuals for change detec-
tion. We have shown that process residuals are sufficient, and equivalent to process observa-
tions, for the problem of change detection. Therefore, change detection c m be done by using
process residuals. We have shown that process residuals are mutually uncorrelated with zero
means when the process is in-control. We have developed a general procedure for specifically
deriving the forms of process residuals. Using the procedure, we have derived the forms of
residuals of general A RIMA processes and state space models, and studied the properties
of the residuals under situations of additive changes, nonadditive changes, steady state, and
nonsteady state. Under the Gaussian assumption, for an ARMA process or a steady-state
state space model subject to an additive change, the residuals are i.i.d. with zero means before
the occurrence of change, and are still mutually independent with the same variance as before
but with time-varying means after the occurrence of change. For an ARIMA process or a
steady-state state model subject to an additive change, we have found that the properties of
residuals are independent of the feedback control applied to the process. The independence
of feedback control of process residuals makes the combination of process control and process
monitoring (change detection) easier, because change detection procedures based on residuals
can be designed independently of the feedback control policies.
Because of the properties of proce3s residuals, detection of an additive change in an ARIMA
process or a steady-state state space model is transformed to detection of an additive change in
the residual process with detection properties preserved. We have studied the performance of
the CUSUM, EWMA and Shewhart control chart procedures applied to the residuals under the
ARL criterion. For computation of the ARLs of the control chart procedures, we have derived
an explicit formula for Shewhart, established integral equations for CUSUM and EWMA, and
developed numerical procedures for solving the integral equations. We have investigated three
typical situations to study the performance of the control chart procedures applied to the
residuals.
We have also studied the LR testing procedure applied to residuals for detection of an
additive change in an autocorrelated process. We have extended the classical LR testing
procedure by replacing its constant threshold with a time-varying threshold sequence. We have
proposed a combined CUSUM and Shewhart procedure for approximation of the extended LR
testing procedure. We have developed numerical procedures based on integral equations for
computation of the ARLs of these detection procedures, and nurnerically investigated their
performance in sorne cases.
We have found that the LR testing procedures perform roughly as well as the CUSUM
control chart procedure. The extended LR testing procedure can improve the performance of
the classical one. When the mean of residual after change decreases rapidly to zero, the EWMA
control chart procedure roughly performs the best among a.U considered change detection
procedures. Otherwise, CUSUM roughly performs the best. Shewhart performs well only for
detection of very large size of changes.
We have studied the problem of optimal sequential testing on process mean levels. The.
problem is a special situation of change detection, is a generalization of Wald's SPRT problem,
and has wide applications in signal processing and other areas. We have formulated the
problem as an optimal stopping problem, derîved the optimal stopping d e , obtained some
important properties of the optimal stopping boundaries, which are two sequence of time-
varying threshold values, developed an algorithm for computation of the optima stopping
boundaries, and presented a numerical example. When the rnean of residual after change
decreases to zero (or a sufficiently small value), the optimal stopping boundaries are truncated.
That is, to be cost-efficient, the sampling and the test must be stopped before or at the
truncation point. When the mean of residual after change is stabilized at a constant level, the
optimal stopping boundaries are also stabilized. When the mean of residual after change is
periodically varying, the optimal stopping boundaries are also periodically varying with the
same period.
Future Research
Future research in the area of change detection in autocorrelated processes c m be done in the
following direct ions.
a Detect ion of nonaddit ive changes.
Change detection in multivariate processes.
a Combination of detection and estimation of the change point in the change detection
problem.
Bibliography
[l] Allen, R., and Huang, K. (1994), "Self-tuning Control of Cutting Force for Rough Tuming
Operat ionsn and the references, Proceedings of the Institution of Mechanical Engineers,
Part B: Journal of Engineering Manufacture, Vo1.208, pp.157-165.
[2] Alwan, L. C. (1992), "Effects of Autocorrelation on Control Chart Performance", Corn-
munications in Statistics - Theory and Methods, 21(4), pp.1025-1049.
[3] Alwan, L. C., and Roberts, H. V. (1988), "TirneSeries Modeling for Statistical Process
Control", Journal of Business and Economic Statistics, Vo1.6, No. 1, pp.87-95.
[4] Alwan, L. C., and Roberts, H. V. (1995), "The Problem of Misplaced Control Limitsn,
Applied Statistics, 44, No.3, pp.269-278.
[5] Astrom, K. J ., and Wittenmark, B. (1 997), Cornputer-Controlled Systems: Theory and
Design, 3rd Ed., Upper Saddle River, New Jersey: Prentice Hall.
[6] Bagshaw, M., and Johnson, R. A. (1975), "The Effect of Serial Correlation on the Per-
formance of CUSUM Tests IIn, Technometrics, Vol. 17, No.1, pp.73-80.
[7] Basseville, M. (1988), &Detecting Changes in Signals and Systems - A Survey", Auto-
matica, Vo1.24, No.3, pp.309-326.
[B] Bassevitle, M., and Nikiforov, 1. V. (1993), Detection of Abrupt Changes: Theory and
Application, Englewood ClifXs, New Jersey: Prentice Hd.
(91 Beibel, M. (1996), "A Note on Ritov's Bayes Approach to the Minimax Property of the
CUSUM Procedure", The Annals of Statistics, Vo1.24, No.4, pp.1804-1812.
(101 Bert houex, P. M., Hunter, W. G., and Pallesen, L. (l978), UMoni toring Sewage Treatment
Plants: Some Quality Control Aspects", Journal of Quality Technology, Vol.iO, No.4,
pp. 139-149.
[ll] Box, G . E. P., Jenkins, G. M., and Reinsel, G. C. (1994), Tirne Sen'es Anolysis: Fore-
casting and Contml, 3rd Ed., Englewood Cliffs, New Jersey: Prentice Hall.
[12] Box, G., and Kramer, T. (1992), uStatistical Process Monitoring and Feedback Adjust-
ment - A Discussionn, Technornetrics, Vo1.34, No.3, pp.251-267.
[13] Chow, Y. S., and Robbins, H. (1963), "On Optimal Stopping Rules", 2. Wahrschein-
fichkeitstheorie, Vo1.2, No.1, pp.33-49.
[14] Cochlar, J., and Vrana, 1. (1978), "On the Optimal Sequential Test of Two Hypotheses
for Statistically Dependent Observationn, Kybernetika, Vol. 14, No. 1, pp.57-69.
[15] Crowder, S. V. (l992), UAn SPC Mode1 for Short Production Runs: Minimizing Expected
Cost", Technornetrics, Vo1.34, No.1, pp.64-73.
[16] Harris, T. J., and Ross, W. H. (1991), 'Statistical Process Control Procedures for Cor-
related Observationsn, The Canadian Journal of Chernical Engineering, Vo1.69, Feb.,
pp.4&57.
[17] Jensen, K. L., and Vardeman, S. B. (1993), "Optimal Adjustment in the Presence of De-
terministic Process Drift and Randorn Adjustment Error", Technometrics, Vo1.35, No.4,
pp.376-389.
[18] Johnson, R. A., and Bagshaw, M. (1974), "The Effect of Serial Correlation on the Per-
formance of CUSUM Tests", Technornetrics, Vo1.16, pp.103-112.
[19] Kailath, T., and Poor, H. V. (1998), "Detection of Stochastic Processes", IEEE Trans-
action on Information Theory, h1.44, No.6, pp.2230-2259.
[20] Kligene, N., and Telksnis, L. (1983), "Methods of Detecting Instants of Change of Ran-
dom Process Properties", Automation and Remote Control, Vo1.44, pp.1241-1283.
r2 11 Lai, T. L. (1995), "Sequential Changepoint Detection in Quality Control and Dynamical
Systemsn with Discussions, Journal of the Royal Statistical Society, Series B, 57, No.4,
pp.613-658.
[2%] Lehmann, E. L. (1959), Testing Statistical Hypotheses, John Wiley and Sons.
[23] Liptser, R. S., and Shiryaev, A. N. (1978), Statistics of Randorn Processes II: Applications,
New York, Heidelberg and Berlin: Springer-Verlag.
[24] Liu, Y ., and Blostein, S. D. (1992), "Optimality of Sequential Probability Ratio Test for
Nonstationary Observations", IEEE Transactions on In/ormation Theory, Vo1.38, No.1,
[25] Lorden, G. (1971), "Procedures for Reacting to a Change in Distribution", The Annals
of Mathematical Statistics, Vo1.42, No.6, pp.1897-1908.
[26] Lucas, J. M. (1982), "Combined Shewhart-CUSUM Quality Control Schemes" , Journal
of Quality Technology, Vol. 14, No.2, pp.51-59.
[27] Lucas, J. M., and Crosier, R. B. (1982), "Fast Initiai Response for CUSUM Quality Con-
trol Schemes: Cive Your CUSUM a Head Start", Technomettics, Vo1.24, No.3, pp.190-
205.
[28] MacGregor, J. F. (1988), "On-Line Statistical Process Control", Chernical Engineering
Progress, Oct., pp.21-31.
[29] Montgomery, D. C., Keats, J. B., Runger, G. C., and Messina, W., S. (1994), "Inte-
grating Statistical Process Control and Engineering Process Control" , Journal of Quality
Technology, Vo1.26, No.2, pp.79-87.
[a01 Moustakides, G. V. (1986), "Optimal Stopping Times for Detecting Changes in Distri-
butions", The Annals of Statistics, Vo1.14, No.4, pp. 1379-1387.
[31] Moustakides, G. V. (1998), "Quickest Detection of Abrupt Changes for a Class of Random
Processes", IEEE T~*unsacf ions on In formu tion Theory, Vo1.44, No.5, pp. 1965-1968.
[32] Page, E. S. (WM), "Continuous Inspection Scheme" , Biometrika, 41, pp.100-115.
[33] Press, W. H., Teukolsky, S. A., Vetterling, W. T., and Flannery, B. P. (1992), Numen'cul
Recipes in C: The Ar t of Scientific Computing, 2nd edition, Cambridge University Press.
[34] Ritov, Y. (1990), "Decision Theoretic Optimality of the CUSUM Proceduren, The Annals
of Statistics, Vo1.18, N0.3, pp.14641469.
[35] Shiryaev, A. N. (1961), "The Problem of the Most Rapid Detection of a Disturbance in
a Stationary Process", Soviet Mathematics, Vo1.2, No.3, pp.795-799.
107
[36] Shiryaev, A. N. (1963), "On Optimal Methods in Quickest Detection Problems", Theory
of Probability and its Applications, Vo1.8, No. 1, pp.22-46.
[37] Shiryaev, A. N. (1978), Optimal Stopping Rules, New York, Heidelberg and Berlin:
Springer-Verlag.
[38] Siegmund, D., and Venkatraman, E. S. (1995), "Using t h e Generalized Likelihood Ratio
Statistic for Sequential Detection of a Change-Pointn, The Annals of Statistics, Vo1.23,
No. 1, pp.255271.
[39] Tucker, W. T., Faltin, F. W., and Vander Wiel, S. A. (1993), "Algorithmic Statistical
Process Control: An Elaboration", Technometn'cs, Vo1.35, No.4, pp.363-375.
[40] Vander Wiel, S. A. (1996 a), "Monitoring Processes That Wander Using Integrated Mov-
ing Average Modelsn, Technornehics, Vo1.38, No.2, pp.139-151.
[41] Vander Wiel, S. A. (1996 b), "Optimal Discrete Adjustments for Short Production Rusn,
Probability in the Engineering and Infornational Sciences, 10, pp.119-151.
[42] Vasilopoulos, A. V., and Stamboulis, A. P. (1978), "Modification of Control Chart Limits
in the Presence of Data Correlation", Journal of Quality Technology, Vol.10, No.1, pp.20-
30.
[43] Wald, A. (1945), 'Sequential Tests of S tatistical Hypotheses", Annais of Mathematical
Statistics, Vo1.16, pp.117-186.
[44] Wald, A. (1947), Sequential Analysis, New York: John Willey and Sons.
[45] Wald, A., and Wolfowitz, J. (1948), "Optimal Character of the Sequential Probability
Ratio Testn, Annals of Mathematical Statistics, V01.19, pp.326-339.
108
[46] Wardell, D. G., Moskowitz, H., and Plante, R. D. (1992), "Control Charts in Presenceof
Data Correlation", Management Science, Vo1.38, No.8, Aug., pp. 1084-1 105.
[47] Wardell, D. G., Moskowitz, H., and Plante, R. D. (1994), "Run-Length Distribution of
Special-Cause Control Charts for Correlated Processes", Technometrics, Vo1.36, No.1,
pp.3-17.
[48] Westgard, G. O., Groth, T., Aronsson, T., and de Verdier C.-H. (1977), "Combined
S hewhart-CUSUM Control Chart for Improved Quality Control in Clinical Chemistry" ,
Clinical Chemistry, Vo1.23, No.10, pp.1881-1887.
[49] Willsky, A. S., and Jones, H. L. (1976), "A Generalized Likelihood Ratio Approach to the
Detection and Estimation of Jumps in Linear Systemsn, IEEE Transaction on Automatic
Control, Vo1.21, No.1, pp.108-112.
[50] Yakir, B. (1997), "A Note on Optimal Detection of a Change in Distributionsn, The
Annals of StatWics, Vo1.25, No.5, pp.2117-2126.