15
Predictability of Demographic Variables in the Short Run Author(s): Joop de Beer Source: European Journal of Population / Revue Européenne de Démographie, Vol. 4, No. 4 (Dec., 1988), pp. 283-296 Published by: Springer Stable URL: http://www.jstor.org/stable/20164488 . Accessed: 28/06/2014 10:33 Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp . JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected]. . Springer is collaborating with JSTOR to digitize, preserve and extend access to European Journal of Population / Revue Européenne de Démographie. http://www.jstor.org This content downloaded from 193.105.245.57 on Sat, 28 Jun 2014 10:33:52 AM All use subject to JSTOR Terms and Conditions

Predictability of Demographic Variables in the Short Run

Embed Size (px)

Citation preview

Page 1: Predictability of Demographic Variables in the Short Run

Predictability of Demographic Variables in the Short RunAuthor(s): Joop de BeerSource: European Journal of Population / Revue Européenne de Démographie, Vol. 4, No. 4(Dec., 1988), pp. 283-296Published by: SpringerStable URL: http://www.jstor.org/stable/20164488 .

Accessed: 28/06/2014 10:33

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .http://www.jstor.org/page/info/about/policies/terms.jsp

.JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range ofcontent in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new formsof scholarship. For more information about JSTOR, please contact [email protected].

.

Springer is collaborating with JSTOR to digitize, preserve and extend access to European Journal ofPopulation / Revue Européenne de Démographie.

http://www.jstor.org

This content downloaded from 193.105.245.57 on Sat, 28 Jun 2014 10:33:52 AMAll use subject to JSTOR Terms and Conditions

Page 2: Predictability of Demographic Variables in the Short Run

European Journal of Population 4 (1988) 283-296 North-Holland

283

PREDICTABILITY OF DEMOGRAPHIC VARIABLES IN THE SHORT RUN

Joop DE BEER *

Netherlands Central Bureau of Statistics, Voorburg, The Netherlands

Received January 1987, final version received January 1989

Abstract. In assessing the performance of population forecasts, it is useful to have a

standard with which the forecast errors can be compared. Univariate time series models

may provide such a standard. Mean forecast errors of time series models indicate to

what extent the movement of a variable could have been predicted from its own past.

These errors show the degree of predictability that is attainable, at least in a given

period. In this paper three times series methods (exponential smoothing, Box-Jenkins

method, and structural time series models) are applied to Dutch data on births, deaths,

marriages, immigrants, and emigrants. The variability of prediction errors between

different periods is examined. The possibility that univariate predictions can be

improved by using quarterly or monthly data instead of annual data is tested.

Resume. Pr?visibilit? ? court terme des variables d?mographiques Lorsque Ton ?value les performances de perspectives d?mographiques, il est utile de

disposer d'une m?thode type ? laquelle les erreurs de pr?vision peuvent ?tre compar?es. Cette m?thode type peut ?tre trouv?e parmi les mod?les qui projettent une seule

variable d?mographique en fonction de son ?volution pass?e. Les erreurs moyennes de

pr?vision de ces mod?les indiquent dans quelle mesure l'?volution pass?e d'une variable

permet de pr?dire son ?volution ? venir. Ces erreurs montrent donc le degr? de

previsibilit? que Ton peut atteindre au cours d'une p?riode donn?e. Dans cet article,

trois m?thodes (lissage exponentiel, m?thode de Box-Jenkins et mod?les structurels) sont appliqu?es ? des donn?es hollandaises portant sur les nombres de naissances, de

d?c?s, de mariages, d'immigrants. La variabilit? des erreurs de pr?vision entre les

diverses p?riodes est examin?e. La possibilit? d'am?liorer ces pr?visions, en utilisant des

donn?es trimestrielles ou mensuelles est test?e.

* The views expressed in this paper are those of the author and do not necessarily reflect the

policies of the Netherlands Central Bureau of Statistics.

Author's address: Netherlands Central Bureau of Statistics, Dept. for Population Statistics, P.O.

Box 959, 2270 AZ Voorburg, The Netherlands.

0168-6577/88/53.50 ? 1988, Elsevier Science Publishers B.V. (North-Holland)

This content downloaded from 193.105.245.57 on Sat, 28 Jun 2014 10:33:52 AMAll use subject to JSTOR Terms and Conditions

Page 3: Predictability of Demographic Variables in the Short Run

284 /. de Beer / Predictability of demographic variables

1. Introduction

In the search for possible improvements in the methods used for

population forecasting, it is important to examine the forecasting

performance of these methods in the past. An assessment of this

performance requires a standard with which the forecast errors in a

particular period can be compared. Univariate time series models may

provide such a standard. The average size of prediction errors of such

models gives an indication of the degree of predictability of a particular variable in a given period. The official population forecast should aim at improving on this degree of predictability since this forecast can take account of additional information from all kinds of sources.

In order to assess the confidence level of a new forecast, one may

examine errors of official population forecasts published in the past (see e.g. Keyfitz (1981), Stoto (1983)). One problem is, however, that successive forecasts are not fully comparable owing to changes in the

methods used. Furthermore, official forecasts are available for a limited

number of starting years only, which makes it difficult to eliminate random effects. As no explicit quantitative model is used in determin

ing the future values of the parameters, it is not possible to make forecasts for intermediate years with hindsight. Univariate time series

models, however, do provide this possibility. This paper examines how accurately the movement of five demo

graphic variables can be forecast up to three years ahead by means of

time series models. The series examined refer to numbers of births,

deaths, marriages, immigrants, and emigrants in the Netherlands. Three

time series methods are used: the exponential smoothing method, the

Box-Jenkins approach and the structural time series model. The mod

els will be discussed briefly in section 2.

Section 3 gives an overview of average forecast errors in recent years

based on series of annual numbers. In order to examine to what extent

the size of the errors is sensitive to the choice of the forecast period, the

average errors for an earlier period are discussed in section 4. Section 5

goes into whether the use of monthly data or quaterly data yields more

accurate forecasts than does the use of annual data. Section 6 shows to

what extent forecasts can be improved if observations in the first half

of the first forecast year are already available. Finally, the possibility

that combining the forecasts of different time series models leads to a

decrease of average errors is examined.

This content downloaded from 193.105.245.57 on Sat, 28 Jun 2014 10:33:52 AMAll use subject to JSTOR Terms and Conditions

Page 4: Predictability of Demographic Variables in the Short Run

J. de Beer / Predictability of demographic variables 285

2. Univariate time series models

Univariate time series models provide forecasts of a variable on the

basis of the movement of that variable in the past, without it being necessary to specify assumptions on future developments of other

variables. Consequently these models are suitable for determining forecasts for past years with hindsight, as they use only information

that was available at the time the forecast would have been made.

Exponential smoothing (ES) is one of the most widely applied fore

casting methods. The method can be applied relatively simply, and the

forecasting performance compares well with more sophisticated meth

ods. Essentially the method adjusts the previous forecast in proportion to the forecast error. Gardner (1985) gives an overview of different

specifications of the model. In this paper the forecasting power of three variants is examined: simple smoothing (assuming a constant trend

level), the Holt-Winters method (assuming a linear trend) and the model for a damped trend.

One disadvantage of exponential smoothing is that the model is not based on a general theoretical model. The approach is primarily a

pragmatic one. As a consequence it is not possible to decide, on purely statistical grounds, which model should be chosen for a given time

series. Moreover, the method does not provide a statistical basis for

assessing the degree of uncertainty of the forecasts.

In the Box-Jenkins approach, much attention is paid to the specifi cation of an appropriate model for a given time series (Box and Jenkins

(1970)). The model is chosen from the class of autoregressive integrated moving average (ARJMA) models. To a large extent, the application of ARIMA models has a 'black box'-nature. Generally the model struc

ture is not easy to interpret, especially when models with seasonal

effects are concerned. As a consequence, model selection is not based on considerations regarding plausible forecast patterns.

An alternative to the Box-Jenkins approach is to take into account

the form of the forecast function right to the start. Structural time series (ST) models are explicitly formulated in terms of trend and seasonal

flunctions (Harvey (1984)). The unobservable components can be estimated by using the Kaiman filter. As yet few research results are

available on the relation between the specification of structural time

series models and their predictive power. Gersch and Kitagawa (1983) and Kitagawa and Gersch (1984) apply some models to five monthly

This content downloaded from 193.105.245.57 on Sat, 28 Jun 2014 10:33:52 AMAll use subject to JSTOR Terms and Conditions

Page 5: Predictability of Demographic Variables in the Short Run

286 /. de Beer / Predictability of demographic variables

series, but it is not clear whether their results have a more general

validity. In this paper the forecasting performance of the so-called basic

model is examined. Harvey and Todd (1983) claim that this simple model is suitable for projecting the movement of varying economic series. The pattern of the forecasts of the basic model resembles that of the Holt-Winters method. An important difference is, however, that

the structural model is explicitly based on a statistical model, the

parameters of which can be estimated using the maximum likelihood

principle.

3. Forecast errors in recent years

In order to assess how accurately the numbers of births, deaths,

marriages, immigrants and emigrants are predictable in the short run,

forecasts are made for recent years using data from previous years only.

Usually absolute numbers of births, deaths, etc. are not forecast

directly. The common practice is to project age-specific rates. However, a problem in applying time series models to the separate age-specific rates -

apart from the large number of models that would have to be

estimated - is that the projections taken together may not show a very

plausible pattern. This problem can be avoided by using models

capable of projecting these rates simultaneously (see e.g. Willekens and

Baydar (1984), De Beer (1985)), but these models can be applied to annual data only. Another possibility is to project not the separate

age-specific rates, but indicators such as total fertility rate and life

expectancy at birth. In the short run, the movement of such indicators

will hardly deviate from that of total numbers of births and deaths. In this paper projections of total numbers are presented, as attention is

paid to short-run forecasts only. The mean absolute percentage error is used as the measure of

forecast accuracy. This measure is easy to interpret. Moreover, the

results for different series are comparable. Apart from this, the conclu

sions would not have been different if a measure based on squared errors had been used. For comparison, the mean errors of so-called

naive predictions are given. The naive prediction is equal to the last

observation prior to the forecast period.

This content downloaded from 193.105.245.57 on Sat, 28 Jun 2014 10:33:52 AMAll use subject to JSTOR Terms and Conditions

Page 6: Predictability of Demographic Variables in the Short Run

/. de Beer / Predictability of demographic variables 287

Table 1 Mean absolute percentage error of forecasts of population variables based on annual data, the

Netherlands, 1979-1984.

Forecast Exponential smoothing horizon

m years Constant

level

Linear

trend Damped trend

ARIMA model

Structural

model

Naive

Births 1 2 3

Deaths 1

2

3

Marriages 1 2 3

Immigrants 1

2

3

Emigrants 1 2 3

2.2

3.5

2.9

1.3

1.9

2.9

4.6

4.2

6.4

21.0

30.1

35.3

5.1

7.4

5.7

3.8

6.8

5.8

1.1

1.6

2.6

4.4

6.5

8.3

27.7

38.3

51.7

5.1

7.4

5.7

2.9

4.9

4.0

1.1

1.7

2.8

4.5

5.6

6.9

25.9

38.3

51.7

5.1

7.4

5.7

3.2

5.2

3.8

0.9

0.7

1.1

3.4

4.6

4.5

13.7

27.1

36.7

3.6

4.1

5.0

2.7

5.2

4.5

0.8

0.7

1.0

4.3

6.5

8.5

27.1

38.7

50.8

6.1

9.9

7.1

2.1

3.4

3.1

1.3

1.9

2.9

4.5

4.1

6.6

13.7

27.1

36.7

4.8

7.6

5.9

In table 1 the mean forecast errors are given for the years 1979-1984.

The forecasts are based on annual numbers from the 1951 up to and

including 1978, 1979, ..., 1983 respectively. Forecasts are calculated for 1, 2 and 3 years ahead.

Since the exponential smoothing method does not include an objec tive identification procedure, three variants of the method are applied to all series.

From table 1 it appears that the series can be divided into three categories: births and deaths with an average forecast error of 2% to

3%; marriages and emigrants with an error of 4% to 5%; and im

migrants with an error of well over 10%. In general, the size of the errors increases with the length of the forecast interval. There are however some exceptions caused by the fact that observations may fluctuate during the forecast period instead of increasing or decreasing

monotonically. None of the models examined can forecast the number of births

more accurately than the naive forecasts during the period 1979-1984.

This content downloaded from 193.105.245.57 on Sat, 28 Jun 2014 10:33:52 AMAll use subject to JSTOR Terms and Conditions

Page 7: Predictability of Demographic Variables in the Short Run

288 J. de Beer / Predictability of demographic variables

The models tend to project too strongly a trend after some years of

increase or decrease whereas in fact the number of births fluctuates

around a virtually constant level. The number of deaths increased

monotonically from 1980 onwards. Both the ARIMA and ST models

project this trend rather accurately: the forecast error is only about

0.5%. The number of marriages in the Netherlands has decreased

strongly with interruptions in some years. The ARIMA model picks up the trend rather quickly after each interruption. Both the ES and ST

models tend to project too strong a trend after some years of uninter

rupted decline, causing relatively large errors after interruptions. The

number of immigrants shows large fluctuations year by year. The

autocorrelation pattern indicates that the series behaves as a 'random

walk'. Hence the ARIMA forecasts are equal to the naive forecasts.

Emigration fluctuates around a practically constant level. The ARIMA

projections depend heavily on the mean of the series which results in

reasonably accurate forecasts.

In short, the results indicate that none of the methods examined

yields the best forecasts for each series and for each forecast horizon.

In order to assess which method performs best on average, the mean

errors for the various series might be calculated. Such a measure would,

however, be strongly affected by the results for the series with the

largest fluctuations, since that series would usually yield the largest differences between methods. This problem can be avoided by relating the mean errors of the separate methods for each forecast horizon to

the mean errors of the naive forecasts. If the average of these quotients - which we shall call prediction coefficients

- is smaller than 1, the

forecasts are on average better than the naive ones. In the event, for the

ARIMA model the prediction coefficient at a forecast horizon of 1 year is 0.94, of 2 years 0.91 and of 3 years 0.83. For the other methods the

coefficients are greater than 1.

Another criterion for evaluating the forecasting performance of a

method is the percentage of forecasts that are better than the corre

sponding naive forecasts. Both the ARIMA projections and the ES

projections with damped trend are closer than the naive forecasts to the

observations in about half of the cases. For the other methods the

percentage is about 40%. If births and immigrants - which cannot be

forecast accurately - are left aside, 70% of the ARIMA projections are

better than the naive ones.

This content downloaded from 193.105.245.57 on Sat, 28 Jun 2014 10:33:52 AMAll use subject to JSTOR Terms and Conditions

Page 8: Predictability of Demographic Variables in the Short Run

J. de Beer / Predictability of demographic variables 289

4. Forecast errors in the period 1970-1975

As the average size of forecast errors depends on the choice of the

forecast period, it is important to check whether the conclusions obtained for recent years apply to other periods also. In the first half of the 1970s, the movement of most of the series examined was different from that in the 1980s. In the earlier period births declined sharply,

deaths fluctuated, marriages manifested a turning point and immigra tion rose to a record height. In table 2 the average errors are given for

the period 1970-1975. Between 1970 and 1975 the annual number of births declined by

almost 30%. The forecast errors are on average larger than in 1979-1984.

The ARIMA projections appear to react rather slowly to the declining trend, whereas the ES projections with linear trend as well as the ST

projections overestimate the decrease after some years. In contrast to

the monotonie increase in recent years, the number of deaths fluctuated in the period 1970-1975. Hence the forecast errors are somewhat

Table 2 Mean absolute percentage error of forecasts of population variables based on annual data, the

Netherlands, 1970-1975.

Forecast Exponential smoothing ARIMA Structural Naive

horizon Constant Linear Damped model model

in years level trend trend

Births 1 5.1 3.6 3.1 5.3 3.9 5.7 2 10.5 8.3 6.6 9.9 9.6 11.1

3 15.7 15.5 10.7 15.1 15.7 16.4

Deaths 1 2.2 2.4 2.2 1.9 2.1 2.2 2 3.0 4.1 3.4 2.2 2.8 2.9 3 1.9 5.3 3.8 2.2 3.6 1.9

Marriages 1 5.5 5.8 6.0 5.1 5.7 5.1 2 8.3 9.1 8.3 8.5 7.5 8.5

3 11.9 15.3 13.1 12.1 12.5 12.1

Immigrants 1 14.8 11.6 11.6 12.9 12.2 12.9 2 14.1 12.9 12.9 17.2 11.8 17.2

3 10.7 12.0 12.0 10.8 10.4 10.8

Emigrants 1 4.5 4.5 4.6 4.1 4.5 4.5 2 6.5 6.5 6.8 4.9 6.5 6.5

3 6.4 6.4 6.8 4.5 6.4 6.4

This content downloaded from 193.105.245.57 on Sat, 28 Jun 2014 10:33:52 AMAll use subject to JSTOR Terms and Conditions

Page 9: Predictability of Demographic Variables in the Short Run

290 /. de Beer / Predictability of demographic variables

larger. In 1970 the rise in the number of marriages from 1960 turned into a sharp decline. Both the ES and ST methods react rather slowly to

this change. The ARIMA projections are equal to the naive ones. As in recent years, immigration fluctuated strongly in the years 1970-1975, but the errors are somewhat smaller.

On average the errors of the ARIMA model are smallest during the

first half of the 1970s, as they are for the recent years. The prediction coefficient -

relating the average errors to the naive forecasts - for the

ARIMA model is equal to 0.39; for the ES variant with constant level

and the ST model the coefficient is equal to 0.99. About 35% of the

ARIMA projections are better than the naive forecasts, compared with

45% of the ST projections with damped trend. It should however be

noted that the ARIMA projections and the ES projections for mar

riages and immigration are equal to the naive forecasts. For the other

series, 60% of the ARIMA forecasts are better than the naive forecasts.

In summary, it can be concluded that births, deaths and marriages

are slightly more predictable for recent years than for the first half of

the 1970s, whereas the opposite is true for immigration. Apart from

births, the errors of the one-year-ahead forecasts have about the same

order of magnitude in both periods.

5. Monthly, quarterly or yearly data?

In projecting annual total numbers, there is, of course, no need to

restrict the analysis to annual data. In a number of cases monthly or

quarterly data may yield more accurate forecasts than do annual data.

For example, when a rising trend turns into a decrease during the last

months of a calendar year, the year total may still show an increase

compared with the previous years. In such a case an increase would be

forecast for the next year on the basis of annual data, whereas monthly

or quarterly data might have led to the forecast of a decrease. On the

other hand, transient changes using monthly or quarterly data may be

wrongly projected into the future as if the long-run direction of the

trend had changed. Because many demographic monthly or quarterly time series show

seasonality, the seasonal pattern has to be taken into account when

specifying a model in order to estimate the trend correctly. All three

methods discussed offer this possibility. As for the ES method, the

This content downloaded from 193.105.245.57 on Sat, 28 Jun 2014 10:33:52 AMAll use subject to JSTOR Terms and Conditions

Page 10: Predictability of Demographic Variables in the Short Run

/. de Beer / Predictability of demographic variables 291

Table 3 Mean absolute percentage error of forecasts of population variables based on monthly and

quarterly data, the Netherlands, 1979-1984.

Forecast Exponential smoothing horizon

ARIMA-model Structural model

in years Monthly Quarterly Monthly Quarterly Monthly Quarterly

Births

Deaths 1

2

3

Marriages 1

2

3

Immigrants 1

2

3

Emigrants 1

2

3

1.8

3.5

3.6

2.1

3.1

4.0

4.0

3.6

5.8

11.5

20.3

30.7

4.4

8.0

6.0

1.9

3.5

4.2

0.6

1.2

2.1

5.0

4.2

5.9

1L0 23.9

31.0

4.2

8.6

5.9

2.1

4.7

6.3

1.6

2.3

3.1

3.8

3.9

3.7

11.1

25.2

38.9

4.4

5.3

4.8

2.2

5.3

7.0

1.3

1.5

2.0

3.9

3.9

3.7

11.7

27.1

35.1

4.4

5.2

4.8

3.2

5.0

5.2

2.0

3.6

6.5

4.3

6.2

9.2

16.5

40.9

68.6

5.2

11.5

2.5

5.0

5.3

0.9

1.4

1.8

5.4

10.5

11.4

11.2

26.1

40.1

6.0

14.1

13.1

variant based on a constant level fits monthly and quarterly observa

tions better than either of the alternative specifications. Hence the

errors of this variant only are given. Table 3 gives the average errors of

forecasts based on models estimated for monthly and quarterly ob

servations respectively from 1970 up to and including 1978, 1979, 1983.

Forecasts of live births based on monthly or quarterly data are better on average than projections using annual data, at least at a forecast

horizon of one year. However, only the ES projections are better than

the naive forecasts. Quarterly data yield better forecasts of deaths than

do monthly data, the reason being that irregular fluctuations on a

monthly basis are relatively large compared with the magnitude of the trend. Only the ES forecasts based on quarterly data are better than

forecasts using annual data. Marriages are best projected by the AR

IMA model, with little difference between forecasts based on monthly,

quarterly and yearly data. For all models, using monthly or quarterly data clearly improves forecasts of immigration, in contrast with emigra tion.

This content downloaded from 193.105.245.57 on Sat, 28 Jun 2014 10:33:52 AMAll use subject to JSTOR Terms and Conditions

Page 11: Predictability of Demographic Variables in the Short Run

292 /. de Beer / Predictability of demographic variables

In summary, monthly and quarterly data yield some improvement of

forecasts one year ahead compared with annual data. There is, how

ever, no improvement at a forecast horizon of two or three years. In

general, there are only small differences between results for monthly and quarterly data.

6. Forecasts half-way through the year

The starting point of the national population forecast by the Nether lands Central Bureau of Statistics is the distribution of the population

according to sex, age and marital status on January 1st of the first

forecast year. This information becomes available in the second half of that year. At that moment observations of births, deaths, etc. in the

first months of the same year are already available. In order to assess to

what extent this information can reduce forecast errors, the average

Table 4 Mean absolute percentage error of forecasts of population variables starting from July 1st in the

first forecast year, based on monthly and quarterly data, the Netherlands, 1979-1984.

Forecast Exponential horizon smoothing in years

ARIMA model

Structural

model

Naive-1 Naive-2

Monthly Quar terly

Monthly Quar

terly

Monthly Quar

terly

Births

Deaths

Marriages

Immigrants

Emigrants

1/2 11/2 21/2

1/2 11/2 21/2

1/2 11/2 21/2

1/2 11/2 21/2

1/2 11/2 21/2

1.2

2.9

3.0

1.1

2.6

3.5

1.2

5.2

5.8

5.3

17.3

28.7

2.2

6.8

7.2

1.2

2.4

2.7

0.6

1.7

2.8

1.1

5.3

5.8

6.8

19.9

33.4

2.0

6.2

7.7

1.3

3.6

4.6

0.6

2.1

3.0

1.4

3.6

4,5

4.9

17.3

26.5

2.4

5.2

5.3

1.1

3.9

5.3

0.7

1.5

2.1

1.3

3.6

3.3

8.1

25.6

24.9

2.6

5.2

5.2

3.6

3.3

3.1

1.2

1.5

2.8

1.4

7.3

7.1

4.8

15.6

44.6

1.5

7.0

11.0

1.2

3.6

5.4

0.5

1.4

2.1

1.8

10.3

12.4

1.2

6.5

15.7

1.3

3.3

3.1

0.5

1.7

2.8

1.7

4.4

5.9

5.5 9.7

19.4 24.8

34.6 30.1

2.9

7.3

6.7

1.2

2.7

2.6

0.6

1.9

3.1

1.9

6.1

6.8

4.3

18.6

33.4

1.3

6.4

8.1

This content downloaded from 193.105.245.57 on Sat, 28 Jun 2014 10:33:52 AMAll use subject to JSTOR Terms and Conditions

Page 12: Predictability of Demographic Variables in the Short Run

/. de Beer / Predictability of demographic variables 293

errors of forecasts starting from July 1st are given in table 4. The naive forecasts are equal to the observations in the second half of the

preceding year. An alternative to the use of this naive forecast is to use

the seasonally adjusted numbers in the first half of the year as a forecast for the second half. The census XA1 method is used for

calculating the seasonally adjusted numbers. The two naive forecasts are labelled Naive-1 and Naive-2 respectively.

Reducing the forecast interval by half a year clearly leads to a decrease in the average forecast errors of births, though by less than

50%. However, this does not imply that every forecast will improve. If the movement of a variable during the first half of the first forecast

year deviates from the trend in the preceding year, the forecast starting from July 1st may be inferior to the forecast starting from January 1st. Forecasts of deaths show some improvements in the first forecast year but hardly so in subsequent years. The errors in forecasts of marriages decrease by about two-thirds in the first year. For immigration there is a considerable difference between the two naive forecasts. Owing to the

relatively rapid changes in immigration, the lag of half a year of Naive-1 compared with Naive-2 forecasts leads to a substantial increase

in errors. The errors in forecasts of emigration also decrease clearly in the first forecast year but hardly in successive years.

On average the errors of forecasts starting from July 1st are signifi

cantly smaller than those of forecasts starting from January 1st, at least in the first forecast year. In the second and third forecast year, there is not much improvement if any at all.

7. Combination of forecasts

The results indicate that no single method performs better than all other methods for all series in all years. Hence it would seem useful to examine whether a combination of the separate forecasts could improve the predictive power of time series models. Makridakis et.al. (1982) show that simply calculating the arithmetic average of the forecasts

performs better than a weighted average with the weights being de termined by the variances and covariances of the residuals of the

various models.

Table 5 shows that, on average, the errors of the combination of the

ES, ARIMA and ST models and the Naive-2 method are smaller than

This content downloaded from 193.105.245.57 on Sat, 28 Jun 2014 10:33:52 AMAll use subject to JSTOR Terms and Conditions

Page 13: Predictability of Demographic Variables in the Short Run

294 J. de Beer / Predictability of demographic variables

Table 5 Mean absolute percentage error of combinations of forecasts, based on monthly and quarterly data, the Netherlands, 1979-1984.

Forecast

horizon

in years

Exponential smoothing + ARIMA model + Structural model

+ Naive-2

ARIMA model + Naive-2

Monthly Quarterly Monthly Quarterly

Births

Deaths

Marriages

Immigrants

Emigrants

1/2 11/2 21/2

1/2 11/2 21/2

1/2 11/2 21/2

1/2 11/2 21/2

1/2 11/2 21/2

1.5

2.8

3.3

0.2

1.8

3.0

1.1

5.4

5.3

4.6

15.0

27.5

1.9

6.1

7.8

1.0

3.0

4.0

0.5

1.6

2.5

1.0

5.8

6.3

5.8

20.8

34.1

1.7

5.5

9.1

1.2

3.0

3.5

0.5

2.0

3.0

1.1

4.8

4.7

4.6

17.6

29.6

1.9

5.8

6.5

1.1

3.1

2.9

0.6

1.7

2.6

1.1

4.8

4.8

5.7

22.1

34.2

1.9

5.2

6.5

the errors of the separate methods, at least at a forecast horizon of half

a year. This is true for both monthly and quarterly data. There is, however, no clear improvement at forecast horizons of 1 1/2 or 2 1/2 years. Besides reducing the size of average errors, a combination of

forecasts may also reduce the risk of large forecast errors. At a forecast

horizon of a half year, the largest errors of the combined forecasts are

indeed smaller than the largest errors of the best separate method for

each series.

As the improvement in forecasting performance is not that great, similar results are likely to be obtained by combining only two meth

ods. In table 5 the average errors of the combination of ARIMA and

Naive-2 forecasts are given. The results differ only slightly from those

obtained using the combination of four methods.

This content downloaded from 193.105.245.57 on Sat, 28 Jun 2014 10:33:52 AMAll use subject to JSTOR Terms and Conditions

Page 14: Predictability of Demographic Variables in the Short Run

/. de Beer / Predictability of demographic variables 295

8. Conclusion

Average forecast errors of univariate time series models can give an

indication of the extent to which the future movements of the numbers

of births, deaths, marriages, immigrants, and emigrants is predictable. The errors can be used both as standard for judging forecasts made in the past and as an indication of the degree of uncertainty inherent in

new forecasts.

Based on annual numbers, deaths can be projected rather accurately 1 to 3 years ahead (average error about 1% in 1979-1984), followed by births (2% to 3%), marriages and emigrants (3% to 5%) and finally immigrants (well over 10%). Apart from immigration, the errors do not increase strongly if the forecast interval increases from 1 to 3 years. On

average, ARIMA models yield more accurate forecasts than do either

the exponential smoothing method or the structural time series model, but the differences are not great. In order to assess to what extent the

results depend on the choice of the period, forecasts were also calcu

lated for the first half of the 1970s. Apart from births, the size of the errors does not differ very much between then and recent years. For

that period too, ARIMA models provide the best forecasts on average. When monthly or quarterly data are used instead of annual num

bers, the forecasting performance of time series models improves somewhat at a horizon of 1 year. Average errors of forecasts 2 or 3

years ahead, however, increase slightly. Generally, it does not make

much difference whether monthly or quarterly data are used. If a

population forecast is made in the course of the first forecast year, it is

possible to make use of observations in the first part of that year. This

will in general reduce the errors for the first year, although occasionally errors may increase as a result of transient changes. For the second and

third year in the forecast period, an additional half year of observations does not lead to a reduction of errors in most cases. The combination

of forecasts of various methods results in a slight decrease of average errors.

The average forecast errors presented in this paper can be used as a

standard for judging the performance of official population forecasts. The average errors indicate to what extent the movement of demo

graphic variables is predictable, without taking into account any ex

planatory theory or expert knowledge. Official forecasts are to be

regarded as satisfactory only to the extent that they perform better

than this.

This content downloaded from 193.105.245.57 on Sat, 28 Jun 2014 10:33:52 AMAll use subject to JSTOR Terms and Conditions

Page 15: Predictability of Demographic Variables in the Short Run

296 J. de Beer / Predictability of demographic variables

References

Box, G.E.P. and G.M. Jenkins, 1970, Time series' analysis. Forecasting and control (Holden-Day, San Francisco, CA).

De Beer, J.A.A., 1985, A time series model for cohort data, Journal of the American Statistical

Association 80, 625-630.

Gardner, E.S., 1985, Exponential smoothing: The state of the art, Journal of Forecasting 4,1-28.

Gersch, W. and G. Kitagawa, 1983, The prediction of time series with trends and seasonalities,

Journal of Business and Economics Statistics 1, 253-264.

Harvey, A.C., 1984, A unified view of statistical forecasting procedures, Journal of Forecasting 3, 245-175.

Harvey, A.C. and P.H.J. Todd, 1983, Forecasting economic time series with structural and

Box-Jenkins models: A case study, Journal of Business and Economics Statistics 1, 299-307.

Keyfitz, N., 1981, The limits of population forecasting, Population and Development Review 7,

579-593.

Kitagawa, G. and W. Gersch, 1984, A smoothness priors-state space modeling of time series with

trend and seasonality, Journal of the American Statistical Association 79, 378-389.

Makridakis, S., A. Andersen, R. Carbone, R. Fildes, M. Hibon, R. Lewandowski, J. Newton, E.

Parzen and R. Winkler, 1982, The accuracy of extrapolation (time series) methods: Results of

a forecasting competition, Journal of Forecasting 1, 111-153.

Stoto, MA., 1983, The accuracy of population projections, Journal of the American Statistical

Association 78, 13-20.

Willekens, F. and N. Baydar, 1984, Age-period-cohort models for forecasting fertility (Nether

lands Interuniversity Demographic Institute, Voorburg).

This content downloaded from 193.105.245.57 on Sat, 28 Jun 2014 10:33:52 AMAll use subject to JSTOR Terms and Conditions