English Life Tables No. 17 Methodology

English Life Tables No. 17 Methodology

Jakub Bijak1,3,4, Erengul Dodd2,3,4, Jonathan J. Forster ⇤2,3,4, and Peter W. F. Smith1,3,4

1Social Sciences2Mathematical Sciences

3Southampton Statistical Sciences Research Institute (S3RI)4ESRC Centre for Population Change

University of SouthamptonSouthampton, SO17 1BJ

August, 2015

⇤Corresponding author. Address: Southampton Statistical Sciences Research Institute, University ofSouthampton, Highfield, Southampton, SO17 1BJ, UK. Tel: +44 (0)23 80595130. Email: [email protected].

English Life Tables No. 17 Methodology

The decennial life tables, known as the English Life Tables, are an o�cial publication whichhas been produced after every decennial census since 1841 (with the exception of 1941 whenno census was carried out). The life tables are designed to provide a snapshot of the mortalityexperience in England and Wales, by age and sex in the three-year period around the censusyear.

These English Life Tables (ELT17) were prepared by Jakub Bijak, Erengul Dodd, JonathanJ. Forster and Peter W.F. Smith of the University of Southampton. We produced these atthe invitation of the O�ce for National Statistics. The tables in the accompanying paper aregraduated (smoothed) estimates of central mortality rates, by age and sex, based on mortalitydata for England and Wales over the years 2010-12. The tables are also available in spreadsheetformat on the ONS website.

In the accompanying paper we describe the development of the statistical methodology used toproduce ELT17. Crude mortality rates are smoothed, or graduated, using a combination of ageneralised additive model and low-dimensional parametric models. The approach to graduationacknowledges uncertainty, particularly in the highest age groups, by model-averaging, using asimplified version of a full Bayesian analysis.

Smoothing mortality data; producing the English Life Table,

2010-12

Jakub Bijak

Erengul Dodd

Jonathan J. Forster

⇤

Peter W. F. Smith

University of Southampton, UK

Abstract

In this paper we describe the statistical methodology used to produce the English Life Ta-ble, ELT17, covering the period 2010-12. Crude mortality rates are smoothed, or graduated,using a combination of a generalised additive model and low-dimensional parametric mod-els. The approach to graduation acknowledges uncertainty, particularly in the highest agegroups, by model-averaging, using a simplified version of a full Bayesian analysis.

Keywords: Bayesian analysis; Generalised additive model; Graduation; Model-averaging

1 Introduction and Review

The decennial life tables, known as the English Life Tables, are an o�cial publication whichhas been produced after every decennial census since 1841 (with the exception of 1941 whenno census was carried out). The life tables are designed to provide a snapshot of the mortalityexperience in England and Wales, by age and sex in the three-year period around the censusyear. In this paper, we describe the statistical methodology underlying the production of ELT17,the 17th version of the English Life Tables. Suppose that y

x

is the number of deaths in Englandand Wales (either male or female) observed aged x at last birthday (equivalently, age at deathin [x, x+1)) in a given time period T . The age x is restricted to be an integer. Let E

x+t

denotethe total exposure of (male or female) individuals exact age x + t, where 0 t < 1 within T .Then the classic central exposed to risk for integer age x is given by

E

C

x

=Z

1

0

E

x+t

dt.

Where T is a calendar year, it is common to approximate E

C

x

by the mid-year exposure, that isthe number of individuals aged in [x, x+ 1) at the mid-point of T .

The observed (or empirical) central mortality rate m̃

x

is given by

m̃

x

=y

x

E

C

x

.

⇤Corresponding author. Address: Southampton Statistical Sciences Research Institute, University ofSouthampton, Highfield, Southampton, SO17 1BJ, UK. Tel: +44 (0)23 80595130. Email: [email protected].

1

This is a statistic which can be thought of as a raw estimator of the underlying central mortalityrate

m

x

=E[Y

x

]

E

C

x

. (1)

The life tables are based on graduated (smoothed) estimates of mx

, or related functions. Grad-uation is carried out because the crude estimates m̃

x

typically fluctuate due to the naturalvariability in the mortality process within the population at risk. This can lead to undesirablefeatures, such as mortality estimates failing to follow a monotonically increasing pattern overmiddle and old ages. Graduation is particularly important at the oldest ages, where exposurenumbers are small and data are sparse.

Over the history of production of the decennial life tables, di↵erent methods have been usedfor graduation, as statistical methodology has become more sophisticated and computationmore straightforward. Typically, infant mortality has been estimated separately (a feature wemaintain here). Most recently, for ELT13 through to ELT16, a spline-based method has beenused to graduate over most of the range of x (age), with various di↵erent methods used toextrapolate into the highest ages, where data are sparse, or non-existent. For example, forELT14 and ELT16, a variable knot cubic spline was fitted, with knot positions determinedaccording to some optimality criterion. For ELT15, a smoothing spline approach was adopted.For extrapolating to the highest ages, the methods used have often required arguably ad hocassumptions about mortality rates at particular ages; for example, ELT16 assumes m

120

= 2(see ONS (2009)). One of the main aims in developing the graduation for ELT17 was to avoidhaving to make such assumptions. A detailed review of methods used in graduating mortalityrates at old ages in previous ELTs can be found in Gallop (2002).

Smoothing has become part of the standard statistical toolbox, and the facility to include smoothfunctions of covariates into a regression model is included in standard software. In particular,the generalised additive model framework allows smooth functions of one or more explanatoryvariables to be included in a regression model. A smooth term in a regression model is typicallyincluded as a cubic spline with a large number of knots. Each knot requires a model parameterand there can be as many as one per observed value of the covariate. Smoothness is then enforcedby maximising a penalised likelihood where the penalty is a function of the roughness of theresulting spline regression function, typically integrated squared second derivative. The degreeof penalisation controls the smoothness of the resulting fit, and this is usually determined bygeneralised cross-validation (Wood, 2015).

Given the widespread adoption of this approach for smooth function estimation, we view it as anatural approach for the graduation of the life table. In developing a model, our focus is on howto manage the transition between the age range where a smoothing spline works well, and theoldest ages, where the sparsity of the data results in the smoothing spline fit being less reliable,leading us to consider other model-based methods.

In Section 2, we present the raw mortality data for England and Wales for 2010-2012, andillustrate the smoothing spline graduation. In Section 3, we introduce our methodology formortality estimation at older ages, and how a smooth transition from the smoothing spline toan old-age model is attained. In Section 4, we present our data analysis and the final life tables,together with our conclusions, are in Section 5.

2 Exploratory Data Analysis

The crude central mortality rates (on a logarithmic scale) are presented in Figure 1. Severalfeatures are immediately apparent. As would be expected, male mortality rates are higher

2

than female mortality rates throughout the age range, but with some convergence at olderages. Mortality decreases throughout the first few years of life, with a particularly steep dropbetween ages 0 and 1. From about age 10 onwards mortality increases, with a particularly steepincrease in late teenage years, particularly pronounced for males, and attributable to a higherrate of death from accidents (sometimes referred to as the ‘accident hump’). Compared withthe variability across ages, the variability between crude mortality rates across the three yearsof observation is much less. Although we expect mortality rates to drift downwards over time,this drift is su�ciently small as to only be weakly visible across a three year time span. For thepurposes of this paper, we choose to ignore this e↵ect. The final feature of note, of particularinterest here, is the fact that crude mortality rates exhibit much greater variability in the highest(over 100) age groups. This presents issues for estimation and extrapolation which will be thesubject of this paper.

0 20 40 60 80 100

Age

Mor

talit

y ra

te

10−4

10−3

10−2

10−1

1

Males − 2010Males − 2011Males − 2012Females − 2010Females − 2011Females − 2012

Figure 1: Crude central mortality rates for England and Wales (2010-2012)

Following previous practice, the crude mortality rate is used to estimate mortality at age 0.Therefore, we focus on estimating mortality rates m

x

for ages x > 0. As discussed in Section 1,we consider a cubic smoothing spline, fitted as a generalised additive model (GAM) in standard

3

software (we used the mgcv package in R; Wood (2015)) to be a straightforward and naturalapproach for graduation. The full model is described in Section 3. Figure 2 presents theestimated mortality function m̂

x

obtained by fitting a generalised additive model incorporatinga cubic smoothing spline for age x, penalised by integrated squared second derivative.

0 20 40 60 80 100

Age

Mor

talit

y ra

te

10−4

10−3

10−2

10−1

1

Males − 2010Males − 2011Males − 2012Males − GAMFemales − 2010Females − 2011Females − 2012Females − GAM

Figure 2: Crude central mortality rates for England and Wales (2010-2012), together with ratesestimated by a penalised cubic smoothing spline, fitted as a generalised additive model (GAM)

Over the majority of the age range, the estimated m̂

x

function looks to be a reasonably smoothfunction which, at the same time, adheres acceptably well to the crude mortality rates – itpasses the ‘eyeball’ test. At the very highest ages, there are reasons to be less satisfied. Firstly,the sparsity of the data (low exposures, and correspondingly low numbers of observed deaths)mean that the uncertainty about m

x

is much greater at high ages. The smoothing spline fitdoes account for this, but any spline fit is largely determined locally, by observed mortality ratesat neighbouring ages all of which are prone to high variability at the highest ages. Second, itis noticeable that the estimated female mortality exceeds male mortality beyond age 107, andwhile this possibility cannot be ruled out, it is likely that this is simply just due to the lackof available data to accurately estimate mortality at these ages. A further issue with using

4

these fits for life table estimation is that such estimation requires us to extrapolate the estimateof the mortality rate function to ages beyond the extremes of the observed data. While anyextrapolation should be undertaken with caution, this is particularly true of extrapolating asmoothing spline estimated on sparse data. In the next section, we develop a methodologywhich provides a more robust fit at the highest ages, and hence a more credible extrapolation.

3 Methodology

3.1 The basic smoothing model

Statistical graduation proceeds by estimating the central mortality rate m

x

, defined in (1),under some model which imposes a degree of smoothness on the m

x

series as a function of x.For example, a Poisson regression or smoothing model is formulated as

Y

x

⇠ Poisson⇣E

C

x

m

x

⌘

where the exposures E

C

x

are included in the model through o↵set terms, and the rates m

x

modelled (typically using a logarithmic link function) by a parametric formula, or as a smoothterm in a generalised additive model. For a large non-uniform population, such as England andWales, a Poisson model, with its implicit assumption that the variance is equal to the mean, istoo inflexible for modelling, and rarely fits observed data well. Therefore, we prefer a negativebinomial model, of the form

Y

x

⇠ NB⇣E

C

x

m

x

, ↵

⌘

where E[Yx

] = E

C

x

m

x

and ↵ is a dispersion parameter such that V ar[Yx

] = E

C

x

m

x

+(EC

x

m

x

)2/↵.Hence, for the observed death counts {y

x

, x > 0} we have the log-likelihood

`(m,↵) =nX

x=1

↵ log✓

↵

E

C

x

m

x

+ ↵

◆+

nX

x=1

y

x

log

E

C

x

m

x

E

C

x

m

x

+ ↵

!

+nX

x=1

log�(↵+y

x

)�n log�(↵) (2)

where n is the largest age for which E

C

x

> 0. Then, in a generalised additive (smoothing spline)model, the mortality rates m

x

are modelled as

logmx

= s(x;�)

where s(x;�) is a linear (in �) function representing regression on a spline basis at a fixed setof knots. The spline coe�cients � are estimated by maximising the penalised log-likelihood

p(�,↵) = `({s(x;�);x = 1, . . . , n},↵)� �

Z|s00(x;�)|2dx

as a function of � where �, the parameter controlling the ‘roughness penalty’, is chosen toprovide the optimal (by cross-validation) predictive fit of the model.

Then, the graduated estimates m̂x

are obtained as

m̂

x

= exp s(x; �̂)

and it is these estimates which are displayed as the lines in Figure 2.

5

3.2 Models for older ages and extrapolation

To obtain a more robust fit at older ages, and to extrapolate the mortality function m

x

beyondthe range of the observed data, we propose to use a parametric model. The simplest and best-known parametric model for human mortality at old ages is the log-linear Gompertz model

logmx

= �

0

+ �

1

x, x � x

0

(3)

where x

0

is a suitable threshold; it is clear from examination of Figure 1 that (3) cannot rea-sonably apply to all x > 0. Therefore our model combines (2) and (3) as

logmx

=

(s(x;�) x < x

0

�

0

+ �

1

x x � x

0

(4)

An alternative to (3) is a Makeham model, of the form

m

x

= �

0

+ exp (�0

+ �

1

x) , x � x

0

.

This, or (3) alone, might be extended by allowing higher order polynomials either inside theexponential function (extra � parameters) or outside (extra � parameters), and incorporatedinto a model like (4). However, we are already proposing a smoothing spline for modellingage-specific variation in mortality across the majority of the age range. A model at the highestages should be able to be robustly fitted where data are sparse, and we prefer such a model tobe as simple as possible. Hence, our preference is for (4) with x

0

chosen so that the fit of themodel across the age range where it is applied is comparable, in terms of predictive accuracy,to the fit of the generalised additive model.

A competing model to (3) which has been applied in life table construction for obtaining smoothestimates at oldest ages, and extrapolation, is a logistic model of the form

m

x

=�

2

exp (�0

+ �

1

x)

1 + exp (�0

+ �

1

x), x � x

0

(5)

With this model, originally proposed by Beard (1963), mortality rates flatten o↵, converging to alimiting rate �

2

as x tends to infinity. A special case of this model, with �

2

= 1, (Thatcher et al.,1998) is used in graduating the human mortality data base (Wilmoth et al., 2007). FollowingWilmoth et al. (2007) we set �

2

= 1 and therefore our logistic model combines (2) and (5) as

m

x

=

8><

>:

exp s(x;�) x < x

0

exp (�0

+ �

1

x)

1 + exp (�0

+ �

1

x)x � x

0

(6)

Hence we have two possible models, (4) and (6) both of which require the choice of a thresholdage x

0

to determine the age range over which the parametric component will be fitted andapplied. In practice, we have no fundamental reason to prefer one model over the other, or toapply a particular value of x

0

. Rather, we would prefer to base our decision on the observed data.Furthermore, given the sparsity of the data at the highest ages, there is likely to be considerableuncertainty about this choice. Ideally, we would like our final graduation to acknowledge thisuncertainty, where appropriate.

3.3 Bayes and partial-Bayes model averaging

Arguably the most natural approach to incorporate model uncertainty into estimates is aBayesian approach. In general, suppose we use k = 1, . . . ,K to index possible models for

6

observed data y, with each model k specifying a probability density p

k

(y|✓k

) for the observeddata y. Here ✓

k

denotes the unknown parameters of model k, which in the present context arethe parameters � of the gam, the parameters (�

0

, �1

) of the old-age mortality function and thenegative binomial dispersion parameter ↵. Then, the Bayesian approach under model uncer-tainty updates a prior probability distribution over the models to a posterior distribution, in thelight of observed data. The posterior distribution accounts for how well the various models fitthe data, and is used explicitly in weighting the models in estimates of any quantity of interest,such as a mortality rate here, which has a constant interpretation across models. Precisely,a prior probability distribution p(m), representing our prior beliefs concerning which model ismost appropriate, is updated as

p(k|y) / p(y|k)p(k) (7)

where p(y|k) is the marginal likelihood

p(y|k) =Z

p

k

(y|✓k

)pk

(✓k

)d✓k

. (8)

It is typical to assume prior neutrality between models, so models are e↵ectively compared usingp(y|k) (or log p(y|k)).

Under model uncertainty, the posterior probability distribution for mortality rates incorporatesmodel uncertainty, as

p(mx

|y) =X

k

p(k|y)pk

(mx

|y) (9)

which is a mixture of the posterior probability distributions for mx

, obtained from the individualmodels in the usual way, weighted by their posterior probabilities (7).

Graduated estimates of mx

can then be obtained using

E[mx

|y] =X

k

p(k|y)Ek

(mx

|y) (10)

a weighted average of the posterior expectations, Ek

(mx

|y), under the various models, with theweights given by the posterior model probabilities. This approach is often described as ‘model-averaging’; see Hoeting et al. (1999) for details. The final graduations fully integrate both modeluncertainty and uncertainty about the parameters of the constituent models.

Evaluation of model-averaged graduations using (10) can be computationally demanding, typi-cally requiring Markov chain Monte Carlo methods or alternative Bayesian computational tools.We propose a relatively simple graduation methodology which captures the important aspectsof (10), that is the incorporation of model uncertainty, while at the same time being easy to im-plement within standard statistical software such as R. This involves, as the first step, replacing(10) by

m̂

x

=X

k

p(k|y)m̂(k)

x

(11)

where m̂

(k)

x

are the (penalised) maximum likelihood estimates (m.l.e.s) of mx

under model k.In the absence of strong prior information about m

x

, and particularly for age-groups x withlarge exposures EC

x

, this is a relatively mild approximation, as we are simply replacing posteriormeans by m.l.e.s which are likely to be close.

The second stage of our approach represents a greater departure from the conventional Bayesianapproach and deals with the computation of model probabilities p(k|y). In the conventionalBayes approach these are computed via (7) and (8). Our approach is based on the concept ofthe ‘partial Bayes factor’ (see O’Hagan and Forster, 2004, Section 7.32). The partial Bayes factorhas been proposed as an alternative Bayesian approach for calculating the marginal likelihood (8)

7

in examples where the requirement is to compute Bayesian model probabilities in the presence ofa vague or di↵use prior distribution p

k

(✓k

) for the model parameters of one or more models. Insuch cases, an alternative is to split the data y into two subsets, which we shall call y

t

(training)and y

v

(validation). Then, the training data y

t

is considered as having been observed a priori,so (7) and (8) are modified to

p(k|y) / p(yv

|k, yt

)p(k) (12)

and

p(yv

|k, yt

) =Z

p

k

(yv

|✓k

)pk

(✓k

|yt

)d✓k

. (13)

Here, we assume that, as in our models, Yx

are independent given ✓

k

, so that p

k

(yv

|✓k

, y

t

) =p

k

(yv

|✓k

) respectively. Note that the posterior distributions for the model parameters ✓k

arisingfrom this modification are una↵ected, as p

k

(✓k

|y) = p

k

(✓k

|yt

, y

v

) / p

k

(yv

|✓k

, y

t

)pk

(✓k

|yt

), sosequentially updating, using y

t

first followed by y

v

makes no di↵erence. However, the posteriormodel probabilities are not preserved, because we now use the prior model probabilities p(k)rather than p(k|y

t

) in (12), even after observing y

t

. This can be considered as deferring anyconsideration of model uncertainty until after the training sample y

t

has been observed, andonly at that point specifying prior probabilities (typically discrete uniform) to models; hence,only the validation data y

v

contribute directly to updating the prior model probabilities.

Our final adjustment, again designed for ease of computation is to replace pk

(✓k

|yt

), the densityfor ✓

k

after observing the training sample, in (13), by a point mass at ✓̂

0k

, the (penalised)maximum likelihood estimate for ✓

k

based on the training data y

t

only. Hence, we use

p(yv

|k, yt

) = p

k

(yv

|✓̂0k

)

in place of (13) leading top(k|y) / p

k

(yv

|✓̂0k

)p(k) (14)

in place of (12). So models are evaluated on the basis of how well they predict the validationdata, based on parameters estimated using the training data. Together with (11), and assuminga discrete uniform prior distribution p(k) over models, this leads to model-averaged estimatesas

m̂

x

=

X

k

p

k

(yv

|✓̂0k

)m̂(k)

x

X

k

p

k

(yv

|✓̂0k

)(15)

This averts the need for integration, at the cost of ignoring uncertainty about the model pa-rameters in the computation of the (partial) marginal likelihood (but only in this aspect). It isclear that (15) provides a graduation which is able to account for model uncertainty, while atthe same time only requiring standard (penalised) maximum likelihood estimation using boththe full and training data, together with computation of the validation data likelihood.

3.4 Practical partial-Bayes graduation

Graduation using (15) requires us to estimate the parameters of each model k using both thetraining data y

t

and the full data y. Here, a model k comprises a choice of x0

2 {1, xmax

}together with a choice of either Gompertz (4) or logistic (6) model. Here x

max

represents theoldest age at which the transition from semiparametric (smooth gam) to parametric model isallowed. In practice, we set x

max

to be n� 4 where n is the oldest age at which E

C

x

> 0 (106 formales, 109 for females) to ensure that the parametric model is always estimated using at least4 data points.

8

Estimation of the parameters of the gam-component of (4) or (6), that is exp s(x;�) for x < x

0

is carried out using the mgcv package in R, with a negative binomial likelihood, includingestimation of the negative binomial dispersion parameter ↵. Estimation of the parameters ofthe log-linear model in (4) for age-groups x � x

0

is carried out using the glm.nb function of theMASS library in R. Estimation of the parameters of the logistic model in (6) is carried out bymaximising the log-likelihood for age-groups x � x

0

directly. Finally, (15) requires computationof the likelihood based on the validation data. This is simply a case of evaluating m̂

0x

using (4)or (6) and then evaluating (2) for data y

v

.

It only remains to specify how data y (age-specific death counts for 2010, 2011, 2012) are splitinto training, y

t

, and validation, yv

, data sets. We decided to keep each year (2010, 2011, 2012)intact and include all observed deaths for a particular year in either y

t

or yv

. Hence models areevaluated based on the prediction for a given year or years, based on parameter estimates fora complementary set of years. Furthermore, for reasons of symmetry, we chose to include 2010and 2012 data in one set, and 2011 data in the other. Another reason for this was to protectagainst downward drift in the mortality rates, so that the two data sets might be expected tohave similar mean mortality rates. The final choice, to make {2010, 2012} the training data and{2011} the validation data is slightly arbitrary, but motivated by having as accurate as possibleestimates ✓̂

0k

to help justify replacing p

k

(✓k

|yt

) by a point mass in (13). Overall graduation isrelatively insensitive to how the data are split into training and validation data, as this splitonly influences the weights p

k

(yv

|✓̂0k

) in (15) and not the graduations for the individual models,

m̂

(k)

x

.

4 Application of the Methods to Modelling England and Wales

Mortality Data (2010-12)

We first examine the analysis of models (4) and (6) separately. For model (4) with a log-linearmortality function at old age, the (conditional) probabilities for di↵erent threshold ages x

0

aredisplayed in Figure 3. The probability that the transition from an unstructured (gam) model toa parsimonious parametric form should happen at an age less than 85 is negligible. Thereafterthe parametric model becomes more competitive. For males, the probability of a thresholdbetween 96 and 105 is 0.80. For females, there is probability of around 0.1 that the threshold isat age x

0

= 91 but the bulk of the probability (0.81) is on a threshold between 101 and 108.

In Figure 4, we present the model-averaged graduation for ages 60 and older, based on model(4). This graduation incorporates uncertainty about the value of the threshold x

0

by computingthe graduated estimates using (11), where the model index k represents di↵erent values of x

0

.Although models for males and females are fitted separately, the extrapolated (log-)mortalityrates are very close. In fact the extrapolation shows the female rates overtaking the male rates,but only by a very small amount relative to the overall uncertainty. Note that, although allindividual models are log-linear, their average, which is calculated on the mortality scale via(11) is not log-linear, explaining the slight curvature in the extrapolated mortality function.

For model (6) with a logistic mortality function at old age, the (conditional) probabilities fordi↵erent threshold values x

0

are displayed in Figure 5. Now, thresholds below age 85 aremore probable, although the probability that the transition from an unstructured (gam) modelto a parsimonious parametric form should happen at an age less than 70 remains negligible.For females, there is a relatively narrow range of thresholds supported by the data, with theprobability of x

0

being between 85 and 91 calculated as 0.84. For males, on the other hand,there remains considerable uncertainty about the threshold with almost all values of x

0

� 75having non-negligible probability. The highest probability region for x

0

is between 90 and 99,

9

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●

●

●

●●●●●●

●●

●

●

●

●

●

●

●●

●

●

60 70 80 90 100 110

0.00

0.02

0.04

0.06

0.08

0.10

0.12

Threshold age

Prob

abilit

y

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

MalesFemales

Figure 3: Threshold probabilities (y-axis) plotted against threshold ages (x0

; x-axis) for thelog-linear old-age model (4). For each potential threshold age x

0

, these represent the probabilitythat x

0

is the optimal threshold at which the generalised additive model, for mortality rates forEngland and Wales (2010-2012), is replaced by the log-linear old-age model (4).

10

60 80 100 120 140

Age

Mor

talit

y ra

te

10−3

10−2

10−1

1

Males − 2010Males − 2011Males − 2012Males − Smooth ratesFemales − 2010Females − 2011Females − 2012Females − Smooth rates

Figure 4: Crude central mortality rates for England and Wales (2010-2012) for ages 60+, to-gether with smooth rates estimated by a model-average of old-age log-linear models for di↵erentthreshold ages x

0

11

with a total probability of 0.43.

●●●●●●●●●●●●●●●●●●●●●

●

●●

●

●

●

●●

●

●

●

●●●●●●●●●

●●●●●●●●

●

60 70 80 90 100 110

0.00

0.05

0.10

0.15

0.20

0.25

Threshold age

Prob

abilit

y

●●●●●●●●●●●●●●●

●●

●

●●●●●●●●●●●●

●

●

●

●

●

●

●

●

●●

●

●

●●●

●

●

●

●

MalesFemales

Figure 5: Threshold probabilities (y-axis) plotted against threshold ages (x0

; x-axis) for thelogistic old-age model (6). For each potential threshold age x

0

, these represent the probabilitythat x

0

is the optimal threshold at which the generalised additive model, for mortality rates forEngland and Wales (2010-2012), is replaced by the logistic old-age model (6).

In Figure 6, we present the model-averaged graduation for ages 60 and older, based on model(6). As in Figure 4, this graduation incorporates uncertainty about the value of the threshold x

0

by computing the graduated estimates using (11), where the model index k represents di↵erentvalues of x

0

. Again models for males and females are fitted separately, and here there is a slightlymore pronounced deviation between the extrapolated (log-)mortality rates. The extrapolationshows female rates converging to the limiting value of one faster than the male rates, resultingin a cross-over. Again, the estimated di↵erences are small, and occur at ages beyond which verylittle data are available, and where uncertainty is greatest. It should not be taken as evidencethat female mortality exceeds male mortality at the highest ages.

Finally, we combine the analysis of models (4) and (6) into a single analysis, computing modelprobabilities which quantify uncertainty both about threshold ages and about the relative merits

12

60 80 100 120 140

Age

Mor

talit

y ra

te

10−3

10−2

10−1

1


Figure 6: Crude central mortality rates for England andWales (2010-2012) for ages 60+, togetherwith smooth rates estimated by a model-average of old-age logistic models for di↵erent thresholdages x

0

13

of the log-linear and logistic models for mortality in the highest age groups. The total probabilityof log-linear models (4) across all threshold ages x

0

is computed (using (14)) to be 0.087 forfemales, and 0.292 for males. Hence the data are generally more supportive of the logisticmodel (6) with its implied limiting mortality rate then the log-linear model of ever-increasingmortality. This is particularly the case for females. For males, we remain more equivocal aboutthe relative merits of (4) and (6). When we estimate mortality rates using the model-averagedcombination of all the models, we obtain the estimates displayed in Figure 7. Again, males andfemales have been fitted separately, but as in Figure 4 we see convergence of estimated mortalityrates up until around age 120. Thereafter, the full model-averaged mortality rate estimates formales and females diverge, due to the greater weight on the log-linear model for males. Again,we caution against over-interpreting an extrapolation this far beyond the range of observeddata. Generally, we believe that Figure 7 represents a graduation which provides an acceptablecompromise between models (4) and (6) and between models with di↵erent threshold ages x

0

.

60 80 100 120 140

Age

Mor

talit

y ra

te

10−3

10−2

10−1

1


Figure 7: Crude central mortality rates for England andWales (2010-2012) for ages 60+, togetherwith smooth rates estimated by a model-average of old-age logistic and log-linear models fordi↵erent threshold ages x

0

14

5 Life Table and Conclusions

The resulting life table for England and Wales, based on the partial-Bayes model-averagedgraduation is provided in Appendix A for females and Appendix B for males. The life tableconsists of three columns, the estimated mortality rates, m

x

1 provided by the graduation (15)partially displayed in Figure 7, and two functions directly derived from them. The first of these,q

x

represents the conditional probability of death before exact age x+ 1 given survival to exactage x. The final column, `

x

, represents the expected number of survivors to exact age x froma birth population of size `

0

= 100, 000. The relationship between ` and q is straightforward toderive as

`

x

= `

0

x�1Y

z=0

(1� q

z

), x = 1, 2, . . .

There are various standard ways for obtaining death probabilities q from mortality rates m,depending on which assumptions one is prepared to make. The simplest is to assume thatthe force of mortality µ(x) (hazard function for the lifetime random variable) is approximatelypiecewise constant, taking constant values across each whole year of age [x, x + 1). In practicemortality rates vary su�ciently smoothly with age that this is a reasonable approximation, andone which we strongly prefer to the alternative of assuming a piecewise linear survival functionin each [x, x+ 1), which typically implies a non-monotonic hazard at high ages.

It is straightforward to express (1) in terms of µ, as

m

x

=

R1

0

E

x+t

µ(x+ t) dt

E

C

x

. (16)

Under the assumption of a constant force of mortality in [x, x+1), we write µ(x+t) = µ(x+1/2)for all t 2 [0, 1), x, following the convention of denoting a constant force by the force at themid-point of the age-year concerned. Then (16) simplifies to

m

x

=µ(x+ 1/2)

R1

0

E

x+t

dt

E

C

x

= µ(x+ 1/2)

Computation of q

x

using graduated m

x

is now straightforward, as the standard relationshipbetween (constant) hazard and survival function gives

q

x

= 1� exp[�µ(x+ 1/2)]

= 1� exp(�m

x

)

As is conventional, life table columns are presented for ages for which ` � 0.5. We have madeone small adjustment to the graduated rates provided by (15). The graduated estimates give anegligible positive di↵erence between female and male rates over a very narrow age range (112 to115). Hence, the final entry of the life table for males (at age 112) would show a lower mortality(in the fourth decimal place) than the corresponding female value. We have chosen to reportthe same value (based on the female graduation) at age 112 for males and females.

In conclusion, we believe that we have developed a method of graduation which takes advantageof the ease with which a wide range of smooth and parametric models can routinely be fitted,while at the same time acknowledging that in regions of sparse data there remains considerableuncertainty about the model which should be used to estimate and extrapolate the mortalityrates. This uncertainty should, where possible, be incorporated into estimation and we have

1We suppress theˆbut all quantities discussed here are estimates.

15

provided a computationally straightforward approach for achieving this. We acknowledge thatthe graduated rates m

x

are provided here without any quantification of associated uncertainty.This simply reflects the reality of standard life table presentation which does not incorporatesuch an analysis. Extending the methodology described here to provide uncertainty intervalswould, nevertheless, be a valuable extension of the current approach.

References

Beard, R. E. (1963) A theory of mortality based on actuarial, biological, and medical consid-erations. In Proceedings of the International Population Conference, vol. 1, 611–625. NewYork.

Gallop, A. (2002) Mortality at advanced ages in the united kingdom.https://www.soa.org/library/monographs/life/living-to-100/2002/mono-2002-m-li-02-1-gallop.pdf. Society of Actuaries, Living to 100 and Beyond: Survival at Advanced AgesMonograph.

Hoeting, J. A., Madigan, D., Raftery, A. E. and Volinsky, C. T. (1999) Bayesian model averaging:a tutorial. Statistical science, 382–401.

O’Hagan, A. and Forster, J. J. (2004) Bayesian Inference, Kendall’s Advanced Theory of Statis-

tics, vol. 2B. London: Arnold.

ONS (2009) English life tables no. 16 methodology, 2000-02.http://www.ons.gov.uk/ons/rel/lifetables/decennial-life-tables/no-16–2000-2002-/rpd-meth-elt16.pdf.

Thatcher, A., V., K. and J.W., V. (1998) The Force of Mortality at Ages 80 to 120. Odense,Denmark: Odense University Press.

Wilmoth, J. R., Andreev, K., Jdanov, D., Glei, D. A., Boe, C., Bubenheim, M., Philipov, D.,Shkolnikov, V. and Vachon, P. (2007) Methods protocol for the human mortality database.University of California, Berkeley, and Max Planck Institute for Demographic Research, Ro-

stock. URL: http://mortality. org [version 31/05/2007].

Wood, S. (2015) Package ‘mgcv’. http://cran.r-project.org/web/packages/mgcv/mgcv.pdf.

16

Appendix A: Graduated Life Table (Females)

x `

x

q

x

m

x

x `

x

q

x

m

x

0 100,000 0.003811 0.003824 58 94,967 0.004448 0.0044581 99,619 0.000238 0.000238 59 94,544 0.004883 0.0048952 99,595 0.000176 0.000176 60 94,082 0.005328 0.0053423 99,578 0.000133 0.000133 61 93,581 0.005753 0.0057704 99,564 0.000107 0.000107 62 93,043 0.006177 0.0061975 99,554 0.000090 0.000090 63 92,468 0.006673 0.0066966 99,545 0.000079 0.000079 64 91,851 0.007293 0.0073207 99,537 0.000073 0.000073 65 91,181 0.008016 0.0080488 99,530 0.000070 0.000070 66 90,450 0.008815 0.0088549 99,523 0.000069 0.000069 67 89,653 0.009717 0.009764

10 99,516 0.000071 0.000071 68 88,782 0.010771 0.01083011 99,509 0.000076 0.000076 69 87,825 0.011983 0.01205512 99,501 0.000084 0.000084 70 86,773 0.013309 0.01339813 99,493 0.000094 0.000094 71 85,618 0.014727 0.01483714 99,483 0.000108 0.000108 72 84,357 0.016263 0.01639715 99,473 0.000124 0.000124 73 82,985 0.017962 0.01812516 99,460 0.000143 0.000143 74 81,495 0.019930 0.02013117 99,446 0.000162 0.000162 75 79,871 0.022284 0.02253618 99,430 0.000178 0.000178 76 78,091 0.025070 0.02539019 99,412 0.000191 0.000191 77 76,133 0.028284 0.02869220 99,393 0.000200 0.000200 78 73,980 0.031934 0.03245521 99,374 0.000208 0.000208 79 71,617 0.036048 0.03671422 99,353 0.000216 0.000216 80 69,036 0.040697 0.04154823 99,332 0.000226 0.000226 81 66,226 0.046066 0.04716024 99,309 0.000239 0.000239 82 63,175 0.052351 0.05377125 99,285 0.000254 0.000254 83 59,868 0.059503 0.06134726 99,260 0.000271 0.000271 84 56,306 0.067347 0.06972227 99,233 0.000289 0.000289 85 52,514 0.075851 0.07888228 99,205 0.000309 0.000309 86 48,530 0.085142 0.08898629 99,174 0.000332 0.000332 87 44,398 0.095606 0.10049030 99,141 0.000359 0.000359 88 40,154 0.107597 0.11383731 99,106 0.000390 0.000390 89 35,833 0.121037 0.12901232 99,067 0.000425 0.000425 90 31,496 0.135087 0.14512633 99,025 0.000465 0.000465 91 27,241 0.149465 0.16188934 98,979 0.000508 0.000508 92 23,170 0.164837 0.18012935 98,928 0.000555 0.000555 93 19,351 0.181559 0.20035436 98,873 0.000606 0.000607 94 15,837 0.199664 0.22272437 98,814 0.000662 0.000662 95 12,675 0.218731 0.24683638 98,748 0.000721 0.000721 96 9,903 0.238365 0.27228839 98,677 0.000786 0.000786 97 7,542 0.258480 0.29905340 98,599 0.000860 0.000860 98 5,593 0.279035 0.32716541 98,515 0.000946 0.000947 99 4,032 0.300002 0.35667842 98,421 0.001042 0.001042 100 2,823 0.321337 0.38763043 98,319 0.001142 0.001142 101 1,916 0.342797 0.41976344 98,207 0.001245 0.001246 102 1,259 0.363750 0.45216445 98,084 0.001354 0.001355 103 801 0.384083 0.48464346 97,952 0.001471 0.001472 104 493 0.403464 0.51661647 97,807 0.001602 0.001603 105 294 0.422024 0.54822348 97,651 0.001753 0.001754 106 170 0.439311 0.57858949 97,480 0.001929 0.001931 107 95 0.456121 0.60902950 97,292 0.002137 0.002139 108 52 0.472456 0.63952351 97,084 0.002366 0.002369 109 27 0.487656 0.66876052 96,854 0.002603 0.002606 110 14 0.502089 0.69733453 96,602 0.002843 0.002847 111 7 0.515573 0.72478954 96,327 0.003096 0.003101 112 3 0.528125 0.75104055 96,029 0.003373 0.003379 113 2 0.539776 0.77604256 95,705 0.003686 0.003693 114 1 0.550574 0.79978557 95,352 0.004046 0.004054

17

Appendix B: Graduated Life Table (Males)

x `

x

q

x

m

x

x `

x

q

x

m

x

0 100,000 0.004746 0.004766 58 92,261 0.006687 0.0067091 99,525 0.000306 0.000306 59 91,644 0.007299 0.0073262 99,495 0.000207 0.000207 60 90,975 0.008016 0.0080493 99,474 0.000147 0.000147 61 90,246 0.008773 0.0088124 99,460 0.000115 0.000115 62 89,454 0.009492 0.0095385 99,448 0.000099 0.000099 63 88,605 0.010259 0.0103126 99,438 0.000091 0.000091 64 87,696 0.011227 0.0112917 99,429 0.000088 0.000088 65 86,712 0.012404 0.0124828 99,421 0.000088 0.000088 66 85,636 0.013669 0.0137639 99,412 0.000089 0.000089 67 84,465 0.015008 0.015122

10 99,403 0.000091 0.000091 68 83,198 0.016595 0.01673411 99,394 0.000094 0.000094 69 81,817 0.018538 0.01871212 99,385 0.000100 0.000100 70 80,300 0.020691 0.02090813 99,375 0.000112 0.000112 71 78,639 0.022860 0.02312514 99,363 0.000134 0.000134 72 76,841 0.025081 0.02540015 99,350 0.000172 0.000172 73 74,914 0.027479 0.02786416 99,333 0.000233 0.000233 74 72,856 0.030193 0.03065817 99,310 0.000311 0.000311 75 70,656 0.033362 0.03393118 99,279 0.000391 0.000391 76 68,299 0.037052 0.03775519 99,240 0.000456 0.000457 77 65,768 0.041194 0.04206720 99,195 0.000496 0.000496 78 63,059 0.045786 0.04686821 99,146 0.000516 0.000516 79 60,172 0.051112 0.05246422 99,095 0.000529 0.000529 80 57,096 0.057401 0.05911423 99,042 0.000537 0.000537 81 53,819 0.064550 0.06672824 98,989 0.000543 0.000543 82 50,345 0.072363 0.07511525 98,935 0.000550 0.000550 83 46,702 0.080889 0.08434826 98,881 0.000566 0.000566 84 42,924 0.090289 0.09462927 98,825 0.000594 0.000594 85 39,048 0.100675 0.10611028 98,766 0.000634 0.000634 86 35,117 0.112111 0.11890929 98,704 0.000682 0.000682 87 31,180 0.124695 0.13318330 98,636 0.000728 0.000728 88 27,292 0.138559 0.14914931 98,565 0.000763 0.000763 89 23,511 0.153460 0.16659832 98,489 0.000795 0.000796 90 19,903 0.168140 0.18409133 98,411 0.000838 0.000838 91 16,556 0.181821 0.20067534 98,329 0.000895 0.000895 92 13,546 0.196459 0.21872735 98,241 0.000969 0.000969 93 10,885 0.214618 0.24158536 98,146 0.001059 0.001060 94 8,549 0.235551 0.26860037 98,042 0.001163 0.001163 95 6,535 0.257214 0.29734738 97,928 0.001273 0.001274 96 4,854 0.278420 0.32631339 97,803 0.001378 0.001379 97 3,503 0.299618 0.35612940 97,668 0.001473 0.001474 98 2,453 0.320549 0.38647041 97,524 0.001575 0.001576 99 1,667 0.341539 0.41785042 97,371 0.001704 0.001706 100 1,098 0.361872 0.44921743 97,205 0.001856 0.001858 101 700 0.381310 0.48015044 97,024 0.002015 0.002017 102 433 0.396649 0.50525745 96,829 0.002168 0.002170 103 261 0.411011 0.52934846 96,619 0.002311 0.002314 104 154 0.426389 0.55580347 96,396 0.002452 0.002455 105 88 0.440169 0.58012048 96,159 0.002619 0.002623 106 49 0.451286 0.60017849 95,907 0.002835 0.002839 107 27 0.465095 0.62566650 95,635 0.003105 0.003110 108 15 0.478488 0.65102351 95,339 0.003429 0.003435 109 8 0.491440 0.67617252 95,012 0.003791 0.003798 110 4 0.503943 0.70106553 94,651 0.004169 0.004178 111 2 0.516003 0.72567754 94,257 0.004576 0.004587 112 1 0.528125 0.75104055 93,825 0.005048 0.00506156 93,352 0.005583 0.00559957 92,831 0.006134 0.006153

18

Documents

English Life Tables No. 17 Methodology