Review of the NLTF Revenue Forecasting Model · Recent events have made forecasting future transport activity challenging. Figure 1 shows the correlation between total annual vehicle

Final Report 1 May 2014

Review of the NLTF Revenue Forecasting Model

Prepared for

Ministry of Transport

Disclaimer

Although every effort has been made to ensure the accuracy of the material and the integrity

of the analysis presented herein, Covec Ltd accepts no liability for any actions taken on the

basis of its contents.

Authorship

Aaron Schiff and John Small

[email protected] | (09) 916 2012

© Covec Ltd, 2014. All rights reserved.

Contents

Executive Summary i

Summary of forecasting requirements i

Context i

Issues with the existing model iii

Summary of our methodology iv

PED volume forecasting v

Light RUC volume forecasting xi

Heavy RUC volume forecasting xv

Discussion of forecasting issues xix

Suggested improvements to the Excel model xx

1 Background and scope 1

2 Needs assessment 2

2.1 Purpose of the forecasts 2

2.2 Consequences of forecast errors 2

2.3 The forecasting process 2

2.4 Forecast outputs and characteristics 3

2.5 Scenario analysis 3

3 Context 5

3.1 New Zealand transport trends 5

3.2 Other New Zealand trends 13

3.3 International transport trends 17

4 Issues with the existing NLTF model 19

4.1 Forecast accuracy and reliability 19

4.2 Design and implementation 22

5 Literature review 27

5.1 Private transport activity 27

5.2 Commercial transport activity 31

6 Data review 33

6.1 Transport activity data 33

6.2 Potential explanatory variables 38

7 Petrol excise duty forecasting 40

7.1 Data 40

7.2 Modelling strategy 46

7.3 Pure time series models [PED model 1] 48

7.4 Regression models [PED models 2a-2e] 52

7.5 Hybrid models [PED models 3a & 3b] 64

7.6 Additional PED volume models 80

7.7 PED volume model evaluation and comparison 87

7.8 PED volume confidence intervals and sensitivity testing 90

7.9 Recommendations for PED modelling 95

8 Road user charges forecasting 97

8.1 Data 97

8.2 Modelling strategy 101

8.3 Light RUC models 102

8.4 Heavy RUC models 122

9 Discussion 146

9.1 Commentary on various forecasting issues 146

9.2 Suggested improvements to the spreadsheet model 151

10 References 158

i

Executive Summary

This report reviews forecasts of National Land Transport Fund (NLTF) revenue. We

focus on the main components of NLTF revenues: petrol excise duty (PED) and heavy

and light road user charges (RUCs). We consider options for modelling and forecasting

the volumes of PED and RUCs (litres and km respectively) to which duties and charges

are applied. We also suggest ways that the design and implementation of the current

Excel forecasting model could be improved. This work was conducted in close

consultation with members of the NLTF revenue forecasting group.

Summary of forecasting requirements

The primary requirement is for forecasts of annual NLTF revenues with a high degree of

accuracy over the next three years. Forecasts over a ten year period are also required,

but greater uncertainty beyond the three year horizon is acceptable. In addition:

The model and forecasts need to be updated in a timely fashion every quarter

when new data becomes available, with a minimum amount of manual work;

A set of scenarios reflecting alternative forecast assumptions and confidence

intervals (eg low, medium, high) are required, but the accuracy of the medium

scenario is of greatest importance;

The model should be capable of easily producing forecasts under various

assumptions about key factors affecting NLTF revenues; and

The forecasts should be readily explainable and understandable, and the reasons

for any significant departure from current trends should be apparent.

Context

Recent events have made forecasting future transport activity challenging. Figure 1

shows the correlation between total annual vehicle kilometres travelled (VKT) and total

real GDP in New Zealand over time. Between 2001 and 2006 there was a strong positive

correlation, but in 2006 this correlation was disrupted. Growth in GDP and VKT

resumed in 2007, and both declined during the 2008/09 recession (also indicating a

positive correlation), but since 2010, real GDP growth has resumed while total VKT has

essentially remained constant.

At the same time, there have been significant changes within the transport fleet (Figure

2). On a per-capita basis, VKT of light petrol and medium diesel vehicles has declined

over time, with the decline in light petrol VKT per capita apparent since 2005. In

contrast, VKT per capita of light diesel vehicles has increased significantly since 2001,

with most of this increase occurring between 2001 and 2008. VKT per capita of heavy

diesel vehicles has fluctuated, but at the end of 2012 was at a similar level as in 2001.

Similar trends have been observed in other developed countries, and a key question is

whether the recent stagnation of some types of road transport activity is a temporary

ii

effect of the global financial crisis and recession, or a permanent shift reflecting changes

in factors such as demographics, travel preferences, and urban design.

Figure 1 Total annual VKT and annual real GDP in New Zealand.

Source: Ministry of Transport and Statistics New Zealand

Figure 2 Annual VKT per capita indexes (2001Q4 = 100) for New Zealand.

Source: Covec analysis of Ministry of Transport and Statistics New Zealand data

80

90

100

110

120

130

140

20

01-4

20

02-2

20

02-4

20

03-2

20

03-4

20

04-2

20

04-4

20

05-2

20

05-4

20

06-2

20

06-4

20

07-2

20

07-4

20

08-2

20

08-4

20

09-2

20

09-4

20

10-2

20

10-4

20

11-2

20

11-4

20

12-2

20

12-4

An

nu

al V

KT

pe

r ca

pit

a (i

nd

ex)

Year Ended Quarter

Heavy diesel

Medium diesel

Light diesel

Light petrol

Total

iii

Issues with the existing model

There are concerns about the accuracy and reliability of the PED and RUC forecasts

produced by the model. While an initial review (Deloitte, 2012) of the model’s accuracy

found no evidence of structural breaks and that the models were performing well, in

practice the model has over-predicted PED and heavy RUC volumes by up to 5%, and

under-estimated light RUC by up to 2%. The PED forecasts are of particular concern,

with the model predicting strong growth in PED volume over the next three years, in

contrast to the general decline in PED volumes since 2007/08 (Figure 3). These forecasts

imply that within three years, annual PED volumes will exceed the highest level

observed over the past 13 years.

Figure 3 PED volume forecasts produced by the existing model.

Source: Ministry of Transport and Covec analysis.

In addition:

The model uses complex error-correction models (ECMs) to generate the

forecasts, which depend on a relatively large number of explanatory variables

and that generate forecasts that have been difficult to interpret and explain,

particularly in the short term.

A relatively complex process of seasonal adjustment is used for all variables,

even in the absence of clear seasonal patterns (eg for PED volumes).

Ad hoc changes to the ECM coefficients are allowed for, without any robust

basis for making these changes.

2,500

2,600

2,700

2,800

2,900

3,000

3,100

3,200

3,300

3,400

3,500

20

00-0

1

20

01-0

2

20

02-0

3

20

03-0

4

20

04-0

5

20

05-0

6

20

06-0

7

20

07-0

8

20

08-0

9

20

09-1

0

20

10-1

1

20

11-1

2

20

12-1

3

20

13-1

4

20

14-1

5

20

15-1

6

20

16-1

7

20

17-1

8

20

18-1

9

20

19-2

0

20

20-2

1

20

21-2

2

20

22-2

3

PED

lit

res

(mill

ion

s)

Actual Forecast

iv

The workflows for updating the model each quarter when new data arrives and

for specifying forecasting scenarios are complex and require a number of

manual steps that are time consuming and may be error-prone. To ensure that

the model has been updated correctly, several MoT staff members must update

it independently, and compare the results.

Summary of our methodology

We applied the following methodology to PED and heavy and light RUC volumes:

Preliminary analysis of quarterly volumes to understand long-term trends and

short-term fluctuations, including testing for predictable seasonal patterns.

Construction of a dataset of potential explanatory variables for PED and RUC

volumes, based on results from a literature review and suggestions from the

NLTF revenue forecasting group and subgroup.

Testing a variety of econometric models for PED and RUC volume, including:

o Simple pure time series models based only on past values of PED and

RUC, and deterministic trends and seasonal dummy variables. The time

series models were selected from a relatively large class of such models

using process of statistical model selection that chooses the model most

likely to have generated the data that has been observed.

o Regression models, including various ways of modelling short-run

dynamics (autoregressive error models, lagged dependent variables, and

simple ECMs).

o Hybrid models, including:

Modelling PED volume as the product of light petrol VKT and

fuel efficiency

Modelling heavy and light RUC volume as functions of

economic activity in transport-intensive sectors

Testing a set of additional models following presentation of our initial results to

the NLTF revenue forecasting group.

Evaluation and testing of all models, including:

o Within-sample goodness of fit and residual diagnostic testing

o Truncated-sample forecasting performance

o The plausibility of out-of-sample 10-year ahead forecasts

v

Choosing a short-list of models on the basis of the above evaluation in

consultation with a subgroup of the NLTF revenue forecasting group.

Developing confidence intervals for a baseline forecast generated from the short-

listed models and performing sensitivity testing on these models.

Making recommendations for modelling PED, light RUC, and heavy RUC

volumes on the basis of all of the above analysis.

PED volume forecasting

PED volume volatility and seasonality

A significant challenge in PED volume forecasting is the high volatility of the quarterly

data (Figure 4). We found no evidence of this volatility being due to a predictable

seasonal pattern. Instead the volatility appears to be driven by the random timing of

large fuel import shipments into New Zealand. To partially overcome this problem, we

recommend that forecasting be done using the 4-quarter moving average of PED

volume. The volatility of volumes (and revenues) should be reflected in the range of

forecasts produced for PED.

Figure 4 Quarterly PED volume and polynomial trend.

Source: Covec analysis of Ministry of Transport data.

PED volume analysis

Three general categories of models of PED volume were tested and compared:

Pure time series models.

0

100

200

300

400

500

600

700

800

900

1,000

19

94-1

19

94-4

19

95-3

19

96-2

19

97-1

19

97-4

19

98-3

19

99-2

20

00-1

20

00-4

20

01-3

20

02-2

20

03-1

20

03-4

20

04-3

20

05-2

20

06-1

20

06-4

20

07-3

20

08-2

20

09-1

20

09-4

20

10-3

20

11-2

20

12-1

20

12-4

20

13-3

PED

Vo

lum

e (

mill

ion

lit

res)

Quarter

vi

Simple regression models relating PED volume to various explanatory variables

including real petrol prices, seasonally adjusted real GDP, and the seasonally

adjusted unemployment rate.

‘Hybrid’ models where PED volume is calculated from a combination of a light

petrol VKT model and a fuel efficiency assumption. Two versions were tested,

using total light petrol VKT and per-capita light petrol VKT. The latter

incorporates total population directly into the PED volume forecast.

Table 1 summarises the eight models for PED volume that were estimated and tested,

including some additional models requested by the NLTF revenue forecasting group.

Table 1 Summary of variables included in the selected PED models.

Model Type Real GDP

Real petrol price

Uempl rate

Fuel eff

Total pop Trend

Young pop

propn

Urban pop

propn

AKL pop

propn

1 Time series

2a Regression (AR errors)

2b Regression (ADL)

2c Regression (ECM)

2d Regression (AR errors)

2e Regression (ADL)

3a Hybrid (total VKT)

3b Hybrid (per-cap VKT)

Table 2 summarises the goodness of fit of these models and their performance when

estimated with a truncated sample of the data up to the second quarter of 2011 and

using the model to produce forecasts of PED volume compared with actuals up to the

third quarter of 2013. The regression models explain around 80% of the variation in the

4-quarter moving average of PED volume, although only 10 – 15% of the variation in

actual quarterly PED volume. The hybrid models have a lower goodness of fit but

perform significantly better than the other models on the truncated sample forecasting

test as indicated by the RMSE and the average percentage errors.

vii

Table 2 Summary of goodness of fit and truncated-sample forecast RMSE of the PED volume models.

1 2a 2b 2c 2d 2e 3a 3b

Full sample goodness of fit

R2 vs PED volume 0.12 0.13 0.10 0.15 0.12 0.12 0.05 0.05

R2 vs PED volume (MA) 0.78 0.81 0.80 0.81 0.81 0.81 0.57 0.56

Truncated sample forecasting performance

RMSE (million litres) 45.2 48.3 57.1 49.0 49.6 49.4 32.5 32.5

Average quarterly error (%) 4.4 4.9 6.3 5.0 5.2 5.1 0.5 0.4

2012 error (%) 3.5 3.5 4.1 3.5 3.8 3.4 -0.3 -0.3

2013 error (%) 4.2 4.9 6.8 5.1 5.2 5.5 0.3 0.1

Figure 5 illustrates the truncated sample forecasting performance of the PED volume

models, where the tendency of all models except the hybrid models to over-forecast

PED volumes can be clearly seen.

Figure 5 Comparison of the truncated-sample forecasting performance of the PED models.

Source: Covec analysis.

Figure 6 compares the annual PED volume forecasts produced by these models under a

baseline scenario and in our view the forecasts produced by the two hybrid models and

regression model (b) are the most plausible.

The PED models were also reviewed in terms of the practicality of using them to

generate forecasts and their relative advantages and disadvantages (Table 3). Taking all

of the above into consideration, we recommend the use of the per-capita hybrid model

(model 3b) to forecast PED volumes.

620

640

660

680

700

720

740

760

780

800

820

840

20

11-3

20

11-4

20

12-1

20

12-2

20

12-3

20

12-4

20

13-1

20

13-2

20

13-3

PED

vo

lum

e (

mill

ion

lit

res)

Quarter

Truncated sample PED volume forecasts

Actual Time series Regression (a)

Regression (b) Regression (c) Regression (d)

Regression (e) Hybrid (a) Hybrid (b)

viii

Figure 6 Annual PED volume forecast comparison.


Table 3 Advantages and disadvantages of the PED volume models.

Model(s) Advantages Disadvantages

Time-series [1]

Very simple implementation

No forecast drivers required

Model can evolve over time

Cannot test alternative scenarios

Provides no explanation for trends

Regression [2a & 2d] (AR errors)

Simple implementation

Clear link between explanatory variables and forecasts

Unsophisticated short-run dynamics

Extrapolates past relationship with GDP and unemployment

Regression [2b & 2e] (lagged dependent variable)



Includes demographic variables

Very sensitive to demographic assumptions

No Stats NZ forecast of urban population for model 2b

Population data is only observed in Census years

Regression [2c] (ECM)

Sophisticated short-run dynamics Difficult to interpret and explain trends


Hybrid [3a & 3b]

Uses potentially more reliable VKT data (compared to PED volumes)


Allows analysis of changing fuel efficiency

Basis for forecasting efficiency is not clear

Model 3a extrapolates past relationship with GDP and unemployment

Model 3b includes an unexplained deterministic trend

2,400

2,600

2,800

3,000

3,200

3,400

3,600

19

96

19

97

19

98

19

99

20

00

20

01

20

02

20

03

20

04

20

05

20

06

20

07

20

08

20

09

20

10

20

11

20

12

20

13

20

14

20

15

20

16

20

17

20

18

20

19

20

20

20

21

20

22

20

23

PED

vo

lum

e (

mill

ion

lit

res)

Year ended June

Annual PED volumes

Actual Current model Time series

Regression (a) Regression (b) Regression (c)

Regression (d) Regression (e) Hybrid (a)

ix

Recommended PED volume model

The recommended PED volume model (model 3b) is a hybrid model that forecasts PED

volume as a function of per-capita light petrol VKT, total population, and fuel efficiency.

Per-capita light petrol VKT is modelled as a function of real per-capita GDP, the real

petrol price, and a negative time trend. Fuel efficiency is also modelled as a function of a

positive time trend, although this could be easily replaced by a more sophisticated

efficiency model if such a model were to be developed.

Figure 7 and Table 4 show an indicative PED volume forecast produced by model 3b. In

the short term, PED volumes are forecast to increase slightly as economic activity

increases, unemployment falls, and real petrol prices remain constant. In the longer

term, the downwards trend in light petrol VKT per capita, higher real petrol prices, and

increasing fuel efficiency are forecast to lead to a decline in PED volumes, with total

volume in 2023 at a similar level as it was in 2013 (see Figure 8).

Figure 7 Indicative forecast and 67% confidence interval from the recommended PED volume model.


2,600

2,700

2,800

2,900

3,000

3,100

3,200

3,300

3,400

19

96

19

97

19

98

19

99

20

00

20

01

20

02

20

03

20

04

20

05

20

06

20

07

20

08

20

09

20

10

20

11

20

12

20

13

20

14

20

15

20

16

20

17

20

18

20

19

20

20

20

21

20

22

20

23

PED

vo

lum

e (m

litr

es)

Year ended June

Actual Forecast

x

Table 4 Indicative forecasts and confidence intervals produced by PED model 3b.

PED volume (million litres) PED volume (annual % change)

YE June

90% lower

67% lower Base

67% upper

90% upper

90% lower

67% lower Base

67% upper

90% upper

2013 3,013

2014 2,884 2,924 2,989 3,058 3,104 -4.3 -2.9 -0.8 1.5 3.0

2015 2,894 2,949 3,037 3,130 3,192 0.3 0.8 1.6 2.4 2.9

2016 2,913 2,968 3,057 3,150 3,213 0.7 0.7 0.6 0.6 0.6

2017 2,916 2,971 3,060 3,153 3,216 0.1 0.1 0.1 0.1 0.1

2018 2,915 2,970 3,059 3,152 3,214 0.0 0.0 0.0 -0.1 -0.1

2019 2,908 2,963 3,051 3,144 3,206 -0.2 -0.2 -0.2 -0.3 -0.3

2020 2,896 2,951 3,038 3,130 3,192 -0.4 -0.4 -0.4 -0.4 -0.4

2021 2,885 2,939 3,026 3,117 3,178 -0.4 -0.4 -0.4 -0.4 -0.4

2022 2,874 2,928 3,015 3,105 3,166 -0.4 -0.4 -0.4 -0.4 -0.4

2023 2,865 2,918 3,004 3,095 3,155 -0.3 -0.3 -0.3 -0.3 -0.3

Figure 8 Approximate decomposition of the forecasts produced by PED model 3b.


-0.2%

-0.1%

0.0%

0.1%

0.2%

0.3%

0.4%

0.5%

0.6%

0.7%

0.8%

2013-4 to2014-3

2014-4 to2015-3

2015-4 to2016-3

2016-4 to2017-3

2017-4 to2018-3

2018-4 to2019-3

2019-4 to2020-3

2020-4 to2021-3

2021-4 to2022-3

2022-4 to2023-3

Ave

rage

co

ntr

ibu

tio

n t

o a

nn

ual

fo

reca

st c

han

ge

Real petrol price Real GDP per capita VKT time trend

Efficiency Population Dynamic & interaction

xi

Light RUC volume forecasting

Unlike PED volumes, light RUC volumes (net km) were found to have a predictable

seasonal pattern. Rather than performing seasonal adjustment on the data, it is more

straightforward and transparent to include seasonal factors (eg quarterly dummy

variables) in the regression models.

We evaluated a large number of light RUC models, and compared the results from time

series, regression, and hybrid models. In this case the hybrid models were based on

GDP levels in various goods-producing sectors, with sub-models to forecast GDP in

these sectors. Table 5 summarises the models that were estimated for light RUC km,

including some additional models requested by the NLTF forecasting group.

Table 5 Summary of explanatory variables in the light RUC models.

Model Type Real GDP

Real

diesel price

Real

light RUC price

TPW

sector GDP

Const.

sector GDP Trend

Goods imports

1 Time series

2a Regression

2b Regression

3 Hybrid

Table 6 summarises the performance of these four models. The regression and hybrid

models have better goodness of fit, but the time series model performs significantly

better on the truncated sample forecasting test (Figure 9).

Table 6 Summary of goodness of fit and truncated-sample forecast RMSE of the light RUC models.

1 2a 2b 3


R2 vs light RUC km 0.57 0.93 0.94 0.92


RMSE (million km) 38.9 86.6 85.4 89.3

Average quarterly error (%) 0.4% 1.4% 2.2% 1.6%

2012 error (%) 0.6% 0.9% 1.8% 1.8%

2013 error (%) 0.3% 3.7% 4.1% 3.6%

The models predict similar growth in the short term but the regression and hybrid

models predict relatively strong growth in the longer term, at a faster rate than the

current model (Figure 10). In our view the short-term forecasts are plausible but the

long-term forecasts may need to be moderated.

xii

Figure 9 Comparison of truncated sample forecasts produced by the light RUC models.


Figure 10 Annual light RUC forecast comparison.


1,700

1,800

1,900

2,000

2,100

2,200

2,300

2,400

20

11-3

20

11-4

20

12-1

20

12-2

20

12-3

20

12-4

20

13-1

20

13-2

20

13-3

Ligh

t R

UC

ne

t km

(m

illio

ns)

Quarter

Truncated sample light RUC volume forecasts

Actual Time series Regression (a) Regression (b) Hybrid

5,000

6,000

7,000

8,000

9,000

10,000

11,000

12,000

20

01

20

02

20

03

20

04

20

05

20

06

20

07

20

08

20

09

20

10

20

11

20

12

20

13

20

14

20

15

20

16

20

17

20

18

20

19

20

20

20

21

20

22

20

23

Ligh

t R

UC

vo

lum

e (

mill

ion

km

)

Year ended June

Annual light RUC volumes


Regression (a) Regression (b) Hybrid

xiii

Recommended light RUC volume model

On the basis of the above analysis, in consultation with a subgroup of the NLTF revenue

forecasting group, light RUC model 2b was selected for further analysis and we

recommend this model for light RUC forecasting. This is a simple regression model that

forecasts light RUC volumes as a function of total real GDP, the real diesel price, the real

light RUC price, real imports of goods, and a positive time trend. Figure 11 and Table 7

show an indicative forecast produced by this model.

Under this scenario and in this model, growth is driven by higher real GDP, higher real

imports of goods, and lower real diesel prices, offset by higher real light RUC prices in

the short term (Figure 12). In the longer term the real GDP effect is important while

there is also a time trend that drives up light RUC km in all periods. Imports do not

feature in the long term in this forecast, however this is due to an assumption about

future imports growth and alternative assumptions will lead to a different forecast.

Figure 11 Indicative forecast and 67% confidence interval for the recommended light RUC model.


0

2,000

4,000

6,000

8,000

10,000

12,000

14,000

20

01

20

02

20

03

20

04

20

05

20

06

20

07

20

08

20

09

20

10

20

11

20

12

20

13

20

14

20

15

20

16

20

17

20

18

20

19

20

20

20

21

20

22

20

23

Ligh

t R

UC

vo

lum

e (m

km

)

Year ended June

Actual Forecast

xiv

Table 7 Indicative forecasts and confidence intervals for light RUC model 2b.

Light RUC volume (million km) Light RUC volume (annual % change)

YE June

90% lower

67% lower Base

67% upper

90% upper

90% lower

67% lower Base

67% upper

90% upper

2013 8,150

2014 8,084 8,210 8,405 8,600 8,725 -0.8% 0.7% 3.1% 5.5% 7.1%

2015 8,252 8,420 8,680 8,940 9,108 2.1% 2.6% 3.3% 4.0% 4.4%

2016 8,543 8,711 8,971 9,231 9,398 3.5% 3.5% 3.3% 3.3% 3.2%

2017 8,966 9,134 9,394 9,654 9,822 5.0% 4.9% 4.7% 4.6% 4.5%

2018 9,342 9,510 9,770 10,030 10,198 4.2% 4.1% 4.0% 3.9% 3.8%

2019 9,674 9,841 10,101 10,361 10,529 3.5% 3.5% 3.4% 3.3% 3.3%

2020 10,001 10,169 10,429 10,689 10,857 3.4% 3.3% 3.2% 3.2% 3.1%

2021 10,340 10,508 10,768 11,028 11,196 3.4% 3.3% 3.2% 3.2% 3.1%

2022 10,684 10,852 11,112 11,372 11,540 3.3% 3.3% 3.2% 3.1% 3.1%

2023 11,034 11,201 11,461 11,721 11,889 3.3% 3.2% 3.1% 3.1% 3.0%

Figure 12 Approximate decomposition of forecasts produced by light RUC model 2b.


-2.0%

-1.5%

-1.0%

-0.5%

0.0%

0.5%

1.0%

2013-4 to2014-3

2014-4 to2015-3

2015-4 to2016-3

2016-4 to2017-3

2017-4 to2018-3

2018-4 to2019-3

2019-4 to2020-3

2020-4 to2021-3

2021-4 to2022-3

2022-4 to2023-3

Ave

rage

co

ntr

ibu

tio

n t

o a

nn

ual

fo

reca

st c

han

ge

Real GDP Real diesel price Real light RUC price

Time trend Real imports Dynamic correction

xv

Heavy RUC volume forecasting

Heavy RUC volumes (net km) were also found to have a predictable seasonal pattern,

and quarterly dummy variables were tested in the models. We again evaluated a large

number of heavy RUC volume models, falling into the same three classes as for light

RUC: time series, simple regression, and hybrid models involving sectoral GDP and the

proportions of heavy vehicles with 2-4 and 7+ axles. Table 8 summarises the heavy RUC

models that were evaluated including additional models requested by the NLTF

revenue forecasting group.

Table 8 Summary of variables in the heavy RUC models.

Model Type Real GDP

Real

heavy RUC price

Forest GDP

TPW GDP

Real

export of

goods

Real

importof

goods Trend

2-4

axles propn

7+

axles propn

1 Time series

2a Regression

2b Regression

3a Hybrid

3b Hybrid

Table 9 summarises the forecasting performance of the heavy RUC models. The two

regression models and one of the hybrid models have the highest goodness of fit, while

the time series model performs best on the truncated sample forecasting test (see also

Figure 13).

Table 9 Summary of goodness of fit and truncated-sample forecast RMSE of the heavy RUC models.

1 2a 2b 3a 3b


R2 vs heavy RUC km 0.82 0.90 0.93 0.91 0.86


RMSE (million km) 31.3 47.4 41.2 50.3 39.3

Average quarterly error (%) -0.4 -3.5 -2.7 -3.8 -2.0

2012 error (%) 0.6 -1.7 -1.1 -1.3 -1.2

2013 error (%) -0.9 -4.1 -3.2 -5.0 -1.9

Figure 14 compares indicative forecasts produced by the heavy RUC models. On the

basis of recent trends in heavy RUC volumes, in our view it is unclear which of these

models produces more plausible forecasts. However, all models produce lower heavy

RUC volume forecasts than the existing NLTF forecasting model.

xvi

Figure 13 Comparison of truncated sample forecasting performance of the heavy RUC models.


Figure 14 Annual heavy RUC forecast comparison.


720

740

760

780

800

820

840

860

880

900

920

940

20

11-3

20

11-4

20

12-1

20

12-2

20

12-3

20

12-4

20

13-1

20

13-2

20

13-3

He

avy

RU

C n

et

km (

mill

ion

s)

Quarter

Truncated sample heavy RUC volume forecasts

Actual Time series Regression (a) Regression (b) Hybrid (a) Hybrid (b)

2,500

3,000

3,500

4,000

4,500

5,000

20

01

20

02

20

03

20

04

20

05

20

06

20

07

20

08

20

09

20

10

20

11

20

12

20

13

20

14

20

15

20

16

20

17

20

18

20

19

20

20

20

21

20

22

20

23

He

avy

RU

C v

olu

me

(m

illio

n k

m)

Year ended June

Annual heavy RUC volumes


Regression (a) Regression (b) Hybrid (a)

Hybrid (b)

xvii

Recommended heavy RUC volume model

On the basis of our analysis, we recommend heavy RUC model 2b for forecasting. This

model forecasts heavy RUC volumes based on real exports and imports of goods, and

the real heavy RUC price.

Figure 15 and Table 10 show an indicative forecast produced by this model. The model

forecasts relatively weak growth in heavy RUC volumes in the first three years, and

steady growth thereafter.

Figure 15 Indicative forecast and 67% confidence interval from the recommended heavy RUC model.

The relatively weak growth in the short term is largely caused by the initial dynamic

correction of the model to the estimated trend, given that actual heavy RUC volumes in

2013 were relatively high (Figure 16). In the longer term, growth is largely driven by

growth in exports, although this depends on the particular long-term assumption of

export growth (and relatively low import growth).

0

500

1,000

1,500

2,000

2,500

3,000

3,500

4,000

4,500

20

01

20

02

20

03

20

04

20

05

20

06

20

07

20

08

20

09

20

10

20

11

20

12

20

13

20

14

20

15

20

16

20

17

20

18

20

19

20

20

20

21

20

22

20

23

Hea

vy R

UC

vo

lum

e (m

km

)

Year ended June

Actual Forecast

xviii

Table 10 Indicative forecasts and confidence intervals produced by heavy RUC model 2b.

Heavy RUC volume (million km) Heavy RUC volume (annual % change)

YE June

90% lower

67% lower Base

67% upper

90% upper

90% lower

67% lower Base

67% upper

90% upper

2013 3,552

2014 3,491 3,536 3,606 3,678 3,726 -1.7% -0.5% 1.5% 3.5% 4.9%

2015 3,441 3,500 3,593 3,688 3,751 -1.4% -1.0% -0.4% 0.3% 0.7%

2016 3,471 3,530 3,623 3,720 3,783 0.9% 0.9% 0.9% 0.9% 0.9%

2017 3,520 3,580 3,675 3,772 3,837 1.4% 1.4% 1.4% 1.4% 1.4%

2018 3,546 3,606 3,702 3,800 3,865 0.7% 0.7% 0.7% 0.7% 0.7%

2019 3,566 3,626 3,723 3,821 3,886 0.6% 0.6% 0.6% 0.6% 0.6%

2020 3,586 3,647 3,744 3,844 3,909 0.6% 0.6% 0.6% 0.6% 0.6%

2021 3,608 3,670 3,767 3,867 3,933 0.6% 0.6% 0.6% 0.6% 0.6%

2022 3,630 3,692 3,790 3,890 3,957 0.6% 0.6% 0.6% 0.6% 0.6%

2023 3,652 3,714 3,813 3,914 3,981 0.6% 0.6% 0.6% 0.6% 0.6%

Figure 16 Approximate decomposition of forecast changes in heavy RUC model 2b.


-1.4%

-1.2%

-1.0%

-0.8%

-0.6%

-0.4%

-0.2%

0.0%

0.2%

0.4%

0.6%

2013-4 to2014-3

2014-4 to2015-3

2015-4 to2016-3

2016-4 to2017-3

2017-4 to2018-3

2018-4 to2019-3

2019-4 to2020-3

2020-4 to2021-3

2021-4 to2022-3

2022-4 to2023-3

Ave

rage

co

ntr

ibu

tio

n t

o a

nn

ual

fo

reca

st c

han

ge

Real exports of goods Real imports of goods

Real heavy RUC price Dynamic & interaction

xix

Discussion of forecasting issues

During our review, the NLTF forecasting group and subgroup raised a number of

general questions about the forecasting approach and models:

Plausibility and risks of the forecasts: Forecasts produced by econometric

models necessarily assume that the relationships embodied in the models

continue to hold in the future. In our view this is not problematic as long as the

models have been thoroughly tested and continue to be reviewed regularly. It is

also not clear that an alternative (non-econometric) approach based on ad hoc

models or simple extrapolation would produce more accurate forecasts, and

such an approach may be criticised because of its arbitrary nature. Overall, in

our view the models recommended in this report produce plausible forecasts of

PED and RUC volumes. However there is always some risk that the

relationships embodied in these models fundamentally changes. This risk can be

mitigated by reviewing the models on a regular basis.

Speed of modelled changes: The requirement that the forecasts can be updated

each quarter led us to estimate quarterly models for PED and RUC volumes.

This implies that changes in the explanatory variables in the models affect PED

and RUC volumes in the current quarter (and in future). In our view this is

reasonable given that the explanatory variables tend to be correlated over time

and that many of the models incorporate dynamic variables (eg lags of the

dependent variable) that imply that the dependent variable takes time to adjust

to shocks.

Scope for the use of multiple models: We have recommended a single model

for each volume forecast. A possible alternative approach involves running

multiple models in parallel and either using these multiple models to produce a

range of forecasts, or combining their forecasts into a single ‘meta-forecast’. We

have some concerns with the use of multiple models:

o Given that a single forecast of PED and RUC volumes is ultimately

required, there is a risk that the process for choosing a single forecast

from multiple models will become arbitrary, which will reduce accuracy

and transparency.

o The use of multiple models will make it more difficult to explain how

the forecasts have been derived, as it will be necessary to explain the

forecast produced by each model as well as the process used for

combining them.

o It is not clear that an approach based on multiple models will perform

better than the use of a single model that is regularly re-estimated and

tested. Re-testing and re-estimation effectively uses multiple models

over time, but since only one model is in use at any given point in time,

the issues associated with combining multiple models are avoided.

xx

For these reasons we prefer the use of a single model for each volume forecast.

However, if the NLTF forecasting group wishes to use multiple models to

generate forecasts, in section 9.1.3 we discuss how this can be done in a

reasonably robust way.

Potential for remediation of the existing spreadsheet model: In our view it is

technically possible to remediate the existing Excel model by replacing the

econometric models in it with new models and making some other changes to

the design and structure. However our advice is that it is likely to be no more

costly (and possibly less costly) to build a new model for PED and RUC

volumes. This is because the complexity of the existing model’s structure means

that modifications would need to be made very carefully and tested thoroughly

to ensure that there are no unwanted side-effects, which will increase costs.

Recommendations for future review of the econometric models: We

recommend that the coefficients of the models for PED and RUC volumes be re-

estimated using the latest available data on an annual basis. This will allow the

coefficients of the model to be updated as new information becomes available.

We expect this annual update would be a straightforward task that the Ministry

could undertake internally or could contract out at relatively low cost. We also

recommend that the econometric models be fully re-tested and their structure

changed if necessary every three years.

Suggested improvements to the Excel model

Following our econometric analysis and review of the existing Excel spreadsheet model,

we suggest the following improvements could be made:

Replace ECMs with simpler regression models

The ECMs in the existing model are relatively complex, particularly the short-run

components of the models. This means that a relatively large number of inputs are

required to generate forecasts, and it can be difficult to explain the short-run predictions

generated by the models. Our econometric analysis found that relatively simple models

can perform well in forecasting PED and RUC volumes, including modelling short-run

dynamics through the use of lagged dependent variables or autoregressive error terms,

which are easier to implement and interpret than ECMs. In our view, for PED and RUC

volumes, any additional benefits of using ECMs to more accurately capture short-run

dynamics are outweighed by the practical disadvantages of this approach. This is

particularly true given that highly accurate quarterly forecasts are not required.

Improve scenario analysis

The ability to analyse scenarios in the model could be improved by clearly separating

actual data inputs from scenarios, and simplifying the way that forecasting scenarios are

specified in the Excel model. As a general principle, anything that needs to be updated

by the spreadsheet user to produce a new forecast should be easily accessible and in a

centralised location rather than dispersed throughout the spreadsheet tabs. A single

‘information’ tab could contain and summarise all of the relevant inputs to the model

when a new forecast is generated.

xxi

Improve outputs of the model

In section 9.2.3 below we suggest a number of simple outputs that can be generated

automatically from the model each time a new forecast is required. These include tables

and charts of the forecast levels and growth rates (and their confidence intervals), as

well as an approximate breakdown of the drivers of the forecasts. It is possible to build

the Excel model in such a way that these outputs update automatically each time a new

forecast is generated.

Remove parameter shocks

The ability to analyse parameter shocks in the model introduces considerable

complexity while potentially undermining the credibility of the econometric models as

there is no simple, non-arbitrary way to make such adjustments. In our view it would be

better for the coefficients of the econometric models to be re-estimated on a regular basis

outside the spreadsheet model, including diagnostic testing.

Remove coefficient re-estimation but re-test econometric models regularly

The coefficients of the econometric models need to be updated regularly, particularly

given the recent disruption to past transport correlations and the open question of

whether these are temporary or permanent changes. However in our view this should

be done outside the Excel model so that a proper set of diagnostic tests can be

performed, and the structure of the models can be updated if necessary.

Remove seasonal adjustment for PED and use quarterly dummies instead of

seasonal adjustment for other variables

There is no predictable seasonal pattern in PED volumes, with the quarterly volatility

largely driven by random factors that essentially cannot be forecasted. Therefore in our

view it is preferable for some simple form of smoothing (eg the 4-quarter moving

average) to be applied to PED volumes for use in the analysis. Other variables such as

RUC volumes do have predictable seasonal patterns. Our recommendation is to include

quarterly dummy variables in the models where necessary to capture seasonal effects,

rather than seasonally adjusting the variables prior to analysis.

Include forecast uncertainties (confidence intervals)

As well as generating forecasts under different input assumptions, it would be helpful if

the model could reflect the uncertainty associated in the econometric models through

the calculation of confidence intervals for the forecasts. The implementation of this will

be greatly simplified by using simple regressions models to generate the forecasts,

rather than ECMs.

Simplify the updating process

The model could be built in such a way that additional observations can be added to a

data table and this flows through the model automatically, including updating the date

ranges applied to output tables and charts. This would reduce the manual work

required to produce updates and eliminate errors that may be created during updating.

xxii

Simplify models for other components of NLTF revenues

While not part of our review, we noted that the existing model includes relatively

complex models for the other minor components of NLTF revenues (eg CNG and LPG

excise, driver licensing, etc). In our view it would be preferable to greatly simplify these

models, for example to use simple time-series models.

1

1 Background and scope

We have been asked by the Ministry of Transport to review the model it uses to

generate forecasts of National Land Transport Fund (NLTF) revenues. The current

model was developed during 2010 and 2011 (Deloitte, 2011a & 2011b) and was most

recently reviewed in 2012 (Deloitte, 2012).

The Ministry has commissioned our review out of concerns that the current model does

not fully meet its needs, and most importantly there are concerns about the accuracy

and reliability of the forecasts it produces.

Due to the constrained timeframe for this review, we focus on the main components of

NLTF revenues: petrol excise duty (PED), and road user charges (RUCs). For modelling

purposes, RUC revenues are split into two categories: light RUC, applying to vehicles

up to six tonnes, and heavy RUC for heavier vehicles. Light RUC applies mostly to

private cars and vans, and some buses and small trucks, while heavy RUC mostly

corresponds to large transport trucks and buses.

Together, PED and RUC revenues comprise around 91 percent of current NLTF gross

revenue. The key task in forecasting PED and RUC revenues is forecasting the volumes

(litres and kilometres, respectively) to which the duties and charges will be applied.

Accordingly, our analysis focuses on the volume forecasts for PED and RUC. Other

revenue sources that we do not review include fuel excise duty on LPG and CNG, motor

vehicle re-licensing and registrations, and charges for motor vehicle change of

ownership and administration activities.

The current NLTF forecasting model uses a class of econometric models known as error-

correction models (ECMs) to generate quarterly PED and RUC forecasts. These models

forecast long-term trends in volumes as functions of a relatively small number of key

drivers. The models also incorporate more complex auxiliary models of short-term

variation around these long-term trends. Additional methodologies are used to handle

seasonal variation in quarterly data, and to permit some types of sensitivity testing.

The Ministry has asked us to review these forecasting models and consider whether

other models may be able to produce better forecasts of NLTF revenues. We have also

been asked to review the design and implementation of the current model, including the

ease with which the model can be used and updated.

This report summarises our findings and is organised as follows. Section 2 reviews the

Ministry’s needs and requirements with regard to NLTF revenue forecasting. Section 3

describes recent trends in New Zealand transport activity, and section 4 gives an

overview of the most significant issues with the current model. Section 5 briefly reviews

recent literature on transport activity modelling and forecasting, and section 6 reviews

the data that is available for NLTF revenue forecasting. Sections 7 and 8 review

forecasting of PED and RUC revenues respectively, and section 9 concludes with a

discussion of some forecasting issues and suggests improvements in the design and

implementation of the model.

2

2 Needs assessment

Through discussions with MoT officials and members of the NLTF revenue forecasting

group, we have assessed the requirements of the forecasting model in terms of the

forecasts that it produces and the ways that the model can be used.

2.1 Purpose of the forecasts

Forecasts of NLTF revenue are required for a variety of purposes. The forecasts are

provided to Treasury, for inclusion in Crown financial budgets and plans. These cover a

four-year timeframe and are updated each March and October, as well as for inclusion

in the government’s annual Budget publication. The government also uses the forecasts

to inform its government policy statement (GPS) on transport, which is updated every

three years, and to understand the future revenues available for transport investments

and initiatives.

NZTA uses the forecasts for its planning, and in particular uses the forecasts of NLTF

revenues over the next three years to plan and sequence transport projects.

2.2 Consequences of forecast errors

NZTA is directly affected by errors in forecasting NLTF revenues, particularly if actual

revenue is lower than forecast. NZTA develops a project plan on the basis of short-term

(1-3 year ahead) revenue forecasts. Once a plan is committed and projects are

commenced, there is limited scope to delay or reorganise projects in response to a

revenue shortfall. Short-term revenue forecasts that are higher than actual revenues

therefore cause NZTA to borrow to meet its commitments, with corresponding

financing costs.

Forecasting errors also make it difficult for the government to plan transport policies

and investments. Again, the greatest difficulty arises if actual revenue is lower than

forecast, meaning that policies and investments may not be able to be implemented in

the time expected.

2.3 The forecasting process

There is a requirement to update the forecasts in a timely fashion each quarter when

new data becomes available, although revenue forecasts are only required and

published for June years. The model update process is undertaken by MoT, with

forecasts being reviewed by the NLTF revenue forecasting group, consisting of officials

from MoT, NZTA, and Treasury.

It is desirable that the amount of work required each quarter to update the forecasts is

minimised. This includes minimising the work necessary to enter new data into the

model and produce forecasts under different scenarios.

To ensure that the model has been updated correctly, several MoT staff members update

it independently, and the results are compared. Sense checks are also done by

3

comparing the forecasts to previous periods, and trying to identify the causes of any

significant changes.

2.4 Forecast outputs and characteristics

Forecasts of all components of NLTF revenue are required but the majority of revenues

arise from PED and RUC. It is necessary for the model to handle some technical

complexities including payments to licensing agents, and refunds.

Forecasts of NLTF revenues for each component and in total are required on an annual

basis (June years) over at least a ten year period. The ability to generate at least a simple

set of scenarios (eg low, medium, high) is required, to reflect the uncertainty associated

with the forecasts. However, while scenarios are generated, the baseline (medium)

scenario is generally adopted by users of the forecasts, eg by Treasury in its financial

reports. Thus while representation of uncertainty is useful, accuracy of the baseline

forecast is very important.

There is also a requirement that the forecasts be readily explainable to users of the

forecasts. This includes changes in trends and reversions to past trends. This means that

it is desirable for the forecasting model and process to be relatively simple and

transparent, although accuracy is still of primary importance.

2.5 Scenario analysis

Ideally, the model should have the ability to generate forecasts under a range of

scenarios, in order to test the effect of various shocks and trends on the forecasts. MoT

staff told us that the types of scenario analysis that would be useful include the effects of

changes in (among other things):

Vehicle fuel efficiency and the mix of vehicles in the fleet including the shift

from light petrol vehicles to light diesel;

The propensity of drivers in different age groups or in different regions to own

vehicles and to drive or use other forms of transport;

The efficiency of freight supply chains, eg truck sizes and operational

improvements that reduce empty running;

Urban density and land uses on people’s need to drive or use other forms of

transport; and

Transport fuel prices, taxes and user charges.

In general, analysing the effect of any one of these changes requires that a suitable

variable be included in the model in some form, and that a suitably robust relationship

is established between that variable and revenues. Thus while the ability to test

scenarios on all of the above may be desirable, it may not be practical within the context

of a forecasting model.

4

For example, forecasting the effects of changes in fuel efficiency will require, at a

minimum:

An average fuel efficiency parameter in the model;

Sensible estimates of the size of this parameter;

An understanding of how this parameter affects NLTF revenues; and

Guidance as to how this parameter may change over time.

Each additional type of analysis therefore increases the complexity of the model, and

requires sufficient data in order to estimate the necessary relationships. While this

would allow a greater range of scenarios to be tested, it is not clear whether this will

improve the accuracy of the forecasts. In general, forecasting favours simple models that

fit the data well, while policy and scenario analysis tends towards more complex

models that permit richer analysis but may be less suitable for forecasting, particularly

in the short term.

We return to this issue in sections 7 and 8 when we analyse options for re-designing the

current forecasting model.

5

3 Context

In this section we briefly analyse recent trends in transport activity in New Zealand and

internationally, and discuss the implications for NLTF revenue forecasting.

3.1 New Zealand transport trends

As a preliminary step, we have analysed overall trends in transport activity in New

Zealand, using vehicle-kilometres travelled (VKT) as the measure of activity. The

quarterly VKT data exhibits some seasonal fluctuations; to smooth this out we have

calculated rolling annual totals as the basis for analysis.

3.1.1 Relationship between transport activity and economic activity

Figure 17 illustrates the overall context for NLTF forecasting by showing the

relationship between total annual VKT (for all types of vehicles) and total annual real

GDP. The available data spans the time period from 2001 to 2013.

Between 2001 and 2005, there was a very strong positive correlation between total VKT

and total real GDP. After 2005, this correlation breaks down. During 2006, real GDP

increased but total VKT declined slightly. In 2007 the positive correlation between VKT

and GDP resumed, and the global financial crisis during 2008 was associated with a

decline in both real GDP and VTK (also indicating a positive correlation). However from

2010 the economy has recovered and real GDP growth has resumed, while VKT has

been volatile but essentially has not increased during the past three years.

Figure 17 Total annual VKT and annual real GDP in New Zealand.


6

This suggests that the apparently strong historical positive correlation between GDP

and transport activity may no longer be reliable. At least, forecasting models estimated

on the basis of this historic relationship are likely to forecast a reversion to that historic

trend, while the most recent data indicates that the relationship between VKT and GDP

is no longer so simple, and/or is being over-ridden by other factors that may require

further investigation. This could include factors such as:

Changes in the unemployment rate

Changes in fuel prices or other transport-related prices

Changes in population demographics, eg the age distribution or the rate of

urbanisation

Changes in the propensity to use of public transport, for example caused by

improvements in the quality of public transport services.

Our subsequent analysis considers all of these variables (and others) as potential drivers

of transport activity and as potential explanations for the deviation from the historic

correlation between transport activity and GDP.

Further insights are provided by breaking down VKT into broad classes by vehicle type.

Figure 18 shows the correlation between annual VKT of light (under 3,500 kg) petrol-

powered vehicles and annual real GDP in New Zealand. In this case, similar features as

Figure 17 are apparent, but the decline in VKT in recent years is even stronger. Between

the third quarter of 2005 and the first quarter of 2013, annual real GDP increased by

12.3%, while annual VKT of light petrol vehicles decreased by 3.4%.

Figure 18 Annual VKT for light petrol vehicles and annual real GDP in New Zealand.


7

For diesel-powered vehicles, different correlations between VKT and GDP can be

observed for different weight classes. There has been a strong positive correlation

between VKT of light (< 3,500 kg) diesel vehicles and real GDP, and this correlation

appears to be largely undisturbed by the global financial crisis and corresponding

recession (Figure 19).

Figure 19 Annual VKT for light diesel vehicles and annual real GDP in New Zealand.

Source: Ministry of Transport and Statistics New Zealand.

Among medium (3,500 kg – 6,000 kg) diesel vehicles, the positive correlation between

VKT and real GDP observed until late 2009 has essentially been reversed in subsequent

years, with VKT of this type of vehicle declining sharply while real GDP has increased

(Figure 20). For heavy vehicles (> 6,000 kg), the positive correlation between real GDP

and VKT has essentially remained over time, however there is some suggestion of a

weakening of this correlation in the most recent data (Figure 21).

This analysis is summarised in Figure 22, showing the correlation between the VKT

measures and real GDP, calculated on a rolling basis over two years. All measures of

VKT are essentially perfectly correlated with real GDP up to the end of 2005, and the

diesel VKT measures remain so until late 2008. The volatility during the global financial

crisis and recession is apparent, but it is also apparent that, with the exceptions of heavy

and light diesel VKT, correlations with GDP have not returned to their former levels.

8

Figure 20 Annual VKT for medium diesel vehicles and annual real GDP in New Zealand.


Figure 21 Annual VKT for heavy diesel vehicles and annual real GDP in New Zealand.


9

Figure 22 Rolling (8-quarter) correlation between annual VKT and annual real GDP.

Source: Covec analysis of Ministry of Transport and Statistics New Zealand data.

3.1.2 Per-capita transport activity

Similar trends are also observed if transport activity is measured on a per-capita basis.

Figure 23 shows total annual VKT per capita for all types of vehicle in New Zealand,

and it is apparent that per-capita transport activity started to decline around 2005.

Figure 23 Total annual VKT per capita in New Zealand.


8,400

8,600

8,800

9,000

9,200

9,400

9,600

9,800

20

01-4

20

02-2

20

02-4

20

03-2

20

03-4

20

04-2

20

04-4

20

05-2

20

05-4

20

06-2

20

06-4

20

07-2

20

07-4

20

08-2

20

08-4

20

09-2

20

09-4

20

10-2

20

10-4

20

11-2

20

11-4

20

12-2

20

12-4

An

nu

al V

KT

pe

r ca

pit

a (k

m)

Year Ended Quarter

10

Consistent with the above analysis, different patterns of VKT per capita are observed for

different vehicle types (Figure 24). Most notable is the increasing use of light diesel

vehicles, while use of light petrol and medium diesel vehicles has fallen.

Figure 24 Annual VKT per capita indexes (2001Q4 = 100) for New Zealand.

Source: Covec analysis of Ministry of Transport and Statistics New Zealand data.

3.1.3 Household travel behaviour

The New Zealand Household Travel Survey, conducted by the Ministry of Transport,

gives some insight into travel behaviour. The frequency of data releases from the survey

(every three years) makes it unsuitable for direct use in generating NLTF forecasts, but it

is a useful source of information about overall transport trends.

Figure 25 shows the annual average distance driven per capita by drivers in different

age groups in small vehicles. One apparently clear trend is the declining volume of per-

capita travel by people in the 25-34 age group, while growth in per-capita travel among

other age groups has been relatively modest or static. The total volume of travel across

all age groups reported in the travel survey grew strongly from 18.3 billion km to 29.1

billion km in 2003-06 but has since remained essentially constant.

80

90

100

110

120

130

140

20

01-4

20

02-2

20

02-4

20

03-2

20

03-4

20

04-2

20

04-4

20

05-2

20

05-4

20

06-2

20

06-4

20

07-2

20

07-4

20

08-2

20

08-4

20

09-2

20

09-4

20

10-2

20

10-4

20

11-2

20

11-4

20

12-2

20

12-4

An

nu

al V

KT

pe

r ca

pit

a (i

nd

ex)

Year Ended Quarter

Heavy diesel

Medium diesel

Light diesel

Light petrol

Total

11

Figure 25 Distance driven per capita in cars, vans, utes, and SUVs, by age group.

Source: Household Travel Survey and Statistics New Zealand.

3.1.4 NLTF revenues and volumes

The above changes in transport activity have translated into changes in the size and

composition of NLTF revenues over time. Light RUC volumes have grown relatively

strongly over time, from 5.66 billion km in 2000/01 to 8.15 billion km in 2012/13, an

average annual growth rate of 3.1% (Figure 26). Heavy RUC volumes have increased at

a slower rate, from 2.70 billion km in 2000/01 to 3.55 billion km in 2012/13, an average

annual growth rate of 2.3%.

In contrast with RUC km, PED volumes (Figure 27) do not exhibit a general upwards

trend, and display considerable volatility over time. We return to the issue of PED

volatility in section 7; for now we note that PED volumes over the past 13 years have

displayed a general pattern of increase until around 2007/08, followed by a general

decline. The total PED volume in 2012/13 was almost identical to the level in 2000/01.

0

2,000

4,000

6,000

8,000

10,000

12,000

14,000

16,000

15–24 25–34 35–44 45–54 55–64 65–74 75+

km p

er

cap

ita

pe

r an

nu

m

Age group

1997/98 2003-06 2004-07 2005-08 2006-09 2007-10 2008-11 2009-12

12

Figure 26 Annual net RUC km.

Source: Ministry of Transport.

Figure 27 Annual PED volumes (litres).

Source: Calculated from Ministry of Transport data.

0

1,000

2,000

3,000

4,000

5,000

6,000

7,000

8,000

9,000

20

00-0

1

20

01-0

2

20

02-0

3

20

03-0

4

20

04-0

5

20

05-0

6

20

06-0

7

20

07-0

8

20

08-0

9

20

09-1

0

20

10-1

1

20

11-1

2

20

12-1

3

Ne

t R

UC

km

(m

illio

ns)

Light RUC Heavy RUC

2,600

2,700

2,800

2,900

3,000

3,100

3,200

3,300

20

00-0

1

20

01-0

2

20

02-0

3

20

03-0

4

20

04-0

5

20

05-0

6

20

06-0

7

20

07-0

8

20

08-0

9

20

09-1

0

20

10-1

1

20

11-1

2

20

12-1

3

PED

lit

res

(mill

ion

s)

13

3.2 Other New Zealand trends

In addition to the above, other recent trends in New Zealand may have an impact on

transport activity and NLTF revenues, including changing demographics, changes in

the structure of the New Zealand economy, and changes in the use of public transport in

cities. Our analysis in subsequent sections includes all of these variables, as well as

prices and others that may have an effect on transport activity.

3.2.1 Demographics

The key demographic changes occurring in New Zealand are changes in the age

distribution of the population and increasing urbanisation.

Figure 28 shows the distribution of age in the New Zealand population across broad

generational groups. Overall the population is ageing, with a significant increase in the

proportion of the population aged between 55 and 74, and a smaller increase in the

proportion of the population aged over 75. The proportions of the population in the 15 -

34 and 35 – 54 age groups in recent years have slightly declined or remained constant.

Figure 28 Population age distribution in New Zealand.

Source: Statistics New Zealand

Figure 29 shows the proportion of the total New Zealand population living in urban

areas, as defined by Statistics New Zealand. This proportion has generally increased

over time, although the urbanisation rate was relatively constant for much of the 2000s.

Also notable is the significant increase in urbanisation in 2013, although it is not clear

whether all of this increase occurred in one year (as the data suggests) or whether this is

a feature of the 2013 Census that has not been applied to the urban population estimates

in the years since the 2006 Census. This means that any modelling based on the urban

population data should be undertaken with caution.

0%

5%

10%

15%

20%

25%

30%

35%

19

91

19

92

19

93

19

94

19

95

19

96

19

97

19

98

19

99

20

00

20

01

20

02

20

03

20

04

20

05

20

06

20

07

20

08

20

09

20

10

20

11

20

12

20

13

Under 15 15 - 34 35 - 54 55 - 74 75+

14

Figure 29 Proportion of the New Zealand population living in urban areas.

Source: Statistics New Zealand

3.2.2 Economic structure

The nature of New Zealand’s economy has gradually changed over time (Figure 30).

Figure 30 Broad breakdown of New Zealand’s real GDP.

Source: Calculated from Statistics New Zealand data.

84.2%

84.4%

84.6%

84.8%

85.0%

85.2%

85.4%

19

96

19

97

19

98

19

99

20

00

20

01

20

02

20

03

20

04

20

05

20

06

20

07

20

08

20

09

20

10

20

11

20

12

20

13

0%

10%

20%

30%

40%

50%

60%

70%

19

88-1

19

89-1

19

90-1

19

91-1

19

92-1

19

93-1

19

94-1

19

95-1

19

96-1

19

97-1

19

98-1

19

99-1

20

00-1

20

01-1

20

02-1

20

03-1

20

04-1

20

05-1

20

06-1

20

07-1

20

08-1

20

09-1

20

10-1

20

11-1

20

12-1

20

13-1

Pro

po

rtio

n o

f re

al G

DP

Primary Manufacturing, construction & wholesale Retail & services

15

The primary sector has generally declined in relative importance, while the tertiary

(retail and services) sector has increased. Secondary industries have also generally

declined over time. One notable feature is that after around 2009 the growth in the share

of activity attributable to retail and services has been curtailed, while the decline in

secondary industries has stopped.

3.2.3 Public transport patronage

Use of public transport in New Zealand has steadily increased over time (Figure 31)

from 22 trips per capita in 2001 to just under 30 trips per capita in 2013, while total

patronage has increased from 86 million boardings in 2001 to 133 million boardings in

2013, an average annual growth rate of 3.7%. Bus is the predominant public transport

mode, with metropolitan rail networks only available in Auckland and Wellington, and

ferry services only available in Auckland, Wellington, and Christchurch.

Figure 31 Annual public transport patronage per capita.


3.2.4 Transport prices

Figure 32 shows annual average real price indexes for selected transport-related goods

and services in New Zealand. Retail petrol and diesel prices have generally increased

over time, although considerable volatility has been observed in recent years. Real

prices for the purchase of vehicles and for passenger transport services have fallen

steadily over time, while real domestic air transport prices have increased but at a

slower rate than fuel prices.

0

5

10

15

20

25

30

35

20

01

20

02

20

03

20

04

20

05

20

06

20

07

20

08

20

09

20

10

20

11

20

12

20

13

An

nu

al t

rip

s p

er

cap

ita

Year ended June

Bus Rail Ferry All

16

Figure 32 Real transport price indexes.

Source: Calculated from Statistics New Zealand and Ministry of Transport data.

3.2.5 Private vehicle ownership

The rate of vehicle ownership appears to have changed in recent years (Figure 33).

Figure 33 Registered cars per capita in New Zealand.

Source: Calculated from Statistics New Zealand data.

0

200

400

600

800

1,000

1,200

1,400

1,600

1,800

2,000

19

91

19

92

19

93

19

94

19

95

19

96

19

97

19

98

19

99

20

00

20

01

20

02

20

03

20

04

20

05

20

06

20

07

20

08

20

09

20

10

20

11

20

12

20

13

Ind

ex

Calendar year

Petrol Diesel

Purchase of vehicles Passenger transport services

Domestic air transport

0.40

0.42

0.44

0.46

0.48

0.50

0.52

0.54

0.56

19

91

19

92

19

93

19

94

19

95

19

96

19

97

19

98

19

99

20

00

20

01

20

02

20

03

20

04

20

05

20

06

20

07

20

08

20

09

20

10

20

11

20

12

20

13

Car

s p

er

cap

ita

17

After increasing relatively strongly between 1991 and 2007, the number of registered

cars per capita in New Zealand has generally declined, although there was a small

increase between 2012 and 2013. While we do not model vehicle ownership directly in

our analysis, this change in propensity to own private vehicles will show up in petrol

volumes and road user charges, and is therefore implicit in our modelling.

3.3 International transport trends

While we have not been able to undertake a comprehensive analysis of transport

activity in other countries, similar trends as in New Zealand have been observed

elsewhere. Figure 34 shows an index of annual VMT in the United States and United

Kingdom. Aggregate transport activity grew strongly for most of the 1980s and 1990s,

but experienced weaker growth in the early 2000s, declined during the financial crisis

and recession in the mid-2000s, and has subsequently remained constant.

Figure 34 Index of annual VMT in the US and UK.

Source: US Department of Transportation & UK Department for Transport.

Internationally, there has been considerable interest in whether the reduction in

transport activity observed since the mid-2000s, in contrast to three decades of steady

growth prior, is a temporary or permanent change. There appear to be two schools of

thought regarding this question. One school suggests that this is a temporary shock,

caused mainly by the global financial crisis, higher unemployment, and volatile fuel

prices. For example, the International Transport Forum (2012) stated:

The 2008 financial crisis triggered a severe, sudden and synchronised drop in demand

leading to strong reductions in global output, trade and transport volumes.

18

The ITF generally considers these shocks to be temporary, and concludes:

Transport flows are expected to grow strongly … driven by higher GDP and larger

populations. In the OECD, passenger transport volumes in 2050 are expected to be 10% to

50% higher than in 2010. Freight transport is expected to grow by 50% to 130%.

The other school of thought, typified by authors such as Litman (2013) is that lower or

declining transport volumes is the “new normal”, due to permanent rather than

temporary changes in demographics, urbanisation, and the structure of the economy.

Litman argues:

Aging population, rising fuel prices, increasing urbanization, improving travel options,

increasing health and environmental concerns, and changing consumer preferences are

reducing demand for automobile travel and increasing demand for alternatives. Automobile

travel will not disappear, but at the margin (compared with current travel patterns) many

people would prefer to drive less and rely more on walking, cycling, public transport and

telework, provided they are convenient, comfortable and affordable.

This controversy makes forecasting future transport volumes challenging. At the very

least, uncertainty over future transport projects is greater than it has been in the past, as

previously steady growth has failed to occur for much of the past decade. We return to

this issue in the New Zealand context in our analysis of private transport activity in

section 7 below.

19

4 Issues with the existing NLTF model

The primary concern with the existing model relates to the accuracy and reliability of

PED and RUC revenue forecasts. Other issues arise from the way that the model has

been designed and implemented. This review is based on our review of the model

provided to us by the Ministry of Transport (updated to October 2013), and discussions

with Ministry staff.

4.1 Forecast accuracy and reliability

The structure of the existing model is explained in detail by Deloitte (2011a). Time-series

econometric techniques were used to estimate error-correction models (ECMs) for PED

and RUC volumes (litres and km respectively), as a function of a number of explanatory

variables. These models are used to generate forecasts of PED and RUC volumes, from

which NLTF revenues are calculated by applying appropriate dollar values.

ECMs contain two components – a long-run trend and a short-run ‘error correction’

component that models the way the dependent variable deviates from the long-run

trend in the short term. Over time, the long-run trend dominates, but the short-run

component is also important for short-term forecasting accuracy.

The models were estimated using quarterly data. This is because although only annual

forecasts are required, the models need to be updated and new forecasts generated

every quarter. Some of the quarterly data series, particularly PED volumes, exhibit

significant volatility. Some of this may be seasonal, but some simply arises from random

factors affecting the timing of fuel imports (see section 6.1.1 below). This creates

significant technical challenges for the short-run models, and makes it difficult to

generate highly accurate short-run forecasts.

The PED, heavy RUC, and light RUC volume forecasts generated by the model

provided to us are shown in Figure 35 to Figure 37. PED volume is forecast to grow

strongly in the first four years, and more slowly thereafter. Questions have been raised

about the reliability of this forecast, given that PED volumes have generally been falling

over the past five years. In contrast, the model predicts that by 2015-16, PED volume

will exceed the peak that it reached in 2007-08. More generally, the Ministry has

expressed concern that the only negative long-run driver of PED volume in the model is

real petrol prices, and given that real petrol prices are expected to be constant in the

short term, the model predicts strong growth in PED volume in the coming years as a

result of increased economic activity.

The RUC volume forecasts are more consistent with recent trends, although the growth

rate of heavy RUC is forecasted to accelerate over time. In comparison with PED, heavy

RUC volumes appear to have recovered from the recession, with increases in the three

years since 2010-11. The growth rate of light RUC has also been relatively stable over

time, and the forecast is generally in line with this. The RUC volumes are also

significantly more stable than PED volumes, which makes forecasting easier.

20

Figure 35 PED volume forecasts produced by the existing model.


Figure 36 Heavy RUC volume forecasts produced by the existing model.


2,500

2,600

2,700

2,800

2,900

3,000

3,100

3,200

3,300

3,400

3,500

20

00-0

1

20

01-0

2

20

02-0

3

20

03-0

4

20

04-0

5

20

05-0

6

20

06-0

7

20

07-0

8

20

08-0

9

20

09-1

0

20

10-1

1

20

11-1

2

20

12-1

3

20

13-1

4

20

14-1

5

20

15-1

6

20

16-1

7

20

17-1

8

20

18-1

9

20

19-2

0

20

20-2

1

20

21-2

2

20

22-2

3

PED

lit

res

(mill

ion

s)

Actual Forecast

0

500

1,000

1,500

2,000

2,500

3,000

3,500

4,000

4,500

5,000

20

00-0

1

20

01-0

2

20

02-0

3

20

03-0

4

20

04-0

5

20

05-0

6

20

06-0

7

20

07-0

8

20

08-0

9

20

09-1

0

20

10-1

1

20

11-1

2

20

12-1

3

20

13-1

4

20

14-1

5

20

15-1

6

20

16-1

7

20

17-1

8

20

18-1

9

20

19-2

0

20

20-2

1

20

21-2

2

20

22-2

3

RU

C k

m (

mill

ion

s)

Actual Forecast

21

Figure 37 Light RUC volume forecasts produced by the existing model.


In an initial review of the model’s forecasting accuracy, Deloitte (2012) assessed the

model’s forecasts for the 2010-11 and 2011-12 years. Figure 38 summarises Deloitte’s

analysis of the model’s forecasting performance in those years for heavy and light RUC,

and PED. The total forecast error was decomposed into model error (due to the

imperfect fit of the model) and input error (due to incorrect assumptions about forecast

drivers). In 2010-11 the input error dominated and model errors were relatively small,

but in 2011-12 the model errors became more significant. This is to be expected with

forecasting over longer time horizons, but also raises the question of whether the model

continues to be valid.

To address this question, Deloitte conducted some simple “structural break” tests by

examining regression residuals and testing the significance of dummy variables at

different points. No evidence of a structural break was found in the PED model. In

respect of RUC, Deloitte found:

Some evidence of structural breaks for heavy RUC, but negligible overall

improvement from incorporating these in the model.

No strong evidence of structural breaks for light RUC.

More recently, the Ministry reviewed actual NLTF revenue compared to the forecasts

and found that the forecasting model had over-predicted PED and heavy RUC by up to

5%, and under-estimated light RUC revenue by up to 2%.

0

1,000

2,000

3,000

4,000

5,000

6,000

7,000

8,000

9,000

10,000

11,000

20

00-0

1

20

01-0

2

20

02-0

3

20

03-0

4

20

04-0

5

20

05-0

6

20

06-0

7

20

07-0

8

20

08-0

9

20

09-1

0

20

10-1

1

20

11-1

2

20

12-1

3

20

13-1

4

20

14-1

5

20

15-1

6

20

16-1

7

20

17-1

8

20

18-1

9

20

19-2

0

20

20-2

1

20

21-2

2

20

22-2

3

RU

C k

m (

mill

ion

s)

Actual Forecast

22

Figure 38 Forecasting errors of the NLTF revenue model in 2010-11 and 2011-12.

Source: Adapted from Deloitte (2012), table 2.3.

4.2 Design and implementation

We have briefly reviewed the Excel implementation of the existing NLTF forecasting

model. Overall, the model is complex, and this partly reflects the complexity of the

forecasting task, but the design seems unnecessarily complex in some ways. In this

section we comment on the model’s structure and its ease of use, then in section 9.2

below we suggest ways that the model could be improved, taking into account our

modelling of PED and RUC volumes.

The model is structured around a series of sub-models for the various components of

the NLTF. Each sub-model takes various data and assumptions, and generates a

quarterly forecast of one component of NLTF revenue, which are then aggregated to

June years for presentation.

Separation between actual data and forecast assumptions

The model does not maintain a clear separation between actual (observed) data for the

various drivers of the NLTF revenue forecasts and future assumptions about these

variables. For example, Figure 39 shows the table where actual and future forecast

economic variables are entered. It is not clear from this table where the actual data ends

and the forecasts begin. This may make it difficult to update the actual data, and may

lead to errors when specifying forecast scenarios.

-2%

-1%

0%

1%

2%

3%

4%

5%

6%

Light RUC Heavy RUC PED Light RUC Heavy RUC PED

Model error Input error

2010-11 2011-12

23

Figure 39 Table for entering actual and forecast economic variables.

Source: NLTF forecasting model

Re-estimation of model coefficients

The model allows for re-estimation of the coefficients of the underlying econometric

models when new data becomes available, via Excel’s built in linear regression

functions. The ability to re-estimate the coefficients is useful, but additional econometric

diagnostic tests should be performed at the same time, to ensure that the models remain

valid. Such testing is not possible within Excel, and so by allowing the coefficients to be

updated there is a danger that the models will become invalid over time but this may

not be noticed. In our view it would be better for re-estimation of the models to be

undertaken as a separate process, every one or two years, including a full set of

statistical diagnostic testing (see section 9.1.5 below).

The steps required to update the coefficients in the model are also relatively complex

and may be error-prone (Figure 40). In part this is because the existing model is unable

to detect the presence of new actual data that has been entered, and consequently a

number of manual steps are required to update the data (see also below).

24

Figure 40 Instructions for re-estimating coefficients in the existing model.


ECM structure is complex

The heart of the forecasting model is a number of ECMs, which allow for relatively

sophisticated time-series dynamics. However, the short-run components of the models

appear to be complex, and include a number of variables that are not included in the

long-run components of the models. This may make it difficult to explain the short-run

forecasts that the model produces.

For example, the long-run model for PED contains three variables: real GDP, real

household consumption, and the real petrol price. The short-run model for PED

contains real GDP as well as five other variables that do not appear in the long-run

model: real investment, the stock of passenger vehicles, the working age population, the

retirement age population, and the difference between retail mortgage rates and the 90-

day interest rate (the ‘interest wedge’). While the long-run model for PED is relatively

easy to understand, the short-run model depends on six different forecasts, and may

produce dynamics that are not easy to explain. Given the focus on short-term forecasts,

this is of concern.

While we have not undertaken a full econometric review of the ECMs in the existing

model, a brief analysis suggests problems, including the use of a number of statistically

insignificant variables. For example, in the long-run PED model, both real GDP and

household consumption are insignificant; this is likely due to the very high correlation

between these variables and it may not be useful to include both in the model.

A complex seasonal adjustment model is used

All of the NLTF volumes (and other variables) in the model appear to have a process of

seasonal adjustment applied to them, regardless of whether reliable seasonal effects are

present. This is particularly problematic for PED volume, which does not appear to have

a regular seasonal pattern (see section 7.1.1 below). The result is seasonal adjustment

factors for PED volume that are very unstable (Figure 41).

25

Figure 41 Seasonal adjustment factors for PED volume in the existing model.

Source: Covec analysis of the NLTF forecasting model.

Ad hoc parameter shocks are allowed for

The model allows for manual adjustment to all of the regression coefficients in the

models, via either a level or percentage adjustment (Figure 42). This adds complexity to

the model and increases the possibility of errors, for limited benefit in our view.

Adjusting the parameters of the models defeats the purpose of using econometric

models for forecasting, and the basis for making any adjustments to the parameters is

unclear. Such adjustments reduce the transparency and predictability of the forecasts

produced by the model, and also require a relatively complex infrastructure to allow

these shocks to flow through to the forecast calculations.

Figure 42 Parameter shock control panel in the existing model.


0.80

0.85

0.90

0.95

1.00

1.05

1.10

1.15

1.20

19

941

996

19

982

000

20

022

004

20

062

008

20

102

012

19

941

996

19

982

000

20

022

004

20

062

008

20

102

012

19

941

996

19

982

000

20

022

004

20

062

008

20

102

012

19

941

996

19

982

000

20

022

004

20

062

008

20

102

012

Seas

on

al a

dju

stm

en

t fa

cto

r

Q1 Q2 Q3 Q4

26

Workflow for specifying the forecast scenario is complex

Ideally, the model would allow the user to quickly and easily specify a forecast scenario

encompassing all of the exogenous variables in the model. Instead, the forecasts of

exogenous variables are mixed with actual data for these variables (see above), and a

separate set of shocks to the variables is allowed for (Figure 43). This structure makes it

difficult to understand exactly what the forecast scenario is, and the process for testing

different scenarios appears to be complex and error-prone, particularly given the fact

that a large number of variables need to be specified.

Figure 43 Specification of variable shocks in the existing model.


Model updating process is complex

Finally, the process for updating the model each quarter when new data becomes

available appears to be complex and time-consuming. This is partly because the model

does not automatically detect the presence of new data and manual changes are

required to move the forecasting period forwards (Figure 44).

Figure 44 Instructions for updating the existing model with new actual data.


27

5 Literature review

In this section we briefly review the recent literature relevant to forecasting transport

activity in New Zealand and elsewhere. Recent work for NZTA has undertaken similar

reviews (Simic & Bartels, 2013, and Stephenson & Zheng, 2013), so to avoid duplication

of effort our review is limited to a summary of the key results that are directly relevant

for NLTF revenue forecasting.

We separate the review into that relevant for modelling and forecasting private

transport activity and that for commercial transport activity.

5.1 Private transport activity

5.1.1 New Zealand literature

Aside from the existing NLTF forecasting model, the most recent empirical analysis of

private transport activity in New Zealand appears to be Stephenson & Zheng (2013). For

NZTA, they develop a national long-term land transport demand model (NLTDM), for

the purpose of analysing the effects of “mega-trends” such as population growth,

demographic changes, and income and economic growth, on transport demand over a

30-year period.

The requirement to model changes in a large number of potential factors led Stephenson

& Zheng to choose a “hybrid” model, incorporating a combination of econometric

models for individual components of transport demand that are combined together in a

structural model of aggregate transport demand. This design makes the model suitable

for testing the effects of the “mega-trends” on transport activity, but may reduce overall

forecasting accuracy as the model contains a large number of parameters, each of which

is only imprecisely estimated.

While diagnostic tests of the individual econometric models are reported by Stephenson

& Zheng, the overall fit of the combined model and its ability to explain observed trends

in transport demand is not reported. For this reason it is difficult to be certain whether

the NLTDM can produce accurate forecasts of NLTF revenues, particularly over the

short term. However, the model can be used to investigate the effects of a large number

of “mega-trends” on transport activity. By design, the model appears to be more useful

for policy analysis and long-term forecasting, rather than short-term forecasting of

NLTF revenues.

In other recent work for NZTA, Simic & Bartels (2013) reviewed methodologies for

analysing the relationship between income growth and passenger vehicle travel. They

noted that the data available in New Zealand was amenable to econometric modelling

but did not support discrete choice modelling (ie modelling transport decisions at the

individual level). They recommended, but did not develop, a panel data model, using

regional VKT data as the dependent variable and income and employment levels across

regions as explanatory variables, consistent with their focus on the role of income as a

driver of transport demand. Simic & Bartels noted significant variance in the regional

VKT data and suggested that econometric analysis may prove difficult.

28

Milne et al (2011) use data from the New Zealand Household Travel Survey (NZHTS) to

study private travel patterns and trends, and assess the use of this data for building a

predictive model of household trips. Their predictive model sought to explain

household trips by mode and area as a function of population demographics, household

types, and car ownership rates. The model is calibrated using simple averages and ratios

calculated from the NZHTS and other data sources, and the goodness of fit of the model

to the actual data is not reported. Furthermore the relatively low frequency of the

NZHTS reporting makes it difficult to use for NLTF revenue forecasting. However in

principle this research suggests that changes in household characteristics may be

important for forecasting changes in private transport activity.

Conder (2009) is another recent example of private transport activity modelling in New

Zealand. Conder’s analysis focuses on car ownership and use, and develops a model of

car ownership as a function of GDP per capita, car purchase prices, and a time trend.

Conder recommended improving this model to estimate car ownership as a function of

household characteristics, and combine that with a model that estimates car use also as a

function of household characteristics, in order to develop an aggregate model of private

transport demand. While Conder noted that sufficiently detailed data was unlikely to be

available to fully support this approach, some aspects of Conder’s recommendations

feature in the model developed by Stephenson & Zheng (2013).

The most comprehensive analysis of public transport demand in New Zealand appears

to be Wang (2011). Wang used econometric models to analyse public transport

patronage in Auckland, Wellington and Christchurch as a function of service levels,

fares, income per capita, car ownership, and fuel prices. While the results vary

somewhat in terms of statistical significance, Wang showed that these factors can have a

significant effect on public transport demand, and Simic & Bartels (2013) argued that

this suggests including public transport variables in models of private car transport.

The Australian Bureau of Infrastructure, Transport and Regional Economics (BITRE)

undertook an econometric exercise to model long-term VKT trends in 25 different

countries including New Zealand (BITRE, 2013a). Econometric models were used to

explain trends in VKT per capita, which is multiplied by population to get predictions of

aggregate annual VKT. The estimated model for New Zealand contains a quadratic time

trend, the unemployment rate, and dummy variables for the global financial crisis and

the fourth quarter of 1988 (Figure 45).

The BITRE model fits the historical VKT data for New Zealand very well, explaining

99.9% of the variation in VKT per capita over time, and with all variables statistically

significant at the 5% level except the 1988 dummy variable. The model predicts

relatively slow future growth in VKT per capita, but relatively strong growth in total

VKT due to expected increases in population. However, time-series diagnostic tests are

not reported, therefore it is unknown whether the apparently strong regression

relationship found by the BITRE is statistically robust.

29

Figure 45 Actual and modelled New Zealand VKT per capita (BITRE, 2013a).

5.1.2 International literature

The international literature on transport modelling is large and includes many different

types of models that have been used for various purposes. In this section we review

some of the recent international literature that we consider to be relevant for forecasting

private transport activity in New Zealand.

Ecola & Wachs (2012) investigate the relationship between land transport activity

(measured by VMT) and GDP growth in the United States. They observe that the

historically close relationship between GDP and VMT appears to have broken down,

starting in the early 2000s (Figure 46). Ecola & Wachs argue that whether VMT causes

GDP growth or whether the causality is reversed is uncertain. They do not identify the

reasons for the decoupling of GDP and VMT observed in the United States, but they do

note that policy may be designed to do this if desired (eg for environmental reasons),

although the effectiveness of such policies is uncertain.

Pickrell et al (2012) develop and test econometric models to forecast VMT in the United

States by vehicle type and road type. To explain private transport, the variables tested

by Pickrell et al, as reported by Simic & Bartels (2013), include:

Economic activity measures: Total GDP, disposable income, median household

income

Demographic charcteristics, including the total population, the proportion of

population aged 20-65, average persons per household, the proportion of

population in urban areas, and the proportion of families with children

30

Cost of driving: The gasoline price per gallon, average fuel economy, and

average fuel cost per mile

Vehicle prices: New and used vehicle prices, vehicle parts prices, and new

vehicle real sales

Road supply: Total road-miles, and road-miles per vehicle

Employment: Total employment, the labour force participation rate, and the

number of employed persons per household

Public transit service: Vehicle-miles of bus and rail services, and the number of cities

with rail transit services.

Figure 46 Relationship between total VMT (left scale) and GDP (right scale) in the United States (Ecola

& Wachs, 2012).

Souche (2010) estimates a model of travel demand within urban areas using a cross-

section of data from 100 different cities around the world. As such, this model is not

very useful for forecasting, but it is useful for identifying the factors that may determine

travel demand in urban areas. Demand functions for private vehicle and public

transport travel are estimated separately, and statistically significant relationships are

found between these variables and the cost of private car travel and urban density.

Higher cost for car travel or higher urban density is found to decrease demand for car

travel and increase demand for public transport.

A similar model for Sydney was developed by Corpuz et al (2007). Explanatory

variables tested in the model included access to public transport, employment and

7

3. Relationship Between VMT and Economic Growth

It has long been observed that VMT and economic growth appear in the aggregate to be

“correlated.” Figure 1 shows how the two measures have grown largely in parallel since

1936, the earliest year for which Federal Highway Administration figures on VMT are

readily available. Except for the World War II period, when many national resources

were devoted to the war effort, the two indicators have largely followed the same path

until the past decade. Beginning around 2003, the two trajectories began diverging.

Figure 1. Total Auto and Truck VMT (trillions) and GDP (trillions of $2005), 1936-

2011

Note: VMT axis on left; GDP on right

Sources: VMT: FHWA,1995 (Table VM-201); FHWA, 2012; BTS, 2012 (Table 1-35);

GDP: BEA, 2012 (Current-dollar and “real” GDP file as of February 29, 2012)

31

housing density, the mix of land use, dwelling types, income levels, and population

demographics. A process of model selection and validation was used to determine that

household VKT was best explained by the number of vehicles per household, the

distance to a major centre, the mix of land use, the proportion of local employment,

housing density, and the distance to public transport infrastructure. This suggests that

local factors can have a significant effect on VKT at the micro level, and to the extent

that these factors are changing (and can be measured) at the national level, they may be

useful for aggregate VKT forecasting.

The BITRE recently undertook analysis of public transport demand in each of

Australia’s state capital cities between 1976 and 2011 (BITRE, 2013b). Total travel per

capita in each city was modelled as a function of a time trend, petrol prices, the

unemployment rate, and dummy variables capturing the global financial crisis and city-

specific events such as the Sydney Olympics in 2000. Given this total, the proportion of

travel undertaken on public transport was modelled as a function of a “disposable

income constraint” variable derived from socio-economic factors such as housing and

food costs and petrol prices, as well as real public transport fares. In general, the BITRE

public transport models fit the historical data well, although as with the other BITRE

study, time-series diagnostic tests are not reported.

5.2 Commercial transport activity

Several previous authors have investigated determinants of heavy traffic on roads, but

there is considerable variation in the methods used.

BITRE (2010) focus on estimating the size of the “freight task” on a route specific basis.1

The routes are origin-destination pairs that connect 56 different locations. This work is

helpful for understanding the likely use of particular roads, but is more geographically

disaggregated than required for NLTF forecasting in New Zealand. Moreover, because

the outputs sought by BITRE are so complex, their forecasting strategies are

correspondingly more constrained. Indeed, much of the work reported by BITRE

involves adjusting and interpolating data, in order to construct a 35 year time series of

origin-destination matrices (see Appendix B of the BITRE report). Their forecasting

methodology is baked into the process they used to create the data-set.

For these reasons, the BITRE work has quite low practical relevance for our purposes.

However we do note that BITRE use a subset of GDP (specifically, non-farm GDP) as an

explanatory variable.

Simcic and Bartels (2013)2 investigate both data and modelling issues of relevance to the

forecasting of RUC revenues. Their report offers some useful insights into the likely

drivers of heavy road transport in New Zealand. We found the following observations

particularly useful:

1 BITRE, 2010, “Road freight estimates and forecasts in Australia: interstate, capital cities and rest of

state. Research Report 121.

2 Simic, A., and R Bartels (2013) Drivers of demand for transport. NZ Transport Agency research report

534.

32

The heavy vehicle VKT data from NZTA appear anomalous in showing a dip in

2010 which is a year later than other VKT data;

A break in the relationship between tonne kms and real GDP appears to have

occurred around 2004;

Truck load assumptions were revised in 2008;

Substitution appears to have occurred from rail to road during the period 2002

to 2008;

Price changes could be relevant, both for diesel (real price increase from 2004)

and for RUC charges from 2007; and

They recommend not using data from before 1992 due to quality concerns.

Most of these observations are reflected in our modelling work below. We generally

avoid using VKT and model RUC kilometres directly. Given the potential for structural

breaks and the concerns about data quality in earlier years, our modelling has used a

data set beginning in 2000. We also explore price effects and find these to be helpful in

the regression models.

Stephenson and Zheng (2013)3 report on a project that developed “a tool for considering

how transport might evolve over time”. Their work has a long-term focus with a 30

year-ahead time horizon. It is also aimed at modelling the value of freight rather than

freight volume.

The “value” focus of this work is not particularly useful for our work because we are

mainly concerned with the volume of transport movements. We note that the authors

suggest that their model could “with some adjustments” be used for forecasting NLTF

revenues. However a forecast of the value of freight will have its own errors, and there

will be a second error component introduced in converting value forecasts to volume

forecasts. For this reason we expect that directly forecasting RUC volumes will be

preferable for forecast accuracy, ease of understanding, and ease of updating.

The Stephenson & Zheng work can be described as a hybrid model because it models

historic changes in freight intensity by sector, assumes those freight intensities will

persist into the future, and then uses sector-level GDP forecasts to generate freight

forecasts. For NLTF forecasting, we are interested in the volume of freight rather than its

value. Nevertheless, we note that the use of industry-level GDP variables is consistent

with BITRE (2010) and also with other work such as Shen et al (2009).4

3 Stephenson, J and L Zheng (2013) National long-term land transport demand model. NZ Transport

Agency research report 520.

4 Shen, S., T. Fowkes, T. Whiteing and D. Johnson, (2009) Econometric modelling and forecasting of

freight transport demand in Great Britain. Institute for Transport Studies, University of Leeds.

33

6 Data review

We have reviewed the available data for modelling PED and RUC volumes. These

consist of data on PED and RUC volumes themselves, as well as data that could be used

to forecast these volumes over time. The following summarises the data we have used in

our analysis and comments on its characteristics.

6.1 Transport activity data

The measures of transport activity used in our analysis are the volumes to which PED

and RUCs are applied. We have also considered vehicle-kilometres travelled (VKT) data

as an alternative for PED modelling, given the volatility in the PED volume data.

6.1.1 PED volumes

PED volumes are not observed directly, but can be inferred from the amount of PED

revenue received and the PED rate applicable at the time. This calculation is performed

on a quarterly basis within the current NLTF forecasting model, and we have used those

volumes as the basis of our analysis of PED. The available data runs from the first

quarter of 1994 to the third quarter of 2013 (Figure 47). In general, quarterly PED

volumes increased from 1994 to early 2008 and have subsequently declined.

Figure 47 Quarterly PED volume and polynomial trend.


The other notable feature from Figure 47 is the considerable volatility of quarterly PED

volumes. Volume has varied by more than 200 million litres from one quarter to the

next. Volatility is also apparent on a year-on-year basis, with year-on-year quarterly

growth rates exhibiting considerable variance (Figure 48). The reason for this volatility

0

100

200

300

400

500

600

700

800

900

1,000

19

94-1

19

94-4

19

95-3

19

96-2

19

97-1

19

97-4

19

98-3

19

99-2

20

00-1

20

00-4

20

01-3

20

02-2

20

03-1

20

03-4

20

04-3

20

05-2

20

06-1

20

06-4

20

07-3

20

08-2

20

09-1

20

09-4

20

10-3

20

11-2

20

12-1

20

12-4

20

13-3

PED

Vo

lum

e (

mill

ion

lit

res)

Quarter

34

appears to be the timing of imports. These do not occur on a fixed schedule, and each

shipment is relatively large, so a small change in timing can cause quarterly volumes to

vary substantially. Figure 49 shows the value of PED revenue received from local

sources versus imports since 2006, and while local revenues are relatively stable,

quarterly import revenues are volatile. We return to this issue in section 7.1 below.

Figure 48 Year-on-year growth rate of quarterly PED volume.


Figure 49 Local versus imported PED revenues.


-35%

-30%

-25%

-20%

-15%

-10%

-5%

0%

5%

10%

15%

20%

25%

30%

35%

19

94-1

19

94-4

19

95-3

19

96-2

19

97-1

19

97-4

19

98-3

19

99-2

20

00-1

20

00-4

20

01-3

20

02-2

20

03-1

20

03-4

20

04-3

20

05-2

20

06-1

20

06-4

20

07-3

20

08-2

20

09-1

20

09-4

20

10-3

20

11-2

20

12-1

20

12-4

20

13-3

Year

-on

-Ye

ar G

row

th R

ate

Quarter

0

50

100

150

200

250

20

06-1

20

06-2

20

06-3

20

06-4

20

07-1

20

07-2

20

07-3

20

07-4

20

08-1

20

08-2

20

08-3

20

08-4

20

09-1

20

09-2

20

09-3

20

09-4

20

10-1

20

10-2

20

10-3

20

10-4

20

11-1

20

11-2

20

11-3

20

11-4

20

12-1

20

12-2

20

12-3

20

12-4

20

13-1

20

13-2

20

13-3

Mill

ion

do

llars

Quarter

Local Imported

35

6.1.2 RUC volumes

Unlike PED, RUC volumes are observed directly. Figure 50 shows the quarterly heavy

RUC (> six tonnes) volumes used in our analysis. The available data runs from the first

quarter of 2000 to the third quarter of 2013. In contrast with PED volume, the series is

relatively stable.

Figure 51 shows the year-on-year growth rate of heavy RUC volume. Growth averaged

4.4% per annum between 2001 and 2005, but from 2006 to 2010 heavy RUC volumes

were essentially constant. Subsequently growth resumed, averaging 2.3% per annum

after 2010, but there are some signs that growth slowed again at the end of 2013.

As with heavy RUC, light RUC (one to six tonnes) volumes are relatively stable over

time (Figure 52). Also apparent are some spikes in volume, corresponding to

temporarily higher purchasing in advance of RUC rate increases, with an offsetting drop

in the following quarter. This suggests that RUC prices may be important for

determining the timing of RUC volumes.

Light RUC volumes have grown steadily over time (Figure 53), although the growth

rate has slowed; annual growth averaged 5.8% from 2001 to 2007, and averaged 1.3%

from 2008 onwards. In 2013 there are some indications that the growth rate has

increased slightly.

36

Figure 50 Heavy RUC volume (net km) and polynomial trend.


Figure 51 Heavy RUC volume (net km) year-on-year growth rate.


0

200

400

600

800

1,000

1,200

20

00-1

20

00-3

20

01-1

20

01-3

20

02-1

20

02-3

20

03-1

20

03-3

20

04-1

20

04-3

20

05-1

20

05-3

20

06-1

20

06-3

20

07-1

20

07-3

20

08-1

20

08-3

20

09-1

20

09-3

20

10-1

20

10-3

20

11-1

20

11-3

20

12-1

20

12-3

20

13-1

20

13-3

Ne

t R

UC

vo

lum

e (

mill

ion

km

)

Quarter

-15%

-10%

-5%

0%

5%

10%

15%

20%

20

01-1

20

01-3

20

02-1

20

02-3

20

03-1

20

03-3

20

04-1

20

04-3

20

05-1

20

05-3

20

06-1

20

06-3

20

07-1

20

07-3

20

08-1

20

08-3

20

09-1

20

09-3

20

10-1

20

10-3

20

11-1

20

11-3

20

12-1

20

12-3

20

13-1

20

13-3

Quarter

37

Figure 52 Light RUC volume (net km) and polynomial trend.


Figure 53 Light RUC volume (net km) year-on-year growth rate.


0

500

1,000

1,500

2,000

2,500

20

00-1

20

00-3

20

01-1

20

01-3

20

02-1

20

02-3

20

03-1

20

03-3

20

04-1

20

04-3

20

05-1

20

05-3

20

06-1

20

06-3

20

07-1

20

07-3

20

08-1

20

08-3

20

09-1

20

09-3

20

10-1

20

10-3

20

11-1

20

11-3

20

12-1

20

12-3

20

13-1

20

13-3

Ne

t R

UC

vo

lum

e (

mill

ion

km

)

Quarter

-40%

-30%

-20%

-10%

0%

10%

20%

30%

40%

50%

60%

20

01-1

20

01-3

20

02-1

20

02-3

20

03-1

20

03-3

20

04-1

20

04-3

20

05-1

20

05-3

20

06-1

20

06-3

20

07-1

20

07-3

20

08-1

20

08-3

20

09-1

20

09-3

20

10-1

20

10-3

20

11-1

20

11-3

20

12-1

20

12-3

20

13-1

20

13-3

Quarter

38

6.1.3 VKT

Given the volatility of PED volumes, in section 7.5 below we investigate the use of light

petrol VKT data to model PED volume indirectly. Figure 54 shows quarterly VKT for

light petrol vehicles, calculated from Ministry of Transport odometer data.5 In

comparison with PED volumes, a shorter quarterly VKT time-series is available (from

2001 onwards), and the VKT data exhibits a clear seasonal pattern. Overall, the trend in

light petrol VKT is similar to that of PED volumes, ie increasing until around 2006,

fluctuating over the next two years, and then generally declining from 2008.

Figure 54 Quarterly VKT for light petrol vehicles.


6.2 Potential explanatory variables

A wide range of potential explanatory variables was considered for our analysis. Table

11 overleaf summarises the characteristics of the data we obtained. An important

consideration for each variable is whether a forecast is available from some reliable

source. If not, the variable is not precluded from use in our analysis, but it means that a

forecast of that variable would have to be developed as part of the forecasting process.

Additional variables were subsequently tested in the models in response to feedback

from the NLTF forecasting group. These are described separately below.

5 There is some petrol consumption by medium and heavy vehicles, but this is insignificant in

comparison to light vehicles.

6,600

6,800

7,000

7,200

7,400

7,600

7,800

20

01-1

20

01-3

20

02-1

20

02-3

20

03-1

20

03-3

20

04-1

20

04-3

20

05-1

20

05-3

20

06-1

20

06-3

20

07-1

20

07-3

20

08-1

20

08-3

20

09-1

20

09-3

20

10-1

20

10-3

20

11-1

20

11-3

20

12-1

20

12-3

20

13-1

VK

T (m

illio

n k

m)

Quarter

Actual 4-quarter moving average

39

Table 11 Summary of explanatory variables for PED and RUC volume forecasting.

Variable Source Period Frequency Forecast available? Comments

Prices

Real retail petrol price MoT or MBIE 1991 - Quarterly No Current model forecasts on the basis of oil prices

Real retail diesel price MoT or MBIE 1991 - Quarterly No Current model forecasts on the basis of oil prices

Real RUCs MoT 1991 - Quarterly n/a RUC rate set by govt

Transport price indexes Stats NZ (CPI) Various Quarterly No

Economic activity

Real GDP (total) Stats NZ or Treasury 1988 - Quarterly Yes (Treasury) Actual & seasonally adjusted are available

Real GDP (by industry) Stats NZ 1988 - Quarterly No Actual & seasonally adjusted are available

Real household consumption Stats NZ or Treasury 1988 - Quarterly Yes (Treasury) Actual & seasonally adjusted are available

Unemployment rate Stats NZ or Treasury 1990 - Quarterly Yes (Treasury) Actual & seasonally adjusted are available

International trade

Import volume Stats NZ 1988 - Quarterly No

Export volume Stats NZ 1988 - Quarterly No

Tourism

International visitor stay days Stats NZ 1979 - Quarterly Yes (MBIE)

Outbound trips by NZers Stats NZ 1979 - Quarterly Yes (MBIE)

Demographics

Population age distribution Stats NZ 1991 - Annual Yes (Stats NZ) Actual data only observed in Census years

Urban population Stats NZ 1996 - Annual No Actual data only observed in Census years

Public transport

Total patronage MoT 2000 - Annual No

Patronage by mode MoT 2000 - Annual No Bus / rail / ferry categories

Road supply

State highway lane-km NZTA 2001 - Annual No Some data quality issues

40

7 Petrol excise duty forecasting

Our analysis concentrates on producing a PED volume forecast; translation of this into a

revenue forecast is relatively straightforward.

7.1 Data

The high volatility of PED volumes poses a considerable challenge for forecasting. It is

first necessary to determine whether the volatility has structure that can be modelled, or

is essentially random and cannot be forecasted in a reliable way.

As discussed above, our understanding is that a major cause of PED volume volatility is

the timing of fuel import shipments into New Zealand. This source of volatility is likely

to be difficult to capture in a model and may have to be reflected in the uncertainty (ie

width of confidence intervals) associated with the PED volume forecasts.

However, it is possible that at least some of the quarterly volatility of PED volumes is

caused by seasonal factors that could be modelled. Accordingly, we have tested for the

presence of predictable seasonal effects.

7.1.1 PED volume volatility and seasonality

A simple test of whether the PED volume volatility is caused by seasonal effects is

provided by re-arranging the data by quarter (Figure 55). If there are seasonal effects

then we expect to see the data for different quarters varying around different levels or

different trends. There are no obvious such differences in the PED volume data,

suggesting that strong seasonal effects are not present.

A more formal test of seasonal effects is provided by estimating a seasonal adjustment

model on the PED volumes data. A simple seasonal model involves regressing PED

volumes on a time trend, quarterly dummy variables, and autoregressive lags.6 The

results of such a regression are shown in Table 12 – none of the quarterly dummy

variables are statistically significant and an F-test of joint significance also fails.

More sophisticated tests of seasonality are possible but given the failure of these basic

tests and our understanding of the cause of PED volume volatility, in our view this

analysis suggests that there are not strong seasonal effects in the PED volumes data that

can be modelled in a reliable way. Rather, some of the volatility in the PED volumes

may be able to be captured in an autoregressive model, but some volatility may have to

be simply reflected in the width of the confidence intervals associated with PED volume

forecasts, unless the underlying data can be improved.

6 This approach is the basis of the “X-11” seasonal adjustment model developed by the US Census

Bureau and used by Statistics New Zealand, although the X-11 model allows for other factors such as

trading days and holidays.

41

Figure 55 PED volumes by quarter.


Table 12 Estimated simple seasonal adjustment model for PED volumes.

Variable Coef Std Err p-value

Time trend 0.0014 0.0005 0.0030

Q2 0.0188 0.0413 0.6490

Q3 -0.0144 0.0346 0.6770

Q4 -0.0333 0.0449 0.4580

Constant 20.3921 0.0289 0.0000

AR(1) -0.2634 0.1075 0.0140

AR(4) 0.2925 0.1192 0.0140

F-test for quarterly dummies 2.15 0.5414

7.1.2 Smoothing

Given that it appears that much of the quarterly volatility in PED volumes cannot be

modelled, in our view it is appropriate to apply some sort of smoothing to the data, to

try to isolate longer-term trends that reflect actual changes in petrol usage.

Many techniques are available for smoothing time-series data. Our preference is for a

relatively simple and transparent approach that does a reasonable job of removing

excess volatility in the data but can be easily understood by users of the forecasts.

For these reasons we recommend the use of simple moving averages. The effects of

applying 4-, 8-, and 12-quarter moving averages to the PED volumes data are shown in

Figure 56. The application of these averages results in the loss of a small amount of data

at the beginning of the sample, but a relatively large sample remains available.

0

100

200

300

400

500

600

700

800

900

1,000

19

941

996

19

982

000

20

022

004

20

062

008

20

102

012

19

941

996

19

982

000

20

022

004

20

062

008

20

102

012

19

941

996

19

982

000

20

022

004

20

062

008

20

102

012

19

941

996

19

982

000

20

022

004

20

062

008

20

102

012

PED

vo

lum

e (

mill

ion

lit

res)

Median

Quartiles

Q1 Q2 Q3 Q4

42

Comparing the three alternative moving averages, all appear to result in similar long-

term trends and similar reductions in volatility. There is a risk associated with over-

smoothing the data that useful information in PED volumes will be “thrown away” by

the smoothing process. For this reason, our recommendation is to apply the shortest

moving average filter (4 quarters) to the data, prior to analysis.

Unless stated otherwise, all subsequent analysis of PED volumes in this section was

carried out using the 4-quarter moving average of the quarterly PED volume data.

Figure 56 Application of moving-average smoothers to quarterly PED volumes.


7.1.3 Relationship to VKT

An alternative to modelling and forecasting PED volumes directly is to model the VKT

of petrol vehicles, and then translate this into petrol volumes via an efficiency factor.

Figure 54 (above) showed quarterly VKT for light petrol vehicles. Figure 57 shows the

relationship between quarterly PED volumes and quarterly light petrol VKT, where a 4-

quarter moving average has been applied to both series to reduce volatility and seasonal

effects. There is a clear positive relationship, but the fit is not perfect. This is likely due

to differences in timing of the VKT data and PED volumes, and potential measurement

errors in both data series.

Improvements in fuel efficiency over time will also lead to a non-linear relationship

between VKT and PED volume, however as we discuss in section 7.5.2 the efficiency

series (calculated by dividing VKT by PED volume) is highly volatile, reflecting the

volatility in the PED data, rather than actual changes in fuel efficiency.

0

100

200

300

400

500

600

700

800

900

1,000

19

94-1

19

94-4

19

95-3

19

96-2

19

97-1

19

97-4

19

98-3

19

99-2

20

00-1

20

00-4

20

01-3

20

02-2

20

03-1

20

03-4

20

04-3

20

05-2

20

06-1

20

06-4

20

07-3

20

08-2

20

09-1

20

09-4

20

10-3

20

11-2

20

12-1

20

12-4

20

13-3

PED

Vo

lum

e (

mill

ion

lit

res)

Quarter

Actual 4-quarter 8-quarter 12-quarter

43

Figure 57 Smoothed quarterly PED volume versus smoothed quarterly light petrol VKT.


7.1.4 Potential explanatory variables

The following were considered as potential explanatory variables for modelling and

forecasting PED volumes:7

Real petrol prices

Real diesel prices (as a substitute for petrol)

The ratio of the real petrol price to the real diesel price

Real “transport price” index (as an overall measure of the price of transport)

Real vehicle price index

Real passenger transport price index (as a substitute for private vehicle use)

Real domestic air transport price index (as a substitute for private vehicle use for

long-distance travel)

Seasonally adjusted real GDP

7 Some additional variables were tested in response to feedback from the NLTF forecasting group and

subgroup; see section 7.6 below.

R² = 0.541

7,000

7,100

7,200

7,300

7,400

7,500

7,600

7,700

700 720 740 760 780 800 820 840

PED

vo

lum

e (

mill

ion

lit

res)

Light petrol VKT (million km)

44

Seasonally adjusted real household consumption

The seasonally adjusted unemployment rate

Total international visitor days spent in New Zealand

Total short-term international outbound trips by New Zealand residents

The proportion of total population aged between 15 and 34

The proportion of total population living in urban areas

Total public transport boardings

Total lane-km of state highways

Most of these time series are available on a quarterly basis. Where only annual data was

available, we constructed quarterly data series using linear interpolation.

Time series data often exhibit increasing volatility when the level increases, and as a

result it is common practice to transform time-series data by taking the natural

logarithm prior to analysis. This also has the advantage that estimated coefficients in

regression models can be interpreted as elasticities. Unless otherwise noted, we have

used the natural logarithms of each data series in the subsequent analysis.

Table 13 on the following page shows the correlations between all pairs of variables in

the PED modelling dataset. PED volume is statistically significantly correlated with

most of the potential explanatory variables in the dataset. In addition, there are high

correlations between some potential explanatory variables, such as the real petrol price

and real diesel price (correlation 0.98) and between real GDP and household

consumption (correlation 1.00).

This may make it difficult to isolate statistically significant effects if highly correlated

explanatory variables are included in the same model. However, in a forecasting model,

isolating the effects of a number of explanatory variables is of secondary importance,

compared to finding a model that fits the data well and generates reliable predictions.8

For this reason, we will generally favour simple models with a good fit over more

complex models.

One caveat is that if the drivers of transport activity change over time, models that fit

historical data well may not produce accurate forecasts. The best way to mitigate this

risk is to regularly re-estimate the models when new data becomes available and test

whether the model remains valid. Regular re-estimation and testing of the models is

particularly important during times such as the present where it is not clear whether

changes to transport activity are temporary shocks or new permanent trends.

8 In contrast, a model intended for policy analysis would require robust estimation of parameters

relevant to policy.

45

Table 13 Correlations between variables in the PED modelling dataset. Correlations that are statistically significant at the 5% level are highlighted in green.

PED volume

(MA)

Real petrol price

Real diesel price

Real transport

price

Real vehicle

price

Real pax. trans. price

Real dom. air trans.

price Real GDP

(SA)

Real hhold

consump. (SA)

Unemp. rate (SA)

Intl. visitor

days Outbound

trips

Young pop

propn. Urban pop

propn. Total PT

boardings

Real petrol price 0.56

Real diesel price 0.58 0.98

Real transport price -0.56 0.17 0.26

Real vehicle price -0.77 -0.83 -0.84 0.24

Real pax trans. price -0.68 -0.84 -0.79 0.31 0.90

Real dom. air trans. price 0.59 0.57 0.64 0.12 -0.71 -0.51

Real GDP (SA) 0.78 0.84 0.84 -0.24 -0.98 -0.91 0.73

Real hhold consump. (SA) 0.77 0.86 0.85 -0.23 -0.98 -0.92 0.71 1.00

Unemp. rate (SA) -0.69 -0.27 -0.35 0.38 0.47 0.29 -0.63 -0.46 -0.44

Intl. visitor days 0.50 0.42 0.46 -0.17 -0.58 -0.48 0.49 0.61 0.60 -0.40

Outbound trips 0.57 0.69 0.69 -0.09 -0.80 -0.70 0.56 0.78 0.79 -0.37 0.16

Young pop. propn -0.79 -0.65 -0.70 0.25 0.93 0.77 -0.79 -0.92 -0.90 0.53 -0.62 -0.68

Urban pop propn. 0.75 0.65 0.63 -0.42 -0.85 -0.71 0.67 0.89 0.86 -0.55 0.54 0.54 -0.93

Total PT boardings 0.23 0.79 0.67 -0.17 -0.89 -0.83 -0.10 0.95 0.94 0.35 0.21 0.60 -0.81 0.56

State highway lane km 0.07 0.84 0.74 0.13 -0.79 -0.84 -0.06 0.88 0.89 0.47 0.10 0.60 -0.60 0.31 0.89

Source: Covec analysis

46

7.2 Modelling strategy

Given the uncertainties associated with PED volumes and the difficulties with using the

current forecasting model to produce accurate and reliable forecasts of PED revenues,

we tested and compared a relatively wide range of PED volume models. In particular,

we evaluated the use of three general classes of model for PED volume (as above, in all

cases the natural logarithm of the 4-quarter moving average of PED volume was used as

the dependent variable):

1. Pure time series models, which forecast future PED volumes purely as a

function of past values and deterministic time trends.

2. Simple regression models, relating PED volumes to other explanatory variables,

with various methods for modelling short-run dynamics (see below).

3. A ‘hybrid’ approach involving forecasting light petrol VKT and fuel efficiency,

and then translating these forecasts into a PED volume forecast.

Each of these three approaches has advantages and disadvantages, as will be discussed

below. More complex approaches, for example modelling VKT associated with drivers

in different age groups and/or in different geographic regions were considered but were

determined not to be feasible given the data available.9

In general, our modelling strategy under each approach involved the following:

1. Analysing the time-series properties of the relevant data series.

2. Selecting variables for inclusion in the model on the basis of statistical tests.

3. Testing additional models in response to feedback from the NLTF revenue

forecasting group.

4. Evaluating the suitability of all models for generating NLTF forecasts.

5. Selecting a short-list of models in consultation with a subgroup of the NLTF

revenue forecasting group, and developing confidence intervals and performing

sensitivity tests on the short-listed models.

6. Making a final recommendation for PED forecasting on the basis of all of the

above analysis.

9 For example, the Household Travel Survey is reported as three-year moving averages.

47

7.2.1 Time-series considerations

Standard regression analysis assumes that all variables have constant mean and

variance over time, ie they are stationary.10 It can be shown that regressing one non-

stationary variable on another is very likely to result in a finding of a statistically

significant relationship when none exists – this problem is known as spurious

regression. If variables are found to be non-stationary, there are two ways to proceed.

First, the variables can be transformed (eg by differencing) to make them stationary.

This is a simple approach, but risks losing information about relationships among the

un-transformed variables. Alternatively, regressions can be performed using the non-

stationary variables, but a “cointegration” test needs to be performed to ensure that the

variables are genuinely related to each other and the regression is not spurious. In a

simple regression model, the cointegration test involves testing whether the residuals of

the estimated model are stationary. If the residuals are stationary, the estimated

relationship is unlikely to be spurious.

Our approach involved conducting Augmented Dickey Fuller (ADF) tests on the

relevant variables to determine whether they are stationary. The results of these tests

(Table 14) indicated that most variables are non-stationary and integrated of order one

(ie first-difference stationary). Given this, we conducted ADF tests on regression

residuals to rule out spurious regressions.

Table 14 Augmented Dickey Fuller test p-values.

Variable Level First

difference Integration

order

PED volume (MA) 0.24 0.00 1

Real petrol price 0.89 0.00 1

Real diesel price 0.71 0.00 1

Real transport price 0.15 0.00 1

Real vehicle price 0.45 0.00 1

Real passenger transport price 0.75 0.00 1

Real domestic air transport price 0.14 0.01 1

Real GDP (SA) 0.34 0.02 1

Real household consumption (SA) 0.78 0.04 1

Unemployment rate (SA) 0.54 0.04 1

International visitor days 0.29 0.04 1

Outbound trips 0.46 0.05 1

Young pop proportion 0.05 0.69 0

Urban pop proportion 0.02 0.17 0

Total public transport boardings 0.38 0.00 1

State highway lane km 0.63 0.04 1


10 Or at least that the variables have constant variance around a linear trend, so that they can be made

stationary by subtracting the trend (‘de-trending’).

48

The regression residuals were also tested for any remaining serial correlation, and

where this was found to be present we attempted to correct for it in various ways

including adding lags of the dependent variable, estimating a simple error-correction

model, or modelling an autoregressive process for the residuals (see below).

7.2.2 Model selection

For the simple regression and hybrid models, a process of general-to-specific selection

was employed to arrive at a parsimonious set of explanatory variables. This involves

estimating a model with all possible explanatory variables, and then successively

eliminating statistically insignificant variables.

For the pure time series models we used a process of ‘Bayesian’ model selection to

determine the optimal parameters of the models (see section 7.3 below).

7.2.3 Forecast evaluation

Our evaluation of the models placed considerable weight on their ability to generate

accurate and plausible forecasts. As well as goodness of fit measured by the R-squared

statistic,11 we tested the forecasting performance of the models in two ways:

1. By generating 10-year ahead out-of-sample forecasts, using a set of plausible

assumptions for the relevant explanatory variables.

2. By re-estimating the models with a truncated sample, and using these models to

estimate forecasts that can then be compared with recent actual PED volumes.

To allow comparison of the accuracy of forecasts from the truncated samples

across different models, we calculated the root mean squared error (RMSE) of

the forecasts versus the actual quarterly PED volumes.

The quarterly PED volumes data runs from the first quarter of 1994 to the third quarter

of 2013. We estimated the truncated sample models using data up to and including the

second quarter of 2011. This left nine quarters of actual data against which forecasts

from the truncated models could be compared. This truncation was chosen so as to have

two complete June years (2012 and 2013) of actual data against which the forecasts from

the truncated sample model could be generated. This also reflects the NLTF revenue

forecasting group’s preference for short term accuracy.

7.3 Pure time series models [PED model 1]

The pure time series approach involves estimating PED volume purely as a function of

its own history and deterministic time trends. We employed a process of ‘Bayesian’

model selection, which involves specifying a class of simple time-series models,

estimating all possible models within this class, and using the Bayesian Information

Criterion (BIC) to select the best model within this class. Such an approach was

pioneered by Phillips & Ploberger (1994) and essentially involves selecting the model

11 We calculated R-squared as the squared correlation between the fitted values of each model and the

actual quarterly PED volume data. This ensures a consistent basis for calculating and comparing R-

squared values across different models.

49

within the chosen class that has the highest probability of generating the actual data that

has been observed.

A simple class of time-series models that has proved to be useful for modelling a wide

variety of variables is the ‘autoregressive + trend’ model:

𝑦𝑡 =∑𝛽𝑖𝑡𝑖 + 𝑒𝑡

𝐴

𝑖=0

𝑒𝑡 =∑𝜌𝑗𝑒𝑡−1 + 𝑢𝑡

𝐵

𝑗=1

where 𝑦𝑡 is the PED volume in quarter 𝑡 (or the first difference of PED volume), 𝐴 and 𝐵

are parameters to be selected that determine the specific form of the model, 𝑢𝑡 is a

random error, and the 𝛽𝑖 and 𝜌𝑗 are parameters to be estimated.12

For example, if we set 𝐴 = 1 and 𝐵 = 2, the model becomes a second-order

autoregressive model with a linear time trend:

𝑦𝑡 = 𝛽0 + 𝛽1𝑡 + 𝑒𝑡

𝑒𝑡 = 𝜌1𝑒𝑡−1 + 𝜌2𝑒𝑡−2 + 𝑢𝑡

The key issue in this process is selecting the parameters 𝐴 and 𝐵 that determine the class

of the time-series models. We specified maximum values for 𝐴 and 𝐵 of two and four

respectively, and then estimated all possible combinations of models in this class. Each

model was also estimated using the level and the difference of PED volume; this

combines a stationarity test with model selection. In total, 96 models within this class

were estimated for PED volume. The BIC statistic was calculated for each, and the

optimal model selected accordingly.

The resulting model used the levels of PED volume and included a quadratic trend and

first-order autoregressive process for the errors:

𝑦𝑡 = 𝛽0 + 𝛽1𝑡 + 𝛽2𝑡2 + 𝑒𝑡

𝑒𝑡 = 𝜌1𝑒𝑡−1 + 𝑢𝑡

Table 15 shows the estimated coefficients of the selected model for PED volume. All

variables are highly statistically significant and the model explains 78% of the quarterly

variation in the moving average of PED volume, although only 12% of the variation in

the original un-smoothed PED volume data. Autocorrelation tests on the residuals of

this model give weak evidence of autocorrelation at four lags, but in our view the model

is sufficiently robust.

12 Applications of such models for forecasting include Phillips (1995) and Schiff & Phillips (2000).

50

Table 15 Estimated coefficients of the selected time-series model for quarterly PED volume.

Variable Coef. Std Err. p-value

t 0.0057 0.0010 0.00

t2 -0.0001 0.0000 0.00

Constant 20.3228 0.0180 0.00

AR(1) 0.4802 0.0927 0.00

R-squared (vs PED volume MA) 0.78

R-squared (vs PED volume actual) 0.12

Residual autocorrelation p-values 0.44, 0.15, 0.27, 0.09

Figure 58 shows the fitted values and 10-year ahead forecasts of quarterly PED volumes

generated by this model, versus the actual quarterly values, the 4-quarter moving

average of actuals, and the forecasts generated by the current NLTF forecasting model.

The selected time-series model picks up the overall trend and fluctuations in the moving

average of PED volumes, and forecasts a continuation of the downwards trend in

volumes, from around 750 million litres per quarter at the end of 2013 to around 630

litres per quarter at the end of 2023. This is due to the inclusion of the quadratic trend

(t2) variable in the model, and while this trend best fits the data, it is not clear that such a

trend would continue beyond the short term. As discussed above, the forecasts do not

attempt to predict any seasonal pattern in PED volumes, as there does not appear to be

any robust pattern. In contrast, the forecasts produced by the existing forecast model

exhibit an upwards trend and a strong seasonal pattern.

Figure 58 Fitted values and out-of-sample forecasts of quarterly PED volumes generated by the

selected time-series model.


0

200

400

600

800

1,000

1,200

19

94-1

19

95-1

19

96-1

19

97-1

19

98-1

19

99-1

20

00-1

20

01-1

20

02-1

20

03-1

20

04-1

20

05-1

20

06-1

20

07-1

20

08-1

20

09-1

20

10-1

20

11-1

20

12-1

20

13-1

20

14-1

20

15-1

20

16-1

20

17-1

20

18-1

20

19-1

20

20-1

20

21-1

20

22-1

20

23-1

PED

vo

lum

e (

mill

ion

lit

res)

Quarter

Quarterly PED volume

Actual Actual (MA) Fitted Forecast Current model forecast

51

Figure 59 shows the results of estimating the selected time-series model with the sample

truncated to the second quarter of 2011 and generating nine quarters of forecasts to the

third quarter of 2013. In all cases, the model generates forecasts that exceed the moving

average of quarterly PED volumes, and exceed the actual PED volumes in all but two

quarters. Under this test, the time-series model would produce PED volume forecasts

that are “too high” by an average of 4.4% per quarter. The RMSE of these forecasts is

45.2 million litres.

Figure 59 Comparison of actual PED volumes and truncated-sample forecasts produced by the selected

time series model.


Figure 60 shows the performance of the selected time-series model on an annual (June

year) basis. In general the model explains the historic trend in annual PED volumes, but

the 10-year forecasts imply a relatively strong decline in volumes over time, from

around 3 billion litres in 2014 to 2.6 billion litres in 2023, an average annual change of

-1.6%. The truncated sample forecasts are too high by 3.5% and 4.2% in 2012 and 2013

respectively.

As described above, the time-series model was selected by using the BIC statistic to

choose the best model among a class of models. In principle this approach could be

repeated every quarter when new PED volume data becomes available; this allows the

form of the model to update or ‘evolve’ to fit the data over time. Thus while the time

series model might predict a downwards trend in PED volume at this point in time, the

trend may change in future if future PED volumes start to increase.

640

660

680

700

720

740

760

780

800

820

20

11-3

20

11-4

20

12-1

20

12-2

20

12-3

20

12-4

20

13-1

20

13-2

20

13-3

PED

vo

lum

e (

mill

ion

lit

res)

Quarter


Actual Actual (MA) Forecast

52

Figure 60 Annual (June year) fitted values and forecasts generated by the selected time-series model.


7.4 Regression models [PED models 2a-2e]

We tested regressions of the 4-quarter moving average of PED volume on the various

explanatory variables listed in section 7.1.4 above. Diagnostic tests including residual

stationarity, the Durbin-Watson statistic, and Breusch-Godfrey tests were employed to

check for problems with the residuals of the models. In general, serial correlation was

found to be present, and we tested three alternative approaches for dealing with this:

1. Modelling the residual serial correlation directly through the use of an

autoregressive model for the regression residuals.

2. Including lags of the dependent variable as explanatory variables in the model.

3. Testing for cointegration and estimating a simple ‘error correction’ model.

These approaches represent different ways of modelling the short-run dynamics of PED

volume. To minimise the spurious regression problem, we performed residual

stationarity tests – these were passed by all of the selected models.

The following subsections describe our application of each of the three approaches

above to modelling PED volume. In each case the dependent variable was the natural

logarithm of the 4-quarter moving average of PED volume.

Following presentation of the PED models to the NLTF revenue forecasting group, a set

of additional models for PED was tested. These are described in section 7.5.5 below.

0

500

1,000

1,500

2,000

2,500

3,000

3,500

4,000

19

96

19

97

19

98

19

99

20

00

20

01

20

02

20

03

20

04

20

05

20

06

20

07

20

08

20

09

20

10

20

11

20

12

20

13

20

14

20

15

20

16

20

17

20

18

20

19

20

20

20

21

20

22

20

23

PED

vo

lum

e (

mill

ion

lit

res)

Year ended June

Annual PED volumes

Actual Fitted Truncated sample forecast Forecast Current model forecast

53

7.4.1 Autoregressive error model [PED model 2a]

A relatively simple way of modelling short-run dynamics is to specify an

autocorrelation model for the regression residuals, for example:

𝑦𝑡 = 𝛼 + 𝛽𝑋𝑡 + 𝑒𝑡

𝑒𝑡 =∑𝜌𝑖𝑒𝑡−𝑖 + 𝑢𝑡

𝐿

𝑖=1

where 𝑦𝑡 is the natural logarithm of the 4-quarter moving average of PED volume, 𝑋𝑡 is

a vector of explanatory variables for PED volume, and 𝑢𝑡 is a random error.

The autocorrelation model for the regression residuals 𝑒𝑡 captures the dynamics of the

response of external shocks to PED volume (ie the persistence of random shocks

arriving via 𝑢𝑡). This is therefore a relatively naïve dynamic model, as it is assumed that

changes in the explanatory variables are immediately reflected in the dependent

variable. The number of autoregressive lags (𝐿) can be selected by various methods; we

used residual correlation tests to determine the appropriate number of lags to eliminate

all significant evidence of serial correlation in the regression residuals.

The process of model selection arrived at the model shown in Table 16. All variables are

highly statistically significant with the exception of the real petrol price, however in our

view it was appropriate to retain this variable in the model given the strong theoretical

link between the petrol price and the quantity of petrol consumed.13 The GDP and

unemployment variables have the expected signs. Household consumption was tested

as an alternative to GDP, but given the high correlation between household

consumption and GDP, this did not materially change the goodness of fit of the model.

This model explains 81% of the variation in the quarterly moving average of PED

volume, and explains 13% of the variation in the un-smoothed PED volume.

Table 16 Estimated coefficients of the selected regression model for quarterly PED volume with an

autoregressive error specification.


Real petrol price -0.0448 0.0287 0.12

Real GDP (SA) 0.2342 0.0397 0.00

Unemployment rate (SA) -0.0744 0.0154 0.00

Constant 18.3791 0.3036 0.00

AR(1) 0.3627 0.1034 0.00

AR(4) -0.2139 0.1041 0.04




Out-of-sample forecasts using this model were generated using Treasury forecasts for

real GDP and the unemployment rate, provided to us by the Ministry of Transport. The

13 The estimated price elasticity of -0.04 also seems broadly acceptable.

54

Treasury forecasts extend to the second quarter of June 2018. Beyond that we have

assumed that real GDP grows at a long-run average rate of 2% per annum (Figure 61)

and the unemployment rate trends to a long-run rate of 4.5% (Figure 62).

Figure 61 Real GDP growth rates.

Source: Treasury and Covec.

Figure 62 Seasonally adjusted unemployment rate.

Source: Treasury and Covec.

-3%

-2%

-1%

0%

1%

2%

3%

4%

5%

6%

7%

19

95-1

19

96-1

19

97-1

19

98-1

19

99-1

20

00-1

20

01-1

20

02-1

20

03-1

20

04-1

20

05-1

20

06-1

20

07-1

20

08-1

20

09-1

20

10-1

20

11-1

20

12-1

20

13-1

20

14-1

20

15-1

20

16-1

20

17-1

20

18-1

20

19-1

20

20-1

20

21-1

20

22-1

20

23-1

Year

-on

-ye

ar g

row

th r

ate

Quarter

Actual Treasury forecast Extended forecast

0%

1%

2%

3%

4%

5%

6%

7%

8%

9%

10%

19

94-1

19

95-1

19

96-1

19

97-1

19

98-1

19

99-1

20

00-1

20

01-1

20

02-1

20

03-1

20

04-1

20

05-1

20

06-1

20

07-1

20

08-1

20

09-1

20

10-1

20

11-1

20

12-1

20

13-1

20

14-1

20

15-1

20

16-1

20

17-1

20

18-1

20

19-1

20

20-1

20

21-1

20

22-1

20

23-1

Seas

on

ally

ad

just

ed

un

em

plo

yme

nt

rate

Quarter

Actual Treasury forecast Extended forecast

55

Forecasts of the real petrol price were created by combining a nominal petrol price

forecast, calculated on the basis of Treasury’s oil price forecast, with Treasury’s CPI

forecast. Again we extended the CPI forecast beyond Treasury’s forecasting horizon,

assuming a long-run inflation rate of 2.2% per annum.

Figure 63 Nominal and real petrol price forecasts.

Source: Ministry of Transport, Treasury and Covec.

Figure 64 shows the quarterly fitted values and forecasts of PED volume generated by

this model, using the GDP, unemployment, and petrol price forecasts shown above. In

contrast with the pure time series model estimated in the previous section, this model

predicts an increase in quarterly PED volume over time, from around 750 million litres

per quarter at the end of 2013 to around 835 million litres in 2023. This increase is driven

by GDP growth and falling unemployment, but growth in PED volumes is somewhat

offset by higher real petrol prices in the longer term.

Figure 65 shows the results of estimating this regression model with the sample

truncated to the second quarter of 2011 and generating nine quarters of forecasts to the

third quarter of 2013. In all cases, the model generates forecasts that exceed the moving

average of quarterly PED volumes, and exceed the actual PED volumes in all but one

quarter. This model would produce PED volume forecasts that are too high by an

average of 4.9% per quarter. The RMSE of these forecasts is 48.3 million litres.

0

200

400

600

800

1,000

1,200

1,400

1,600

0

50

100

150

200

250

300

350

19

94-1

19

95-1

19

96-1

19

97-1

19

98-1

19

99-1

20

00-1

20

01-1

20

02-1

20

03-1

20

04-1

20

05-1

20

06-1

20

07-1

20

08-1

20

09-1

20

10-1

20

11-1

20

12-1

20

13-1

20

14-1

20

15-1

20

16-1

20

17-1

20

18-1

20

19-1

20

20-1

20

21-1

20

22-1

20

23-1

CP

I

Pe

tro

l p

rice

(ce

nts

/ li

tre

)

Quarter

Nominal - Actual Nominal - ForecastReal - Actual Real - ForecastCPI - Actual CPI - Treasury forecastCPI - Extended forecast

56


selected regression model with an autoregressive error specification.



regression model with an autoregressive error specification.


0

200

400

600

800

1,000

1,200

19

94-1

19

95-1

19

96-1

19

97-1

19

98-1

19

99-1

20

00-1

20

01-1

20

02-1

20

03-1

20

04-1

20

05-1

20

06-1

20

07-1

20

08-1

20

09-1

20

10-1

20

11-1

20

12-1

20

13-1

20

14-1

20

15-1

20

16-1

20

17-1

20

18-1

20

19-1

20

20-1

20

21-1

20

22-1

20

23-1

PED

vo

lum

e (

mill

ion

lit

res)

Quarter



640

660

680

700

720

740

760

780

800

820

20

11-3

20

11-4

20

12-1

20

12-2

20

12-3

20

12-4

20

13-1

20

13-2

20

13-3

PED

vo

lum

e (

mill

ion

lit

res)

Quarter



57

Figure 66 shows the June-year fitted values and forecasts implied by this model. The

model generally follows the trend in annual PED volume, but forecasts that the recent

decline will be reversed and that PED volumes will resume growth, reaching around 3.3

billion litres in 2023, for an average annual growth rate of 1.0%, although growth is

slower in later years due to higher petrol prices. The truncated-sample model over-

estimates PED volumes by 3.5% in 2012 and 4.9% in 2013.

Figure 66 Annual (June year) fitted values and forecasts generated by the selected regression model

with an autoregressive error specification.


7.4.2 Lagged dependent variable model [PED model 2b]

The second model specification that we tested involved including lags of the dependent

variable to model short-run dynamics:

𝑦𝑡 = 𝛼 + 𝛽𝑋𝑡 +∑𝜌𝑖𝑦𝑡−1

𝐿

𝑖=1

+ 𝑢𝑡

In some sense this model is more sophisticated than the model estimated in section

7.4.1, as it allows for both shocks to the explanatory variables and the random error to

affect the dependent variable gradually over time.

During the estimation process we also tested whether lags of the explanatory variables

should be included in the model; such a model is known as an autoregressive

distributed-lag model. However we found lags of explanatory variables to be generally

unnecessary and did not improve the goodness of fit of the models. The number of lags

0

500

1,000

1,500

2,000

2,500

3,000

3,500

4,000

19

96

19

97

19

98

19

99

20

00

20

01

20

02

20

03

20

04

20

05

20

06

20

07

20

08

20

09

20

10

20

11

20

12

20

13

20

14

20

15

20

16

20

17

20

18

20

19

20

20

20

21

20

22

20

23

PED

vo

lum

e (

mill

ion

lit

res)

Year ended June

Annual PED volumes


58

of the dependent variable was chosen by significance testing and in order to eliminate

any evidence of serial correlation in the regression residuals.14

As in the previous subsection, a process of general-to-specific model selection was

employed to choose the explanatory variables in this model. The inclusion of lags of the

dependent variable to handle serial correlation resulted in a somewhat different set of

variables being selected (Table 17). All of the estimated coefficients are statistically

significant at least at the 10% level, and have the expected signs. This model explains

80% of the variation in the 4-quarter moving average of PED volumes and 10% of the

variation in actual PED volumes.

Table 17 Estimated coefficients of the selected regression model for quarterly PED volume with lagged

dependent variables.



Real GDP (SA) 0.2581 0.0742 0.00


Young population proportion -0.3546 0.2210 0.10

Urban population proportion -10.1229 4.9084 0.04

PED volume (-1) 0.2632 0.1232 0.04

PED volume (-2) 0.2544 0.1135 0.03

PED volume (-4) -0.2828 0.1100 0.01

Constant 11.3144 2.9723 0.00




It is important to note that the estimated coefficient on the urban proportion of the

population is very large, indicating that any forecasts generated by this model will be

very sensitive to assumptions about urban population. Since the actual urban

population is only measured in Census years, there is considerable uncertainty about

this variable, both in the data used to estimate the model and for forecasting. Thus while

a negative relationship between the proportion of population living in urban areas and

PED volumes is plausible, this model may be difficult to use for forecasting because

small changes to assumptions about future urban population will produce large changes

in the PED volume forecasts and it is difficult to be precise about future values of the

urban population proportion.

In any case, we have generated forecasts using this model for comparison with the other

models. In addition to the real petrol price, real GDP, and unemployment rate forecasts

presented above, we have developed forecasts of the proportion of the population living

in urban areas (Figure 67) and the proportion of the population aged between 15 and 34

(Figure 68).

14 Estimation of models with lagged dependent variables requires special attention to testing for serial

correlation in the regression residuals. We performed Durbin-Watson and Breusch-Godfrey tests to

ensure that the models were adequately specified.

59

Figure 67 Proportion of the population living in urban areas.

Source: Statistics New Zealand and Covec analysis.

Figure 68 Proportion of the population aged between 15 and 34.

Source: Statistics New Zealand.

83.8%

84.0%

84.2%

84.4%

84.6%

84.8%

85.0%

85.2%

85.4%

85.6%

85.8%

19

96

19

97

19

98

19

99

20

00

20

01

20

02

20

03

20

04

20

05

20

06

20

07

20

08

20

09

20

10

20

11

20

12

20

13

20

14

20

15

20

16

20

17

20

18

20

19

20

20

20

21

20

22

20

23

Urb

an p

op

ula

tio

n p

rop

ort

ion

As at June

Census Stats NZ estimate Forecast

24%

25%

26%

27%

28%

29%

30%

31%

19

96

19

97

19

98

19

99

20

00

20

01

20

02

20

03

20

04

20

05

20

06

20

07

20

08

20

09

20

10

20

11

20

12

20

13

20

14

20

15

20

16

20

17

20

18

20

19

20

20

20

21

20

22

20

23

Pro

po

rtio

n o

f p

op

ula

tio

n a

ged

15

-34

As at June

Census Stats NZ estimate Forecast

60

Statistics New Zealand provides data on the urban population in Census years and

estimates for intermediate years. We have not been able to find forecasts of the urban

population so we have constructed our own scenario, as shown in Figure 67. Given the

recent growth in the urban proportion, this forecast is based on applying the growth

profile observed between 1996 and 2006 to the period from 2014 to 2023. The sensitivity

of the forecasting model to the urban proportion means that highly accurate forecasts of

urban population will be required to generate accurate PED volume forecasts with this

model, which may not be practical.

Statistics New Zealand does produce population forecasts by age group, and their

forecast for the proportion of the population aged between 15 and 34 is shown in Figure

68. In general this proportion is expected to decline over time.

Figure 69 shows the fitted values and forecasts generated by the regression model in

Table 17, given the forecasts of the explanatory variables above. This model predicts a

gradual increase in PED volume over time, from around 750 million litres at the end of

2013 to around 780 million litres at the end of 2023. In this model, the increase driven by

GDP growth, lower unemployment, and a reduced proportion of the population aged

between 15-34 is offset by higher real petrol prices and a higher urban population.


selected regression model with a lagged dependent variable.


Estimating this model using the truncated sample generates forecasts that are generally

too high relative to actual PED volumes (Figure 70). On average, this model produces

forecasts that exceed PED volumes by 6.3% per quarter for the period from 2011Q3 to

2013Q3. The RMSE of the model over this period is 57.1 million litres. The model

predicts relatively slow growth of annual PED volume (Figure 71), to 3.1 billion litres in

0

200

400

600

800

1,000

1,200

19

94-1

19

95-1

19

96-1

19

97-1

19

98-1

19

99-1

20

00-1

20

01-1

20

02-1

20

03-1

20

04-1

20

05-1

20

06-1

20

07-1

20

08-1

20

09-1

20

10-1

20

11-1

20

12-1

20

13-1

20

14-1

20

15-1

20

16-1

20

17-1

20

18-1

20

19-1

20

20-1

20

21-1

20

22-1

20

23-1

PED

vo

lum

e (

mill

ion

lit

res)

Quarter



61

2023, at an average annual growth rate of 0.3%. The truncated sample model over-

estimates volumes by 4.1% in 2012 and 6.8% in 2013.


regression model with a lagged dependent variable.


Figure 71 Annual (June year) fitted values and forecasts generated by the selected regression model

with an autoregressive error specification.


620

640

660

680

700

720

740

760

780

800

820

840

20

11-3

20

11-4

20

12-1

20

12-2

20

12-3

20

12-4

20

13-1

20

13-2

20

13-3

PED

vo

lum

e (

mill

ion

lit

res)

Quarter



0

500

1,000

1,500

2,000

2,500

3,000

3,500

4,000

19

96

19

97

19

98

19

99

20

00

20

01

20

02

20

03

20

04

20

05

20

06

20

07

20

08

20

09

20

10

20

11

20

12

20

13

20

14

20

15

20

16

20

17

20

18

20

19

20

20

20

21

20

22

20

23

PED

vo

lum

e (

mill

ion

lit

res)

Year ended June

Annual PED volumes


62

7.4.3 Error-correction model [PED model 2c]

The final regression model that we tested was an ECM. Such models are more complex

than the models estimated above, but can capture more complicated short-run

dynamics. There are various ways that ECMs can be estimated and used for

forecasting.15 One is to estimate the long-run relationship, and then use the residuals

from that regression to build a model of short-run dynamics. Alternatively, it is possible

to show that an ECM can be estimated using a single equation of the form:

∆𝑦𝑡 = 𝛼 + 𝜌𝑦𝑡−1 + 𝛽𝑋𝑡−1 + 𝛾∆𝑋𝑡 + 𝑢𝑡

where 𝑦𝑡 is PED volume in quarter t and 𝑋𝑡 is a matrix of variables that are related to

PED volume in the long run.

We tested ECMs using this latter approach as it is more straightforward to estimate and

use for generating forecasts, although the single-equation model does not produce

separate estimates of short- and long-run effects. Using the general-to-specific model

selection process, we arrived at an ECM for PED volume shown in Table 18. For all the

explanatory variables, at least one component (the lag or the difference) is statistically

significant at the 5% level. This model explains 81% of the variation in the 4-quarter

moving average of PED volumes and 15% of the variation in actual quarterly PED

volume. There is weak evidence of fourth-order autocorrelation in this model.

Table 18 Estimated coefficients of the selected error correction model for quarterly PED volume. Note

the dependent variable is the quarterly change in the natural logarithm of the 4-quarter moving

average of PED volume.


PED volume (-1) -0.6414 0.1138 0.00

Real petrol price (-1) -0.0576 0.0271 0.04

Real petrol price (Diff) 0.0148 0.0447 0.74

Real GDP (SA) (-1) 0.1836 0.0456 0.00

Real GDP (SA) (Diff) 0.3889 0.2814 0.17

Unemployment rate (SA) (-1) -0.0437 0.0141 0.00

Unemployment rate (SA) (Diff) -0.0683 0.0443 0.13

Constant 11.5819 2.1067 0.00




Figure 72 shows the fitted values and 10-year ahead forecasts generated by this ECM,

using the same assumptions about petrol prices, GDP, and unemployment as above.

This model predicts a relatively strong increase in PED volume over time, from around

750 million litres at the end of 2013 to around 840 million litres at the end of 2023. On

average, the forecasts produced by this ECM are similar to those produced by the

current model (also using an ECM), although for the reasons discussed above we have

not attempted to forecast a quarterly seasonal pattern for PED volume.

15 See, for example, section 20.7 of Davidson & MacKinnon (1993).

63

Figure 73 shows the comparison of the forecasts produced by this ECM estimated using

the truncated sample and the actual PED volumes. As with the other models, the ECM

generally over-estimates PED volume during this period, by an average of 5.0% per

quarter. The RMSE of the forecasts for this period is 49 million litres.


selected ECM.


Figure 73 Comparison of actual PED volumes and truncated-sample forecasts produced by the ECM.


0

200

400

600

800

1,000

1,200

19

94-1

19

95-1

19

96-1

19

97-1

19

98-1

19

99-1

20

00-1

20

01-1

20

02-1

20

03-1

20

04-1

20

05-1

20

06-1

20

07-1

20

08-1

20

09-1

20

10-1

20

11-1

20

12-1

20

13-1

20

14-1

20

15-1

20

16-1

20

17-1

20

18-1

20

19-1

20

20-1

20

21-1

20

22-1

20

23-1

PED

vo

lum

e (

mill

ion

lit

res)

Quarter



640

660

680

700

720

740

760

780

800

820

20

11-3

20

11-4

20

12-1

20

12-2

20

12-3

20

12-4

20

13-1

20

13-2

20

13-3

PED

vo

lum

e (

mill

ion

lit

res)

Quarter



64

Figure 74 shows the June-year forecasts generated by the ECM. This model produces a

similar growth profile over time as the existing NLTF forecasting model. Our ECM

forecasts an increase in PED volume to 3.35 billion litres in 2023, an average annual

growth rate of 1.1%. The truncated model over-estimates PED volume by 3.5% in 2012

and 5.1% in 2023.

Figure 74 Annual (June year) fitted values and forecasts generated by the ECM.


7.5 Hybrid models [PED models 3a & 3b]

As an alternative to the simple time-series and regression models estimated above, we

have developed ‘hybrid’ models of PED volume, based on a combination of a VKT

forecast for light petrol vehicles and a fuel efficiency forecast. In particular, the PED

volume in a quarter can be calculated as VKT in that quarter divided by fuel efficiency

(measured as kilometres per litre).16

To do this, we developed separate models of light petrol VKT and fuel efficiency, and

then combined these to generate a model of PED volume and forecasts. This model also

has the advantage that the effect of future changes in fuel efficiency on PED volumes

can be analysed. We tested two versions of the light petrol VKT model – one for total

VKT and one for per-capita VKT that was multiplied by a population forecast to obtain

total VKT.

The accuracy of the forecasts from these models depends in part on the accuracy of the

light petrol VKT data, calculated from odometer readings. Based on our analysis, it

16 In this model we use only VKT of light petrol vehicles as the determinant of PED volume. Light

petrol vehicles account for around 99.95% of total petrol VKT in New Zealand.

0

500

1,000

1,500

2,000

2,500

3,000

3,500

4,000

19

96

19

97

19

98

19

99

20

00

20

01

20

02

20

03

20

04

20

05

20

06

20

07

20

08

20

09

20

10

20

11

20

12

20

13

20

14

20

15

20

16

20

17

20

18

20

19

20

20

20

21

20

22

20

23

PED

vo

lum

e (

mill

ion

lit

res)

Year ended June

Annual PED volumes


65

appears that the VKT data does not suffer from the same unexplained volatility as the

PED volume data, however the actual accuracy of the VKT data is unknown.

Furthermore, as will become apparent below, the volatility in the PED volume data gets

translated into volatility in the calculated efficiency in the hybrid models, therefore the

problem of PED volume volatility is not eliminated in these models, and the accuracy of

the forecasts depends in part on the accuracy of the efficiency forecasts.

7.5.1 Total light petrol VKT model [PED model 3a]

The available data on quarterly light petrol VKT was shown in Figure 54. This displays a

long-run trend as well as a clear seasonal pattern around this trend. It is important to

note that a shorter time-series for VKT is available compared to the time-series for PED

volume. The quarterly VKT data available at the time of our analysis runs from the first

quarter of 2001 to the first quarter of 2013. In contrast, PED volume data is available to

the third quarter of 2013. The lack of more recent data is a disadvantage of using light

petrol VKT for forecasting, rather than forecasting PED volume directly.

Using the same process of general-to-specific model selection and diagnostic testing on

the regression residuals, we arrived at a regression model for light petrol VKT involving

the real petrol price, real GDP, the unemployment rate, and quarterly dummy variables

to capture the seasonal pattern. These variables were found to be cointegrated, and one

lag of light petrol VKT was included to correct for remaining serial correlation. The

estimated coefficients of this model are shown in Table 19, and all variables are

significant at the 10% level, with the exception of the second-quarter dummy.17 The

model explains 96% of the quarterly variation in light petrol VKT.

Table 19 Estimated coefficients of the regression model for quarterly light petrol VKT.



Real GDP (SA) 0.1002 0.0350 0.01


Q2 -0.0025 0.0019 0.21

Q3 0.0091 0.0025 0.00

Q4 0.0247 0.0026 0.00

Light petrol VKT (-1) 0.7266 0.0878 0.00

Constant 1.6581 0.5448 0.00

R-squared (of light petrol VKT) 0.96


Figure 75 shows the fitted values and forecasts produced by this model. Falling

unemployment and rising GDP lead to relatively strong growth in the first five years of

the forecast period, from around 7.3 billion km per quarter in 2013 to around 7.9 billion

km per quarter in 2018. Beyond this, rising real petrol prices offset the GDP and

unemployment effects, and light petrol VKT essentially remains constant.

17 Although insignificant, the Q2 dummy was retained in the model on the basis of a joint F-test of the

significance of all quarterly dummies. The same approach was taken for all other models with

quarterly dummies.

66

Figure 75 Fitted values and out-of-sample forecasts for quarterly light petrol VKT.


The corresponding June-year fitted values and forecasts of light petrol VKT are shown

in Figure 76. The model fits the historical data well, and the recent downwards trend of

VKT is expected to be reversed by GDP growth and lower unemployment in the

medium term. The model predicts an average annual growth rate of light petrol VKT of

0.9% between 2013 and 2023, although this is essentially divided into two periods with

annual growth of 1.5% between 2013 and 2018 and 0.2% between 2018 and 2023.

Given that this model predicts a relatively rapid resumption of light petrol VKT growth

(at least until around 2018), in light of the discussion in section 3 about transport trends,

it is worth considering the plausibility of this model in more detail. As noted above, the

model explains 96% of the variation in quarterly light petrol VKT, including the period

after 2005 when VKT has generally declined. This suggests that the decline in VKT is

adequately explained by the variables in the model (the real petrol price, real GDP, and

the unemployment rate).

6,400

6,600

6,800

7,000

7,200

7,400

7,600

7,800

8,000

8,200

20

01-1

20

01-4

20

02-3

20

03-2

20

04-1

20

04-4

20

05-3

20

06-2

20

07-1

20

07-4

20

08-3

20

09-2

20

10-1

20

10-4

20

11-3

20

12-2

20

13-1

20

13-4

20

14-3

20

15-2

20

16-1

20

16-4

20

17-3

20

18-2

20

19-1

20

19-4

20

20-3

20

21-2

20

22-1

20

22-4

20

23-3

VK

T (m

illio

n k

m)

Quarter

Quarterly light petrol VKT

Actual Fitted Forecast

67

Figure 76 Annual fitted values and forecasts of light petrol VKT.


Figure 77 analyses this in more detail. For each year we have plotted total light petrol

VKT and its annual percentage change, as well as the annual percentage change of the

three explanatory variables in the model. This allows us to see how each of the

explanatory variables contributes to the changes in VKT over time:

Up to 2005, light petrol VKT grew relatively strongly, driven by increasing GDP

and falling unemployment, while real petrol prices increased slowly.

In 2006, light petrol VKT fell due to sharply higher real petrol prices, static

unemployment, and a slight slowdown in GDP growth.

In 2007, GDP growth and the change in unemployment were similar to 2006, but

real petrol prices fell slightly, leading to a small increase in VKT.

In 2008, GDP growth and the change in unemployment were again similar to

2006 and 2007, but real petrol prices increased significantly, leading to a small

decrease in VKT.

In 2009, GDP fell and unemployment increased dramatically. Although real

petrol prices fell slightly, the reduction in economic activity caused a large

reduction in VKT.

In 2010, unemployment continued to increase, but GDP increased slightly and

real petrol prices continued to fall, leading to a slight increase in VKT.

27,000

27,500

28,000

28,500

29,000

29,500

30,000

30,500

31,000

31,500

32,000

32,500

20

02

20

03

20

04

20

05

20

06

20

07

20

08

20

09

20

10

20

11

20

12

20

13

20

14

20

15

20

16

20

17

20

18

20

19

20

20

20

21

20

22

20

23

VK

T (m

illio

n k

m)

Year ended June

Annual Light Petrol VKT


68

In 2011, unemployment improved slightly but GDP fell slightly and real petrol

prices increased significantly, leading to another fall in VKT.

In 2012, unemployment slightly worsened again and real petrol prices increased,

while GDP grew, leading to a decrease in VKT.

Through this analysis, it is possible to see that the growth in VKT up to 2005 and the

subsequent general decline can be adequately explained by changes in GDP,

unemployment, and petrol prices. While this does not rule out the possibility that some

other structural shift has occurred (eg due to increasing urbanisation and urban

density), a plausible explanation for the changes in VKT between 2001 and 2012 is also

provided by changes in economic activity and petrol prices.

Overall, the VKT analysis does not provide clear evidence that the relationship between

light petrol VKT and economic drivers has ‘broken’ in recent years. If the downwards

trend in VKT continues while economic activity improves and real petrol prices do not

rise, this may provide evidence of a new relationship, but at this time we have not been

able to find such a relationship.

Figure 77 Detailed analysis of changes in light petrol VKT.


7.5.2 Per-capita light petrol VKT model [PED model 3b]

The per-capita version of the light petrol VKT model was estimated in the same manner

as the total VKT model but using total VKT divided by total population as the

dependent variable. The estimated per capita model is shown in Table 20. In this model,

the unemployment rate was not significant but a deterministic time trend was included.

28,000

28,500

29,000

29,500

30,000

30,500

31,000

-20%

-10%

0%

10%

20%

30%

40%

2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012

An

nu

al V

KT

(mill

ion

km

)

An

nu

al c

han

ge

Year ended June

Light petrol VKT Real GDP Unemployment Real petrol price

69

Table 20 Estimated coefficients of the regression model for quarterly per-capita light petrol VKT.



Real GDP per capita (SA) 0.2287 0.0686 0.00

Time trend -0.0013 0.0004 0.00

Q2 -0.0035 0.0020 0.08

Q3 0.0061 0.0028 0.04

Q4 0.0216 0.0028 0.00

Light petrol VKT per capita (-1) 0.5633 0.1034 0.00

Constant 1.4670 0.3624 0.00

R-squared (of light petrol VKT per capita) 0.99


Figure 78 shows the fitted values and forecasts predicted by this model. Per-capita light

petrol VKT has been declining steadily for some time. In the short term, this decline is

forecast to level off, due to higher real GDP per capita and lower real petrol prices.

However in the long term the decline resumes, due to the negative time trend in the

model and rising real petrol prices.

Figure 78 Fitted values and out-of-sample forecasts of per capita light petrol VKT.

Figure 79 shows the total light petrol VKT values produced by the per-capita model

after multiplying by total population.18 In the short term, total light petrol VKT is

forecast to increase, driven largely by increases in population. In the long term, total

18 Out-of-sample forecasts are generated using Statistics New Zealand’s medium population projection.

1,000

1,100

1,200

1,300

1,400

1,500

1,600

1,700

1,800

1,900

2,000

20

01-1

20

02-1

20

03-1

20

04-1

20

05-1

20

06-1

20

07-1

20

08-1

20

09-1

20

10-1

20

11-1

20

12-1

20

13-1

20

14-1

20

15-1

20

16-1

20

17-1

20

18-1

20

19-1

20

20-1

20

21-1

20

22-1

20

23-1

VK

T p

er

cap

ita

(km

)

Quarter

Quarterly light petrol VKT per capita


70

light petrol VKT flattens, as the population increases at essentially the same rate that

per-capita VKT declines.

Figure 79 Total light petrol VKT derived from the per-capita VKT model.


Figure 80 and Figure 81 respectively compare the annual per-capita and total light

petrol VKT forecasts produced by the two VKT models. Overall the per-capita model

produces lower forecasts of both per-capita and total light petrol VKT. In our view, the

light petrol VKT forecasts produced by the per-capita model are more plausible than the

forecasts produced by the total VKT model, given recent trends in light petrol VKT.

6,600

6,800

7,000

7,200

7,400

7,600

7,800

20

01-1

20

02-1

20

03-1

20

04-1

20

05-1

20

06-1

20

07-1

20

08-1

20

09-1

20

10-1

20

11-1

20

12-1

20

13-1

20

14-1

20

15-1

20

16-1

20

17-1

20

18-1

20

19-1

20

20-1

20

21-1

20

22-1

20

23-1

VK

T (m

illio

n k

m)

Quarter

Quarterly light petrol VKT


71

Figure 80 Comparison of light petrol VKT per capita forecasts produced by the total VKT and per-

capita VKT models.


Figure 81 Comparison of total light petrol VKT forecasts produced by the total VKT and per-capita

VKT models.


5,600

5,800

6,000

6,200

6,400

6,600

6,800

7,000

7,200

7,400

7,600

20

02

20

03

20

04

20

05

20

06

20

07

20

08

20

09

20

10

20

11

20

12

20

13

20

14

20

15

20

16

20

17

20

18

20

19

20

20

20

21

20

22

20

23

An

nu

al a

vera

ge V

KT

pe

r ca

pit

a (k

m)

Year ended June

Hybrid model light petrol VKT per capita forecasts

Actual Total VKT model Per capita VKT model

27,000

27,500

28,000

28,500

29,000

29,500

30,000

30,500

31,000

31,500

32,000

32,500

20

02

20

03

20

04

20

05

20

06

20

07

20

08

20

09

20

10

20

11

20

12

20

13

20

14

20

15

20

16

20

17

20

18

20

19

20

20

20

21

20

22

20

23

An

nu

al t

ota

l V

KT

(mill

ion

km

)

Year ended June

Hybrid model light petrol total VKT forecasts

Actual Total VKT model Per capita VKT model

72

7.5.3 Fuel efficiency [PED models 3a & 3b]

The second component of the hybrid models is a model of average fuel efficiency of the

light petrol vehicle fleet (measured as km / litre). The same efficiency model was used

for both the total VKT and per-capita VKT models, as there is no reason to expect

efficiency to differ if VKT is calculated in total or on a per-capita basis.

Figure 82 shows the quarterly fuel efficiency implied by simply dividing quarterly VKT

of the light petrol fleet by the PED volume in the corresponding quarter. The resulting

data series exhibits large fluctuations between around 8 km / litre and 12 km / litre,

reflecting the volatility in the PED volume series. It is implausible that these short term

fluctuations reflect actual large changes in fuel efficiency, therefore we have applied a 4-

quarter moving average to the efficiency time series, and used the moving average in

our subsequent analysis.

The 4-quarter moving average for efficiency generally fluctuates within a range from

around 9.5 to 9.8 km / litre during the period for which data is available. There is no

strong trend apparent in Figure 82, but there is some evidence of a slight increase in

efficiency between 2008 and 2013.

There are a number of ways that efficiency could be modelled and forecasted. For

example, changes in the engine size composition of the vehicle fleet could be analysed

and used as the basis for future predictions of efficiency. However, this would involve

undertaking detailed analysis of changes in the fleet on a quarterly basis, and

potentially making a large number of assumptions about the drivers of efficiency.

Figure 82 Implied fuel efficiency of the light petrol fleet.


7

8

9

10

11

12

13

20

01-1

20

01-3

20

02-1

20

02-3

20

03-1

20

03-3

20

04-1

20

04-3

20

05-1

20

05-3

20

06-1

20

06-3

20

07-1

20

07-3

20

08-1

20

08-3

20

09-1

20

09-3

20

10-1

20

10-3

20

11-1

20

11-3

20

12-1

20

12-3

20

13-1

Effi

cie

ncy

(km

/ li

tre

)

Quarter

Light Petrol Fuel Efficiency

Actual 4-quarter moving average

73

Alternatively, a regression model for efficiency could be specified. Using the 4-quarter

moving average for efficiency as the dependent variable, we tested various regression

models using explanatory variables including real petrol prices, real vehicle prices, and

economic variables. However, we were unable to find robust relationships between

efficiency and other variables. This suggests that the primary drivers of changes in

efficiency (if any) may be improvements in engine technology, and changes in driver

preferences and/or behaviour that are difficult to model directly.

As a result of the failed regression analysis for efficiency, we sought to build a simple

time-series model, for the purpose of PED volume forecasting. Given the apparent

slightly increasing trend in efficiency after 2008, we tested models for the 4-quarter

moving average of efficiency using the following explanatory variables:

A logarithmic time trend19

A dummy variable taking the value one from 2008 onwards and zero in other

time periods

An interaction between the logarithmic time trend and the above dummy

We also tested the inclusion of autoregressive error terms. Using significance tests and

residual diagnostics, we arrived at the estimated regression model for efficiency shown

in Table 21. This model has a statistically significant increasing trend after 2008, and

explains 28% of the variation in the quarterly moving average of efficiency.

Table 21 Estimated time-series regression model for light petrol efficiency.


Log time trend -0.2175 0.0884 0.02

Dummy variable 0.6169 0.3193 0.06

Dummy * log time trend -2.0001 1.1396 0.09

Constant 0.2382 0.1424 0.10

Efficiency MA (-1) 7.9250 1.4457 0.00

R-squared (vs moving average) 0.28


The fitted values and out-of-sample forecasts produced by this model are shown in

Figure 83. The model predicts a slight increase in efficiency over time, from around 9.80

km / litre in 2013 to around 10.14 km / litre in 2023.

19 A logarithmic rather than linear trend was used to prevent efficiency from becoming implausibly

high over time.

74

Figure 83 Fitted values and out-of-sample forecasts for light petrol fuel efficiency.


7.5.4 PED volumes (total VKT model) [PED model 3a]

The total light petrol VKT and efficiency models were combined to produce fitted

values and forecasts of PED volumes (Figure 84). This hybrid model explains 57% of the

variation in the 4-quarter moving average of PED volume but only 5% of the variation

in actual quarterly PED volume.

This hybrid model predicts a small and diminishing increase in PED volume over time,

reflecting the interaction between increasing VKT and increasing efficiency. Quarterly

PED volumes are forecast to increase from around 750 million litres per quarter in 2013

to around 780 million litres per quarter by 2023, with most of the increase occurring by

2018. The quarterly forecasts show a slight seasonal pattern, reflecting the seasonal

pattern in VKT.

Figure 85 shows the results when both the VKT and efficiency models are truncated as

before and the truncated models are used to produce forecasts of PED volumes from the

third quarter of 2011 to the third quarter of 2013. In contrast with the other models, the

hybrid model does not consistently under- or over-estimate PED volume, on average

under-estimating PED volume by around 0.5% per quarter. The RMSE for these

forecasts is 32.5 million litres.

6

7

8

9

10

11

12

13

20

01-1

20

01-4

20

02-3

20

03-2

20

04-1

20

04-4

20

05-3

20

06-2

20

07-1

20

07-4

20

08-3

20

09-2

20

10-1

20

10-4

20

11-3

20

12-2

20

13-1

20

13-4

20

14-3

20

15-2

20

16-1

20

16-4

20

17-3

20

18-2

20

19-1

20

19-4

20

20-3

20

21-2

20

22-1

20

22-4

20

23-3

km /

litr

e

Quarter

Quarterly light petrol fuel efficiency

Actual Fitted Forecast Actual (moving average)

75

Figure 84 Fitted values and out-of-sample forecasts of quarterly PED volumes generated by the hybrid

model of total light petrol VKT and efficiency.


Figure 85 Comparison of actual PED volumes and truncated-sample forecasts produced by the hybrid

model of total light petrol VKT and efficiency.


0

200

400

600

800

1,000

1,200

19

94-1

19

95-1

19

96-1

19

97-1

19

98-1

19

99-1

20

00-1

20

01-1

20

02-1

20

03-1

20

04-1

20

05-1

20

06-1

20

07-1

20

08-1

20

09-1

20

10-1

20

11-1

20

12-1

20

13-1

20

14-1

20

15-1

20

16-1

20

17-1

20

18-1

20

19-1

20

20-1

20

21-1

20

22-1

20

23-1

PED

vo

lum

e (

mill

ion

lit

res)

Quarter



640

660

680

700

720

740

760

780

800

820

20

11-3

20

11-4

20

12-1

20

12-2

20

12-3

20

12-4

20

13-1

20

13-2

20

13-3

PED

vo

lum

e (

mill

ion

lit

res)

Quarter



76

Figure 86 shows the June-year fitted values and forecasts generated by the hybrid

model. Annual PED volume is forecasted to increase slightly from 3.0 billion litres in

2013 to 3.15 billion litres in 2023, an average annual growth rate over ten years of 0.4%.

The forecasts imply annual growth of 1.0% between 2013 and 2018, and then -0.1%

between 2018 and 2023. The truncated sample forecasts generated by this model are

relatively accurate, under-estimating PED volume by 0.3% in 2012 and by over-

estimating by 0.3% in 2013.

Within the model it is possible to decompose the quarterly percentage change in PED

volume into VKT and efficiency effects. The quarterly percentage change in PED

volume is approximately equal to the quarterly percentage change in total VKT minus

the quarterly percentage change in efficiency. This decomposition is shown in Figure 87,

where we have shown the 4-quarter moving average of the quarterly growth rates in

order to smooth out the seasonal variation in quarterly VKT.

Figure 86 Annual (June year) fitted values and forecasts generated by the total VKT hybrid model.


0

500

1,000

1,500

2,000

2,500

3,000

3,500

4,000

19

96

19

97

19

98

19

99

20

00

20

01

20

02

20

03

20

04

20

05

20

06

20

07

20

08

20

09

20

10

20

11

20

12

20

13

20

14

20

15

20

16

20

17

20

18

20

19

20

20

20

21

20

22

20

23

PED

vo

lum

e (

mill

ion

lit

res)

Year ended June

Annual PED volumes

Actual Fitted

77

Figure 87 Approximate decomposition of quarterly PED forecasts in the total VKT hybrid model.


7.5.5 PED volumes (per-capita VKT model) [PED model 3b]

The per-capita light petrol VKT and efficiency models were combined to produce fitted

values and forecasts of PED volumes (Figure 88). This hybrid model explains 56% of the

variation in the 4-quarter moving average of PED volume but only 5% of the variation

in actual quarterly PED volume.

This hybrid model predicts generally flat PED volume over time. Quarterly PED

volumes are forecast to decline slightly from around 750 million litres per quarter in

2013 to around 745 million litres per quarter by 2023.

The performance of this model on the truncated sample forecasting test (Figure 89) is

similar to the total VKT hybrid model. In contrast with the other models, the hybrid

model does not consistently under- or over-estimate PED volume, on average under-

estimating PED volume by around 0.4% per quarter. The RMSE for these forecasts is

32.5 million litres.

Figure 90 shows the June-year fitted values and forecasts generated by the hybrid

model. Annual PED volume is forecasted to essentially remain unchanged between 2013

and 2023. The truncated sample forecasts generated by this model are relatively

accurate, under-estimating PED volume by 0.3% in 2012 and by over-estimating by 0.1%

in 2013.

-0.3%

-0.2%

-0.1%

0.0%

0.1%

0.2%

0.3%

0.4%

0.5%

0.6%

0.7%

20

13-3

20

14-1

20

14-3

20

15-1

20

15-3

20

16-1

20

16-3

20

17-1

20

17-3

20

18-1

20

18-3

20

19-1

20

19-3

20

20-1

20

20-3

20

21-1

20

21-3

20

22-1

20

22-3

20

23-1

20

23-3

Qu

arte

rly

chan

ge (

4-q

uar

ter

mo

vin

g av

era

ge)

Total VKT Efficiency Total PED volume

78

Figure 88 Fitted values and out-of-sample forecasts of quarterly PED volumes generated by the hybrid

model of per capita light petrol VKT and efficiency.


Figure 89 Comparison of actual PED volumes and truncated-sample forecasts produced by the hybrid

model of per capita light petrol VKT and efficiency.


0

200

400

600

800

1,000

1,200

19

94-1

19

95-1

19

96-1

19

97-1

19

98-1

19

99-1

20

00-1

20

01-1

20

02-1

20

03-1

20

04-1

20

05-1

20

06-1

20

07-1

20

08-1

20

09-1

20

10-1

20

11-1

20

12-1

20

13-1

20

14-1

20

15-1

20

16-1

20

17-1

20

18-1

20

19-1

20

20-1

20

21-1

20

22-1

20

23-1

PED

vo

lum

e (

mill

ion

lit

res)

Quarter



640

660

680

700

720

740

760

780

800

820

20

11-3

20

11-4

20

12-1

20

12-2

20

12-3

20

12-4

20

13-1

20

13-2

20

13-3

PED

vo

lum

e (

mill

ion

lit

res)

Quarter



79

Figure 90 Annual fitted values and forecasts generated by the per capita VKT hybrid model.


In this model, the quarterly percentage change in PED volume is approximately equal to

the percentage change in per-capita VKT plus the percentage change in population,

minus the percentage change in efficiency. Figure 91 shows this decomposition (on a 4-

quarter moving average basis) for the forecasts produced by the hybrid per-capita VKT

model for PED volume.

0

500

1,000

1,500

2,000

2,500

3,000

3,500

4,000

19

96

19

97

19

98

19

99

20

00

20

01

20

02

20

03

20

04

20

05

20

06

20

07

20

08

20

09

20

10

20

11

20

12

20

13

20

14

20

15

20

16

20

17

20

18

20

19

20

20

20

21

20

22

20

23

PED

vo

lum

e (

mill

ion

lit

res)

Year ended June

Annual PED volumes

Actual Fitted

80

Figure 91 Approximate decomposition of quarterly PED forecasts in the per-capita VKT hybrid model.


7.6 Additional PED volume models

Following presentation of the above models to the NLTF revenue forecasting group, we

were requested to test some additional models for PED volume, including some new

variables and excluding some existing variables:

Excluding real petrol prices in the PED regression and hybrid models;

Including total NZ population in the PED regression and total VKT hybrid

model;

Including the proportion of population living in Auckland as an alternative to

the urban population proportion in the PED regression models and testing this

variable in the hybrid models.

7.6.1 Excluding real petrol prices [PED model 2d]

The real petrol price variable was highly statistically significant in the VKT components

of the hybrid models, but was only weakly significant in one of the PED volume models.

We tested the effects of excluding this variable from the models and found:

Excluding the real petrol price from the total and per capita VKT models in the

hybrid models results in serious econometric problems including significant

residual autocorrelation, indicating a relevant variable has been omitted.

-0.4%

-0.3%

-0.2%

-0.1%

0.0%

0.1%

0.2%

0.3%

0.4%

20

13-3

20

14-1

20

14-3

20

15-1

20

15-3

20

16-1

20

16-3

20

17-1

20

17-3

20

18-1

20

18-3

20

19-1

20

19-3

20

20-1

20

20-3

20

21-1

20

21-3

20

22-1

20

22-3

20

23-1

20

23-3

Qu

arte

rly

chan

ge (

4-q

uar

ter

mo

vin

g av

era

ge)

Per-capita VKT Efficiency Population Total PED Volume

81

Excluding the real petrol price from the PED volume model with autoregressive

errors did not cause the model to fail econometric diagnostic tests.

We therefore developed an additional PED volume regression model shown in Table 16.

This model explains 81% of the variation in the 4-quarter moving average of PED

volume and 12% of the variation in actual quarterly PED volumes.

Table 22 Estimated coefficients of the selected regression model for quarterly PED volume with an

autoregressive error specification, excluding real petrol price.


Real GDP (SA) 0.1788 0.0212 0.00


Constant 18.7373 0.2271 0.00

AR(1) 0.4131 0.0970 0.00

AR(4) -0.2048 0.1045 0.05




The quarterly fitted values and forecasts produced by this model are shown in Figure

92. The forecasts are similar to the other PED regression models.


regression model excluding real petrol price.


0

200

400

600

800

1,000

1,200

19

94-1

19

95-1

19

96-1

19

97-1

19

98-1

19

99-1

20

00-1

20

01-1

20

02-1

20

03-1

20

04-1

20

05-1

20

06-1

20

07-1

20

08-1

20

09-1

20

10-1

20

11-1

20

12-1

20

13-1

20

14-1

20

15-1

20

16-1

20

17-1

20

18-1

20

19-1

20

20-1

20

21-1

20

22-1

20

23-1

PED

vo

lum

e (

mill

ion

lit

res)

Quarter



82

Significantly, omitting the real petrol price has not greatly changed the forecast of

increasing PED volumes over time. This suggests that lower real petrol prices in the

short term are not the primary driver of increasing PED volume forecasts produced by

the other regression models. Rather it is increasing real GDP and lower unemployment

that is causing the increase in those models. This is consistent with the low estimates of

petrol price elasticity in these models.

The truncated sample forecasts produced by this model are shown in Figure 93. As with

the other PED regression models, this model tends to over-forecast PED volumes during

this period, by an average of 5.2% per quarter. The RMSE for these forecasts is 49.6

million litres.

Figure 93 Comparison of actual PED volumes and truncated-sample forecasts produced by the PED

regression model excluding real petrol price.


Figure 94 shows the annual fitted values and forecasts generated by this model. The

truncated model over-estimates PED volumes by 3.8% and 5.2% in 2012 and 2013

respectively. The annual volumes are forecasted to grow at an average annual rate of

1.0%, reaching 3.3 billion litres by 2023.

640

660

680

700

720

740

760

780

800

820

20

11-3

20

11-4

20

12-1

20

12-2

20

12-3

20

12-4

20

13-1

20

13-2

20

13-3

PED

vo

lum

e (

mill

ion

lit

res)

Quarter



83

Figure 94 Annual fitted values and forecasts generated by the PED model excluding petrol price.


7.6.2 Including total NZ population

The total New Zealand population (estimated on a quarterly basis; obtained from

Statistics New Zealand) was tested as an explanatory variables in the PED regression

and total VKT models. This variable was found to be statistically insignificant in every

case, while real GDP remained significant in most cases. This suggests, although does

not prove, that economic activity as measured by real GDP better explains the trend on

PED volumes or light petrol VKT than population over time. This may be because total

real GDP reflects both economic activity per capita as well as total population.

7.6.3 Including proportion of population living in Auckland [PED model 2e]

The regression models tested in section 7.4.2 above included the proportion of urban

population as an explanatory variable. A key issue with this variable is the lack of

suitable forecasts. As an alternative, we tested the proportion of total population living

in Auckland, for which forecasts can be calculated from Statistics New Zealand’s

regional population forecasts. The quarterly values of this variable are shown in Figure

95, where the forecasts have been calculated from Statistics New Zealand’s medium

population projection.

0

500

1,000

1,500

2,000

2,500

3,000

3,500

4,000

19

96

19

97

19

98

19

99

20

00

20

01

20

02

20

03

20

04

20

05

20

06

20

07

20

08

20

09

20

10

20

11

20

12

20

13

20

14

20

15

20

16

20

17

20

18

20

19

20

20

20

21

20

22

20

23

PED

vo

lum

e (

mill

ion

lit

res)

Year ended June

Annual PED volumes

Actual Fitted

84

Figure 95 Proportion of total population living in Auckland.

Source: Statistics New Zealand.

The Auckland population proportion was found to be statistically insignificant in the

hybrid VKT models, but was significant in the PED regression model with lagged

dependent variables, together with real GDP and the real petrol price, while the

unemployment rate became insignificant.

The estimated coefficients of this model are shown in Table 23. The proportion of

population living in Auckland is highly statistically significant, and the estimated

coefficient of around -2 indicates that a 1% increase in the proportion of population

living in Auckland results in a 2% decrease in PED volume, everything else equal.

Table 23 Estimated coefficients of the PED volume regression model including the proportion of

population living in Auckland, with lagged dependent variables.


Real GDP (SA) 0.7970 0.1722 0.00


Auckland pop proportion -2.0813 0.5500 0.00

Constant 2.5875 2.1361 0.23

PED volume (-1) 0.2477 0.1135 0.03

PED volume (-2) 0.3127 0.1117 0.01

PED volume (-4) -0.1906 0.0964 0.05




29%

30%

31%

32%

33%

34%

35%

36%

37%

38%

19

96-2

19

97-2

19

98-2

19

99-2

20

00-2

20

01-2

20

02-2

20

03-2

20

04-2

20

05-2

20

06-2

20

07-2

20

08-2

20

09-2

20

10-2

20

11-2

20

12-2

20

13-2

20

14-2

20

15-2

20

16-2

20

17-2

20

18-2

20

19-2

20

20-2

20

21-2

20

22-2

20

23-2

20

24-2

20

25-2

20

26-2

Estimated Forecast

85


96. PED volumes are forecast to increase in the short term due to higher economic

activity and lower real petrol prices; this is not greatly offset by the higher forecast

proportion of population living in Auckland.


regression model including Auckland population proportion.


The truncated sample forecasting performance of the model is shown in Figure 97. As

with other PED regression models, this model tends to over-forecast PED volumes

during this period, by an average of 5.1% per quarter. The RMSE for this period is 49.4

million litres.

Figure 98 shows the annual forecasts generated by this model; these are very similar to

the forecasts produced by the current NLTF forecasting model. The truncated model

over-estimated PED volumes by 3.4% in 2012 and 5.5% in 2013. PED volumes are

forecast to increase at an average annual rate of 1.1%, reaching 3.4 billion litres in 2023.

0

200

400

600

800

1,000

1,200

19

94-1

19

95-1

19

96-1

19

97-1

19

98-1

19

99-1

20

00-1

20

01-1

20

02-1

20

03-1

20

04-1

20

05-1

20

06-1

20

07-1

20

08-1

20

09-1

20

10-1

20

11-1

20

12-1

20

13-1

20

14-1

20

15-1

20

16-1

20

17-1

20

18-1

20

19-1

20

20-1

20

21-1

20

22-1

20

23-1

PED

vo

lum

e (

mill

ion

lit

res)

Quarter



86

Figure 97 Comparison of actual PED volumes and truncated-sample forecasts produced by the PED

regression model including Auckland population proportion


Figure 98 Annual fitted values and forecasts generated by the PED model including Auckland

population proportion.


640

660

680

700

720

740

760

780

800

820

20

11-3

20

11-4

20

12-1

20

12-2

20

12-3

20

12-4

20

13-1

20

13-2

20

13-3

PED

vo

lum

e (

mill

ion

lit

res)

Quarter



0

500

1,000

1,500

2,000

2,500

3,000

3,500

4,000

19

96

19

97

19

98

19

99

20

00

20

01

20

02

20

03

20

04

20

05

20

06

20

07

20

08

20

09

20

10

20

11

20

12

20

13

20

14

20

15

20

16

20

17

20

18

20

19

20

20

20

21

20

22

20

23

PED

vo

lum

e (

mill

ion

lit

res)

Year ended June

Annual PED volumes

Actual Fitted

87

7.7 PED volume model evaluation and comparison

The previous section presented eight alternative models of PED volumes and generated

indicative forecasts from each. Table 24 summarises the variables that are included in

these models and will affect the forecasts of PED volume.

Table 24 Summary of variables included in the selected PED models.

Model Type Real GDP

Real petrol price

Uempl rate

Fuel eff

Total pop Trend

Young pop

propn

Urban pop

propn

AKL pop

propn

1 Time series

2a Regression (AR errors)

2b Regression (ADL)

2c Regression (ECM)

2d Regression (AR errors)

2e Regression (ADL)

3a Hybrid (total VKT)

3b Hybrid (per-cap VKT)

The different models have relative advantages and disadvantages in terms of their use

and interpretation (Table 25).

Table 25 Advantages and disadvantages of the PED volume models.

Model(s) Advantages Disadvantages

Time-series [1]

Very simple implementation

No forecast drivers required

Model can evolve over time

Cannot test alternative scenarios

Provides no explanation for trends

Regression [2a & 2d] (AR errors)



Unsophisticated short-run dynamics


Regression [2b & 2e] (lagged dependent variable)



Includes demographic variables

Very sensitive to demographic assumptions

No Stats NZ forecast of urban population for model 2b

Population data is only observed in Census years

Regression [2c] (ECM)

Sophisticated short-run dynamics Difficult to interpret and explain trends


Hybrid [3a & 3b]

Uses potentially more reliable VKT data (compared to PED volumes)


Allows analysis of changing fuel efficiency

Basis for forecasting efficiency is not clear

Model 3a extrapolates past relationship with GDP and unemployment

Model 3b includes an unexplained deterministic trend

88

The models also differ in terms of their goodness of fit and forecasting performance.

Table 26 compares the goodness of fit of the models and the RMSE of the truncated-

sample forecasts. Overall, the regression models have the highest within-sample

goodness of fit, explaining around 80% of the variation in the 4-quarter moving average

of PED volume and between 10% and 15% of the variation in actual PED volume.

However the truncated-sample forecasting performance of the hybrid models is

considerably better than the regression models (Figure 99). While the regression models

tend to over-forecast PED volumes, the hybrid models are relatively accurate on

average.

Figure 100 compares the annual PED volume forecasts produced by these eight models.

All models except the time series model predict a general increase of PED volume over

time, although forecast growth is generally higher in earlier years than later. Several of

the regression models predict a similar (but slightly lower) profile of growth compared

to the current forecasting model; this is unsurprising as these models contain similar

variables to the current model.

The hybrid models and regression model 2b (that includes demographic variables)

produce similar growth profiles, and this is somewhat lower than the other regression

models. In the hybrid models this arises primarily from the forecasted improvement in

efficiency and the downward trend in per-capita VKT in the per-capita model. In

regression model 2b the forecasted increase in the urban proportion of population

works to offset growth driven by higher GDP and lower unemployment.

The time series model is the only model that predicts a continuous decline in PED

volume. This model essentially extrapolates the recent trend in PED volume and in our

view it is likely to be more accurate for short-term forecasting than long-term

forecasting. With that in mind, it is notable that the time series model produces similar

forecasts for 2014 and 2015 as the hybrid models and regression model 2b.

Table 26 Summary of goodness of fit and truncated-sample forecast RMSE of the PED volume models.

1 2a 2b 2c 2d 2e 3a 3b


R2 vs PED volume 0.12 0.13 0.10 0.15 0.12 0.12 0.05 0.05

R2 vs PED volume (MA) 0.78 0.81 0.80 0.81 0.81 0.81 0.57 0.56


RMSE (million litres) 45.2 48.3 57.1 49.0 49.6 49.4 32.5 32.5

Average quarterly error (%) 4.4 4.9 6.3 5.0 5.2 5.1 0.5 0.4

2012 error (%) 3.5 3.5 4.1 3.5 3.8 3.4 -0.3 -0.3

2013 error (%) 4.2 4.9 6.8 5.1 5.2 5.5 0.3 0.1

89

Figure 99 Comparison of the truncated-sample forecasting performance of the PED models.


Figure 100 Annual PED volume forecast comparison.


620

640

660

680

700

720

740

760

780

800

820

840

20

11-3

20

11-4

20

12-1

20

12-2

20

12-3

20

12-4

20

13-1

20

13-2

20

13-3

PED

vo

lum

e (

mill

ion

lit

res)

Quarter


Actual Time series Regression (a)

Regression (b) Regression (c) Regression (d)

Regression (e) Hybrid (a) Hybrid (b)

2,400

2,600

2,800

3,000

3,200

3,400

3,600

19

96

19

97

19

98

19

99

20

00

20

01

20

02

20

03

20

04

20

05

20

06

20

07

20

08

20

09

20

10

20

11

20

12

20

13

20

14

20

15

20

16

20

17

20

18

20

19

20

20

20

21

20

22

20

23

PED

vo

lum

e (

mill

ion

lit

res)

Year ended June

Annual PED volumes


Regression (a) Regression (b) Regression (c)

Regression (d) Regression (e) Hybrid (a)

90

Taking all of the above into consideration, in consultation with a subgroup of the NLTF

forecasting group it was decided to select models 2a, 3a, and 3b for further analysis.

7.8 PED volume confidence intervals and sensitivity testing

Confidence intervals were developed for the indicative forecasts produced by PED

models 2a, 3a, and 3b, and sensitivity tests were performed on these models. 67% and

90% confidence intervals were chosen, reflecting a balance between confidence intervals

that are relatively wide, versus a high confidence level.

PED model 2a forecasts the 4-quarter moving average of PED volume, and the hybrid

models 3a and 3b use the 4-quarter moving average of efficiency. Therefore, strictly

speaking the quarterly forecasts and confidence intervals produced by these models are

for the moving average of PED volume rather than actual quarterly PED volume. On an

annual basis, much of the excess volatility in quarterly PED volume cancels out, but

some remains as the timing of fuel shipments can still vary from year to year.

Therefore, we have also estimated wider confidence intervals based on the relationship

between actual PED volumes and the moving average, that indicate the range within

which PED volume is likely to fall in any given year. Over a period of several years, the

confidence interval for the moving average indicates the likely overall accuracy during

that period, but it is more difficult to accurately forecast PED volume in any given year,

ie the confidence intervals for any given year are wider.

Sensitivity tests were performed by subjecting the models to a one standard deviation of

each of the explanatory variables in turn, relative to the levels used to generate the

baseline forecasts presented above. For price variables, the level of the variable was

increased by one standard deviation. For GDP, the growth rate was increased (in all

years) by one standard deviation. These sensitivity tests are not directly comparable

across variables, but are consistent in the sense that one standard deviation was used as

the basis for the test.

7.8.1 PED model 2a

An indicative forecast and 67% and 90% confidence intervals produced by the

regression PED model 2a are shown in Table 27. The 67% and 90% confidence intervals

for the moving average PED volume forecast produced by this model are approximately

+/- 2.4% and 4.0% of the baseline forecast respectively (Figure 101). The forecasts

produced by this model are relatively sensitive to the GDP growth rate over time,

although the effect in any given year of 2% higher real GDP growth is relatively small.

The forecasts are relatively insensitive to petrol prices, and somewhat sensitive to the

unemployment rate.

In order to better understand how forecasts are generated, we have developed an

approximate breakdown of the contribution of each variable in the model (Figure 102).20

In the short term all variables make a contribution to the forecasts, while in the long

term the forecasts are predominantly driven by changes in real GDP.

20 See section 9.2.3 for a discussion of these charts.

91

Table 27 Indicative forecasts and confidence intervals produced by PED model 2a.


YE June

90% lower

67% lower Base

67% upper

90% upper

90% lower

67% lower Base

67% upper

90% upper

2013 3,013

2014 3,007 3,040 3,091 3,144 3,178 -0.2 0.9 2.6 4.3 5.5

2015 3,066 3,115 3,191 3,270 3,321 2.0 2.5 3.2 4.0 4.5

2016 3,081 3,129 3,207 3,286 3,338 0.5 0.5 0.5 0.5 0.5

2017 3,110 3,160 3,238 3,317 3,370 1.0 1.0 1.0 1.0 1.0

2018 3,143 3,192 3,271 3,352 3,405 1.0 1.0 1.0 1.0 1.0

2019 3,166 3,216 3,295 3,377 3,430 0.7 0.7 0.7 0.7 0.7

2020 3,176 3,226 3,306 3,388 3,441 0.3 0.3 0.3 0.3 0.3

2021 3,186 3,236 3,316 3,398 3,452 0.3 0.3 0.3 0.3 0.3

2022 3,196 3,246 3,326 3,408 3,462 0.3 0.3 0.3 0.3 0.3

2023 3,206 3,257 3,337 3,419 3,474 0.3 0.3 0.3 0.3 0.3

Figure 101 Confidence intervals and sensitivity test results for PED model 2a.

c

-10%

-8%

-6%

-4%

-2%

0%

2%

4%

6%

8%

10%

20

14

20

15

20

16

20

17

20

18

20

19

20

20

20

21

20

22

20

23

Re

lati

ve t

o b

ase

line

fo

reca

st

90% confidence interval (MA) 90% confidence interval (actual)67% confidence interval (MA) 67% confidence interval (actual)30c/L higher petrol price 2% higher real GDP growth rate1.2% higher unemp rate

92

Figure 102 Approximate annual average contributions to forecasts in PED model 2a.


7.8.2 PED model 3a

Table 28, Figure 103 and Figure 104 show the forecasts, confidence intervals, and

sensitivity tests for the hybrid PED model 3a. The 67% and 90% confidence intervals for

the moving average PED volume forecast produced by this model are approximately

+/- 3.1% and 5.2% of the baseline forecast respectively. This model is relatively sensitive

to the GDP growth rate over time, relatively insensitive to the unemployment rate, and

somewhat sensitive to the petrol price and fuel efficiency assumptions.

Table 28 Indicative forecasts and confidence intervals produced by PED model 3a.


YE June

90% lower

67% lower Base

67% upper

90% upper

90% lower

67% lower Base

67% upper

90% upper

2013 3,013

2014 2,879 2,921 2,989 3,059 3,107 -4.4 -3.1 -0.8 1.5 3.1

2015 2,904 2,962 3,056 3,155 3,221 0.9 1.4 2.3 3.1 3.7

2016 2,951 3,010 3,107 3,207 3,275 1.6 1.6 1.6 1.7 1.7

2017 2,981 3,041 3,139 3,240 3,309 1.0 1.0 1.0 1.0 1.0

2018 3,001 3,062 3,160 3,262 3,331 0.7 0.7 0.7 0.7 0.7

2019 3,012 3,072 3,170 3,273 3,341 0.3 0.3 0.3 0.3 0.3

2020 3,009 3,070 3,167 3,269 3,338 -0.1 -0.1 -0.1 -0.1 -0.1

2021 3,004 3,064 3,161 3,263 3,331 -0.2 -0.2 -0.2 -0.2 -0.2

2022 2,999 3,059 3,155 3,257 3,325 -0.2 -0.2 -0.2 -0.2 -0.2

2023 2,995 3,055 3,152 3,253 3,321 -0.1 -0.1 -0.1 -0.1 -0.1

-0.50%

-0.25%

0.00%

0.25%

0.50%

0.75%

1.00%

1.25%

1.50%

1.75%

2.00%

2013-4 to2014-3

2014-4 to2015-3

2015-4 to2016-3

2016-4 to2017-3

2017-4 to2018-3

2018-4 to2019-3

2019-4 to2020-3

2020-4 to2021-3

2021-4 to2022-3

2022-4 to2023-3

Ave

rage

co

ntr

ibu

tio

n t

o a

nn

ual

fo

reca

st c

han

ge

Real petrol price Real GDP Unemp rate Dynamic & interaction

93

Figure 103 Confidence intervals and sensitivity test results for PED model 3a.


Figure 104 Approximate annual average contributions to forecasts in PED model 3a.


-10%

-8%

-6%

-4%

-2%

0%

2%

4%

6%

8%

10%2

01

4

20

15

20

16

20

17

20

18

20

19

20

20

20

21

20

22

20

23

Re

lati

ve t

o b

ase

line

fo

reca

st90% confidence interval (MA) 90% confidence interval (actual)67% confidence interval (MA) 67% confidence interval (actual)30c/L higher petrol price 2% higher real GDP growth rate1.2% higher unemp rate 0.25 km/L higher efficiency

-0.2%

-0.1%

0.0%

0.1%

0.2%

0.3%

0.4%

0.5%

0.6%

0.7%

0.8%

2013-4 to2014-3

2014-4 to2015-3

2015-4 to2016-3

2016-4 to2017-3

2017-4 to2018-3

2018-4 to2019-3

2019-4 to2020-3

2020-4 to2021-3

2021-4 to2022-3

2022-4 to2023-3

Ave

rage

co

ntr

ibu

tio

n t

o a

nn

ual

fo

reca

st c

han

ge

Real petrol price Real GDP Unemp rate Efficiency Dynamic & interaction

94

7.8.3 PED model 3b

Table 29, Figure 105 and Figure 106 show the forecasts, confidence intervals, and

sensitivity tests for hybrid PED model 3b. The 67% and 90% confidence intervals for the

moving average PED volume forecast produced by this model are approximately

+/- 3.0% and 4.9% of the baseline forecast respectively. The model is relatively sensitive

to the GDP and population growth rate assumptions over time, and is also somewhat

sensitive to petrol price and fuel efficiency assumptions.

Table 29 Indicative forecasts and confidence intervals produced by PED model 3b.


YE June

90% lower

67% lower Base

67% upper

90% upper

90% lower

67% lower Base

67% upper

90% upper

2013 3,013

2014 2,884 2,924 2,989 3,058 3,104 -4.3 -2.9 -0.8 1.5 3.0

2015 2,894 2,949 3,037 3,130 3,192 0.3 0.8 1.6 2.4 2.9

2016 2,913 2,968 3,057 3,150 3,213 0.7 0.7 0.6 0.6 0.6

2017 2,916 2,971 3,060 3,153 3,216 0.1 0.1 0.1 0.1 0.1

2018 2,915 2,970 3,059 3,152 3,214 0.0 0.0 0.0 -0.1 -0.1

2019 2,908 2,963 3,051 3,144 3,206 -0.2 -0.2 -0.2 -0.3 -0.3

2020 2,896 2,951 3,038 3,130 3,192 -0.4 -0.4 -0.4 -0.4 -0.4

2021 2,885 2,939 3,026 3,117 3,178 -0.4 -0.4 -0.4 -0.4 -0.4

2022 2,874 2,928 3,015 3,105 3,166 -0.4 -0.4 -0.4 -0.4 -0.4

2023 2,865 2,918 3,004 3,095 3,155 -0.3 -0.3 -0.3 -0.3 -0.3

Figure 105 Confidence intervals and sensitivity test results for PED model 3b.


-10%

-8%

-6%

-4%

-2%

0%

2%

4%

6%

8%

10%

20

14

20

15

20

16

20

17

20

18

20

19

20

20

20

21

20

22

20

23

Re

lati

ve t

o b

ase

line

fo

reca

st

90% confidence interval (MA) 90% confidence interval (actual)67% confidence interval (MA) 67% confidence interval (actual)30c/L higher petrol price 1.4% higher real GDP per capita growth rate0.3% higher population growth rate 0.25 km/L higher efficiency

95

Figure 106 Approximate annual average contributions to forecasts in PED model 3b.

7.9 Recommendations for PED modelling

The above analysis has not produced one model for PED volume that is clearly superior

to all others and so some degree of judgement is required to choose a model.

Of the three models (2a, 3a, and 3b) selected for detailed analysis, the regression model

2a has the best goodness of fit, but the hybrid models 3a and 3b perform significantly

better on the truncated-sample forecasting tests (Table 26). Model 2a also produces

somewhat narrower forecast confidence intervals than models 3a and 3b.

In terms of out-of-sample forecasts, in our view models 3a and 3b produce more

plausible forecasts of PED volume than model 2a (Figure 100 above). The PED volume

forecasts produced by models 3a and 3b are similar, but in our view the per-capita

model 3b generates forecasts of total and per-capita light petrol VKT that are more

plausible than model 3a (see Figure 80 and Figure 81). Model 3b also produces PED

volume forecasts that have slightly narrower confidence intervals than model 3a.

On balance, our recommendation is to adopt the per-capita hybrid model 3b for

forecasting PED volumes, due to its significantly higher short-term forecasting accuracy

than regression model 2a and because it produces PED volume forecasts that in our

view are model plausible than model 2a, while also producing VKT forecasts that are

more plausible than model 3a.

We have also been asked to comment on the use of the time series model to generate a

PED forecast scenario. The time series model has similar goodness of fit to the

-0.2%

-0.1%

0.0%

0.1%

0.2%

0.3%

0.4%

0.5%

0.6%

0.7%

0.8%

2013-4 to2014-3

2014-4 to2015-3

2015-4 to2016-3

2016-4 to2017-3

2017-4 to2018-3

2018-4 to2019-3

2019-4 to2020-3

2020-4 to2021-3

2021-4 to2022-3

2022-4 to2023-3

Ave

rage

co

ntr

ibu

tio

n t

o a

nn

ual

fo

reca

st c

han

ge



96

regression models and performs slightly better than those models on the truncated

sample forecasting test, but does not match the performance of the hybrid models (see

Table 26). The time series model includes a strong downwards trend, which reflects the

recent trend in PED volumes but in our view is less likely to produce a plausible forecast

over ten years. The time series model may therefore be useful for generating a forecast

scenario over the short term (two to three years) but in our view is less likely to be

useful in the longer term. We discuss the use of multiple models to generate forecasts in

more detail in section 9.1.3 below.

97

8 Road user charges forecasting

In the case of road user charges (RUC), the underlying task is to forecast the number of

net kilometres purchased.

8.1 Data

Bearing in mind the diversity of the commercial transport sector, and the different

relationships between forms of commercial transport and GDP as discussed in section

3.1.1, we model light (1 – 6 tonnes) and heavy (over 6 tonnes) RUC kilometres

separately. For analysis we use the “net” RUC measures as these are what is required to

generate a NLTF revenue forecast.

8.1.1 RUC volumes

A first look at the data was provided in Figure 50 and Figure 52 above. These showed

that the light and heavy RUC series exhibit fairly stable trends over time, variation

around these trends and a couple of spikes in the light RUC series around 2002 and 2004

corresponding to pre-purchasing in advance of announced RUC rate increases.

Figure 107 and Figure 108 provide a simple graphic analysis of seasonality for light and

heavy RUC respectively, by re-arranging the quarterly data series into quarters. There is

some evidence of a seasonal pattern, particularly for heavy RUC, although identification

of this pattern is complicated by the fact that both data series are generally trending

upwards over time. It is therefore more appropriate to test for seasonality using a

regression on a trend and quarterly dummy variables.

Figure 107 Graphical seasonality analysis for light RUC.


0

500

1,000

1,500

2,000

2,500

20

00

20

02

20

04

20

06

20

08

20

10

20

12

20

00

20

02

20

04

20

06

20

08

20

10

20

12

20

00

20

02

20

04

20

06

20

08

20

10

20

12

20

00

20

02

20

04

20

06

20

08

20

10

20

12

Ligh

t R

UC

ne

t km

(m

illio

n k

m)

Median

Quartiles

Q1 Q2 Q3 Q4

98

Figure 108 Graphical seasonality analysis for heavy RUC.


Table 30 shows the results of a seasonality test for light RUC by regressing light RUC

km on a time trend and quarterly dummies.21 The second-quarter dummy is significant

at the 5% level and a joint test of significance of the quarterly dummies is rejected at the

5% level. This gives evidence of a predictable seasonal pattern in light RUC volumes

that should be taken into account in subsequent modelling.

Table 30 Seasonality test for light RUC.


Trend 1.20 x 107 1.23 x 106 0.0000

Q2 -1.28 x 108 5.48 x 107 0.0240

Q3 -8.27 x 107 5.48 x 10 0.1380

Q4 9.89 x 106 5.58 x 107 0.8600

Constant 1.51 x 109 5.10 x 107 0.0000


Results of the same seasonality test for heavy RUC are shown in Table 31. The second

and third quarter dummies are significant at the 1% level and the F-test of joint

significance of the quarterly dummies is rejected at the 1% level. This gives strong

evidence of a predictable seasonal pattern in heavy RUC volumes.

21 In this case, there was no evidence of significant serial correlation requiring autoregressive terms to

be added to the regression.

0

200

400

600

800

1,000

1,200

20

00

20

02

20

04

20

06

20

08

20

10

20

12

20

00

20

02

20

04

20

06

20

08

20

10

20

12

20

00

20

02

20

04

20

06

20

08

20

10

20

12

20

00

20

02

20

04

20

06

20

08

20

10

20

12

He

avy

RU

C n

et

km (

mill

ion

km

)

Median

Quartiles

Q1 Q2 Q3 Q4

99

Table 31 Seasonality test for heavy RUC.


Trend 3.84 x 106 6.56 x 105 0.0000

Q2 -4.84 x 107 1.42 x 107 0.0010

Q3 -4.50 x 107 1.29 x 107 0.0010

Q4 1.74 x 107 1.21 x 107 0.1510

AR(1) 0.5082 0.1081 0.0000

Constant 7.10 x 108 2.33 x 107 0.0000


8.1.2 Potential explanatory variables

Based on the literature review, we considered the following variables as potential

drivers of heavy and light RUC kilometres:22

Sectoral real GDP for freight-intensive parts of the economy, notably

o Agriculture

o Forestry

o Transport post & warehousing (TPW); and

o Construction

Real GDP in total;

The unemployment rate;

Real prices for diesel, petrol and the ratio (petrol/diesel);

Real prices for RUC (light and heavy);

Time trends; and

Seasonal dummies.

One question of interest is whether RUC volumes have changed as the composition of

GDP has changed. Bearing in mind the relatively limited time period for which RUC

data is available, the correlation coefficients for light and heavy RUC and the

components of GDP are reported in Table 32.

22 Some additional variables were later tested in response to feedback from the NLTF forecasting

group; see sections 8.3.4 and 8.4.4 below.

100

Table 32 Correlations between RUC volumes and selected components of New Zealand’s real GDP.

Light

RUC km Heavy

RUC km Ag. GDP Forestry

GDP TPW GDP

Const. GDP

Agriculture GDP 0.20 0.36

Forestry GDP 0.29 0.35 -0.13

TPW GDP 0.84 0.93 0.31 0.34

Construction GDP 0.71 0.75 -0.03 0.17 0.78

Total GDP 0.83 0.90 0.22 0.33 0.95 0.89

We note that there is considerable variation in correlations with RUC across different

sectors of the economy. It is also noteworthy that heavy RUC has stronger correlations

with all of the GDP variables than light RUC; this perhaps reinforces the view that some

of the growth in light RUC activity reflects substitution from petrol-powered vehicles

primarily for private use. The merit of decomposing GDP is also underlined by the wide

variation in correlation coefficients between pairs of sectoral GDP variables.

Table 33 shows the correlations between quarterly RUC volumes and other potential

explanatory variables. There is a high positive correlation between RUC volumes and

seasonally adjusted GDP. Correlations with RUC prices and fuel prices are generally

high and positive; this is a counterintuitive result which probably reflects the fact that

each individual correlation in the table does not control for any other effects.

Table 33 Correlations between RUC volumes and other potential explanatory variables.

Light

RUC km

Heavy

RUC km

Light

RUC real

price

Heavy

RUC real

price

Real

petrol price

Real

diesel price

Real

GDP (SA)

Heavy RUC km 0.78

Light RUC real price 0.67 0.75

Heavy RUC real price 0.09 -0.01 0.40

Real petrol price 0.62 0.65 0.86 0.34

Real diesel price 0.55 0.60 0.73 0.26 0.96

Real GDP (SA) 0.81 0.84 0.93 0.15 0.85 0.74

Unemployment rate (SA) 0.08 -0.02 0.41 0.88 0.30 0.19 0.14

Table 34 shows the results of stationarity tests for the RUC regression variables. Most

variables are non-stationary and integrated of order one, with the exception of the

agriculture and forestry GDP series. In our regression analysis we have performed

residual stationarity tests to minimise the likelihood of estimating spurious regressions.

101

Table 34 Augmented Dickey Fuller test p-values.

Variable Level First

difference Integration

order

Light RUC km 0.34 0.00 1

Heavy RUC km 0.14 0.01 1

Light RUC real price 0.84 0.00 1

Heavy RUC real price 0.82 0.00 1

Real petrol price 0.86 0.00 1

Real diesel price 0.65 0.00 1

Real GDP (SA) 0.48 0.00 1

Real GDP (actual) 0.36 0.00 1

Real agriculture GDP 0.00 0.00 0

Real forestry GDP 0.00 0.00 0

Real TPW GDP 0.17 0.05 1

Real construction GDP 0.63 0.00 1

Unemployment rate (SA) 0.74 0.00 1

8.2 Modelling strategy

As with PED volumes, we tested models for each of heavy and light RUC in three

general categories:

1. Pure time series models, including only past values of net RUC km,

deterministic time trends, and quarterly dummy variables.

2. Simple regression models, relating net RUC km to other explanatory variables,

including testing various models for short-run dynamics.

3. A ‘hybrid’ approach involving forecasting real GDP in various goods-producing

sectors and then using these forecasts to generate forecasts of net RUC km, and

in the case of heavy RUC, using the proportions of heavy RUC vehicles with 2-4

and 7+ axles.

As with the PED data, the variables of primary interest (light and heavy RUC net km)

are both inferred to be non-stationary on the basis of Dickey Fuller tests. This opens up

two general modelling strategy options:

Transform the variables to stationary forms (by taking first-differences), forecast

those forms, and then undo the transformation to express the forecast in terms

of the target variables; or

Work directly with the variables of interest, develop forecasting models for

them, and then check later whether the non-stationarity has been managed

appropriately.

We have used both of these approaches in our RUC modelling. The first approach

emerged endogenously from our specification process for the time-series models, where

102

the primary model choice criterion was to minimise the information remaining in model

residuals.

Given the forecast horizons required, and based on preliminary modelling work, we

considered that the second of these options was preferable for the regression models.

First differencing the RUC variables introduces considerable extra volatility that needs

to be modelled and accommodated in the forecasts.

Our preferred approach to the regression models allows us to focus directly on the

variables of interest while also guarding against spurious regression concerns. At the

same time, the fact that the first-difference strategy is used in our time-series models

allows that approach to remain in the set of total models for the subsequent model

comparison process.

In contrast with PED, quarterly RUC volumes were found to be relatively predictable

and we found that simple models of short-run dynamics were sometimes useful but

sometimes unnecessary. Error-correction models were tested but these did not produce

models that fit significantly better or produced better short-run forecasts than other

models.

In particular we found that around 90% (or more) of the variation in RUC volume can

be explained by appropriate variables and there is little to be gained by adding the

complexity of an ECM. Thus the models reported below do not contain dynamic

variables or use a simple lag of the dependent variable in some cases. In all cases the

robustness of the models was checked by testing the residuals.

8.3 Light RUC models

The dependent variable in all of the light RUC models is quarterly light RUC km. Unlike

the models for PED and heavy RUC, the light RUC models were estimated using the

levels of the variables rather than the natural logarithms. This is because we found that

using logs for light RUC tended to produce forecasts that grew at an excessive

exponential rate. Using levels reduced this problem, but means that the coefficients of

the light RUC regressions cannot be interpreted as elasticities.

8.3.1 Pure time series models [Light RUC model 1]

The pure time series approach that we adopted for light RUC is the same process of

Bayesian model selection described in section 7.3 above. A total of 192 models were

estimated, corresponding to the set of models generated by all combinations of:

A constant, linear, or quadratic time trend

An autoregressive error term with between one and four lags;

Quarterly dummy variables; and

Using the level or the difference of RUC km as the dependent variable.

103

As before, each model was estimated using the available data, and the model with the

lowest value of the BIC statistics was selected as the chosen model.

The selected model for light RUC volume used the first difference as the dependent

variable and included a constant and three autoregressive error terms (Table 35). The

model explains 57% of the variation in quarterly RUC volumes. The quarterly fitted

values and forecasts generated by this model are shown in Figure 109. In general the

model picks up the trend in light RUC volume over time but does not closely match the

short-run fluctuations. The forecasts generally follow the same trend as the existing

model but are slightly lower in earlier years.

Table 35 Estimated coefficients of the selected time series model for light RUC volumes. Dependent

variable is the first difference of quarterly RUC volume.


Constant 1.3200 x 107 1.2600 x 107 0.2940

AR(1) -0.7363 0.0898 0.0000

AR(2) -0.6007 0.1517 0.0000

AR(3) -0.3112 0.1552 0.0450

R-squared (vs RUC volume) 0.57


Figure 109 Fitted values and out-of-sample forecasts of quarterly light RUC volumes generated by the

selected time series model.


Figure 110 shows the forecasts produced by the selected model when it is estimated

using the sample truncated to the second quarter of 2011, versus the actual light RUC

0

500

1,000

1,500

2,000

2,500

3,000

20

00-1

20

00-4

20

01-3

20

02-2

20

03-1

20

03-4

20

04-3

20

05-2

20

06-1

20

06-4

20

07-3

20

08-2

20

09-1

20

09-4

20

10-3

20

11-2

20

12-1

20

12-4

20

13-3

20

14-2

20

15-1

20

15-4

20

16-3

20

17-2

20

18-1

20

18-4

20

19-3

20

20-2

20

21-1

20

21-4

20

22-3

20

23-2

Ligh

t R

UC

ne

t km

(m

illio

ns)

Quarter

Quarterly light RUC volume

Actual Fitted Forecast Current model forecast

104

volume until the third quarter of 2013. The model fits relatively well during this period,

matching the upwards trend in volume and not appearing to systematically under- or

over-estimate the volume. On average during this period the model over-estimates RUC

volume by 0.4% per quarter, and the RMSE for the period is 38.9 million km.

Figure 110 Comparison of actual light RUC volumes and truncated-sample forecasts produced by the



Figure 111 shows the June-year fitted values and forecasts produced by this model. The

model generates essentially linear growth in light RUC volume over time, from 8.15

billion km in 2013 to 10.29 billion km in 2023, an average annual growth rate of 2.4%.

This growth is similar to the existing model, which predicts an average annual growth

rate of 2.5%. The truncated-sample model forecasts are too high by 0.6% in 2012 and

0.3% in 2013.

1,850

1,900

1,950

2,000

2,050

2,100

20

11-3

20

11-4

20

12-1

20

12-2

20

12-3

20

12-4

20

13-1

20

13-2

20

13-3

Ligh

t R

UC

ne

t km

(m

illio

ns)

Quarter


Actual Forecast

105

Figure 111 Annual (June year) fitted values and forecasts generated by the selected time-series model

for light RUC volume.


8.3.2 Regression models [Light RUC model 2a]

As with PED volume, we tested the forecasting performance of simple regression

models of light RUC km. Again we used a process of general-to-specific selection and

diagnostic testing of regression residuals to select parsimonious models of quarterly

light net RUC km. Quarterly dummies were included to capture the potential seasonal

effects identified above.

Models of short-run dynamics (autoregressive errors, lagged dependent variables, and

error-correction models) were tested, but the dynamic variables did not greatly improve

the goodness of fit and were not statistically significant. However, given the observation

that RUC volumes tend to increase in advance of RUC rate increases, we tested leads of

RUC rates in the models and these were significant. Residual diagnostic tests were

performed to ensure the selected models were not spurious and there was no evidence

of residual autocorrelation.

Table 43 shows the estimated coefficients of the selected model for quarterly light RUC

net km.23 All economic variables are significant at the 10% level except the real diesel

23 Unlike other regression models where the natural logarithm of the dependent variable was used, this

model was estimated using the actual value of quarterly light RUC km. This is because using the

natural logarithm of the dependent variable was found to produce forecasts that increased at an

excessive exponential rate due to the inclusion of the time trend in the model, which was considered to

be implausible.

0

2,000

4,000

6,000

8,000

10,000

12,000

20

01

20

02

20

03

20

04

20

05

20

06

20

07

20

08

20

09

20

10

20

11

20

12

20

13

20

14

20

15

20

16

20

17

20

18

20

19

20

20

20

21

20

22

20

23

Ligh

t R

UC

ne

t km

(m

illio

ns)

Year ended June

Annual light RUC km


106

price, and the second-quarter dummy and time trend are significant. The model

explains 93% of the variation in quarterly light RUC km.

Table 36 Estimated coefficients of the selected regression model for light RUC volumes. Dependent

variable is the actual quarterly light RUC volume


Real GDP (SA) 6.67 x 104 1.11 x 104 0.00

Real diesel price -8.14 x 105 5.13 x 105 0.12

Real light RUC price -7.84 x 107 6.31 x 106 0.00

Real light RUC price (+1) 4.98 x 10 7 6.28 x 106 0.00

Q2 -8.65 x 107 2.55 x 107 0.00

Q3 -5.47 x 107 2.59 x 107 0.04

Q4 7.18 x 107 2.63 x 107 0.01

Time trend 1.21 x 107 3.67 x 106 0.00

Constant 4.15 x 108 3.77 x 108 0.28

R-squared (vs light RUC km) 0.93


To generate forecasts from this model, forecasts of real GDP, the real diesel price, and

the real light RUC price are required. We used the same real GDP forecast as was used

for the PED volume models (see section 7.4 and Figure 61 above). Forecasts of the real

diesel price were created by combining a nominal petrol price forecast provided by the

Ministry of Transport with Treasury’s CPI forecast and an extended long-run forecast of

2.2% inflation per annum (Figure 112).

Figure 112 Nominal and real diesel price forecasts.

Source: Ministry of Transport, Treasury, and Covec.

0

200

400

600

800

1,000

1,200

1,400

1,600

0

50

100

150

200

250

20

00-1

20

01-1

20

02-1

20

03-1

20

04-1

20

05-1

20

06-1

20

07-1

20

08-1

20

09-1

20

10-1

20

11-1

20

12-1

20

13-1

20

14-1

20

15-1

20

16-1

20

17-1

20

18-1

20

19-1

20

20-1

20

21-1

20

22-1

20

23-1

CP

I

Die

sel

pri

ce (

cen

ts /

litr

e)

Quarter

Nominal - Actual Nominal - Forecast

Real - Actual Real - Forecast

CPI - Actual CPI - Treasury Forecast

CPI - Extended Forecast

107

The light RUC price assumptions are shown in Figure 113. The nominal price is

assumed to increase annually over time, while the real light RUC price is generally

constant.

Figure 113 Nominal and real light RUC prices.


The fitted values and out-of-sample forecasts produced by this model with the above

assumptions are shown in Figure 114. The forecasts increase at an increasing rate, in

contrast to the forecasts produced by the existing model, which are approximately

linear. However, this model produces forecasts that are lower than the existing model in

early years, and higher than the existing model in later years.

Figure 115 compares the truncated-sample forecasts produced by this model with actual

light RUC km. In general the model over-predicts light RUC km, by an average of 1.4%

per quarter. The RMSE for this period is 86.7 million km.

The annual forecasts produced by this model are shown in Figure 116. While the

truncated-sample forecasts are relatively close to the actuals for 2012 and 2013 with

errors of 0.9% and 3.7% respectively, the out-of-sample forecasts increase at an

increasing rate, driven by the GDP, diesel price, and light RUC price assumptions.

Annual light RUC km is forecasted by this model to increase from 8.2 billion km in 2013

to 11.7 billion km in 2023, an average annual growth rate of 3.6%. In contrast, the

existing model predicts an average annual growth rate of 2.4%. However, this model

predicts relatively low growth rates in the first three years, of between 2.2% and 2.7%, in

contrast to 2.5% to 4.4% for the existing model.

0

200

400

600

800

1,000

1,200

1,400

1,600

0

10

20

30

40

50

60

70

80

20

00-1

20

01-1

20

02-1

20

03-1

20

04-1

20

05-1

20

06-1

20

07-1

20

08-1

20

09-1

20

10-1

20

11-1

20

12-1

20

13-1

20

14-1

20

15-1

20

16-1

20

17-1

20

18-1

20

19-1

20

20-1

20

21-1

20

22-1

20

23-1

CP

I

Ligh

t R

UC

pri

ce (

cen

ts /

km

)

Quarter





108


selected simple regression model



selected simple regression model.


0

500

1,000

1,500

2,000

2,500

3,000

3,500

20

00-1

20

00-4

20

01-3

20

02-2

20

03-1

20

03-4

20

04-3

20

05-2

20

06-1

20

06-4

20

07-3

20

08-2

20

09-1

20

09-4

20

10-3

20

11-2

20

12-1

20

12-4

20

13-3

20

14-2

20

15-1

20

15-4

20

16-3

20

17-2

20

18-1

20

18-4

20

19-3

20

20-2

20

21-1

20

21-4

20

22-3

20

23-2

Ligh

t R

UC

ne

t km

(m

illio

ns)

Quarter



1,700

1,800

1,900

2,000

2,100

2,200

2,300

20

11-3

20

11-4

20

12-1

20

12-2

20

12-3

20

12-4

20

13-1

20

13-2

20

13-3

Ligh

t R

UC

ne

t km

(m

illio

ns)

Quarter


Actual Forecast

109

Figure 116 Annual (June year) fitted values and forecasts generated by the selected simple regression

model for light RUC volume.


8.3.3 Hybrid models [Light RUC model 3]

The RUC volume data does not suffer from the excess volatility problem that is

observed in the PED volume data. Therefore, we did not use VKT data to build

additional models for light RUC volumes. Instead, taking into account the evidence in

the literature review (section 5.2 above), we analysed the relationship between RUC

volumes and GDP of various goods-producing sectors in New Zealand. As discussed in

section 8.1.2 above, we selected the agriculture, forestry, transport, postal and

warehousing (TPW), and construction sectors for use in this analysis.

One issue with this approach is that forecasts of sectoral GDP are not available.

Therefore as an initial step we built models relating GDP in each of these sectors to total

GDP and deterministic time trends, using simple models described below. We then

tested the sectoral GDP variables in models for heavy and light RUC including other

explanatory variables such as real RUC and diesel prices, and used the forecasts of

sectoral GDP developed below to generate heavy and light RUC volume forecasts.

Sectoral GDP models

Figure 117 shows the fitted models and forecasts of real GDP in each of the four sectors.

The dependent variables in the models are seasonally adjusted real GDP, quarterly

dummies, and an autoregressive error term with between one and four lags. The models

explain between 95% and 98% of the variation in quarterly real GDP in each sector.

0

2,000

4,000

6,000

8,000

10,000

12,000

14,000

20

01

20

02

20

03

20

04

20

05

20

06

20

07

20

08

20

09

20

10

20

11

20

12

20

13

20

14

20

15

20

16

20

17

20

18

20

19

20

20

20

21

20

22

20

23

Ligh

t R

UC

ne

t km

(m

illio

ns)

Year ended June

Annual light RUC km


110

Figure 117 Sectoral GDP forecasting models.

Agriculture Forestry

Transport, Post & Warehousing (TPW) Construction

Source: Statistics New Zealand, Treasury, and Covec.

Light RUC hybrid models

A regression model for light RUC was estimated using sectoral GDP variables, the real

light RUC price, real diesel prices, quarterly dummies, and time trends. Again a process

of general-to-specific model selection and residual diagnostic testing was used to select

a parsimonious model for the quarterly data. Lags of the sectoral GDP variables and

prices were tested, and again a one-quarter lead of the real light RUC price was found to

be significant.24

The estimated coefficients of this model are shown in Table 37. All coefficients are

significant at the 10% level or better, and the model explains 92% of the variation in

quarterly light RUC net km. The agriculture and forestry GDP variables were excluded

as these were not significant and did not greatly improve the goodness of fit.

24 As with the earlier light RUC models, this model was estimated in levels rather than logs, as using

logs tended to produce forecasts of excessive growth in the long term.

0

200

400

600

800

1,000

1,200

1,400

1,600

1,800

2,00020

00-

1

200

1-2

200

2-3

200

3-4

200

5-1

200

6-2

200

7-3

200

8-4

201

0-1

201

1-2

201

2-3

201

3-4

201

5-1

201

6-2

201

7-3

201

8-4

202

0-1

202

1-2

202

2-3

Re

al G

DP

(1

99

7/9

8 $

m)

Quarter

Actual Fitted Forecast0

100

200

300

400

500

600

200

0-1

200

1-2

200

2-3

200

3-4

200

5-1

200

6-2

200

7-3

200

8-4

201

0-1

201

1-2

201

2-3

201

3-4

201

5-1

201

6-2

201

7-3

201

8-4

202

0-1

202

1-2

202

2-3

Re

al G

DP

(1

99

7/9

8 $

m)

Quarter


0

500

1,000

1,500

2,000

2,500

3,000

200

0-1

200

1-2

200

2-3

200

3-4

200

5-1

200

6-2

200

7-3

200

8-4

201

0-1

201

1-2

201

2-3

201

3-4

201

5-1

201

6-2

201

7-3

201

8-4

202

0-1

202

1-2

202

2-3

Re

al G

DP

(1

99

7/9

8 $

m)

Quarter

Actual Fitted Forecast0

500

1,000

1,500

2,000

2,500

3,000

200

0-1

200

1-2

200

2-3

200

3-4

200

5-1

200

6-2

200

7-3

200

8-4

201

0-1

201

1-2

201

2-3

201

3-4

201

5-1

201

6-2

201

7-3

201

8-4

202

0-1

202

1-2

202

2-3

Re

al G

DP

(1

99

7/9

8 $

m)

Quarter


111

Table 37 Estimated coefficients of the selected hybrid model for light RUC volumes.


Real light RUC price -7.37 x 107 6.18 x 106 0.00

Real light RUC price (+1) 4.18 x 107 6.15 x 106 0.00


Real TPW GDP 9.88 x 105 1.23 x 105 0.00

Real construction GDP 1.17 x 105 6.51 x 104 0.08

Time trend 1.82 x 107 3.12 x 106 0.00

Light RUC volume (-1) -0.12 0.06 0.07

Constant 9.83 x 108 2.51 x 108 0.00




118. The model explains the historical data relatively well, including the two spikes

observed in advance of RUC rate increases. The forecasts are initially lower than the

forecasts produced by the existing model, but are higher from 2018.


selected hybrid regression model


Figure 119 shows the truncated-sample forecasts produced by this model versus actual

light RUC km. In general the model over-estimates light RUC km during this period, by

an average of 1.6% per quarter. The RMSE for this period is 89.3 million km. On an

annual basis (Figure 120), the model forecasts an increase in light RUC km from 8.2

billion km in 2013 to 11.7 billion km in 2023, an average annual growth rate of 3.6%,

0

500

1,000

1,500

2,000

2,500

3,000

3,500

20

00-1

20

00-4

20

01-3

20

02-2

20

03-1

20

03-4

20

04-3

20

05-2

20

06-1

20

06-4

20

07-3

20

08-2

20

09-1

20

09-4

20

10-3

20

11-2

20

12-1

20

12-4

20

13-3

20

14-2

20

15-1

20

15-4

20

16-3

20

17-2

20

18-1

20

18-4

20

19-3

20

20-2

20

21-1

20

21-4

20

22-3

20

23-2

Ligh

t R

UC

ne

t km

(m

illio

ns)

Quarter



112

compared to 2.4% for the current model. The truncated-sample model over-estimates

light RUC km by 1.8% in 2012 and 3.6% in 2013.


selected hybrid regression model.


Figure 120 Annual (June year) fitted values and forecasts generated by the selected hybrid model for

light RUC volume.


1,700

1,800

1,900

2,000

2,100

2,200

2,300

20

11-3

20

11-4

20

12-1

20

12-2

20

12-3

20

12-4

20

13-1

20

13-2

20

13-3

Ligh

t R

UC

ne

t km

(m

illio

ns)

Quarter


Actual Forecast

0

2,000

4,000

6,000

8,000

10,000

12,000

14,000

20

01

20

02

20

03

20

04

20

05

20

06

20

07

20

08

20

09

20

10

20

11

20

12

20

13

20

14

20

15

20

16

20

17

20

18

20

19

20

20

20

21

20

22

20

23

Ligh

t R

UC

ne

t km

(m

illio

ns)

Year ended June

Annual light RUC km


113

8.3.4 Additional light RUC models [Light RUC model 2b]


were requested to test some additional models for light RUC volume including real

imports and exports of goods as additional explanatory variables (Figure 121).

Figure 121 Real exports and imports of goods.

Source: Treasury and Covec analysis.

0

2,000

4,000

6,000

8,000

10,000

12,000

14,000

20

00

-1

20

01

-1

20

02

-1

20

03

-1

20

04

-1

20

05

-1

20

06

-1

20

07

-1

20

08

-1

20

09

-1

20

10

-1

20

11

-1

20

12

-1

20

13

-1

20

14

-1

20

15

-1

20

16

-1

20

17

-1

20

18

-1

20

19

-1

20

20

-1

20

21

-1

20

22

-1

20

23

-1

Re

al v

alu

e (

19

95

/96

$m

)

Real quarterly exports (goods only; seasonally adjusted)

Actual Treasury Forecast Extended Forecast

0

2,000

4,000

6,000

8,000

10,000

12,000

14,000

16,000

20

00

-1

20

01

-1

20

02

-1

20

03

-1

20

04

-1

20

05

-1

20

06

-1

20

07

-1

20

08

-1

20

09

-1

20

10

-1

20

11

-1

20

12

-1

20

13

-1

20

14

-1

20

15

-1

20

16

-1

20

17

-1

20

18

-1

20

19

-1

20

20

-1

20

21

-1

20

22

-1

20

23

-1

Re

al v

alu

e (

19

95

/96

$m

)

Real quarterly imports (goods only; seasonally adjusted)

Actual Treasury Forecast Extended Forecast

114

We tested these two variables in the regression model 2a, in combination and

individually. We found that real exports of goods was not statistically significant either

on its own or in combination with imports, but real imports of goods was statistically

significant in both cases.

The estimated light RUC regression model including real imports of goods is shown in

Table 38. As before, this model was estimated in levels rather than logs, as the log model

produced exponentially increasing forecasts.

The quarterly fitted values and forecasts generated by this model are shown in Figure

122. As with the other light RUC models, this model generates forecasts that are initially

lower than the current model but exceed the current model’s forecasts in later periods.

The truncated sample forecasts produced by this model are shown in Figure 123. The

pattern is similar to the other light RUC models, with a tendency to over-forecast light

RUC volume, by an average of 2.2% per quarter. The RMSE for this period is 85.4

million km.

The annual forecasts produced by the model are shown in Figure 124. The truncated

sample model over-forecasts light RUC volume by 1.8% in 2012 and 4.1% in 2013. The

forecast volumes grow at an essentially linear rate, and this leads to higher forecasts

than the current model after 2017.

Table 38 Estimated coefficients of the additional regression model for light RUC volumes including

imports. Dependent variable is the actual quarterly light RUC volume.


Real GDP (SA) 3.84 x 104 1.86 x 104 0.05


Real light RUC price 4.71 x 107 6.29 x 106 0.00

Real light RUC price (+1) -7.64 x 107 6.23 x 106 0.00

Real imports of goods 4.24 x 104 3.81 x 106 0.07

Q2 -8.47 x 107 2.48 x 107 0.00

Q3 -5.09 x 107 2.53 x 107 0.05

Q4 7.24 x 107 2.56 x 107 0.01

Time trend 1.45 x 107 2.27 x 106 0.00

Constant 1.01 x 109 4.86 x 108 0.04



115


regression model including imports.



light RUC regression model including imports.


0

500

1,000

1,500

2,000

2,500

3,000

3,500

20

00-1

20

01-1

20

02-1

20

03-1

20

04-1

20

05-1

20

06-1

20

07-1

20

08-1

20

09-1

20

10-1

20

11-1

20

12-1

20

13-1

20

14-1

20

15-1

20

16-1

20

17-1

20

18-1

20

19-1

20

20-1

20

21-1

20

22-1

20

23-1

Ligh

t R

UC

ne

t km

(m

illio

ns)

Quarter



1,700

1,800

1,900

2,000

2,100

2,200

2,300

20

11-3

20

11-4

20

12-1

20

12-2

20

12-3

20

12-4

20

13-1

20

13-2

20

13-3

Ligh

t R

UC

ne

t km

(m

illio

ns)

Quarter


Actual Forecast

116

Figure 124 Annual (June year) fitted values and forecasts generated by the regression model for light

RUC volume including imports.


8.3.5 Light RUC model evaluation and comparison

Table 39 summarises the explanatory variables used in the four models for light RUC

presented above.

Table 39 Summary of explanatory variables in the light RUC models.

Model Type Real GDP

Real

diesel price

Real

light RUC price

TPW

sector GDP

Const.

sector GDP Trend

Goods imports

1 Time series

2a Regression

2b Regression

3 Hybrid

As with the PED models, we compare the RUC models on the basis of:

Goodness of fit (R-squared) of quarterly net RUC km;

The RMSE of the truncated sample forecasts; and

0

2,000

4,000

6,000

8,000

10,000

12,000

14,000

20

01

20

02

20

03

20

04

20

05

20

06

20

07

20

08

20

09

20

10

20

11

20

12

20

13

20

14

20

15

20

16

20

17

20

18

20

19

20

20

20

21

20

22

20

23

Ligh

t R

UC

ne

t km

(m

illio

ns)

Year ended June

Annual light RUC km

Actual Fitted

117

The plausibility of the quarterly and annual forecasts produced by the models.

Table 40 compares the goodness of fit and truncated sample forecasting performance of

the three models for light RUC net km. The regression and hybrid models have

significantly better goodness of fit than the time series model, but also perform worse

than the time series model on the truncated sample forecasting test.

Table 40 Summary of goodness of fit and truncated-sample forecast RMSE of the light RUC models.

1 2a 2b 3


R2 vs light RUC km 0.57 0.93 0.94 0.92


RMSE (million km) 38.9 86.6 85.4 89.3

Average quarterly error (%) 0.4% 1.4% 2.2% 1.6%

2012 error (%) 0.6% 0.9% 1.8% 1.8%

2013 error (%) 0.3% 3.7% 4.1% 3.6%

As with the PED models, the light RUC models have relative advantages and

disadvantages. The time series model performs well on the truncated sample test and

produces out-of-sample forecasts that grow at a rate consistent with recent trends, while

being more conservative than the current model in the short term. This model is very

simple to implement, however as a pure time-series model it lacks any explanatory

variables and therefore cannot be used to generate forecasts under alternative scenarios.

Figure 125 compares the truncated sample forecasting performance of the light RUC

models. All models with the exception of the time series model tend to over-forecast

light RUC km during this period.

118

Figure 125 Comparison of truncated sample forecasts produced by the light RUC models.


Figure 126 compares the out-of-sample forecasts generated by the light RUC models.

The regression and hybrid models are very similar in terms of quarterly and annual

forecasts. These models forecast a relatively low rate of growth of light RUC until 2016,

and then an increasing growth rate, with the forecasts exceeding those produced by the

current model from 2017. The time-series model produces an essentially linear growth

forecast that is below the forecasts produced by the current model in early years, but

above the current model in later years.

The hybrid model and simple regression model produce forecasts that are very similar.

Given this and given the additional complexity of the hybrid model (ie the need to

generate sectoral GDP forecasts), in our view the simple regression model is preferable

to the hybrid model. The simple regression model produces forecasts that grow at a rate

consistent with recent trends for the first three years, in contrast to the current model

which predicts a higher rate of growth. However the simple regression model predicts a

significant increase in growth of light RUC volumes after 2016. While this depends

somewhat on the light RUC price scenario that is used, there may be some question

about the plausibility of the long-term forecasts produced by this model.

1,700

1,800

1,900

2,000

2,100

2,200

2,300

2,400

20

11-3

20

11-4

20

12-1

20

12-2

20

12-3

20

12-4

20

13-1

20

13-2

20

13-3

Ligh

t R

UC

ne

t km

(m

illio

ns)

Quarter


Actual Time series Regression (a) Regression (b) Hybrid

119

Figure 126 Annual light RUC forecast comparison.


8.3.6 Light RUC confidence intervals and sensitivity testing

In consultation with a subgroup of the NLTF forecasting group, light RUC model 2b

(regression including imports) was selected for the development of confidence intervals

and sensitivity testing.

Table 41 and Figure 127 show the indicative forecast and confidence intervals and

sensitivity tests from this model. The 67% and 90% confidence intervals average

approximately +/- 2.6% and 4.2% respectively. The model is relatively sensitive to the

real GDP and real imports growth rate assumption, relatively insensitive to the real

diesel price, and somewhat sensitive to the real light RUC price.

Figure 128 shows the approximate decomposition of the baseline forecasts in this model.

In the short term, higher real GDP, lower real diesel prices, and the time trend increase

light RUC km, while higher real light RUC prices offset these effects. In the longer term

the time trend and higher real GDP are the dominant factors.

5,000

6,000

7,000

8,000

9,000

10,000

11,000

12,000

20

01

20

02

20

03

20

04

20

05

20

06

20

07

20

08

20

09

20

10

20

11

20

12

20

13

20

14

20

15

20

16

20

17

20

18

20

19

20

20

20

21

20

22

20

23

Ligh

t R

UC

vo

lum

e (

mill

ion

km

)

Year ended June

Annual light RUC volumes


Regression (a) Regression (b) Hybrid

120

Table 41 Indicative forecasts and confidence intervals for light RUC model 2b.

Light RUC volume (million km) Light RUC volume (annual % change)

YE June

90% lower

67% lower Base

67% upper

90% upper

90% lower

67% lower Base

67% upper

90% upper

2013 8,150

2014 8,084 8,210 8,405 8,600 8,725 -0.8% 0.7% 3.1% 5.5% 7.1%

2015 8,252 8,420 8,680 8,940 9,108 2.1% 2.6% 3.3% 4.0% 4.4%

2016 8,543 8,711 8,971 9,231 9,398 3.5% 3.5% 3.3% 3.3% 3.2%

2017 8,966 9,134 9,394 9,654 9,822 5.0% 4.9% 4.7% 4.6% 4.5%

2018 9,342 9,510 9,770 10,030 10,198 4.2% 4.1% 4.0% 3.9% 3.8%

2019 9,674 9,841 10,101 10,361 10,529 3.5% 3.5% 3.4% 3.3% 3.3%

2020 10,001 10,169 10,429 10,689 10,857 3.4% 3.3% 3.2% 3.2% 3.1%

2021 10,340 10,508 10,768 11,028 11,196 3.4% 3.3% 3.2% 3.2% 3.1%

2022 10,684 10,852 11,112 11,372 11,540 3.3% 3.3% 3.2% 3.1% 3.1%

2023 11,034 11,201 11,461 11,721 11,889 3.3% 3.2% 3.1% 3.1% 3.0%

Figure 127 Confidence intervals and sensitivity testing for light RUC model 2b.


-30%

-25%

-20%

-15%

-10%

-5%

0%

5%

10%

15%

20%

25%

30%

20

14

20

15

20

16

20

17

20

18

20

19

20

20

20

21

20

22

20

23

Re

lati

ve t

o b

ase

line

fo

reca

st

90% confidence interval 67% confidence interval2% higher real GDP growth 28c/L higher real diesel price7c/km higher real light RUC price 9% higher real imports growth

121

Figure 128 Approximate decomposition of forecasts produced by light RUC model 2b.


8.3.7 Recommendations for light RUC modelling

The regression and hybrid models have better goodness of fit than the time series

model, but the time series model performs significantly better on the truncated sample

forecasting test (Table 40). In terms of forecasting performance, there is little to

differentiate the two regression models (2a and 2b) and the hybrid model (3). We

therefore recommend the regression models over the hybrid model due to their

simplicity, whereas the hybrid model requires auxiliary models of sectoral GDP.

In our view, the regression models 2a and 2b produce forecasts that are plausible in the

short term (up to three years ahead) but the strong long term growth predicted by these

models raises some questions. However this is in part due to the relatively low real light

RUC prices assumed in the long term, and an alternative assumption about this price

would generate lower long term growth. Therefore these models do not necessarily

predict high light RUC growth in all cases.

Overall we recommend the use of the regression model 2b. Its performance is similar to

model 2a but the inclusion of imports in the model leads to slightly better goodness of

fit and allows for the analysis of a broader range of forecasting scenarios.

-2.0%

-1.5%

-1.0%

-0.5%

0.0%

0.5%

1.0%

2013-4 to2014-3

2014-4 to2015-3

2015-4 to2016-3

2016-4 to2017-3

2017-4 to2018-3

2018-4 to2019-3

2019-4 to2020-3

2020-4 to2021-3

2021-4 to2022-3

2022-4 to2023-3

Ave

rage

co

ntr

ibu

tio

n t

o a

nn

ual

fo

reca

st c

han

ge

Real GDP Real diesel price Real light RUC price

Time trend Real imports Dynamic correction

122

8.4 Heavy RUC models

The dependent variable in all of the heavy RUC models is the natural logarithm of

quarterly heavy RUC km.

8.4.1 Pure time series model [Heavy RUC model 1]

The approach to time series modelling of heavy RUC is the same as that for light RUC

described in section 8.3.1 above. Table 42 shows the estimated coefficients of the selected

time series model for heavy RUC volume. This model also uses the first difference of

volume as the dependent variable, and includes quarterly dummy variables, a constant,

and two autoregressive error terms. The model explains 82% of the variation in heavy

RUC volumes.

Table 42 Estimated coefficients of the selected time series model for heavy RUC volumes. Dependent

variable is the first difference of quarterly RUC volume.


Q2 -3.28 x 107 1.86 x 107 0.0780

Q3 1.84 x 107 1.83 x 107 0.3160

Q4 7.65 x 107 2.58 x 107 0.0030

Constant -1.10 x 107 1.13 x 107 0.3280

AR(1) -0.5402 0.1158 0.0000

AR(2) -0.3074 0.1786 0.0850

R-squared (vs RUC volume) 0.82


Figure 129 shows the quarterly fitted values and out-of-sample forecasts of heavy RUC

volume produced by this model. In general the model predicts the trend and quarterly

fluctuations in volume relatively well. The model generates forecasts of heavy RUC

volume that increase somewhat more slowly than the existing model.

Figure 130 shows the comparison of the forecasts generated by this model estimated

using the truncated sample, versus actual values during the truncated period. In general

the model’s predictions are relatively accurate, although accuracy appears to break

down in the latter part of the period. On average the model under-estimates heavy RUC

volume by 0.4% per quarter, and the RMSE is 31.3 million km.

Figure 131 shows the annual fitted values and forecasts generated by the model. The

model predicts an increase of annual heavy RUC km from 3.55 billion km in 2013 to 4.26

billion km in 2023, an average annual growth rate of 1.8%. This is significantly lower

than the growth rate of the current model, which predicts 4.77 billion RUC km by 2023

and an average annual growth rate of 3.0%. The truncated-sample model over-estimates

heavy RUC km by 0.6% in 2012 and under-estimates by 0.9% in 2013.

123

Figure 129 Fitted values and out-of-sample forecasts of quarterly heavy RUC volumes generated by the



Figure 130 Comparison of actual heavy RUC volumes and truncated-sample forecasts produced by the



0

200

400

600

800

1,000

1,200

1,400

20

00-1

20

00-4

20

01-3

20

02-2

20

03-1

20

03-4

20

04-3

20

05-2

20

06-1

20

06-4

20

07-3

20

08-2

20

09-1

20

09-4

20

10-3

20

11-2

20

12-1

20

12-4

20

13-3

20

14-2

20

15-1

20

15-4

20

16-3

20

17-2

20

18-1

20

18-4

20

19-3

20

20-2

20

21-1

20

21-4

20

22-3

20

23-2

He

avy

RU

C n

et

km (

mill

ion

s)

Quarter

Quarterly heavy RUC volume


760

780

800

820

840

860

880

900

920

940

20

11-3

20

11-4

20

12-1

20

12-2

20

12-3

20

12-4

20

13-1

20

13-2

20

13-3

He

avy

RU

C n

et

km (

mill

ion

s)

Quarter


Actual Forecast

124

Figure 131 Annual (June year) fitted values and forecasts generated by the selected time-series model

for heavy RUC volume.


8.4.2 Regression models [Heavy RUC model 2a]

The process for regression modelling of heavy RUC was the same as for light RUC

described in section 8.3.2 above. The selected heavy RUC regression model is shown in

Table 43.25 The real diesel price was not significant in this model (and was estimated

with the wrong sign), indicating that heavy RUC demand is insensitive to the diesel

price. The one-quarter lead of the real heavy RUC price was significant, and a first-order

autoregressive error term was required to deal with some remaining serial correlation.

The model explains 90% of the variation in quarterly heavy RUC net km.

Table 43 Estimated coefficients of the selected regression model for heavy RUC volumes. Dependent

variable is the natural logarithm of quarterly heavy RUC volume


ln(Real GDP SA) 0.8380 0.0943 0.00

ln(Real heavy RUC price) -1.0006 0.1626 0.00

ln(Real heavy RUC price) (+1) 0.7861 0.2173 0.00

Q2 -0.0600 0.0127 0.00

Q3 -0.0647 0.0117 0.00

Q4 0.0239 0.0140 0.09

Constant 12.9638 1.1658 0.00

AR(1) 0.3606 0.1646 0.03

R-squared (vs heavy RUC km) 0.90


25 Unlike light RUC, the natural logarithm of heavy RUC km was used as the dependent variable.

0

500

1,000

1,500

2,000

2,500

3,000

3,500

4,000

4,500

5,000

20

01

20

02

20

03

20

04

20

05

20

06

20

07

20

08

20

09

20

10

20

11

20

12

20

13

20

14

20

15

20

16

20

17

20

18

20

19

20

20

20

21

20

22

20

23

He

avy

RU

C n

et

km (

mill

ion

s)

Year ended June

Annual heavy RUC km


125

The same real GDP forecast as above (Figure 61) were used to generate indicative

forecasts from this model. The assumptions for nominal and real heavy RUC prices are

shown in Figure 132, where the nominal RUC price was provided by the Ministry of

Transport and the CPI forecast is as before.

Figure 132 Nominal and real heavy RUC prices.


Figure 133 shows the quarterly fitted values and forecasts produced by this model. The

model explains the historical pattern relatively well, although the spike in early 2007 is

not fully predicted and actual heavy RUC km declines somewhat faster than the model

predicts during the 2009 recession. This model essentially generates a linear forecast of

heavy RUC km, at a slower rate of growth compared to the existing model.

Figure 134 compares the truncated-sample forecasts produced by this model with the

actual heavy RUC net km. In general the model matches the pattern in heavy RUC km

relatively well but under-predicts during this period, by an average of 4.1% per quarter.

The RMSE for this period is 47.4 million km.

The annual forecasts from this model are shown in Figure 135. The model fits the

historical data relatively well, although predicts higher heavy RUC km during the

2009/10 recession than actually occurred. The truncated model under-predicts heavy

RUC km in 2012 and 2013 by 1.7% and 4.1% respectively. The out-of-sample forecasts

grow from 3.6 billion km in 2013 to 4.1 billion km in 2023, an average annual growth

rate of 1.4%. In contrast the existing model predicts an annual growth rate over this

period of 3.0%.

0

200

400

600

800

1,000

1,200

1,400

1,600

0

50

100

150

200

250

300

350

400

20

00-1

20

01-1

20

02-1

20

03-1

20

04-1

20

05-1

20

06-1

20

07-1

20

08-1

20

09-1

20

10-1

20

11-1

20

12-1

20

13-1

20

14-1

20

15-1

20

16-1

20

17-1

20

18-1

20

19-1

20

20-1

20

21-1

20

22-1

20

23-1

CP

I

He

avy

RU

C p

rice

(ce

nts

/ k

m)

Quarter





126







0

200

400

600

800

1,000

1,200

1,400

20

00-1

20

00-4

20

01-3

20

02-2

20

03-1

20

03-4

20

04-3

20

05-2

20

06-1

20

06-4

20

07-3

20

08-2

20

09-1

20

09-4

20

10-3

20

11-2

20

12-1

20

12-4

20

13-3

20

14-2

20

15-1

20

15-4

20

16-3

20

17-2

20

18-1

20

18-4

20

19-3

20

20-2

20

21-1

20

21-4

20

22-3

20

23-2

He

avy

RU

C n

et

km (

mill

ion

s)

Quarter



740

760

780

800

820

840

860

880

900

920

940

20

11-3

20

11-4

20

12-1

20

12-2

20

12-3

20

12-4

20

13-1

20

13-2

20

13-3

He

avy

RU

C n

et

km (

mill

ion

s)

Quarter


Actual Forecast

127

Figure 135 Annual (June year) fitted values and forecasts generated by the selected simple regression

model for heavy RUC volume.


8.4.3 Hybrid models [Heavy RUC model 3a]

Hybrid models for heavy RUC were tested using the same approach and sectoral GDP

models as described for light RUC in section 8.3.3 above. To some extent this model will

reflect future changes in freight intensity within the economy, if such sectors are

expected to grow relatively quickly.

The estimated coefficients of the selected hybrid heavy RUC model are shown in Table

44. All variables are significant at the 5% level or better, except two of the quarterly

dummies which were retained in the model following an F-test of joint significance of

the quarterly dummies (p-value 0.03). The model explains 91% of the quarterly variation

in heavy RUC net km.

Table 44 Estimated coefficients of the selected hybrid model for heavy RUC volumes.




ln(Real forestry GDP) 0.1568 0.0684 0.0270

ln(Real TPW GDP) 0.8131 0.0594 0.0000

Q2 -0.0102 0.0107 0.3460

Q3 0.0201 0.0235 0.3970

Q4 0.0388 0.0184 0.0400

Constant 15.3426 0.5723 0.0000



0

500

1,000

1,500

2,000

2,500

3,000

3,500

4,000

4,500

5,000

20

01

20

02

20

03

20

04

20

05

20

06

20

07

20

08

20

09

20

10

20

11

20

12

20

13

20

14

20

15

20

16

20

17

20

18

20

19

20

20

20

21

20

22

20

23

He

avy

RU

C n

et

km (

mill

ion

s)

Year ended June

Annual heavy RUC km


128

Figure 136 shows the quarterly fitted values and forecasts of heavy RUC net km

produced by this model. The model generally fits well and produces forecasts that are

significantly lower and grow more slowly than the forecasts from the existing model.




Figure 137 shows the truncated sample forecasts produced by this model in comparison

with actual heavy RUC volumes. In general this model under-predicts the actual heavy

RUC km, by an average of 3.8% per quarter, with an RMSE of 50.3 million km.

The annual forecasts for this model (Figure 138) grow from 3.6 billion km in 2013 to 4.1

billion km in 2023, an average annual growth rate of 1.4%. In contrast, the existing

model forecasts an average annual growth rate of 3.0% over the same period. The

selected hybrid model predicts a small increase in heavy RUC km between 2015, but

then a slight decline in 2016. This is caused by assumed heavy RUC rate increases, as

well as a forecast slow-down in the rate of GDP growth in the construction sector. The

truncated model under-predicts heavy RUC km by 1.3% in 2012 and 5.0% in 2013.

0

200

400

600

800

1,000

1,200

1,400

20

00-1

20

00-4

20

01-3

20

02-2

20

03-1

20

03-4

20

04-3

20

05-2

20

06-1

20

06-4

20

07-3

20

08-2

20

09-1

20

09-4

20

10-3

20

11-2

20

12-1

20

12-4

20

13-3

20

14-2

20

15-1

20

15-4

20

16-3

20

17-2

20

18-1

20

18-4

20

19-3

20

20-2

20

21-1

20

21-4

20

22-3

20

23-2

He

avy

RU

C n

et

km (

mill

ion

s)

Quarter



129




Figure 138 Annual (June year) fitted values and forecasts generated by the selected hybrid model for

heavy RUC volume.


740

760

780

800

820

840

860

880

900

920

940

20

11-3

20

11-4

20

12-1

20

12-2

20

12-3

20

12-4

20

13-1

20

13-2

20

13-3

He

avy

RU

C n

et

km (

mill

ion

s)

Quarter


Actual Forecast

0

500

1,000

1,500

2,000

2,500

3,000

3,500

4,000

4,500

5,000

20

01

20

02

20

03

20

04

20

05

20

06

20

07

20

08

20

09

20

10

20

11

20

12

20

13

20

14

20

15

20

16

20

17

20

18

20

19

20

20

20

21

20

22

20

23

He

avy

RU

C n

et

km (

mill

ion

s)

Year ended June

Annual heavy RUC km


130

8.4.4 Additional Heavy RUC models


were requested to test some additional models for heavy RUC volume including real

imports and exports of goods as additional explanatory variables, the share of ‘goods

production’ sectors in GDP and to consider variables that could measure the ‘freight

efficiency’ of the heavy vehicle fleet.

We tested a definition of the ‘goods production’ sectors including the following:

Agriculture

Forestry & logging

Fishing, aquaculture & agriculture support services

Mining

Printing & manufacturing

Electricity, gas, water and waste services

Construction

Wholesale trade

We also tested an alternative definition including the retail sector with the above. In

general the share of these sectors in GDP has been declining from around 40% (45%

including retail) in 2000 to around 33% (40% including retail) in 2013. However we

found that either definition of this variable was not statistically significant in the heavy

RUC regression models.

We tested the inclusion of real exports and imports of goods in isolation and alone in the

heavy RUC regression model. Both variables were statistically significant, but this

caused real GDP to become statistically insignificant.

To measure freight efficiency, we tested two alternative approaches, both calculated

from the heavy vehicle weigh-in-motion (WIM) data provided by NZTA. First we tested

the average heavy vehicle mass as an explanatory variable in the model. If heavy

vehicles are carrying heavier loads on average, this could lead to a reduction in heavy

RUC km, everything else equal. However this variable was not statistically significant in

the heavy RUC regression models. Our second approach was to include the proportions

of heavy vehicles with 2 – 4 axles and 7+ axles as explanatory variables. These were

statistically significant in the heavy RUC regression models.

Accordingly, we developed two additional heavy RUC models for comparison with the

earlier models:

Heavy RUC model 2b: A regression model including real exports and imports of

goods but excluding real GDP

Heavy RUC model 3b: A hybrid model incorporating the proportions of heavy

vehicles with 2 – 4 axles and 7+ axles, and simple trend models of these variables

for generating forecasts.

131

Regression model including exports & imports [Heavy RUC model 2b]

The estimated coefficients of this model are shown in Table 43. Both exports and

imports have a positive effect on heavy RUC km and are significant at the 5% level. The

elasticity with respect to imports is significantly greater than the elasticity with respect

to exports, suggesting that imports have a larger impact on heavy RUC km. The model

explains 93% of the variation in quarterly heavy RUC km.

Table 45 Estimated coefficients of the selected regression model for heavy RUC volumes including

exports and imports. Dependent variable is the natural logarithm of quarterly heavy RUC volume.




ln(Real exports of goods) 0.1846 0.0884 0.04

ln(Real imports of goods) 0.3039 0.0563 0.00

Q2 -0.0588 0.0099 0.00

Q3 -0.0628 0.0102 0.00

Q4 0.0252 0.0101 0.02

Constant 17.2907 0.3821 0.00



Forecasts were generated using trade forecasts produced by the Treasury. The quarterly

fitted values and forecasts produced by this model are shown in Figure 139. The

forecasts are significantly lower than those produced by the current model. The model

also performs relatively well on the truncated sample test (Figure 140).

The annual forecasts produced by this model are shown in Figure 141. The model

predicts heavy RUC km to grow at an average annual rate of 0.7%, reaching 3.8 billion

km in 2023.

132

Figure 139 Quarterly fitted values and forecasts produced by the heavy RUC regression model

including exports and imports.


Figure 140 Truncated sample forecasting performance of the heavy RUC regression model including

exports and imports.


0

200

400

600

800

1,000

1,200

1,400

20

00-1

20

01-1

20

02-1

20

03-1

20

04-1

20

05-1

20

06-1

20

07-1

20

08-1

20

09-1

20

10-1

20

11-1

20

12-1

20

13-1

20

14-1

20

15-1

20

16-1

20

17-1

20

18-1

20

19-1

20

20-1

20

21-1

20

22-1

20

23-1

He

avy

RU

C n

et

km (

mill

ion

s)

Quarter



740

760

780

800

820

840

860

880

900

920

940

20

11-3

20

11-4

20

12-1

20

12-2

20

12-3

20

12-4

20

13-1

20

13-2

20

13-3

He

avy

RU

C n

et

km (

mill

ion

s)

Quarter


Actual Forecast

133

Figure 141 Annual fitted values and forecasts produced by the heavy RUC regression model including

exports and imports.


Hybrid model including heavy RUC axle proportions [Heavy RUC model 3b]

The proportions of heavy vehicles with 2 – 4 and 7+ axles in the NZTA WIM data are

shown in Figure 142. In general the proportion of heavy vehicles with 2 – 4 axles has

fluctuated around a relatively constant level, while the proportion of heavy vehicles

with 7+ axles has gradually increased over time.

For the purposes of analysis we have developed indicative forecasts of these

proportions, as shown in the figure. We have assumed that the proportion of heavy

vehicles with 2 – 4 axles remains constant over time, while the proportion of vehicles

with 7+ axles follows a logarithmic increasing trend over time. This implies that the

proportion of vehicles with 5 or 6 axles is declining over time.

0

500

1,000

1,500

2,000

2,500

3,000

3,500

4,000

4,500

5,000

20

01

20

02

20

03

20

04

20

05

20

06

20

07

20

08

20

09

20

10

20

11

20

12

20

13

20

14

20

15

20

16

20

17

20

18

20

19

20

20

20

21

20

22

20

23

He

avy

RU

C n

et

km (

mill

ion

s)

Year ended June

Annual heavy RUC km

Actual Fitted

134

Figure 142 Proportion of heavy vehicles with 2 – 4 and 7+ axles.

Source: Covec analysis of NZTA data.

The estimated coefficients of this model are shown in Table 46. The proportions of

heavy vehicles with 2 – 4 and 7+ axles are both statistically significant at the 10% and 5%

levels respectively. Both are estimated to negatively affect heavy RUC km, and the

model explains 86% of the variation in quarterly heavy RUC km.

Table 46 Estimated coefficients of the heavy RUC model including truck axle proportions.


ln(Real GDP) 1.1515 0.2169 0.00



ln(2-4 axles proportion) -0.6597 0.3493 0.07

ln(7+ axles proportion) -0.5226 0.2496 0.04

Q2 -0.0625 0.0130 0.00

Q3 -0.0673 0.0135 0.00

Q4 0.0215 0.0133 0.11

Constant 7.5268 2.9926 0.02




143. The forecasts grow in an essentially linear fashion, and are very similar to the

forecasts produced by the current model in earlier years, but are lower than the current

model in later years.

0%

10%

20%

30%

40%

50%

60%

20

01

20

02

20

03

20

04

20

05

20

06

20

07

20

08

20

09

20

10

20

11

20

12

20

13

20

14

20

15

20

16

20

17

20

18

20

19

20

20

20

21

20

22

20

23

20

24

Proportion of heavy vehicles by number of axles

2-4 Axles 7+ axles

135

Figure 143 Quarterly fitted values and forecasts produced by the heavy RUC model including truck

axle proportions.


The truncated sample forecasting performance of this model is shown in Figure 144. The

model forecasts heavy RUC km relatively well over this period, under-forecasting by an

average of 2% per quarter. The RMSE for this period is 39.3 million km, which is less

than all other models except the time series model.

The annual forecasts produced by this model are shown in Figure 145. These grow at an

average annual rate of 2.3%, reaching 4.4 billion km by 2023. This growth rate is slower

than the existing model but relatively rapid in comparison to the other heavy RUC

models estimated above.

0

200

400

600

800

1,000

1,200

1,400

20

00-1

20

01-1

20

02-1

20

03-1

20

04-1

20

05-1

20

06-1

20

07-1

20

08-1

20

09-1

20

10-1

20

11-1

20

12-1

20

13-1

20

14-1

20

15-1

20

16-1

20

17-1

20

18-1

20

19-1

20

20-1

20

21-1

20

22-1

20

23-1

He

avy

RU

C n

et

km (

mill

ion

s)

Quarter



136

Figure 144 Truncated sample forecasting performance of the heavy RUC model including truck axle

proportions.


Figure 145 Annual fitted values and forecasts produced by the heavy RUC model with truck axle

proportions.


720

740

760

780

800

820

840

860

880

900

920

940

20

11-3

20

11-4

20

12-1

20

12-2

20

12-3

20

12-4

20

13-1

20

13-2

20

13-3

He

avy

RU

C n

et

km (

mill

ion

s)

Quarter


Actual Forecast

0

500

1,000

1,500

2,000

2,500

3,000

3,500

4,000

4,500

5,000

20

01

20

02

20

03

20

04

20

05

20

06

20

07

20

08

20

09

20

10

20

11

20

12

20

13

20

14

20

15

20

16

20

17

20

18

20

19

20

20

20

21

20

22

20

23

He

avy

RU

C n

et

km (

mill

ion

s)

Year ended June

Annual heavy RUC km

Actual Fitted

137

8.4.5 Heavy RUC model evaluation and comparison

Table 47 summarises the variables used in the five heavy RUC models presented above.

Table 47 Summary of variables in the heavy RUC models.

Model Type Real GDP

Real heavy RUC price

Forest GDP

TPW GDP

Real export

of goods

Real import

of goods Trend

2-4 axles propn

7+ axles propn

1 Time series

2a Regression

2b Regression

3a Hybrid

3b Hybrid

Table 48 summarises the goodness of fit and truncated sample forecasting performance

of the heavy RUC models. The regression models and the hybrid model with sectoral

GDP have the highest goodness of fit, with the regression model including exports and

imports performing best overall. On the truncated sample test, the time series model (1)

performs best, followed by the hybrid model including truck axle proportions (3b). The

regression model including exports and imports (2b) also performs better on this test

than the regression model including GDP (2a).

Table 48 Summary of goodness of fit and truncated-sample forecast RMSE of the heavy RUC models.

1 2a 2b 3a 3b


R2 vs heavy RUC km 0.82 0.90 0.93 0.91 0.86


RMSE (million km) 31.3 47.4 41.2 50.3 39.3

Average quarterly error (%) -0.4 -3.5 -2.7 -3.8 -2.0

2012 error (%) 0.6 -1.7 -1.1 -1.3 -1.2

2013 error (%) -0.9 -4.1 -3.2 -5.0 -1.9

Figure 146 compares the annual forecasts produced by the heavy RUC models. All

models forecast slower growth than the existing model. The regression models and the

hybrid model including sectoral GDP (3a) predict relatively weak growth in heavy RUC

km in the short term, with stronger growth after 2016. Regression model 2b (including

exports and imports) produces the lowest long term growth forecast, although this

depends on the particular assumption about long term exports and imports growth.

Figure 147 compares the truncated sample forecasting performance of the heavy RUC

models. The pattern is similar across the models, although the time series model does a

somewhat better job of following the path of heavy RUC km during this period.

138

Figure 146 Annual heavy RUC forecast comparison.


Figure 147 Comparison of truncated sample forecasting performance of the heavy RUC models.


2,500

3,000

3,500

4,000

4,500

5,000

20

01

20

02

20

03

20

04

20

05

20

06

20

07

20

08

20

09

20

10

20

11

20

12

20

13

20

14

20

15

20

16

20

17

20

18

20

19

20

20

20

21

20

22

20

23

He

avy

RU

C v

olu

me

(m

illio

n k

m)

Year ended June

Annual heavy RUC volumes


Regression (a) Regression (b) Hybrid (a)

Hybrid (b)

720

740

760

780

800

820

840

860

880

900

920

940

20

11-3

20

11-4

20

12-1

20

12-2

20

12-3

20

12-4

20

13-1

20

13-2

20

13-3

He

avy

RU

C n

et

km (

mill

ion

s)

Quarter


Actual Time series Regression (a) Regression (b) Hybrid (a) Hybrid (b)

139

8.4.6 Heavy RUC confidence intervals and sensitivity testing

On the basis of the above analysis, in consultation with a subgroup of the NLTF revenue

forecasting group, heavy RUC models 2a, 2b and 3b were selected for further analysis

and sensitivity testing.

Heavy RUC model 2a

The indicative forecasts, confidence intervals, and sensitivity test results for this model

are shown in Table 49 and Figure 148.

The 67% and 90% confidence interval widths are approximately +/- 3.5% and 5.8% of the

baseline forecast respectively. This model contains only two explanatory variables; the

forecasts are relatively sensitive to the real GDP growth assumption over time, and

relatively insensitive to the heavy RUC price assumption.

In the model, the heavy RUC price largely acts to determine the specific quarterly

timing of heavy RUC purchases. This does not greatly affect the forecasts but improves

the fit of the model and increases the accuracy of the GDP effect.

Figure 149 shows the approximate decomposition of the baseline forecast from this

model. The forecasts are predominantly driven by real GDP, with higher real heavy

RUC prices having some offsetting effect in the short term.

Table 49 Indicative forecasts and confidence intervals produced by heavy RUC model 2a.


YE June

90% lower

67% lower Base

67% upper

90% upper

90% lower

67% lower Base

67% upper

90% upper

2013 3,552

2014 3,477 3,531 3,617 3,706 3,765 -2.1% -0.6% 1.8% 4.3% 6.0%

2015 3,392 3,471 3,596 3,725 3,812 -2.4% -1.7% -0.6% 0.5% 1.2%

2016 3,419 3,499 3,625 3,756 3,844 0.8% 0.8% 0.8% 0.8% 0.8%

2017 3,473 3,554 3,682 3,816 3,904 1.6% 1.6% 1.6% 1.6% 1.6%

2018 3,538 3,620 3,751 3,886 3,977 1.9% 1.9% 1.9% 1.9% 1.9%

2019 3,598 3,682 3,815 3,953 4,045 1.7% 1.7% 1.7% 1.7% 1.7%

2020 3,660 3,745 3,880 4,021 4,114 1.7% 1.7% 1.7% 1.7% 1.7%

2021 3,723 3,809 3,947 4,090 4,185 1.7% 1.7% 1.7% 1.7% 1.7%

2022 3,787 3,875 4,015 4,160 4,257 1.7% 1.7% 1.7% 1.7% 1.7%

2023 3,852 3,942 4,084 4,232 4,330 1.7% 1.7% 1.7% 1.7% 1.7%

140

Figure 148 Confidence intervals and sensitivity tests for heavy RUC model 2a.


Figure 149 Approximate decomposition of forecast changes in heavy RUC model 2a.


-30%

-25%

-20%

-15%

-10%

-5%

0%

5%

10%

15%

20%

25%

30%2

01

4

20

15

20

16

20

17

20

18

20

19

20

20

20

21

20

22

20

23

Re

lati

ve t

o b

ase

line

fo

reca

st90% confidence interval 67% confidence interval

2% higher real GDP growth rate 14c/km higher real heavy RUC price

-2.0%

-1.5%

-1.0%

-0.5%

0.0%

0.5%

1.0%

2013-4 to2014-3

2014-4 to2015-3

2015-4 to2016-3

2016-4 to2017-3

2017-4 to2018-3

2018-4 to2019-3

2019-4 to2020-3

2020-4 to2021-3

2021-4 to2022-3

2022-4 to2023-3

Ave

rage

co

ntr

ibu

tio

n t

o a

nn

ual

fo

reca

st c

han

ge

Real GDP Real heavy RUC price Dynamic & interaction

141

Heavy RUC model 2b

Sensitivities and confidence intervals for this model are shown in Table 50 and Figure

150. The 67% and 90% confidence intervals are approximately +/- 2.6% and 4.3% of the

indicative forecast respectively. The forecasts produced by the model are relatively

sensitive to the imports growth assumption over time and somewhat less sensitive to

exports growth. The forecasts are relatively insensitive to the real heavy RUC price.



YE June

90% lower

67% lower Base

67% upper

90% upper

90% lower

67% lower Base

67% upper

90% upper

2013 3,552

2014 3,491 3,536 3,606 3,678 3,726 -1.7% -0.5% 1.5% 3.5% 4.9%

2015 3,441 3,500 3,593 3,688 3,751 -1.4% -1.0% -0.4% 0.3% 0.7%

2016 3,471 3,530 3,623 3,720 3,783 0.9% 0.9% 0.9% 0.9% 0.9%

2017 3,520 3,580 3,675 3,772 3,837 1.4% 1.4% 1.4% 1.4% 1.4%

2018 3,546 3,606 3,702 3,800 3,865 0.7% 0.7% 0.7% 0.7% 0.7%

2019 3,566 3,626 3,723 3,821 3,886 0.6% 0.6% 0.6% 0.6% 0.6%

2020 3,586 3,647 3,744 3,844 3,909 0.6% 0.6% 0.6% 0.6% 0.6%

2021 3,608 3,670 3,767 3,867 3,933 0.6% 0.6% 0.6% 0.6% 0.6%

2022 3,630 3,692 3,790 3,890 3,957 0.6% 0.6% 0.6% 0.6% 0.6%

2023 3,652 3,714 3,813 3,914 3,981 0.6% 0.6% 0.6% 0.6% 0.6%

Figure 150 Confidence intervals and sensitivity tests for heavy RUC model 2b.


-30%

-25%

-20%

-15%

-10%

-5%

0%

5%

10%

15%

20%

25%

30%

20

14

20

15

20

16

20

17

20

18

20

19

20

20

20

21

20

22

20

23

Re

lati

ve t

o b

ase

line

fo

reca

st

90% confidence interval 67% confidence interval14c/km higher real heavy RUC price 3.8% higher real exports growth rate9% higher real imports growth rate

142

Figure 151 shows that both exports and imports drive higher heavy RUC volumes in the

short term, offset by higher real heavy RUC prices and the dynamic correction. In the

long term, export growth is the primary driver, though this depends on the assumed

growth rates of exports and imports.



Heavy RUC model 3b

Confidence intervals and sensitivity tests for this model are shown in Table 51 and

Figure 152. The 67% and 90% confidence intervals are approximately +/- 3.2% and 5.3%

of the indicative forecast respectively. The forecasts are relatively sensitive to the real

GDP growth rate assumption over time and somewhat sensitive to the assumptions

about the proportion of trucks with 2-4 and 7+ axles. The model is relatively insensitive

to the real heavy RUC price; as with the other heavy RUC models this variable is largely

performing a timing function and enables the coefficients on the other variables to be

estimated more accurately.

-1.4%

-1.2%

-1.0%

-0.8%

-0.6%

-0.4%

-0.2%

0.0%

0.2%

0.4%

0.6%

2013-4 to2014-3

2014-4 to2015-3

2015-4 to2016-3

2016-4 to2017-3

2017-4 to2018-3

2018-4 to2019-3

2019-4 to2020-3

2020-4 to2021-3

2021-4 to2022-3

2022-4 to2023-3

Ave

rage

co

ntr

ibu

tio

n t

o a

nn

ual

fo

reca

st c

han

ge

Real exports of goods Real imports of goods

Real heavy RUC price Dynamic & interaction

143



YE June

90% lower

67% lower Base

67% upper

90% upper

90% lower

67% lower Base

67% upper

90% upper

2013 3,552

2014 3,524 3,580 3,668 3,760 3,821 -0.8% 0.8% 3.3% 5.8% 7.5%

2015 3,566 3,642 3,761 3,885 3,967 1.2% 1.7% 2.5% 3.3% 3.8%

2016 3,632 3,709 3,831 3,957 4,040 1.9% 1.9% 1.9% 1.9% 1.9%

2017 3,704 3,782 3,907 4,035 4,120 2.0% 2.0% 2.0% 2.0% 2.0%

2018 3,791 3,871 3,999 4,130 4,217 2.4% 2.4% 2.4% 2.4% 2.4%

2019 3,873 3,955 4,085 4,219 4,308 2.2% 2.2% 2.2% 2.2% 2.2%

2020 3,954 4,038 4,170 4,308 4,398 2.1% 2.1% 2.1% 2.1% 2.1%

2021 4,038 4,123 4,259 4,399 4,492 2.1% 2.1% 2.1% 2.1% 2.1%

2022 4,124 4,211 4,349 4,492 4,587 2.1% 2.1% 2.1% 2.1% 2.1%

2023 4,212 4,301 4,442 4,588 4,685 2.1% 2.1% 2.1% 2.1% 2.1%

Figure 152 Confidence intervals and sensitivity tests for heavy RUC model 3b.


Figure 153 shows that real GDP growth dominates the baseline forecast in this model,

although this depends on the scenarios chosen for the truck axle proportions and real

heavy RUC prices.

-30%

-25%

-20%

-15%

-10%

-5%

0%

5%

10%

15%

20%

25%

30%

20

14

20

15

20

16

20

17

20

18

20

19

20

20

20

21

20

22

20

23

Re

lati

ve t

o b

ase

line

fo

reca

st

90% confidence interval 67% confidence interval

2% higher real GDP growth 14c/km higher real heavy RUC price

1.1% higher proportion of 2-4 axles 3.8% higher proportion of 7+ axles

144



8.4.7 Recommendations for heavy RUC modelling

Of the three heavy RUC models selected for further analysis (2a, 2b, and 3b), the

regression models 2a and 2b have higher goodness of fit, but the hybrid model 3b

performs better on the truncated sample forecasting test (Table 48).

The models produce somewhat different out-of-sample forecasts (Figure 146), with

models 2a and 2b producing similar forecasts of relatively low growth for the first five

years but diverging thereafter due to different assumptions about real GDP, exports and

imports growth. Model 3b produces higher heavy RUC forecasts in all years. Given

recent trends in heavy RUC km, in our view it is difficult to say which of these forecasts

is more plausible.

Of these three models, regression model 2b has the narrowest forecast confidence

intervals and is more closely related to economic activity that requires the transport of

goods (ie exports and imports). In comparison, model 2a relies on a stable relationship

between heavy RUC km and total real GDP, which may not hold in the long term as the

structure of the economy changes.

The hybrid model 3b allows for changes in composition of the heavy transport fleet to

affect the forecasts, but this model is more difficult to use for forecasting because

forecasts of the number of heavy vehicles with 2 – 4 and 7+ axles are required. It is not

clear how these forecasts can be generated in a robust and transparent manner other

than through the use of simple time trends.

-1.25%

-1.00%

-0.75%

-0.50%

-0.25%

0.00%

0.25%

0.50%

0.75%

1.00%

1.25%

2013-4 to2014-3

2014-4 to2015-3

2015-4 to2016-3

2016-4 to2017-3

2017-4 to2018-3

2018-4 to2019-3

2019-4 to2020-3

2020-4 to2021-3

2021-4 to2022-3

2022-4 to2023-3

Ave

rage

co

ntr

ibu

tio

n t

o a

nn

ual

fo

reca

st c

han

ge

Real GDP Real heavy RUC price 2-4 axles propn

7+ axles propn Dynamic & interaction

145

Overall on balance we recommend the use of the regression model 2b for forecasting

heavy RUC km. This model performs relatively well and is straightforward to

implement, while having the advantage of not relying on a stable link between total real

GDP and heavy RUC km.

146

9 Discussion

In this section we comment on issues and questions that the NLTF forecasting group

and subgroup have raised, and suggest some improvements to the implementation and

outputs of the existing model.

9.1 Commentary on various forecasting issues

During our review, the NLTF forecasting group and subgroup raised a number of

general questions about the forecasting approach and models, which we address in the

subsections below.

9.1.1 Plausibility and risks of the forecasts

Forecasts produced by econometric models necessarily assume that the relationships

embodied in the models continue to hold in the future. In our view this is not

problematic as long as the models have been thoroughly tested and continue to be

reviewed in future.

It is not clear that an alternative (non-econometric) approach based on ad hoc models or

simple extrapolation would produce more accurate forecasts, and such an approach

may be open to criticism due to its arbitrary nature. While an econometric approach

offers considerable freedom in modelling, it does help to pin down the specific form of

estimated relationships using data (ie the coefficient estimates), and so should provide a

reasonably objective basis for forecasting.

Our analysis showed that recent PED and RUC volume trends are well explained by a

small number of measures of economic activity (eg real GDP and unemployment) and

price variables (eg real fuel and RUC prices). This includes explaining recent declines or

slowdowns in growth that are observed in some of the PED, RUC and VKT data.

To the extent possible given available data we have tested for other explanations of

these trends, such as changes in urbanisation and heavy vehicle efficiency. While there

is some evidence that such variables could also explain recent trends in PED, RUC and

VKT, in our opinion this is not yet conclusive, and there are difficulties in using such

variables for forecasting (eg lack of reliable forecasts of some variables).

Overall, in our view the models recommended in this report produce plausible forecasts

of PED and RUC volumes. However there is always some risk that the relationships

embodied in these models have fundamentally changed. By their nature such changes

are almost impossible to anticipate in advance. Instead, fundamental changes will reveal

themselves gradually over time as the estimated models are no longer able to explain

the observed data.

The main way to guard against this risk is to ensure that the econometric models are

reviewed on a regular basis in future. We discuss this in section 9.1.5 below.

147

9.1.2 Speed of modelled changes

The requirement that the forecasts can be updated each quarter led us to estimate

quarterly models for PED and RUC volumes. While the forecasts are aggregated to June

years for presentation purposes, the quarterly nature of the models does imply that

changes in the forecast drivers are reflected to some extent in the forecasts in the same

quarter, ie relatively immediately.

We tested some alternative assumptions, eg that changes in forecast drivers took two,

three, or four quarters to flow through to the explanatory variables. This is achieved by

including lags of the explanatory variables in the regression models. In most cases we

found such lags to be statistically insignificant and/or not to improve the fit or

forecasting performance of the models.

A possible reason for this is that the explanatory variables in our models tend to be

highly correlated over time. For example, a high petrol price in one quarter is generally

associated with a high petrol price in the next quarter. This means that while in reality

changes in petrol prices may take time to affect transport activity, including lags of

petrol price in the model does not add much additional information to the regression.

Furthermore, many of our models incorporate autoregressive error terms or lagged

dependent variables. These types of models imply that the dependent variable takes

time to adjust to exogenous shocks. Where included, these dynamic variables

summarise the time profile of the response of the dependent variable.

Therefore, in our view it is reasonable to use quarterly models as the basis for

forecasting, particularly given that only annual forecasts are required and reported, and

the exact quarterly timing of PED and RUC volume changes does not need to be

predicted.

9.1.3 Scope for the use of multiple models

Our econometric analysis in sections 7 and 8 considered three broad categories of

models for PED and RUC: time series, simple regression, and ‘hybrid’ models.

Within the regression and hybrid categories for PED and RUC volumes, a number of

different models were tested including different variables and different model

structures. In many cases we found that the time series models produced the most

accurate short-term forecasts (as measured by the truncated sample forecasting test), but

these models did not always produce plausible long-term forecasts and by their nature

it is not possible to explain what is driving the forecasts. For this reason we recommend

the use of a single regression or hybrid model for each forecast, although we also

recommend that these models are regularly re-tested (see section 9.1.5 below).

A possible alternative approach involves running multiple models in parallel and either

using these multiple models to produce a range of forecasts, or combining their

forecasts into a single ‘meta-forecast’. Such an approach avoids the need to pick a single

‘best’ model and allows for the fact that different models may produce more accurate

forecasts at different points in time or over different forecast horizons.

148

However in our view the use of multiple models raises some concerns:

Given that a single forecast of PED and RUC volumes is ultimately required,

there needs to be a robust basis for either choosing a single forecast from the

range produced by multiple models, or for combining the forecasts of these

models into a single forecast. There is a risk that this part of the forecasting

process will become arbitrary, which may reduce the accuracy and transparency

of the forecasts.

The use of multiple models will make it more difficult to explain how the

forecasts have been derived. As well as explaining the forecasts produced by

each of the multiple models, it will be necessary to explain the reasons for

choosing a forecast within the range produced by these models, or to explain the

process used for combining forecasts.

It is not clear that an approach based on multiple models will perform better

than the use of a single model that is regularly re-estimated and tested. Regular

re-estimation of a single model will allow its coefficients to change over time to

reflect changes in underlying relationships. Re-testing the model will also verify

that the model remains valid, or will indicate that a new model needs to be

estimated. Re-testing and re-estimation effectively uses multiple models over

time, but since only one model is in use at any given point in time, the issues

associated with combining forecasts from multiple models are avoided.

For these reasons we prefer the use of a single econometric forecasting model for each

volume forecast. With a single model it is still possible to produce a range of forecasts to

reflect uncertainty, through the development of confidence intervals and producing

forecasts under different scenarios, as illustrated in sections 7.8, 8.3.6, and 8.4.6 above. It

is easier to explain the variations in forecasts produced by a single model (via scenarios)

than to explain differences in forecasts produced by different models that may contain

different sets of variables and different dynamics. Thus using a single model will keep

the forecasting process relatively simple while allowing different views to be explored

through testing different forecast scenarios.

Having said this, if the NLTF forecasting group wishes to proceed with a multiple-

models approach, some guidance is available from the econometrics literature with

regards to non-arbitrary ways of combining forecasts from different models. There is a

moderately large literature on combining forecasts, dating mainly from the 1970s and

80s, though the practice can be dated back to Laplace in 1818.26 Much of this work

focused on comparing methods of forecast combination, the general aim of which is to

reduce (expected) forecast errors. There is an inherent difficulty with this work

however, namely that it is not possible to tell in advance which models are the most

accurate. If one knew this, then those models would naturally get more weight in a

combining process.

26 A useful review paper is Robert T. Clemen (1989)

149

The literature has found that simple methods, such as the arithmetic mean, can work

very well, better in fact than more complex processes. However if one is willing to place

extra faith in the testing used to select forecasting models, then there is a more obvious

option. Baumeister and Kilian (2013) suggest that the mean squared error (MSE) of

truncated sample forecasts can be used as a basis for combining models.27 Calculating

the MSE at different time periods and for different forecast horizons provides a basis for

estimating the relative accuracy of the models. The MSEs could then be used to derive

weights for combining forecasts from multiple models into a single forecast.

For example, our analysis indicated that in many cases the time series models produced

more accurate forecasts in the short term (up to two years ahead) than the regression or

hybrid models, as measured by the truncated sample test. If it was found that the

regression or hybrid models performed better on this test over longer time horizons, this

could provide a basis for combining the time series forecasts with a regression or hybrid

model. In particular the forecast would be weighted towards the time series model’s

forecast in the short term, with more weight being placed on the regression or hybrid

model forecast at longer forecast horizons.

More complex approaches are possible using a larger number of models, however this

exacerbates the concerns discussed above. Therefore if multiple models are used, our

recommendation is to adopt a relatively simple approach that combines forecasts from a

time series model with one other regression or hybrid model. This provides most of the

advantages of using multiple models without introducing excessive complexity.

9.1.4 Potential for remediation of the existing model

Our review identified a number of issues arising from the structure and implementation

of the existing spreadsheet model (section 4 above). In addition our econometric

analysis has shown that models that are considerably simpler than the ECMs used in the

existing model can do a good job of explaining recent trends in PED and RUC volumes

and can produce plausible forecasts. Therefore the potential improvements to the

existing model involve both changing the econometric models used to produce the

forecasts, and changing the structure and implementation of the spreadsheet model (see

also section 9.2 below).

This raises the question of whether it is sensible to remediate the existing spreadsheet

model, rather than to build a new model. In our view, remediation would likely involve

most or all of the following tasks:

Replacement of the key econometric models for PED and RUC volumes with

new models

Removal of the seasonal adjustment process applied to all variables and apply

alternative treatment of seasonality where appropriate (eg including quarterly

27 Baumeister and Kilian (2013) find that such forecasts (for real oil prices) are more accurate than no-

change forecasts, but they do not test against more sophisticated models or regression models that are

regularly re-estimated and updated.

150

dummy variables in the regression models)

Removing the parameter adjustments feature of the model

Simplification of the process used to generate forecasts from the model, for

example by consolidating all inputs and assumptions into one sheet, and

improvement of the tables and charts produced by the model.

While it would be possible to make these changes to the existing model, in our view it is

likely to be no more costly (and possibly less costly) to build a new model for PED and

RUC volumes. This is because the complexity of the existing model’s structure means

that modifications need to be made carefully and tested thoroughly to ensure that there

are no unwanted side-effects.

If our recommendations of using relatively simple regression and/or time series models

are adopted, it will be possible to build a new spreadsheet model that is less complex

than the existing model, which will help to reduce the cost. In contrast, remediating the

existing model would involve fully understanding the workings of the model before

any changes are made, and testing the effects of such changes on all parts of the model.

9.1.5 Recommendations for future reviews of the econometric models

As discussed in section 9.1.1, the main way to mitigate the risk that econometric models

produce increasingly inaccurate forecasts over time is to re-estimate and re-test the

models regularly. This is particularly true in times like the present where there are

questions about whether past relationships between transport activity and other

variables continue to be valid.

Re-estimation involves estimating the coefficients of the model when new data has

become available, but does not involve changing the structure of the model or changing

the variables contained in it. Re-estimation is therefore relatively straightforward and

we recommend that this be undertaken on an annual basis. This would involve using

statistical software to estimate the coefficients of the models using the latest data, and

then replacing the coefficient values in the spreadsheet model with the updated values.

Provided that the form of the models does not change, updating the coefficients is

something that could be undertaken by the Ministry internally or by consultants at

relatively low cost (we estimate in the range of $20-30,000 per annum).

Re-testing the models involves running econometric diagnostic tests and possibly

changing the structure of the models including adding or dropping variables, if there is

evidence of fundamental changes in forecasting relationships over time. This could be

undertaken as a two stage process, as follows.

First, a set of econometric tests can be performed on the models to determine if they are

still adequately specified. In the time series context the most important tests are for

serial correlation and non-stationarity of the regression residuals. The presence of either

of these problems is evidence that the relationship embodied by the model is no longer

valid. Then if such problems are found, a broader search for a new model could be

151

undertaken. This is a larger exercise involving testing models of different forms and

with different variables, to see if a better model can be found.

Given the time and cost involved with re-testing, but given the current uncertainty

about whether the determinants of transport activity have fundamentally changed, we

recommend that this be undertaken every three years in the near future. It may be

possible to reduce this frequency once the current uncertainty about determinants of

transport activity is resolved.

9.2 Suggested improvements to the spreadsheet model

The following is a summary of our recommended improvements to the NLTF revenue

forecasting model as implemented in an Excel spreadsheet. Many of these

recommendations are aimed at simplifying the model and making it easier to use.

9.2.1 Replace ECMs with simpler regression models

The ECMs in the existing model are relatively complex, particularly the short-run

components of the models. This means that a relatively large number of inputs are

required to generate forecasts, and it can be difficult to explain the short-run predictions

generated by the models.

Our econometric analysis found that relatively simple models with a small number of

explanatory variables and simple dynamic variables such as autoregressive terms can

perform well in forecasting PED and RUC volumes. In our view, for PED and RUC

volumes, any additional benefits of using ECMs to more accurately capture short-run

dynamics are outweighed by the disadvantages of this approach. This is particularly the

case if the primary objective is accuracy of annual forecasts over the next two to three

years, rather than generating highly accurate quarterly forecasts.

9.2.2 Improve forecast generation and scenario analysis

The ability to analyse scenarios in the model could be improved by:

Clearly separating actual data inputs from scenarios.

Simplifying the specification of scenarios: For each variable these can simply be

input as a quarterly or annual time series, with generation of the scenario

performed outside the model.

As a general principle, anything that needs to be updated by the spreadsheet user to

produce a new forecast should be easily accessible and in a centralised location rather

than spread throughout the model. Ideally, the forecasting scenario would be specified

in a single information tab at the front of the Excel workbook that summarises and

graphs the variables relevant to the forecasting scenario. The user would input a

scenario into this tab, and therefore values in only one place need to be changed to

generate a forecast under any given scenario. This also makes it relatively easy to save a

‘snapshot’ of the scenario for future reference.

152

While not part of the spreadsheet model itself, it would also be helpful for the Ministry

to adopt procedures for recording modifications to the model. Ideally, the only

modifications that should be required by users are updating actual data over time and

inputting new forecasting scenarios. It would be good practice to use a ‘change log’ to

record such changes, briefly describe each change, and note who it was done by and

when the change was made.

9.2.3 Improve outputs of the models

The model should produce a set of simple graphs and tables to summarise the forecasts.

In Excel it is possible to set up these graphs so that the date range updates automatically

over time and no manual work is required to update the graphs when a new forecast is

generated each quarter.

The following tables and charts illustrate our recommended outputs from the model for

PED volumes. Similar outputs are possible for RUC volumes.

Table 52 presents forecast volumes and annual growth rates. A comparison to the past

three years is provided to illustrate recent trends. The low and high forecasts in this case

represent the 67% confidence bounds, but these could also be generated from alternative

scenarios for the explanatory variables.

Table 52 Example PED volume annual forecast table.

YE June

PED volume (m litres) Annual change (%)

Low Medium High Low Medium High

2011 3,105 2.3

2012 3,033 -2.3

2013 3,013 -0.7

2014 (f) 2,924 2,989 3,058 -2.9 -0.8 1.5

2015 (f) 2,949 3,037 3,130 0.8 1.6 2.4

2016 (f) 2,968 3,057 3,150 0.7 0.6 0.6

2017 (f) 2,971 3,060 3,153 0.1 0.1 0.1

2018 (f) 2,970 3,059 3,152 0.0 0.0 -0.1

2019 (f) 2,963 3,051 3,144 -0.2 -0.2 -0.3

2020 (f) 2,951 3,038 3,130 -0.4 -0.4 -0.4

2021 (f) 2,939 3,026 3,117 -0.4 -0.4 -0.4

2022 (f) 2,928 3,015 3,105 -0.4 -0.4 -0.4

2023 (f) 2,918 3,004 3,095 -0.3 -0.3 -0.3

The forecasts can also be plotted automatically, for example Figure 154 shows annual

PED volumes and Figure 155 shows the annual growth rate.

153

Figure 154 Example PED volume annual forecast chart.

Figure 155 Example PED volume annual growth chart.

2,600

2,700

2,800

2,900

3,000

3,100

3,200

3,300

3,400

19

96

19

97

19

98

19

99

20

00

20

01

20

02

20

03

20

04

20

05

20

06

20

07

20

08

20

09

20

10

20

11

20

12

20

13

20

14

20

15

20

16

20

17

20

18

20

19

20

20

20

21

20

22

20

23

PED

vo

lum

e (m

litr

es)

Year ended June

Actual Forecast

-8%

-6%

-4%

-2%

0%

2%

4%

6%

8%

10%

19

96

19

97

19

98

19

99

20

00

20

01

20

02

20

03

20

04

20

05

20

06

20

07

20

08

20

09

20

10

20

11

20

12

20

13

20

14

20

15

20

16

20

17

20

18

20

19

20

20

20

21

20

22

20

23

PED

vo

lum

e an

nu

al c

han

ge

Year ended June

Actual Forecast

154

In many cases, the annual (year ended June) forecasts will be generated when some

quarters of actual data for the first year of the forecast period are available. In this case it

may be helpful to show the year-ended totals for the available quarters in the first

forecast year, to show how actual values are tracking for this year. Figure 156 shows an

example of this assuming that the first two quarters of actual data for the year ended

June 2014 are available.

Figure 156 Example annual PED volume chart showing annual totals for available quarters of the first

forecast year.

It is also possible to present an approximate decomposition of the forecasts into changes

in the explanatory variables in the model, modified by their coefficients. Most of the

models recommended in this report are log-log models. To illustrate how the forecasts

from these models can be decomposed, consider the following simple model where y is

a function of x and z:

ln 𝑦 = 𝑎 + 𝑏1 ln 𝑥 + 𝑏2 ln 𝑧

By totally differentiating, we obtain:

1

𝑦𝑑𝑦 =

𝑏1𝑥𝑑𝑥 +

𝑏2𝑧𝑑𝑧

Which implies that:

%∆𝑦 ≈ 𝑏1%∆𝑥 + 𝑏2%∆𝑧

2,800

2,850

2,900

2,950

3,000

3,050

3,100

3,150

3,200

20

09

20

10

20

11

20

12

20

13

20

14

20

15

20

16

20

17

20

18

20

19

20

20

20

21

20

22

20

23

An

nu

al P

ED v

olu

me

(m li

tres

)

Year ended June

Actual Forecast

155

The relationship is only approximate because the derivative assumes an infinitesimal

change in x and z whereas in reality the changes are larger and this generates interaction

effects that are not captured in the formula above. Furthermore, in the first forecast

period the change in y reflects a ‘jump’ from the actual value in the previous period to

the forecast value produced by the model, since the actual value in the previous period

will be different to the model’s prediction for that period.

In the case of a linear model such as that used for light RUC, the decomposition is

different. To illustrate, consider the following simple linear time series model:

𝑦𝑡 = 𝑎 + 𝑏1𝑥𝑡 + 𝑏2𝑧𝑡

From one period to the next, the percentage change in y is given by:

%∆𝑦𝑡 =𝑦𝑡 − 𝑦𝑡−1𝑦𝑡−1

=𝑏1(𝑥𝑡 − 𝑥𝑡−1) + 𝑏2(𝑧𝑡 − 𝑧𝑡−1)

𝑦𝑡−1

If we define the modified coefficients 𝑑1 = 𝑏1𝑥𝑡−1/𝑦𝑡−1 and 𝑑2 = 𝑏2𝑧𝑡−1/𝑦𝑡−1 then it is

straightforward to verify that:

%∆𝑦𝑡 = 𝑑1%∆𝑥𝑡 + 𝑑2%∆𝑧𝑡

This relationship is exact but in the linear model the effective coefficient on the

percentage change of each variable is not constant over time.

The models we have developed are estimated at a quarterly frequency, and quarterly

values are summed to produce annual totals. To estimate an approximate annual

decomposition of the forecasts, we have applied the above decomposition to the

quarterly models and then taken the annual average of the effect of each variable. These

annual averages are taken over 4-quarter periods during the forecast period, rather than

for June years. Otherwise, the first forecast year includes a mix of actual data and

forecast quarters, and this complicates the first year forecast decomposition.

Figure 157 shows an example of this decomposition for the PED volume forecast from

PED model 3b. For each variable, the annual average contribution of that variable to the

quarterly forecasts in each year is presented. We have not shown the annual percentage

change in PED volume on this chart because the annual averaging means that the effects

of the individual variables do not add up to the annual change in PED volume.

Nevertheless, the chart shows the approximate relative importance of the variables in

the model and whether they are having a positive or negative effect on PED volume.

156

Figure 157 Example decomposition of PED volume forecasts.

9.2.4 Remove parameter shocks functionality

As discussed in section 4, the ability to analyse parameter shocks in the model

introduces considerable complexity while potentially undermining the credibility of the

econometric models, and there is no simple, non-arbitrary way to make such

adjustments. In our view it would be better for the coefficients of the econometric

models to be re-estimated in a regular basis outside the spreadsheet model (see below).

9.2.5 Remove coefficient re-estimation

The coefficients of the econometric models do need to be updated regularly – preferably

every year or at least every two years. However in our view this should be done outside

the Excel model so that a proper suite of diagnostic tests can be performed, and the

structure of the models can be updated if necessary.28 Ideally this would be a regular

part of the forecasting process, done at a point in the year when there is sufficient time

to re-analyse the models in detail.

9.2.6 Improve treatment of seasonal variation

As our analysis demonstrated, there is no predictable seasonal pattern in PED volumes,

with the quarterly volatility largely driven by random factors that essentially cannot be

forecasted. Therefore in our view it is preferable for some form of smoothing (eg the 4-

quarter moving average) to be applied to PED volumes for use in the analysis.

28 Our econometric analysis was conducted in STATA, however any suitable statistical software could

be used, including Eviews or R.

-0.2%

-0.1%

0.0%

0.1%

0.2%

0.3%

0.4%

0.5%

0.6%

0.7%

0.8%

2013-4 to2014-3

2014-4 to2015-3

2015-4 to2016-3

2016-4 to2017-3

2017-4 to2018-3

2018-4 to2019-3

2019-4 to2020-3

2020-4 to2021-3

2021-4 to2022-3

2022-4 to2023-3

Co

ntr

ibu

tio

n t

o f

ore

cast

an

nu

al %

ch

ange



157

Other variables such as RUC volumes do have predictable seasonal patterns. Our

recommendation for handling this is to include quarterly dummy variables in the

models where necessary to capture seasonal effects, rather than seasonally adjusting the

variables prior to analysis. This treatment has essentially the same effect as seasonal

adjustment but is simpler to implement and is more transparent. Regular re-estimation

of the models will allow testing whether the seasonal pattern is changing, by testing

whether the seasonal dummies are changing, and/or interacting these dummy variables

with deterministic time trends.

9.2.7 Include forecast uncertainties (confidence intervals)

As well as generating forecasts under different input assumptions, it would be helpful if

the model could reflect the uncertainty associated in the econometric models through

the calculation of confidence intervals for the forecasts. The implementation of this will

be greatly simplified by using simple regressions models to generate the forecasts,

rather than ECMs.

9.2.8 Simplify the updating process

Updating the model each quarter requires entering new actual data that has become

available. As noted in section 4, the Excel model does not detect the presence of new

data, and so updating requires a number of manual steps to flow this data through the

model. In fact it is possible to build the model in such a way (without macros) that

additional observations can be added to a data table and this flows through the model

automatically, including updating the date ranges applied to output tables and charts. If

the model was built in this way, it would reduce the manual work required to produce

updates and eliminate errors that may be created during updating.

9.2.9 Simplify models for other components of NLTF revenues

The existing model also includes relatively complex models for the other parts of NLTF

revenues that we have not analysed in this report (eg CNG and LPG excise, driver

licensing, etc). These comprise less than 10% of total NLTF revenues but introduce a

considerable amount of complexity into the Excel model. In our view it would probably

be acceptable to greatly simplify these models, for example to forecasts on the basis of

simple time-series models.

This would greatly simplify the model overall, while having a minimal effect on overall

accuracy of the total NLTF revenue forecasts. However, if highly accurate forecasts of

these revenue components are required for other reasons then such simplification may

not be desirable.

158

10 References

Baumeister, C. & L. Kilian (2013). Forecasting the real price of oil in a changing world: A

forecast combination approach. Bank of Canada Working Paper 2013-28.

BITRE (2013a) Traffic growth: Modelling a global phenomenon. BITRE Research Report

128.

BITRE (2013b) Public transport use in Australia’s capital cities: Modelling and

forecasting. BITRE Research Report 129.

Clemen, R. T. (1989). Combining Forecasts: A Review and Annotated Bibliography.

International Journal of Forecasting, 5: 559-83.

Conder, T (2009) Development and application of a New Zealand car ownership and

traffic forecasting model. NZ Transport Agency research report 394.

Corpuz, G, M McCabe & K Ryszawa (2007) The development of a Sydney VKT

regression model. Report presented to the 29th Australian Transport Research Forum.

Davidson, R. & J MacKinnon (1993). Estimation and inference in econometrics. Oxford

University Press.

Deloitte (2011a) Review and redesign of the National Land Transport Fund revenue

forecasting model. Deloitte Access Economics report for the Ministry of Transport, April

2011.

Deloitte (2011b) National Land Transport Forecasting model – User Guide. Deloitte

Access Economics report for the Ministry of Transport, May 2011.

Deloitte (2012) Review of NLTF forecasting model. Deloitte Access Economics report for

the Ministry of Transport 11 October 2012.

Ecola, L & M Wachs (2012) Exploring the relationship between travel demand and

economic growth. RAND Corporation report for the Federal Highway Administration.

International Transport Forum (2012) Transport outlook: Seamless transport for greener

growth.

Litman, T (2013) The future isn’t what it used to be: Changing trends and their

implications for transport planning. Victoria Transport Policy Institute, 5 November 2013.

Milne, A, S Rendall & S Abley (2011) National travel profiles part B: Trips, trends and

travel prediction. NZ Transport Agency research report 467.

Pickrell D, D Pace, R West & G Hagemann (2012) Developing a multi-level vehicle miles

of travel forecasting model. Paper presented at the 91st Annual Meeting of the

Transportation Research Board.

159

Phillips, P C B (1995) Bayesian model selection and prediction with empirical

applications. Journal of Econometrics, 69: 289-331.

Phillips, P C B & W Ploberger (1994) Posterior odds testing for a unit root with data-

based model selection. Econometric Theory, 10: 774-808.

Schiff, A & P C B Phillips (2000) Forecasting New Zealand’s real GDP. New Zealand

Economic Papers, 34 (2): 159-182.

Simic, A & R Bartels (2013) Drivers of demand for transport. NZ Transport Agency

research report 534.

Souche, S (2010) Measuring the structural determinants of urban travel demand.

Transport Policy, 17 (3): 127-34.

Stephenson, J & L Zheng (2013) National long-term land transport demand model. NZ

Transport Agency research report 520.

Wang, J (2011) Appraisal of factors influencing public transport patronage. NZ Transport

Agency research report 434.

Documents

Review of the NLTF Revenue Forecasting Model · Recent events have made forecasting future transport activity challenging. Figure 1 shows the correlation between total annual vehicle