25
A Probability Sample Strategy for A Probability Sample Strategy for improving the quality of the Consumer improving the quality of the Consumer Price Index Survey using the Information Price Index Survey using the Information of the Business Register of the Business Register Luigi Biggeri , Piero Demetrio Falorsi Luigi Biggeri , Piero Demetrio Falorsi National Statistical Institute of Italy (ISTAT) National Statistical Institute of Italy (ISTAT)

A Probability Sample Strategy for improving the quality of the Consumer Price Index Survey using the Information of the Business Register Luigi Biggeri,

Embed Size (px)

Citation preview

Page 1: A Probability Sample Strategy for improving the quality of the Consumer Price Index Survey using the Information of the Business Register Luigi Biggeri,

A Probability Sample Strategy for A Probability Sample Strategy for improving the quality of the Consumer improving the quality of the Consumer

Price Index Survey using the Information of Price Index Survey using the Information of the Business Registerthe Business Register

Luigi Biggeri , Piero Demetrio FalorsiLuigi Biggeri , Piero Demetrio FalorsiNational Statistical Institute of Italy (ISTAT)National Statistical Institute of Italy (ISTAT)

Page 2: A Probability Sample Strategy for improving the quality of the Consumer Price Index Survey using the Information of the Business Register Luigi Biggeri,

2

A Probability Sample Strategy for improving the quality of the Consumer Price Index Survey …A Probability Sample Strategy for improving the quality of the Consumer Price Index Survey …

SummarySummary

The presentation describes a proposal of a new sampling strategy for the Italian CPI survey, aiming to identify a solution that may work out some of the problems of the current design, based on purposive sampling that sometimes could cause bias in the estimates. A complex random multiple stage pps sampling schema is proposed where the inclusion (or selection) probabilities at the different stages are proportional to the turnover.

Two relevant innovation herein proposed are related to the procedure for the selection of elementary items and to the estimation procedure, based on an observational strategy allowing: (i) to calculate proxy values of the weights w unknown at elementary item level; (ii) to define a consistent estimation method by means of which the national CPI estimate can be obtained as a weighted sum of the estimates of the subpopulation indices.

Page 3: A Probability Sample Strategy for improving the quality of the Consumer Price Index Survey using the Information of the Business Register Luigi Biggeri,

3

A Probability Sample Strategy for improving the quality of the Consumer Price Index Survey …A Probability Sample Strategy for improving the quality of the Consumer Price Index Survey …

SummarySummary

1. The Current CPI construction: characteristics and issues

2. Analysis and new studies

3. A proposal for a probability sampling strategy

4. Sampling frame and design

5. Estimation method

6. Concluding remarks

Page 4: A Probability Sample Strategy for improving the quality of the Consumer Price Index Survey using the Information of the Business Register Luigi Biggeri,

4

A Probability Sample Strategy for improving the quality of the Consumer Price Index Survey …A Probability Sample Strategy for improving the quality of the Consumer Price Index Survey …

The Laspeyres type index

Where: P is the price; y the year; m the month;

a= geographic area; c = local district, v = outlet, j =elementary item

1. The Current CPI construction: characteristics and issues (a)

j

yjw 11,12

j a

ymyvcj

yvcjcvac

ymy rwI ,;1,12,

1,12,

,;1,12

1,12,

,,,;1,12

,

yvcj

ymvcjymy

vcjP

Pr

Page 5: A Probability Sample Strategy for improving the quality of the Consumer Price Index Survey using the Information of the Business Register Luigi Biggeri,

5

A Probability Sample Strategy for improving the quality of the Consumer Price Index Survey …A Probability Sample Strategy for improving the quality of the Consumer Price Index Survey …

The current purposive sample strategy of the CPI survey

• The collection of prices of a fixed basket of 562 representative products (purposively chosen) is carried out in two different ways: – (a) centrally (roughly 60 products) by the staff of Istat

through specific sample procedures

– (b) locally (roughly 500 products) directly by staff of Municipal Statistical Offices involved in the survey.

• Local survey: Three sampling stages:– The first stage units (PSU) are the chief towns of

provinces (86 municipalities out of 103) – The second stage units are the outlets purposively

chosen (at December of each year) in each PSU to be representative of the consumer behaviour as a kind of quota sampling (roughly 40,000)

– The most sold elementary items of the fixed basked of products (chosen at December of each year) are observed in each selected outlet (roughly 400,000)

1. The Current CPI construction : characteristics and issues (b)

Page 6: A Probability Sample Strategy for improving the quality of the Consumer Price Index Survey using the Information of the Business Register Luigi Biggeri,

6

A Probability Sample Strategy for improving the quality of the Consumer Price Index Survey …A Probability Sample Strategy for improving the quality of the Consumer Price Index Survey …

• The elementary indexes are obtained at municipality level by unweighted geometric mean

• The national index is calculated by subsequent territorial aggregation of elementary indexes, using weights at different levels based on population, national account data and households expenditure survey

• CPI for each sampled municipality is also calculated

1. The Current CPI construction: characteristics and issues (c)

Page 7: A Probability Sample Strategy for improving the quality of the Consumer Price Index Survey using the Information of the Business Register Luigi Biggeri,

7

A Probability Sample Strategy for improving the quality of the Consumer Price Index Survey …A Probability Sample Strategy for improving the quality of the Consumer Price Index Survey …

Some issues of the current survey

1. The current survey structure based on purposive sampling strategy does not allow to evaluate the accuracy; attempts to evaluate variance should be carried out.

2. Not all the chief towns of provinces are included in the survey and the small municipalities are not included at all.

3. The selection criterion of the “most sold elementary item” of the product in each outlet could introduce unknown bias

4. The lack of adequate detailed information on the households’ consumer expenditures, prevents the use of the weights at the elementary aggregate level and at municipal and regional level

2. Analysis and new studies (a)

Page 8: A Probability Sample Strategy for improving the quality of the Consumer Price Index Survey using the Information of the Business Register Luigi Biggeri,

8

A Probability Sample Strategy for improving the quality of the Consumer Price Index Survey …A Probability Sample Strategy for improving the quality of the Consumer Price Index Survey …

The need for experimental analysis

1. To get information on the importance of the possible biases, analysis and computations must be carried out implementing adequate experiments

2. To evaluate and improve the quality of the Italian CPIs, last year Istat established a Scientific Committee that is reviewing the different aspects of the indices construction process. The Committee has stressed the need to study and verify the fesibility construction of a probabilistic sample strategy.

2. Analysis and new studies (b)

Page 9: A Probability Sample Strategy for improving the quality of the Consumer Price Index Survey using the Information of the Business Register Luigi Biggeri,

9

A Probability Sample Strategy for improving the quality of the Consumer Price Index Survey …A Probability Sample Strategy for improving the quality of the Consumer Price Index Survey …

1. The proposal is tailored for the survey of prices collected locally

2. Recent availability of a business register referred to the local units and yearly updated (outlets)

3. Possibility to estimate the turnover of each outlet for each product, to be used for construction of weights.

4. The proposed survey framework based on a probability sample strategy guarantees unbiased estimates and should deal with most of the mentioned issues

5. The sample design consists of a three stage selection scheme (local districts, outlets and items) using probabilities proportional to the turnover used as a proxy of the consumer expenditure.

6. The index estimation in based on an observational scheme allowing to obtain proxy measures of the weight. Generalised regression estimator is used. A coherence of the calculated indexes for different estimation domains (planned or not) is obtained

3. A proposal for a probability sampling strategy

Page 10: A Probability Sample Strategy for improving the quality of the Consumer Price Index Survey using the Information of the Business Register Luigi Biggeri,

10

A Probability Sample Strategy for improving the quality of the Consumer Price Index Survey …A Probability Sample Strategy for improving the quality of the Consumer Price Index Survey …

The parameter of interest is the national prices index

c = local district, v = outlet, d = type of product, j = item

price index of item (d,j,c,v)

weigth of item (d,j,c,v)

in terms of sold in base period

N

c

M

v

D

d

J

jvcjdvcjdpop

c vc vcd

rwI1 1 1 1

,,,,1,0

, ,

0,,

1,,,, vcjdvcjdvcjd PPr

N

c

M

v

D

d

J

jvcjdvcjdvcjd

c vc vcd

FFw1 1 1 1

0,,

0,,,,

, ,

4. Sampling frame and design (a)

Page 11: A Probability Sample Strategy for improving the quality of the Consumer Price Index Survey using the Information of the Business Register Luigi Biggeri,

11

A Probability Sample Strategy for improving the quality of the Consumer Price Index Survey …A Probability Sample Strategy for improving the quality of the Consumer Price Index Survey …

PSUs: Local districts (municipalities in Italy) are selected within the geographical area through balanced sampling, aiming to define a sample producing direct estimates of the totals of some auxiliary variables equal to the known totals (Deville and Tillé, 2004)

SSUs: The sampling design for the outlets consists of linking D distinct samples, one for each type of product (TP). The outlet selection is made through a coordinated selection technique (PRN) aiming at obtaining an high level of overlapping of the selected samples for each type of product, reducing the size of the total sample of outlets, being equal the number of observed items (Ohlsson, 1995)

FINAL UNITS: A probability sample scheme for the item selection based on iterative hierarchical drawing of groups of products is proposed. Such a scheme is feasible and allows to solve the current problem of the definition of the fixed basket of products

4. Sampling frame and design (b)

Page 12: A Probability Sample Strategy for improving the quality of the Consumer Price Index Survey using the Information of the Business Register Luigi Biggeri,

12

A Probability Sample Strategy for improving the quality of the Consumer Price Index Survey …A Probability Sample Strategy for improving the quality of the Consumer Price Index Survey …

Planned domains for the survey estimates:

The most detailed domain is the geographical area by Type of Product (TP), element of the four digit classification of COICOP

4. Sampling frame and design (b)

Page 13: A Probability Sample Strategy for improving the quality of the Consumer Price Index Survey using the Information of the Business Register Luigi Biggeri,

13

A Probability Sample Strategy for improving the quality of the Consumer Price Index Survey …A Probability Sample Strategy for improving the quality of the Consumer Price Index Survey …

Sampling frame: Local Unit Archive

– Yearly updated

– The information contained in the archive (expositive surface, number of employees, economic activity code, geographical zone) are used for the stratification by size, outlet typology, etc.

– The NACE code may allow to establish which outlets sell the TP items to households

– A table linking NACE codes and types of products has been constructed

4. Sampling frame and design (c)

Page 14: A Probability Sample Strategy for improving the quality of the Consumer Price Index Survey using the Information of the Business Register Luigi Biggeri,

14

A Probability Sample Strategy for improving the quality of the Consumer Price Index Survey …A Probability Sample Strategy for improving the quality of the Consumer Price Index Survey …

Table 1. Example of table linking Types of products and NACE codes

Type of product

NACE CODE

52.11.1

52.11.2

52.11.3 …

52.24.1 …

52.27.4

52.48.6 52.48.E …

52.74.0

hypermarket

supermarket

grocerydiscount

 

Retail market of bread

   

Retail market of objects for arts, religion, etc.

Retail market for not other grocery

 

 

1 Rice 1 1 1 … 0 … 1 0 0 … 0

2 Bread 1 1 1 … 1 … 1 0 0 … 0

3 Pasta 1 1 1 … 0 … 1 0 0 … 0

.. …… … … … … … … … … … … …

.. …… … … … … … … … … … … …

207

Expenditures for religion

0 0 0 … 0 … 0 1 1 … 0

4. Sampling frame and design (d)

Page 15: A Probability Sample Strategy for improving the quality of the Consumer Price Index Survey using the Information of the Business Register Luigi Biggeri,

15

A Probability Sample Strategy for improving the quality of the Consumer Price Index Survey …A Probability Sample Strategy for improving the quality of the Consumer Price Index Survey …

Sampling Frame CONSTRUCTION: definition of turnover

outlet turnover: from business register source, it is exactly known only for the enterprises with only one local unit; otherwise it is imputed using different data sources

turnover for outlet and type of product: estimated

using different data sources (fiscal data, business register, National Accounts, Household Budget Survey)

Note that possible errors in imputation do not imply

bias on sampling strategy but they can cause only an increase of variance

0v,cF

~

0v,c,.d F

~

4. Sampling frame and design (e)

Page 16: A Probability Sample Strategy for improving the quality of the Consumer Price Index Survey using the Information of the Business Register Luigi Biggeri,

16

A Probability Sample Strategy for improving the quality of the Consumer Price Index Survey …A Probability Sample Strategy for improving the quality of the Consumer Price Index Survey …

Local districts Selection

– sample local districts are drawn from the local districts of the a area by means of a balanced sampling design with inclusion probabilities proportional to the turnover:

– The balancing equations are

being

where is the overall turnover of the c-th local district for the d-th type of product calculated by summing up frame data

)a(n )a(N

0)(.,.

0,..,.

)( ~

~

a

cac

F

Fn

)a()a( N

1cc

n

1c c

c xx

c0,.c,.D

0,.c,.d

0,.c,.1c ,F

~,...,F

~,...,F

~x

0cd F ,.,.

~

4. Sampling frame and design (f)

Page 17: A Probability Sample Strategy for improving the quality of the Consumer Price Index Survey using the Information of the Business Register Luigi Biggeri,

17

A Probability Sample Strategy for improving the quality of the Consumer Price Index Survey …A Probability Sample Strategy for improving the quality of the Consumer Price Index Survey …

Outlet Selection

– Separate samples are realised (one for each type of product).

– Each sample is performed through a PRN coordination technique which realises the maximum overlapping of the outlets selected for the different types of products

– In the sample selected for the generic type of product d (d=1,…,D) the outlets are stratified by typology within the local district

– The outlet final inclusion probability is defined as proportional to the outlet size in terms of turnover for the d-th type of product in the area.

4. Sampling frame and design (g)

0)(,.

0,,.

)(0)(,.

0,,.

)(|,, ~

~

~

~

ad

vcdad

adc

vcdadccvcdcvcd

F

Fm

F

Fm

Page 18: A Probability Sample Strategy for improving the quality of the Consumer Price Index Survey using the Information of the Business Register Luigi Biggeri,

18

A Probability Sample Strategy for improving the quality of the Consumer Price Index Survey …A Probability Sample Strategy for improving the quality of the Consumer Price Index Survey …

Items Selection (1)

In order to perform the probability selection of the items, the main operational difficulty is the construction of the list of all items sold in the outlet belonging to the type of product for which the outlet has been included in the sample

– A way to solve such a difficulty is to define:

• A hierarchical tree classification of elementary products for each type of product;

• A selection procedure for each level of this structure.

– The procedure should be translated in a specific algorithm, implemented in the lap-top used by the interviewer for the data collection. This operation allows to identify briefly a very small subset of homogeneous items to be used for the item selection in the outlet.

4. Sampling frame and design (h)

Page 19: A Probability Sample Strategy for improving the quality of the Consumer Price Index Survey using the Information of the Business Register Luigi Biggeri,

19

A Probability Sample Strategy for improving the quality of the Consumer Price Index Survey …A Probability Sample Strategy for improving the quality of the Consumer Price Index Survey …

Swimming Skiing Other SportsBody Building

Body Building

Water Polo

Downhill skiing Cross-country skiing

Swimming

Tennis

Fli

p p er

s

M as

ch er

eS wi

m su it

Wat

er

polo

eq

ui

pm

en

t Sk

i b oo

tsSk

i

Bod

y

Bu

ild

ing

Sk

iw ea

r Sk

i b oo

tsSk

i Sk

iw ea

r Te

n ni

s ra ck et

S p or

ts w ea

r Fo ot

b all

S p or

ts w ea

r

Fli

pp

er

s

Sw

im sui

t

Wat

er

polo

eq

ui

pm

en

t Ski

bo

ots

Ski

Bo

dy

Bu

ildi

ng

Sn

ow

sui

t Sk

i b oo

tsSki

Sn

ow

sui

t Te

nni

s ra ck et

Sp

ort

sw ea

r Sh

ort

tr

ous

er

s Un

de

rs hir

t Fo

ot

bal

lSn

ea

ke

rsGl

ov

es

Gl

ov

es

Football

TP = Equipment for Sport

Page 20: A Probability Sample Strategy for improving the quality of the Consumer Price Index Survey using the Information of the Business Register Luigi Biggeri,

20

A Probability Sample Strategy for improving the quality of the Consumer Price Index Survey …A Probability Sample Strategy for improving the quality of the Consumer Price Index Survey …

Item Selection (2)

– The procedure of item selection uses, at each level, the inclusion probabilities defined on the basis of information available in the sampled outlet or available as a auxiliary priori information.

– The optimal situation would occur if the probabilities used at each level were proportional to the turnover of the unit with respect to the total turnover of the outlet for the set of units among which the selection has to be carried out at the specific level.

– The probability selection allows to define unbiased estimators

– The efficiency of the estimates depends on the kind of the selection probabilities used

4. Sampling frame and design (i)

Page 21: A Probability Sample Strategy for improving the quality of the Consumer Price Index Survey using the Information of the Business Register Luigi Biggeri,

21

A Probability Sample Strategy for improving the quality of the Consumer Price Index Survey …A Probability Sample Strategy for improving the quality of the Consumer Price Index Survey …

Final inclusion probabilities

The sampling scheme is implemented giving the items an inclusion probability proportional to the ratio between the item turnover and the overall turnover, at the d -th TP and area a level.

This expression shows that the proposed sample design is approximately self-weighting

4. Sampling frame and design (l)

0)a(,.d

0v,cj,d

)a(dv,cj,dF

Fj

Page 22: A Probability Sample Strategy for improving the quality of the Consumer Price Index Survey using the Information of the Business Register Luigi Biggeri,

22

A Probability Sample Strategy for improving the quality of the Consumer Price Index Survey …A Probability Sample Strategy for improving the quality of the Consumer Price Index Survey …

In the estimation phase it is useful to express the weight with the following factorisation:

Therefore, a proxy observable value of this weight can be calculated as

5. Estimation method (a)

v,cj,dv,c,.dv,c.,.v,cj,d kkkw

0v,c,.d

0v,cj,d

0v,c.,.

0v,c,.d

0.,.,..,.

0v,c.,.

v,cj,dF

F

F

F

F

Fw v,cj,dv,c,.dv,c.,. kkk

where are respectively the

imputed values of

v,cj,dv,c,.dv,c.,. kandk,k

v,cj,dv,c,.dv,c.,. kandk,k

Page 23: A Probability Sample Strategy for improving the quality of the Consumer Price Index Survey using the Information of the Business Register Luigi Biggeri,

23

A Probability Sample Strategy for improving the quality of the Consumer Price Index Survey …A Probability Sample Strategy for improving the quality of the Consumer Price Index Survey …

The general index estimate can be obtained by means of the

generalised regression estimator proposed by Valliant (1999),

based on the model

The expression of the estimator is

In this way the sample estimates equal the population totals,

known or estimates from external sources (Households Budget

Survey, National Accounts).

5. Estimation method (b)

v,cj,dv,cj,dv,cj,d xβr

A

a

D

d

n

c

m

v

j

jvcjdvcjdvcjd

vcjd

ad cgd ad

rwII1 1 1 1 1

,,,,,,,,

WW1,01,0

)( )( 1ˆ)ˆ(ˆ~

βXX

Page 24: A Probability Sample Strategy for improving the quality of the Consumer Price Index Survey using the Information of the Business Register Luigi Biggeri,

24

A Probability Sample Strategy for improving the quality of the Consumer Price Index Survey …A Probability Sample Strategy for improving the quality of the Consumer Price Index Survey …

The proposed strategy is coherent with the Italian current practice: the sample of elementary items and outlets is updated each year to take into account the rapid changes in the products and in outlet universes. The sampling selection of outlets and items developed with permanent random numbers techniques allows implementing in a simple way a yearly updating of the samples guaranteeing, at the same time, to realize a prefixed rotation rate (Ohlsson, 1995).

Meanwhile, the sample of Local Districts, once selected, remains unchanged for several years. This is justified by cost consideration, connected with the high cost of training the interviewers for the local districts, and by the fact that the structure of local districts changes over time very slowly.

6. Concluding remarks (a)

Page 25: A Probability Sample Strategy for improving the quality of the Consumer Price Index Survey using the Information of the Business Register Luigi Biggeri,

25

A Probability Sample Strategy for improving the quality of the Consumer Price Index Survey …A Probability Sample Strategy for improving the quality of the Consumer Price Index Survey …

To verify the feasibility of the proposed probability sampling design, an experimental version of the frame has been implemented for testing various aspects of the sampling strategy. The outcome of the experiments have been encouraging. An experimentation of the selection of local districts (correspond to municipalities) and outlets for the Italian survey has been carried out.

Many other experiments have to be carried out to evaluate: (i) the feasibility and the cost-efficient implementation of the proposed probability sampling strategy; (ii) the quality improvements that

can be obtained using only partially the proposed strategy.

Concluding remarks (b)