17
1 Dealing with Item Non- response in a Catering Survey Pauli Ollila Statistics Finland Kaija Saarni Finnish Game and Fisheries Research Institute Asmo Honkanen Finnish Game and Fisheries Research Institute

Dealing with Item Non-response in a Catering Survey

  • Upload
    wendi

  • View
    31

  • Download
    1

Embed Size (px)

DESCRIPTION

Dealing with Item Non-response in a Catering Survey. Pauli Ollila Statistics Finland Kaija Saarni Finnish Game and Fisheries Research Institute Asmo Honkanen Finnish Game and Fisheries Research Institute. The Finnish Catering Survey. - PowerPoint PPT Presentation

Citation preview

Page 1: Dealing with Item Non-response in a Catering Survey

1

Dealing with Item Non-response in a Catering Survey

Pauli Ollila Statistics Finland

Kaija SaarniFinnish Game and Fisheries Research Institute

Asmo HonkanenFinnish Game and Fisheries Research Institute

Page 2: Dealing with Item Non-response in a Catering Survey

2

The Finnish Catering Survey• Studying the use of fish, crawfish, red

deer, elk and reindeer in the catering sector during year 2005.

• Carried out by Finnish Game and Fisheries Research Institute together with the interview organisation of Statistics Finland

• Computer assisted telephone interviews at the beginning of 2006

• Population 14740, sample size 2263, stratification by “portion classes” (7).

• Respondents 1741, unit non-response 498, over-coverage 24

Page 3: Dealing with Item Non-response in a Catering Survey

3

Information on amounts• The questionnaire was divided into three

sections for fish, crab and game (red deer, elk, reindeer)

• Among other questions every section included questions requiring amounts in kilograms, both in totals and in categories (type of product, species) and origin (domestic/imported)

• The amounts in categories could be defined in percentages as well

Page 4: Dealing with Item Non-response in a Catering Survey

4

EXAMPLE: Question 8a

Y k s i k k öN i m iY h t e y s t i e d o t

M U I S T I O

3 0 . 5 . 2 0 0 7

k g /y e a r

%

1 . F r e s h w h o l e / g u t t e d2 . F r e s h f i l l e t3 . F r o z e n w h o l e / g u t t e d4 . F r o z e n f i l l e t5 . O t h e r f r o z e n p r o d u c t s6 . P r e p a r e d7 . C a n n e d8 . S a l t e d o r s p i c e d9 . S m o k e d1 0 . I n o t h e r f o r m

1 0 0 %

What was the total amount of fish as raw material you used in 2005________________ kg

Furthermore, estimate the form in which the fish as raw material was delivered to you? (If you cannot estimate the distribution with kilograms, estimate the proportion of the total in percents)

Page 5: Dealing with Item Non-response in a Catering Survey

5

The quality of response• It was obvious that some respondents could not provide full and exact

information for these questions due to various reasons. • For example, the amounts given in classifying questions were contradictory

to the overall questions. Further, the questions for domestic and foreign fish were providing different results than the overall fish consumption question.

• A lot of editing work was carried out in the Finnish Game and Fisheries Research Institute in order to get the data cleaner (e.g. functional deduction between questions) and to convert the percentage information into kilograms.

Page 6: Dealing with Item Non-response in a Catering Survey

6

Y k s ik k öN im iY h te y s t ie d o t

M U IS T IO

3 0 . 5 . 2 0 0 7

s u m so k

n o t o t a l /z e r o t o t a l

c a t e g .m i s s i n g

c a t e g . s u mm o r e

c a t e g . s u ml e s s

a l l

F i s h 1 2 7 0 2 6 2 3 7 9 6 1 1 2 1 7 4 1F o r e i g n f i s h 1 0 5 9 9 3 3 4 4 9 2 1 5 3 1 7 4 1C r a w f i s h 7 9 1 6 4 6 1 4 2 0 1 7 4 1R e d d e e r , e l k ,r e i n d e e r

1 7 2 1 5 6 0 8 1 0 1 7 4 1

NOTE: Less than 10 % difference in total kilograms and sum of kilograms was allowed in the interview situation.

• Still some contradictory and insufficient responses, which couldn’t be solved, were left for statistical processing.

• For example, regarding total kilograms and sum of kilograms of categories we had:

Page 7: Dealing with Item Non-response in a Catering Survey

7

Item non-response• The most usual case of item non-response: the category

kilograms are totally missing when the overall total exists.• The sum of the existing category kilograms may either

exceed or go below the overall total given in the response. • In principle the latter alternative can be considered as item

non-response. • However, it is not clear how many categories are under

item non-response or whether the existing category sums are simply erroneous for some part.

Page 8: Dealing with Item Non-response in a Catering Survey

8

How to correct?• How to treat full missingness of the category sums?• How to deal with category sums not matching the

overall sum (mismatch sums)?

Alternatives for dealing with the problems• Donor imputation• Mean imputation• Regression imputation• Weight adjustments

The method in the final statistical processing was chosen from these alternatives considered in the following form:

Page 9: Dealing with Item Non-response in a Catering Survey

9

Corrections considered: donor imputation

- Selecting a donor within a stratum (“portion category”), applying its percentages for creating the imputed values as proportions from the overall total.

- Nearest neighbour class criterion by “number of kitchen staff”, “number of days serving fish”.

Full missingness of the category sums

- For the cases of category sums lower than the overall sum it is hard to apply imputation, there is no information of which category/categories should get the imputation values, and the mismatch may still continue. For the opposite cases imputation is not applicable.

- In order to retain distribution information on categories, the relations are proportioned up or down with a ratio

Mismatch sums

category

icategoryioveralli yyr ,,

Page 10: Dealing with Item Non-response in a Catering Survey

10

Corrections considered: group mean imputation

- Using group means of percentages for every amount category. “Portion categories” and “number of days serving fish” used as groups.

Full missingness of the category sums

Mismatch sums (as in donor imputation)

Corrections considered: regression imputation

- Using modelling for percentages in categories, various auxiliary variables tried, e.g. “number of kitchen staff”, “number of days serving fish” separately for “portion categories” (only for those kitchens, who have served fish). No better explanatory variables were available for all observations.

Full missingness of the category sums

Mismatch sums (as in donor imputation)

Page 11: Dealing with Item Non-response in a Catering Survey

11

Corrections considered: weight adjustments

- Correcting the category results by adjusting the weight separately for the different questions including amounts with a ratio

i.e. the weighted overall total sum divided by the weighted sum of the category sums.

- Separate weights cause inconsistencies when comparing statistics based on variables with no item non-response made either with normal weighting or adjusted weighting. Also practical problems in tabulations and analysis may occur.

Full missingness of the category sums & mismatch sums

si category

icategoryisi

ioveralli ywyw ,,

Page 12: Dealing with Item Non-response in a Catering Survey

12

Actions at that time• Due to the lack of time at the estimation phase the weight adjustments were chosen. ==>

conservative and quick solution => all the information on amounts were in line with each other (some kind of calibration).

• The purposes of the catering survey were purely descriptive, and studies were made only at the general level and some simple classes (e.g. region).

• Complex cross-tabulations and analysis were not conducted.

WHAT DID THE SUBSEQUENT TESTS WITH THE CORRECTION ALTERNATIVES REVEAL?

Page 13: Dealing with Item Non-response in a Catering Survey

13

Subsequent test experiences• Inflating item non-response factor in weight adjustments varying

from 1.00689 to 1.47618

• Practical choice: mean and regression imputation conducted for others than the biggest class, which had the value 100 % - sum of other percent estimates. This ensured the situation that the sum of other percent estimates was not exceeding 100 %.

• The regression estimation performed so poorly (e.g. negative percentage values) that it was not considered further

• Only weight adjustment replicates the original distribution of the classification amounts

• The standard deviations are affected in all methods

Page 14: Dealing with Item Non-response in a Catering Survey

14

The inconsistency problem with weight adjustments (example: proportion classes)Y k s i k k öN i m iY h t e y s t i e d o t

M U I S T I O

3 0 . 5 . 2 0 0 7

1 - 4 9 5 0 - 9 9 1 0 0 -1 9 9

2 0 0 -4 9 9

5 0 0 -9 9 9

1 0 0 0-

a l l

O r i g i n a l 4 5 9 33 1 . 1 6

3 5 2 22 3 . 8 9

2 9 8 62 0 . 2 6

2 3 1 71 5 . 7 2

9 2 36 . 2 6

3 9 92 . 7 1

1 4 7 4 0

F i s h 4 8 4 53 2 . 2 2

3 7 1 62 4 . 7 1

2 8 5 81 9 . 0 0

2 2 6 81 5 . 0 8

9 2 66 . 1 6

4 2 72 . 8 4

1 5 0 3 9

I m p o r t e d f i s h 5 1 4 73 3 . 1 5

3 8 4 72 4 . 7 7

2 9 3 11 8 . 8 7

2 2 6 11 4 . 5 6

9 2 05 . 9 3

4 2 22 . 7 2

1 5 5 2 9

S p e c i e s o ff o r e i g n f i s h

5 0 8 93 2 . 6 3

3 8 2 02 4 . 4 9

3 0 2 01 9 . 3 7

2 2 8 81 4 . 6 7

9 5 46 . 1 2

4 2 52 . 7 3

1 5 5 9 6

D o m e s t i c f i s h 5 9 6 93 3 . 0 4

4 4 3 12 4 . 5 2

3 4 7 31 9 . 2 2

2 6 7 01 4 . 7 8

1 0 5 25 . 8 2

4 7 32 . 6 2

1 8 0 6 9

totals rounded to integers

Page 15: Dealing with Item Non-response in a Catering Survey

15

The distribution problem(example: species of fish, overall total 14036226)

Y k s i k k öN i m iY h t e y s t i e d o t

M U I S T I O

3 0 . 5 . 2 0 0 7

n oc o r r e c t i o n

w e i g h ta d j u s t m e n t

d o n o ri m p u t a t i o n

g r o u p m e a ni m p u t a t i o n

S a l m o n 1 7 7 7 3 2 6 1 5 . 1 9 2 1 3 1 5 4 2 1 5 . 1 9 2 3 5 7 2 2 8 1 6 . 7 6 2 0 8 6 7 8 1 1 4 . 8 7R a i n b o w t r o u t 3 3 4 8 1 9 0 2 8 . 6 1 4 0 1 5 4 7 6 2 8 . 6 1 3 7 7 8 2 6 6 2 6 . 8 6 3 9 0 5 6 8 1 2 7 . 8 3B a l t i ch e r r i n g

9 3 6 3 7 3 8 . 0 0 1 1 2 2 9 9 0 8 . 0 0 1 1 1 9 5 5 1 7 . 9 6 1 1 4 4 0 6 7 8 . 1 5

E u r o p e a nw h i t e f i s h

2 8 9 2 9 1 2 . 4 7 3 4 6 9 4 6 2 . 4 7 3 3 8 2 7 9 2 . 4 0 3 3 5 1 3 0 2 . 3 9

P i k e p e r c h 2 8 2 9 9 0 2 . 4 2 3 3 9 3 8 9 2 . 4 2 3 3 8 2 9 0 2 . 4 0 3 1 8 1 4 7 2 . 2 7

V e n d a c e 1 8 4 8 7 5 1 . 5 8 2 2 1 7 2 0 1 . 5 8 2 1 4 9 7 7 1 . 5 3 2 1 7 2 6 4 1 . 5 5

P e r c h 1 4 3 6 8 7 1 . 2 3 1 7 2 3 2 4 1 . 2 3 1 6 4 9 0 7 1 . 1 7 1 6 1 3 7 8 1 . 1 5

H e r r i n g 2 0 8 9 6 7 1 . 7 9 2 5 0 6 1 3 1 . 7 9 2 4 2 1 8 2 1 . 7 2 2 4 5 3 9 8 1 . 7 4C o d a n d o t h e rw h i t e f i s h

2 9 8 9 3 9 4 2 5 . 5 4 3 5 8 5 1 7 2 2 5 . 5 4 3 6 8 8 2 7 6 2 6 . 2 2 3 8 2 0 2 8 9 2 7 . 2 2

T u n a 1 2 2 4 0 1 8 1 0 . 4 6 1 4 6 7 9 6 2 1 0 . 4 6 1 4 4 5 7 8 6 1 0 . 2 8 1 4 2 9 1 6 5 1 0 . 1 8O t h e r 3 1 8 5 9 7 2 . 7 2 3 8 2 0 9 3 2 . 7 2 3 4 8 4 8 5 2 . 4 8 3 7 2 9 2 7 2 . 6 6

1 1 7 0 3 7 1 0 1 0 0 . 0 1 4 0 3 6 2 2 6 1 0 0 . 0 1 4 0 3 6 2 2 6 1 0 0 . 0 1 4 0 3 6 2 2 6 1 0 0 . 0

Page 16: Dealing with Item Non-response in a Catering Survey

16

Weighted standard deviation changes (example: species of fish)Y k s i k k ö

N i m iY h t e y s t i e d o t

M U I S T I O

3 0 . 5 . 2 0 0 7

r e s p o n d e n t sw i t h o u tc o r r e c t i o n

w e i g h ta d j u s t m e n t s

d o n o ri m p u t a t i o n

r e g r e s s i o ni m p u t a t i o n

S a l m o n 1 2 2 0 1 3 3 6 1 6 7 3 1 2 9 3R a i n b o w t r o u t 5 3 2 4 5 8 3 0 5 3 4 6 5 3 4 8B a l t i c h e r r i n g 6 3 0 6 9 0 7 0 0 6 9 5E u r o p e a nw h i t e f i s h

2 7 8 3 0 5 3 0 5 2 8 8

P i k e p e r c h 3 9 9 4 3 7 4 2 4 4 0 9V e n d a c e 2 0 3 2 2 2 2 1 4 2 0 7P e r c h 1 8 1 1 9 8 1 9 2 1 8 2H e r r i n g 2 2 9 2 5 1 2 4 1 2 3 6C o d a n d o t h e rw h i t e f i s h

2 1 0 3 2 3 0 4 9 2 9 2 3 0 5

T u n a 8 6 7 9 5 0 5 2 8 8 8 9O t h e r 5 2 4 5 7 4 7 3 5 2 8

Page 17: Dealing with Item Non-response in a Catering Survey

17

Conclusions • The inconsistency level of the weight adjustment method

was not serious

• Both donor and mean imputation had a slight effect to the distribution of amounts, but not remarkable

• It is clear that the weighted standard deviations were inflated by the weight adjustments, but donor imputation tended to have more varying standard deviation figures between amount categories. As expected, mean imputation had a diminishing effect on variation.

• Current recommendation: Banff package for statistical editing and imputation (by Statistics Canada, constructed in SAS environment)