7
Research report Test–retest reliability and external validity of the previous day food questionnaire for 7–10-year-old school children M.A.A. de Assis a, * , E. Kupek b , D. Guimara ˜es a , M.C.M. Calvo b , D.F. de Andrade c , F. Bellisle d a Post-Graduate Program in Nutrition, Centro de Cie ˆncias da Sau ´de, Universidade Federal de Santa Catarina, Campus Universita ´rio – Trindade, Floriano ´polis 88040-900, Santa Catarina, Brazil b Department of Public Health, Federal University of Santa Catarina, Floriano ´polis, SC, Brazil c Department of Informatics and Statistics, Federal University of Santa Catarina, Floriano ´polis, SC, Brazil d CRNH Ile-de-France, INRA, SMBH Universite ´ Paris13, Paris, France Received 23 April 2007; received in revised form 15 February 2008; accepted 15 February 2008 Abstract The aim of this study was to assess the reliability and external validity of the Previous Day Food Questionnaire (PDFQ) designed to obtain a report of the foods eaten on the previous day by schoolchildren. Participants were 7–10-year-old school children of the first four grades of a public school in Southern Brazil (N = 227). Test–retest reliability was evaluated by kappa coefficient for two administrations of the PDFQ on the same day to the same children. External validity of the PDFQ was evaluated via sensitivity, specificity, and positive and negative predictive values (PPVand NPV, respectively) using trained observers of the food eaten on the previous day as gold standard. The association between responses from observed food intake with those from reported food intake on PDFQ was evaluated by multivariate logistic regression, controlled for school grade, gender, time of the eating, and the variation between first and second PDFQ applications. For the reliability study, the analyses stratified by school eating occasions (3 a day) indicated that agreement level was moderate or better for all food categories. PDFQ’s sensitivity ranged from 57.1% (vegetables) to 93.3% (rice), whereas its specificity ranged from 77.8% (bread/pasta) to 98% (meats). Both, PPV and NPV were reasonably high. PDFQ was highly associated with observed food intake, with effect magnitude several times larger than any other factor analyzed for all foods. PDFQ also showed good test–retest reliability, suggesting that it may generate reliable and valid data for assessing food intake at the group (school) level. # 2008 Elsevier Ltd. Published by Elsevier Ltd. All rights reserved. Keywords: External validity; Reliability; Sensitivity; Specificity; Food questionnaire; Schoolchildren Introduction The use of food questionnaires for children and adolescents in nutritional epidemiology has increased steadily in the last two decades (Livingstone, Robson, & Wallace, 2004; McPherson, Hoelscher, Alexander, Scanlon, & Serdula, 2000) but their reliability and external validity has rarely been investigated using standard parameters of diagnostic perfor- mance well established in clinical epidemiology, i.e. sensitivity, specificity, positive predictive values (PPV) and negative predictive values (NPV). Recently, some of these parameters have been introduced in nutrition literature under different names, such as ‘‘omission rate’’ for false-negative results and ‘‘intrusion rate’’ for false-positive results (Baranowski et al., 2002; Baxter, Thompson, Davis, & Johnson, 1997), using trained observer report as a gold standard. The variability of the gold standard itself across meals, types of food, and specific groups of consumers with varying food recall accuracy, represents an additional difficulty for validation studies in this area. For example, some foods are rarely consumed for breakfast as opposed to lunch; also, children may recall the food items less accurately than adults. These factors may affect the ability of a trained observer to correctly identify the food items consumed and therefore influence the external criterion, i.e. the very gold standard against which the food questionnaires are judged. In other words, the variability of gold standards can be seen as a reliability component influencing external validity. The impact of this component www.elsevier.com/locate/appet Available online at www.sciencedirect.com Appetite xxx (2008) xxx–xxx * Corresponding author. E-mail address: [email protected] (M.A.A. de Assis). + Models APPET-588; No of Pages 7 0195-6663/$ – see front matter # 2008 Elsevier Ltd. Published by Elsevier Ltd. All rights reserved. doi:10.1016/j.appet.2008.02.014 Please cite this article in press as: de Assis, M.A.A., et al., Test–retest reliability and external validity of the previous day food questionnaire for 7–10-year-old school children, Appetite (2008), doi:10.1016/j.appet.2008.02.014

Test–retest reliability and external validity of the previous day food questionnaire for 7–10-year-old school children

  • Upload
    ufsc

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

+ Models

APPET-588; No of Pages 7

Research report

Test–retest reliability and external validity of the previous day food

questionnaire for 7–10-year-old school children

M.A.A. de Assis a,*, E. Kupek b, D. Guimaraes a, M.C.M. Calvo b,D.F. de Andrade c, F. Bellisle d

a Post-Graduate Program in Nutrition, Centro de Ciencias da Saude, Universidade Federal de Santa Catarina,

Campus Universitario – Trindade, Florianopolis 88040-900, Santa Catarina, Brazilb Department of Public Health, Federal University of Santa Catarina, Florianopolis, SC, Brazil

c Department of Informatics and Statistics, Federal University of Santa Catarina, Florianopolis, SC, Brazild CRNH Ile-de-France, INRA, SMBH Universite Paris13, Paris, France

Received 23 April 2007; received in revised form 15 February 2008; accepted 15 February 2008

Abstract

The aim of this study was to assess the reliability and external validity of the Previous Day Food Questionnaire (PDFQ) designed to obtain a

report of the foods eaten on the previous day by schoolchildren. Participants were 7–10-year-old school children of the first four grades of a public

school in Southern Brazil (N = 227). Test–retest reliability was evaluated by kappa coefficient for two administrations of the PDFQ on the same day

to the same children. External validity of the PDFQ was evaluated via sensitivity, specificity, and positive and negative predictive values (PPVand

NPV, respectively) using trained observers of the food eaten on the previous day as gold standard. The association between responses from

observed food intake with those from reported food intake on PDFQ was evaluated by multivariate logistic regression, controlled for school grade,

gender, time of the eating, and the variation between first and second PDFQ applications. For the reliability study, the analyses stratified by school

eating occasions (3 a day) indicated that agreement level was moderate or better for all food categories. PDFQ’s sensitivity ranged from 57.1%

(vegetables) to 93.3% (rice), whereas its specificity ranged from 77.8% (bread/pasta) to 98% (meats). Both, PPV and NPV were reasonably high.

PDFQ was highly associated with observed food intake, with effect magnitude several times larger than any other factor analyzed for all foods.

PDFQ also showed good test–retest reliability, suggesting that it may generate reliable and valid data for assessing food intake at the group (school)

level.

# 2008 Elsevier Ltd. Published by Elsevier Ltd. All rights reserved.

Keywords: External validity; Reliability; Sensitivity; Specificity; Food questionnaire; Schoolchildren

www.elsevier.com/locate/appet

Available online at www.sciencedirect.com

Appetite xxx (2008) xxx–xxx

Introduction

The use of food questionnaires for children and adolescents

in nutritional epidemiology has increased steadily in the last

two decades (Livingstone, Robson, & Wallace, 2004;

McPherson, Hoelscher, Alexander, Scanlon, & Serdula,

2000) but their reliability and external validity has rarely been

investigated using standard parameters of diagnostic perfor-

mance well established in clinical epidemiology, i.e. sensitivity,

specificity, positive predictive values (PPV) and negative

predictive values (NPV). Recently, some of these parameters

have been introduced in nutrition literature under different

* Corresponding author.

E-mail address: [email protected] (M.A.A. de Assis).

0195-6663/$ – see front matter # 2008 Elsevier Ltd. Published by Elsevier Ltd.

doi:10.1016/j.appet.2008.02.014

Please cite this article in press as: de Assis, M.A.A., et al., Test–retest relia

7–10-year-old school children, Appetite (2008), doi:10.1016/j.appet.200

names, such as ‘‘omission rate’’ for false-negative results and

‘‘intrusion rate’’ for false-positive results (Baranowski et al.,

2002; Baxter, Thompson, Davis, & Johnson, 1997), using

trained observer report as a gold standard.

The variability of the gold standard itself across meals, types

of food, and specific groups of consumers with varying food

recall accuracy, represents an additional difficulty for validation

studies in this area. For example, some foods are rarely

consumed for breakfast as opposed to lunch; also, children may

recall the food items less accurately than adults. These factors

may affect the ability of a trained observer to correctly identify

the food items consumed and therefore influence the external

criterion, i.e. the very gold standard against which the food

questionnaires are judged. In other words, the variability of

gold standards can be seen as a reliability component

influencing external validity. The impact of this component

All rights reserved.

bility and external validity of the previous day food questionnaire for

8.02.014

Fig. 1. The previous day food questionnaire.

M.A.A. de Assis et al. / Appetite xxx (2008) xxx–xxx2

+ Models

APPET-588; No of Pages 7

can be quantified by a test–retest methodology applied to varied

situations and/or groups of consumers.

To the best of our knowledge, the complex relationship

between reliability and validity discussed above has not yet

been quantified for a food questionnaire. The broader aim of

this work is to stimulate discussion on these important

methodological issues by providing an example of quantifying

test–retest reliability of the Previous Day Food Questionnaire

(PDFQ) and its impact on external validity. Our specific

research questions were: Is the PDFQ a valid tool reflecting

food intake among first-to-fourth grade Brazilian school-

children? Direct observation was used as a gold standard on

three different school eating occasions.

Methods

Development of the questionnaire

The PDFQ was designed to provide a brief, inexpensive, and

easily administered instrument for assessing eating patterns of

elementary schoolchildren from the Santa Catarina State

(southern Brazil). The development of the PDFQ was

conducted by an expert interdisciplinary team including

nutritionists, classroom teachers, pedagogic and communica-

tion specialists. They considered the cognitive skills of children

aged 7–10 y to respond to the questionnaire (Baranowski &

Domel, 1994) and reviewed reliability and validity studies to

determine the format, administration and observation protocols

(Baranowski et al., 2002; Baxter, Thompson, & Davis, 2000;

Baxter et al., 1997; Koehler et al., 2000; Smith et al., 2001). A

pilot test of the first version was performed in 69 first-to-fourth

grade elementary pupils from a public school in order to assess

their ability to complete the questionnaire and to adjust

administration and observation procedures, as well as the list

and grouping of food items.

The second version of PDFQ, used in the present study,

consists of a single day recall procedure to be administered in

the classroom by a trained teacher. The PDFQ is structured in

three pages. Five daily eating occasions are chronologically

ordered and illustrated (breakfast and mid-morning snack on

the first page, lunch and afternoon snack on the second page,

dinner and selected food preferences on the third page). Each

eating occasion is illustrated with 21 selected foods or food

groups including 13 individual food items (dried beans, rice,

cheese, beef, poultry, pasta, crackers, bread, French fries, pizza,

hamburger, eggs, yoghurt), four food groups (fruits, vegetables,

sweets, fish/sea foods) and four beverages (milk, chocolate

milk, soft drinks and fruit juices) (Fig. 1). The foods and food

groups were chosen in order to take into account the food

patterns of children in this age group, food presented in school

menus and foods recommended in the guidelines for Brazilian

population (Ministerio da Saude, 2005).

Participants

For the present study, a convenience sample of pupils was

selected in first-to-fourth grade elementary classes from a

Please cite this article in press as: de Assis, M.A.A., et al., Test–retest relia

7–10-year-old school children, Appetite (2008), doi:10.1016/j.appet.200

public school in a city from the federal state of Santa Catarina.

The pupils were recruited from all 3 third-grade and 2 fourth-

grade classes in June 2005, and from all 2 first-grade and 2

second-grade classes in October 2006. The number of children

per class was between 25 and 30 pupils.

This school had not previously participated in the

development phase of the questionnaire and was the only full

time school in the region offering three eating occasions (mid-

morning snack, lunch and afternoon snack) which did not allow

children to bring any food from home. Food items typically

available on the school menus for morning and afternoon

snacks included milk and milk products, cakes, crackers, bread

with spreads (margarine, cheese, and jelly) and fruits. Lunch

meals included one or two kinds of salads (mixture of greens

and vegetables), eggs, meats (beef, poultry) or fish/sea foods,

rice, pasta, dried beans, soy protein, cassava flour, potatoes and

a dessert (fruits or sweets). Children could freely select specific

items at both snacks and lunch.

Parents completed a questionnaire with socio demographic

information allowing classification of the socioeconomic

status, according to the Brazil Criterion for Economic

Classification (ABEP, 2000), which takes into account both

household assets and education of the household head. Families

were classified from class A (wealthiest) to E (poorest).

Children gave their oral consent and their parents signed a

written consent for participation in this study. No financial

reward was offered. Approval for the study was granted by the

Ethics Committee of the Federal University of Santa Catarina,

Brazil.

Observations

Quality assurance activities before data collection (March to

April, 2005) comprised training of field staff, protocol

development for observations of school meals and assessment

of the agreement of all observers at recording types of food and

beverages commonly served in school lunches. The training

phase occurred in the same school as data collection, with

children from first and second grades, not participating in the

bility and external validity of the previous day food questionnaire for

8.02.014

M.A.A. de Assis et al. / Appetite xxx (2008) xxx–xxx 3

+ Models

APPET-588; No of Pages 7

main research. For assessing the agreement, the food choices of

30 children from a second-grade class were observed during

school lunches over 6 days by an experienced observer and five

dieticians’ students. No attempt was made to quantify the food

intake. Tests of agreement between dieticians’ records and the

experienced observer showed a mean agreement value of 92%

for food items eaten during school lunches.

During data collection, procedures for quality control of

observations were identical to those during training phase. The

agreement was assessed on 9 days in 10 children, during lunch,

for third and fourth grades (June 2005) and for first and second

grades (October 2006). Agreement across five dieticians’

records and the experienced observer revealed 93.2% agree-

ment for food items eaten by children in the first and second

grades, and 94.3% for food items eaten by children in third and

fourth grades.

In the main data collection, the observers completed a PDFQ

to indicate what individual children were observed eating at the

morning snack, lunch, and afternoon snack. Five to six children

were observed by each observer. Each child used identification

tags to facilitate the observation. The observers were positioned

near the tables where the pupils were eating, so the pupils knew

when they were being observed. However, the pupils were not

told that they would complete questionnaires on the next day.

Practice observations were conducted with all nine classes in

classrooms and school canteens from Monday to Friday during

2 weeks before the main data collection, in order to allow the

children to get used to the presence of the observers and thus

decrease reactivity during data collection.

PDFQ administration

A single school teacher was trained to administer the PDFQ

according to a set of written instructions. During the training

phase (April 2005) the questionnaire was administered four

times to two classes of first- and second-grade pupils. Each

administration was observed by two dieticians’ students and

was audio-recorded and transcribed. Two principal investiga-

tors reviewed the transcriptions, and then rated the teacher’s

performance using the criteria from the quality control

checklist for interviews, adapted from Shaffer et al. (2004).

The checklist contained 25 items regarding all phases of PDFQ

administration and four items on clarity, prompts, cues, and

questions asked by the pupils. Response options for each of

these items were coded as ‘‘adequate,’’ ‘‘needs improvement’’

and ‘‘inadequate’’. In the main research, quality control of

PDFQ administration included only the checklist application,

performed by two researchers that provided an immediate

feedback to the teacher. It was conducted in nine classes across

grades 1–4, indicating that the teacher adequately followed the

questionnaire instructions.

Administration of the PDFQ begins with a school teacher

explaining its purpose and content by using a poster size of each

of the four pages of the questionnaire. The teacher induced

pupils to think about the previous day with questions such as

‘‘Which day of the week was yesterday?’’, ‘‘Did you go to

school yesterday?’’, ‘‘How did you go to school yesterday?’’,

Please cite this article in press as: de Assis, M.A.A., et al., Test–retest relia

7–10-year-old school children, Appetite (2008), doi:10.1016/j.appet.200

‘‘Did you eat breakfast yesterday morning?’’, ‘‘Which foods or

beverages illustrated on the page in front of you did you eat/

drink for breakfast yesterday?’’ This set of questions was

repeated for each meal illustrated in the questionnaire. The

children were instructed to circle the item they had consumed

for each meal on the previous day. Two researchers circulated

through the classroom in order to ensure that responses were

legible, without interfering with children’s filling in the

questionnaire. The PDFQ was administered by a single teacher

twice a day (before mid-morning and afternoon snacks) in nine

classes. Each questionnaire took approximately 40 min to

complete.

Data analyses

Although the PDFQ included all five daily eating occasions,

only food items eaten at school (mid-morning snack, lunch and

afternoon snack) were analyzed for accuracy because only

these could be validated by observation of food intake.

Foods were combined into 15 categories in an effort to

reduce the number of statistical tests needed to analyze the

PDFQ results: sweets; bread and pasta; milk and milk products

(milk, chocolate milk, yoghurt, and cheese); rice; fruits;

vegetables (greens and other vegetables); dried beans; soft

drinks; fruit juice; eggs; French fries; meats (beef, poultry),

pizza, burgers, fish and sea foods.

Test–retest reliability was evaluated by kappa statistic and

associated 95% confidence intervals (Szklo & Nieto, 2004).

Kappa coefficients were interpreted using the guidelines given

by Landis and Koch (1977), i.e. 0.01–0.20 as slight, 0.21–0.40

as fair, 0.41–0.60 as moderate, 0.61–0.80 as substantial, and

0.81–1.00 as almost perfect agreement. The sources of

disagreement were further analyzed in multivariate logistic

regression with agreement between two PDFQ applications

(yes/no) as dependent variable and school grade, gender and

time of eating (mid-morning, lunch and afternoon) as

independent variables.

For validity testing, responses on the PDFQ (first applica-

tion) were compared food by food, in each school eating

occasion, with observed consumption for each child separately.

True-positive, true-negative, false-positive, and false-negative

values were calculated for each food category. External validity

was evaluated via sensitivity, specificity, PPV, NPV, false-

positive and false-negative rates, with 95% confidence intervals

(Szklo & Nieto, 2004), using observers’ responses as the gold

standard for each food or food group included in the analysis.

Sensitivity was defined as the proportion of observations of

consuming a food item also reported in the PDFQ (true

positives divided by the sum of true positives and false

negatives); specificity was the proportion of observations of not

consuming a food item that was also not reported in the PDFQ

(true negatives divided by the sum of true negatives and false

positives); PPV was the proportion of reports of consuming a

food item in the PDFQ that was confirmed by observation (true

positives divided by the sum of true positives and false

positives); NPV was the proportion of reports of not consuming

a food item in the PDFQ that was confirmed by observation

bility and external validity of the previous day food questionnaire for

8.02.014

Table 1

PDFQ test–retest reliability for reported food consumption by eating occasion (N = 227)

Food categories Mid-morning meal Lunch Afternoon snack

% Consumers Kappa % consumers Kappa % consumers Kappa

Test Retest Test Retest Test Retest

Sweets 32.6 32.6 0.74 (0.65, 0.83)* 1.3 1.3 a 59.0 55.1 0.60 (0.49, 0.70)*

Bread/pasta 53.7 59.0 0.68 (0.58, 0.77)* 37.4 41.0 0.69 (0.59, 0.78)* 39.6 40.1 0.59 (0.48, 0.69)*

Milk/milk products 74.4 73.6 0.70 (0.60, 0.81)* 1.8 0.0 a 43.6 50.7 0.70 (0.61, 0.79)*

Rice 0.4 1.3 a 75.8 74.0 0.77 (0.67, 0.86)* 1.3 0.0 a

Fruits 2.2 6.6 a 17.6 16.7 0.54 (0.39, 0.68)* 32.6 27.3 0.81 (0.73, 0.89)*

Vegetables 0.0 0.0 a 32.6 27.8 0.55 (0.43, 0.67)* 0.0 0.0 a

Dried beans 0.0 0.4 a 52.4 54.2 0.77 (0.69, 0.85)* 0.4 0.0 a

Meat, poultry 0.9 0.4 a 70.0 69.2 0.60 (0.49, 0.72)* 0.0 0.0 a

Abbreviations: PDFQ, previous day food questionnaire; CI, 95% confidence interval.*All kappa’s values were P < 0.001.

a—No statistics were computed because of low reported consumption.

M.A.A. de Assis et al. / Appetite xxx (2008) xxx–xxx4

+ Models

APPET-588; No of Pages 7

(true negatives divided by the sum of true negative and false

negatives). False-negative rate was the proportion of reports of

not consuming a food item in the PDFQ when intake was

observed (false negatives divided by the sum of false negatives

and true positives). False-positive rate was the proportion of

reports of consuming a food item in the PDFQ when intake was

not actually observed (false positives divided by the sum of

false positives and true negatives).

Multivariate logistic regression was used to analyze the

association between responses from observed food intake

with those from reported food intake on PDFQ and also to

verify the sources of variation of the gold standard itself.

Observed food intake (yes/no) entered into the model as

dependent variable, with PDFQ (first application), school

grade (aggregated data from children in first and second

school grades and from children in third and fourth grades),

gender, time of eating (mid-morning, lunch and afternoon),

and variation between first and second PDFQ application as

independent variables.

Data were analyzed using the Statistical Package for Social

Sciences (SPSS Version 10.0 for Windows, 1998, SPSS Inc.,

Chicago, IL, USA).

Results

Response rates and sample characteristics

Of the 265 children invited to participate, 263 provided both

child and parental consent. Data from 34 children were

excluded because they were absent when the reliability or

validity tests were carried out. The final analytic sample was

composed by 227 children (participation rate 85.7%), of which

39 first-grade (53.8% girls), 56 second-grade (57.1% boys), 72

third-grade (58.3% boys) and 60 fourth-grade (56.7% boys)

pupils. The mean age was 8.9 y (S.D. = 1.0 y) for the overall

sample, 8.1 y (S.D. = 0.7 y) for children in first and second

grades, and 9.4 y (S.D. = 0.8 y) for children in third and fourth

grade. The pupils were predominantly from families classified

in the D and E economic group (93.4%), indicating low

socioeconomic status.

Please cite this article in press as: de Assis, M.A.A., et al., Test–retest relia

7–10-year-old school children, Appetite (2008), doi:10.1016/j.appet.200

Reliability

Analyses of test–retest reliability of food categories

comprising each school eating occasion for all children are

summarized in Table 1. Kappa coefficients could not be

computed for seven items with none or very low consumption

reported on PDFQ test and retest (pizza, hamburger, soft drinks,

fruit juices, fish and sea foods, eggs, and French fries). The

analyses stratified by school eating occasion indicated that level

of agreement was moderate or higher for all food categories.

Disagreements between PDFQ test and retest were

predominantly associated with the frequency of food intake,

which in turn was related to the time of the day the meal was

taken. For example, the discrepancies were much more likely to

occur for lunch-time than for mid-morning foods typically

consumed for Brazilian lunch, such as rice, dried beans and

meat. Odds ratios (95% confidence intervals) for lunch versus

mid-morning meal effect were 11.04 (2.54, 47.91) for rice,

31.10 (4.16, 232.62) for dried beans and 46.00 (6.25, 338.49)

for meats in multivariate logistic regression also adjusted for

school grade and gender; other effects were not statistically

significant (details not shown).

External validity

PDFQ showed good overall performance in combined

school eating occasions, except for vegetables where sensitivity

was below 60% (Table 2). Specificity (probability of correctly

not reporting intake of a food) was higher than sensitivity

(probability of correctly reporting a food intake). For example,

the sensitivity of 64.6% for sweets means that the PDFQ can

correctly detect approximately 2 out of 3 sweets consumed; in

comparison, less than 6 out of 10 occasions when vegetables

were consumed were correctly detected. The false-negative

rates (omission rates) varied from 42.9% (vegetables) to 6.7%

(rice) and false-positive rates (intrusion rates) from 22.2%

(bread, pasta) to 2% (meat, poultry). Both, PPV (probability

that a food reported on the PDFQ was really eaten) and NPV

(probability that a food not reported on the PDFQ was not

eaten) were reasonably high (59–98%).

bility and external validity of the previous day food questionnaire for

8.02.014

Table 2

Sensitivity, specificity, false-positive and false-negative rates, and predictive values of PDFQ by food item, in three combined school eating occasions, for all children (3 meals � 227 children = 681 reports)

Food item Sensitivity* % (95% CI) Specificity* % (95% CI) False-negative rate*% (95% CI) False-positive rate* % (95% CI) PPV* % (95% CI) NPV* % (95% CI)

Sweets 64.6 (58.9, 70.3) 91.6 (89.0, 94.3) 35.4 (29.7, 41.1) 8.4 (5.7, 11.0) 83.9 (78.9, 88.8) 79.4 (75.7, 83.0)

Bread/pasta 76.8 (71.7, 81.8) 77.8 (73.8, 81.8) 23.2 (18.2, 28.3) 22.2 (18.2, 26.2) 69.0 (63.8, 74.3) 83.9 (80.2, 87.5)

Milk/milk products 83.9 (79.2, 88.6) 83.4 (79.9, 86.8) 16.1 (11.4, 20.8) 16.6 (13.2, 20.1) 72.8 (67.5, 78.1) 90.7 (87.9, 93.5)

Rice 93.3 (89.5, 97.1) 95.7 (94.0, 97.5) 6.7 (2.9, 10.5) 4.3 (2.5, 6.0) 87.5 (82.6, 92.4) 97.8 (96.5, 99.1)

Fruits 85.0 (78.3, 91.8) 95.1 (93.4, 96.9) 15.0 (8.2, 21.7) 4.9 (3.1, 6.6) 76.5 (68.8, 84.1) 97.2 (95.8, 98.5)

Vegetables 57.1 (46.1, 68.2) 95.0 (93.3, 96.8) 42.9 (31.8, 53.9) 5.0 (3.2, 6.7) 59.5 (48.3, 70.6) 94.6 (92.8, 96.4)

Dried beans 86.2 (79.2, 93.1) 93.4 (91.3, 95.4) 13.8 (6.9, 20.8) 6.6 (4.6, 8.7) 67.5 (59.1, 75.9) 97.7 (96.4, 98.9)

Meat/poultry 84.4 (79.0, 89.7) 98.0 (96.8, 99.2) 15.6 (10.3, 21.0) 2.0 (0.8, 3.2) 93.8 (90.1, 97.5) 94.6 (92.7, 96.6)

PDFQ, previous day food questionnaire; CI, confidence interval; PPV, positive predictive value; NPV, negative predictive value.*See data analysis section for definitions.

Note: No statistics are computed for pizza, hamburger, soft drinks, fruit juice, fish/sea foods, eggs and French fries because of low frequency for these food categories.

Table 3

Factors associated with observed food intake in multivariate logistic regression: odds ratios and related 95% confidence intervals

Food items PDFQ test PDFQ retest different versus

equal to test

Grades 3 and 4 versus 1 and 2 Girls versus boys Afternoon meal versus mid-

morning meal

OR (95% CI) P OR (95% CI) P OR (95% CI) P OR (95% CI) P OR (95% CI) P

Sweets 5.84 (3.56, 9.59) 0.000 1.70 (0.88, 3.28) NS 2.11 (1.31, 3.40) 0.002 0.95 (0.60, 1.50) NS 4.11 (2.58, 6.55) 0.000

Bread/pasta 16.42 (10.67, 25.27) 0.000 2.58 (1.54, 4.31) 0.000 1.54 (1.03, 2.31) 0.035 1.62 (1.08, 2.43) 0.019 0.15 (0.09, 0.25) 0.000

Milk/milk products 10.48 (6.22, 17.67) 0.000 2.17 (1.10, 4.28) 0.025 1.06 (0.65, 1.74) NS 1.22 (0.77, 1.92) NS 0.45 (0.29, 0.70)

Rice 47.29 (18.62, 120.12) 0.000 6.55 (1.44, 29.85) 0.015 0.97 (0.43, 2.20) NS 0.95 (0.42, 2.15) NS NC 0.000

Fruits 73.04 (35.55, 150.08) 0.000 1.51 (0.57, 4.02) NS 0.78 (0.37, 1.63) NS 1.31 (0.66, 2.59) NS NC

Vegetables 5.99 (3.10, 11.57) 0.000 0.88 (0.40, 1.91) NS 0.61 (0.33, 1.14) NS 1.10 (0.60, 2.00) NS NC

Dried beans 18.21 (8.52, 38.90) 0.000 2.16 (0.75, 6.22) NS 0.43 (0.21, 0.85) 0.016 0.76 (0.39, 1.50) NS NC

Meat/poultry 33.70 (13.05, 87.05) 0.000 1.40 (0.50, 3.90) NS 4.95 (2.03, 12.07) 0.000 1.41 (0.59, 3.35) NS NC

Abbreviations: PDFQ, previous day food questionnaire; OR, odds ratio; NS, not significant; NC, no consumption.

M.A

.A.

de

Assis

eta

l./Ap

petite

xxx(2

00

8)

xxx–xxx

5

+M

od

els

AP

PE

T-5

88

;N

oo

fP

ages

7

Please

citeth

isarticle

inp

ressas:

de

Assis,M

.A.A

.,etal.,T

est–retest

reliability

and

extern

alvalid

ityo

fth

ep

revio

us

day

foo

dq

uestio

nn

airefo

r

7–

10

-year-o

ldsch

oo

lch

ildren

,A

pp

etite(2

00

8),

do

i:10

.10

16

/j.app

et.20

08

.02

.01

4

M.A.A. de Assis et al. / Appetite xxx (2008) xxx–xxx6

+ Models

APPET-588; No of Pages 7

Responses from children on PDFQ were highly associated

with those from observed food intake even after adjusting for

other factors, with effect magnitude several times larger than

any other factor analyzed for all foods, except for sweets where

afternoon meal versus mid-morning meal was of similar

magnitude (Table 3). Children were significantly more likely to

consume sweets in the afternoon as opposed to bread/pasta and

milk/milk products. Gender differences were significant only

for bread/pasta but this was a relatively small effect. Children in

grades 3 and 4 were significantly less likely to report eating

dried beans and more likely to report eating sweets, bread/

pasta, and meats, but only the last one had a notable effect

magnitude. The test–retest reliability of PDFQ was signifi-

cantly only for rice, bread/pasta, and milk/milk products but

even then the magnitude of this effect was more than four times

lower than the effect of PDFQ itself (Table 3).

Discussion

This article evaluated test–retest reliability and external

validity of PDFQ using direct observation as a gold standard on

three different eating occasions in a school setting. The

strengths of this study are the use of trained observers to

validate the PDFQ and the quality control implemented for the

questionnaire administration. Direct observation of eating

behavior is considered a better criterion for external validation

than food frequency questionnaires and recalls because it does

not rely on subject’s memory (Baranowski et al., 2002).

The PDFQ performed well both in terms of external validity

and test–retest reliability. Children were able to identify foods

and beverages related to the different eating occasions (mid-

morning snack, lunch and afternoon snack) (Table 1). The foods

whose intake was neither recalled nor observed on the previous

day were indeed the items rarely present on the school menus,

such as fish and sea foods, fruit juices, eggs, soft drinks, pizza,

hamburger, and French fries. This can partly be attributed to the

policy to limit the offer of soft drinks and high-fat ready-to-eat

snacks on school menus in public schools in the state of Santa

Catarina.

The pattern of sensitivity, specificity, and predictive values

suggests PDFQ is more accurate in identifying single foods

than foods that were components of a mixed dish or included

multiple foods. For example, rice had the highest sensitivity and

the greatest NPV among the eight food items, as well as a high

specificity and a high positive predictive value. In contrast,

although PDFQ performed well in identifying children who did

not consume vegetables and sweets (specificity of 95% for

vegetables and 91.6% for sweets), its ability to identify those

who consumed these foods was moderate (sensitivity of 57.1%

for vegetables and 64.6% for sweets). In other words, there is a

higher proportion of false negative than false positive,

indicating a higher proportion of individuals underestimating

than overestimating their consumption of specific items. This is

consistent with other studies reporting that instrument-related

factors that apparently contributed to children’s difficulty in

responding accurately to individual items on food question-

naires are use of categories of foods versus specific foods and

Please cite this article in press as: de Assis, M.A.A., et al., Test–retest relia

7–10-year-old school children, Appetite (2008), doi:10.1016/j.appet.200

foods included in mixed dishes (Hoelscher, Day, Kelder, &

Ward, 2003; Koehler et al., 2000; Smith et al., 2001). In studies

that used observation to validate 24-h recall interviews, the food

items most often omitted or least accurately recalled were

‘added foods’, such as condiments, butter/margarine and salad

dressing (Baxter et al., 1997; Weber et al., 2004), as well as

sweets and desserts (Weber et al., 2004).

Comparing the test–retest reliability of the present study

with other studies of dietary assessment methods is limited

because of the differences in study design, age of children,

referent periods, validation standards and the methods of data

analysis. Nevertheless, the PDFQ reliability was similar to or

better than reliability reported in other studies of food

questionnaires with middle-school students (Hoelscher et al.,

2003; Smith et al., 2001).

In addition, we are not able to directly compare the PDFQ’s

external validity with other studies using direct observation of

eating behavior in school settings due to differences in the

definitions of omission (false negatives) and intrusion rates

(false positives), and in different approaches used to compare

observed information to reported information (Baranowski

et al., 2002; Baxter et al., 1997).

The analysis to assess factors associated with observed food

intake showed some significant effects of discrepancies

between first and second PDFQ administration, school grade,

gender and time of eating, but they were several times lower as

compared to the PDFQ itself. Baxter et al. (1997) determined

the impact of gender, ethnicity, meal component, and time

interval between eating and reporting on accuracy of fourth-

graders’ self-reports of school lunch. They showed that the only

significant effect for omitted food rate and phantom food rate

was time interval; the longer the interval, the higher the rates.

This effect was also shown by Weber et al. (2004).

Some limitations of our study should be considered. First,

the study included a convenience sample of schoolchildren

from first-to-fourth grades of a public school, thus restricting

the findings to the schoolchildren of similar characteristics.

Second, direct observation of eating behavior included two

snacks and a lunch at school instead of the entire day. Third,

accuracy of the children’s reporting may have been influenced

by the observation of school meals conducted a day before

administration of the PDFQ. Fourth, we do not know if children

grouped specific food items in the same way as they were

presented in the PDFQ. Finally, the PDFQ was not designed to

estimate nutrient intake, so questions about portion size were

not included. On the other hand, this simplifies recall by

prompting only the relevant food items eaten on the previous

day, thus avoiding the difficulties associated with children’s

assessment of portion size (Lytle et al., 1993) and keeps the

questionnaire relatively brief and easy for completion and

administration. Issues regarding the importance of asking

children about amounts eaten were discussed elsewhere (Baxter

et al., 2000). Another advantage of epidemiologic analysis

based on foods (as opposed to nutrients) is its direct

comparability to dietary recommendations which are typically

given in terms of foods, e.g. consumption of five fruits and

vegetables a day (Ministerio da Saude, 2005).

bility and external validity of the previous day food questionnaire for

8.02.014

M.A.A. de Assis et al. / Appetite xxx (2008) xxx–xxx 7

+ Models

APPET-588; No of Pages 7

The PDFQ needs further adaptations to cover the 24-h

period, including a snack after dinner. Future research should

include evaluation of the validity and reliability of the PDFQ in

other populations (e.g. older and younger individuals), other

settings (e.g. at home), and for more irregular eating patterns

associated with the weekends.

Acknowledgments

This study was supported by the Post-Graduate Program in

Nutrition, UFSC, Santa Catarina, Brazil. We gratefully thank

Prof. Mauro Barros (University of Pernambuco) who assisted

with the design of the PDFQ. Thanks to the school staff, the

graduate students of UNIVALI who recorded the measures, and

children and their parents who participated in the study.

References

ABEP (2000). Associacao Brasileira de Empresas de Pesquisa. Sao Paulo:

Available at http://www.abep.org. Accessed November 2006.

Baranowski, T., & Domel, S. B. (1994). A cognitive model of children’s

reporting of food intake.. American Journal of Clinical Nutrition,

59(Suppl.), 212S–217S.

Baranowski, T., Islam, N., Baranowski, J., Cullen, K. W., Myres, D., Marsh, T.,

et al. (2002). The food intake recording software system is valid among

fourth-grade children. Journal of the American Dietetic Association, 102,

380–385.

Baxter, S. D., Thompson, W. O., & Davis, H. C. (2000). Prompting methods

affect the accuracy of children’s school lunch recalls. Journal of the

American Dietetic Association, 100, 911–918.

Baxter, S. D., Thompson, W. O., Davis, H. C., & Johnson, M. H. (1997). Impact

of gender, ethnicity, meal component, and time interval between eating and

reporting on accuracy of fourth-graders’ self-reports of school lunch.

Journal of the American Dietetic Association, 97, 1293–1298.

Please cite this article in press as: de Assis, M.A.A., et al., Test–retest relia

7–10-year-old school children, Appetite (2008), doi:10.1016/j.appet.200

Hoelscher, D. M., Day, R. S., Kelder, S. H., & Ward, J. L. (2003). Reprodu-

cibility and validity of the secondary level School-Based Nutrition Mon-

itoring student questionnaire. Journal of the American Dietetic Association,

103, 186–194.

Koehler, K. M., Cunningham-Sabo, L., Lambert, L. C., McCalman, R., Skipper,

B. J., & Davis, S. M. (2000). Assessing food selection in a health promotion

program: Validation of a brief instrument for American Indian children in

the Southwest United States. Journal of the American Dietetic Association,

100, 205–211.

Landis, J. R., & Koch, G. G. (1977). The measurement of observer agreement

for categorical data. Biometrics, 33(1), 159–174.

Livingstone, M. B., Robson, P. J., & Wallace, J. M. (2004). Issues in dietary

intake assessment of children and adolescents. The British Journal of

Nutrition, 92, S213–S222.

Lytle, L. A., Nichaman, M. Z., Obarzanek, E., Glovsky, E., Montgomery, D.,

Nicklas, T., et al. (1993). Validation of 24-hour recalls assisted by food

records in third-grade children. The CATCH Collaborative Group. Journal

of the American Dietetic Association, 93, 1431–1436.

McPherson, R. S., Hoelscher, D. M., Alexander, M., Scanlon, K. S., & Serdula,

M. K. (2000). Dietary assessment methods among school-aged children:

Validity and reliability. Preventive Medicine, 31(Suppl.), S11–S33.

Ministerio da Saude (2005). Guia alimentar para a populacao brasileira.

Brasılia: Available at http://www.saude.gov.br. Accessed September, 2006.

Shaffer, N. M., Baxter, S. D., Thompson, W. O., Baglio, M. L., Guinn, C. H., &

Frye, F. H. (2004). Quality control for interviews to obtain dietary recalls

from children for research studies. Journal of the American Dietetic

Association, 104, 1577–1585.

Smith, K. W., Hoelscher, D. M., Lytle, L. A., Dwyer, J. T., Nicklas, T. A., Zive,

M. M., et al. (2001). Reliability and validity of the Child and Adolescent

Trial for Cardiovascular Health (CATCH) Food Checklist: A self-report

instrument to measure fat and sodium intake by middle school students.

Journal of the American Dietetic Association, 101, 635–647.

Szklo, M., & Nieto, F. J. (2004). Epidemiology. Beyond the basics. Sudbury:

Jones and Bartlett Publishers;. pp. 343–401.

Weber, J. L., Lytle, L., Gittelsohn, J., Cunningham-Sabo, L., Heller, K.,

Anliker, J. A., et al. (2004). Validity of self-reported dietary intake at

school meals by American Indian children: The Pathways Study. Journal of

the American Dietetic Association, 104, 746–752.

bility and external validity of the previous day food questionnaire for

8.02.014