10
PAIRED PREFERENCE TESTS WITH REVERSED HIDDEN DEMAND CHARACTERISTICS YIXUN XIA 1,2 , ALONDRA RIVERA-QUINTERO 1 , EDUARDO CALDERON 1 , FANG ZHONG 2 and MICHAEL O’MAHONY 1,3 1 Department of Food Science and Technology, University of California, Davis, CA 95616 2 Department of Food Science and Technology, Jiangnan University, Wuxi, Jiangsu, China 3 Corresponding author. TEL: +1530-756-5493; FAX: +1530 756 7320; EMAIL: [email protected] Accepted for Publication January 29, 2014 doi:10.1111/joss.12090 ABSTRACT To gain insight into the proportion of consumers who respond to extraneous factors in the paired preference testing situation, rather than the sensory proper- ties of products being assessed, “placebo” preference tests with putatively identical products are used. One use is as a tool to select consumers who ignore extraneous factors and are thus suitable for use in preference tests, where experimenters wish to have confidence that consumers are ignoring extraneous factors and responding only to the sensory attributes of the products being tested. Yet, selecting such con- sumers tends often to reduce the sample size to approximately 20–35% of its origi- nal, which is unacceptably low. The protocol used in this study employed unusual instructions and questions to reverse the hidden demand characteristics of the test, so that the sample size was only reduced to 80–90% of its original. PRACTICAL APPLICATIONS Paired preference tests are part of the battery of tests used for measuring food acceptance. Although, when used casually, the test is simple; when used for a formal assessment of acceptance, there are problems that must be addressed. One such problem is the tendency of consumers to respond with preferences, even when the stimuli are putatively identical. Such a bias has the potential to produce an overestimation of preferences, when products are being assessed by paired pref- erence protocols. Accordingly, research into this phenomenon is worthwhile, so that more valid and reliable tests can be developed. INTRODUCTION The measurement of preference and acceptance of foods is important for product development and decisions regard- ing the launching of new products in the marketplace. One such test of preference is the paired preference test (Lawless and Heymann 2010). When the paired preference test is used informally, it is simple. Yet, when used more formally as a research tool or for an exercise that is concerned with measuring accurate market acceptance, there are problems that must be addressed. One such problem is a potential for consumers to report preferences that are based on factors that are quite separate from the properties of the “test pair” of products being assessed. Ennis and Collins (1980) mailed two cigarettes (call them “A” and “B”) to a large number of consumers’ homes for comparison on a variety of attributes, such as better flavor, easier draw, better aftertaste, slower burning, etc. Finally, they were asked for their preferences and 40% reported preference for cigarette “A”, 20% reported “no preference” and 40% preference for “B”. Yet,“A” and “B” had been taken from the same production run. They were essentially identi- cal cigarettes. Therefore, the majority of preferences expressed for one or the other of the cigarettes would have been because of the factors in the experimental situation other than the sensory characteristics of the cigarettes. This experiment was repeated with four different brands of ciga- rettes, with consumer sample sizes ranging 412–488 (total 1,787). There was remarkable agreement between each brand. Yet, the 40-20-40 frequency distribution for putatively identical stimuli does not seem to be general. For reasons yet unresolved, other authors (Marchisano et al. 2003; Alfaro-Rodriguez et al. 2005, 2007, 2008; Chapman and Journal of Sensory Studies ISSN 0887-8250 149 Journal of Sensory Studies 29 (2014) 149–158 © 2014 Wiley Periodicals, Inc.

COMPARISON OF FLOWISE AND SIPWISE METHODS OF TASTING

  • Upload
    ucd

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

PAIRED PREFERENCE TESTS WITH REVERSED HIDDENDEMAND CHARACTERISTICSYIXUN XIA1,2, ALONDRA RIVERA-QUINTERO1, EDUARDO CALDERON1, FANG ZHONG2 andMICHAEL O’MAHONY1,3

1Department of Food Science and Technology, University of California, Davis, CA 956162Department of Food Science and Technology, Jiangnan University, Wuxi, Jiangsu, China

3Corresponding author.TEL: +1530-756-5493;FAX: +1530 756 7320;EMAIL: [email protected]

Accepted for Publication January 29, 2014

doi:10.1111/joss.12090

ABSTRACT

To gain insight into the proportion of consumers who respond to extraneousfactors in the paired preference testing situation, rather than the sensory proper-ties of products being assessed, “placebo” preference tests with putatively identicalproducts are used. One use is as a tool to select consumers who ignore extraneousfactors and are thus suitable for use in preference tests, where experimenters wishto have confidence that consumers are ignoring extraneous factors and respondingonly to the sensory attributes of the products being tested. Yet, selecting such con-sumers tends often to reduce the sample size to approximately 20–35% of its origi-nal, which is unacceptably low. The protocol used in this study employed unusualinstructions and questions to reverse the hidden demand characteristics of thetest, so that the sample size was only reduced to 80–90% of its original.

PRACTICAL APPLICATIONS

Paired preference tests are part of the battery of tests used for measuring foodacceptance. Although, when used casually, the test is simple; when used for aformal assessment of acceptance, there are problems that must be addressed. Onesuch problem is the tendency of consumers to respond with preferences, evenwhen the stimuli are putatively identical. Such a bias has the potential to producean overestimation of preferences, when products are being assessed by paired pref-erence protocols. Accordingly, research into this phenomenon is worthwhile, sothat more valid and reliable tests can be developed.

INTRODUCTION

The measurement of preference and acceptance of foods isimportant for product development and decisions regard-ing the launching of new products in the marketplace. Onesuch test of preference is the paired preference test (Lawlessand Heymann 2010). When the paired preference test isused informally, it is simple. Yet, when used more formallyas a research tool or for an exercise that is concerned withmeasuring accurate market acceptance, there are problemsthat must be addressed. One such problem is a potential forconsumers to report preferences that are based on factorsthat are quite separate from the properties of the “test pair”of products being assessed.

Ennis and Collins (1980) mailed two cigarettes (call them“A” and “B”) to a large number of consumers’ homes forcomparison on a variety of attributes, such as better flavor,

easier draw, better aftertaste, slower burning, etc. Finally,they were asked for their preferences and 40% reportedpreference for cigarette “A”, 20% reported “no preference”and 40% preference for “B”. Yet, “A” and “B” had been takenfrom the same production run. They were essentially identi-cal cigarettes. Therefore, the majority of preferencesexpressed for one or the other of the cigarettes would havebeen because of the factors in the experimental situationother than the sensory characteristics of the cigarettes. Thisexperiment was repeated with four different brands of ciga-rettes, with consumer sample sizes ranging 412–488 (total1,787). There was remarkable agreement between eachbrand.

Yet, the 40-20-40 frequency distribution for putativelyidentical stimuli does not seem to be general. For reasonsyet unresolved, other authors (Marchisano et al. 2003;Alfaro-Rodriguez et al. 2005, 2007, 2008; Chapman and

bs_bs_banner

Journal of Sensory Studies ISSN 0887-8250

149Journal of Sensory Studies 29 (2014) 149–158 © 2014 Wiley Periodicals, Inc.

Lawless 2005; Kim et al., 2008; Angulo et al., 2009;Alvarez-Coureaux et al., 2010; Sung et al., 2011) found dif-ferent frequencies in their paired preference tests. Thenumbers varied with the products being tested, the experi-mental conditions, the types of consumers tested as well asthe types and numbers of response options allowed. Yet, thenumber of consumers who chose the “no preference” optionwas almost always considerably less than those who chose apreference. The numbers choosing the “no preference”option varied a great deal but most were in the 20–35%range. This means that if the majority of consumersrespond with preferences for a putatively identical pair,some of the preferences reported for the “test pair” of prod-ucts are likely to be due to extraneous factors in the experi-mental situation, rather than being due to the sensoryproperties of the actual products. The problem is that thereis no obvious way of knowing which responses are due tothe products and which are due to extraneous factors unre-lated to the products. The potential for misinformationshould not be ignored.

One way of trying to sort out the effect of the tendencyto respond to extraneous factors was to compare responsesto the test pair of products with responses to a putativelyidentical pair. Because the function of the “identical” pairwas reminiscent of the function of a placebo in a drugstest, it was named the “placebo” pair. So, response fre-quencies elicited by the test pair were compared withthose from the placebo pair. For this, Chi-squarewas used. The consumer was given both pairs to assess(Alfaro-Rodriguez et al. 2007, 2008; Kim et al. 2008;Alvarez-Coureaux et al. 2010; Sung et al. 2011) and theresponses to the test pair became the observed frequenciesin the Chi-square test, while those for the placebo pairbecame the expected frequencies. Yet, because the sameconsumers tasted both pairs, this broke the assumptionsfor the independent samples Chi-square test. However, thealternative related samples designs (McNemar 1947;Bowker 1948; Stuart 1955; Maxwell 1970) were structurallyless powerful and ran into logical problems and showed noadvantage over the simpler Chi-square approach (Sunget al. 2011).

Using the placebo frequencies to test whether theobserved frequencies indicate a significant preference situa-tion does not resolve the problem of preferences elicitedby extraneous factors. Chi-square merely examines whetherthe observed frequencies are significantly different from thesituation where all the recorded preferences were elicited byextraneous factors. Thus, even when the observed responsefrequencies for the target pair are significantly differentfrom those of the placebo pair, some of the observed targetfrequencies may still include responses to extraneousfactors. There needs to be a way of eliminating the prefer-ences elicited by extraneous factors, to separate them from

preferences elicited by the input from the test products.There are several possibilities.

Alfaro-Rodriguez et al. (2007) and Sung et al. (2011)approached the problem by not only considering the datafrom all the consumers tested but also by examining theresponse frequencies to the target stimuli from the subset ofconsumers, who had exhibited no preference in the placebocondition. The advantage here was that only the responsesof consumers, who had demonstrated a tendency not toreport preferences elicited by extraneous factors, were con-sidered. In other words, the placebo pair acted as a form of“screening” tool by not selecting consumers who had dem-onstrated a tendency to report preferences elicited by extra-neous factors. The disadvantage was that because of thesmall number of “no preference” responses for the placebopair, the sample size of these “screened” consumers wasreduced. Furthermore, even though a consumer mightexhibit a preference induced by extraneous factors in theplacebo condition, the sensory characteristics of the targetstimuli might be sufficiently different to entice consumersto pay attention to the sensory properties of the targetstimuli and ignore any extraneous factors. In other words,real preferences might be elicited when consumers wereconfronted with the test pair. Therefore, using data onlyfrom consumers who had chosen the “no preference” optionfor the placebo pair could result in the loss of useful data(Sung et al. 2011). Thus, such an approach can be viewed ascautious and perhaps too conservative.

Accordingly, it would be an advantage if the paired pref-erence test could be manipulated so that it gave a highernumber of “no preference” responses in the placebo condi-tion, allowing more consumers to be selected or “screened”for assessing the test pair; useful data might not be lost. Theproblem is to change the predisposition of the consumerswhen they are confronted with a preference test. It ishypothesized that this predisposition is that the consumerthinks that he or she must respond with a preference.Another way of expressing this is that the consumer, moti-vated by the “hidden demand characteristics” of the experi-ment, should respond with a preference. Hidden demandcharacteristics are imagined instructions that have not beengiven to the consumer, but which the consumer feels he orshe must obey, as well as the overt demand characteristics ofthe instructions. If these hidden demand characteristicscould be reversed, then it is hypothesized that more con-sumers would pass the “screening” by the placebo conditionselection tool. Data could then be collected from a largerproportion of consumers.

The goal of this experiment was to give paired preferencetests to consumers, using placebo pairs as a screening tool,whereby those consumers who indicated preferences in theplacebo condition could be eliminated before assessing thetest pair. The goal was to investigate whether a change in the

PREFERENCE ALTERED DEMAND CHARACTERISTICS Y. XIA ET AL.

150 Journal of Sensory Studies 29 (2014) 149–158 © 2014 Wiley Periodicals, Inc.

preference test protocol could increase the proportions ofconsumers who passed the screening. The change in theprotocol to be investigated was designed to reverse thehidden demand characteristics inherent in paired preferencetests. The experiment was repeated with modifications tothe protocol, to allow an examination of the robustness ofthe procedure.

EXPERIMENT 1

Part A

The goal of this part of the experiment was to attempt toreverse the hidden demand characteristics of a preferencetest, with suitable instructions and questions for the con-sumers, so that in the placebo condition, the “no prefer-ence” option would be selected by a greater proportion ofconsumers.

MATERIALS AND METHODS

Consumers

Consumers of potato chips (n = 203, 94 M, 109 F, age range18–50 years), students and staff of the University of Califor-nia, Davis, were selected. They were intercepted in a diningroom on the campus. They were tested in an accessible area,by the side of the dining room, set up for the experiment.

Stimuli

Stimuli were original flavored and sour cream and onionflavored potato chips (Pringles Manufacturing Co., Jackson,TN). They were presented in 4 oz (118 mL) Serco-Keyplastic cups (S.E.Rykoff & Co., Los Angeles, CA), two toa cup.

Procedure

Consumers were tested individually. After establishingrapport and collecting demographic details, the experi-menter instructed the consumers in the experimental proce-dure and only began the experiment when the consumershad thoroughly understood their task. The experimenterobserved the consumer unobtrusively, to ensure that theconsumer followed the instructions correctly and to beavailable for answering any questions and recording any ofthe consumers’ comments.

Consumers were presented with two pairs of cups, thetest pair and the placebo pair. The chips in the test pairwere different: “original” and “sour cream and onion.” Thechips in the placebo pair were putatively identical: eithertwo sets of original chips or two sets of sour cream and

onion chips. Consumers were first presented with two cupscontaining the placebo pair. Consumers were told that theywere going to perform a simple preference test. Whilepointing to the relevant cups, consumers were told thatthey might like one more than the other or vice versa, orthey might like them just about the same. The experi-menter then said that she knew that they would like themabout the same because everybody did. Consumers thentasted and swallowed the placebo pair of chips andresponded verbally.

Those consumers who said they had no preference werethen presented with the test pair. This time the experi-menter did not go through the full instructions. She merelysaid: “Same thing, try these two.” Then after the consumershad completed the second part of the experiment, theexperiment was terminated.

Those consumers who responded that they had had apreference with the placebo pair were not automatically dis-counted. The experimenter asked these consumers howgreat the difference was between the two. They usually saidit was only very small. The experimenter then askedwhether this meant that they liked them “just about thesame.” If they replied that this was the case, the experi-menter deemed them to have had no operational preferencewith the placebo pair and proceeded with the second part ofthe experiment. If they replied that they still had a prefer-ence, they still proceeded with the second part of the experi-ment, yet were judged not to have passed the “screening”procedure. After assessing the test pair, these two groups ofconsumers were asked which of these two chips they wouldbuy if they went into a store. If they chose the chips theyhad reported as preferring, this did not necessarily confirmthat the stated preferences during the test indicated “reallife” behavior, in other words, that they were operationalpreferences. However, as far as self-report can be trusted, itis suggestive that the measured preferences were opera-tional. Secondly, they were asked that if their preferredproduct was not available, whether they would buy the lesspreferred product. Answers in the affirmative were alwaysgiven, lending support to the idea that the preferencesobtained were between two products that were liked. Suchquestions make useful routine additions to preference tests.

It is important to note that this experimental protocolrequires the experimenter to interact with the consumerface-to-face. The experimenter must have the right person-ality and social skills to establish a friendly rapport and leadthe consumer smoothly through the required responsetasks. Accordingly, the experimenter should be chosencarefully.

The placebo pair was always presented before the testpair. For the test pair, the original chips were tasted first byhalf the consumers and second by the other half. For halfthe consumers, the placebo pair consisted of the original

Y. XIA ET AL. PREFERENCE ALTERED DEMAND CHARACTERISTICS

151Journal of Sensory Studies 29 (2014) 149–158 © 2014 Wiley Periodicals, Inc.

chips, while the other half were sour cream and onion. Nowater mouth-rinses were taken during the test. Sessionlengths ranged 2–5 min.

Part B

The goal of this part of the experiment was to perform thesame paired preference tests as in part A, with the modifica-tion that the test pair was always presented before theplacebo pair.

MATERIALS AND METHODS

Consumers

Consumers of potato chips (n = 200, 94 M, 106 F, age range20–36 years) were selected. They were students and staff ofthe University of California, Davis, and were interceptedand tested in the same conditions as for part A.

Stimuli

Stimuli were the same as for part A.

Procedure

The details of the procedure and the environment were thesame as for part A with the following modification. The testpair was always presented before the placebo pair. However,the instructions were exactly the same, despite the fact thatthe test pair was served first. Consumers did not appear dis-turbed by the experimenter initially stating that she thoughtthe consumers would like the test pair of chips the sameamount. Criteria for passing the “screening” test wereexactly the same. Session lengths ranged 2–5 min.

Part C

The goal of this experiment was to perform the same pairedpreference tests as in part A, with the modification that theinitial instructions were shortened.

MATERIALS AND METHODS

Consumers

Consumers of potato chips (n = 200, 98 M, 102 F, age range18–36 years) were selected. They were students and staff ofthe University of California, Davis, and were interceptedand tested in the same conditions as for part A.

Stimuli

Stimuli were the same as for part A.

Procedure

The details of the procedure and the environment were thesame as for part A with the following modification. In partA, the initial part of the instructions to the consumers wasthat they were told that they might like one chip more thanthe other, or vice versa or they might like them just aboutthe same. In this part of the experiment, these instructionswere omitted. However, consumers were still told that theexperimenter knew they would like them about the samebecause everybody did. Consumers then tasted the placebopair of chips and proceeded as in part A. Session lengthsranged 2–5 min.

Parts D, E and F

Parts A, B and C were repeated as parts D, E and F, respec-tively, using apple juice stimuli instead of potato chips.

Consumers

For Parts D, E and F, the consumers of apple juice wereselected. Part D used 202 consumers (93 M, 109 F, age range18–36 years); part E used 200 consumers (100 M, 100 F, agerange 18–30 years); part F used 206 consumers (84 M,122 F, age range18–32 years). As before, all consumers werestudents and staff of the University of California, Davis,who were intercepted and tested in the same conditions asfor part A, B and C.

Stimuli

Stimuli were apple juice (100% juice, Safeway KitchensApple Juice, Safeway Inc., Pleasanton, CA) and the sameapple juice diluted with purified water to 80% strength. Thewater was purified by a Purelab Prima Reverse OsmosisSystem (ELGA LabWater, High Wycombe, Bucks, England)in series with a Milli-Q Advantage A10 system, using ionexchange and activated charcoal (Millipore Corp., Bedford,MA), yielding water with conductivity <10−6 mho/cm, withTOC ≤ 5 ppb.

The juice samples were served at constant room tempera-ture (18–20C) in 2 oz (59 mL) black plastic cups (Fabri-KalPortion Cups, Kalamazoo, MI).

Procedure

For parts D, E and F, the details of the procedures and theenvironment were the same as for parts A, B and C, respec-tively, except that apple juice stimuli were used instead ofpotato chips. As with the chips, the juices were swallowedand no water mouth-rinses were taken during the test. Forall three parts, session lengths ranged 2–5 min.

PREFERENCE ALTERED DEMAND CHARACTERISTICS Y. XIA ET AL.

152 Journal of Sensory Studies 29 (2014) 149–158 © 2014 Wiley Periodicals, Inc.

RESULTS

The results from parts A, B and C using potato chips aresummarized in Table 1 while the results for parts D, E and Fusing apple juice are summarized in Table 2. In both tables,the sample sizes are given for the consumers who passed thescreening and those who did not (both in bold), as well asthe size of the whole sample. It can be seen that the percent-age of consumers who passed the “screening” test by notresponding to the placebo pair with preferences, ranging80–90%, was robustly higher than the 20–35% usuallyfound with paired preference tests that use a placebo pair(see Introduction section). This indicated that reversing thehidden demand characteristics in the preference test, causedby the altered instructions and questions used in the experi-ments, successfully increased the proportion of “screened”consumers, who were available to assess the test pair ofproducts.

Regarding the test pairs of products, the tables indicatethe preference/no preference responses of the consumerswho passed the screening and those who did not, as well as

the whole sample of consumers. From Table 1, for thoseconsumers who passed the screening, it can be seen thatmore consumers preferred the sour cream and onion chipsthan preferred the original chips (e.g. part A: 51.2 versus36.4%). Unsurprisingly, this was echoed in the wholesample, yet with smaller preference ratios in parts A, B andC (1.2 versus 1.4; 3.5 versus 3.6; 3.1 versus 3.5, respec-tively). The reason for this was the discrepancies betweenthose who passed the screening and those who did not. Inpart A, those who did and did not pass the screening indi-cated preferences in opposing directions. In parts B and C,the preferences were in the same direction but the prefer-ence ratios for those who failed the screening were notice-ably reduced (2.8 versus 3.6; 1.8 versus 3.5, respectively).Although two-way Chi-square values for differences inpreference/no preference frequencies between those whopassed and failed the screening were not significant(P ≧ 0.09), the results are suggestive. If consumers whodid not pass the screening had not been eliminated, theextent of preference for the sour cream and onion chipswould have been underestimated.

TABLE 1. PERCENTAGES OF CONSUMER PREFERENCE AND NO PREFERENCE RESPONSES FOR THE VARIOUS PROTOCOLS IN PARTS A, B AND COF EXPERIMENT 1, FOR PAIRED PREFERENCE TESTS BETWEEN ORIGINAL AND SOUR CREAM AND ONION FLAVORED POTATO CHIPS, FORCONSUMERS WHO PASSED OR DID NOT PASS “SCREENING” AND FOR THE WHOLE SAMPLE OF CONSUMERS

Experimental protocolPassed or notpassed screening Sample size

Preferringoriginal chips

Nopreference

Preferring sourcream andonion chips

Part A Passed screening 162 (80%) 36.4 12.3 51.2Placebo first Not passed screening 41 (20%) 53.7 9.8 36.6

Whole sample 203 39.9 11.8 48.3Part B Passed screening 179 (90%) 19.6 10.1 70.4Placebo second Not passed screening 21 (10%) 23.8 9.5 66.7

Whole sample 200 20 10 70Part C Passed screening 167 (84%) 18.6 15.6 65.9Placebo first Not passed screening 33 (16%) 33.3 6.1 60.6Shortened instructions Whole sample 200 21 14 65

TABLE 2. PERCENTAGES OF CONSUMER PREFERENCE AND NO PREFERENCE RESPONSES FOR THE VARIOUS PROTOCOLS IN PARTS D, E AND F OFEXPERIMENT 1, FOR PAIRED PREFERENCE TESTS BETWEEN UNDILUTED AND DILUTED APPLE JUICES, FOR CONSUMERS WHO PASSED OR DID NOTPASS “SCREENING” AND FOR THE WHOLE SAMPLE OF CONSUMERS

Experimental ProtocolPassed or notpassed Screening Sample size

Preferringdilutedapple juice

Nopreference

Preferringundilutedapple juice

Part D Passed screening 161 (80%) 18.6 14.9 66.5Placebo first Not passed screening 41 (20%) 46.3 4.9 48.8

Whole sample 202 24.3 12.9 62.9Part E Passed screening 162 (81%) 11.7 13.0 75.3Placebo second Not passed screening 38 (19%) 26.3 18.4 55.3

Whole sample 200 14.5 14 71.5Part F Passed screening 166 (81%) 13.3 12.7 74.1Placebo first Not passed screening 40 (19%) 25 0 75Shortened instructions Whole sample 206 20.4 13.6 63.1

Y. XIA ET AL. PREFERENCE ALTERED DEMAND CHARACTERISTICS

153Journal of Sensory Studies 29 (2014) 149–158 © 2014 Wiley Periodicals, Inc.

The effects become clearer in Table 2. For those consum-ers who passed the screening, more consumers preferred theundiluted apple juice. Again, this was echoed in the wholesample, with smaller preference ratios in parts D, E and F(2.6 versus 3.6; 4.9 versus 6.4; 3.1 versus 5.6, respectively).Again, the reason for this was the discrepancies betweenthose who passed the screening and those who did not. Inparts D, E and F, the preference ratios for those who failedthe screening were noticeably reduced (1.1 versus 3.6; 2.1versus 6.4; 3.0 versus 5.6, respectively). In this case, Chi-square values for the preference/no preference frequencieswere significantly different (P ≤ 0.03). These results providestronger evidence for the advantage of eliminating thosewho do not pass the screening, to ensure that the extent ofpreference for the preferred product is not underestimated.

In parts A and D, the placebo was presented first while inparts B and E, it was presented second. For those consumerswho passed the screening, the pattern of preference/no pref-erence responses was significantly different between parts Aand B (Chi-square, P = 0.001), but not between parts D andE (P = 0.16). It is possible that presenting the placebo firstor second had no effect and the difference between partsA and B was merely due to the fact that the consumersand products were different, which could elicit differentpreference/no preference frequencies. Yet, it is also suggestedto be cautious and to include counterbalancing of this orderin future studies.

The difference between parts A and C and parts D and Fis that parts A and D had longer instructions. Consumerswere told that they might like one product more than theother, or vice versa or that they may like them just aboutthe same. They were then told that the experimenter knewthey would like them just about the same because every-body did. The shortened instructions left out the sentencesaying that they may like one product than the other. Thispossibility was not mentioned. Therefore, it could behypothesized that the shorter version of the instructionsgave an even stronger reversal of the hidden demand char-acteristics. Accordingly, it might be hypothesized that thisreversal might have an effect on the test pair of products.If this were so, then the frequency of no preferenceresponses would be higher for parts C and F. This wasindeed so for parts A versus C (15.6 versus 12.3%), whiletheir response frequencies were also significantly different(Chi-square, P = 0.001). Yet, for parts D versus F, thehypothesized higher no preference frequency for F did notoccur (12.7 < 14.9%) while their response frequencies werenot significantly different (P = 0.29). As above, it is pos-sible that the variations in instructions have no effect andthe difference between parts A and C was merely due tothe fact that the consumers and products were different,which could elicit different preference/no preference fre-quencies. It would not be wise, however, to assume that the

method is robust with regard to experimental instructionsuntil further evidence is acquired.

As a statistical point, it should be noted that the Chi-square computations were performed on the frequencies ofconsumers who reported preferences/no preferences andnot on percentage values shown in the tables. For some ofthe comparisons between consumers who had and had notpassed the screening test, the frequencies for the “no prefer-ence” option were sometimes below “5.” This could altersome of the significance values which must accordinglybe taken as approximate. However, this does not alter theconclusions.

In conditions A–F, when asked whether when confrontedwith the two products they had encountered in the test,whether they would choose to buy the product they hadpreferred in the test, all consumers answered in the affirma-tive. This suggested that their test preferences might havebeen operational. Secondly, when asked whether they wouldbuy the second product if the preferred product were notavailable, all consumers answered in affirmative. This lentcredence to the idea that the preferences were between twoproducts that were liked enough to buy.

EXPERIMENT 2

Experiment 1 demonstrated that a preference test withreversed hidden demand characteristics (RHDC protocol)produced a robustly high proportion of consumers whoselected the “no preference” option, when confronted witha placebo pair. The proportions ranged 80–90% which wasconsiderably higher than reports in the literature for regularpaired preference tests that utilized a placebo and test pair.Yet, these comparisons are all of an independent samplestype. The goal of experiment 2 was to require consumers toperform both types of test, to enable a related samplescomparison.

Consumers

Consumers of fruit juice (n = 200, 105 M, 95 F, age range18–41 years), students and staff of the University of Califor-nia, Davis, were selected. They were intercepted in a diningroom on the campus. They were tested in an accessible area,by the side of the dining room, set up for the experiment.

Stimuli

Stimuli were cranberry juice cocktail (20% Juice, LangerJuice Company Inc., City of Industry, CA) and pomegranatejuice cocktail (20% Juice, Langer Juice Company Inc.). Thejuice samples were served at room temperature (18–20C) in2 oz (59 mL) black plastic cups (Fabri-Kal Portion Cups).

PREFERENCE ALTERED DEMAND CHARACTERISTICS Y. XIA ET AL.

154 Journal of Sensory Studies 29 (2014) 149–158 © 2014 Wiley Periodicals, Inc.

Procedure

Consumers were tested individually. After establishingrapport and collecting demographic details, the experi-menter instructed the consumers in the experimental proce-dure and only began the experiment when the consumershad thoroughly understood their task. The experimenterobserved the consumer unobtrusively, to ensure that theconsumers followed the instructions correctly and to beavailable for answering any questions and recording any ofthe consumers’ comments.

Consumers were given two preference tests, each compar-ing the cranberry versus pomegranate juice cocktails. Theydiffered in that one used the RHDC protocol as used inexperiment 1, part A. The other used the protocol utilizedby Alfaro-Rodriguez et al. (2007, 2008), Kim et al. (2008),Alvarez-Coureaux et al. (2010) and Sung et al. (2011). Inthis protocol, which will be called the regular protocol, theconsumer was instructed to taste two juice samples andstate whether they preferred one or the other (these wereindicated by pointing) or whether they had no preference.These instructions were similar to those for the test withRHDC, except the part where the experimenter said that sheknew the consumer would not have a preference wasomitted. In this protocol, consumers who exhibited a prefer-ence with the placebo pair were deemed immediately not tohave passed the “screening” procedure.

The order of presentation of the two protocols was coun-terbalanced over judges. For the tests in each protocol, theplacebo pair was always presented before the test pair. Theorder of presentation of stimuli within a pair was appar-ently random yet was counterbalanced over all the consum-ers. Session lengths ranged 5–8 min.

RESULTS

The results for experiment 2 are given in Table 3. The topthree lines in the table indicate the results for the regularprotocol; the bottom three lines indicate the results for the

RHDC protocol. For each protocol, the numbers in boldrepresent the numbers of consumers who either passed the“screening” procedure or failed it. The unbold numbersrepresent the frequencies for the whole sample of theconsumers.

It can be seen that for the RHDC protocol, the number ofconsumers who passed the “screening” procedure was twiceas big as those using the regular protocol (88 versus 44%).For both protocols, the pomegranate juice was preferred tothe cranberry juice, but the extent of preference was not asgreat as for the chips and apple juices in experiment 1, asseen from lower preference ratios (1.2–1.3).

By inspection, the percentage of frequencies for thepreference/no preference responses for those consumerswho passed the screening, would not appear to presenta different picture for the regular and RHDC protocols.Because this is a related samples comparison, the use ofChi-square test for this comparison was not legitimate. Hadit been so, the difference would have been recorded as notsignificant (P = 0.24). Thus, it would seem that the two pro-tocols did not elicit different preference/no preferenceresponses. In the same way, the differences between theresponse frequencies for those who passed and did not passthe screening, were not significant (Chi-square, regular pro-tocol P = 0.19; RHDC P = 0.29). Using a d′ analysis, thesmaller d′ value representing the strength of preference forthe pomegranate juice with the RHDC protocol (0.15 versus0.20) was not significant (P = 0.84). However, the lowervariance of d′ (0.017 versus 0.037) for the RHDC protocolillustrated how its greater number of screened consumersincreased its power.

DISCUSSION

Experiments 1 and 2 indicated that an unusual change inthe instructions and questions had the effect of elicitingmore “no preference” responses for placebo pairs. One wayof describing the effect of these changes was that it reversedthe hidden demand characteristics of the testing situation.

TABLE 3. NUMBERS OF CONSUMERS REPORTING PREFERENCES AND NO PREFERENCE RESPONSES, FOR THE REVERSED HIDDEN DEMANDCHARACTERISTICS PROTOCOL (RHDC PROTOCOL) AND A REGULAR PROTOCOL, FOR PAIRED PREFERENCE TESTS BETWEEN POMEGRANATE ANDCRANBERRY JUICE, FOR CONSUMERS WHO PASSED OR DID NOT PASS “SCREENING” AND FOR THE WHOLE SAMPLE OF CONSUMERS

Experimentalprotocol

Passed or notpassed screening

Placebo pair Test pair

No preference PreferencePomegranatejuice

Nopreference

Cranberryjuice

Regular Passed screening 88 (44%) 48 (54.5%) 2 (2.3%) 38 (43.2%)Not passed screening 112 (56%) 54 (48.2%) 9 (8.0%) 49 (43.8%)Whole sample 200 102 (51%) 11 (5.5%) 87 (43.5%)

RHDC Passed screening 176 (88%) 89 (50.6%) 13 (7.4%) 74 (42%)Not passed Screening 24 (12%) 10 (41.7%) 4 (16.7%) 10 (41.7%)Whole sample 200 99 (49.5%) 17 (8.5%) 84 (42%)

Y. XIA ET AL. PREFERENCE ALTERED DEMAND CHARACTERISTICS

155Journal of Sensory Studies 29 (2014) 149–158 © 2014 Wiley Periodicals, Inc.

Therefore, should experimenters wish to use only thoseconsumers who had “passed” the placebo test “screening”for their measurement of preference, for test pairs of prod-ucts, this protocol would appear to be a feasible approach.Experiment 1 also demonstrated that small variations in thetest protocol did not lower the percentage of consumerschoosing “no preference” responses with the placebo pair.

It could be argued that the instructions and interac-tion with the experimenter in this protocol deliberatelyattempted to bias the consumers against reporting prefer-ences with the placebo pair. However, it could just as rea-sonably be argued that because of the hidden demandcharacteristics of the test, consumers were already biased toreport preferences for the placebo pair. The protocols usedhere, merely reduced this bias and even reversed it.However, this reversed bias toward choosing a “no prefer-ence” option would have been deleterious, if it had persistedfor the test pair. However, this was not the case. Under thesame biased conditions, consumers reported that the stimuliin the test pair were not liked equally; they had a preference.Also, the proportion of “no preference” responses for thetest pair was arguably small. Therefore, it can be hypoth-esized that the reversal of hidden demand characteristics inthis protocol caused little or no bias with the measurementof preferences for the test pair.

Results for both experiments indicated that the preferredproduct was the same regardless of whether data were takenfrom those consumers who had passed the screening test orthe whole sample of consumers, ignoring the screening test.However, the extent of preference for the preferred productwas reduced if those who did not pass the screening werenot eliminated. This can be seen from the preference ratios.Therefore, it would be erroneous to conclude that thescreening was not necessary, should accurate data beneeded. Data from an unscreened sample of consumers canpotentially underestimate the extent of preference for thepreferred product. For casual testing, this might not matter,but in the case where accurate data are required, screeningwould appear to be an important part of a paired preferencetesting protocol.

In the present protocol, it is also possible to question theexperimenter’s interaction with consumers who had ini-tially responded with preferences for the placebo pair. Sheasked the consumer how great the difference was betweenthe two stimuli and if they said it was very small, she askedwhether this meant that they liked them “just about thesame.” They usually replied in the affirmative and they weredeemed to have passed the “screening.” To explain this, it isimportant to understand the difference between a “test pref-erence” and an “operational preference.”

It should be noted that for the preference tests, theresponse option for the “no preference” condition was tosay that the consumer liked the products “just about the

same.” This term was used rather than “like them thesame.” To explain the reason for this, it is necessary to con-sider the difference between “test” preferences and “opera-tional” preferences. A “test” preference is usually obtainedunder artificial test conditions like the paired preferencetest, where there is a predisposition for the consumer tofeel that she should report a preference. A preference basedon choices made in real life conditions is defined as an“operational” preference. Usually, the goal of test prefer-ences is to predict operational preferences. The problemwith test preferences is that consumers with their predis-position for feeling they must report preferences, will tendto seize on any slight difference in the test stimuli, onwhich to base such a preference. Even if the stimuli areputatively identical, any slight difference in perception,either imagined or real, perhaps because of slight adapta-tion effects, can be used to generate a preference. Yet, thismay not correspond to an operational preference, wheresuch slight differences do not affect real life choice behav-ior. For a test preference, consumers are likely to base theirdecision on whether the stimuli taste the same or not, inmuch the same way, as they do in a difference test.However, for predicting operational preferences, theyshould base their decision on whether the stimuli are the“same product” or not. In this latter case, rather thandeciding whether the two stimuli taste the same, consum-ers should decide whether they taste close enough to bethe same product, in other words: “just about the same.”Theoretically, in terms of signal detection/Thurstonianmodeling, this can be described as preference tests requir-ing a longer τ-criterion than difference tests (Sung et al.2011). Consequently, to get closer to measuring an opera-tional preference in a test preference situation, consumerswhen they expressed a preference with the placebo pair,were asked whether they liked the products “just about thesame.” Further questioning about buying behaviorprovided further evidence.

Finally, just as it is worth describing Thurstonian model-ing, as the vehicle used for reaching conclusions with differ-ence testing, it is worth mentioning the model used forchoosing how to reverse the hidden demand characteristicsin this study. It was based on speculation regardingKahneman’s (2011) theory of fast and slow thinking. Slowthinking is careful, deliberate, consciously aware activitywith what could be called central processing in the brain.Fast thinking is more of an instant reaction, responding“without thinking it through” or “without paying sufficientattention to the task.” It is stereotypic. It switches intothought processes built up from experience. For example,consider the response to the problem that a shepherd has 27sheep, all but nine die, how many are left? People will tendto subtract 9 from 27 (=16) because that is what they havenearly always done before when set similar problems. They

PREFERENCE ALTERED DEMAND CHARACTERISTICS Y. XIA ET AL.

156 Journal of Sensory Studies 29 (2014) 149–158 © 2014 Wiley Periodicals, Inc.

reach a conclusion without paying sufficient attention tothe task.

Fast thinking can loosely be seen as fulfilling, cognitively,the same needs as automatic motor behaviors like tying anecktie or riding a bicycle. These motor behaviors requireslow thinking in central processing at first and then, aftersufficient practice, are transferred to what could be calledsubroutines to work on “auto pilot.” This can free centralprocessing from having to deal with routine repetitive tasks.In fact, such “auto pilot” behavior can be disrupted should itbe transferred to slow thinking, such as when trying toexplain to someone how to tie that necktie. Fast thinkingcould be envisioned as a set of cognitive habits of thoughtor to stretch an analogy: cognitive subroutines. The impor-tant property here is their habitual or stereotypic nature. Inthe same way, it can be hypothesized that because of prefer-ence questions being often asked in everyday life, a con-sumer develops a predisposition to automatically answerwith a preference response. Accordingly, the consumer willrespond in this predisposed manner to a preference test,even with a placebo pair. Motivationally, such behavior isdescribed as responding to hidden demand characteristics.To highlight its attentional aspects, it can also be describedas an example of fast thinking: responding without payingsufficient proper attention to the task.

The problem becomes one of how to remove con-sumers from their fast thinking mode and transfer thepreference judgement back to slow thinking. It is hypoth-esized that with the more consciously aware considerationof central processing, the slow thinking consumer will payproper attention to the sensory characteristics of theplacebo pair, rather than acting on “auto pilot.” Theproblem becomes one of how difficult it might be toachieve this.

Neal et al.’s (2011) popcorn experiment suggests that itmight be easy. The experiment provided an example of howa seemingly trivial change in an experimental protocol, suchas changing to using one’s subdominant hand, can change aconsumer’s popcorn eating behavior. It affected the routineeating behavior of eating without paying sufficient attentionto the flavor, whereby stale popcorn was eaten as much asfresh popcorn. This could be described as eating in a fastthinking mode. Changing to eating with the subdominanthand elicited rejection of the stale popcorn. This could bedescribed as changing to eating in a slow thinking mode,whereby greater attention was paid to the popcorn’s sensoryproperties. From this, it was hypothesized that even a smallchange in experimental conditions of a preference test, suchas unusual instructions and questions, could be differentenough to separate consumers from the usual testingconditions and cause a switch from fast thinking to slowthinking. Then, consumers might pay more close attentionto the sensory attributes of the food in the placebo condi-

tion and report a lack of preference. Whether this model iscorrect or not, the experimental protocol deduced from itwas successful.

It can be argued that the predisposition to respond toa preference test by giving a preference, described as aresponse bias elicited by hidden demand characteristics orusing fast thinking, are really all descriptions of the sametype of behavior; they merely highlight different aspects ofthat behavior. A consideration of response bias or hiddendemand characteristics stresses the motivational aspects ofthis behavior, while fast thinking stresses the attentionalaspects. Both these suggest that many of the extraneousfactors that the consumer responds to in the placebo condi-tion are internal. Yet, there is a further point. Consider theexplanation using response bias, namely responding tohidden demand characteristics. It can be argued that suchan explanation is merely a renaming of the behavior andnot a true explanation. Why do consumers respond to aplacebo pair with a preference? Because of response bias.Define response bias in this situation. It is responding to theplacebo pair with preference. The fast thinking explanationdoes not suffer from this circularity and was therefore ableto be used in a predictive capacity.

REFERENCES

ALFARO-RODRIGUEZ, H., O’MAHONY, M. and ANGULO, O.2005. Paired preference tests: d′ values from Mexicanconsumers with various response options. J. Sensory Studies20, 275–281.

ALFARO-RODRIGUEZ, H., ANGULO, O. and O’MAHONY, M.2007. Be your own placebo: A double paired preference testapproach for establishing expected frequencies. Food Qual.Prefer. 18, 353–361.

ALFARO-RODRIGUEZ, H., ANGULO, M. and O’MAHONY,M. 2008. Paired preference tests: “50:50” and “alternating” nopreferences. J. Sensory Studies 23, 765–779.

ALVAREZ-COUREAUX, Y., AGUILAR, P., O’MAHONY, M. andANGULO, O. 2010. Assessment of preference with controlsfor response bias operating in the test situation: A practicalexample using omega-3 enriched wholegrain breadswith Ecuadorian consumers. J. Sensory Studies 25,659–671.

ANGULO, O., OKAYAMA, K., NAKAMURA, T., YUEN, R. andO’MAHONY, M. 2009. Use of purchase preference options toincrease “no preference” frequencies in placebo preferencetests. J. Sensory Studies 24, 258–268.

BOWKER, A.H. 1948. A test for symmetry in contingencytables. J. Amer. Stats. Assoc. 43, 572–574.

CHAPMAN, K.W. and LAWLESS, H.T. 2005. Sources of errorand the no-preference option in dairy product testing.J. Sensory Studies 20, 454–468.

ENNIS, D.M. and COLLINS, J. 1980. The distinction betweendiscrimination and splitting in paired testing. Report

Y. XIA ET AL. PREFERENCE ALTERED DEMAND CHARACTERISTICS

157Journal of Sensory Studies 29 (2014) 149–158 © 2014 Wiley Periodicals, Inc.

#80–233, Philip Morris Research Center, Richmond, Virginia,pp. 50.

KAHNEMAN, D. 2011. Thinking Fast and Slow, Farrar, Straus &Giroux, New York, NY.

KIM, H-S., LEE, H-S., O’MAHONY, M. and KIM, K-O. 2008.Paired preference tests using placebo pairs and differentresponse options for chips, orange juices and cookies. J.Sensory Studies 23, 417–438.

LAWLESS, H.T. and HEYMANN, H. 2010. Sensory Evaluation ofFood. Principles and Practices, Springer, New York, NY.

MARCHISANO, C., LIM, J., CHO, H.-S., SUH, D.-S., JEON,S.-Y., KIM, K.O. and O’MAHONY, M. 2003. Consumersreport preference when they should not: A cross-culturalstudy. J. Sensory Studies 18, 487–516.

MAXWELL, A.E. 1970. Comparing the classification of subjectsby two independent judges. Brit. J. Psychiat. 116, 651–655.

MCNEMAR, Q. 1947. Note on the sampling error of thedifference between correlated proportions or percentages.Psychometrika 12, 153–157.

NEAL, D.T., WOOD, W., WU, M. and KURLANDER, D. 2011.The pull of the past: When do habits persist despite conflictwith motives? Personality Soc. Psychol. Bull. 37, 1428–1437.

STUART, A. 1955. A test for homogeneity of the marginaldistributions in a two-way classification. Biometrika 42,412–416.

SUNG, Y.E., LEE, H.-S., O’MAHONY, M. and KIM, K.-O. 2011.Paired preference tests: Use of placebo stimuli with liking andbuying preferences. J. Sensory Studies 26, 106–117.

PREFERENCE ALTERED DEMAND CHARACTERISTICS Y. XIA ET AL.

158 Journal of Sensory Studies 29 (2014) 149–158 © 2014 Wiley Periodicals, Inc.