17
THE INFLUENCE OF SP DESIGN ON THE INCENTIVE TO BIAS IN RESPONSES Hui Lu, Tony Fowkes and Mark Wardman Institute for Transport Studies, University of Leeds 1. INTRODUCTION Stated Preference (SP) methods have been used extensively in transport research and elsewhere both for demand forecasting purposes and to value the importance attached to different product features and travel attributes. A great deal of scepticism surrounded the methods in the early years of application, to the extent that studies addressed the issue of the extent to which the results from SP methods validated against comparable Revealed Preference evidence (MVA/ITS/TSU 1987; Louviere, 1988). SP is now very much an accepted method of travel behaviour analysis. However, there are instances where the findings of SP studies are a cause for concern, as we shall demonstrate. Alongside the broader acceptance and wider application of SP methods, some practitioners (Bates, 1998; Ampt et al. 2000; Wardman and Shires, 2001; Arentze et al. 2003) have argued for greater openness in discussing what they see as significant concerns surrounding SP. This paper, based on work undertaken for a PhD thesis, provides a contribution in that spirit, specifically addressing the issue of the strategic biasing of SP responses (Bonsall 1986, Wardman, 1986). This paper reviews and explores incentives for respondents to strategically bias their answers in order to influence policy makers. The paper examines two hypotheses concerning bias in SP responses, relating to task complexity and the presence of specific anti-bias warnings in the questionnaire. Section 2 of this paper summarises concerns surrounding the extent to which the responses to hypothetical questions reliably reflect individuals’ true preferences when there is an incentive to bias responses. The discussion is illustrated with examples from research in the transport field. Section 3 describes the design of the SP exercises and the data collection process whilst section 4 reports the SP results for commuters’ valuation of new trains in Greater Manchester. 2. SP BIASES 2.1 Incentive to bias: lessons from transport studies There are several reasons why individuals’ responses to hypothetical questions might not reflect their true preferences (Bonsall, 1986; Bradley, 1988; Mitchell and Carson, 1989). Amongst these, the issues of realism, task complexity and familiarity have received a considerable amount of attention. On the other hand, and despite serious concerns in the early literature, the strategic biasing of responses tends to have been overlooked in recent times, particularly within the SP methodology.

The Influence of SP Design on the Incentives to Bias in ...web.mit.edu/11.951/oldstuff/albacete/Other_Documents/Europe Tran… · THE INFLUENCE OF SP DESIGN ON THE INCENTIVE TO BIAS

  • Upload
    others

  • View
    8

  • Download
    0

Embed Size (px)

Citation preview

Page 1: The Influence of SP Design on the Incentives to Bias in ...web.mit.edu/11.951/oldstuff/albacete/Other_Documents/Europe Tran… · THE INFLUENCE OF SP DESIGN ON THE INCENTIVE TO BIAS

THE INFLUENCE OF SP DESIGN ON THE INCENTIVE TO BIAS IN RESPONSES

Hui Lu, Tony Fowkes and Mark Wardman

Institute for Transport Studies, University of Leeds

1. INTRODUCTION

Stated Preference (SP) methods have been used extensively in transport research and elsewhere both for demand forecasting purposes and to value the importance attached to different product features and travel attributes. A great deal of scepticism surrounded the methods in the early years of application, to the extent that studies addressed the issue of the extent to which the results from SP methods validated against comparable Revealed Preference evidence (MVA/ITS/TSU 1987; Louviere, 1988). SP is now very much an accepted method of travel behaviour analysis. However, there are instances where the findings of SP studies are a cause for concern, as we shall demonstrate. Alongside the broader acceptance and wider application of SP methods, some practitioners (Bates, 1998; Ampt et al. 2000; Wardman and Shires, 2001; Arentze et al. 2003) have argued for greater openness in discussing what they see as significant concerns surrounding SP. This paper, based on work undertaken for a PhD thesis, provides a contribution in that spirit, specifically addressing the issue of the strategic biasing of SP responses (Bonsall 1986, Wardman, 1986).

This paper reviews and explores incentives for respondents to strategically bias their answers in order to influence policy makers. The paper examines two hypotheses concerning bias in SP responses, relating to task complexity and the presence of specific anti-bias warnings in the questionnaire. Section 2 of this paper summarises concerns surrounding the extent to which the responses to hypothetical questions reliably reflect individuals’ true preferences when there is an incentive to bias responses. The discussion is illustrated with examples from research in the transport field. Section 3 describes the design of the SP exercises and the data collection process whilst section 4 reports the SP results for commuters’ valuation of new trains in Greater Manchester.

2. SP BIASES

2.1 Incentive to bias: lessons from transport studies

There are several reasons why individuals’ responses to hypothetical questions might not reflect their true preferences (Bonsall, 1986; Bradley, 1988; Mitchell and Carson, 1989). Amongst these, the issues of realism, task complexity and familiarity have received a considerable amount of attention. On the other hand, and despite serious concerns in the early literature, the strategic biasing of responses tends to have been overlooked in recent times, particularly within the SP methodology.

Page 2: The Influence of SP Design on the Incentives to Bias in ...web.mit.edu/11.951/oldstuff/albacete/Other_Documents/Europe Tran… · THE INFLUENCE OF SP DESIGN ON THE INCENTIVE TO BIAS

It has long been recognised that some individuals will not reveal their true preferences when there is a benefit to be gained from not doing so. Samuelson (1954, p388) stated that “now it is in the selfish interest of each person to give false signals, to pretend to have less interest in a given collective consumption activity than he really has” whilst Bohm (1971) comments that “potential consumers of a proposed output of a public good have stated preferences which can only be expected to overestimate their true valuations. The simple reason is of course that the consequences as to their payments (eg a tax increase) have been left out of the process.” These statements illustrate the classic free-rider problem and the reverse incentive to overstate values where payment is not expected.

In the transport context, the earliest methods used to obtain monetary valuations of attributes or to forecast likely behaviour were based around direct willingness to pay and stated intention questions. They have an unenviable record. For example, stated intentions to use a new transport service invariably overpredict actual usage, often by a considerable amount, or imply somewhat larger demand elasticities than other methods (Gensch, 1980; Couture and Dooley, 1981; Chatterjee et al., 1983; Fowkes and Preston, 1991; Wardman and Shires, 2003). The reason is simply that respondents have an incentive to attempt to influence policy in their favour, since they are not committed to behave in accordance with their intentions, and the policy is readily apparent. Similarly, differences were found between respondents’ Willingness to Pay (WTP) for improvements and their Willingness to Accept (WTA) a (similar sized) deterioration (Fowkes, 1995).

By offering multiple trade-offs across a number of attributes, it is argued that SP methods can better mask the purpose of a study. Indeed if all attribute variations are equally likely to occur then the method is compatible with respondents revealing their true preferences. SP methods were welcomed as an advance upon cruder stated intentions and willingness to pay approaches and their numerous attractions were able to largely win over an initially sceptical profession such that their widespread use in transport in the 1980’s soon came to dominate revealed preference (RP) applications (Wardman, 1998). In the environmental literature, concerns about bias in Contingent Valuation (CV) responses are well rehearsed and, as was the case in transport some years before, SP was regarded as an advance. Adamowicz et al. (1999, p.467) stated that “Strategic behaviour should be minimal in SP tasks since the choices are made from descriptions of attributes and it will not be clear which choice will over- or underrepresent a valuation”.

Wardman and Whelan (2001) reviewed 45 SP values of new or improved trains from a large number of disparate studies. Not only did the values generally seem implausibly large, but they were found to be three times higher where the purpose of the study would have been readily perceived as valuing new trains. In practice, fares were regulated by government and hence there was an incentive for respondents to signal high values to increase the chances that new trains are introduced, without financial consequences to themselves. Moreover, when the SP values were deflated

Page 3: The Influence of SP Design on the Incentives to Bias in ...web.mit.edu/11.951/oldstuff/albacete/Other_Documents/Europe Tran… · THE INFLUENCE OF SP DESIGN ON THE INCENTIVE TO BIAS

by a third, they were very much more in line with the demand increases observed after the introduction of new stock. In each of four well defined and familiar choice contexts, Wardman and Shires (2001) estimated the RP values of the penalty of having to change trains to be lower than the SP values, with the former averaging 5.4 minutes and the latter averaging 11.5 minutes.

Charging for the use of road space is a contentious issue and it is difficult to mask the purpose of studies dealing with it. SP studies covering tolling often detect a higher sensitivity to this than other cost variations whilst the sensitivity to tolls will be higher for their introduction to currently untolled roads than for variations on currently tolled roads or newly built tolled roads. These patterns are indeed evident in the literature (Wardman, 1998; 2001).

Turning to valuations of time spent standing on a train, SP based values raise concerns. Wardman (2003) covers 23 valuations from 8 studies. The mean value of standing time relative to seated time in the 20 instances where the purpose of the study would clearly have been seen as valuing overcrowding was 3.5. This falls to 2.7 across the three values from 2 studies where overcrowding was an element of a broader study looking at aspects of mode choice and interchange. SP mode specific constants favouring public transport modes over car, sometimes by appreciable amounts, are by no means uncommon.

Wardman and Bristow (2004) found large differences in valuations of aircraft movement between an SP exercise which aimed to conceal the purpose of the survey by placing aircraft movements alongside a wide range of quality of life variables and a more conventional SP exercise which offered trade-offs between local taxes and aircraft noise where the purpose of the study would have been obvious. The latter can be expected to attract more strategic bias and the higher values obtained by it are consistent with this.

We therefore contend that evidence from SP applications in transport suggests that where the objective of the exercise is obvious, especially where the issue is contentious, strategic bias is likely to occur.

2.2 Methods to amend incentive to bias: (i) Task Complexity

By adding complexity to the SP task, it maybe hoped that respondents will exhibit less bias. This may partly be due to the extra effort required merely to complete the exercise without bias, but it is more likely to be due to respondents failing to see any single clear purpose to the exercise. In our example below, we will clearly be interested in valuing the difference between two types of rolling stock, even though ‘journey time’ and ‘frequency’ are included in the base experiment. When we additionally include ‘punctuality’ and ‘crowding’ the purpose of valuing stock type difference may seem less important to respondents and so lead to a reduced incentive to bias the responses in that regard.

Page 4: The Influence of SP Design on the Incentives to Bias in ...web.mit.edu/11.951/oldstuff/albacete/Other_Documents/Europe Tran… · THE INFLUENCE OF SP DESIGN ON THE INCENTIVE TO BIAS

2.3 Methods to amend incentive to bias: (ii) Cheap Talk

Some lessons can be learned from SP’s counterpart - the Contingent Valuation (CV) method used in the environmental and political area. Bohm (1984, p140) stated that “ to improve the performance of any method in which there are possible incentives for misrepresentation, measures should be taken a priori to reduce the inclination to give in to such incentives”. He suggests (a) that the survey not be anonymous; and (b) respondents be informed of the importance of consumer valuations for decision making. Providing warnings against bias has come to be known in CV research as ‘Cheap Talk’ (Cummings and Taylor. 1999; Neil, 1995; Loomis et al, 1996; Brown, 2003; Murphy, 2003). Cheap talk draws from lessons in experimental economics and psychology concerning the design of valuation instruments. Cummings and Taylor (1999) stated that:

“Cheap talk refers to the costless transmission of signals and information (i.e., cheap talk does not directly affect the payoffs of players in a game). Many game theorists typically use the term ’cheap talk’ in referring to nonbinding communication of actions by two or more players in a game prior to their actual binding commitment. As an anonymous reviewer has pointed out, we use the term ‘cheap talk’ in a parallel way, referring to nonbinding communication of actions by two or more players in an experiment prior to their hypothetical commitment.”

Simply put, Cheap Talk (CT) is a warning message before the survey that tells respondents what has caused biases in previous experiments. Some of the above studies showed that a properly designed Cheap Talk insert can effectively amend the bias in the CV methods without causing other biases. There are no studies, known to us, with the application of CT in SP experiments.

Most empirical studies have followed the CT script from Cummings and Taylor (1999). Their script describes the bias problem and explicitly asks respondents not to overstate their true willingness to pay (WTP). It is a ‘long-version’ of CT, describing the bias and reasons for bias. There is little evidence of the effectiveness of the less detailed ‘short-version’ of Cheap Talk (see Poe et al, 2002).

Most of the previous research was carried out by in-house or personal interview, only one experiment is carried out by telephone interview (Aadland and Caplan, 2003). There is no evidence to prove whether or not cheap talk can be used in the mail-back surveys.

2.4 Research hypotheses

In this research, a series of experiments was carried out to test the existence of incentives and explore the approach to reduce/amend incentives in the SP design and practice. Firstly, as discussed in section 2.2, if the aim of survey is masked by the inclusion of additional attributes, the incentive to bias might be amended. This generates the first hypothesis:

Page 5: The Influence of SP Design on the Incentives to Bias in ...web.mit.edu/11.951/oldstuff/albacete/Other_Documents/Europe Tran… · THE INFLUENCE OF SP DESIGN ON THE INCENTIVE TO BIAS

Hypothesis 1 (H1): The Task complexity of the SP designs affects the valuations implicit in responses.

In our experiment, two different designs were introduced with different number of attributes to test if adding more attributes amends respondents’ choices.

Secondly, as discussed in section 2.3, we tried including a ‘Cheap Talk’ warning against bias. Our second hypothesis is therefore:

Hypothesis 2 (H2): The adding of Cheap Talk affects respondent choice, dampening the incentive to strategic bias.

In our experiment, a paragraph of CT was added into some of the SP experiments, following a small focus group and the pilot survey. The paragraph of CT was as follows:

“Previous surveys have sometimes found that people say they would be happy to pay extra for improved trains but when the fare is raised and the improved trains are provided, people say they would prefer the cheaper fare with the old trains. Bearing this in mind, as you read through the following choices, please imagine you will actually have to pay the fare stated.”

3. SP EXPERIMENT DESIGN AND DATA COLLECTION

3.1 Introduction to the SP experiment

The paper aims to identify the influence of different designs on the pattern of SP responses and to explore means of identifying and reducing strategic bias.

Quality of service and passengers’ priorities are important items, which many operators seek to quantify, on the grounds that concentration on the most important aspects may increase patronage and improve profitability. Prior to introducing new or refurbished rolling stock as part of their franchises, Train Operating Companies (TOCs) have often carried out SP surveys to investigate the public preference of the fleet and to test whether the improvement of the fleet and services is enough to increase the fare and thus to evaluate the cost and benefit of this investment. The rolling stock value derived from SP results is used not only for welfare appraisal and pricing but also for rail demand forecasting. By using the fare elasticity, the value of rolling stock can be converted to a demand effect (ATOC 2005).

In previous studies, SP was found to be the most reliable method to investigate the quality improvements. Reviews of the application of SP to rolling stock studies have found that monetary values for time and frequency from SP methods have been remarkably consistent among studies (MVA, 1993). However, the valuation of soft variables such as service quality (rolling stock, punctuality, crowding and so on) is rather less convincing.

Page 6: The Influence of SP Design on the Incentives to Bias in ...web.mit.edu/11.951/oldstuff/albacete/Other_Documents/Europe Tran… · THE INFLUENCE OF SP DESIGN ON THE INCENTIVE TO BIAS

Some empirical evidence found that the value from SP is much higher than the RP (Revealed Preference) when the issue is to introduce a new rolling stock or refurbishment. This can be explained as the bias by which respondents overestimate those characteristics that improve their well-being (Wardman, 1998) or packaging effects (Jones, 1997). As a result, when applying the value obtained from the SP exercises to the demand forecast, bias has been found (Wardman and Whelan, 2001; ATOC, 2002; Wardman, 2003).

In the present research, the SP experiment was conducted in Greater Manchester, UK. Most of the trains running on those commuter’s routes in Greater Manchester (the map is shown in the Fig1) are Pacers (Class 142, 144). Newer trains, Sprinters (Class 150) and Super Sprinters (Class 153 and 156), were introduced to some of the routes, so that most travellers will have had experience of both Pacers and the newer trains. The subject was commuters’ valuation of the difference between the two types of rolling stock shown in Fig2. Sixteen stations were chosen for the main survey. The two criteria for choosing the stations were: firstly, Pacers are currently running on the route; secondly, the boarding number is large enough (>150) in the morning peak hour.

3.2 SP experiments outline

The experiment outline is presented in the following table. Simply put, there were four designs in the SP experiments. Two factors were added into the SP survey: Cheap Talk and Task Complexity. Table 1: The four SP Experiments

SP Design

Adding of Cheap Talk (CT)

Task Complexity (Masking the Aim)

SP1 No Simple (3 attributes) SP2 Yes Simple (3 attributes) SP3 No Complex (5 attributes) SP4 Yes Complex (5 attributes)

Page 7: The Influence of SP Design on the Incentives to Bias in ...web.mit.edu/11.951/oldstuff/albacete/Other_Documents/Europe Tran… · THE INFLUENCE OF SP DESIGN ON THE INCENTIVE TO BIAS

Figure 1: Railway map in Greater Manchester

Figure 2: Stock types – Super Sprinters versus Pacers

3.3 SP experiment design

After a review of studies on rolling stock in recent years, a set of SP experiment designs was used to investigate the effects of different designs on responses. Cost and journey time were obvious attributes to include.

Page 8: The Influence of SP Design on the Incentives to Bias in ...web.mit.edu/11.951/oldstuff/albacete/Other_Documents/Europe Tran… · THE INFLUENCE OF SP DESIGN ON THE INCENTIVE TO BIAS

Some attributes such as cleanness, ventilation, noise, crowding and smoothness of ride were suggested to have a strong influence on the valuation of rolling stock. Some attributes relevant to the rail service were also taken into account in the previous experience such as service frequency, reliability and punctuality.

After much thought, three common attributes were chosen: fare (Cost), in-journey time (Time) and headway (Frequency). In the complex experiment, two more service attributes were included to investigate the effects of task complexity and to see if adding more variables would affect people’s perception of the task. Those two attributes were punctuality and crowding. In all cases, Option A referred to Super Sprinters and Option B to Pacers.

Attributes levels should be realistic and acceptable to respondents. In this research, most of the respondents were commuters in Greater Manchester. Journey time is around 20-50 minutes and cost is around £4-£6. The presentation and measurement of crowding and punctuality are difficult. After the review of previous research, crowding and punctuality were given as the combination of probability of occurrence and length of time. Table 2 gives an example of the choice in the SP design. Table 2: Example of the choice in the SP experiment

CHOICE 7 Option A Option B Train Type Super Sprinters Pacers Journey Time 20 minutes 30 minutes Single Fare £2.00 £1.80 Frequency Every 20 minutes Every 10 minutes Punctuality 2 out of 5 times delay for 10 minutes Always on time Crowding Enough seats 2 out of 5 times stand for whole journeyPreference

A fractional factorial design was applied into the SP design. For the simple design, there were 3 attributes for each alternative with 3 levels each which generated 9 scenarios in the SP experiment. 18 options were generated in the complex design. Considering the task load, these 18 options were split to two questionnaires by choosing 9 for each. Simulations were carried out by using Excel and ALOGIT (Daly, 1992). Boundary ray maps were constructed, and simulations run, as part of tests to see if the design can recover the target attribute valuations sufficiently accurately (Fowkes, 1991).

3.4 Post-questionnaire questions on respondents’ perception

Respondents make their decision based on their perception of the experimental subject. To probe the decision making process of respondents and to gain a better understanding of the effect of perceptions on their decision making, 3 post–questionnaire questions were added into the survey. More specifically, respondents’ opinions of task complexity, familiarity of the alternatives, and the perception of price increase following the introduction of the new trains were explored. Some interesting effects were found, but space prevents their inclusion in this paper.

Page 9: The Influence of SP Design on the Incentives to Bias in ...web.mit.edu/11.951/oldstuff/albacete/Other_Documents/Europe Tran… · THE INFLUENCE OF SP DESIGN ON THE INCENTIVE TO BIAS

3.5 Data collection and description

The questionnaire contains four parts: Part 1 was designed for gathering basic information on the journey. Part 2 is a conventional SP part which provides 9 choices for respondents. In some of the questionnaires, CT and Complexity were added to test the research hypotheses. Part 3 was designed to gather socio-economic data on respondents, allowing us to adjust to external data. Part 4 explores respondents’ perceptions.

A pilot survey was conducted in August 2005 followed by the main survey from November to December 2005 in Greater Manchester. A paper-based questionnaire with a FREEPOST envelope was handed out to the respondents at the 13 railway stations. A total of 2768 questionnaires were sent out. 1321 commuters mailed back, of which there were 1222 usable questionnaires. The response rate is 47.7%.

4. RESUTLS

4.1 Pooling the data

This section presents the data analysis based on 1222 respondents’ answers which gave 10998 preference observations. In the preference data analysis, an MNL model has been applied to find the coefficients for each attribute. We distinguish between short distance trips (S) and long distance (L). With ‘simple’ and ‘complex’ designs, each with and without ‘Cheap Talk’, we have 4 different SP designs (1 to 4), each with 2 options (A and B).

In pooling data across different SP designs, we need to allow for any differences in scale. In Logit models, estimated coefficients are ‘scaled’ according to the unexplained error. The larger the variance of the error term, the smaller the scale. For more detail see Louviere et al (2000). Consequently, for example, our more complex designs might be expected to yield a smaller scale due to a larger amount of residual variation. Failing to account for the difference in scale could attribute its effect to the coefficient estimates. A hierarchical logit model has been estimated which pools each of the binary choice experiments and, by specifying dummy nests as demonstrated in Figure 3, allows the logsum parameter to serve as a rescaling parameter. In this example, the parameters are all scaled to be in the units of the first SP exercise (by scaling factors S1 to S7). This is a standard procedure for pooling across data sets.

Figure 3: Artificial tree structure to obtain scale factors

Page 10: The Influence of SP Design on the Incentives to Bias in ...web.mit.edu/11.951/oldstuff/albacete/Other_Documents/Europe Tran… · THE INFLUENCE OF SP DESIGN ON THE INCENTIVE TO BIAS

Table 3 reports results from MNL models, allowing different scale factors for the 8 data sets. t tests are against zero or unity, as appropriate. As shown in Equation 1, ASC represents the constant term for data set i. Separate ASC coefficients are presented for our 8 cases (short v long for 4 SP designs) reflecting the value of the difference in rolling stock types. The

represent the variables in the SP design such as cost and time etc. The

i

k

i

Xβ are the incremental factors for the income effects on cost.

U (Equation 1) )cos**()*(6

11∑∑==

++=j

jj

n

kkkii tincXASC βα

Table 3: Results from an MNL model estimation

Model 1 Model 2 coef t(0) t(1) coef. t(0) t(1)

Income Effects Base (Inc1 & Inc 6) 0 0 Inc2(cost) (£10-20k) 0.0019 (2.18) 0.0019 (2.00) Inc3(cost) (£21-35k) 0.0050 (5.96) 0.0052 (6.10) Inc4(cost) (£36-50k) 0.0076 (5.33) 0.0087 (5.55) Inc5(cost) (over 50k) 0.0115 (4.12) 0.0126 (5.82) Estimation of Coefficients Time -0.0906 (-5.98) -0.1002 (-11.50) Cost -0.0200 (-6.63) -0.0220 (-15.84) Frequency -0.0662 (-6.59) -0.0741 (-14.82) Punctuality -0.4787 (-6.46) -0.5270 (-10.94) Crowding -0.1455 (-6.07) -0.1596 (-10.49) Estimation of ASC (new rolling stock) New stock in S1 0.3291 (4.80) 0.3407 (4.90) New stock in S2 0.3877 (4.80) 0.4267 (5.79) New stock in S3 0.3986 (3.63) 0.4296 (3.82) New stock in S4 0.3138 (2.56) 0.3553 (3.02) New stock in L1 0.3910 (4.49) 0.4069 (7.10) New stock in L2 0.2471 (3.47) 0.3304 (5.23) New stock in L3 0.5695 (4.90) 0.6420 (5.47) New stock in L4 0.3093 (3.46) 0.3232 (3.11) Scale Factors θSS1 (Open Nest) 1 θLSS12 1 θSS2 1.0852 (6.99) (0.55) θSS34 0.6857 (10.79) (4.95)θSS3 0.7154 (5.40) (-2.15) θLS34 0.5324 (10.23) (8.98)θSS4 0.7671 (6.19) (-1.88) θLS1 1.0037 (5.45) (0.02) θLS2 1.3353 (5.56) (1.40) θLS3 0.6069 (5.38) (-3.48) θLS4 0.5460 (6.29) (-5.23) No. of observations 10885 10885

2ρ with respect to constants 0.1198 0.1193 Log likelihood of constants -6550.8 -6554.7 Chi-test statistic 7.77 Degree of Freedom 4 Chi-test Critical Value (5%) 9.49

Page 11: The Influence of SP Design on the Incentives to Bias in ...web.mit.edu/11.951/oldstuff/albacete/Other_Documents/Europe Tran… · THE INFLUENCE OF SP DESIGN ON THE INCENTIVE TO BIAS

In Model 1, SS1 was chosen as the open nest, and the scale factors were obtained by Nested Logit models. Among the 7 data sets, the scale factors for SS2, LS1 and LS2 were not significantly different from 1 which indicated that their variances were not significantly different from the open nest. In Model 2, the whole data sets had been combined into three groups: Simple for both short and long journey (LSS12), Complex for short journey (SS34) and Complex (LS34) for long journey. From the likelihood ratio test, Model 2 is not significantly worse than Model 1 in statistical meaning. The following analysis is based on Model 2.

4.2 Effects of design variables (Hypothesis 1)

The income multiplicative effects on the cost coefficients had been added into the model to show how respondents’ income affects their decision making. The coefficients are statistically significant with reasonable values. With the increase of income, the weight for cost decreased, which indicated that respondents are less sensitive to cost when their incomes are higher.

The estimation of coefficients from Model 2 are all correct sign with significant t-value (>1.96, 95%). Monetary value of different variables is derived by the ratio of target variable’s coefficients to that of cost. Obtaining the monetary values also helps to cancel out the scale factor effects. Table 4 gives the monetary values for each income group.

As mentioned before, we investigate the effects of SP designs on the responses value of new rolling stock (VoR) by estimating separate ASC terms in the model. From the results, different values of rolling stock were obtained for 8 different designs (VoRS1-4, VoRL1-4). These are shown, disaggregated by income band, in Table 4, along with values of time (VoT), headway (VoH), punctuality (VoP), and crowding (VoC).The value of in-vehicle time is from 4. 6 to 10.2 p/m for different income groups which agrees with previous value of time studies (Wardman, 2001). People with higher income have a higher value of time. Table 4: Monetary values of design variables derived from Model 2 (pence/minute).

Cost Coef VoT VoH VoP VoC Base Group -0.0220 4.55 3.37 23.95 7.25 Inc2 £10-20k -0.0201 4.99 3.69 26.22 7.94 Inc3 £21-35k -0.0168 5.96 4.41 31.37 9.50 Inc4 £36-50k -0.0133 7.53 5.57 39.62 12.00

Inc5 over £50k -0.0094 10.66 7.88 56.06 16.98

Cost Coef VoRS1 VoRS2 VoRS3 VoRS4 VoRL1 VoRL2 VoRL3 VoRL4 Base Group -0.0220 15.49 19.40 19.53 16.15 18.50 15.02 29.18 14.69 Inc2 £10-20k -0.0201 16.95 21.23 21.37 17.68 20.24 16.44 31.94 16.08 Inc3 £21-35k -0.0168 20.28 25.40 25.57 21.15 24.22 19.67 38.21 19.24 Inc4 £36-50k -0.0133 25.62 32.08 32.30 26.71 30.59 24.84 48.27 24.30

Inc5 over £50k -0.0094 36.24 45.39 45.70 37.80 43.29 35.15 68.30 34.38

Table 4 illustrates that there is not a clear picture regarding the effect of task complexity on user valuation of rolling stock. Only group LS3, which is the complex design without CT, shows a statistically significant higher value of

Page 12: The Influence of SP Design on the Incentives to Bias in ...web.mit.edu/11.951/oldstuff/albacete/Other_Documents/Europe Tran… · THE INFLUENCE OF SP DESIGN ON THE INCENTIVE TO BIAS

stock, which compared to group LS1 says that task complexity has raised the valuation, contrary to expectation. For other groups there are different but non-significant findings, yielding no clear overall picture.

Turning to the scaling factors, Table 3 shows us that the scale factors for the simple design are larger than those of the complex design. This indicates that the variance of the complex design responses was larger than those for simple design, which agrees with expectation (Bradley and Daly, 1994). We are still investigating task complexity in our survey, but our initial finding is that it makes responses less accurate but does not clearly alter the valuations.

4.3 The effects of Cheap Talk (Hypothesis 2)

This section presents the effects of CT on the valuation of new rolling stock in our survey. Referring back to Table 3, the ASC (new stock) coefficients are usually lower for the even number designs (with CT) compared to the corresponding odd numbered design. This is not the case for S2 v S1, and is only significant for L4 v L3. On balance, we have found in Table 3 that adding CT appears to dampen people’s incentive to strategic bias.

From Table 4, we see the same effects in terms of monetary value of rolling stock. This is illustrated in Figure 4. For example, looking at income group 2 (£10-20k), the largest group among all the sample, except for the group of ‘SS’ (simple design in short journey), the group with CT had a lower value of rolling stock compared with the one without CT in each adjacent odd-even pair. This indicates that CT has an effect to lower respondents’ value of new train in the complex design and for both experiments for longer journey distances. Figure 4: Effects of SP design on the valuation of rolling stock for the income group £10-20K

Effects of CT on the valuation of new stockIncome group £ 10-20k

05

101520253035

VoRSS VoRSC VoRLS VoRLC

SP Design

Mon

etar

y Va

lue

No CT

With CT

Table 5: Definition of variables

Monetary Value Group Description VoRSS Simple design in shorter journey distance

VoRSC Complex design in shorter journey distance

VoRLS Simple design in longer journey distance VoRLC Complex design in longer journey distance

Page 13: The Influence of SP Design on the Incentives to Bias in ...web.mit.edu/11.951/oldstuff/albacete/Other_Documents/Europe Tran… · THE INFLUENCE OF SP DESIGN ON THE INCENTIVE TO BIAS

Table 6 reports Model 3, where the effect of CT on the estimation of ASC was incorporated in the model as a dummy variable. The test of CT on separate data sets has been conducted and found the effect of CT varies across different design (complex / simple) and different groups of respondents (short journey and long journey). The effects of CT were then tested for the pooled data as shown in Table 6. The preferred model was Model 3, includes different ‘design’ factors as shown in Equation 2. Here, the ‘ct’ is the dummy variable for the adding of Cheap Talk information; iγ is the coefficient of the Cheap Talk. The d represents the dummy variable for design variables.

Y

)cos**()*(})*(*{6

111∑∑∑===

++++=j

jj

n

kkk

m

yyyiii tincXdesigndctASCU βαγ (Equation 2)

Table 6: Results from the joint model

Model 3 Coefficient Estimation Coef. t(0) Inc1(Base)- Income incremental Effects Inc2 (Cost) (£10-20k) 0.0018 (1.98) Inc3 (Cost) (£21-35k) 0.0052 (6.50) Inc4 (Cost) (£36-50k) 0.0086 (4.96) Inc5 (Cost) (over £50k) 0.0126 (5.14) Design Variables CT (Adding of Cheap Talk) -0.0543 (-2.00) LJ (Long Journey Distance)*CD(Complex Design) 0.1003 (1.37) Other Variables Rolling Stock (ASC) 0.4071 (8.85) Time -0.1000 (-22.21) Cost -0.0219 (-19.79) Freq -0.0739 (-16.41) Punctuality -0.5266 (-16.00) Crowding -0.1596 (-14.26) Scale Factors θLS12 1 θS34 0.6921 (22.18) θL34 0.5358 (13.56) No. of Observations 10885 Log likelihood of constants -6557.64 with respect to constants 0.1189 2ρ

Table 6 shows that CT has a significant negative effect on the estimation of constant term (with higher significance in other models not shown). This indicates that when adding Cheap Talk to the design, respondents gave a negative effect on the valuation of new rolling stock (reducing the valuation of Income Group 2 by 2.7 pence per one-way trip). At the same time, journey distance and complex design show some interaction effects on the estimation of constant term, although the effect is not significant. From Models 1 and 2, in Table 3, the scale factor for the design with CT does not

Page 14: The Influence of SP Design on the Incentives to Bias in ...web.mit.edu/11.951/oldstuff/albacete/Other_Documents/Europe Tran… · THE INFLUENCE OF SP DESIGN ON THE INCENTIVE TO BIAS

show a significant difference from the design without CT in each pair. This indicates that adding CT does not cause more variance in the responses. In summary, we therefore find that CT appears to be countering strategic bias.

5. CONCLUSION

This paper has reviewed the incentive for respondents to bias, especially strategic bias, in SP experiments. By introducing increased task complexity and adding Cheap Talk warnings against bias, this research aimed to find the methods to amend the incentive for respondent’s strategic bias.

From the results, we accepted Hypothesis 1 that adding task complexity affects respondents making choices. When the task complexity increases, the scale factor was found to decrease, which indicates that the variance of errors in responses is getting larger. This suggests that respondents should be less able to strategically bias the results. However, from our results, it appears adding more attributes did not affect respondents’ valuation of new rolling stock in a clear way.

Regarding CT, we found no scale factor effects, so we conclude that respondents were not helped or hindered in completing the experiment by the addition of CT. Moving to the size of the valuations, adding CT significantly lowers the valuation of new trains. This is our major finding, and work is proceeding on investigating this further.

In addition, models including the perception attributes have been undertaken which already shows some significant results, but space has prevented discussion of that here.

ACKNOWLEDGEMENTS

The authors are grateful to GMPTE and ITS, University of Leeds for funding support the SP survey, and to Northern Rail Ltd for permission to survey on their stations.

REFERENCES Aadland, D., Caplan, A.J. (2003). Willingness to pay for curbside recycling with detection and mitigation of hypothetical bias. American Journal of Agricultural Economics, 85(2): 492-502.

Adamowicz, W.L., Boxall P.C., Louviere, J.J. and Swait J. (1999) Stated Preference methods for valuing environmental amenities. Chapter 13 pp. 460-482 in Bateman, I.J., and Willis, K.G., eds, Valuing Environmental Preferences: Theory and Practice of the Contingent Valuation Methods in the US, EU and Developing Countries. Oxford University Press

Page 15: The Influence of SP Design on the Incentives to Bias in ...web.mit.edu/11.951/oldstuff/albacete/Other_Documents/Europe Tran… · THE INFLUENCE OF SP DESIGN ON THE INCENTIVE TO BIAS

Ampt, E., Swanson, J., and Pearmain, D. (2000), Stated Preference: Too much Deference? Steer Davies Gleave Ltd. Stated Preference Modelling Techniques. PTRC, pp 191-201

Arentze, T., Borgers, A., Timmermans, H., and DelMistro, R. (2003) Transport Stated Choice responses: effects of task complexity, presentation format and literacy. Transportation Research Part E 39, pp229-244

ATOC (2002, 2005). Public Transport Forecast Handbook (PDFH 4.0)

Bates, J. (1998) Reflections on Stated Preference: theory and practice, in J de D. Ortúzar, D.A. Hensher and S. Jara-Diaz, eds, Travel Behaviour Research: Updating the state of play, chapter 6, 89-103, Pergamon, UK.

Bohm, P. (1971) An approach to the problem of estimating demand for public goods. Swedish Journal of Economics, 73 (1): 51-66

Bohm, P. (1984) Revealing demand for an actual public good, Journal of Public Economics 24 pp. 135-151.

Bonsall, P.W. (1986)Transfer price data – Its definition, collection and use, in E. Ampt, W. Brog and A.J. Richardson (eds), Selected Proceedings of Second International Conference on New Survey Methods in Transport, pp. 63-76. VNU Science Press, Utrecht.

Bradley, M. (1988) Realism and adaptation in designing hypothetical travel choice concepts. Journal of Transport Economics and Policy 22 (1), pp.121-137

Bradley, M., Daly, A. (1994), Use of the logit scaling approach to test for rank-order and fatigue effects in stated preference data. Transportation 21, pp167-184

Brown, T. C., Ajzen, I., and Hrubes, D. (2003) Further tests of entreaties to avoid hypothetical bias in referendum contingent valuation. Journal of Environmental Economics and Management. 46(2): 353-361.

Chatterjee, A., Wegmann, F.J., and McAdams, M.A. (1983) Non-Commitment bias in public opinion on transit usage. Transportation 11 pp 347-360.

Couture, M.R., Dooley, T. (1981) Analysing traveller attitudes to resolve intended and actual use of a new transit service. Transportation Research Record 194, Washington.

Cummings, R. G., Taylor, L. O. (1999) Unbiased value estimates for environmental goods: A Cheap Talk design for the Contingent Valuation method. The American Economic Review. 89(3): 649 - 665.

Fowkes, A. S. (1991) Recent developments in Stated Preference techniques in transport research, PTRC conference, reprinted in Ortuzar J. de D. (2000), Stated Preference modelling techniques, pp37-52, PTRC, London.

Page 16: The Influence of SP Design on the Incentives to Bias in ...web.mit.edu/11.951/oldstuff/albacete/Other_Documents/Europe Tran… · THE INFLUENCE OF SP DESIGN ON THE INCENTIVE TO BIAS

Fowkes, A.S. (1995) The influence of modelling error on the shapes of estimated demand functions, PTRC Conference, published as Transportation Planning Methods, pp49-60, PTRC, London. Fowkes, A.S., Preston, J. (1991) Novel approaches to forecasting the demand for new local rail services. Transportation Research A, 25 (4), pp. 209-218.

Gensch, D.H. (1980) Choice model calibrated on current behaviour predicts public response to new policies. Transportation Research A 14, pp.171-180.

Hague Consulting Group (2000) ALOGIT 4.0EC. The Hague.

Loomis, J.B., Thomas, B., Beatrice, L., and Peterson, G. (1996) Improving validity experiments of Contingent Valuation methods: Results of efforts to reduce the disparity of hypothetical and actual Willingness to Pay. Land Economics, 72(4): 450- 461.

Louviere, J. J. (1988) Analyzing Decision Making: Metric Conjoint Analysis, Newbury Park; London

Louviere, J J., Hensher, D.A., and Swait, J. (2000) Stated Choice Methods: Analysis and Application, Cambridge : Cambridge University Press, 2000

Mitchell, R.C., Carson, R.T. (1989) Using Surveys to Value Public Goods: The Contingent Valuation Method. Resources for the Future, Washington DC.

Murphy, J. J., Stevens, T. H., Allen, P. G., and Weatherhead, D. (2003) A meta-analysis of hypothetical bias in Stated Preference valuation. Working paper. Amherst, MA: Univ. of Massachusetts, Dept. of Resource Economics.

MVA Consultancy, ITS University of Leeds, TSU University of Oxford (1987) Value of travel time savings. Policy Journals, Newbury, Berks.

MVA Consultancy (1993) Passenger Priorities Research Stage 1. Prepared for London Underground Limited.

Neil, H. R. (1995) The context for substitutes in CVM Studies: some empirical observations.” Journal of Environmental Economics and Management, 29(3), 393-397.

Poe, G. L., Clark, J. E., Rondeau, D., and Schulze. W. D. (2002) Provision point mechanisms and field validity tests of Contingent Valuation. Environmental and Resource Economics, 23: 105-131.

Samuelson, P. (1954) The pure theory of public expenditure. The Review of Economics and Statistics, 36(4) 387-389

Wardman, M. (1986) Route choice and the value of motorists’ travel time:

Page 17: The Influence of SP Design on the Incentives to Bias in ...web.mit.edu/11.951/oldstuff/albacete/Other_Documents/Europe Tran… · THE INFLUENCE OF SP DESIGN ON THE INCENTIVE TO BIAS

theoretical and methodological issue, Working Paper 224, Institute for Transport Studies, University of Leeds.

Wardman, M. (1998) The value of travel time: A review of British evidence. Journal of Transport Economics and Policy, 32(3), 285-315.

Wardman, M. (2001) Public transport values of time. Working Paper 564, Institute for Transport Studies, University of Leeds.

Wardman, M. (2003) Reliability of values obtained from Stated Preference methods. Institute for Transport Studies, University of Leeds.

Wardman, M., Bristow A.L. (2004) Noise and air quality valuations: evidence from Stated Preference residential choice models, Transportation Research D, 9(1) 1-27.

Wardman, M., Shires, J. (2001) Comparison of within mode revealed preference and Stated Preference choice models. Paper presented at AET European Transport Conference.

Wardman, M., Shires, J. (2003) Review of fare elasticities in Great Britain. Working Paper 573. Institute for Transport Studies, University of Leeds.

Wardman, M., Whelan, G. (2001) Valuation of improved railway rolling stock: A review of the literature and new evidence. Transport Reviews 21 (4) 415-448.