39
* *

Does reputation hinder entry? Study of statistical ... › uploadfiles › 2018_ps6_pa1.pdf · Blablacar is an online marketplace for ride-sharing. It was established in 2006 inrance,F

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Does reputation hinder entry?

Study of statistical discrimination on a platform ∗Work in progress, please do not share without permission.

Xavier Lambin†& Emil Palikot‡

March 28, 2018

Abstract

Using a dataset collected on a popular ride-sharing platform we, �rstly, docu-

ment that minorities achieve lower economic outcomes in terms of revenue, sold

seats and popularity of listings. Secondly, we show that minority drivers who have

persisted through several �rst interactions despite the discrimination, and built

reputation, achieve results that are similar to those of non-minorities. Finally,

we use the model of career concern's developed by Holmström (1999) to study

dynamics of e�ort.

1 Introduction

In many digital marketplaces, incomplete information plays a crucial role. Interactions

between users are multifaceted, and contracts cannot be exhaustive. In the words of

Joe Gebbia co-founder of AirBnB1 a crucial element of the success of his business is de-

signing trust. What he means by this is that online platforms, typically, match people

who otherwise do not know each other, to engage in one-time activity with uncer-

tain outcomes. In order to address this incomplete information problem, many online

∗We are grateful to Yassine Lefouili, Steven Tadelis, Daniel Garrett, Nicolas Pistolesi, Marc Ivaldiand Bruno Jullien for their valuable comments at various stages of the paper.†Ph.D candidate at Toulouse School of Economics , [email protected]‡Ph.D candidate at Toulouse School of Economics, [email protected]. Emil Palikot greatfully

acknowledges the �nancial support from the ERC grant no. 3409031https://www.youtube.com/watch?v=16cM-RFid9U , last accessed on Dec 5, 2017

1

marketplaces allow users to create pro�les where aside of product/service speci�c infor-

mation users present personal information, i.e., name, age, photo, links to social media

websites, etc. This is meant to facilitate transactions by establishing trust. However,

it can also lead to discrimination.

Economic research has documented discrimination of minorities in house-rentals (Edel-

man and Luca (2014)), ride-sharing (Farajallah et al. (2016)) or taxis (Ge et al. (2016))

across several developed countries. An important common feature of online platforms

is the existence of a reputation system. After each interaction parties are encouraged

to leave a rating and possibly a written review. In principle, this information should

allow updating beliefs on the types of reviewed users, and potentially enable them to

escape discrimination.

In this paper, we ask the question whether the potential of overcoming discrimina-

tion through reputation system is realized. Firstly, we empirically investigate whether

minority drivers face discrimination on a popular ride-sharing platform; secondly, we

demonstrate that building reputation, indeed, allows users to �ght back discrimination.

The reputation e�ect is shown in the context of cross-sectional data, panel and using

coarsened matching. Finally, based on Holmström (1999) we develop a model of the

behavior of drivers and argue that it well predicts taken e�ort and resulting grade.2

Relation to literature: �rstly, this paper relates to the large literature on statisti-

cal discrimination. While the concept has been described since the 1970s (see Phelps

(1972), Arrow (1973) and later by Altonji and Pierret (2001)) with numerous interesting

applications to the labor market (see e.g. Coate and Loury (1993),Farber and Gibbons

(1996), Charles and Guryan (2011), Lang and Lehmann (2012) ), online markets and

their relative anonymity lend themselves well to such discrimination. As a result, they

have also constituted new observation terrains for researchers : Edelman and Luca

(2014), Edelman et al. (2017) and Laouenan and Rathelot (2017) study discrimination

on the short-term house rental platform Airbnb. Castillo et al. (2013), Goddard et al.

(2015) and Ge et al. (2016) show evidence of discrimination in transportation systems.

Close to our empirical analysis, Farajallah et al. (2016) shows minorities have a lower

success rate on a carpooling platform. Our paper also identi�es discrimination. How-

ever, it goes into more details by showing that this discrimination is at least in part

2Results presented in this version of the project should be seen as preliminary. We are currentlyworking on improvements to both theoretical and empirical aspects of this paper.

2

statistical, rather than taste-based (Schuessler and Becker (1958))3, and in this, it is

closest to Laouenan and Rathelot (2017). Recent economic (and computer science) liter-

ature has studied e�ectiveness and design of reputation systems, some notable projects

being: Nosko and Tadelis (2015), Cabral et al. (2010), Bar-Isaac and Tadelis (2008),

Liu and Skrzypacz (2014), Livingston (2005), Jolivet et al. (2016), Bolton et al. (2004),

Mayzlin et al. (2014), Jullien and Park (2014) and Zervas et al. (2015). However, these

papers focus on understanding how consumers may react to the information provided.

They aim at improving the accuracy of the reputation system, by for instance reducing

fraud or providing adequate information. In contrast, our paper shows an excess of

precision in reputation may prevent entry of new users. Spagnolo (2012) and Butler et

al.(2017) show in lab experiments that a reputation system, if not designed wisely may

hinder the entry of new participants. We provide a formal analysis and identify this

e�ect on an existing marketplace. Kovbasyuk and Spagnolo (2017) also show a repeated

game with limited records maximizes the number of trades. Randomly changing types

provide the mechanism behind the identi�ed e�ect. In our model, this e�ect is due to a

trade-o� between revealing information, and allowing the entry of new agents. Finally,

we consider a model of moral hazard, which is related to seminal works in the �eld

Baron and Myerson (1982),Holmström (1999), and La�ont and Tirole (1986), but also

more recent works of Roger and Vasconcelos (2014) and Garrett and Pavan (2012).

The rest of the paper is organized as follows: Section 2 describes the functioning of a

car-pooling platform we are focusing on, as well as our data gathering process. Section

3 documents discrimination, as well as shows the positive e�ect of reputation building

using a cross-section and a matching method. Sections 4 and 5 provide some insights

into the mechanics of reputation building by adapting seminal models of moral hazard

to our setting. Section 6 concludes.

2 Empirical context and data collection

Blablacar is an online marketplace for ride-sharing. It was established in 2006 in France,

and today operates in 22 countries, mostly in Europe, but also Mexico, India, and

Brazil. There are eight million active drivers and over 50 million passenger4, which

makes Blablacar the biggest ride-sharing platform in Europe. The basic idea behind

3Darity and Mason (1998) provide a more detailed review of economic theories of discrimination4https://techcrunch.com/2017/04/10/how-blablacar-faced-growing-pains-and-had-to-change-its-

focus/

3

Blablacar is to enable drivers to o�er seats in their cars that would otherwise end up

empty. Hence, in principle Blablacar should not be used by "professional" drivers, but

by people that just happen to drive on a particular route. This is re�ected in a pricing

mechanism; drivers receive price recommendations based solely on distance, 0.062 EUR

per km, from which they can deviate; however, the maximum price is set 0.082EUR

per km. The idea behind is that Blablacar should allow drivers to cover their costs, but

not run a pro�table business.

After a trip, both passengers and the driver are encouraged to leave a review that

consists of written comment and a number of stars from 1 to 5, which are later on

available publicly. The review system has a simultaneous reveal feature, which means

that a user will not be able to access received review unless he writes one herself.

After typing a search for a ride between a pair of cities, a potential passenger sees a

list of available drivers, with their photo, name, average grade, and price. To see more

details about the driver, including the history of reviews and identity of reviewers a

future passenger has to click on the listing. Examples of pro�le and listing pages are in

the Appendix.A

Data collection We have collected our dataset using a web-crawler on blablacar.fr

website from 1.07.2017 to 08.03.2018. Crawling through the site, the program gathered

all the information available to prospective riders. It includes information on the ride

itself such as the price posted by the driver, time and date, destination, origin, type of

car, whether pets or large luggage are allowed, etc. It also collected detailed data on

the driver itself. Her name, age, picture, a short biography, number of Facebook friends

and other variables are observed. Most importantly, we collect the rating of the driver

and the history of ratings for each driver. The data also covers information related to

the listing itself such as the number of views of a given listing, how many seats have

already been sold etc.

In addition to this, we have matched this data with several other datasets: origins

of names and gender are established through matching with French government index

complemented with other publicly available sources5. In order to increase precision,

we have also used a machine learning software6 to con�rm matching procedure based

5https://www.data.gouv.fr/fr/datasets/liste-de-prenoms/ , http://www.signi�cation-prenom.net/ ,http://madame.le�garo.fr/prenoms/origine/. The complete list of names and origins is available uponrequest.

6www.kairos.com

4

on facial recognition. Passengers might have a preference for cars of higher quality;

we use the value of a car as a proxy. The market value of cars is approximated by

the average price of the same type of cars posted on eBay in Germany. As cost may

be a signi�cant driver of prices, the fuel e�ciency of cars is calculated by matching

car names with a dataset of fuel consumption of cars in long range distances (French

environment and energy management agency � ADEME). Distances and expected time

by car or public transportation are calculated using google.maps. We also include the

suggested and maximum prices set by Blablacar � which follows a cost-based formula

re�ecting the spirit of a not-for-pro�t activity. Drivers may deviate from the suggested

price, but may not exceed the maximum price. Finally, information speci�c to the city

of destination/departure is included, such as population, median income, index of crime

(French government statistics INSEE). We have selected routes that either start or end

in Paris (or close vicinity). The other end of the trip is one of the of remaining 140

largest French cities located further than 20 km from Paris7.

In these preliminary empirical results, we de�ne minority agents as these whose name

has an Arabic or African origin or connotation; In this, we follow most of literature.

However, this constitutes an extension of the de�nition of minorities compared to prior

investigations of discrimination on this platform which restrict minorities to drivers

which an arabic-sounding name (see Farajallah et al (2016)). Descriptive statistics of

selected variables are shown in table 1.

The average price of a ride is 28 euros, and traveling distances are of 370 km in a

car worth 6000 euros. The average driver is 38 years old and has posted (successfully

or not) almost 60 rides in the past. Most of the drivers are men (70%) and around

23% of drivers are from a minority. In our dataset we have about 240.000 observations;

drivers have a unique ID, thus in our dataset, we can identify 48.000 drivers with av-

erage four listings. Unfortunately, we have a number of missing observations for some

variables; therefore typically we will have much fewer observations in estimated regres-

sions. Drivers' strategic decisions are, foremost, setting a price and the number of seats

to o�er. BlablaCar claims that its mission is to help drivers recoup some of the costs

related to driving the car, rather than run a professional transportation endeavor. This,

apart from being an attractive marketing slogan results also in BlablaCar suggesting

prices to drivers, and setting a maximum price. Furthermore, passengers are informed

7The exclusion of small rides is motivated by the fact the core business of Blablacar lies in long-rangetransportation. In shorter ranges around Paris, public transport makes ride-sharing an unattractiveoption and pricing strategies may strongly di�er from that observed in the core market of the platform

5

Table 1: Descriptive statistics

Statistic N Mean St. Dev. Min Max

ride price 234,769 27.72 13.54 6 68price delta 236,529 3.68 3.57 −6.54 18.42driver's age 238,690 37.22 12.60 18 68number of reviews 238,992 41.34 64.22 0 423talkative 241,410 2.21 0.48 1 3published o�ers (#) 238,986 59.29 95.61 2 661number of views of a listing 238,982 18.50 23.58 0 145empty seats (#) 241,410 2.41 0.88 0 5taken seats (#) 241,410 0.29 0.60 0 4average grade 222,453 0.92 0.05 0.74 1.00revenue 238,841 6.56 14.80 0 76seniority (# months) 238,819 43.10 26.71 1 113hours untill ride 204,688 131.96 154.54 1.19 763.42o�er posted since 238,989 5.06 7.45 0.00 52.92post's per month 241,410 1.85 3.21 0.01 29.84picture 241,410 0.87 0.34 0 1bio (#words) 238,878 14.90 16.70 0 79price of car 200,975 6.07 5.10 0.60 27.80consumption 209,660 4.99 0.74 3.65 7.48competition 241,410 28.74 32.82 1 651median revenue (city) 224,634 18,935.67 2,096.17 13,060.00 30,904.50duration public transport (# minutes) 235,022 3.69 2.22 0.14 15.24km 236,062 368.78 178.71 55.14 858.49time of travel (# minutes) 241,410 13.27 4.59 0 23notice (# minutes) 228,771 10.49 9.97 0.00 49.85gender (male) 238,079 0.73 0.44 0 1minority 127,144 0.23 0.42 0 1automatic acceptance 241,410 0.53 0.50 0 1

6

whether a price o�ered is a deviation (up or down) from the suggested price with a

color code (green is a rather low price, red is a rather high price). Thus, the decision

of drivers is really whether, and how much to deviate from the suggested price, we call

this variable �price delta�. Figure 1 shows its distribution in our data.

Figure 1: Distribution of Price Delta

Price delta is close to normally distributed with some skewness to the right, but

the vast majority of drivers set prices within a 5 EUR range around the mean (3.7

EUR). The suggested price is based on expected fuel consumption for an average car

and the distance to be covered. Therefore, deviations can be to some extent explained

by drivers having cars with higher/lower average fuel consumption. Other than cost

elements, factors such as competition from other drivers on the platform as well as other

transportation options can in�uence pricing. Finally, and most importantly from our

perspective is the impact of minority status, and reputation. A characteristic feature

of many online reputation systems is that most users leave very high reviews, and as a

result, there is little variation in data. This is the case also with BlaBlaCar. Drivers are

evaluated on a scale from 1 to 5, and there is a possibility of leaving a written comment.

Figure 2 shows the distribution of average reputation of drivers. As is observed in many

online platforms low ratings are extremely rare (Nosko and Tadelis (2015), Dellarocas

and Wood (2008) ) with the vast majority of rates being within the 4-5 range (i.e.,

between �very good" and �perfect").

7

Figure 2: Distribution of average rating per driver

We collect several measures of economic outcomes. Firstly, we have a proxy for

revenue, which in our case is the product of pric and the number of seats taken as long

as not all of them are sold. Secondly, we have a number of seats reserved. Finally,

a potential passenger has to click on a listing before making a reservation, thus in

our opinion, a number of views that an o�er has received is a good measure of its

attractiveness. Our dataset may miss some very successful rides that are no longer

displayed when data is collected (see discussion in appendix B). We assume this may

happen especially with rides posted by non-minority drivers. As a consequence, the

estimates exhibited in the present paper should be understood as lower bound estimates.

3 Can reputation solve the discrimination problem?

We are interested in investigating whether minority drivers, who may be discriminated

when they enter the platform, can escape this discrimination by building a reputation.

As mentioned before discrimination of minorities is a well-documented phenomenon,

and in raw data, we also see this; listings of minorities receive fewer views 18.9 vs. 16.9,

they have fewer seats taken 0.291 vs. 0.285, and make lower revenue 6.7 vs. 5.9 EUR.

Table 2 introduces various controls:

We can notice several patterns that are consistent across all our measure, older,

8

Table 2: Economic outputs, regressed over driver and ride characteristics

Dependent variable:

number of views revenue taken seats

(1) (2) (3)

driver's age −0.058∗∗∗ (0.007) −0.011∗∗ (0.005) −0.001∗∗∗ (0.0002)number of reviews 0.020∗∗∗ (0.002) 0.016∗∗∗ (0.001) 0.001∗∗∗ (0.00005)talkative 0.723∗∗∗ (0.170) 0.183 (0.121) 0.008 (0.005)minority −0.878∗∗∗ (0.214) −0.631∗∗∗ (0.153) −0.021∗∗∗ (0.006)seniority (# months) −0.024∗∗∗ (0.003) −0.004∗ (0.002) −0.0002∗∗ (0.0001)hours untill ride 0.0005 (0.003) 0.002 (0.002) 0.0002∗∗ (0.0001)posted since 2.592∗∗∗ (0.075) 0.808∗∗∗ (0.053) 0.036∗∗∗ (0.002)posts per month −0.761∗∗∗ (0.039) −0.175∗∗∗ (0.028) −0.009∗∗∗ (0.001)picture 1.796∗∗∗ (0.537) 0.633∗ (0.385) 0.021 (0.015)bio (# words) 0.029∗∗∗ (0.005) 0.008∗∗ (0.004) 0.0003∗∗ (0.0001)male −1.393∗∗∗ (0.182) −0.063 (0.130) 0.003 (0.005)car price −0.042∗∗∗ (0.016) −0.023∗∗ (0.012) −0.001∗∗ (0.0005)consumption 0.645∗∗∗ (0.112) 0.200∗∗ (0.081) 0.012∗∗∗ (0.003)competition 0.020∗∗∗ (0.003) 0.013∗∗∗ (0.002) 0.001∗∗∗ (0.0001)revenue median (city) 0.001∗∗∗ (0.0004) 0.001∗∗ (0.0003) 0.00002∗∗ (0.00001)duration public transport −1.332 (1.272) −5.725∗∗∗ (0.911) −0.112∗∗∗ (0.036)km 0.001 (0.007) 0.025∗∗∗ (0.005) 0.0002 (0.0002)hour 0.062∗∗∗ (0.019) 0.012 (0.013) 0.001∗∗ (0.001)night 0.955∗∗∗ (0.185) 0.468∗∗∗ (0.133) 0.018∗∗∗ (0.005)day 0.732∗∗ (0.298) −1.327∗∗∗ (0.213) −0.069∗∗∗ (0.009)notice −0.948∗∗∗ (0.073) −0.457∗∗∗ (0.052) −0.022∗∗∗ (0.002)automatic acceptance −2.397∗∗∗ (0.165) 1.824∗∗∗ (0.118) 0.087∗∗∗ (0.005)Constant −7.250 (5.629) 0.546 (4.040) −0.055 (0.162)Observations 63,070 62,945 63,599R2 0.250 0.087 0.080Adjusted R2 0.248 0.084 0.076Residual Std. Error 19.906 (df = 62827) 14.244 (df = 62702) 0.573 (df = 63356)F Statistic 86.746∗∗∗ (df = 242; 62827) 24.813∗∗∗ (df = 242; 62702) 22.611∗∗∗ (df = 242; 63356)

Note: Trip �xed e�ects not reported ∗p<0.1; ∗∗p<0.05; ∗∗∗p<0.01

9

female and non-minority drivers receive better economic outcomes. Here we measure

reputation building as a number of reviews; it is highly statistically signi�cant in all

regressions. After we control for a number of reviews, seniority on the platform has

a negative coe�cient. Drivers with pro�les with more extended descriptions and with

picture receive higher outcomes. All in all, these results corroborate earlier �ndings

that minority users are discriminated in online marketplaces.

3.1 Reputation e�ect: OLS on subsamples by seniority

Now we investigate whether these patterns of discrimination di�er depending on the

level of reputation. We divide our sample into three subsamples; no reputation with

less than three reviews, with some reputation (between 4 and 49 reviews), and with

established reputation (more than 50 reviews)8. Initially, minorities are discriminated.

However, building reputation allows them to overcome it. We see this pattern for all our

measures, discrimination disappears entirely in regression with the number of views and

taken seats, and is diminished for revenue. Table 3 presents the results for the number

of views as a dependent variable; other regressions are in Appendix C. Being minority

has a signi�cant impact at the beginning of the career minority drivers receive 2.6 views

fewer (mean is 18.5). Crucially, this e�ect decreases relatively fast and can be entirely

overcome once reputation is built. Figure 3 illustrates this. It is worth noting that most

of the other driver-speci�c characteristics remains unchanged; the sign and economic

magnitude of driver's age are similar, having a picture and lengthy description of a

pro�le does not seem to matter too much. Finally, discrimination could also be gender

speci�c, and indeed we have shown that male drivers on average receive lower economic

outcomes. In this case, it seems that reputation does not seem to play a crucial role,

the e�ect is slightly diminished for experienced drivers. However, we cannot draw any

clear-cut conclusions.

8Choice of these thresholds is ad-hoc. However, the reputation e�ect remains for local changes

10

Table 3: Reputation e�ect, number of views as dependent

Number of reviews:

1:3 4:49 50+

driver's age −0.099∗∗∗ (0.018) −0.070∗∗∗ (0.008) −0.001 (0.013)talkative 1.305∗∗∗ (0.454) 0.479∗∗ (0.210) 1.063∗∗∗ (0.304)minority −2.585∗∗∗ (0.565) −1.190∗∗∗ (0.262) 0.032 (0.389)seniority (# months) −0.007 (0.010) −0.010∗∗ (0.004) −0.046∗∗∗ (0.007)hours untill ride −0.006 (0.007) 0.0002 (0.004) 0.0002 (0.006)posted since 2.244∗∗∗ (0.183) 2.776∗∗∗ (0.094) 2.463∗∗∗ (0.154)posts per month −0.269∗∗ (0.136) −0.511∗∗∗ (0.055) −0.755∗∗∗ (0.054)picture 2.065 (1.467) 2.616∗∗∗ (0.770) 1.393 (1.021)bio (#words) 0.010 (0.015) 0.023∗∗∗ (0.006) 0.025∗∗∗ (0.009)male −1.265∗∗ (0.495) −1.235∗∗∗ (0.221) −0.532 (0.390)car price −0.061 (0.042) −0.009 (0.020) −0.056∗∗ (0.028)consumption 0.416 (0.298) 0.714∗∗∗ (0.140) 0.882∗∗∗ (0.207)competition 0.014∗ (0.009) 0.023∗∗∗ (0.004) 0.022∗∗∗ (0.006)median revenue (city) 0.0003 (0.001) 0.002∗∗∗ (0.0005) 0.001∗∗ (0.001)duration public transport −2.376 (2.953) −0.617 (1.563) 3.335 (2.815)km 0.007 (0.016) 0.002 (0.009) −0.011 (0.013)hour 0.157∗∗∗ (0.049) −0.003 (0.024) 0.087∗∗ (0.034)night 0.141 (0.505) 0.865∗∗∗ (0.228) 1.265∗∗∗ (0.334)day 1.548∗∗ (0.785) 1.125∗∗∗ (0.381) −1.378∗∗∗ (0.528)notice −0.656∗∗∗ (0.179) −0.968∗∗∗ (0.091) −0.985∗∗∗ (0.152)automatic acceptance −1.525∗∗∗ (0.442) −2.211∗∗∗ (0.199) −4.186∗∗∗ (0.319)Constant 10.639 (12.653) −14.894∗∗ (7.033) −15.627 (11.805)Observations 8,684 42,314 17,871R2 0.259 0.265 0.266Adjusted R2 0.238 0.261 0.256Residual Std. Error 19.487 (df = 8444) 19.875 (df = 42073) 19.463 (df = 17629)F Statistic 12.337∗∗∗ (df = 239; 8444) 63.170∗∗∗ (df = 240; 42073) 26.521∗∗∗ (df = 241; 17629)

Note: Trip �xed e�ects not reported ∗p<0.1; ∗∗p<0.05; ∗∗∗p<0.01

11

Figure 3: Coe�cient of minority status from regressions in Table 3

3.2 Panel data To be added

3.3 Coarsened Exact Matching

This project, likewise most in the literature, uses non-experimental data for evaluating

the impact of minority status. Hence, producing estimates of the impact of being a

minority su�ers from the selection on the non-observables bias. There is a growing,

mostly theoretical, literature on the use of matching techniques to address this issue.

Rosenbaum and Rubin (1983) and Heckman et al. (1997) demonstrate that this bias

can be greatly reduced by use of various matching techniques. Some of their properties

are discussed by Abadie and Imbens (2006) and Abadie and Imbens (2016). A similar

methodology has been applied in Sarsons (2017) 9.

The objective of matching exercise is to test the robustness of results from the standard

OLS of Section ??. We will �rstly estimate propensity scores for each of the observa-

tions and discard these with extreme values. Secondly, we will perform matching of

the minority and non-minority subsamples on driver-speci�c variables. We will execute

both exact matching and coarsened matching. Finally, we will regress model using the

matched sample, controlling for listing-speci�c characteristics.

9We use matching software developed by ?

12

The propensity score is a logistic regression with minority status being dependent vari-

ables and following controls: the price of a car, driver's age, number of posts per month,

picture dummy, length of biography, gender, fuel consumption of the car and whether

the driver is talkative. Minority drivers are more likely to be a young male and to enjoy

conversations. They have on average more expensive cars that consume more fuel; their

pro�les are also shorter. More detailed results are in the Appendix D. We delete 5%

smallest and 5% largest propensity scores, in this way we delete observations for which

we are unlikely to �nd a counterpart.

Exact matching is performed on all driver's characteristics for which we have esti-

mated logistic regression. In our sample, it means that we have 8636 minority drivers

matched with 19039 non-minority drivers. As entrants, we will label minority drivers

with less than ten reviews and as incumbent's (experienced users) these with more than

30. From table 4 we can see that even after the matching procedure, minority entrant

Table 4: Economic outcomes of entrants, exact matching

Dependent variable

Number of views Revenue Taken seats

minority -1.4211∗∗ (0.002 ) -0.8294 (0.007)∗∗ -3.00E-02∗∗ (0.005)hours untill ride -0.0234 ∗∗ (0.009) -0.0025 (0.686) -9.20E-05 (0.672)posted since 2.2056 ∗∗∗ (<2e-16) 0.6555 (9.00E-06)∗∗∗ 2.40E-02 ∗∗∗ (5.00E-06)competition 0.0461 ∗∗∗( 8e-10) 0.0029 (0.569) 1.10E-04 (0.559)day 0.7465 ( 0.124 ) 0.9003 (0.006)∗∗ 2.80E-02∗ (0.018)night 1.9899 ∗∗ ( 0.006 ) -1.1733 (0.018)∗ -4.90E-02∗∗ (0.006)notice -0.3299 (0.122 ) -0.3203 (0.027)∗ -1.20E-02∗ (0.023)

Matched Observations 9,555

Note: Trip �xed e�ects not reported ∗p<0.1; ∗∗p<0.05; ∗∗∗p<0.01

drivers are facing discrimination. We repeat the same process for drivers with repu-

tation. Table 5 reports the results. Again, we observe the reputation e�ect; minority

status is insigni�cant for the number of views and seats taken. In case of revenue, there

is a positive relationship, which could suggest a selection e�ect.

Coarsened Matching: is a method to increase the number of matched observations.

We introduce bins in which we will match non-binary covariates: age of the driver, the

price of a car, number of posts per month, length of bio and fuel consumption of the car.

Choice of cuto�s in�uences the precision of matching procedure as well as the number

13

Table 5: Economic outcomes of incumbent drivers, exact matching

Dependent variable

Number of views Revenue Taken seats

minority -0.0200 (0.971 ) 0.9449 (0.035)∗ 0.02679 ( 0.142 )hours untill ride 0.0091 (0.408 ) 0.0180 (0.044)∗ 0.00110∗∗ (0.002)posted since 3.2570∗∗∗ (<2e-16) 1.5431 (2e-12 )∗∗∗ 0.06835 ∗∗∗ ( 3e-14)competition 0.0285 ∗∗( 0.002) 0.0237 ( 0.001)∗∗ 0.00061∗(0.041)day 1.2000∗ ( 0.041 ) 0.7038 ( 0.140 ) 0.04531 ∗ (0.020)night 0.8739 ( 0.357 ) -1.8608 (0.016)∗ -0.10491∗∗∗ (8e-04 )notice -1.1991 ∗∗∗ ( 5e-06 ) -0.9369 (1e-05)∗∗∗ -0.04648 ∗∗∗ ( 9e-08 )

Matched Observations 5,316

Note: Trip �xed e�ects not reported ∗p<0.1; ∗∗p<0.05; ∗∗∗p<0.01

of matched observations; we match within a quantile for each of the variables. In this

way, we match 14146 minority drivers with 45959 nonminority ones, which is almost

a twofold increase. We present only coe�cient on minority (table 5). In the coarsely

Table 6: Economic outcomes entrants and incumbents, coarsened matching

Dependent variable

Number of views Revenue Taken seats

minority (entrant) -0.9869 ∗∗ (0.005) -0.52608∗ (0.029) -0.02070∗ ( 0.016 )minority (incumbent) 0.2507 ( 0.459 ) 0.2327 (0.390) 0.01684 ( 0.134 )

Matched Observations (both models) 60,105

∗p<0.1; ∗∗p<0.05; ∗∗∗p<0.01

matched sample, we also see a clear reputation e�ect. Minority entrants have lower

economic outcomes, however after they build reputation the e�ect goes entirely away.

These results depend on cut-o�s for labeling as entrants/ incumbents, as well as on the

selection of bins for coarsened matching; they are however robust to local changes.

We have provided evidence of reputation e�ect on all available economic outcomes

measures: number of views, taken seats and revenue. Minority drivers face discrimina-

tion at the beginning of their career, this e�ect, however, disappears as soon as they

develop a reputation. We have implemented a matching technique that made minority

drivers more comparable with non-minorities. Still, the reputation e�ect was visible.

14

4 Driver's Incentive Problems - very preliminary

We have seen that reviews are useful for a passenger in learning about the quality

driver. Moreover, we have provided evidence of updating about the expected quality.

Good reviews are valuable for drivers and informative for passengers. In this section,

we expand on this observation by adapting a moral hazard model. In a seminal paper

Holmström (1999) studies incentive problem of a manager in a moral hazard setting,

where the type of manager is known neither to the market nor to the manager herself.

A manager exerts e�ort that is costly, and it impacts the outcome. We believe that

the two-sided incomplete information set-up is a good representation of interactions on

Blablacar.

Let η be a measure of driver's quality; this can be seen as representing her driving skills

or as for how "likable" she is; η is incompletely known to both driver and potential

passenger; however, its minority status is observed. Passengers and drivers know the

underlying distribution of η; speci�cally, we will assume that η is distributed normally

with mean mm and precision (inverse of the variance) hm for minority drivers and mnm,

hnm for non-minorities. With each interaction, a passenger leaves a review to the driver

that is a measure of the quality of the service provided by the driver. In period t, this

quality is given by:

yt = η + at + εt, t = 1, 2...

, where at ∈ [0,∞] is driver's e�ort and εt is a stochastic noise term, that we assume

to be normally distributed with zero mean and precision hε. ε represents the error

on rating, either due to approximative perception by the riders, or imprecision of the

reputation system itself. Drivers are assumed to be risk neutral with preference given

by an atemporal separable utility function:

U(c, a) =∞∑t=1

βt−1 [ct − g(at)] (1)

Cost of exerting e�ort g(.), in general, is assumed to be increasing and convex; utility

function is publicly known. Consumption ct is assumed to be an increasing function of

15

expected quality. Solution to 1 is given by10:

γt ≡∞∑s=t

βs−tαs = g′(α∗t )

,where αt = hεht, ht = hm/nm + thε, and a

∗t (yt−1) represents the equilibrium decision rule.

As, γt is a declining sequence, and since the sum converges to zero, the e�ort will be a

declining sequence that asymptotically converges to zero. We can interpret it by noting

that in this set-up ability and e�ort are substitutes. As long as the ability is unknown

there are returns to supplying it, output today is increasing the perception of ability

and increase the expected wage in future. What is important, is that returns to the

e�ort are bigger the more there is uncertainty about ability. An entrant driver's reviews

are crucial in learning about her ability. Eventually, η is revealed almost completely,

when the driver becomes incumbent, and new observations will have little impact on

beliefs. For very experienced drivers, there are no returns to trying to in�uence output

and labour supply goes to zero.

We expect grades received by drivers to be relatively high at the beginning, when the

returns from providing e�ort are the highest, and to gradually decline, before stabilizing

at the level corresponding to η. Figure 4 documents this using our dataset.

Figure 4: Distributions of grades and di�erent levels of experience

We document changes in the distribution of grades. We restrict the sample to only

10see Holmström (1999) for details

16

these drivers for whom we have at least 25 reviews. Therefore these are reviews of the

same drivers, at di�erent stages of their career. First, we present distribution of grades

from the �rst to the �fth, then from 6th to the 10th and so forth. The vertical line is

the mean of the last distribution. Clearly, we observe the higher, on average, grades at

the beginning of the career and stabilization at the later stage. This is consistent with

intuition provided in Holmström (1999).

This allows us to calculate moments of distributions of types and errors11. Thus, ηm ∼N(4.47, 0.15) and ηnm ∼ N(4.58, 0.12), hε = 0.37. We assume that g(at) = µ

2(a2t ). Now

we can nonlinearly estimate the cost of e�ort and discount factor:

a∗t =1

µ

[∞∑s=t

βs−tαs

](2)

This estimation allows us to calculate the expected output.12

5 Reputation system: a barrier to or a mode of entry?

In this section, we propose an alternative model of moral hazard, in order to shed some

further light on the problem of e�ort taking by an entrant driver. Is is a static version

of a model where a principal (passenger in the context of a ride-sharing platform)

interacts with two agents (drivers): an incumbent who had used the service n times

before and developed some reputation and an entrant with no reputation yet. We

examine the impact of a reputation system on a principal's choice to interact with a

given agent instead of the other one. The reputation, built over past interactions helps

the principal �gure out whether she wants to interact with an experienced agent or hire

a new one with no reputation yet.13

Each of the past n interactions corresponds to a ride for which a principal needed an

agent. The agent's type has two dimensions a private one β and a public one ε. In the

context of ride-sharing, we will think of ε as belonging to a minority or not, and of β as

the quality/experience, the driver is actually providing. The principal does not observe

the exact type of the agent but has a prior on the distribution of types. For a given

11This results should be seen as an illustration, and they will change while we �netune the model12Linking the optimal e�ort with expected output and using it as a proxy for quality of the driver

is the next stage, and it is currently under development13Extension to a dynamic version of the game, in which agents internalize the impact of their actions

on reputations is our next step, and we are currently working on it

17

agent, the distribution depends on an observed attribute ε. The cumulative distribution

function of types in a given population ε is denoted Fε(.). Hence, when a principal is

matched with an agent, she observes ε, the �population" from which the agent is selected;

which itself is drawn randomly from CDF G(ε). The principal decides whom to hire;

she can thus choose to keep the incumbent (whose population we denote εold), or �re

her and hire the new agent (εnew). She will make a decision based on her beliefs about

current and new drivers' type and propensity to exert e�orts � and her own ability to

elicit e�orts based on the information at his disposal. The model is an extension of

La�ont and Tirole (1986), by addition of the incumbent. Alternatively, this game can

be seen as repeated interaction in which users are myopic. Despite them being short-

sighted, the platform has a degree of memory about previous interactions conveyed

by a �potentially imperfect� reputation system. Hence, any information revealed and

recorded by the principal, will be fully exploited at later stages of the game. The

principal designs pricing and e�ort scheme by which she aims at fully revealing direct

mechanisms. At each period j, the agent i is asked to report her type β̂i. The agent

is then paid a transfer si and provided with a recommendation on e�orts ei. Both the

transfer and e�orts will depend on the reported type β̂i, the population εi of the agent

and the beliefs of the principal, to be described later on. The principal enjoys a gross

bene�t of πi = βi + ei, meaning a high-type agent provides high bene�ts. The principal

also wants to encourage e�ort from an agent, which induces a private cost to the agent.

The principal observes πi, but not βi and ei, giving the agent the opportunity to pretend

she is a low type, to save on e�orts. The principal aims at maximizing:

βi + ei − si

The agent endures a cost of producing e�orts ψ(ei). Her outside opportunity is 0. Her

payo� if she accepts the o�er of the principal is therefore:

ui = si − ψ(ei)

A key addition to standard models is that the principal only has imperfect recall of

previous disclosures. Assuming agent i revealed herself to be of type βi over the last n

periods, the principal believes the type of the agent is :

β̃i,n+1 = (1− αn)βi + αnvε (3)

18

, where vε is randomly drawn from distribution Fε(.), which corresponds to the prior

distribution of types in population ε, absent further information. From the perspective

of principle and agents α ∈ [0, 1] is exogenous14, and represents how well the principal

remembers the information extracted in previous periods. α = 1 means she remembers

nothing and starts from the same prior each period. α = 0 means there is perfect

recall: once the type of the agent is disclosed, the information is fully exploited leaving

the agent with no rent. At the intermediary levels of α the principal has an idea of

the type previously disclosed, but still relies on her prior to some extent. Another

interpretation for this belief formation is that the principal distrusts the information

previously collected and still clings to her prior. We could describe this behavior as a

form of persistence in prejudice. In practice, α represents the strength of the reputation

system, which allows transmitting private information revealed in previous stages to

subsequent stages. The smaller α, the more precise the reputation system. Each period,

the principal forms her belief on agent types and sets a menu of payments. We focus

on revealing mechanisms, that ensure full participation. In more details, the timing

within each is:

• Step 1: principal establishes her prior on current driver (equation 3)

• Step 2: principal is matched with a new driver of population drawn from G(ε)

• Step 3: principal chooses whether to retain her current agent, or �re her and hire

the new one.

• Step 4: agent accepts/rejects participation, and reports her type β̂i

• Step 5: mechanism prescribes a menu of payments. Agent exerts e�orts e accord-

ingly

• Step 6: agent/principal agree on bad outcome if observed bene�ts do not equal

π(β̂i) = β̂i + e(β̂i))

5.1 Hiring and �ring decisions

Expected bene�t of a new driver In this subsection we review some features of

standard incentive compatible contracts, and study how these contracts will change

when agents come from di�erent populations. Whenever no confusion is possible, we

drop subscript i for the sake of concision.

The principal is matched with a driver of population ε. Pricing scheme ensures full

participation. We focus on direct mechanisms. Assuming the agent reports type β̂, she

14Introducing the platform that sets α is an extension, currently under work

19

then has to exert e�orts that will replicate the total bene�ts π(β̂) the principal expects

to observe (otherwise the agent incurs a large penalty). Therefore, she has to choose

e(β, β̂) = π(β̂)− β

Her payo� if innate type is β, and reports β̂ is:

u(β, β̂) = si(β̂)− ψ(π(β̂)− β) (4)

∂U

∂β(β, β̂) = ψ′(π(β̂)− β)

Let U(β) be the agent's payo� with truthful reporting. The envelope theorem yields

that:

U(β) = U(βε) +

∫ β

βε

ψ′(e(s))ds (5)

To minimize rents left to the agent, the principal will choose a menu such that U(βε) = 0

. Integration by parts we obtain :

E(U(β)) =

∫ β̄ε

βε

1− Fε(s)fε(s)

ψ′(e(s))fε(s)ds (6)

The expected payo� of the principal is :

Enew(Π) =

∫ β̄ε

βε

(π(s)− si(s))fε(s)ds (7)

=

∫ β̄ε

βε

(β + e− 1− Fε(s)

fε(s)ψ′(e(s))− ψ(e(s)

)fε(s)ds (8)

Maximizing over the e�ort recommendation yields:

ψ′(e(β)) = 1− 1− Fε(β)

fε(β)ψ′′(e(β)) (9)

This means high types are required to exert high e�orts, while low type e�orts are

distorted downwards. This is a classical result of the theory of incentives.

Assumption 1. For ease of exposition, we make the following assumptions:

20

A1 The di�erence in distribution of types between populations, modeled by changes in

ε, corresponds to a shift of densities from left to right, meaning Fε(β) = F0(β− ε)

A2 ψ(e) = e2

2

A3 Fε(.) follows a uniform distribution over [ε, 1 + ε]

Assumption A1 means that the higher ε, the higher the expected type of an agent.

This has an impact on e�ort recommendation:

ψ′(eε(β)) = 1− 1− F0(β − ε)f0(β − ε)

ψ′′(e(β) (10)

Lemma 1. Under assumptions 1, take an agent of a given type β, with no reputation.

As the agent's prior signal decreases (ε small):

1. the required e�ort increases

2. the surplus of the agent increases

3. the surplus of the principal derived from the agent increases

Proof. Impact of ε on e�orts: De�ne :

h(e, ε) =− ψ′(eε(β)) + 1− 1− Fε(β)

fε(β)ψ′′(e(β))

We use the implicit function theorem on h(e, ε):

∂e∗(ε)

∂ε= −

1 + 1−Fε(β)f2(ε,β))

∂f(ε,β))∂β

ψ′′(e) + 1−Fε(β)f(ε,β))

ψ′′′(e)

= −1 < 0

Impact of ε on agent surplus: Surplus of type β from population ε:

u(β, ε) =

∫ β

βε

ψ′(e(s, ε))f(s, ε)ds =

∫ β

ε

(s− ε)ds =(β − ε)2

2

⇒ ∂u(β, ε)

∂ε< 0

21

Impact of ε on principal surplus:

Π(β, ε) = β + e(β, ε)− 1− Fε(β)

fε(β)ψ′(e(β, ε)− ψ(e(β, ε))

= β + β − ε− (1− β + ε)(β − ε)− (β − ε)2

2

⇒∂Π(β, ε)

∂ε= ε− β < 0

Albeit intuitive, lemma 1 has signi�cant implications for the principal's choice of

an agent. It is worth observing that absent other information than the prior Fε(.),

A principal will choose an agent from the highest ε population whenever she has the

choice. Agents from a relatively low ε population may thus be stuck in a �reputation

trap", whereby they get excluded from interaction � despite them having a potentially

higher type than the expected type of members of a higher-signal population. This is

also in spite of the fact they will provide more e�ort if they are selected than their

counterpart from a higher population �see item (1) in lemma 1� and despite the fact,

the agent provides more surplus to the principal �see item (3) in lemma 1. This means

a reputation system is necessary to convey information from one period to the other,

to avoid the reputation trap phenomena. As the next section will show, the quality of

the reputation system will be vital to eliminating this issue and restoring e�ciency.

Expected bene�t of retaining the incumbent The bene�t of trading with the

incumbent lies in the reduced uncertainty due to their previous messages conveyed by

the reputation system. Assume the agent i is of true type βi. The population of the

incumbent is indexed by εold. The principal forms her beliefs according to equation (3).

This means the principal may doubt the report of the agent is accurate, and thinks her

type will be drawn from a distribution F̃α,n,ε(.) such that:

F̃α,n,ε(x) = F0

(x− (1− αn)βi

αn− ε)

(11)

We solve an optimal incentive compatible contract for an agent whose type is dis-

tributed according to 11. To do so we follow analogous steps as the case with a new

driver analyzed here-above.

22

We �nd that the expected surplus of the principal is:

Eold(Π) = En,βi[(1− αn)βi + αnvε + en,βi(s)−

1− Fα,n,ε(s)fα,n,ε(s)

ψ′(en,βi(s))− ψ(en,βi(s))

](12)

Maximizing with respect to en,βi(s) we �nd that en,βi(s) is de�ned by :

ψ′(en,βi(s)) = 1− 1− Fα,n,ε(s)fα,n,ε(s)

ψ′′(en,βi(s)) (13)

There are two important di�erences between retaining the incumbent (12) and hiring

an entrant (8). First, the principal has a more precise prior on the agents' expected

type (�rst two terms in 12). Second, the e�ort schedule is modi�ed due again to a

better appreciation of the experienced agents' type. This is re�ected both in the e�ort

recommendation, and the rent left to agents.

Principal choices with a reputation system When choosing to retain or �re a

driver the principal will compare her expected surplus from interaction with either of

them, exposed in lemma 2. For ease of exposition, this lemma uses assumptions 1.

Lemma 2. The expected bene�t from hiring a new agent is :

Enew(Π) =2

3+ εnew (14)

, expected bene�t of trading with the incumbent:

Eold(Π) = (1− αn)βi + αnεold +1

2+α2n

6(15)

Proof. See Appendix E.1

Generally the principal's hiring decisions can be illustrated with two examples.

Firstly, let us compare agents that have traded exactly the same number of times

n. In that case we �nd from equation 15 that the principal will hire agent 1 instead of

2, when

β1 ≥ β2 +αn

1− αn(ε2 − ε1) (16)

23

The relative importance of revealed (true) type βi compared to the public signal εi

decreases in α: from the point of view of comparing users that have been interacting

before, the principal will prefer having a very precise reputation system, so as to rely

on real type instead of public signals and reach the optimal decision rule.

Secondly, let us compare the cases when the principal needs to choose between

an experienced (�old") agent with n > 0 recorded interactions and a new one with

no reputation. The principal will retain the incumbent if and only if the expected

bene�t from continued interaction with her Eold(Π) is greater than the expected bene�t

from hiring a new agent Enew(Π). In this case, both the information revealed by the

incumbent and comparison of the expected types from the two populations are used

by the principal. For ease of exposition, we use assumptions 1 again. With these

assumptions, we obtain the following proposition 1:

Proposition 1. The old agent is retained if and only if

Eold(Π) > Enew(Π)

⇔ βold > β̄ ≡ εnew − αnεold1− αn

+1

6+αn

6(17)

Proof. Derives from a comparison of 14 and 15

There are several conclusions stemming from proposition 2; the �rst conclusion

is that if αn is close to 0 (precise reputation system), the retention policy becomes

e�cient, as the principal retains the manager if and only if the surplus she will obtain

from continuation exceeds the expected surplus stemming from interaction with a new

manager. If αn is close to 1 (i.e., the principal has a poor memory, or the reputation

system is weak), then condition 17 is met if and only if εnew > εold: the principal chooses

whoever has the highest public signal. In that case, agents with a low public signals are

trapped in a �reputation trap". This situation is more likely to occur at early stages

(small n) of the relationship between the principal and the agent (since ∂β̄∂αn

> 0). To

study welfare implications of such a reputation system, we introduce a principal with a

perfect recall that internalizes agent surplus (in other words, if she were a benevolent

social planner). Such a principal would be able to elicit optimal e�ort of all agents she

is matched with. In that case, the retention policy is simpler:

β̄SP =1

2+ εnew (18)

24

The social planner keeps the agent if and only if her observed type is greater than the

expected type of the new population. We can then write the following lemma.

Lemma 3. Assume the principal is matched with a new agent of a higher population

than the incumbent agent. There exist αlim such that if α < αlim, there is excess

retention of the agent compared to the policy of a social planner. If α > αlim, there is

excess exclusion of the agent.

Proof. There is excess retention if and only if

β̄ < β̄SP ⇔ w(αn) ≡ εnew − αnεold1− αn

+1

6+αn

6− 1

2− εnew < 0

w(0) < 0 , w(x) −−→x→1

+∞ and ∂w∂αn

(αn) > 0. The lemma results from the intermediate

value theorem.

Figure 5 summarizes the �ndings of this section. It shows the retention policy per

type β , as a function of reputation system memory αn, when the incumbent is εold = 0,

and the principal is matched with a new agent from population εnew > 0. We see

that if the principal distrusts past observations or has a bad memory (αn large), there

will be an excess exclusion of the incumbent agent. It may go as far as excluding all

incumbent agents, notwithstanding their type. On the contrary, if the memory is too

good, the principal may exert excess retention: even though the incumbent may be

of a relatively low type, the principal will keep her. This is because the information

revealed through the reputation system allows the principal to reduce necessary rent.

Closed form solutions for threshold values for α can be found in appendix E.2.

From a social welfare perspective, it may, therefore, be desirable that the reputation

system has an imperfect recall, to limit entrenchment of the incumbent (thanks to the

asymmetry of information being revealed during previous interactions). There should

be some recall though, to avoid that populations be con�ned in a reputation trap.

Taking only the static game into account, the principal always �nds it optimal to

have as good a memory as possible. A principal should, therefore, strive to have α as

low as possible. This result is unlikely to hold if she takes into account the following

two e�ects: (1) the e�ect of reputation on incentives to do long-lasting investment

in quality. (2) The e�ect of reputation on the entry of new agents � which bene�ts

will be reaped in future periods. The analysis of future market expansion e�ects on a

platforms' inclination to keep a precise history of previous transactions is currently in

progress.

25

Figure 5: Retention policy per type β , as a function of reputation system memory αn

6 Conclusion

While it has been documented on a number of online marketplaces that minority users

face discrimination, the role of reputation system in overcoming it has been less studied.

Our empirical analysis uses unique data on listings on a popular on-line carpooling

platform. We show that indeed, minority users face discrimination. Their listings

are less popular, they sell fewer seats and have lower revenue. However, this e�ect

is concentrated during �rst interactions on the platform. Minority drivers overcome

discrimination by building a reputation. A minority driver with several reviews receives

similar economic outcomes to a non-minority one. We show this result using cross-

sectional data, as well as with coarsened matching.

Passengers are willing to change their minds about minority drivers as soon as they

see reviews. This observation highlights the importance of a well-designed reputation

system in inducing entry and market expansion. Early stages of reputation building are

26

particularly important. We are currently working on applying seminal moral hazard

models of Holmström (1999) as well as of La�ont and Tirole (1988) to gain further

insights into the role played by online reviews.

As signaled before, this is an on-going research project and results presented in the

paper will be changed in future.

References

Abadie, A. and Imbens, G. W. (2006). Large sample properties of matching estimators

for average treatment e�ects. econometrica, 74(1):235�267.

Abadie, A. and Imbens, G. W. (2016). Matching on the estimated propensity score.

Econometrica, 84(2):781�807.

Altonji, J. G. and Pierret, C. R. (2001). Employer Learning and Statistical Discrimina-

tion EMPLOYER LEARNING AND STATISTICAL DISCRIMINATION*. Source:

The Quarterly Journal of Economics, 116(1):313�350.

Arrow, K. (1973). The Theory of Discrimination.

Bar-Isaac, H. and Tadelis, S. (2008). Seller Reputation. Foundations and Trends R© in

Microeconomics, 4(4):273�351.

Baron, D. P. and Myerson, R. B. (1982). Regulating a monopolist with unknown costs.

Econometrica: Journal of the Econometric Society, 50(4):911�930.

Bolton, G. E., Katok, E., and Ockenfels, A. (2004). How E�ective Are Electronic

Reputation Mechanisms? An Experimental Investigation. Management Science,

50(11):1587�1602.

Cabral, L. L., Hortacsu, A., and Hortaçsu, A. (2010). The dynamics of seller reputa-

tion:Evidence from eBay. The Journal of Industrial Economics, LVIII(1):54�78.

Castillo, M., Petrie, R., Torero, M., and Vesterlund, L. (2013). Gender di�erences

in bargaining outcomes: A �eld experiment on discrimination. Journal of Public

Economics, 99:35�48.

Charles, K. K. and Guryan, J. (2011). Studying Discrimination: Fundamental Chal-

lenges and Recent Progress. Annual Review of Economics, 3(1):479�511.

27

Coate, S. and Loury, G. C. (1993). Will A�rmative-Action Policies Eliminate Negative

Stereotypes ? The American Economic Review, 83(5):1220�1240.

Darity, W. A. and Mason, P. L. (1998). Evidence on Discrimination in Employment:

Codes of Color, Codes of Gender. Journal of Economic Perspectives, 12(2):63�90.

Dellarocas, C. and Wood, C. A. (2008). The Sound of Silence in Online Feedback:

Estimating Trading Risks in the Presence of Reporting Bias. Management Science,

54(3):460�476.

Edelman, B., Luca, M., and Svirsky, D. (2017). Racial Discrimination in the Sharing

Economy: Evidence from a Field Experiment. American Economic Journal: Applied

Economics, 9(2):1�22.

Edelman, B. G. and Luca, M. (2014). Digital Discrimination: The Case of Airbnb.com.

SSRN Electronic Journal.

Farajallah, M., Hammond, R. G., and PPnard, T. (2016). What Drives Pricing Behavior

in Peer-to-Peer Markets? Evidence from the Carsharing Platform BlaBlaCar. SSRN

Electronic Journal.

Farber, H. S. and Gibbons, R. (1996). LEARNING AND WAGE DYNAMICS. Quar-

terly Journal of Economics, 111(4):1007�1047.

Garrett, D. F. and Pavan, A. (2012). Managerial Turnover in a Changing World.

Journal of Political Economy, 120(5):879�925.

Ge, Y., Knittel, C., MacKenzie, D., and Zoepf, S. (2016). Racial and Gender Dis-

crimination in Transportation Network Companies. NBER Working Paper Series,

(22776):1�38.

Goddard, T., Kahn, K. B., and Adkins, A. (2015). Racial bias in driver yielding behavior

at crosswalks. Transportation Research Part F: Tra�c Psychology and Behaviour,

33:1�6.

Heckman, J. J., Ichimura, H., and Todd, P. E. (1997). Matching as an econometric

evaluation estimator: Evidence from evaluating a job training programme. The review

of economic studies, 64(4):605�654.

28

Holmström, B. (1999). Managerial incentive problems: A dynamic perspective. The

review of Economic studies, 66(1):169�182.

Iacus, S., King, G., Porro, G., et al. (2009). Cem: software for coarsened exact matching.

Journal of Statistical Software, 30(13):1�27.

Jolivet, G., Jullien, B., and Postel-Vinay, F. (2016). Reputation and prices on the e-

market: Evidence from a major French platform. International Journal of Industrial

Organization, 45:59�75.

Jullien, B. and Park, I. U. (2014). New, like new, or very good? Reputation and

credibility. Review of Economic Studies, 81(4):1543�1574.

La�ont, J.-J. and Tirole, J. (1986). Using Cost Observations to Regulate Firms. Journal

of Political Economy, 94(3):614�641.

La�ont, J.-J. and Tirole, J. (1988). The Dynamics of Incentive Contracts. Econometrica,

56(5):1153�1175.

Lang, K. and Lehmann, J.-Y. K. (2012). Racial Discrimination in the Labor Market:

Theory and Empirics. Journal of Economic Literature, 50(4):959�1006.

Laouenan, M. and Rathelot, R. (2017). Ethnic Discrimination on an Online Marketplace

of Vacation Rentals. working paper.

Liu, Q. and Skrzypacz, A. (2014). Limited records and reputation bubbles. Journal of

Economic Theory, 151(1):2�29.

Livingston, J. A. (2005). How Valuable Is a Good Reputation? A Sample Selection

Model of Internet Auctions. Review of Economics and Statistics, 87(September):453�

465.

Mayzlin, D., Dover, Y., and Chevalier, J. (2014). Promotional reviews: An empirical

investigation of online review manipulation.

Nosko, C. and Tadelis, S. (2015). The Limits of Reputation in Platform Markets: An

Empirical Analysis and Field Experiment. NBER Working Paper Series, page 20830.

Phelps, E. S. (1972). The Statistical theory of Racism and Sexism. American Economic

Review, 62(4):659�661.

29

Roger, G. and Vasconcelos, L. (2014). Platform Pricing Structure and Moral Hazard.

Journal of Economics and Management Strategy, 23(3):527�547.

Rosenbaum, P. R. and Rubin, D. B. (1983). The central role of the propensity score in

observational studies for causal e�ects. Biometrika, 70(1):41�55.

Sarsons, H. (2017). Interpreting signals in the labor market: Evidence from medical

referrals. Job Market Paper.

Schuessler, K. and Becker, G. S. (1958). The Economics of Discrimination. American

Sociological Review, 23(1):108.

Spagnolo, G. (2012). Reputation, competition, and entry in procurement. International

Journal of Industrial Organization, 30(3):291�296.

Zervas, G., Proserpio, D., and Byers, J. (2015). A First Look at Online Reputation on

Airbnb, Where Every Stay is Above Average. Where Every Stay is Above . . . , pages

1�22.

30

A Navigation on Blablacar.fr

First, users type in the origine, destination and date of the ride they are seeking. They

then see a list of rides meeting their request (�gure 6 ). They may then click on speci�c

postings to have more details about the ride (�gure 7). Finally they may either see the

pro�le of the driver (�gure 8) or proceed directly to payment. Blablacar service fees

are a function of the price posted by the driver. These fees are shown on �gure 9.

Figure 6: Listing o�ered on a given route

31

Figure 7: Details of a posting

32

Figure 8: A driver's pro�le

Figure 9: Evolution of service fees on blablacar over time

33

B Oversampling of minorities for short-notice rides

Due to our scraping method, it cannot be excluded that our sample provides a slightly

biased representation of listings. Indeed, the program takes snapshots of listings dis-

played on the website at a given point time. However, rides that are already full are

no longer displayed on the platform. This means our data collection may undersample

the particularly interesting rides that would sell out very fast, or those corresponding

to times when demand is much higher than supply. This wouldn't be an issue if both

minorities and non-minorities were a�ected the same way by this sampling bias. How-

ever, as we show in this paper the minority status does impact the attractiveness of

a given listing. Therefore, minorities who may be perceived as posting less interesting

rides remain longer on display and may therefore be over-represented in our sample.

Therefore, our minority gap estimates should be understood as lower bounds. Indeed,

minorities are compared to a pool constituted of non-minorities that are not so good

as to have sold out their seats extremely fast. Table 10 shows that minority drivers

represent a specially high share of rides that are posted on a short notice, a possible

sign that non-minority drivers have sold their seats faster. For trips posted with more

notice, we believe our sample is indeed representative of the actual participants on

blablacar. Indeed, most of the rides �either from minorities or not � still have more

than one empty seat, which means that most listings and indeed collected.

Figure 10: Share of minorities in sample as a function of number of days betweenposting and departure

34

This is true despite the fact minorities tend to allow for automatic con�rmation

more frequently than non-minorities (12% of drivers with automatic con�rmation are

minorities, while they represent only 8% of the drivers with manual con�rmation).

C Reputation e�ect

Table 7: Taken seats, regressed over driver and ride characteristics

Number of reviews:

(1:3) (4:49) (50+)

driver's age −0.001∗∗ (0.0004) −0.001∗∗∗ (0.0002) −0.0001 (0.0004)talkative 0.018 (0.011) 0.007 (0.006) 0.023∗∗ (0.010)minority −0.037∗∗∗ (0.014) −0.030∗∗∗ (0.007) −0.012 (0.013)seniority (# months) −0.0004 (0.0003) −0.00001 (0.0001) −0.001∗∗∗ (0.0002)hours untill ride −0.0003 (0.0002) 0.0002∗ (0.0001) 0.0002 (0.0002)posted since 0.017∗∗∗ (0.004) 0.037∗∗∗ (0.003) 0.046∗∗∗ (0.005)posts per month 0.002 (0.003) −0.002 (0.002) −0.011∗∗∗ (0.002)picture 0.031 (0.036) 0.020 (0.022) 0.003 (0.034)bio (# words) 0.0002 (0.0004) 0.00004 (0.0002) 0.0001 (0.0003)male −0.003 (0.012) 0.007 (0.006) 0.022∗ (0.013)car price 0.001 (0.001) −0.002∗∗∗ (0.001) −0.0001 (0.001)consumption 0.012∗ (0.007) 0.018∗∗∗ (0.004) 0.011∗ (0.007)competition 0.0004∗∗ (0.0002) 0.001∗∗∗ (0.0001) 0.001∗∗∗ (0.0002)median revenue (city) 0.00003 (0.00002) 0.00002∗ (0.00001) 0.00003 (0.00002)duration public transport −0.043 (0.072) −0.095∗∗ (0.044) 0.085 (0.094)km −0.0004 (0.0004) 0.0003 (0.0002) −0.0002 (0.0004)hour 0.001 (0.001) 0.001 (0.001) 0.002∗ (0.001)night 0.011 (0.012) 0.014∗∗ (0.006) 0.026∗∗ (0.011)day −0.061∗∗∗ (0.019) −0.061∗∗∗ (0.011) −0.111∗∗∗ (0.017)notice −0.006 (0.004) −0.022∗∗∗ (0.003) −0.028∗∗∗ (0.005)automatic acceptance 0.073∗∗∗ (0.011) 0.084∗∗∗ (0.006) 0.090∗∗∗ (0.011)Constant −0.031 (0.309) −0.135 (0.197) −0.456 (0.393)Observations 8,762 42,654 18,013R2 0.074 0.073 0.097Adjusted R2 0.048 0.067 0.085Residual Std. Error 0.477 (df = 8522) 0.559 (df = 42413) 0.648 (df = 17771)F Statistic 2.867∗∗∗ (df = 239; 8522) 13.818∗∗∗ (df = 240; 42413) 7.936∗∗∗ (df = 241; 17771)

Note: ∗p<0.1; ∗∗p<0.05; ∗∗∗p<0.01

35

Table 8: Revenue, regressed over driver and ride characteristics

Number of reviews:

(1:3) (4:49) (50+)

driver's age −0.025∗∗ (0.012) −0.012∗∗ (0.006) −0.006 (0.010)talkative 0.561∗ (0.301) 0.130 (0.151) 0.408∗ (0.230)minority −0.943∗∗ (0.375) −0.745∗∗∗ (0.189) −0.633∗∗ (0.294)seniority (# months) −0.011 (0.007) −0.004 (0.003) −0.017∗∗∗ (0.005)hours untill ride −0.007 (0.005) 0.003 (0.003) 0.003 (0.005)posted since 0.428∗∗∗ (0.121) 0.887∗∗∗ (0.067) 0.948∗∗∗ (0.116)posts per month 0.066 (0.090) −0.063 (0.040) −0.198∗∗∗ (0.041)picture 0.208 (0.971) 0.587 (0.554) −0.059 (0.776)bio (# words) 0.005 (0.010) −0.001 (0.004) 0.003 (0.007)male −0.173 (0.328) −0.002 (0.159) 0.189 (0.295)car price 0.010 (0.028) −0.044∗∗∗ (0.014) 0.002 (0.022)consumption 0.285 (0.197) 0.368∗∗∗ (0.101) −0.004 (0.156)competition 0.012∗∗ (0.006) 0.014∗∗∗ (0.003) 0.014∗∗∗ (0.004)median revenu (city) 0.001 (0.001) 0.001∗ (0.0003) −0.00001 (0.001)duration public transport −3.229∗ (1.955) −4.759∗∗∗ (1.126) −8.297∗∗∗ (2.291)km 0.007 (0.010) 0.024∗∗∗ (0.006) 0.031∗∗∗ (0.010)hour 0.008 (0.032) −0.002 (0.017) 0.011 (0.026)night 0.043 (0.334) 0.466∗∗∗ (0.164) 0.571∗∗ (0.253)day −1.455∗∗∗ (0.517) −1.329∗∗∗ (0.274) −1.549∗∗∗ (0.398)notice −0.146 (0.118) −0.509∗∗∗ (0.065) −0.555∗∗∗ (0.114)automatic acceptance 1.597∗∗∗ (0.293) 1.753∗∗∗ (0.144) 1.737∗∗∗ (0.241)Constant 1.144 (8.377) −3.628 (5.074) 17.921∗ (9.684)

Observations 8,688 42,235 17,787R2 0.080 0.086 0.120Adjusted R2 0.054 0.081 0.108Residual Std. Error 12.903 (df = 8448) 14.324 (df = 41994) 14.714 (df = 17545)F Statistic 3.062∗∗∗ (df = 239; 8448) 16.533∗∗∗ (df = 240; 41994) 9.917∗∗∗ (df = 241; 17545)

Note: ∗p<0.1; ∗∗p<0.05; ∗∗∗p<0.01

36

Table 9: Propensity score

Dependent variable:

minority

car price 0.020∗∗∗ (0.002)driver's age −0.036∗∗∗ (0.001)posts per month 0.066∗∗∗ (0.003)picture −17.857 (55.481)bio (# words) −0.014∗∗∗ (0.001)gender 1.065∗∗∗ (0.026)consumption 0.134∗∗∗ (0.013)talkative 0.347∗∗∗ (0.019)Constant 15.415 (55.481)

Observations 79,440Log Likelihood −36,268.060Akaike Inf. Crit. 72,554.120

Note: ∗p<0.1; ∗∗p<0.05; ∗∗∗p<0.01

D Matching

E Retention and exclusion decisions

E.1 Expected revenues from an agent

Expected revenues of a new agent: Equation 14 is derived by plugging assump-

tion 1 in equation 12:

Enew(Π) = En,Bi[(1− αn)βi + αnvε + en,βi(s)−

1− Fα,n,ε(s)fα,n,ε(s)

ψ′(en,βi(s))− ψ(en,βi(s))

]

Since the agent is new, we take n = 0 (no history):

= E0,βi

[vε + e0,βi(s)−

1− Fα,0,ε(s)fα,0,ε(s)

e0,βi(s)−(e0,βi(s))

2

2

]

37

E0,βi [vε] = 12

+ ε. Pro�t maximization yields 10, or e0,βi(s) = s− ε

=1

2+ ε+

∫ 1+ε

ε

(s− ε)[1− 1− (s− ε)

1− s− ε

2

]ds

=1

2+ ε+

1

6=

2

3+ ε

Expected revenues of incumbent agent: Equation 15 is derived by plugging

assumption 1 in equation 12:

Eold(Π) = En,Bi[(1− αn)βi + αnvεold + en,βi(s)−

1− Fα,n,εold(s)fα,n,εold(s)

ψ′(en,βi(s))− ψ(en,βi(s))

]= (1− αn)βi + αn

(1

2+ εold

)+ En,βi

[en,βi(s)−

1− Fα,n,εold(s)fα,n,ε(s)

en,βi(s)−(en,βi(s))

2

2

]

Note that Fα,n,εold(s) = x−(1−αn)βiαn

− εold and is de�ned over [(1 − αn)βi + αnεold, (1 −αn)βi + αn(1 + εold)]. fα,n,εold(s) = 1

αnover the same interval, 0 otherwise. Pro�t

maximization yields 10, or en,βi(s) = 1 − 1−Fα,n,εold (s)

fα,n,εold (s)= (1 − αn)(1 − βi) + s − εoldαn .

It follows:

Eold(Π) = (1− αn)βi + αn(

1

2+ εold

)+

∫ (1−αn)βi+αn(1+εold)

(1−αn)βi+αnεold

en,βi(s)

(1− (1− en,βi(s))−

en,βi(s)

2

)1

αnds

= (1− αn)βi + αn(

1

2+ εold

)+

1

2αn

∫ (1−αn)βi+αn(1+εold)

(1−αn)βi+αnεold

((1− αn)(1− βi) + s− εoldαn)2 ds

= (1− αn)βi + αn(

1

2+ εold

)+

1

6αn(1− (1− αn)3

)= (1− αn)βi + αn

(1

2+ εold

)+

1

6

(3− 3αn + α2n

)= (1− αn)βi + αnεold +

1

2+α2n

6

38

E.2 Threshold values for α

Optimal retention : The decision to retain/exclude current driver is optimal when

cuto�s values of β found in 17 and 18 are equal:

β̄ = β̄SP ⇔εnew − αoptεold

1− αopt+

1

6+αopt

6=

1

2+ εnew

⇔ αopt =3 + 6(εnew − εold)−

√(3 + 6(εnew − εold))2 − 8

2

From this it follows that if εnew > εold, there always exist αopt ∈ [0, 1] such that type

retention by the principal corresponds to the one of a benevolent social planner.

Full exclusion : The condition for full exclusion of a population is that β̄ = 1. De�ne

αexclusion the solution to this equation:

αexclusion = 3−√

4 + 6(εnew − εold)

Note that as soon as εnew > εold, there exist αexclusion ∈ [0, 1] such that if α > αexclusion,

all types from incumbent population will be dismissed.

39