161
The Pennsylvania State University The Graduate School College of the Liberal Arts ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING A Dissertation in Economics by Kaustav Das c 2013 Kaustav Das Submitted in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy August 2013

ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

The Pennsylvania State University

The Graduate School

College of the Liberal Arts

ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

A Dissertation inEconomics

byKaustav Das

c© 2013 Kaustav Das

Submitted in Partial Fulfillmentof the Requirements

for the Degree of

Doctor of Philosophy

August 2013

Page 2: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

The dissertation of Kaustav Das was reviewed and approved* by the following:

Kalyan ChatterjeeDistinguished Professor of Economics and Management ScienceDissertation Adviser, Chair of Committee

Edward GreenProfessor of Economics

Vijay KrishnaDistinguished Professor of Economicsand Director of Graduate Studies

Susan H. XuProfessor of Management Science and Supply Chain Management

*Signatures are on file in the Graduate School.

Page 3: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

Abstract

This dissertation consists of Four Chapters:

Chapter 1 analyses a situation where competing agents involved in making the same

discovery have alternative research avenues to pursue. Agents are uncertain about the

quality of the available research methods. They learn about a particular method in light

of their search experiences. One can relate this to R&D activities in the pharmaceutical

industry, electronics industry etc. This scenario is modeled as a Two-armed Bandit prob-

lem. We consider two alternative settings. One has two risky arms which are perfectly

negatively correlated and the other one has one safe arm and one risky arm. I show that

with a winner-takes all structure and heterogeneity among agents with respect to their in-

nate abilities, there is always an excessive amount of experimentation along one of the lines

of research. This phenomenon is called Duplication which implies that there is too much

specialisation along a line of research when efficiency would require more diversification.

Chapter 2 explores the scenario where competing agents trying to make the same

discovery have alternate methods of research to choose from and agents may be privately

informed about the quality of a method. The model is an extension of the second setting

with symmetric firms, where each firm may experience private arrival of information along

the good risky avenue. I show that there is a symmetric non-cooperative equilibrium in

which there is excessive amount of experimentation along the risky avenue if the prior is

iii

Page 4: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

high enough and too little otherwise.

Chapter 31 analyses a model of price formation in a market with a finite number of

non-identical agents engaging in decentralised bilateral interactions. We focus mainly on

equal numbers of buyers and sellers, though we discuss other cases. All characteristics

of agents are assumed to be common knowledge. Buyers simultaneously make targeted

offers, which sellers can accept or reject. Acceptance leads to a pair exiting and rejection

leads to the next period. Offers can be public, private or “ex ante public” (as in directed

search models, which are, however, mostly one-period in the preceding literature). As the

discount factor goes to 1, the price in all transactions converges to the same value.

Chapter 42 studies study a model of decentralised bilateral interactions in a small

market where one of the sellers has private information about her value. There are two

identical buyers and another seller, whose valuation is commonly known to be in between

the two possible valuations of the informed seller. We consider two infinite horizon games,

with public and private simultaneous one-sided offers respectively and simultaneous re-

ponses. We show that there is a stationary perfect Bayes’ equilibrium for both models

such that prices in all transactions converge to the same value as the discount factor goes

to 1.

Keywords: R&D competition, Two-armed Bandit, Duplication, Bilateral Bargaining,

Outside options, Incomplete information, Coase Conjecture, Uniform Price

1Co-authored with Kalyan Chatterjee2Co-authored with Kalyan Chatterjee

iv

Page 5: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

Contents

Dedication ix

Acknowledgments x

1 Competition, Duplication and Learning in R&D 1

1.1 Environment 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

1.1.1 Beliefs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

1.1.2 Social Planner’s problem: The Efficiency Benchmark . . . . . . . . . 10

1.1.3 The non-cooperative game . . . . . . . . . . . . . . . . . . . . . . . 16

1.2 Environment 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

1.2.1 Symmetric firms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

1.2.2 Asymmetric firms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

1.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

2 Competition and Learning in R&D : The Role of Private Information 43

2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

2.2 Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

2.2.1 The planner’s problem: The full information optimal . . . . . . . . . 48

2.3 The non-cooperative game . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

2.3.1 Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

v

Page 6: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

2.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

3 Decentralised Bilateral Trading, Competition for Bargaining Partners

and the law of one price 60

3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

3.1.1 Motivation for the problem studied . . . . . . . . . . . . . . . . . . . 61

3.1.2 Main features of our model. . . . . . . . . . . . . . . . . . . . . . . . 62

3.1.3 Related literature. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

3.2 The basic framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

3.2.1 The model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

3.2.2 Equilibrium in the basic model . . . . . . . . . . . . . . . . . . . . . 67

3.2.3 Adding a seller . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

3.2.4 Heterogeneous buyers . . . . . . . . . . . . . . . . . . . . . . . . . . 83

3.2.5 Adding a buyer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

3.2.6 Generalisation 1: n buyers and n sellers . . . . . . . . . . . . . . . . 84

3.2.7 Generalisation 2: n buyers and n-1 sellers . . . . . . . . . . . . . . . 91

3.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

4 Decentralised Bilateral Trading in a Market with Incomplete Informa-

tion 96

4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

4.2 The Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

4.2.1 Players and payoffs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

4.2.2 The extensive form . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

4.3 Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

4.3.1 The Benchmark Case: Complete information . . . . . . . . . . . . . 103

4.3.2 Equilibrium of the one-sided incomplete information game with two

players . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104

vi

Page 7: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

4.3.3 Equilibrium of the four-player game with incomplete information. . 106

4.4 Asymptotic characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . 114

4.5 A non-stationary equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . 115

4.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117

Bibliography 118

Appendix 123

A.1 Solution for planner’s v(p) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123

A.2 Switching-derivative lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

A.3 Auxillary results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

A.3.1 For the proof of proposition (3 . . . . . . . . . . . . . . . . . . . . . 124

A.3.2 For the proof proposition (8) . . . . . . . . . . . . . . . . . . . . . . 126

A.3.3 For the proof of lemma (8) . . . . . . . . . . . . . . . . . . . . . . . 126

A.4 Strategy depending on both belief and the location of the opponent . . . . . 127

A.5 Proof of Lemma 11 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128

A.6 Proof of Lemma 13 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133

A.7 Proof of lemma 23 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134

A.8 Proof of Proposition 15 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135

A.9 Proof of Proposition 16 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137

A.10 Proof of Proposition 17 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138

A.11 Details of the equilibria defined in proposition (18) . . . . . . . . . . . . . . 140

A.11.1 Ph < 1 and 1− Ph > qH . . . . . . . . . . . . . . . . . . . . . . . . 140

A.11.2 Ph < 1 and 1− Ph < qH . . . . . . . . . . . . . . . . . . . . . . . . . 141

A.11.3 Ph ≥ 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141

A.12 Off-path behavior of the 2 player game with incomplete information . . . . 142

A.13 Off-path behavior of the 4 player game with incomplete information(public

offers) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143

vii

Page 8: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

A.14 Off-path behavior with private offers . . . . . . . . . . . . . . . . . . . . . . 147

viii

Page 9: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

Dedication

To

My parents

Gopa Das and Pabitra Kumar Das,

for the way they have brought me up;

and

Gurudev Rabindranath Tagore,

The great poet and musician,

whose songs are constant source of inspiration to me.

ix

Page 10: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

Acknowledgments

This dissertation would not have been possibile without the wisdom, kindness and friend-

ship of many people.

First and foremost, I am indebted to my adviser Prof. Kalyan Chatterjee. In the past

five years that I have been with him, I have learnt immensely from him as an instructor in

the classroom, as a mentor in the PhD program, and as an erudite scholar in economics.

But for his continuous guidance and encouragement, I would have never ended up doing

research. Chapters 3 and 4 of this dissertation are joint works with him, and in the course

of working on the project, I learnt the basic approach to think about a research question

and thereafter coming up with a solution. I will never forget how, inspite of being in an

extremely constrained situation, he kept on providing me with support during the past one

year when I was searching for a job. Prof. Chatterjee has set an example before me how

a supervisor should be. I am extremely fortunate to have had him as my adviser.

I am extremely grateful to Prof. Edward Green and Prof. Vijay Krishna for serving as

field members in my dissertation committee. Their thoughtful and detailed comments have

helped this dissertation look much better. Prof. Susan Xu was generous enough to serve

as an outside committee member and her comments were also helpful. I must thank Prof.

James Jordan for his constant help over these years. Help from Venky Venkateswaran and

Alex Monge is greatly acknowledged. A special note of thanks to Prof. Bhaskar Dutta for

his comments on chapters 3 and 4, and to Prof. Sven Rady for his comments on chapters

1 and 2.

x

Page 11: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

I am grateful to my friends at the Department of Economics at Penn State with whom I

have had numerous insightful discussions, both academic and non-academic. A special note

of thanks goes to Ethem Akyol and Pathikrit Basu. Ethem and I have studied together on

many ocassions and association with him has enhanced my technical rigor to a large extent.

Pathikrit was generous enough to go through drafts of my papers and give constructive

feedbacks. Thanks to Bruno and Nail for being wonderful office mates. I would also like

to thank all my friends in State College who made my stay an enjoyable and enriching

experience.

I have been fortunate enough to be taught by many dedicated teachers in Presidency

College and the Indian Statistical Institute. I would specially like to mention Prof. Amitava

Bose and Prof. Ambar Ghosh in this context. Prof. Ghosh motivated me in my study of

Economics in my early budding days in Presidency College. It is because of him that I

decided to pursue higher studies in the subject. Prof. Amitava Bose is one of the finest

people I have ever come across. He has provided me with constant encouragement and

has taught me how to live life in an enjoyable manner, amidst all challenges during the

graduate program.

This acknowledgement will remain incomplete without mentioning my parents and

my elder brother, Gaurav. It is only because of my father that I pursued Economics

at the undergraduate level. My family has always been with me over these challenging

years and provided me with constant motivation. Finally, many thanks to Atisha for her

encouragement through the highs and lows during the past one year.

xi

Page 12: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

Chapter 1

Competition, Duplication and

Learning in R&D

Innovation constitutes an important part in the progress of a society. Starting from the

growth rate of an economy to the various aspects which affects the day to day life of

individuals, R&D activities play an important role. Innovation is a costly and uncertain

process. The uncertainty arises from the fact that the exact path along which the R&D

activities will bear success is unknown. Therefore potential innovators go through trial-

and-error experimentation along the available research avenues. Since experimentation

along an avenue involves a cost (explicit or implicit), it is always desirable that at any

point of time, resources are optimally(from the society’s point of view) spread among the

available methods. Experimentation along a wrong avenue will delay the invention. If the

society discounts the future, then this delay imposes a cost.

This problem is prevalent in those industries where success in R&D activities comes

through a series of trial and error across different methods of experimentation. Hence

apart from the choice of scale of the R&D activities, choosing among alternative research

projects is also an important issue. Most of the existing literature addressing the issue

of patent race, has mainly been concerned with the overall level of firms’ investment in

1

Page 13: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

2

R&D activities.(for example Reinganum (1982), Loury(1979), Lee and Wilde (1980), and

Dasgupta and Stiglitz (1980).) However, there have been very few attempts to analyse the

issue of efficient allocation of R&D activities between competing lines of enquiry. In the

present chapter, to isolate this aspect, we fix the total amount of R&D resources and solely

focus on the issue of allocating resources between competing avenues of research.

It is commonly observed that similar innovations are simultaneously tried out by com-

peting firms, who might differ with respect to their abilities. This is of importance in many

real life instances. Consider the research activities to invent a drug for Alzheimer’s disease.

This disease is estimated to cost America alone some 170 billion dollars a year. At the turn

of the century, research to invent a drug for this disease seemed promising. The physical

manifestations of the disease are plaques of a type of protein known as the β-amyloid, and

nerve-cell-engulfing tangles of a protein known as the tau. However the exact causation is

not known. Hence given a level of resource to invent a drug for the disease, it is absolutely

necessary to choose a proper allocation of resources across the methods of experimentation.

When several firms are independently engaged in R&D activities, the way the firms are

compensated in the market is approximately of the form- the winner takes all. This form

of compensation is mimicked by the institution of patents. The winner-takes-all hypothesis

can also be perceived as an idealisation of the fact that even in the absence of patents, there

are many real world observations where the rent accrued to the first inventor is dispropor-

tionately higher than the ones accruing to the later inventors. This is because of the fact

that the first inventor in many situations makes great inroads into the market, and they

earn a huge share of rent from the invention. For example, at present three big companies

are engaged in research to invent a drug for the Alzheimer’s disease. They are Pfizer, Eli

Lilli and Baxter. Given the high perceived valuation of a possible drug, it is evident that

whoever invents the drug first will make a disproportionately higher amount of money than

the later inventors. To cite another example, Xerox -corporation was the first firm to invent

a photocopier using the Xerography technology. Although there were other competing com-

Page 14: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

3

panies to come up with a photocopier, Xerox reaped a rent which was disproportionately

higher than that earned by the successive companies inventing photocopier.

The issue of making a choice among competing research avenues was also observed in the

development of an asthma medicine. This medicine tried to block the action of leukotrienes.

Research was conducted along two broad approaches, namely as inhibition strategy and an

antagonist strategy. After experimenting with both, the inhibition strategy was abandoned

and ultimately success came along the antagonist strategy. Apart from these, one can also

look for instances in the electronics industry. For example, in the race in the 1970′s for

inventing marketable video players, RCA and Sony adopted different approaches. Sony’s

approach bore success (and consequently earned a huge profit) and RCA lost a huge amount

of money.( [31], [58] )

In situations similar to above, when the patent mechanism is such that the first one to

invent appropriates all the rent, a particular firm’s decision about which research avenue

to pursue is not only affected by the belief itself, but also by the choices of other firms. It is

worth exploring the efficient allocation of firms across research avenues and the distortions

which can take place in a non-cooperative interaction. A possible distortion would be all

firms engaged in R&D experimenting on the same approach, whilst the socially optimal

allocation would involve diversifying effort on different approaches. This phenomenon is

called duplication. It involves a firm imitating its competitor in a situation when social

optimum would require the firm to adopt a different strategy. There are many real life

stylized facts which might be a manifestation of this phenomenon. For example, consider

the Alzheimer’s drug research case. It was widely believed that the level of β-amyloid pro-

tein is the main culprit. Consequently for the past two decades almost exclusive attention

was given to develop drugs to remove amyloid plaques. However not much success has

been attained in this direction. The drugs which are presently in the market, only delay

the onset of this disease.([20]) As a consequence of this, the theory that β-amyloid protein

is the culprit is waning and the conjecture that tau-proteins are to be blamed is gaining

Page 15: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

4

ground. However major R&D activities still involve removal of amyloid plaques.

This chapter analyses highly stylized models to address this issue. The analysis is done

using a strategic Bandit setting. Two environments are considered. The first environment

has two firms (1 and 2) trying to make the same invention, for example to find the correct

explanation for why Alzheimer’s disease occurs. There are two available research avenues,

of which one and only one can lead to success.( like in the context of our Alzheimer’s

disease example, either tau or the amyloid is the correct explanation) The setup is similar

to that of a buried treasure problem with two sites S1 and S2. One and only one of the sites

contains the treasure. Firms know that with probability p the treasure is at S1. Hence we

are analysing a two armed bandit model where both the arms are risky and are perfectly

negatively correlated. Also here agents are operating on the same bandit. Conditional on

the treasure being present at a particular site, (the arm being good) the success of a firm

who is searching there is defined by a Poisson process. The intensity of this Poisson process

is common knowledge. For a particular site (arm), this intensity differs across firms. It is

assumed that while firm 1 is better than firm 2 at S1, firm 2 has an edge over firm 1 at

S2. The firm who discovers the treasure first, appropriates all the rent (and is normalised

to 1). This is equal to the social value of the invention. We consider a continuous time

framework where a firm based on the prior chooses an initial site to carry out research

and a time point (or equivalently a posterior) at which it decides to switch, conditional

on no discovery until that time. The choice of sites by the firms is publicly observable.

Conditional on there being no discovery till a time point, firms update p using Bayes’ rule

on the basis of their search experiences till then.

We first obtain the efficiency benchmark by solving the planner’s problem. The planner

at any instant can choose a site for each of the firms, to carry out research. The objective of

the planner is to maximise the expected discounted social surplus with respect to the firms’

abilities and the likelihood of a site being the correct one. Efficiency involves allocating

both the firms to the same site (specialization) for extreme range of beliefs and allocating

Page 16: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

5

them to different sites (diversification) for an interim range of beliefs. In absence of any

heterogeneity between the firms, the range of beliefs over which the planner allocates firms

to different sites shrinks to the point 12 . Next, we fully characterise the non-cooperative

equilibria. Attention is restricted to markovian strategies only1. We show that when

firms’ abilities differ, the efficient allocation can never be achieved. The non-cooperative

interaction always involves duplication. If the extent of the heterogeneity between the firms

(with respect to other parameters) is large enough, then we have a unique non-cooperative

equilibrium in threshold type markovian strategies. This equilibrium outcome involves

diversification over a range of beliefs. If the extent is not high enough then we have a

multiplicity of equilibria involving duplication, with diversification at one point only. It

has been shown that the only situation when the efficient allocation can be achieved in a

non-cooperative interaction is when the firms are homogeneous.

In the second environment there are two sites. One of them is referred to as the

safe site. Safe site has the treasure for sure. Any firm searching there obtains success

according to a Poisson process with intensity π0 > 0. The risky site can either be good

or bad. A bad risky site has no treasure. A good risky site has the treasure and if firm

i searches there, it obtains success according to a Poisson process with intensity πi. We

have π1 ≥ π2 > π0 > 0. First, we solve the planner’s problem to obtain the efficiency

benchmark. When π1 = π2, then efficiency requires allocating both the firms to the risky

site if belief exceeds a threshold and allocate both to the safe site otherwise. This can also

be obtained as a non cooperative outcome in threshold type markovian strategies. With

heterogeneous firms, efficiency requires diversification. This means there is a range of belief

over which the superior firm is allocated to the risky site and the other one to the safe site.

We show that there is a unique equilibrium in threshold type markovian strategies which

echoes the phenomenon of duplication.1In the present model, the state of Markovian strategies should include both belief and location of the

opponent firm. However in the body of the chapter we concentrate only on the effect of beliefs. In theappendix we show that this does not really matter .

Page 17: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

6

By establishing the phenomenon of duplication in two different kinds of environments,

we can see that duplication is not an artefact of a particular model. This is basically a man-

ifestation of the heterogeneity between the firms, and the competition between the agents.

Alternatively, the above analysis characterises the non-cooperative equilibria in threshold

type markovian strategies, for two different two armed bandit models, with players differing

with respect to their innate abilities and with payoff externalities.

Related Literature: This chapter contributes to the relatively less explored area of

the broad literature on R&D races. It shows that in presence of heterogeneity and com-

petition among agents, there is always a distortion in the choice of research avenue in a

non-cooperative interaction. Bhattacharya and Mookerjee([7]), Dasgupta and Maskin([18])

are two of the early papers which explore this issue in a static framework. Chatterjee and

Evans ([12]) analyses similar issues in a dynamic setting. The model of this chapter is

closely related to [12]except for the fact that we consider site-specific knowledge and a

continuous time framework. However here we can show that we always have duplication

in the non-cooperative interaction. Some other papers to look into similar issues are Fer-

shtman and Rubinstein ([22]) and Akcigit and Liu([1]). ([22]) studies a two-stage model in

which agents simultaneously rank a finite set of boxes. Exactly one of the boxes contains

the prize. Players commit to open the boxes according to their ranked order. Inefficiency

arises due to the fact that the box which is most likely to have the prize is not opened first.

Their model is basically static in nature.

This chapter also contributes to the strategic bandit literature. Some of the seminal

papers which have studied the bandit problem in the context of economics, are Bolton

and Harris ([9]) Keller,Rady and Cripps([37]), Keller and Rady([38]), Klein and Rady (

[40]), Klein([39]), Thomas([61]). In all of these papers except ([61]) and ([40]), players have

replicas of bandits. Free-riding is a common feature in all of these above models. This

leads to inefficient level (too little) of experimentation. The present work differs from ([37])

Page 18: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

7

and ([38]) in two ways. First, we have payoff externalities. Due to this, the phenomenon

of free riding does not arise .(in the first two environments) Secondly, agents differ with

respect to their innate abilities. This gives us inefficiency in equilibrium, the nature of

which is different from the ones in ([37]) and ([38]).

([61]) analyses a set-up where each player has access to an exclusive risky arm, and

both of them have access to a common safe arm. At a time the safe arm can be accessed

by one player only. Hence there is congestion along an arm. The present chapter differs

form this in the way that here each of the arms can be accessed by all the players. Further

we do not have congestion along any of the arms.

The model analysed in ( [40]) has each player having a bandit with a safe arm and a

risky arm. The risky arm of one player is perfectly negatively correlated to the risky arm of

the other player. The environment 1 in the present chapter differs from this in the following

way. We have two arms, both of which are risky and perfectly negatively correlated. (there

is no safe arm)Each arm can be accessed by all the players. ([39]) addresses a model where

players have replicas of bandits with three arms. One of the arms is safe and the other

two are risky. The risky arms are perfectly negatively correlated. Thus there is no payoff

externality between the players as in the present work.

Players in the present chapter differ with respect to their innate abilities, which is

absent in ([61]), ( [40]) and ([39]). Evidently, this seems to be the first successful attempt

in the bandit literature, which explicitly works out models that incorporate difference in

learning abilities of the players along an arm2. Of course we only analyse settings where

players operate on the same bandit.

The rest of the chapter is organised as follows. Section 2 and 3 analyses the models

in environment 1 and 2 respectively. Section 4 describes the analysis of a situation when

there is private arrival of information. Finally, section 5 concludes the chapter.2Klein and Rady([40]) discuss this issue in their work. Also, Akcigit and Liu[[1]] have this feature in

their model. However their work is solely concerned about dealing with private arrival of information.

Page 19: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

8

1.1 Environment 1

Two firms (1 and 2 ) are simultaneously searching for a prize which is worth 1 unit. The

first inventor appropriates all the rent from it. There are two potential avenues along which

the research can be conducted. We refer to these avenues as sites. Hence there are two

sites (S1 and S2) and the treasure is located at one and only one of them. However the

correct site is unknown to both the firms. It is only publicly known that with probability

p (p ∈ (0, 1)), S1 is the site which contains the treasure.

Firms’ capability of conducting research at the onset is site specific. While firm 1 is

better in searching at S1, firm 2 does relatively better at S2(conditional on the treasure

being present at the respective sites). Time is continuous and firms discount the future by

a continuous time discount rate r, such that r > 0.

Conditional on the treasure being present at a particular site, the success of a firm who

is searching there is governed by a Poisson process. The intensity of this Poisson process

directly reveals the level of basic research knowledge a firm possess, in conducting research

at that particular site. Given the treasure is located at S1, the Poisson intensity of the

success of firm 1 is π′

and that of firm 2 is π. Similarly, given the treasure is located at

S2, the Poisson intensity of the success of firm 2 is π′

and that of firm 1 is π where,

π′> π > 0

The abilities of the firms across sites are common knowledge.

1.1.1 Beliefs

Each firm can observe the site where his opponent is going. If there is an invention then it

is immediately revealed. In the present model, this implies that the outcomes of research

by firms are publicly observable. Thus, given the players’ common prior p0, at each time

point t ≥ 0, players share a common posterior pt, which is derived using Bayes’ rule on

Page 20: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

9

the basis of the observed outcomes till then. Over the time interval [t, t + ∆] (∆ > 0), if

both firms 1 and 2 carry out research at S1 without having any success, then the common

posterior at t+ ∆ is given by

pt+∆ =pt exp−(π+π

′)∆

pt exp−(π+π′ )∆ +1− pt

The posterior is decreasing in ∆. The longer the firms conduct research at S1 without

finding the treasure, the less optimistic they become about the treasure being present at

S1( Simultaneously they become more optimistic about S2 having the treasure). If the

firms conduct research at S1 for the time interval dt → 0 (such that the terms of order

o( dt) can be ignored), then the law of motion followed by the belief is given by

dpt = −(π + π′)pt(1− pt) dt

Similarly if the firms carry out research at S2, the law of motion of the belief is given

by dpt = (π + π′)pt(1− pt) dt. Given the parametric assumptions of the model, it is easy

to see that there is no change in beliefs when each site is exploited by one firm only and

there is no arrival. This can be explained as follows. Suppose firm 1 explores S1 and firm

2 explores S2 over the time interval [t, t+ ∆] and there is no arrival. In that case, because

of firm 1’s exploration, p gets updated downwards and because of firm 2’s exploration, p

gets updated upwards. Thus as the duration of the interval dt → 0, from the above we

can infer that the total change in p is given by:

dpt = −(π′)pt(1− pt) dt+ (π

′)pt(1− pt) dt = 0

This explains why the beliefs are frozen if each firm exploits one site.

Page 21: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

10

1.1.2 Social Planner’s problem: The Efficiency Benchmark

We solve for the utilitarian social planner’s optimal behavior in our present set-up. The

planner allocates each firm to a site based on the firm’s ability of conducting research along

that site, and the likelihood of that site containing the treasure. Let pt be the common

subjective probability at time t which the firms assign to S1 being the correct site. The

planner’s payoff from the invention is 1 unit.

The planner wants to maximise the expected discounted social value by choosing an

appropriate action profile at each instant. kt = (k1t, k2t) denotes the action profile chosen

by the planner at the instant t. kit (i = 1, 2) can take values in {0, 1} only. kit = 1 implies

that the planner allocates both the firms to Si. If k1t = k2t = 0 then it implies that the

planner allocates 1 to S1 and 2 to S23. Hence we must have,

k1t + k2t ≤ 1

kt(t ≥ 0) is such that it is measurable with respect to the information available at the time

point t.

Assumption 1 If the planner is indifferent between allocating firm 1 (2) to S1 and S2, then

it allocates 1 (2) to S1 (S2). Since in the current set-up beliefs can move in both directions,

this ensures a well-defined solution to the corresponding law of motion for posterior beliefs.

This is closely related to the admissibility assumption in ([40]) and ([39]).

The expected discounted payoff to the planner can then be expressed as:

E[∫ ∞

0e−rt[(1− k1t − k2t)π

′+ k1tpt(π + π

′) + k2t(1− pt)(π + π

′)]eX(t) dt],

3It is easy to observe that the social planner will never allocate 1 to S2 and 2 to S1

Page 22: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

11

where

X(t) = −[∫ t

0{(1− k1τ − k2τ )π

′+ k1τpτ (π + π

′) + k2τ (1− pτ )(π + π

′)} dτ ]

and the expectation is over the stochastic processes kt and pt. This shows that we can take

the belief to be our state variable. Thus we have a dynamic programming problem with

the current belief p (from now on we will do away with the time subscript) as the state

variable. Since the evolution of beliefs depends on k only, the planner’s problem reduces

to choosing the action profile k = (k1, k2), given the current belief p.

Let v(p) be the value function of the planner. By the principle of optimality it should

satisfy

v(p) = maxk1,k2∈{0,1};k1+k2≤1

{(1− k1 − k2)π′dt+ (π + π

′)(k1p+ k2(1− p)) dt

+(1− r dt)[1− k1p(π + π′) dt− k2(1− p)(π + π

′) dt− (1− k1 − k2)π

′dt][v(p+ dp)]}

where (1−r dt) is an approximation of the discount factor exp−r dt. Substituting v(p+ dp) =

v(p) + v′(p) dp and dp = (k1 − k2)p(1− p)(π + π

′) dt, we get

v(p) = maxk1,k2∈{0,1};k1+k2≤1

{(1− k1 − k2)π′dt+ (π + π

′)(k1p+ k2(1− p)) dt

+(1−r dt)[1−k1p(π+π′) dt−k2(1−p)(π+π

′) dt−(1−k1−k2)π

′dt][v(p)+v

′(p)(k1−k2)p(1−p)(π+π

′) dt]}

After simplifying and rearranging the above, we obtain the following Bellman equation

rv = maxk1,k2∈{0,1};k1+k2≤1

{(1− k1 − k2)[π′(1− v)] + k1[(π + π

′)p(1− v − (1− p)v′)]

+ k2[(π + π′)(1− p)(1− v + pv

′)]} (1.1)

Proposition 1 There exists a solution to the planner’s problem in which both the firms

Page 23: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

12

are allocated to S1 (S2) if the belief p is strictly greater(lower) than a threshold p∗1 (p∗2). If

the belief is in the range [p∗2, p∗1], then firm 1 is allocated to S1 and firm 2 is alocated to S2.

p∗1 and p∗2 satisfy,

0 < p∗2(=π

π + π′) <

12< p∗1(=

π′

π + π′) < 1

Note that p∗1 = 1− p∗2.

Proof.

We prove this through following two lemmas:

Lemma 1 If the planner’s solution is assumed to be of the threshold type, i.e if there exist

threshold probabilities p∗2 and p∗1, such that 0 < p∗2 < p∗1 < 1 and both firms are allocated

to site S1 (S2) for p ∈ (p∗1, 1] ([0, p∗2)) and firm 1 (2) to S1 (S2) for p ∈ [p∗2, p∗1], then

p∗1 = π′

π+π′= 1− p∗2.

Proof of Lemma. Suppose the planner’s solution is of the threshold type as described

by the above lemma. If p ∈ (p∗1, 1], from (1.1) we can infer that v(p) satisfies:

v′+

[r + (π + π′)p]

p(1− p)(π + π′)v =

11− p

This is a first order linear O.D.E. Solving for it( see appendix (A.1) for a detailed

analysis) we obtain:

v =π + π

r + π + π′p+ C1(1− p)[Λ(p)]

r

π+π′ (1.2)

where C1 is an integration constant and Λ(p) = 1−pp .

Similarly if p ∈ [0, p∗2), v(p) satisfies the following O.D.E:

v′ − v [r + (1− p)(π + π

′)]

p(1− p)(π + π′)= −1

p

Page 24: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

13

Solving the above first order O.D.E as before, we have

v =π + π

r + π + π′(1− p) + C2(p)[Γ(p)]

r

π+π′ (1.3)

where C2 is an integration constant and Γ(p) = p1−p . Finally if p ∈ [p∗2, p

∗1], then v satisfies,

rv = (1− v)π′ ⇒ v =

π′

r + π′

Hence the value function is given by:

v(p) =

π+π′

r+π+π′p+ C1(1− p)[Λ(p)]

r

π+π′ : If p ∈ (p∗1, 1],

:π+π

r+π+π′(1− p) + C2(p)[Γ(p)]

δ

π+π′ : if p ∈ [0, p∗2),

:π′

δ+π′: if p ∈ [p∗2, p

∗1].

(1.4)

If p∗1 and p∗2 are optimally chosen, then the smooth pasting and value matching condi-

tions should be satisfied at p∗1 and p∗2. Invoking them we derive C1,C2, p∗2 and p∗1 . At p∗1,

the value matching condition implies:

π + π′

r + π + π′p∗1 +C1(1−p∗1)[Λ(p∗1)]

r

π+π′ = v(p∗1) =

π′

r + π′⇒ C1 =

π′

r+π′− π+π

δ+π+π′p∗1

(1− p∗1)[Λ(p∗1)]r

π+π′

(1.5)

and the smooth pasting condition implies,

v′(p∗+1 ) = 0⇒ π + π

r + π + π′= C1[Λ(p∗1)

r

π+π′ +

r

π + π′Λ(p∗1)

r

π+π′ 1p

]

Substituting the value of C1 from (1.5) we obtain p∗1 = π′

π+π′

Page 25: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

14

Since π′> π, p∗1 >

12 . Similarly, we obtain C2 and p∗2 as

C2 =π′

r+π′− π+π

r+π+π′(1− p∗2)

γ2[Γ(p∗2)]r

π+π′

; p∗2 =π

π + π′(1.6)

It is easy to see that p∗2 <12 as π

′> π.

Lemma 2 The v obtained in (1.4) with p∗1 = π′

π+π′= 1− p∗2, and the corresponding policy

k satisfy (1.1).

Proof of Lemma. We need to show that k1 = 1 for p ∈ (p∗1, 1], k2 = 1 for p ∈ [0, p∗2) and

k1 = k2 = 0 for p ∈ [p∗2, p∗1] are optimal choices for the planner. Let,

B(p, v) = π′(1−v) ; B1(p, v) = (π+π

′)p(1−v−(1−p)v′) and B2(p, v) = (π+π

′)(1−p)(1−v+pv

′)

Then (1.1) is equivalent to

rv = maxk1,k2∈{0,1};k1+k2≤1

{(1− k1 − k2)B(p, v) + k1B1(p, v) + k2B2(p, v)}

To show that v (and the corresponding k) satisfies the Bellman equation, we need to verify

that the following hold ,

B(p, v) = max{B(p, v), B1(p, v), B2(p, v)} for p ∈ [p∗2, p∗1]

B1(p, v) = max{B(p, v), B1(p, v), B2(p, v)} for p ∈ (p∗1, 1]

B2(p, v) = max{B(p, v), B1(p, v), B2(p, v)} for p ∈ [0, p∗2)

First, consider the interval [p∗2, p∗1]. According to (1.4), v = π

r+π′in this region. Thus

v′

= 0. This implies that

B(p, v) = π′(1− v) ; B1(p, v) = (π + π

′)p(1− v) and B2(p, v) = (π + π

′)(1− p)(1− v).

Page 26: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

15

The conditions B(p, v) ≥ B1(p, v) and B(p, v) ≥ B2(p, v) hold simultaneously when p ≤π′

π+π′and p ≥ π

π+π′. Hence for p ∈ [p∗2, p

∗1],

B(p, v) = max{B(p, v), B1(p, v), B2(p, v)}

Next, consider the region (p∗1, 1]. v′

is given by π+π′

r+π+π′− C1(Λ(p))

r

π+π′ [1 + r

π+π′.1p ]

⇒ (1−p)v′ =π + π

r + π + π′−[p

π + π′

r + π + π′+C1(1−p)[Λ(p)]

r

π+π′ ]−C1

(1− p)p

[Λ(p)]r

π+π′ r

π + π′

⇒ (1− v − (1− p)v′) =r

π + π′ + r+

(1− p)p

C1[Λ(p)]r

π+π′ r

π + π′

Substituting the above in the expression of B1(p, v), we get B1(p, v) = rv.

Further, from the expression of v′(p) we obtain

pv′

=π + π

r + π + π′p− pC1[Λ(p)]

r

π+π′ − C1

r

π + π′[Λ(p)]

r

π+π′ = v − C1[Λ(p)]

r

π+π′ .r + π + π

π + π′

⇒ 1− v + pv′

= 1− C1[Λ(p)]r

π+π′ .r + π + π

π + π′

Substituting this in the expression of B2(p, v), we get B2(p, v) = (π+π′)−(r+π+π

′)v. Thus

to have B1(p, v) ≥ B(p, v) and B1(p, v) ≥ B2(p, v), we require v ≥ π′

r+π′and v ≥ π+π

2r+π+π′

respectively. Since π′

r+π′− π+π

2r+π+π′= δ(π

′−π)

(r+π′ )(2r+π+π′ )> 0 and for p ∈ (p∗1, 1] v > π

r+π′, we

have

B1(p, v) = max{B(p, v), B1(p, v), B2(p, v)}

Similarly we can show that for the region p ∈ [0, p∗2), B2(p, v) = max{B(p, v), B1(p, v), B2(p, v)}.

This shows that the value function and the corresponding policy k satisfies (1.1).

The proof of proposition (1) now follows directly from lemma (1) and (2).

We conclude this subsection by making an observation. It follows that the length

of the interval [p∗2, p∗1] is given by π

′−ππ+π′

. Hence the range of beliefs over which there is

Page 27: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

16

diversification of research, is increasing in the difference of abilities of firms at a particular

site. If π = π′, then the range shrinks to the point 1

2 . This implies that with homogeneous

firms, diversification in research takes place at the belief 12 only.

1.1.3 The non-cooperative game

The extensive form

Player i = 1, 2 chooses actions {ki,t ∈ {(1, 0), (0, 1)} such that ki,t is measurable with

respect to the information available at time t. ki,t = (1, 0)((0, 1)

)indicates that firm i is

going to S1 (S2). It is evident that as soon as there is a discovery by a particular firm the

game ends. We consider a winner takes all structure, so that the entire rent from discovery

accrues to the firm who discovers it first.

Throughout our analysis of the non-cooperative game, we will restrict our attention

to Markovian strategies with common belief p as the state variable. We define a (marko-

vian)strategy of player i (i = A,B) to be the mapping ki : [0, 1] → {(1, 0), (0, 1)} (i.e

from states pt to kit). We allow only those ki(.) functions which satisfy the property

that k−1i [(1, 0)] and k−1

i [(0, 1)] are disjoint unions of a finite number of non-degenerate

sub-intervals in [0, 1]. Also ki(0) = (0, 1) and ki(1) = (1, 0). This ensures that player i

chooses the dominant action under subjective certainty. Given this we can also visualise

the strategies of the players as follows. A firm, given the current belief choose a site and

a posterior at which it is going to switch to the other site. Player 1’s markov strategy is

called a threshold type strategy if k−11 (1, 0) = [p1, 1] or (p1, 1]. Similarly player 2’s markov

strategy is a threshold type strategy if k−12 (0, 1) = [0, p2] or [0, p2).

It should be noted that strictly speaking, the domain of a Markovian strategy of a

particular firm should not only depend on the belief p, but also on the location of its com-

petitor. Appendix (A.4) illustrates that the results obtained by restricting the strategies

of players as function of beliefs remain valid when strategy depends on both the belief and

the location of the competitor.

Page 28: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

17

Assumption 2 We assume that k1 is right continuous and k2 is left continuous. This

guarantees the existence of a well-defined solution to the law of motion for posterior beliefs.

Equilibrium

We aim to characterise the markov-perfect equilibria of the non-cooperative game. A

markov-perfect equilibrium is a pair of strategies (k1, k2), such that the strategy of player

i maximises his expected discounted payoff, conditional on the strategy of player j and

vice-versa.

First we focus on equilibria in which diversification in research takes place (i.e there

exists a range of beliefs over which firms choose different sites.)and the strategies of players

are of the threshold type. In the present set-up threshold type markov strategies are said

to be symmetric if

k1 =

(1, 0) : if p ∈ [p, 1],

(0, 1) : if p ∈ [0, p),(1.7)

and

k2 =

(0, 1) : if p ∈ [0, p],

(1, 0) : if p ∈ (p, 1],(1.8)

such that p < p and p = 1− p.

Let v1(p) and v2(p) be the value functions(equilibrium payoffs) of firm 1 and 2 re-

spectively, from an equilibrium strategy profile (k1, k2). Then given k2, v1 and k1 should

satisfy,

v1(p) = maxk1∈{(1,0),(0,1)}

{(k11k

22pπ

′dt+ k1

1k12pπ

′dt+ k2

1k22(1− p)π dt+ k2

1k12(1− p)π dt)

+(1− r dt)(1− k11k

22π′dt− k1

1k12p(π + π

′) dt− k2

1k22(1− p)(π + π

′) dt− k2

1k12π dt)(v1(p, k2)

Page 29: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

18

− (k11k

12 − k2

1k22)p(1− p)(π + π

′)v′1(.) dt)} (1.9)

Similarly, given k1, v2 and k2 should satisfy

v2(p) = maxk2∈{(1,0),(0,1)}

{(k22k

11(1− p)π′ dt+ k1

1k12pπ dt+ k2

1k22(1− p)π′ dt+ k2

1k12pπ dt)

+(1− r dt)(1− k11k

22π′dt− k1

1k12p(π + π

′) dt− k2

1k22(1− p)(π + π

′) dt− k2

1k12π dt)(v2(p)

− (k11k

12 − k2

1k22)p(1− p)(π + π

′)v′2(.) dt)} (1.10)

Expanding and rearranging (1.9) and (1.10)(after ignoring the terms of the order o( dt) )

we get the following Bellman equations, which the equilibrium payoffs should satisfy

rv1 = maxk1∈{(1,0),(0,1)}

{k11k

22[π

′(p− v1)] + k1

1k12[(π + π

′)p(

π′

π + π′− v1 − (1− p)v′1)]

+ k21k

22[(π + π

′)(1− p)( π

π + π′− v1 + pv

′1)] + k2

1k12[π((1− p)− v1)]} (1.11)

rv2 = maxk2∈{(1,0),(0,1)}

{k22k

11[π

′((1− p)− v2)] + k2

2k21[(π + π

′)(1− p)( π

π + π′− v2 + pv

′2)]

+ k12k

11[(π + π

′)p(

π

π + π′− v2 − (1− p)v′2)] + k1

2k21[π(p− v2)]} (1.12)

If (k1, k2) is a threshold type markovian strategy profile and is symmetric in the way

described above, then players’ payoffs induced by this strategy profile are given by

v1(p) =

π′

r+π+π′p+ C1

11(1− p)[Λ(p)]r

π+π′ : if p ∈ (p, 1],

π′

r+π′p : if p ∈ [p, p],

πr+π+π′

(1− p) + C122[Γ(p)]

r

π+π′ : if p ∈ [0, p)

:

(1.13)

Page 30: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

19

and

v2(p) =

πr+π+π′

p+ C211(1− p)[Λ(p)]

r

π+π′ : if p ∈ (p, 1],

π′

r+π′(1− p) : if p ∈ [p, p],

π′

r+π+π′(1− p) + C2

22p[Γ(p)]r

π+π′ : if p ∈ [0, p)

:

(1.14)

where Ci11 , Ci22 (i = 1, 2) are integration constants.

If the strategy profile (k1, k2) constitutes an equilibrium(markovian), then given k2,

(1.13) along with k1 should satisfy (1.11) and given k1, (1.14) along with k2 should should

satisfy (1.12).

The following proposition states that we cannot obtain the efficient outcome as an

equilibrium outcome of the non-cooperative game.

Proposition 2 There does not exist an efficient equilibrium.

Proof. The strategy profile (k∗1, k∗2), which implements the efficient outcome is the one

which satisfies (1.7) and (1.8) with p = p∗2 and p = p∗1 . If (k∗1, k∗2) constitutes an equilibrium

then k∗1 should constitute a best response to k∗2 for all p ∈ [0, 1]. The payoffs induced by

this strategy profile will be given by (1.13) and (1.14) with p = p∗2 and p = p∗1.

If k∗1 is a best response to k∗2, then given k∗2, v1 along with k∗1 should satisfy (1.11), .

Consider the region [p∗2, p∗1] first. Payoffs induced by the strategy profile (k∗1, k

∗2) implies

that for p ∈ [p∗2, p∗1], v1 = π

r+π′p. From (1.11), we know that to have k∗1 to be a best

response to k∗2, we require

π′(p− v1) ≥ (π + π

′)(1− p)( π

π + π′− v1 + pv

′1)

⇒ p ≥ π(r + π′)

rπ′ + π(r + π′)

However,π(r + π

′)

rπ′ + π(r + π′)− p∗2 =

π(r + π′)

rπ′ + π(r + π′)− π

π + π′

Page 31: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

20

=π(π

′2)[rπ′ + π(r + π′)][(π + π′)]

> 0

Hence for p ∈ [p∗2,π(r+π

′)

rπ′+π(r+π′ )) , π

′(p− v1) < (π + π

′)(1− p)( π

π+π′− v1 + v

′1p). Thus k∗1

does not constitute a best response to k∗2 . This shows that there does not exist an efficient

equilibrium.

Inefficient equilibrium with diversification (symmetric):

The efficient strategy profile involves diversification and is symmetric( in the manner

described above). Since we have shown that there does not exist an efficient equilibrium,

it is of natural interest to look for outcomes(with diversification in research) that can be

obtained in a symmetric Markovian equilibrium of the non-cooperative game. It turns out

that for certain parametric conditions we can obtain a unique equilibrium outcome (in

threshold type markovian strategies). The following proposition describes this.

Proposition 3 If r(π′−π)−ππ′ > 0, then the unique Markovian equilibrium in threshold

type strategies is symmetric and is constituted by the strategy profile (kN1 , kN2 ) such that it

satisfies (1.7) and (1.8) with,

p = p∗N2 =π(r + π

′)

rπ′ + π(r + π′)and p = p∗N1 =

rπ′

rπ′ + π(r + π′)= 1− p∗N2

Proof. We prove this proposition with the help of following lemmas:

Lemma 3 If firm 2 goes to S2 for p ∈ [0, p] and to S1 for p ∈ (p, 1] with 12 ≤ p < p∗1,

then there exists a p∗N2 satisfying 0 < p∗N2 < 12 ≤ p, such that for firm 1, going to S2 for

p ∈ (0, p∗N2 ) and to S1 for p ∈ [p∗N2 , 1] constitutes a best response to firm 2’s strategy. .

Proof of Lemma. By hypothesis, we have k2 = (0, 1) for p ∈ [0, p] and k2 = (1, 0) for

p ∈ (p, 1], such that p ≥ 12 . We know that given this, for p = 0 it is optimal for 1 to choose

k1 = (0, 1)( that is to conduct research at S2). We now need to find the point where 1 will

Page 32: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

21

find it optimal to switch to S1, given k2. Hence we will be solving for the optimal stopping

problem of player 1 in the region [0, p].

Let p∗N2 be the switching point for 1. First, we assume that p∗N2 < 12 . Then this will

induce a payoff function for 1 which satisfies (1.13) with p = p∗N2 . Since v1 thus obtained

is a continuous function, at the switching point switching point we shall have,

π′{p∗N2 − v1} = (π + π

′)(1− p∗N2 ){ π

π + π′− v1 + p∗N2 v

′1}

Given k2, p can change in one direction only, v′1 = π

r+π′. This implies

π′ r

r + π′p∗N2 = (1− p∗N2 )π

⇒ p∗N2 =π(r + π

′)

rπ′ + π(r + π′)

Since r(π′ − π)− ππ′ > 0, p∗N2 < 1

2 . This is consistent with the assumption that p∗N2 < 12 .

This shows that k1 = (1, 0) is an optimal response to k2 for p ∈ [p∗N2 , p].

Next, Consider the region (p, 1]. For k1 to constitute a best response to k2 we must

have,

(π + π′)p(

π′

π + π′− v1 − v

′1(1− p)) ≥ π((1− p)− v1)

In this region, v′1 is given by

v′1 =

π′

r + π + π′− C1

11(Λ(p))r

π+π′ [1 +

r

π + π′1p

] (1.15)

This implies,

(π + π′)p(

π′

π + π′− v1 − v

′1(1− p)) = rv1

Thus we require

rv1 ≥ π((1− p)− v1)⇒ v1 ≥π

r + π(1− p)

Page 33: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

22

From the value matching condition we know that v1(p) = π′

r+π′ p. Since 1

2 ≤ p < p∗1, from the

switching derivative lemma (refer to appendix A.2) we know that v′1 > 0 for all p ∈ [p, 1].

Hence we must have v1 >πr+π (1 − p) for all p ∈ (p, 1]. This implies that k1 = (1, 0) is an

optimal response to k2 for p ∈ (p, 1].

Thus we have shown that k1 constitutes a best response to k2 for all p ∈ [0, 1]. This

concludes the proof of the lemma.

Lemma 4 If firm 1 goes to S1 for p ∈ [p, 1] and to S2 for p ∈ [0, p) with p∗2 < p ≤ 12 , then

there exists a p∗N1 satisfying 12 < p∗N1 < 1, such that for firm 2, going to S2 for p ∈ [0, p∗N1 ]

and to S1 for p ∈ (p∗N1 , 1] constitutes a best response to firm 1’s strategy.

Proof of Lemma. We have k1 = (1, 0) for p ∈ [p, 1] and k1 = (0, 1) for p ∈ [0,p) such

that p ≤ 12 . Given this, at p = 1, 2 finds it optimal to choose k2 = (1, 0) (that is to conduct

research at S1). As before, we intend to find the point where 2 will switch to S2. (the

optimal stopping problem of player 2 in the region [p, 1]).

Let p∗N1 be the switching point of 2. Assuming p∗N1 > 12 , this will induce a payoff

function for 2 which satisfies (1.14) with p = p∗N1 . As in lemma (3), at p = p∗N1 we shall

have

π′((1− p∗N1 )− v2) = (π + π

′)p∗N1 (

π

π + π′− v2 − v

′2(1− p))

Since given k1, p can change in one direction only, v′2 = − π

r+π′. This implies

π′(1− p∗N1 )

r

r + π′= pπ

⇒ p∗N1 =rπ′

rπ′ + π(r + π′)

Since r(π′ − π)− ππ′ > 0, p∗N1 > 1

2 . This is consistent with our assumption that p∗N1 > 12 .

This shows that k2 = (0, 1) is an optimal response to k1 for p ∈ [p, p∗N2 ]. Similar to the proof

of lemma (3) we can also show that k2 = (0, 1) is a best response to k1 for p ∈ [0, p). Hence

Page 34: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

23

we have demonstrated that k2 is a best response to k1 for all p ∈ [0, 1]. This concludes the

proof of the lemma.

Let(kN1 , kN2 ) be the strategy profile such that it satisfies (1.7) and (1.8) with, p =

p∗N2 and p = p∗N1 . The payoff functions induced by this strategy profile satisfies (1.13) and

(1.14) with p = p∗N2 and p = p∗N1 .

Appendix (A.3.1) describes the value of the integration constants obtained by imposing

the value matching condition at p∗N1 and p∗N2 . Also it shows that v1 and v2 satisfy the

smooth pasting condition at p∗N2 and p∗N1 respectively, conditional on the other player’s

strategy.

The proof of the proposition now follows directly from lemma (3) and (4) and the fact

that p∗N2 and p∗N1 constitute the unique switching points for 1 and 2 respectively.

Proposition (3) characterises the non-cooperative equilibrium when r(π′−π)−π′π > 0.

It is to be observed that p∗N2 > p∗2 and p∗N1 < p∗1. The inefficiency of the non-cooperative

equilibrium entails from the fact that in the intervals [p∗2, p∗N2 ) and (p∗N1 , p∗1], firms conduct

research at the same site when efficiency requires them to conduct research at different

sites. Hence there exist ranges of beliefs, such that if the state lies in one such range, there

is too much specialistation along a line of research, when efficiency requires diversification.

Given π and r and r > π, the condition r(π′ − π)− π′π > 0 puts a lower bound on the

value of π′. This condition can be intuitively explained as follows. Suppose there exists a

range of beliefs such that firm 1 conducts research at S1 and firm 2 at S2. Over the range

of diversification, the payoffs to firm 1 and 2 are π′

r+π′p and π

r+π′(1 − p) respectively. If

firm 1 unilaterally deviates and goes to S2, then the belief will be updated upwards and

if firm 2 unilaterally deviates and goes to S1, the belief will be updated downward. This

implies that if firm 1 unilaterally deviates and goes to S2 over the time interval dt, then

Page 35: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

24

conditional on no arrival, the expected discounted future payoff will be higher for firm 1.

Similarly, if firm 2, deviates and goes to S1 over the dt time interval then conditional on

no discovery its expected discounted future payoff increases.

Thus staying at different sites (as described above) is incentive compatible from firms’

point if at each p, given the other firm’s action, the instantaneous payoff to a firm from

diversification is no less than that from specialisation. This is a necessity. Consider a p in

such a range. Given that firm 2 is at S2, firm 1 knows that the expected discounted payoff

from diversification is π′

r+π′p. The instantaneous payoff is π

r+π′pr dt. The instantaneous

payoff from specialisation is π(1−p) dt. Thus it is optimal for firm 1 to go for diversification

if,π′

r + π′pr dt ≥ π(1− p) dt (1.16)

Similarly, given firm 1 is at S1,firm 2 finds it optimal to go for diversification if

π′

r + π′(1− p)r dt ≥ πp dt (1.17)

Thus to have a range of beliefs over which diversification will take place in a non-

cooperative equilibrium, (1.16) and (1.17) should hold together, with strict inequality for

at least one p if the range does not consist of one point only . This implies

π′

r + π′r > π

⇒ r(π′ − π)− π′π > 0

This explains the condition required for the existence of a symmetric equilibrium. Thus

to have an equilibrium with diversification in research, it is necessary that the extent of

site specific superiority is high enough. In this chapter this is reflected by the magnitude

of the term (π′ − π). The condition is more likely to be true, when the value of (π

′ − π)

is higher. (for a given value of r) However one can see that for low values of r (i.e when

Page 36: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

25

agents become more patient) the condition is less likely to hold.

Inefficient equilibria with no diversification:

It is clear from the previous proposition that we cannot have symmetric equilibrium

with diversification in research when the condition r(π′ − π) − π′π > 0 fails to hold. In

these situations we can expect to obtain equilibrium where the equilibrium strategy profile

(kN′

1 , kN′

2 ) satisfies (1.7) and (1.8) with p = p = p∗. In other words we look for equilibrium

where switching points for the firms are the same. This implies that in these equilibria,

diversification of research takes place only at the point p∗.

To begin with, we focus on the case when firms are equally capable along sites, i.e

π11 = π1

2 = π21 = π2

2 = π

Clearly, the condition r(π′−π)−π′π > 0 fails to hold. The following proposition describes

the equilibrium.

Proposition 4 If π′

= π, the unique equilibrium in threshold type strategies is constituted

by the strategy profile (kNe1 , kNe2 ) such that it satisfies (1.7) and (1.8) with p = p = 12 .

Proof. Suppose the strategy profile (k′1, k

′2) constitutes an equilibrium such that (k

′1, k

′2)

satisfies (1.7) and (1.8) with p = p = p∗. This will induce the payoff functions v1(.) and

v2(.) which satisfy (1.13) and (1.14) respectively with p = p = p∗.

Firm 1 finds it optimal to switch from S2 to S1 at p∗. Then, from (1.11) we should

have

π{p− v1} ≥ 2πp{12− v1 + pv

′1}

If 1 choose S2 at p∗ then conditional on no discovery, p will be updated upwards. Hence

v′1 will be given by

v′1 =

π

r + 2π− C1

11[Λ(p)]r2π [1 +

r

2π1p

]

Page 37: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

26

Choosing the integration constant by imposing the value matching condition to v1 at p∗,

we have

2πp{12− v1 + pv

′1} = π − (r + 2π)v1

Thus we require

π{p− v1} ≥ π − (r + 2π)v1 ⇒ π(r

r + π)p∗ ≥ π − (r + 2π)

π

r + πp∗

using v1(p∗) = πr+πp

∗. This implies that p∗ ≥ 12 .

Next, Firm 2 finds it optimal to switch from S1 to S2 at p∗. From (1.12) we shall then

have

π{(1− p)− v2} ≥ 2π{12− v2 − (1− p)v′2}

If firm 2 goes to S2 at p∗, then conditional on no discovery, p will be updated downwards.

Hence v′2 will be given by

v′2 = − π

r + 2π+ C2

22[1 +r

2π1

1− p]

After substituting the value of the integration constant, we can posit that for optimality

we require

π{(1− p∗)− v2} ≥ π − (r + 2π)v2

Putting v2(p∗) = πr+π (1− p∗), we then have p∗ ≤ 1

2 .

This implies that if there is an equilibrium with the same switching point for both the

firms, then the switching point should be p∗ = 12 .

Let (kNe1 , kNe2 ) be the strategy profile which satisfies (1.7) and (1.8) with p = p = 12 .

The payoffs induced by this profile will be given by (1.13) and (1.14) with p = p = 12 .

The integration constants are chosen by imposing value matching condition to v1 and v2

at p = 12 .

Page 38: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

27

We now need to establish that (kNe1 , kNe2 ) constitutes an equilibrium. All we need to

show is that kNe1 (kNe2 ) constitutes a best response to kNe2 (kNe1 ) for p ∈ (12 , 1](p ∈ [0, 1

2))

Consider the region (12 , 1]. If it is optimal for firm 1 to choose S1, then it must be true

that,

2πp(π

2π− v1 − (1− p)v′1) ≥ π(1− p− v1)

⇒ v1 ≥π

r + π(1− p)

It can be shown that v1 is strictly increasing for p ∈ (12 , 1]. Since v1(1

2) = πr+π

12 , v1 >

πr+π (1−p) for all p ∈ (1

2 , 1]. Hence kNe1 is a best response to kNe2 for p ∈ (12 , 1]. In a similar

manner it can be shown that kNe2 constitutes a best response to kNe1 for p ∈ [0, 12).

This concludes the proof.

The above analysis shows that in the absence of any difference in abilities of the firms,

we have a unique equilibrium in threshold type markovian strategies with diversification

at one point only. By recalling the analysis of the social planner’s problem we can posit

that when firms are equally capable at each of the sites, the outcome of this equilibrium

coincides with the efficient outcome.

We now turn our focus to the situation when firms’ abilities do differ and we cannot

have equilibrium that involves diversification in research activities over a range of beliefs.

The following proposition describes this.

Proposition 5 If π′> π > 0 and the condition r(π

′ − π)− π′π > 0 fails to hold, then we

have a multiplicity of equilibria in threshold type strategies as described below:

ks1 = (0, 1) for p ∈ [0, p∗) and ks1 = (1, 0) for p ∈ [p∗, 1]

ks2 = (0, 1) for p ∈ [0, p∗] and ks2 = (1, 0) for p ∈ (p∗, 1]

where,

p∗ ∈ [max{ps, p∗N1 }, 1−max{ps, p∗N1 }]

Page 39: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

28

and

ps =π(r + π

′)

π(r + π′) + π′(r + π)

Proof. Since the condition r(π′ − π)− π′π > 0 fails to hold, we cannot have equilibrium

with diversification in research (i.e a range of beliefs over which firms choose different sites

to conduct their research.). Thus we seek to find equilibria where the switching points for

the firms are the same. Let p∗ be the common switching point. The payoffs induced will

be given by (1.13) and (1.14) with p = p = p∗.

At p∗, firm 1 finds it optimal to switch to site S1 from S2. This implies that we must

have

π′(p∗ − v1) ≥ (π + π

′)(1− p∗){ π

π + π′− v1 + p∗v

′1}

At p = p∗, given that 2 is at S2, if 1 goes to S2 then conditional on there being no discovery,

p will increase. Hence v′1 will be given as

v′1 =

π′

π + π′ + r− C1

11[Λ(p)]r

π+π′ [1 +

r

π + π′1p

]

Since integration constants are chosen by imposing the value matching condition to v1 at

p∗,

(π + π′)(1− p∗){ π

π + π′− v1 + p∗v

′1} = π(1− p∗) + π

′ − (r + π + π′)v1

Hence we require

π′(p∗ − v1) ≥ π(1− p∗) + π

′ − (r + π + π′)v1

Substituting v1(p∗) = π′

r+π′p∗ , we obtain

p∗ ≥ π(r + π′)

π(r + π′) + π′(r + π)= ps

Page 40: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

29

Similarly we can show that if B finds it optimal to switch at p∗, then we must have

p∗ ≤ π′(r + π)

π(r + π′) + π′(r + π)= ps

Thus to have an equilibrium with same switching points, it is necessary that the switch-

ing point p∗ lies in the interval [ps, ps]. Since r(π′−π)−π′π ≤ 0, p∗N1 < 1

2 . In an equilibrium

where the switching points are the same it is a necessity that the switching point p∗ ≥ p∗N1 .

Otherwise from our previous analysis we know that if p∗ < p∗N1 and 1 switches at p∗ then 2

finds it optimal to switch at p∗N1 and not p∗. Hence p∗ ∈ [max{ps, p∗N1 }, 1−max{ps, p∗N1 }].

Finally, we need to establish that ks1 (ks2) constitutes a best response to ks2 (ks1) for

p ∈ (p∗, 1] ([0, p∗)). Consider the region p ∈ (p∗, 1]. From the above analysis we know that

given the conjectured behavior, we have

v1(p∗) ≥ π

r + π(1− p∗)

From the switching derivative lemma we can infer that v1 will be strictly increasing for

p ∈ [p∗, 1]. Hence v1 ≥ πr+π (1 − p) for p ∈ [p∗, 1]. Along the line of our previous analysis

we can show that this is what is required for ks1 to be a best response to ks2 for p ∈ (p∗, 1].

Similarly we can show that ks2 constitutes a best response to ks1 for p ∈ [0, p∗).

It is to be noted that in the present case, smooth pasting condition will not necessarily

be satisfied by vi at p∗. This is because here we are in some sense getting a corner solution

for the optimal stopping problems of firm 1 and firm 2. That is, given that ks2 = (0, 1) for

p ∈ [0, p∗], firm 1 would have ideally liked to switch to S1 from S2 at p = p∗N2 . However he

will not be able to do this since p∗ ≤ p∗N2 . Similar thing will be true for firm 2 as well.

This concludes the proof.

The previous proposition states that when r(π′ − π) − π′π < 0 and firms are hetero-

geneous, we have a multiplicity of equilibria where firms have a common switching point.

Page 41: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

30

Since p∗, the common switching point always lies in the interval (p∗2, p∗1), each of this equi-

libria involves duplication. The analysis of the above model shows that whenever the firms

differ in their innate abilities,(i.e their Poisson intensities of learning along an arm dif-

fer)the non-cooperative equilibrium is inefficient, such that for a certain range of beliefs,

there is too much specialisation along a method of research when efficiency would require

more diversification. Hence the phenomenon of duplication in the present set-up can be

perceived as a manifestation of competition and heterogeneity among the firms.

Next, we analyse a different model in Environment 2 and show that duplication is only

possible when firms differ in their abilities. The setting is similar to the ones analysed

in ([37]) and ([38]). However here players operate on the same bandit. Thus apart from

showing that the phenomenon of duplication generalizes in to other models as well, we also

show the nature of inefficiency in a bandit model with one safe arm and one risky arm in

presence of payoff externalities and difference in innate abilities across the players.

1.2 Environment 2

Two firms (1 and 2) are trying to find a treasure. The first one to find it appropriates all

the rent from it which we normalize to 1. There are two sites to look for the treasure. Up

to now, it is similar to the setting in environment 1. However, the characteristics of the

sites differ as follows.

It is known with certainty that one of the sites has the treasure. This site is referred

to as the safe site(S). As before, we consider a continuous time framework. The success of

any firm who is searching at S, follows a Poisson process with intensity π0 > 0. The other

site (R) is a risky one and can either be good or bad. A good risky site has the treasure

and the success of firm i (i = 1, 2) who is searching at R, follows a Poisson process with

intensity πi, such that πi > π0 for all i = 1, 2. A bad risky arm has no treasure. At the

onset, the firms know that the risky site is good with probability p. Each firm can observe

Page 42: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

31

the location where its opponent is carrying out research.

1.2.1 Symmetric firms

First we consider the situation when the firms are symmetric in their abilities. That is for

both the firms(1 and 2), success at the risky site follows a Poisson process with intensity

π1, such that π1 > π0 > 0.

Planner’s problem: The Efficiency Benchmark

Consider the problem of a benevolent social planner who wants to maximise the expected

discounted social value from the invention. The payoff to the planner from invention is

1. Hence at each instant, based on p, he allocates each of the firms to a particular site to

carry out research. kt denotes the action profile chosen by the planner at the instant t.

kt ∈ {0, 1, 2}. kt denotes the number of firms allocated to the risky site at the instant t.

kt(t ≥ 0) is such that it is measurable with respect to the information available at time t.

It is assumed that if the planner is indifferent between allocating a firm to the risky

and the safe site, then he allocates it to the safe site. Thus the action profile of the planner

is left continuous.

From now on we will do away with the time subscript. Let v(p) be the value function

of the planner. Since actions are left continuous and beliefs can move only in the left

direction, left continuity of v(p) can always be assumed.

Then v(p) should satisfy,

v(p) = maxk∈{0,1,2}

{(2− k)π0 dt+ kpπ1 dt+

(1− r dt)(1− (2− k)π0 dt− kpπ1 dt)(v(p)− v′(.)kp(1− p) dt)},

Page 43: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

32

since (v(p+ dp) = v(p) + v′(p) dp) and dp = kp(1− p) dt.

After simplifying, we have

rv = maxk∈{0,1,2}

{(2− k)π0[1− v] + k(π1p[1− v − v′p(1− p)])} (1.18)

Proposition 6 The planner allocates both firms to the risky site as long as p > p∗, where

p∗ = π0π1

. For p ≤ p∗, both firms are allocated to the safe site.

Proof. Since (1.18) is linear in k, we know that at the optimum, k will either be 2 or 0.

When both firms are optimally allocated to the risky site, the value function satisfies:

v =2π1

r + 2π1+ C(1− p)[Λ(p)]

r2π1 ,

where Λ(p) = 1−pp and C is the integration constant. This is derived by solving the O.D.E

obtained by putting k = 2 in (1.18).

When both firms are optimally allocated to S, then v = 2π0r+2π0

. Since v(p) satisfies the

value matching and smooth pasting conditions at p = p∗, we get

C =2π0r+2π0

− 2π1r+2π1

(1− p∗)[Λ(p)]r

2π1

and p∗ =π0

π1

This concludes the proof.

The non-cooperative game

Player i chooses actions {kit ∈ {0, 1}}, such that kit is measurable with respect to the

information available at time t. We restrict our attention to Markovian strategies, such

that strategy of player i is defined by the mapping ki : [0, 1] → {0, 1}. We allow only

those ki functions which satisfy the property that k−1i (1) and k−1

i (0) are disjoint unions of

a finite number of non-degenerate sub-intervals in [0, 1], such that ki(0) = 0 and ki(1) = 1.

This ensures that the game is well-defined in the continuous time framework.

Page 44: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

33

Firms simultaneously update their belief about the risky site to be good until there is

at least one firm carrying out research at the risky site and there is no discovery(at any

of the sites). Both k1 and k2 are left continuous, which guarantee the existence of a well

defined law of motion of the posterior.

Let vi be the value function (equilibrium payoff) of firm i (i = 1, 2)in the non-cooperative

game. If (k1, k2) is an equilibrium strategy profile then given kj (j = 1, 2), ki (i = 1, 2; i 6= j)

and vi should satisfy

vi = maxki∈{0,1}

{(1− ki)π0 dt+ kiπ1p dt+

(1− r dt)(1− π0 dt(2− ki + kj)− pπi(ki + kj) dt)(vi − v′ip(1− p)(ki + kj) dt)}

Simplifying above, we obtain

rvi = maxki∈{0,1}

{(1− ki)π0(1− vi) + ki(π1p[1− vi − v′ip(1− p)])

− (1− kj)π0vi − kjπjp(vi + (1− p)v′i)} (1.19)

Proposition 7 There exists an efficient equilibrium.

Proof. Consider the following strategy profile: Each firm uses R for p > p∗ and S for

p ≤ p∗ (Hence p∗ is the switching point). This is a symmetric strategy profile and the

outcome implied by this profile is the efficient outcome. We need to show that this profile

constitutes an equilibrium.

Suppose firm 2 follows the above strategy. We will determine the best response of firm

1. It is clear that for p = 1, firm 1 will choose R. Thus the optimal switching point of firm

1 is to be determined.

If firm 1 shifts to S from R at any p > p∗, then his payoff in the range (p∗, p] will satisfy,

v1 =π0

π0 + r(1− π1

π1 + π0 + rp) + C(1− p)[Λ(p)]

π0+rπ1

Page 45: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

34

This is derived from solving the O.D.E. obtained by putting k1 = 0 and k2 = 1 in (1.19).

Since firm 2 switches to S from R at p∗, value matching condition at p∗ implies

C =π0

r+2π0− π0

π0+r (1− π1π1+π0+rp

∗)

(1− p∗)[Λ(p∗)]r+π0π1

We can check that C < 0. Hence v′1 is concave for p ∈ (p∗, p]. Further, v

′1(p∗) is zero.

Thus if 1 switches to S from R at p, v′1(p) < 0. This implies that we have π0(1− v1(p)) <

π1p[1 − vi(p) − v′i(p)p(1 − p)], which contradicts optimality. This is true for any p > p∗.

This implies that firm 1 should shift to S from R at any p ≤ p∗.

Suppose firm 1 shifts at a point p′< p∗. Then v1 for the range [p

′, p∗] will satisfy,

v1 =π1

r + π0 + π1p+ C(1− p)[Λ(p)]

r+π0π1

Then v′1(p′) < 0 for any p

′< p∗. Since v1(.) will satisfy the value matching condition

at p′, we know that v1(p

′) = π0

r+2π0. Thus for p = p

′+ ε, ε > 0 and ε → 0, we must have

v1(p) < π0r+2π0

. However by switching to S at p∗ he can guarantee himself a payoff of π0r+2π0

at all p < p∗. This contradicts optimality. Hence the unique optimal switching point for

firm 1 is p∗. Similarly we can show this for firm 2.

This concludes the proof.

The setting of this model with symmetric firms is similar to that in ([37]), except for

the difference that here we have payoff-externality among players. (i.e they operate on the

same bandit). Hence we see that competition brings in efficiency.

1.2.2 Asymmetric firms

Suppose the firms differ in their abilities in conducting research at the good risky site.

That is we have π1 > π2 > π0 > 0.

Page 46: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

35

Planner’s problem

Let (k1, k2) be the planner’s action profile. ki ∈ {0, 1}, for i = 1, 2. ki = 1(0) implies that

the planner has allocated the ith firm to the risky(safe) site. Let v(p) be the value function

of the planner. Then it should satisfy

v(p) = maxki∈{0,1}

{(2− k1 − k2)πo dt+ k1pπ1 dt+ k2pπ2 dt+

(1− r dt)(1− (2−k1−k2)π0 dt−k1pπ1 dt−k2pπ2 dt)(v(p)− v′(p)p(1− p)(k1π1 +k2π2) dt)}

⇒ rv = maxki∈{0,1}

{(2−k1−k2)π0[1−v]+k1(pπ1[1−v−v′(1−p)])+k2(pπ2[1−v−v′(1−p)])}

(1.20)

This is because v(p+ dp) = v(p) + v′(p) dp and dp = −(k1π1 + k2π2) dt.

Lemma 5 If there exists an interior solution (i.e there exists p∗i ∈ (0, 1) such that for

higher p firm i is allocated to R and for p less than or equal to p∗i , firm i is allocated to S)

then optimality requires diversification over a range of beliefs. That is, there exists a range

of beliefs over which the planner will allocate one firm to the risky site and the other to the

safe site.

Proof of Lemma. Suppose not. This implies that the planner’s optimality requires him

to switch both the firms from the risky to the safe site at the same p, say p′. At the

optimum the smooth pasting condition must hold which implies that v′(p′) = 0. From

(1.20), we know that optimality requires,

p′π2[1− v] = p

′π1[1− v(p

′)] = π0[1− v(p

′)]

However since π1 > π2, p′π2[1− v(p)] < p

′π1[1− v(p

′)]. This is a contradiction.

This proves the lemma.

Lemma 6 Firm 2 is to be shifted at a higher p than firm 1.

Page 47: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

36

Proof of Lemma. Suppose not. From lemma (5) we know that this implies firm 1 is

shifted to the safe site at a higher p than firm 2. Let this switching point be p∗1. From (1.20),

we know that at p∗1 we must have, π0[1−v(p∗1)] = p∗1π1[1−v(p∗1)−v′(p∗1)(1−p∗1)]. Since π2 <

π1, we have π0[1−v(p∗1)] = p∗1π1[1−v(p∗1)−v′(p∗1)(1−p∗1)] > p∗1π2[1−v(p∗1)−v′(p∗1)(1−p∗1)].

This is a contradiction to the claim that it is optimal to keep firm 2 at the risky site at

p = p∗1. This proves the lemma.

Proposition 8 There exists a solution to the planner’s problem, where both the firms are

allocated to the risky site for p > p∗2, firm 2 is allocated to the safe site and 1 to the risky

site for p ∈ (p∗1, p∗2], and both firms are allocated to the safe site for p ≤ p∗1 where p∗1 = π0

π1.

Proof. First, assume that there exists some π0π1< p∗2 < 1, such that it is optimal to shift

firm 2 to the safe site at p∗2. The range of beliefs over which 2 is allocated to the safe site

and 1 is allocated to the risky site, v(p) should satisfy,

v =π0

r + π0+

rπ1p

(r + π0)(r + π0 + π1)+ C2(1− p)[Λ(p)]

r+π0π1 ≡ vSR

This is derived through solving the O.D.E obtained by putting k2 = 0 and k1 = 1 in

(1.20). Suppose p∗1 is the belief where 1 is shifted to the safe site. Since at p∗1, both the

firms are at S, optimality would require to have v′(p∗1) = 0(smooth pasting condition).

According to lemma (6), firm 2 is shifted from R to S at a higher p. Then from the

value matching condition, we know that we should have vSR(p∗1) = v(p∗1). This gives us

C2 =rπ0

(r+π0)(r+2π0)− rπ1p

∗1

(r+π0)(r+π0+π1)

(1−p∗1)[Λ(p∗1)]r+π0π1

. Observe that C2 > 0. Also, the smooth pasting condition

at p∗1 implies v′SR(p∗1) = 0. This gives us

rπ1

(r + π0)(r + π0 + π1)− C2[Λ(p∗1)]

r+π0π1 [1 +

(r + π0)π1p∗1

] = 0⇒ p∗1 =π0

π1

Page 48: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

37

We now need to prove the existence of a p∗2 ∈ (p∗1, 1), such that at p∗2, the planner finds

it optimal to shift firm 2 from R to S. When both firms are allocated to R, v(p) satisfies

v =π1 + π2

r + π1 + π2p+ C1(1− p)[Λ(p)]

rπ1+π2 ≡ vR

Hence we need to prove the existence of a p∗2 ∈ (p∗1, 1), such that vR(p∗2) = vSR(p∗2) and

v′R(p∗2) = v

′SR(p∗2); the manifestations of the value matching and smooth pasting conditions

respectively.

Consider any p ∈ [p∗1, 1]. By v′sR , we denote the slope of vR if p is the point where firm

2 is shifted from R to S. Note v′SR(.) is evaluated on the basis of the fact that firm 1 is

shifted from R to S at p = p∗1.

v′sR =

π1 + π2

r + π1 + π2− Cp1 [Λ(p)]

rπ1+π2 [1 +

r

(π1 + π2)p]

where Cp1 =vSR−

π1+π2r+π1+π2

p

(1−p)[Λ(p)]r

π1+π2. This is obtained from the value matching condition at p.

Consider the expression Cp1 [Λ(p)]r

π1+π2 . The derivative of this expression with respect

to p is strictly negative.(refer to Appendix (A.3.2))

Further, the term [1 + r(π1+π2)p ] is also decreasing in p. Hence v

′sR is strictly increasing

in p.

At p = 1,

v′sR =

π1 + π2

r + π1 + π2>

rπ1

(r + π0)(r + π0 + π1)= v

′SR

At p = p∗1 = π0π1

,

v′sR(p∗1) =

π1 + π2

r + π1 + π2− [

vSR(p∗1)− π1+π2r+π1+π2

p

(1− p∗1)][1 +

r

(π1 + π2)p∗1]

=π1 + π2

r + π1 + π2− {[

( 2π0r+2π0

)− ( π1+π2r+π1+π2

p)(1− p)

][1 +r

(π1 + π2)p]}

Page 49: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

38

since vSR(p∗1) = 2π0r+2π0

.

It can be shown that π1+π2r+π1+π2

− {[(

2π0r+2π0

)−(π1+π2r+π1+π2

p)

(1−p) ][1 + r(π1+π2)p ]} = 0 for p = 2π0

π1+π2.

Since v′sR(p) is strictly increasing in p and p∗1 = π0

π1< 2π0

π1+π2, v′sR(p∗1) < 0. Earlier, we

have established that the smooth pasting condition at p∗1 implies v′SR(p∗1) = 0. Hence

v′sR(p∗1) < v

′SR(p∗1), v

′sR(1) > v

′SR(1). Since both v

′SR(.) and v

′sR(.) are strictly increasing and

concave in p, there exists a unique p∗2 ∈ (p∗1, 1), such that v′sR(p∗2) = v

′SR(p∗2).

This concludes the proof of the existence of p∗2. Also it is established that v(p) is strictly

convex for p > p∗1.

Corollary 1 p∗2 >π0π2

, the threshold p where the planner would have shifted firm 2 from R

to S had he been dealing with this firm only.

Proof. Suppose not. Then p∗2 ≤ π0π2

. At p∗2, v′(p∗2) = v

′SR(p∗2) > 0. Since v is strictly

convex for p > π0π1

, v′(π0π2

) > 0. Therefore at p = π0π2

, π0[1−v] > π2p[1−v−v′(1−p)]. From

(1.20), we can see that this contradicts the claim that p∗2 ≤ π0π2

. This proves the corollary.

The non-cooperative game

This is similar to the non-cooperative game with symmetric firms. Thus k1(.) and k2(.)

are the Markovian strategies of the players.

Let v1(p) and v2(p) be the payoff functions of firm 1 and 2 respectively in a Markovian

equilibrium. vi should then satisfy,

rvi = maxki∈{0,1}

{(1−ki)[π0(1−vi)]+ki[πip(1−vi−v′i(1−p))]−[(1−kj)π0vi+kjp(vi+v

′(1−p))]}

(1.21)

This implies that given kj , at any p optimality on firm i’s part requires choosing ki(p) =

0(1) if [π0(1− vi)] ≥ (<)[πip(1− vi − v′i(1− p))] .

We determine the non-cooperative equilibrium in following steps.

Page 50: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

39

Lemma 7 Suppose 2 follows the strategy of going to R for p > p∗N2 and to S for p ≤ p∗N2such that π0

π1< p∗N2 < 1. Then firm 1’s best response is to go to R for p > p∗1 and to S for

p ≤ p∗N1 where p∗1 = π0π1

.

Proof of Lemma. First, consider the range p ≤ p∗N2 . If k1 = 1 (k2 = 0 by hypothesis),

then by putting i = 1 in (1.21) we know that v1 should solve

v′1 +

[r + π0 + π1]p(1− p)π1

v1 =1

(1− p)

This is a first order O.D.E. Solving this we have,

v1 =π1

r + π0 + π1p+ C(1− p)[Λ(p)]

r+π0π1 ≡ vRS1 (p) (1.22)

where C is an integration constant. If he choose k1 = 0 then v1(p) should satisfy,

v1 =π0

r + 2π0(1.23)

Initially, we assume that firm 1 indeed behaves in the way as claimed, for p ≤ p∗N2 .

Later, we will show that the value function thus obtained for the specified range will satisfy

the bellman equation for this range.

p∗N1 is the threshold, above which firm 1 goes to R. Then, value matching and smooth

pasting at p∗1 would imply vRS1 = π0r+2π0

and vRS′

1 (p∗N1 ) = 0. From these, we obtain

p∗N1 = π0π1

. Now we check, whether v1 thus obtained, satisfies the bellman equation or not.

At p = p∗N1 v′1(p∗N1 ) = 0. Hence [π1p(1−v1−v

′1(1−p))] = π0(1−v1). For p ∈ [p∗N1 , p∗N2 ],

v′1 satisfies,

v′1 ≡ vRS

′1 =

π1

r + π0 + π1− C[Λ(p)]

r+π0π1 [1 +

r + π0

π1p]

⇒ (1− p)v′1 =π1

r + π0 + π1− C(1− p)[Λ(p)]

r+π0π1

r + π0

π1p− v1

⇒ [π1p(1− v1 − v′1(1− p))] = (r + π0)v1

Page 51: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

40

v1 is strictly convex and increasing in the range [p∗N1 , p∗N2 ]. At p = p∗N1 , [π1p(1−v1−v′1(1−

p))] = (r + π0)v1 = π0(1 − v1). Therefore for p ∈ (p∗N1 , p∗N2 ], [π1p(1 − v1 − v′1(1 − p))] =

(r+π0)v1 > π0(1−v1) . From (1.21) we can conclude that it is optimal for firm 1 to choose

k1 = 1.

Next, consider the range p > p∗N2 . As before we conjecture that it is optimal for 1

to choose k1 = 1 and derive the value function. Then we show that the obtained value

function indeed satisfy the bellman equation. If 1 choose k = 1 then v1 should satisfy

v1 =π1

r + π1 + π2p+ C(1− p)[Λ(p)]

rπ1+π2

Value matching at p∗N2 implies C =[v1(p∗N2 )− π1

r+π1+π2p∗N2 ]

(1−p∗N2 )[Λ(p)]r

π1+π2. Clearly C is positive as v1(p∗N2 ) >

π1r+π1+π0

p∗N2 > π1r+π1+π2

p∗N2 . Thus v1 is strictly increasing and convex in (p∗N2 , 1].

We will show that it satisfies the bellman equation. For p > p∗N2 ,

v′1 =

π1

r + π1 + π2− C[Λ(p)]

rπ1+π2 (1 +

r

(π1 + π2)p

⇒ [π1p(1− v1 − v′1(1− p))] =

π

π1 + π2[π2p+ rv1]

At p∗N2 , [π1p(1 − v1 − v′1(1 − p))] > π0(1 − v1). Since v1 is strictly increasing in p, for

p ∈ (p∗N2 , 1], [π1p(1− v1− v′1(1− p))] > π0(1− v1) for p > p∗N2 . Hence it is optimal for firm

1 to choose k1 = 1 for p ∈ (p∗N2 , 1].

This concludes the proof.

Lemma 8 Suppose firm 1 plays the following strategy: Go to R for p > p∗N1 = π0π1

and Go

to S for p ≤ p∗N1 . Then there exists a p∗N2 ∈ (p∗N1 , π0π2

), such that firm 2’s best response is

to Go to R for p > p∗N2 and Go to S for p ≤ p∗N2 .

Proof of Lemma. Consider p ≤ p∗N1 . First, as before we conjecture that it is optimal

for firm 2 to be at S. Then v2 = π0r+2π0

for p ≤ p∗N1 . From (1.21) once can conclude that

π0(1− v2) > π2p[1− v2 − v′2(1− p)] for p ≤ p∗N1 .

Page 52: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

41

Now consider the optimal stopping problem of firm 2 in the range [p∗N1 , 1], given firm

1’s strategy.

First we show that firm 2 will switch from R to S at a p > p∗N1 .

Suppose firm 2 switches from R to S at p∗N1 . Then v2 for p ≥ p∗N1 satisfies:

v2 =π2

r + π1 + π2p+ C(1− p)[Λ(p)]

rπ1+π2 ≡ vR2

This is derived through solving the O.D.E obtained by substituting k1 = k2 = 1 and i = 2

in (1.21).

We can show that vR′

2 (p∗N1 ) < 0 (See Appendix (A.3.3))

If firm 2 switches at some p2, such that p2 > p∗N1 , then v2 in the range [p∗N1 , p2] will

satisfy,

v′2 +

[r + π0 + π1]π1p(1− p)

v2 =π0

π1p(1− p)

Solving this O.D.E we have

v2 =π0

r + π0[1− π1

r + π0 + π1p] + C(1− p)[Λ(p)]

r+π0π1 ≡ vSR2

Value matching at p∗N1 implies C < 0. Hence v2 is concave in the range [p∗N1 , p2] and

it can be shown that vSR′

2 (p∗N1 ) = 0. Hence this proves that it is optimal for firm 2 to

switch at a point p > p∗N1 . Also at the optimal switching point p∗N2 ,(if exists) smooth

pasting condition will be satisfied and we shall have vR′

2 (p∗2) = vSR′

2 (p∗N2 ) < 0. Suppose

by vRs′

2 we denote the derivative of vR2 (p) if the switching takes place at the belief p. It

has been established that vSR′

2 (p∗N1 ) = 0 > vRs′

2 (p∗N1 ). Further, it is easy to see that

vSR′

2 (1) = − π0π1(r+π0)(r+π0+π1) <

π2r+π1+π2

= vRs′

2 (1). It can be shown that vRs′

2 (p) is strictly

increasing. Since vSR′

2 (.) is strictly decreasing, there exists a unique p∗N2 ∈ (p∗N1 , 1), such

that vSR′

2 (p∗N2 ) = vRs′

2 (p∗N2 ).

From (1.21), we know that at the optimal we shall have [π2p∗N2 (1−v2(p∗N2 )−v′2(p∗N2 )(1−

Page 53: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

42

p∗N2 ))] = π0(1− v2(p∗N2 )). Since [1− v2(p∗N2 )] < [1− v2(p∗N2 )− v′2(p∗N2 )(1− p∗N2 )], we have

p∗N2 < π0π2

.

Proposition 9 Firm 1 going to R (S) for p > (≤)p∗N1 and firm 2 going to R (S)for

p > (≤)p∗N2 constitutes a Markovian equilibrium .

Proof. This follows directly from lemma (7) and (8).

The above proposition describes the unique equilibrium in threshold type Markovian

strategies. Since p∗N2 < π0π2

< p∗2, there exists a range of beliefs (p∗N2 , p∗2) when efficiency

requires firm 2 to shift to the safe site, but it does not. This shows, that the non-cooperative

equilibrium outcome involves the phenomenon of duplication.

Proposition (9) strengthens the notion, that in a R&D race model, where firms have

to choose between competing research projects and the first inventor appropriates all the

rent, we have distortion in the form of duplication.

1.3 Conclusion

We have demonstrated that when the firms’ abilities differ across research methods, then

efficiency requires diversification of research efforts over a range of beliefs. This has been

established in two different environments. In presence of heterogeneity among firms, we

do not achieve efficiency in a non-cooperative interaction. Only when the firms are equally

capable across research methods, is the non-cooperative outcome efficient. When the firms

differ in abilities, we always have duplication in a non-cooperative interaction. Depending

on the parameter values we can either have a unique equilibrium with diversification over

a range of beliefs or a multiplicity of equilibria with diversification at a point only.

Page 54: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

Chapter 2

Competition and Learning in R&D

: The Role of Private Information

2.1 Introduction

Innovation is an important aspect in the technological progress of an economy. The previous

chapter explored the scenario where competing agents trying to make the same discovery

had alternate methods of research to choose from. We saw that in presence of heterogeneity

between agents, non-cooperative interactions always lead to distortion. This distortion is

in the form of too-much duplication, i.e both firms adopt same kind of research method,

when efficiency dictates one of them to adopt a different method.

Throughout our analysis in the previous chapter, we found that inefficiency was follow-

ing from competition and heterogeneity among agents. The analysis was done by restricting

ourselves to settings where every outcome is perfectly observable by all agents and there

can only be one kind of news arriving(i.e the final discovery). However, in reality, we do

observe that the process of innovation might involve several interim arrivals of news before

the final discovery. While for all practical purposes final discovery can be supposed to be

publicly observable due to patenting, there is no reason why intermediate arrivals of news

43

Page 55: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

44

should be supposed to be observed by everyone. In fact, it is commonly observed that

firms conducting research to compete for a discovery, often maintain secrecy about their

interim outcomes, even though the path of research adopted by other firms is commonly

known. Each firm may obtain some interim success which they may not choose to reveal.

Revealing interim success gives an instantaneous payoff, but it makes the firm vulnerable in

the sense that the competitor may take advantage of this interim result and make the final

discovery sooner and thereby get disproportionately higher rent due to the winner-takes-all

structure.

As a motivating example, consider the world of academic research. Often two re-

searchers try to solve the same problem independently. Whoever solves the problem first,

gets a disproportionately higher payoff (a very good publication) than the subsequent re-

searcher solving the problem. In this situation it is very likely that one of them may get an

interim result earlier. This individual now has two options: Either to reveal this interim

discovery or to conceal it. Revealing might give an instantaneous payoff(say a publication

in a relatively low ranked journal). However, this also increases the probability of the

competing researcher solving the final problem earlier. This shows that a researcher will

not always have incentive to reveal his interim success. In particular, in the absence of any

interim payoff a researcher will never reveal any interim result. This chapter analyses this

issue of private arrival of information in a setting where there is no payoff to reveal interim

result(s).

The setting is a modified version of the model in the second environment of the previous

chapter. We have two firms who are trying to find the same treasure. There are two

alternate sites to search for the treasure. One of them will almost surely lead to success

in finite time. We refer to this as the safe site. We consider a continuous time framework.

Success to each firm who searches there follows a Poisson process with intensity π0 > 0.

The other site can either be good or bad. A bad risky site has no treasure and a good

risky site has the treasure for sure. A firm who searches there can experience two kinds

Page 56: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

45

of arrivals. There can be an arrival of information according to a Poisson process with

intensity π1 > 0. This just informs the firm that the site is good. This information is

only revealed to the firm who experiences this arrival. There can also be arrival of final

discovery according to a Poisson process with intensity π2 > π0. A priori, players start

with the same prior p, which is the probability with which the risky site is good.

We first obtain the efficiency benchmark or the full information optimal of this model,i.e

when both the firms are controlled by a social planner, who can observe all arrivals experi-

enced by a firm. Hence, both firms and the planner share a common belief about the state

of the risky site. The planner at each instant allocates a firm to a site. As soon as there is

a final discovery, the search ends. If any firm experiences an informational arrival, then all

uncertainties are resolved and both the firms thereon are allocated to the risky site( which,

in fact is now found to be good). The solution is threshold type. There exists a threshold

belief p∗ such that conditional on no observation, both firms are allocated to the risky site

if p > p∗ and to the safe site otherwise.

Next, we turn to the non-cooperative game. We restrict ourselves to symmetric equi-

libria. This implies that on the equilibrium path, given same information, actions will

be identical across firms. Hence, if the players start with a common prior, then on the

equilibrium path they will have a common posterior, even though the beliefs are private.

We derive an equilibrium as follows. There exists a common threshold p∗N , such that if

the private belief is greater than p∗N , then the firm searches at the risky site. Else they go

to the safe site. If a firm experiences an informational arrival then it keeps on searching

at the risky site as long the game continues. If initially a firm goes to the risky site and

gets no arrival till the belief hits p∗N , then it shifts to the safe site. However, if it observes

that its competitor has not shifted, then it reverts back to the safe site. This is because

the action of the competitor gives the firm a signal that an informational arrival has been

experienced at the risky site and thus it is good.

Having described the full information optimal and a non-cooperative equilibrium, we

Page 57: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

46

try to analyse the nature of inefficiency. We observe that p∗N > p∗. However, this will not

help us to determine the nature of inefficiency. This is because in the full information case,

the beliefs are public and in the non-cooperative case the beliefs are private. Moreover, the

belief updating processes are different. In the non-cooperative game, movement of beliefs

are sluggish. Hence, to determine the nature of inefficiency, we take the following route.

For each initial prior, at which the planner would have allocated both firms to the

risky site, we try to calculate the duration for which the firms are kept at the risky site,

conditional on no observation. Then we compare this with the duration for which the firms

would be in the risky site in the non-cooperative game, given the same prior.

First of all, it is trivially true that if the prior is in the range (p∗, p∗N ), in the non

cooperative game, the duration for which the firms go to the risky site (which is actually

0) is less than that a planner would have wanted. Then we determine a threshold belief

p0∗ ∈ (p∗N , 1) such that if the initial prior is higher (lower) than this threshold, then the

duration for which the firms are in the risky site in the non-cooperative game is higher

(lower) than that a planner would have wanted. Hence, too much optimism results in

excessive experimentation along the risky line.

Related Literature: This chapter contributes to two broad areas: IO literature and

the Strategic Bandit literature. To avoid redundancy with the previous chapter, I only

discuss the papers which have dealt with the issue of private arrival of information in the

context of strategic experimentation.

The paper which is closest to this work is the one by Akcigit and Liu[1]. They analyse

a two-armed bandit model with one risky and one safe arm. The risky arm could poten-

tially lead to a dead end. Inefficiency arises from the fact that there is wasteful dead-end

replication and an early abandonment of the risky project. The present work incorporates

the issue of private arrival of information in a different manner. The private information is

in the form of good news about the risky site, unlike their work where private information

Page 58: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

47

is in the form of bad news. However, the present work shows that there can still be early

abandonment of the risky project, if to start with players are not too much optimistic

about the quality of the risky line. Further, in the present work we have learning even

when there is no information asymmetry.

Heidhues, Rady and Strack[34] analyse a strategic experimentation model where we

have private payoffs. They take a two armed bandit model with a risky arm and a safe

arm. Players observe each other’s behavior but not the realised payoffs. They communicate

with each other via-cheap talk. The present chapter differs from their work in the following

ways. Firstly, we have private arrivals of information only. Secondly, players are rivals

against each other.

The rest of the chapter is organised as follows. Section 2.2 discusses the Environment

formally and the full information optimal solution. Section 2.3 discuss the non-cooperative

game and the nature of inefficiency. Section 2.4 concludes the chapter.

2.2 Environment

Two firms are trying to find the same treasure. The first one to find it, appropriates all the

rent from it( which is the social value from the invention and is normalised to 1). There

are two sites. One of the sites referred to as the safe one, has the treasure for sure. A firm

who is searching there discovers it according to a Poisson process with intensity π0 > 0.

The other site is risky and can either be bad or good. A bad risky site has no treasure in it

and a firm who is searching there does not experience any arrival. A good risky site has the

treasure for sure. A firm who searches there can experience two kinds of arrivals. There

can be an arrival of information according to a Poisson process with intensity π1 > 0. This

just informs the firm that the site is good. This information is only revealed to the firm

who experiences this arrival. There can also be arrival of final discovery according to a

Poisson process with intensity π2 > π0.

A priori, players start with the same prior p, which is the probability with which the

Page 59: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

48

risky site is good.

2.2.1 The planner’s problem: The full information optimal

Consider the optimal allocation in the case of full information, i.e when all kinds of arrivals

at the risky site are publicly observable. Suppose both the firms are controlled by a

benevolent social planner, who can observe all the arrivals experienced by each of the

firms. The planner wants to maximise the expected discounted social value.

Let k be the number of firms allocated to the risky site at an instant t. Since every

arrival is observable to the planner, if there is no arrival during the interval dt, then

dpt = −pt(1− pt)k(π1 + π2) dt

As soon as the planner experiences any arrival at the risky site, the uncertainty is resolved.

If it is a final discovery then the search ends, else the planner knows for sure that it is a

good risky site and allocates both the firms to that site then on. As before, we assume

that if the planner is indifferent between allocating a firm to the risky and and the safe

site, then he allocates it to the safe site.

Hence if v(p) is the value function of the planner, then it should satisfy the following

Bellman equation:

v(p) = maxk∈{0,1,2}

{(2− k

)π0 dt+ kp

[π2 dt+ π1

2π2

r + 2π2dt]

+(1− r dt

)(1− (2− k)π0 dt− kp(π1 + π2) dt

)(v(p)− v′(p)kp(1− p)(π1 + π2) dt

)}

⇒ rv = maxk∈{0,1,2}

{(2−k)[π0(1−v)] +kp

[π2 +π1

2π2

r + 2π2− (π1 +π2)v− (π1 +π2)(1−p)v′

]}(2.1)

Since the Bellman equation is linear in k, we can posit that at the optimal either k = 0 or

Page 60: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

49

k = 2. If k = 0, then v = 2π0r+2π0

. If k = 2 then v satisfies the following first order O.D.E:

v′+

[r + 2(π1 + π2)p]p(1− p)2(π1 + π2)

v =2π2{r + 2(π1 + π2)}(r + 2π2)2(π1 + π2)

1(1− p)

This is derived from (2.1) by putting k = 2. The solution to this O.D.E is

v =2π2

(r + 2π2)p+ C(1− p)[Λ(p)]

r2(π1+π2) (2.2)

where C is the integration constant and Λ(p) = (1−p)p .

Let p∗ be the belief at which the planner shifts both the firms to the safe site from the

risky site. Since for p = 1 (p = 0), the planner allocates both firms to the risky (safe) site,

p∗ ∈ (0, 1). Hence v(p) should satisfy the value matching and smooth pasting condition.

From the value matching condition at p∗, we have

C =2π0r+2π0

− 2π2r+2π2

p∗

(1− p∗)[Λ(p)]r

2(π1+π2)

Smooth pasting condition at p∗ implies v′(p∗+) = 0. From (2.2) we have

v′

=2π2

r + 2π2− C[Λ(p)]

r2(π1+π2) [1 +

r

2(π1 + π2)1p

]

Substituting the value of C and imposing the smooth pasting condition at p∗, we obtain

2π2

r + 2π2=

2π0r+2π0

− 2π2r+2π2

p∗

(1− p∗)[1 +

r

2(π1 + π2)1p∗

]

⇒ p∗ =π0

π2 + 2π1{(π2−π0)}(r+2π2)

(2.3)

By comparing this p∗ to the one obtained in the model without informational arrival

(which is π0π2

from the previous chapter), we can infer that experimentation along the risky

line is carried out for a larger range of beliefs in presence of informational arrival.

Page 61: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

50

2.3 The non-cooperative game

Let us now consider the non-cooperative game. We assume that a firm can observe the

action of its opponent. Only the final discovery by any firm is publicly observable. Hence

if a firm is searching at the risky site and experiences an informational arrival, it is private

to him. Also any informational arrival to a firm resolves the uncertainty to it.

Suppose firms start out with the same prior pt at the instant t, and both conduct

research at the risky site over a time interval ∆ > 0. Conditional on not observing anything

until the instant t and during the interval [t, t+ ∆], the common posterior of the firms at

(t+ ∆) is given as:

pt+∆ =pte−(π1+2π2)∆

pte−(π1+2π2)∆ + (1− pt)

This is because during the time interval [t, t + ∆], conditional on the risky site being

good, probability that a firm does not experience any informational arrival or have a final

discovery is e−(π1+π2)∆ and the probability that the opponent does not have any final

discovery is e−(π2)∆. Hence probability that the site is good and a firm does not observe

anything is pe−(π1+2π2)∆.

If ∆ is small enough then the firms’ common posterior when both conduct research at

the risky site (starting with a common prior) satisfies the following law of motion:

dpt = −(π1 + 2π2)pt(1− pt) dt

2.3.1 Equilibrium

We look for a symmetric equilibrium in the following kind of strategies:

A firm, given the current belief (which is private) chooses a site to carry out research.

If it chooses the risky site at the onset, then it also chooses a posterior, at which it is going

to switch to the safe site.

Page 62: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

51

In a symmetric equilibrium, there exists a threshold p∗N ∈ (0, 1) such that if the prior

(which is common to both the firms) p0 > p∗N , then both the firms choose the risky site

to carry out research and choose p∗N as the posterior to switch to the safe site.

Since firms start out from the same prior, as long as there is no arrival, firms will have

the same posterior.

If at p = p∗N , the firm observes the other firm to be still conducting research at the

risky site then it reverts back to the risky site and follows the other firm then on. Also if

the other firm reverts back to the risky site at any p < p∗N then it will follow suit. Shifting

between sites is costless and takes dt amount of time where dt > 0 and dt → 0. dt is

small enough such that the terms of order o( dt2) can be ignored.

In the following proposition we show that for sufficiently patient firms, such a symmetric

equilibrium exists and is unique.

Proposition 10 There exists a unique symmetric equilibrium as described above for suf-

ficiently patient firms(i,e r is low enough) with

p∗N =π0

π2 + π1r+2π2

(π2 − π0) rr+π0

Proof. We prove this proposition in following steps:

Lemma 9 If there exists a symmetric equilibrium as conjectured above then we must have

p∗N ≤ π0

π2 + π1r+2π2

(π2 − π0) rr+π0

≡ ¯p∗N

Proof of Lemma. Suppose there exists a symmetric equilibrium as conjectured above

and p∗N is the common belief where firms switch to the safe site from the risky site, if the

prior p0 > p∗N . Let the action of firm i be denoted by kit. kit ∈ {0, 1}. kit = 0(1) implies

that the firm is choosing the safe(risky) site. The strategy of each player in a symmetric

equilibrium is as follows:

Page 63: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

52

1. If p0 > p∗N , then ki0 = 1 and kit = 1 as long as pit > p∗N . If pit ≤ p∗N , then kit = 0.

2. If pit ≤ p∗N (t > 0) and kjt = 1 (j 6= i) then kit = kjt from then on.

Let v1(p1) and v2(p2) be the equilibrium payoffs to firm 1 and 2 respectively, in a

symmetric equilibrium. Since firms are identical in all respects and they start with a

common prior, in a symmetric equilibrium firms will have a common posterior, conditional

on not observing anything. Thus on the equilibrium path, we shall have pi = pj .

Then given the strategy of firm 2, firm 1’s payoff v1(.) should satisfy the following

bellman equation:

v1 = maxk1∈{0,1}

{(1− k1)π0 dt+ k1p

[π2 dt+ π1 dt

π2

r + 2π2

]+(1−r dt

)(1−(2−k1−k2)π0 dt−k1(π1+π2)p dt−k2π2p dt

)(v1−v

′1p(1−p)[k1(π1+π2)+k2π2] dt

)}

⇒ rv1 = maxk1∈{0,1}

{(1−k1)

[π0(1−v1)

]+k1p

[(π2(r + π1 + 2π2)

r + 2π2)−(π1+π2)v1−v

′1(1−p)(π1+π2)

]− (1− k2)π0v1 − k2

[π2v1 + π2p(1− p)v

′1

]}(2.4)

We define Bs(p) and Br(p) as follows:

Bs(p) =[π0(1− v1)

](2.5)

Br(p) = p[(π2(r + π1 + 2π2)

r + 2π2)− (π1 + π2)v1 − v

′1(1− p)(π1 + π2)

](2.6)

From (2.4) it is clear that if at a particular p it is optimal for firm 1 to go to the risky

(safe) site then we shall have Br(p) ≥ (≤)Bs(p).

According to the conjectured equilibrium given firm 2’s strategy, firm 1 finds it optimal

Page 64: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

53

to shift to the safe site at p = p∗N . Hence at p = p∗N we must have

Bs(p) ≥ Br(p)⇒ π0

(1− v1

)≥ p[(π2(r + π1 + 2π2)

r + 2π2)− (π1 + π2)v1 − v

′1(1− p)(π1 + π2)

]In equilibrium, both firms shift to S at p = p∗N . This implies that the left derivative

of v1 at p∗N is zero. Given firm 2’s strategy, if firm 1 goes to the risky site at p = p∗N ,

then conditional on there being no arrival, belief can change only in the leftward direction.

This implies

π0

(1− v1(p∗N )

)≥ p∗N

[(π2(r + π1 + 2π2)

r + 2π2)− (π1 + π2)v1(p∗N )

]Value matching condition at p∗N implies v1(p∗N ) = π0

r+2π0. Hence we shall have

π0(r + π0)(r + 2π0)

≥ p∗N π2(r + 2π2)(r + π0) + rπ1(π2 − π0)(r + 2π2)(r + 2π0)

⇒ p∗N ≤ π0

π2 + π1r+2π2

(π2 − π0) rr+π0

This concludes the proof.

As per the conjectured equilibrium, both firms go to the risky site for p > p∗N . Starting

from a common prior if both firms go to the risky site for p > p∗N , then v1 in this region

satisfies the following O.D.E:

v′1 +

r + (π1 + 2π2)p(1− p)(π1 + 2π2)

v1 =π2

r + 2π2(r + π1 + 2π2)

1(1− p)(π1 + 2π2)

This O.D.E is obtained from the bellman equation in (2.4) by putting k1 = k2 = 1.

Solving this we have

v1 =π2

r + 2π2p+ C(1− p)[Λ(p)]

rπ1+2π2 (2.7)

Page 65: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

54

where C and Λ(.) are as defined before. Then v′1 is given by

v′1 =

π2

r + 2π2− C[Λ(p)]

rπ1+2π2 [1 +

r

π1 + 2π2

1p

] (2.8)

Lemma 10 In a symmetric equilibrium it is necessary to have p∗N = ¯p∗N .

Proof of Lemma. Suppose p∗N < ¯p∗N . In equilibrium we should have Br(p) ≥ Bs(p) for

all p > p∗N .

First of all we will show that it is never possible to have v′1(p∗N+) < 0.

From (2.6) we have

Br(p) = p[(π2(r + π1 + 2π2)

r + 2π2)− (π1 + π2)v1 − v

′1(1− p)(π1 + π2)

]⇒ Br(p) = p[

π2(r + π1 + 2π2)r + 2π2

−(π1+2π2)v1(p)−(π1+2π2)v′1(p)(1−p)]+π2pv1+π2p(1−p)v

′1

Since for p > p∗N , v1 is given by (2.7), we have

Br(p) = rv1 + π2pv1 + π2p(1− p)v′1 (2.9)

Consider p = p∗N + ε, such that ε > 0 and ε→ 0. Then v1 ≈ π0r+2π0

.(since v1 is continuous)

This implies

Bs(p) ≈ rπ0

r + 2π0+

π20

r + 2π0

If v′1(p∗N+) < 0, then Br(p) < rv1 + π2pv1 ≈ r π0

r+2π0+ π2p

π0r+2π0

. Since ¯p∗N < π0π2

,

r π0r+2π0

+ π2pπ0

r+2π0< Bs(p). This implies that Br(p) < Bs(p) and contradicts optimality.

Hence we cannot have v′1(p∗N+) < 0 in equilibrium. This implies that v

′1(p∗N+) ≥ 0.

Since both firms shift to the safe site from the risky site at p = p∗N , using the value

matching condition at p = p∗N we have

C =π0

r+2π0− π2

r+2π2p∗N

(1− p∗N )[Λ(p′)]r

π1+2π2

Page 66: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

55

From (2.8), we have v′1(p∗N+) ≥ (≤)0 according as p∗N ≥ (≤) π0

π2+π1

r+2π2(π2−π0)

.

Since v′1(p∗N+) ≥ 0, we must have p∗N ∈ [ π0

π2+π1

r+2π2(π2−π0)

, ¯p∗N ).

As p∗N < ¯p∗N , we shall have Bs(p∗N ) > Br(p∗N ).

This implies

π0

(1− v1(p∗N )

)> p∗N

[(π2(r + π1 + 2π2)

r + 2π2)− (π1 + π2)v1(p∗N )

]as v

′1(p∗N ) = 0. As v1(.) is continuous, this strict inequality will still be satisfied for

p = p∗N + ε (ε > 0) . For p = p+ ε, v′1 ≥ 0. Then from (2.6), we can infer that

Bs(p) = π0

(1− v1(p∗N )

)> p∗N

[(π2(r + π1 + 2π2)

r + 2π2)− (π1 + π2)v1(p∗N )

]

> p∗N[(π2(r + π1 + 2π2)

r + 2π2)− (π1 + π2)v1(p∗N )− v′1(1− p)(π1 + π2)

]= Br(p)

This is not possible in equilibrium. Hence it is necessary to have Bs(p∗N ) = Br(p∗N ). This

implies p∗N = ¯p∗N .

This concludes the proof.

From the above two lemmas we know that a necessary condition to have a symmetric

equilibrium is to have the common switching probability p∗N to be equal to ¯p∗N . Hence

conditional on the existence, the symmetric equilibrium is unique.

Now we need to prove the existence. We need to find the conditions which will guarantee

that for all p > p∗N , Br(p) ≥ Bs(p).

According to the proposed profile of strategies, both firms go to the risky site for

p > p∗N . Then Br(p) is given by (2.9). Using (2.5) we know that Br(p) ≥ Bs(p) requires

v1 ≥π0

r + π2p+ π0− π2

p(1− p)v′1(p)r + π2p+ π0

(2.10)

From our above analysis we know that v1 is always strictly increasing and convex in p for

p ∈ (p∗N , 1). This implies v1 ≥ π0r+2π0

for p ∈ (p∗N , 1). Hence from (2.10) we can posit that

Page 67: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

56

a sufficient condition to ensure Br(p) ≥ Bs(p) is to have

π0

r + π2p+ π0− π2

p(1− p)v′1(p)r + π2p+ π0

≤ π0

r + 2π0

Since v′1 > 0 for p > p∗N , the above inequality is satisfied (strictly) for p ≥ π0

π2.

At p = π0π2

,π0

r + π2p+ π0− π2

p(1− p)v′1r + π2p+ π0

<π0

r + 2π0

Hence there exists a p′< π0

π2, such that for p ∈ (p

′, π0π2

],

π0

r + π2p+ π0− π2

p(1− p)v′1r + π2p+ π0

≤ π0

r + 2π0

From the expression of p∗N we know that p∗N → π0π2

as r → 0. Hence we can find a

r∗ > 0, such that for all r ∈ (o, r∗), p∗N > p′. This implies implies for r ∈ (0, r∗), we have

Br(p) ≥ Bs(p) for all p ∈ (p∗N , 1).

This concludes the proof of the proposition

Since rr+π0

< 1 we can infer that p∗N > p∗. Hence the non-cooperative equilibrium

may involve distortion. However, a priori it cannot be determined whether there will be

too much or too little experimentation along the risky line in the non-cooperative equi-

librium described above. This is because in non-cooperative interaction, private arrival of

information is not publicly observable. Thus if the common prior is greater than p∗N , then

conditional on no arrival, the private belief of the players diverges from the public belief.

(which is the one discussed in the full information optimal problem) Thus to determine

the nature of inefficiency, we need to know the duration of experimentation along the risky

line, conditional on no arrival. Observe that in case of no private information, there is a

one to one correspondence between the duration of experimentation and the posterior.

The following proposition establishes the nature of inefficiency.

Proposition 11 The non-cooperative equilibrium involves inefficiency. There exists a

Page 68: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

57

p0∗ ∈ (p∗N , 1) such that if the prior p0 > p0∗, then conditional on no arrival we have

excessive experimentation and for p0 < p∗0 we have too little experimentation. By ex-

cessive experimentation we mean that starting from a prior the duration for which firms

conduct research in the risky site is more than that a planner would have liked to.

Proof. Let tnp0 be the duration of experimentation along the risky line by the firms in the

non-cooperative equilibrium described above when they start from the prior p0. From the

non-cooperative equilibrium described above we know that of the firms start out from the

prior p0 then they would carry on experimentation along the risky line until the posterior

reaches p∗N . From the dynamics of the posterior we know that

dpt = −(π1 + 2π2)pt(1− pt) dt⇒ dt = − 1(π1 + 2π2)

1pt(1− pt)

dpt

tnp0 = − 1(π1 + 2π2)

∫ p∗N

p0

[1pt

+1

(1− pt)] dpt

⇒ tnp0 =1

(π1 + 2π2)[log[Λ(p∗N )]− log[Λ(p0)]]

Let tpp0 be the duration of experimentation along the risky line a planner would have wanted

if the firms start out from the prior p0. Then from the equation of motion of pt in the

planner’s problem we have

dpt = −2(π1 + π2)pt(1− pt) dt⇒ dt = − 12(π1 + π2)

1pt(1− pt)

dt

⇒ tpp0 =1

(2π1 + 2π2)[log[Λ(p∗)]− log[Λ(p0)]]

We have excessive experimentation when tnp0 > tpp0 . This is the case when

1(π1 + 2π2)

[log[Λ(p∗N )]− log[Λ(p0)]] >1

(2π1 + 2π2)[log[Λ(p∗)]− log[Λ(p0)]]

⇒ π1 log[Λ(p0)] < 2(π1 + π2) log[Λ(p∗N )]− (π1 + 2π2) log[Λ(p∗)]

Page 69: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

58

Since Λ(p) is decreasing in p the above inequality states that there exists a p∗0 ∈ (0, 1) such

that if p0 > p∗0 then the above inequality is satisfied. Also since p∗ < p∗N we have

π1 log[Λ(p∗0)] = π1 log[Λ(p∗N )]− (π1 + 2π2)[log[Λ(p∗)]− log[Λ(p∗N )]] < π1 log[Λ(p∗N )]

⇒ p∗0 > p∗N

This concludes the proof.

In the non-cooperative equilibrium, distortion arises from two sources. One, is what

we call the implicit free-riding effect. This comes from the fact that if a firm experiences

a private arrival of information, then the benefit from that is also reaped by the other

competing firm. This is possible here because of instantaneous costless switching. In fact,

if information arrival to firms would have been public, then the non-cooperative equilibrium

would always involve free-riding. This follows directly from ([37]). Thus this implicit free

riding effect tends to reduce the duration of experimentation along the risky line.

The other kind of distortion arises from the fact that information arrival is private and

the probability that the opponent firm has experienced an arrival of information is directly

proportional to the belief that the risky site is good. Conditional on no observation, this

makes the movement of the belief sluggish. This results in an increase in the duration of

experimentation along the risky line. The effect of distortion from the second (first) source

dominates, if the prior to start with is higher.(lower)

This intuitively explains the result obtained in the above proposition.

2.4 Conclusion

This chapter has analysed a tractable model to explore the situation when there can be

private arrival of information. We show that there can be a non-cooperative equilibrium

where depending on the prior we can have both too much and too little experimentation

along the risky line. This result has been obtained under the assumption that firms can

Page 70: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

59

switch between sites without incurring any cost (revocable switching). It will be interesting

to see how the results change if a firm after switching to the safe site is unable to revert

back to the risky site. This idea of irrevocable switching and payoff from interim results

will addressed in my near future research.

Page 71: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

Chapter 3

Decentralised Bilateral Trading,

Competition for Bargaining

Partners and the law of one price

3.1 Introduction

In this chapter, we study price formation in a market with small numbers of buyers and

sellers, where transactions are bilateral, between a single buyer and a single seller. For a

broad range of variants of a dynamic bargaining game with many sellers and buyers, in

which only one side of the market makes offers, we find that, as the discount factor goes to

1, the stationary equilibrium prices in different transactions converge to a single value. A

dynamic version of “directed search” is one of the extensive forms discussed here, though

more attention is given to offers targeted to specific individuals on the other side.

60

Page 72: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

61

3.1.1 Motivation for the problem studied

Most modern markets consist of a small number of participants on each side. These par-

ticipants buy from and sell to each other, write contracts with each other and sometimes

merge with each other. The transactions in these markets are often bilateral in nature,

consisting of an agreement between a buyer and a seller or a firm and a worker. These

bilateral trades occur without any centralised pricing mechanism, in a series of bargains in

which the “outside options” for a current bargaining pair are, in fact, endogenously given

for each by the presence of alternative partners on the other side of the market. However,

these potential alternative partners, by their presence, implicitly compete with each other

and one question that arises naturally is whether the “competitive” pressure of the outside

options leads to an approximately uniform price for non-differentiated goods. It is this

basic question, about endogenous outside options and a uniform price, that this chapter

seeks to study, in the context of a particular set of extensive forms. We focus on complete

information. 1

Examples

Whilst the models we study are going to be highly stylised representations of these exam-

ples, they at least have some features in common with them. A standard example used

in these settings is the housing market, for a given location and a given type of home (to

reduce the extent of differentiation). Sellers list their houses, buyers visit, inspect and then

convey their offers to the sellers-one offer from each buyer. Sellers can accept or reject the

offers they have; possibly they then make counter-offers or often wait for the buyer to come

back again with new higher offers. Whether counter-offers are made or not distinguishes

different extensive forms or bargaining protocols. The offers are privately made to sellers,

who typically do not know what other sellers receive.

Another example is of a firm being acquired. Here the potential acquirer makes a public1An incomplete information analysis has been done in the next chapter.

Page 73: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

62

targeted offer for a particular firm, which the shareholders of the potential acquisition have

to accept or reject (based on a recommendation by the management). A rejection could

lead to the acquirer raising its offer. There could be competition on both sides, perhaps

another potential buyer called in by management of the target as a “white knight” and

other possible targets with the same attractive characteristics as the one in play. In this

particular context, it makes sense to think of offers as being one-sided, from the potential

buyers, and publicly announced.

Private targeted offers occur in negotiations for joint ventures. For example, the book [2]

describes the joint venture talks between industrial gas companies and chemical companies

in the 1980s, in which the players were Air Products, Air Liquide and British Oxygen on

one side and DuPont, Dow Chemical and Monsanto on the other. After some bargaining,

two joint ventures and an acquisition resulted.

A fourth context, this time from the economics literature, occurs in the “directed

search” models common in labour economics([29], [47]). The game consists here of firms

announcing wage offers simultaneously and workers deciding which offer to accept. If firms

are constrained in the number of slots, sometimes not all workers who seek the job can be

hired. The game is often modelled as one-stage; there is no dynamics of competing offers

over time for the same potential workers. Our model, however, has this additional feature

(of competition over time).

3.1.2 Main features of our model.

Our model begins from a setting of two buyers with common valuation v, two sellers

with valuations M,H and complete information about these values. We assume that

v > H > M > 0. We then extend the model by adding buyers, sellers and both to the

basic model. There is a one-time entry of players, at the beginning of the game, and a

buyer-seller pair who trade leave the market.

Players discount with a common discount factor δ ∈ (0, 1). We consider equilibria for

Page 74: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

63

high values of δ and consider the limit of equilibria as δ → 1. We also consider extensive

forms with public and private targeted offers and “ex ante” public offers (as in directed

search), using the terminology of Gale. All extensive forms we consider have two main

features; offers are one-sided and offers are simultaneous. Simultaneous offers seems to

us to be the right way to capture the essence of competition. Targeting an offer to one

individual on the other side of the market enables us to endogenise matching between

buyers and sellers as a strategic decision. Once the offers have been made, one per proposer,

recipients simultaneously accept or reject. A rejection ensures that the game continues to

the following period, where payoffs are discounted by δ.

Our main results, starting with the basic model, can be simply described. There

is a unique stationary equilibrium outcome under complete information, involving non-

degenerate mixed strategies for all players. As δ → 1, the mixed strategies collapse to a

single price and the price in all matches goes to H. In equilibrium, there could be one-period

delay with positive probability, but the cost of delay, of course, goes to 0 as δ → 1. The

price H might be thought of as a competitive equilibrium price in the complete information

setting.

The complete information asymptotic results extend to the general case for n buyers

and n sellers (where n <∞ ).2

In the next section, we discuss the relevant literature and compare our results to some

of the existing work.

3.1.3 Related literature.

We now qualitatively describe the existing literature and compare our model with it. The

first attempts to obtain microfoundations for markets using bilateral bargaining were the

papers by Rubinstein and Wolinsky [53], [8] Gale [25] and [26], . These papers were all2We have not checked for uniqueness of the stationary equilibrium outcome. Though the equilibrium in

the general case has to consider cases not present in the basic model, the uniqueness result should extend,though proving it formally would involve details of a large number of special cases.

Page 75: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

64

concerned with large anonymous markets, in which players who did not agree in a given

period are randomly and exogenously rematched in succeeding periods with someone they

had never met before. Rubinstein and Wolinsky [53] and Gale [26] consider bargaining

frictions given by discounting and characterise the limiting price as the discount factor

goes to 1. The limiting price depends on exogenously given probabilities of being matched

in the following period.

Rubinstein and Wolinsky [54] (see also [46], Chapter 9.2, 9.3 for an exposition of their

models) consider B buyers and S sellers, with B > S and both finite. They have models

in which a proposal is made and, if it is rejected, participants are rematched using an

exogenous matching technology. They take a frictionless trading environment. Rubinstein

and Wolinsky, in this paper, show that there could be multiple equilibria in prices even

though all buyers and all sellers are homogeneous. In their model, because there are more

buyers than sellers, the competitive price is the buyers’ valuation. However, non-stationary

equilibria with prices different from the competitive equilibrium also exist. Some additional

assumptions ensure the competitive solution to be unique.

Gale and Sabourian [28] and Sabourian [55] use notions of strategic complexity to select

the competitive equilibrium in games of the kind studied by Rubinstein and Wolinsky, by

refining away non-stationary equilibria using the complexity concept.

Hendon and Tranaes [35], also following [54] study a market with two heterogeneous

buyers and one seller, and random matching after termination, and show there is no sta-

tionary subgame perfect equilibrium.

Chatterjee and Dutta [10] attempt a project similar to this one, also with public and

private targeted offers and ex ante offers, but both sides of the market are allowed to make

offers. It turns out that this difference with the current chapter is crucial. The paper

[10] does not, in general, obtain an asymptotically single price as δ → 1; under public

targeted offers, there is a pure strategy equilibrium and all pure strategy equilibria involve

two different prices. In general, the mixed strategy equilibria in the other models remain

Page 76: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

65

non-degenerate even as δ → 1, unlike this chapter, even though the expected player payoffs

converge (except for public targeted offers).

To summarise, this current chapter differs from the existing literature by considering

one or more of the following: (i) Small numbers and strategic matching. (ii) Extensive

forms with different assumptions about whether offers are public or targeted and private.

(iii) Simultaneous offers. Despite this variety and the number of differences with the papers

mentioned above, the results we get are surprisingly consistent with an asymptotic single

price. It is clear that the fact that we consider one-sided rather than alternating offers has

much to do with this, and this might be considered one of the takeaways from this chapter,

namely that the intuition for the single price result holds broadly provided alternating

offers don’t push prices apart when buyer-seller valuations are heterogeneous.

In the next section, we discuss the basic model with two buyers and two sellers under

complete information. In Section 3, we consider extensions of the basic model, analyzing

the effects of adding a buyer or a seller. We also show, in this section, how to extend

the description of the equilibrium constructed in Section 2 to a setting where there are n

buyers and n sellers, for general finite n.

3.2 The basic framework

3.2.1 The model

Players and payoffs

In the basic model we address, there are two buyers and two sellers. As mentioned in

Section 3.1.2, there are two buyers B1 and B2 with a common valuation of v for the good

(the maximum this buyer is willing to pay for a unit of the indivisible good). There are

two sellers. Each of the sellers owns one unit of the indivisible good. Sellers differ in their

valuations (we can also interpret these as their costs of producing to order). One of the

sellers, (SM ) has a value of M for one or more units of the good. The other seller, (SH)

Page 77: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

66

similarly has a value of H where

v > H > M > 0

This inequality implies that either buyer has a positive benefit from trade with either

seller. Alternative assumptions can be easily accommodated but are not discussed in the

chapter. In the basic complete information framework all these valuations are commonly

known. Finally, all players are risk neutral. Players (buyers or sellers) have a common

discount factor δ where δ ∈ (0, 1). Suppose a buyer agrees on a price pj with seller Sj in

period t. Then the buyer has an expected discounted payoff of δt−1(v − p) and Sj has the

payoff of δt−1(p− j), where j = M,H.

We shall discuss the informational assumptions along with the extensive forms in the

next subsection.

The extensive form

We consider an infinite horizon multi-player bargaining game with one-sided offers. The

extensive form of the game is described as follows.

At each time point t = 1, 2, ... offers are made simultaneously by the buyers. The offers

are targeted. This means an offer by a buyer consists of a seller’s name (that is SH or

SM ) and a price at which the buyer is willing to buy the object from the seller he has

chosen. Each buyer can make only one offer per period. Two settings could be considered;

one in which each seller observes all offers made (public targeted offers) and one (private

offers) in which each seller observes only the offers she gets. (Similarly for buyers after

the offers have been made.) We shall focus on the first and argue that here it makes no

difference in the analysis of stationary equilibrium. A seller can accept at most one offer

she receives. Acceptances or rejections are simultaneous. Once an offer is accepted, the

trade is concluded and the trading pair leave the game. Leaving the game is publicly

observable. The remaining players proceed to the next period in which buyers again make

Page 78: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

67

price offers to the sellers. As is standard in these games, time elapses between rejections

and new offers.

The analogue of directed search, public offers that are not targeted to specific individ-

uals, is discussed in the extensions section.

We will not formally write out strategies, since this is a standard multi-stage game with

observable actions [24] . The main difference between the two extensive forms discussed in

the previous subsection is that, in public targeted offers, a seller’s response (and subsequent

actions by all players) can condition on the history of offers made to the other seller, in

addition to those she receives herself. In private offers, the only public history in each

period is the set of players remaining in the game. Each player has private histories as

well. Our equilibrium notions here will be standard, subgame perfect equilibrium for the

public targeted offers case and public perfect equilibrium for the second ( to avoid having

to consider and specify a player’s beliefs about past and present offers that are not in his

or her private history).3

3.2.2 Equilibrium in the basic model

Stationary equilibria

We consider stationary equilibria, that is, equilibria in which buyers when making offers

condition only on the set of players remaining in the game and the sellers, when responding,

condition on the set of players remaining and the offers made by the buyers. Clearly in the

private targeted offers model, the response of a seller can condition only on her own offer.

(We emphasise that this is not a restriction on strategies, only on the equilibria considered.)

These are therefore public perfect equilibria in the private offers game and particular sub-

game perfect equilibria in the targeted offers extensive form. We shall demonstrate that

the equilibrium outcome we find in this way is the unique stationary equilibrium outcome.

We shall proceed in this subsection by showing that a candidate strategy profile, in fact,3See [6] for an example of the effect of such beliefs in a multilateral bargaining context.

Page 79: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

68

does constitute an equilibrium. In the next subsection, we shall show that the stationary

equilibrium payoff vector is unique upto choice of the buyer who makes an offer to both

sellers.

The conjectured equilibrium is as follows:

1. Consider a game in which only two players, buyer Bi and seller Sj remain in the

market and wj denotes the valuation/cost of Sj . Then it is clear that (i) Bi offers wj and

(ii) that Sj accepts any offer at least as high as wj and rejects otherwise.

2. Now consider the four-player game4. We consider the following strategies:

(a) One of the buyers, B1 say, makes offers to each seller with positive probability and

the other buyer B2 makes an offer only to SM . Let q be the probability with which B1

offers to SH . B1 offers H to SH . B1 randomises an offer to SM , using a distribution F1 (·)

with support [pl, H], where pl is to be defined later. The distribution F1(·) consists of

an absolutely continuous part from pl to H and a mass point at pl. B2 randomises by

offering M to SM (with probability q′) and randomising his offers in the range [pl, H] using

an absolutely continuous distribution function F2. The distributions Fi(·) are explicitly

calculated later.

(b) The sellers’ strategies in the four-player game are as follows. SH accepts the highest

offer greater than or equal to H and rejects if all offers are less than H. SM accepts the

highest offer with a payoff from accepting at least as large as the expected continuation

payoff from rejecting it (to be calculated later).

3. The expected payoff of a buyer Bi in equilibrium is v −H. The expected payoff of

SH is 0 and that of SM is positive and is considered below.

Lemma 11 Suppose there exists a pl such that

pl −M = δ(E(y)−M)4Note that, since we start with the same number of players on both sides of the market and since players

can leave only in pairs, any possible subgames will also have the same number of buyers and sellers.

Page 80: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

69

,where y (a random variable) represents the maximum price offer to SM under the proposed

strategies. Then the strategies in 1,2 above constitute an equilibrium with

(i)

F1(s) =(v −H)(1− δ(1− q))− q(v − s)

(1− q)[(v − s)− δ(v −H)](3.1)

(ii)

F2(s) =(v −H)(1− δ(1− q′))− q′(v − s)

(1− q′)[(v − s)− δ(v −H)](3.2)

(iii)

q =[v −H](1− δ)

(v −M)− δ(v −H)(3.3)

(iv)

q′

=[v −H](1− δ)

(v − pl)− δ(v −H)(3.4)

Proof.

Since the proof is long, we relegate it to appendix (A.5).

Lemma 12 There exists a unique pl ∈ (M,H), such that,

pl −M = δ(E(y)−M)

where E(y) is same as defined before.

Proof. For any x ∈ (M,H) let F x1 (.), F x2 (.), qx , q′x, and Ex(y) be the expressions obtained

from F1(.), F2(.), q, q′

and E(y) respectively by replacing pl by x. Thus all we need to

show is that there exists a unique x∗ ∈ (M,H) such that,

x∗ −M = δ(Ex∗(y)−M)

Page 81: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

70

We have,

Ex(y) = qx[q′xM + (1− q′x)Ex2 (p)] + (1− qx)[q

′xEx1 (p)

+(1− q′x)E(highest offer)]

where, Exi (p) is derived from F xi (.), (i = 1, 2) and is the expected price offer by the

buyer Bi,when his offers are in the range [x,H].

The following lemma shows that as x increases by 1 unit, increase in Ex(y) is by less

than 1 unit.

Lemma 13∂Ex(y)∂x

< 1

Proof. See appendix A.6 for the proof of the lemma.

Now we define the function G(.) as,

G(x) = x− [δEx(y) + (1− δ)M ]

Differentiating G(.) w.r.t x we get,

G′(x) = 1− (δ)

∂Ex(y)∂x

From Lemma 13 we have,

G′(x) > 0

From the equilibrium strategies we know that M < Ex(y) < H for any x ∈ (M,H). Since

δ ∈ (0, 1) we have,

limx→M

G(x) < 0 and limx→H

G(x) > 0

Page 82: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

71

Since G(.) is a continuous and monotonically increasing function, using the Intermediate

Value Theorem we can say that there exists a unique x∗ ∈ (M,H) such that,

G(x∗) = 0

⇒ x∗ = δEx∗(y) + (1− δ)M

This x∗ is our required pl.

Thus we have,

G(pl) = 0

⇒ pl = (1− δ)M + δE(y)

Proposition 12 There exists a unique pl ∈ (M,H) such that strategies described above

constitute a subgame perfect equilibrium and,

pl = (1− δ)M + δE(y)

Proof. The proof directly follows from lemma 11 and lemma 12.

Uniqueness of the stationary equilibrium outcome

In this section we will show that the outcome derived above is the unique stationary

equilibrium outcome in this game, so that the expected payoff to each of the buyers is

v−H5. By outcome we mean the vector of payoffs obtained by the buyers and sellers. We

will adopt the methodology of Shaked and Sutton [57].5In fact there is another stationary equilibrium where B2 offers to both the sellers with positive proba-

bility and B1 to SM only. The qualitative nature will be the same and the buyer with valuation vi obtainsa payoff of vi −H.

This does not necessarily mean that the price is H. However, we shall show this is true asymptotically,as δ → 1.

Page 83: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

72

Let M∗ and m∗ be the maximum and the minimum payoffs6 obtained by a buyer in

any stationary equilibrium of the complete information game. Also let ΛH and ΛM be the

maximal stationary equilibrium payoffs for sellers SH and SM respectively.

Lemma 14 In any stationary equilibrium, when all four players are present, both buyers

cannot make offers to both sellers with positive probability.

Proof. In a stationary equilibrium when both the buyers are offering to both the sellers,

each buyer should randomise its offer while offering to any of the sellers. Given the buyers’

behavior, each seller accepts an offer(or the maximum of the received offers) if and only

if the payoff from acceptance is at least as large as the discounted continuation payoff

from rejection. This implies that in a stationary equilibrium we need not worry about the

deviations by the sellers.

Let sMi be the upper bound of the support of offers to SM from the buyer Bi, i = 1, 2.

Let sHi be the upper bound of the support of offers to SH from the buyer Bi, i = 1, 2.

If sH1 6= sH2 then the buyer having a higher upper bound (say B1) can profitably deviate

by offering (sH1 − ε) to SH , where ε > 0 and sH1 − ε > sH2 .

Thus ,

sH1 = sH2 = sH

By similar reasoning we can say that,

sM1 = sM2 = sM

Next we would argue that we must have sH = sM . Suppose not . W.L.O.G let

sH > sM . In this case one of the buyers can profitably deviate by offering p to SM such

that sH > p > sM . Thus we have,

sH = sM = s6We assume (without needing to) that the supremal and infimal payoffs are actually achieved.

Page 84: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

73

Let q2 be the probability with which B2 offers to SH . Let FM2 (.) and FH2 (.) be the con-

ditional distributions of offers by B2 given that he makes offers to SM and SH respectively.

Take s ∈ [sM1 , s] ∩ [sH1 , s]. B1’s indifference relation tells us that:

(v − s)[q2 + (1− q2)FM2 (s)] + (1− q2)(1− FM2 (s))δ(v −H)

= (v − s)[(1− q2) + q2FH2 (s)] + q2(1− FH2 (s))δ(v −M)

Since δ(v −M) 6= δ(v −H), (1− q2)(1− FM2 (s)) 6= q2(1− FH2 (s)). W.L.O.G we take,

(1− q2)(1− FM2 (s)) > q2(1− FH2 (s))

⇒ (1− q2)(1− FM2 (s)) > q2(1− FH2 (s))

The above inequality suggests that B2 puts a mass point at the upper bound of one of the

supports. If not then both (1 − q2)(1 − FM2 (s)) and q2(1 − FH2 (s)) are 0 and the above

inequality is not satisfied. This implies that B1 can profitably deviate.

Lemma 15 In any stationary equilibrium, when all four players are present , both buyers

cannot offer to SH with positive probability.

Proof. Clearly both offering to SH only is not possible in equilibrium. Similarly one of the

buyers offering to SH only and the other one making offers to both the sellers with positive

probability is not possible. In that case the buyer who is offering to both can profitably

deviate by offering M to SM . Thus if both are offering to SH it must be the case that

both are making offers to both the sellers with positive probability. From lemma 14 we

know that this is not possible in a stationary equilibrium. This concludes the proof.

Lemma 16 ΛH = 0

Proof. Suppose not. That is let it be the case that in a particular stationary equilibrium

SH obtains a strictly positive payoff (ΛH > 0). From Lemma 14 and Lemma 15 we know

Page 85: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

74

that a single buyer is making this offer to SH . Since ΛH > 0, this buyer is offering xH

(where xH ≥ H + ΛH) with positive probability and his payoff is less than or equal to

v − xH .

Suppose this buyer deviates and makes an offer of x′H such that,

x′H = H + εΛH

where 0 < δ < ε < 1.

This offer will always be accepted by SH , irrespective of what the other seller’s strategy

is. This is because if she rejects this offer then next period she can at most obtain a payoff

of ΛH which is worth δΛH now. However by accepting this offer she gets εΛH > δΛH .

Since,

xH − x′H ≥ H + ΛH −H − εΛH

= ΛH(1− ε) > 0,

this deviation is profitable for the buyer. Thus we must have ΛH = 0 . This also tells

us that in a stationary equilibrium SH never gets an offer greater than H with positive

probability.

Lemma 17 In a stationary equilibrium, SM cannot get an offer greater than H with pos-

itive probability.

Proof. Suppose SM gets an offer H +4,4 > 0 with positive probability. From lemma

2.4 we know that H never gets an offer greater than H in equilibrium. Thus the buyer

making the above offer to M can profitably deviate by offering H + λ4, (0 < λ < 1) to

SH . Thus in equilibrium SM cannot get an offer greater than H with positive probability.

Lemma 18

m∗ ≥ v −H for i = 1, 2

Page 86: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

75

Proof. From Lemma 16 and Lemma 17 we can posit that none of the sellers gets any offer

greater than H with positive probability. Thus in a stationary equilibrium buyers’ offers

are always in the interval [M,H]. Hence m∗ is bounded below by v −H. Thus,

m∗ ≥ v −H

Lemma 19

M∗ ≤ v −H for i = 1, 2

Proof. Suppose there exists a stationary equilibrium such that Bi obtains a payoff of M∗

such that M∗ > v −H.

(i) Consider the situation when the buyers play pure strategies. It must be true that

the offer made by Bi is accepted. Let p∗ be the equilibrium price offer by Bi. Since,

M∗ = v − p∗ > v −H

we have,

p∗ < H

This implies that this offer is accepted by seller SM .

Thus either Bj (j 6= i) is offering to SH or it is offering a price lower than p∗ to SM .

In both cases Bj can profitably deviate by offering a price p to SM such that p∗ < p < H .

Hence it is not possible for Bi to obtain a payoff of M∗ > v − H in a stationary

equilibrium when both the buyers play pure strategies.

(ii) Suppose at least one of the buyers plays a non-degenerate mixed strategy. It is

easy to note that Bi cannot obtain a payoff of M∗ > v−H if he offers to SH with positive

probability. Thus we only need to consider the situations when Bi is offering to SM only.

Suppose both B1 and B2 are offering to SM only. There does not exist a stationary

Page 87: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

76

equilibrium where one of the buyers plays a pure strategy. Thus both B1 and B2 play

mixed strategies. It is trivial to check that in equilibrium the supports of their offers have

to be the same. Let [s, s] be the common support of their offers, where s ≥ M . Since Bi

obtains a payoff higher than v−H we must have s < H. Let Fj(.) be the distribution 7 of

offers by Bj where j = 1, 2 and j 6= i. Thus for any s ∈ [s, s] we have ,

(v − s)Fj(s) + (1− Fj(s))δ(v −H) = M∗

⇒ Fj(s) =M∗ − δ(v −H)

(v − s)− δ(v −H)

Since Fj(s) is always positive, Bj puts a mass point at s. From lemma 18 we know that

m∗ ≥ v − H. Thus by applying similar reasoning we can show that Bi also puts a mass

point at s.

We will show that Bi can profitably deviate. Suppose Bi shifts the mass from s to s+ ε

where ε > 0 and ε is small enough. The change in payoff of Bi is given by,

4ε = Fj(s+ ε)(v − (s+ ε))− Fj(s)2

(v − s) (3.5)

We will show that for small values of ε the above change in payoff is positive. For ε > 0,

from ( 3.5) we have,

4ε = [Fj(s) + εF′j (x)](v − (s+ ε))− Fj(s)

2(v − s)

where x ∈ (s, ε).

This implies

4ε = Fj(s)(v − s) + εF′j(x)(v − s)− εFj(s)− ε2F

′j (x)− Fj(s)

2(v − s)

7We assume that Fj(.) is differentiable

Page 88: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

77

= Fj(s)(v − s

2− ε) + εF

′j (x)(v − s)− ε2F ′j (x)

For ε small enough we have, ε2F′j (x) ≈ 0.

Thus 4ε = Fj(s)(v − s

2− ε) + εF

′j (x)(v − s) > 0

This shows that Bi has a profitable deviation.

Next, consider the case when Bi offers to SM and Bj offers to SH . If Bi is playing a

pure strategy then his offer must be less than H. If Bi is playing a mixed strategy then the

upper bound of the support must be less than H. In both cases Bj can profitably deviate.

Lastly, consider the case when Bi is offering to SM and Bj is offfering to both the

sellers. If Bi obtains a payoff of M∗ > v −H then the upper bound of the support of his

offers must be less than H. Since the other buyer is offering to SH , his payoff is bounded

above by v −H. This implies that Bj can profitably deviate.

Hence from the above arguments we can infer that,

M∗ ≤ v −H (3.6)

Proposition 13 The outcome implied by the asymmetric equilibrium of Proposition 12 is

the unique stationary equilibrium outcome of the basic game.

Proof. From Lemma 18 and Lemma 19 we have,

M∗ ≤ v −H ≤ m∗ (3.7)

By construction we have,

m∗ ≤M∗

Page 89: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

78

This implies that,

M∗ = v −H = m∗

This concludes the proof.

We will conclude the discussion on uniqueness by stating that in proving the station-

ary equilibrium outcome to be unique we have never used the fact that each seller while

responding observes the other seller’s offer. Thus the same analysis will hold good in the

private offers model. Hence the outcome implied by the stationary equilibrium8 of the tar-

geted offers model is the unique public perfect equilibrium outcome of the basic complete

information game with private targeted offers.

Asymptotic characterisation

We now determine the limiting equilibrium outcome when the discount factor δ → 1.

From (3.3) we know that the probability with which the buyer B1 offers to SH is given

by,

q =(v −H)(1− δ)

(v −M)− δ(v −H)(3.8)

From ( 3.8) it is clear that as δ → 1, q → 0.

From section 3.2.2 recall the equation,

G(x) = x− [δEx(y) + (1− δ)M ]

Since the fixed point x∗ is a function of δ, we denote it by x∗(δ).

Lemma 20 There exists a δ∗ ∈ (0, 1) such that for any δ ∈ (δ∗, 1), the fixed point x∗(δ)

is bounded above by δH.

Proof. We know that for any δ ∈ (0, 1), limx→H G(x) > 0.8This is the same as the one described for the public targeted offers model.

Page 90: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

79

Since the function G(x) is continuous and monotonically increasing in x, there exists

a δ∗ ∈ (0, 1) such that, G(δH) > 0 for all δ ∈ (δ∗, 1), . Thus for any δ ∈ (δ∗, 1), the fixed

point x∗(δ) is bounded above by δH.

Lemma 21 As δ → 1, q′ → 0.

Proof. We have,

q′

=(v −H)(1− δ)

(v − pl)− δ(v −H)

=1

vv−H + δH−pl

(1−δ)(v−H)

where pl = x∗(δ).

From Lemma 20 we have δH − pl > 0. Thus we have

q′ → 0 as δ → 1

Proposition 14 As δ → 1, pl → H.

Proof. The offers from B2 to SM in the range [pl, H], follows the distribution function,

F2(s) =(v −H)[1− δ(1− q′)]− q′(v − s)

(1− q′)[v − s− δ(v −H)]

⇒ 1− F2(s) =H − s

(1− q′)[v − s− δ(v −H)]

.

Note that,

1− F2(H) = 0

From Lemma 21 we know that as δ → 1, q′ → 0. Thus as δ → 1, for s arbitrarily close

Page 91: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

80

to H we have,

1− F2(s) ≈ H − sH − s

= 1

Thus the support of the distribution F2 collapses. This implies that as δ → 1 , pl → H.

This shows that as agents become patient enough, the unique stationary equilibrium

outcome of the basic complete information game implies that in presence of all players

both the buyers almost surely offer H to seller SM . Hence although trading takes place

through decentralised bilateral interactions, asymptotically we get a uniform price for a

non-differentiated good.

Stationary equilibrium for ex ante public offers/directed search

We intend to find a stationary equilibrium of this (modified) extensive form. The qualitative

nature of the equilibrium, analogous to the one we have studied before, is as follows. One

of the buyers B1 randomises between posting a price of H and posting something less than

H. He randomises his prices if offering less than H. The other buyer B2’s posted price is

randomised along a support whose upper bound is H.

In order to describe the candidate equilibrium, we note that the two player game (one

buyer-one seller) is identical to that in the targeted offers model. We consider only the

four-player game. Consider the following strategies:

(a) One of the buyers, B1 say, puts a mass of q at H and a continuous distribution of

offers, (1−q)F1(.) from pl to H, where pl will be defined later. The conditional distribution

F1(.) consists of an absolutely continuous part from pl to H and a mass point at pl. B2,

on the other hand randomises his posts by putting a mass point at p′l and an absolutely

continuous part F2(.) from pl to H, with p′l < pl. The price p

′l is defined as,

p′l =

M +H

2(3.9)

The distributions Fi(.) will be explicitly calculated.

Page 92: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

81

(b) The sellers’ strategies in the four-player game are as follows:

Suppose p1 and p2 are the posted prices such that M ≤ p1 ≤ p2 . If p2 ≥ H then

SM accepts p1 (p2) if p1 ≥ M+p22 (p1 <

M+p22 ). If p2 < H then SM accepts p2 only if the

payoff from accepting it is at least as large as the continuation payoff from rejecting it. SH

accepts p2 provided p2 ≥ H.

2. The expected payoff of a buyer i in equilibrium is v−H. The expected payoff of SH

is zero and that of SM is positive.

Lemma 22 Suppose there exists pl ∈ (p′l, H) such that,

pl −M = δ(E(y)−M)

,where p (a random variable) represents the highest price offer ≤ H under the proposed

strategies. Then the proposed strategies constitute an equilibrium with,

(i)

F1(s) =(v −H)(1− δ(1− q))− q(v − s)

(1− q)[(v − s)− δ(v −H)]

(ii)

F2(s) =(v −H)(1− δ(1− q′))− q′(v − s)

(1− q′)[(v − s)− δ(v −H)]

(iii)

q =[v −H](1− δ)

(v − p′l)− δ(v −H)

(iv)

q′

=[v −H](1− δ)

(v − pl)− δ(v −H)

Proof. The proof is identical to the proof of lemma 11, if we replace M by p′l.

In the next lemma we will show that for sufficiently high values of δ there exists a

unique pl in the open interval (p′l, H)

Page 93: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

82

Lemma 23 There exists a δ∗ ∈ (0, 1) such that for all δ > δ∗, there exists a unique

pl ∈ (p′l, H) that satisfies,

pl = δE(y) + (1− δ)M

Proof. Refer to appendix A.7

Asymptotic characterisation for ex ante public offers/directed search

In the public offers model, as δ → 1, pl → H. Thus as agents become patient enough

we get a uniform price for the non-differentiated goods. Since the proof of this is almost

identical to the proof of Proposition 14 we omit it.

Note that the different versions of the extensive form give similar equilibria and the

same asymptotic result, provided offers are one-sided

3.2.3 Adding a seller

We now consider the effect of adding a seller to the basic complete information model.

(i) Suppose the three sellers have different valuations, i.e H, M and L with,

v > H > M > L

In this case the seller with valuation H will be irrelevant. This is because we have

already described an equilibrium with 2 buyers and 2 sellers (sellers having different valu-

ations) in which each buyer is guaranteed a payoff of v−M . Since SH will not accept any

price lower than H, buyers will simply ignore SH . Hence in this case the unique stationary

equilibrium outcome will be the same as in the 2 buyers, 2 sellers case.

(ii) If two of the sellers have valuations M and one of them has valuation H, where

M < H, then it is easy to see that each buyer offering M to each of the sellers SM

constitutes an equilibrium. In this case each of the buyers gets a payoff of v−M . Intuitively,

it seems that this gives the unique equilibrium payoff.

Page 94: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

83

(iii) Lastly consider the case when two of the sellers have valuation H and one has

valuation M . In this situation the stationary equilibrium of the 2 buyer, 2 sellers case

will be applicable. We can assume that one of the H sellers is randomly chosen at the

beginning of the game.

3.2.4 Heterogeneous buyers

Suppose, in the basic model, buyers too are heterogeneous. That is, buyer Bi has a

valuation of vi where,

v1 > v2 > H > M

Analysis of the basic model holds good. 9

We conclude this subsection by providing an example to show that even if there is

potential of trade for both the sellers, such trades need not take place in the equilibrium

of our model. Suppose there are two buyers with valuation v1 and v2 and two sellers with

valuations H and M such that

M < v2 < H < v1

In equilibrium, both the buyers offer v2 to the seller with valuation M and the trade

takes place between the M -seller and the v1-buyer. (If, in equilibrium, the v2 buyer were

concluding the trade with positive probability, the v1 buyer would offer ε > 0 more and

have a profitable deviation.) Note that, in this case, any price between v2 and H would be

a competitive equilibrium in which the demand and supply would equate.

3.2.5 Adding a buyer

This analysis has been done in the generalised section for homogeneous buyers.9The generalisations in ensuing sections are in terms of homogeneous buyers. Heterogeneity in buyer

valuations can be accommodated in the section on n buyers and n sellers. We have not been able toincorporate heterogeneity in the case of more buyers than sellers.

Page 95: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

84

3.2.6 Generalisation 1: n buyers and n sellers

Players and payoffs

There are n buyers (n > 2 and n finite) and n sellers. Each buyer’s maximum willingness

to pay for a unit of an indivisible good is v. Each of the sellers owns one unit. Sellers differ

in their valuations. We denote seller Sj ’s valuation (j = 1, ..., n) by uj where,

v > un > un−1 > ... > u2 > u1

The above inequality implies that any buyer has a positive benefit from trade with any

of the sellers. All players are risk neutral. Hence the expected payoffs obtained by the

players in any outcome of the game are identical to that in the basic model.

The extensive form

This is identical to the one in the basic complete information game. We first consider the

infinite horizon, public and targeted offers game where the buyers simultaneously make

offers and each seller either accepts or rejects an offer directed towards her. Matched pairs

leave the game and the remaining players continue the bargaining game with the same

protocol.

Equilibrium

We seek, as usual, to find a stationary equilibrium. Thus buyers’ offers at a particular

time point depend only on the set of players remaining and the sellers’ responses depend

on the set of players remaining and the offers made by the buyers. Since we start out with

equal numbers of buyers and sellers, any possible subgame will have that. Depending on

the parametric values we can have three types of equilibria. However, as δ becomes greater

than a threshold value, there is only one type of characterisation.

Page 96: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

85

First, for our notational convenience, we re-label u1 = L and un = H. From the basic

complete information game, for each i = 1, ..., n− 1, we calculate pi such that,

pi = (1− δ)ui + δE(yi) (3.10)

where E(yi) is defined as the equilibrium expected maximum price offer which Si gets in

the four-player game with Si and Sn as the sellers and two buyers with valuation v.10

For each i = 1, ..., n− 1 we define qi as,

qi =H − pi

(v − pi)− δ(v −H)(3.11)

and qH as ,

qH =(v −H)(1− δ)

(v − L)− δ(v −H)(3.12)

Let P =∑

i=1,..,n−1 qi. The following three propositions fully characterise the equilibrium

behavior in the present game11. In all of them, sellers’ strategies are as follows: (i) Sn

accepts any offer greater than or equal to H. (ii) Seller Si (i = 1, .., n − 1) accepts the

highest offer with a payoff from accepting at least as large as the expected continuation

payoff from rejecting it.

Proposition 15 If for δ ∈ (0, 1), P < 1 and 1 − P > qH , then the equilibrium is as

follows:

(i) Buyer B1 makes offers to S1 only. B1 puts a mass of q′1 at L and has a continuous

distribution of offers F1(.) with [p1, H] as the support. Bn makes offers to S1 with probability

q1. He randomises his offers to S1 with a probability distribution F 1n(.) with [p1, H] as the

support. F 1n(.) puts a mass point at p1 and has an absolutely continuous part from p1 to

10Note that pi is given by the equilibrium of the appropriate four-player game, which has already beendescribed earlier. It can essentially be treated as an exogenously given function of the parameters of theproblem for the purposes of the n− player analysis.

11Note that all quantities used in these propositions are defined with respect to the exogenously givenparameters of the game.

Page 97: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

86

H. The distributions F1(.), F 1n(.), q1 and q

′1 are given by:

F1(s) =(v −H)[1− δ(1− q′1)]− q′1(v − s)

(1− q′1)[(v − s)− δ(v −H)](3.13)

F 1n =

(v −H)[1− δq1]− (1− q1)(v − s)q1[(v − s)− δ(v −H)]

(3.14)

q′1 =

(v −H)(1− δ)(v − p1)− δ(v −H)

(3.15)

q1 = q1 + (1− P − qH) (3.16)

(ii) For i = 2, ..., n − 1, Bi makes offers to Si only. Bi’s offers to Si are randomised

with a distribution Fi(s). Fi(.) puts a mass point at pi and has an absolutely continuous

part from pi to H. Bn makes offers to Si (i = 2, .., n − 1) with probability qi = qi. Bn’s

offers to Si are randomised by an absolutely continuous probability distribution F in with

[pi, H] as the support. For i = 2, .., n− 1, Fi(.) and F in(.) are given by,

Fi =(v −H)(1− δ)

(v − s)− δ(v −H)(3.17)

F in =(v −H)[1− δqi]− (1− qi)(v − s)

qi[(v − s)− δ(v −H)](3.18)

(iii) Bn offers to Sn with probability qH . He offers H to Sn.

(iv) In equilibrium, all buyers obtain an expected payoff of v −H.

Proof. Refer to appendix (A.8).

Proposition 16 If for a δ ∈ (0, 1) P < 1 and 1 − P < qH , then the equilibrium is as

follows:

(i) For i = 1, 2, ..., n−1, buyer Bi makes offers to Si only. Bi’s offers to Si are random

with a distribution Fi(s). Fi(.) puts a mass point at pi and has an absolutely continuous

part from pi to H. Bn makes offers to Si (i = 1, .., n − 1) with probability qi = qi. Bn’s

Page 98: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

87

offers to Si are random with an absolutely continuous probability distribution F in with [pi, H]

as the support. For i = 1.., n− 1, Fi(.) and F in(.) are given by,

Fi =(v −H)(1− δ)

(v − s)− δ(v −H)(3.19)

F in =(v −H)[1− δqi]− (1− qi)(v − s)

qi[(v − s)− δ(v −H)](3.20)

(ii) Bn offers to Sn with probability qn = 1− P. He offers H to Sn.

(iii) In equilibrium, all buyers obtain an expected payoff of v −H.

Proof. Refer to appendix (A.9)

Proposition 17 If P ≥ 1, then the equilibrium is as follows:

For i = 1, .., n − 1, buyer Bi makes offers to seller Si only. Bi’s offers to Si are

randomised using a distribution function Fi(.), with [pi, p] as the support. The distribution

Fi(.) puts a mass point at pi and has an absolutely continuous part from pi to p. Buyer

Bn offers to all sellers except Sn. Bn’s offers to Si (i = 1, ..n− 1) are randomised with a

continuous probability distribution F in. The support of offers is [pi, p]. The probability with

which Bn offers to Si (i = 1, .., n − 1) is qi. If P = 1 then p = H. If P > 1 then p < H

and as δ → 1, p → H. In equilibrium, all buyers obtain an expected payoff of v − p. The

following relations formally define the equilibrium:

Fi(s) =(v − p)− δ(v −H)(v − s)− δ(v −H)

(3.21)

F in =(v −H)[1− δqi]− (1− qi)(v − s)

qi[(v − s)− δ(v −H)](3.22)

qi =p− pi

(v − pi)− δ(v −H)(3.23)

Further if for δ = δ∗, P > 1 then for all δ > δ∗, P > 1 and p→ H as δ → 1.

Proof. Refer to appendix (A.10)

Page 99: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

88

Proposition (17) tells us that as agents become patient enough, prices in all transactions

tend towards H.12The following observation can be made about the asymptotic result. For

δ high enough, the prices tend towards the valuation of the highest seller, independently

of the distributions of the valuations of the other sellers. Hence even if the distribution of

the valuations of the sellers Si (i = 1, ..n − 1) is heavily skewed towards L, the uniform

asymptotic price will still be H.

Whilst the formal proofs of the above propositions are relegated to the appendix, we

provide a verbal description of the nature of the stationary equilibrium as follows.

It can be observed that in all of the above stationary equilibria, each buyer, other than

Bn,is assigned to a seller to make offers to-buyer Bi to seller Si. The remaining buyer (Bn)

offers to all (or all but one) the sellers. This creates some competition among the buyers,

since each seller(except Sn) gets two offers with positive probability. The probability qH is

the probability with which Bn should offer to Sn in equilibrium if B1 puts a mass point at

u1(= L). The quantity qi is the probability with which Bn should offer to Si in equilibrium,

if Bi puts a mass point at pi and Bn offers to all the sellers. Further, in any stationary

equilibrium, a buyer who is assigned to a seller Sj has to put a mass point either at uj or

at pj . Hence, for a given δ, if Bn has to make offers to all the sellers then it is necessary

to have P < 1. Further if 1 − P > qH , then it is possible to have the buyer B1 put a

mass point at L; the equilibrium is then described by proposition (15). Otherwise the

equilibrium is described by proposition (16). On the other hand if P ≥ 1 it is not possible

to have Bn offering to all the sellers in equilibrium. In that case he offers to all but the

highest valued seller. The equilibrium is then described by proposition (17). In the 2× 2

case, the conditions P < 1 and 1−P > qH are satisfied for all values of δ ∈ (0, 1). This is

because in the 2× 2 case P = H−pl(v−pl)−δ(v−H) , which is less than 1 for all values of δ ∈ (0, 1).

Further 1 − P = (v−H)(1−δ)(v−pl)−δ(v−H) > qH = (v−H)(1−δ)

(v−M)−δ(v−H) as pl > M . Hence the qualitative

nature of the equilibrium described in proposition (15) is identical to the one described in12We have seen earlier (in the 2 × 2 game analysis) that pi goes to H as δ → 1. In this propsition, we

show that p→ H as δ → 1. Thus the supports of the randomised strategies also collapse as δ → 1

Page 100: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

89

the basic model. However for n > 2, the conditions satisfied by the 2 × 2 configuration

need not hold for all values of δ.

In proposition (17), the highest valued seller does not get any offer when all the players

are present. Hence the continuation game faced by a seller from rejection is always the same

irrespective of whether she gets one offer or two offers. A seller knows that by rejecting

all the offer(s) she will face a four-player game with Sn as the other seller and two buyers

with valuation v. Thus the seller Si,(i = 1, .., n− 1) knows the continuation game for sure

and this does not require her to observe the offers received by other sellers or the seller to

whom buyer Bn is making his offer. Since for high values of δ, P ≥ 1, we have the following

corollary:

Corollary 2 With private offers, Proposition (17) describes the equilibrium of the game

for high values of δ.

Heterogeneous buyers: Suppose the buyers are heterogeneous such that,

vN > vN−1 > ... > v2 > v > H > uN−1 > ... > L

For each i = 1, ..n− 1, define

phi = (1− δ)ui + δE(yhi )and

qhi =H − phi

(vi − phi )− δ(vhi −H)

, where E(yhi ) is defined as the equilibrium expected maximum price offer that Si gets in

the four-player game with Si and Sn as the sellers and two buyers with valuation vi and

vn. As before, let Ph =∑

i=1,..,n−1 qhi . Define qHh = (v1−H)(1−δ)

(v1−L)−δ(v1−H) ≡ qH as v1 = v.

Proposition 18 With heterogeneous buyer valuations, analogues to propositions 15, 16

and 17 hold good for Ph < 1 and 1 − Ph > qH , Ph < 1 and 1 − Ph < qH and Ph ≥ 1

Page 101: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

90

respectively. For Ph < 1 and 1 − Ph > qH the lowest-valued buyer with valuation v offers

to S1. The specifics, however, are slightly different(see appendix (A.11)). Also with private

offers, proposition (17) describes the equilibrium for high values of δ.

Remark 1 We omit the formal proof of the results for heterogeneous buyers since this

is very similar to those of the previous propositions. Here, we explain why in the case of

Ph < 1 and 1−Ph > qH the lowest-valued buyer with valuation v offers to S1, rather than

one of the others.13 In equilibrium, the buyer who is making offers to S1 puts a mass point

at the reservation value of that seller (i.e. at L). Since the buyer is indifferent between

offering L to S1 and making randomised offers in the range [p1, H], the probability (qH)

with which the buyer Bn makes offers to Sn must just make B1 indifferent among the offers

in the support of his randomised strategy.14 This gives qH as below.

(v − L)qH + (1− qH)δ(v −H) = v −H

⇒ qH =(v −H)(1− δ)

(v − L)− δ(v −H)

Buyer Bj (j 6= 1; j 6= n) makes randomised offers to the seller Sj with [pj , H] as the

support. First, it is easy to see that Bj cannot profitably deviate by making offers to Sk

(j 6= k 6= n) in the range [pk, H]. To ensure that the proposed strategies constitute an

equilibrium we need to show that this buyer with valuation vj(6= v), has no incentive to

offer ui(or in the range (ui,pi) ) to Si, i = 1, .., n− 1;. First consider i = 2, ..n− 1. Since

offers are public15, a seller with valuation ui will only accept an offer of ui (or something

in the range (ui,pi)) if the buyer Bn makes an offer to Sn. Hence, the payoff to the buyer

with valuation vj of making an offer of ui to Si is,

(vj − ui)qH + (1− qH)δ(vj −H)

13This is a sufficient condition for the strategies described to be an equilibrium.14W.L.O.G we assume that v1 = v15Note that the equilibrium for private offers is described by a different proposition.

Page 102: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

91

Define qHj such that, (vj − ui)qH + (1 − qH)δ(vj − H) = vj − H. This implies qHj =(vj−H)(1−δ)

(vj−ui)−δ(vj−H) . Since vj > v for all j 6= 1 and ui > L, for all i 6= 1 we have

qHj =(vj −H)(1− δ)

(vj − ui)− δ(vj −H)>

(v −H)(1− δ)(v − L)− δ(v −H)

= qH

Since (vj − ui) > δ(vj −H), (vj − ui)qH + (1− qH)δ(vj −H) < (vj −H). The equilibrium

payoff to the buyer with valuation vj is (vj − H). This implies that the buyer has no

incentive to offer ui to seller Si. This also proves that for i = 1, the buyer Bj has no

incentive to offer L to S1. To see this note that (vj−H)(1−δ)(vj−L)−δ(vj−H) >

(v−H)(1−δ)(v−L)−δ(v−H) . Since B1

is also offering L to S1 with some positive probability the payoff to Bj by offering L to S1

is strictly less than (vj − L)qH + (1− qH)δ(vj −H) < vj −H. Hence Bj has no incentive

to offer anything in the range [ui, pi) to Si (i = 1, ..n− 1).

3.2.7 Generalisation 2: n buyers and n-1 sellers

Players and payoffs

We have n buyers and n − 1 sellers. The rest of the environment is as before with all

buyers having a common, known valuation v. Each of the sellers owns one unit of that

good. Sellers differ in their valuations. Seller Sj ( j = 1, .., n − 1) has a valuation of uj

such that,

v > un−1 > ... > u1

Thus any buyer has a positive benefit from trade with any seller. However, the number of

buyers is more than the number of sellers. Hence only n − 1 buyers can be served. The

payoffs of players are identical to that in the basic model.

The extensive form

The extensive form is same as in the basic complete information game. Thus at each time

point we have simultaneous public targeted offers from the buyers only. Sellers respond by

Page 103: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

92

either accepting or rejecting the offers(s). Matched pairs leave the game and the players

remaining move on to the next period. They continue the bargaining game according to

the same protocol.

Equilibrium

We will derive a stationary equilibrium of this extensive form. Thus buyers’ offers at any

time point depend only on the set of players remaining and the sellers’ responses depend

only on the set of players remaining, and the offers. Before we describe the equilibrium

of this game formally we will verbally discuss its nature. In equilibrium, if all the players

are present, buyer Bi (i = 1, ..., n − 1) makes offers to Si only. His offers are randomised

using a distribution function function Fi(.), with [pi, p] (pi and p will be defined later )

as the support. Fi(.) puts a mass point at pi and has an absolutely continuous part from

pi to p. Buyer Bn makes offers to all the sellers with positive probability. Bn’s offers to

Sj (j = 1, .., n− 1) are randomised using a probability distribution F in(.). The support of

offers is [pi, p].

For each i = 1, .., n− 1 we define pi as ,

pi = (1− δ)ui + δv (3.24)

Let qi be the probability with which Bn offers to seller Si. The following proposition

now formally defines the equilibrium of the game.

Proposition 19 (i) The above conjectured strategies constitute a stationary equilibrium

of the present game with,

Fi(s) =v − pv − s

(3.25)

F in(s) =(v − p)− (1− qi)(v − s)

qi(v − s)(3.26)

qi =p− piv − pi

(3.27)

Page 104: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

93

p = v − (n− 2)

∏i=1,..,n−1(v − pi)∑

j=1,..,n−1[∏k=1,..,n−1;k 6=j(v − pk)]

(3.28)

(ii) In equilibrium, each buyer obtains an expected payoff of (v − p).

Proof. First consider the buyer Bi, (i = 1, .., n−1). For s ∈ [pi, p] his indifference relation

is,

(v − s)[(1− qi) + qiFin(s)] = v − p

Solving the above relation for F in(.) we get (3.26). Putting s = pi in Bi’s indifference

relation we obtain (3.27). It is easy to note that F in(pi) = 0 and F in(p) = 1.

Next, consider the buyer Bn. The support of his offers to Si (i = 1, .., n − 1) is [pi, p].

For s ∈ [pi, p], Bn’s indifference relation is given by

(v − s)[Fi(s)] = v − p

which gives us (3.25). Note that Fi(pi) > 0 and Fi(p) = 1. This confirms our conjecture

that Bi puts a mass point at pi.

To have consistency in the expressions obtained we must have,

∑i=1,..,n−1

qi = 1⇒∑

i=1,..,n−1

(p− pi)(v − pi)

= 1

⇒∑

i=1,..,n−1

(v − p)(v − pi

) = n− 2

Rearranging the terms in the above relation we get (3.28).

Now we should check that the strategies constitute an equilibrium. First, observe that

on the equilibrium path if a seller Si rejects her offer(s) then next period she will face a

game with two buyers and one seller. This will give her a discounted payoff of δ(v − ui).

Hence her minimum acceptable price should be pi. From the analysis of the basic model one

can infer that on the equilibrium path, there is no profitable deviation for the players. The

Page 105: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

94

way we have specified sellers’ strategies these always constitute best responses in any off-

path contingency. It is easy to check that buyers’ strategies also constitute best responses

in any off-path contingency. This concludes the proof.

Remark 2 Note that irrespective of whether a seller gets one offer or two offers, the con-

tinuation game faced by her from rejection is the same. Hence the result of the proposition

(19) holds good for the case of private offers as well.

The Asymptotic Characterisation

We would like to analyze the equilibrium outcome discussed above as agents become patient

enough, i.e as δ → 1.

From (3.24)It is easy to observe that pi → v as δ → 1. Thus as δ → 1, (v− pi)→ 0 for

i = 1, .., n− 1. This implies that the second term in (3.28) goes to zero as δ tends to one.

Hence ,

p→ v as δ → 1

This implies that the distributions of the price offers by each buyer collapse to a single

value in the limit.

Thus as δ → 1, we tend to get an uniform price of v for the non-differentiated good.

This is equivalent to the Walrasian outcome of the present setup.

3.3 Conclusion

This chapter has considered several different variants of a dynamic strategic matching and

bargaining game, with the common feature that only one side of the market makes offers.

Unlike other papers in the field, the offers are made simultaneously to capture competition.

We find that stationary equilibria give a single price asymptotically in all the transactions.

Previous work has shown that this conclusion is not true when buyers and sellers take

it in turns to make offers (a game of which the Rubinstein bargaining game is a special

Page 106: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

95

case). Alternating offers with heterogeneity in valuations tends to drive valuations apart.

Other authors [17] have mentioned the difficulty of solving dynamic bargaining and

matching games with many players if there is heterogeneity of valuations on both sides,

though she was specifically concerned with alternating offers. This turns out mostly not to

be an issue for us, except in the one general case where sellers are on the short side, where

we have not been able to extend the basic analysis.

One interesting heterogeneity would be to consider settings in which the value of buyer

i for seller j′s good is vij , as in the housing market. In this setting it seems appropriate

to assume that sellers’ valuations do not depend upon the identity of the potential buyers.

This is kept for future research, though it seems feasible that techniques similar to the ones

used in this chapter would enable us to characterise equilibrium prices in such markets as

well.

Page 107: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

Chapter 4

Decentralised Bilateral Trading in

a Market with Incomplete

Information

4.1 Introduction

This chapter attempts to study a small market in which one of the players has private

information about her valuation. As such, it is a first step in combining the literature

on incomplete information with that on market outcomes obtained through decentralised

bilateral bargaining.

We shall discuss the relevant literature in detail later on in the introduction. Here we

summarise the motivation for studying this problem.

One of the most important features in the study of bargaining is the role of outside

options in determining the bargaining solution. There have been several different ap-

proaches to this issue, starting with treating alternatives to the current bargaining game

as exogenously given and always available. Accounts of negotiation directed towards prac-

titioners and policy-oriented academics, like Raiffa’s masterly “The Art and Science of

96

Page 108: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

97

Negotiation”,([50]) have emphasised the key role of the “Best Alternative to the Negoti-

ated Agreement” and mentioned the role of searching for such alternatives in preparing

for negotiations. Search for outside options has also been considered, as well as search for

bargaining partners, in a general coalition formation context.

Real world examples of such search for outside options are abound. For example, firms

that receive (public) takeover bids seek to generate other (also public offers) in order to

improve their bargaining position. Takeovers are an instance also of public one-sided offers.

The housing market is another example; there is a given (at any time) supply of sellers and

buyers who are interested in a particular kind of house make (private) offers to the sellers

of the houses they are interested in, one at a time. (This is, for instance, the example used

in [36].)

Private targeted offers are prevalent in industry as well, for joint ventures and mergers.

For example, the book [2] is concerned with the joint venture negotiations in the 1980s,

in which Air Products, Air Liquide and British Oxygen were buyers and DuPont, Dow

Chemical and Monsanto were sellers (of a particular kind of membrane technology). The

final outcome of these negotiations were two joint ventures and one acquisition.

Proceeding more or less in parallel, there has been considerable work on bargaining with

incomplete information. The major success of this work has been the complete analysis of

the bargaining game in which the seller has private information about the minimum offer

she is willing to accept and the buyer, with only the common knowledge of the probability

distribution from which the seller’s reservation price is drawn, makes repeated offers which

the seller can accept or reject; each rejection takes the game to another period and time

is discounted at a common rate by both parties. With the roles of the seller and buyer re-

versed, this has also been part of the development of the foundations of dynamic monopoly

and the Coase conjecture. Other, more complicated models of bargaining have also been

formulated (including by one of us), with two-sided offers and two-sided incomplete infor-

mation, but these have not yielded the clean results of the game with one-sided offers and

Page 109: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

98

one-sided incomplete information.

Whilst this need not necessarily be a reason for studying this particular game, it does

suggest that if we desire to embed bargaining in a more complex market setting with

private information, it is rational for us, the modellers, to minimise the extent of complexity

associated with the bargaining to focus on the changes introduced by adding endogenous

outside options, as we intend to do here.

Our model therefore takes the basic problem of a seller with private information and

an uninformed buyer and adds another buyer-seller pair; here the new seller’s valuation is

different from the informed seller’s and commonly known and the buyers’ valuations are

identical. Each seller has one good and each buyer wants at most one good. This is the

simplest extension of the basic model that gives rise to outside options for each player,

though unlike the literature on exogenous outside options, only one buyer can deviate from

the incomplete information bargaining to take his outside option with the other seller (if

this other seller accepts the offer).

In our model, buyers make offers simultaneously, each buyer choosing only one seller.1

Sellers also respond simultaneously, accepting at most one offer. A buyer whose offer is

accepted by a seller leaves the market with the seller and the remaining players play the

one-sided offers game with or without asymmetric information. We consider both the cases

where buyers’ offers are public, so the continuation strategies can condition on both offers

in a given period, and private, when only the proposer and the recipient of an offer know

what it is and the only public information is the set of players remaining in the game.

Our analysis explores whether a Perfect Bayes Equilibrium similar to that found in the

two-player asymmetric information game continues to hold with alternative partners on

both sides of the market and with different conditions on observability of offers.

The equilibrium we describe is a randomized behavioral strategy one (as in the two-

player game). As agents become patient enough, in equilibrium competition always takes1Simultaneous offers extensive forms probably capture the essence of competition best.

Page 110: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

99

place for the seller whose valuation is commonly known. The equilibrium behavior of beliefs

is similar to the two-player asymmetric information game and the same across public and

private offers. However, the off-path behaviour sustaining this equilibrium is different and

has to take into account many more possible deviations. The path of beliefs also differs

once an out-of-equilibrium choice occurs. The case of private offers is quite interesting.

For example a buyer who offers to the informed seller might see his offer rejected but his

expectation that the other offer has been accepted is belied when he observes all players

remain in the market. He is then unsure of whether the other buyer has deviated and made

an offer to the informed seller, which the informed seller has rejected, or an offer to the

seller with commonly known valuation. The beliefs have to be constructed with some care

to make sure the play gets back to the equilibrium path. However, the beliefs used here

are not inherently implausible.

The interesting asymptotic characterisation obtained by taking the limit of the equilib-

rium prices, as the discount factor goes to 1, is that, despite the asymmetric information

and two heterogeneous sellers, the different distributions of prices collapse to a single price

that is consistent with an extended Coase conjecture.2

The intuition and the economics behind these results can be explained in the following

way. In the benchmark case when one of the sellers’ valuation is known to be H and

the other M , then in the Walrasian setting, there will be excess demand at any prices

p ∈ (M,H). This is in essence what drives the prices to H. We model an explicit trading

protocol with simultaneous offers made by both buyers. As δ → 1,the offers converge to

H and the trade takes place immediately. For lower values of δ, the buyer can exploit the

fact that the seller will need to wait until she gets a new offer and hence buyers would be2The “Coase conjecture” relevant here is the bargaining version of the dynamic monopoly problem,

namely that if a uninformed seller (who is the only player making offers) has a valuation strictly belowthe informed buyer’s lowest possible valuation, the unique sequential equilibrium as the seller is allowed tomake offers frequently has a price that converges as the frequency of offers becomes infinite to the lowestbuyer valuation. Here we show that even if one adds endogenous outside options for both players, a similarconclusion holds for an equilibrium that is common to both public and private offers-hence an extendedCoase conjecture holds.

Page 111: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

100

able to capture some rents.

Next, let us move on to the private information case. From existing results we know the

solution for the sub-game where only the privately informed seller is left. This states that

as δ → 1, the price would converge to H and trade will take place immediately. Thus in the

limit the reservation price of the informed seller is H, regardless of her type. This explains

why we have equilibrium in which there is immediate trade at a price of H. There coulkd

be other equilibria where essentially the buyers collude on (in a tacit manner though).

In the two-player game, the Perfect Bayesian Equilibrium is unique in the “gap” case.

In our competitive setting, this is not true, at least for public offers. We include an example.

Related literature: The modern interest in this approach dates back to the sem-

inal work of Rubinstein and Wolinsky ( [53], [54]), Binmore and Herrero ([8])and Gale

([25]),[26]). These papers, under complete information, mostly deal with random match-

ing in large anonymous markets, though Rubinstein and Wolinsky (1990) is an exception.

Chatterjee and Dutta ([10]) consider strategic matching in an infinite horizon model with

two buyers and two sellers and Rubinstein bargaining, with complete information. The

previous chapter analysed markets under complete information where the bargaining is

with one-sided offers.

There are several papers on searching for outside options, for example, Chikte and

Deshmukh ([16]), Muthoo ([44]), Lee ([41]), Chatterjee and Lee ([15]). Chatterjee and

Dutta ([11]) study a similar setting but with sequential offers by buyers. In the present work

we consider simultaneous offers, which is closer to the usual model of Bertrand competition.

We should emphasise that we consider an infinite horizon model, unlike one-stage Bertrand

competition.

A rare paper analysing outside options in asymmetric information bargaining is that

by Gantner([30]), who considers such outside options in the Chatterjee-Samuelson ([14])

model. Our model differs from hers in the choice of the basic bargaining model and in the

explicit analysis of a small market with both public and private targeted offers.

Page 112: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

101

Some of the main papers in one-sided asymmetric information bargaining are the well-

known ones of Sobel and Takahashi([56]), Fudenberg, Levine and Tirole ([23]), Ausubel

and Deneckere ([4]). The dynamic monopoly papers mentioned before are the ones by

Gul and Sonnenschein ([32]) and Gul, Sonnenschein and Wilson([33]). See also the review

paper of Ausubel, Cramton and Deneckere ([5]).

There are papers in very different contexts that have some of the features of this model.

For example, Swinkels [60] considers a discriminatory auction with multiple goods, private

values (and one seller) and shows convergence to a competitive equilibrium price for fixed

supply as the number of bidders and objects becomes large. We keep the numbers small, at

two on each side of the market. Horner and Vieille [36] consider a model with one informed

seller, two buyers with correlated values who are the only proposers and both public and

private offers. They show that, in their model unlike ours, public and private offers give

very different equilibria; in fact, public offers could lead to no trade.

Outline of rest of the chapter. The rest of the chapter is organised as follows.

Section 2 discusses the model in detail. The qualitative nature of the equilibrium and its

detailed derivation is given in section 3. The asymptotic characteristics of the equilibrium

are obtained in Section 4. Section 5 discusses the possibility of other equilibria. Finally,

section 6 concludes the chapter.

4.2 The Model

4.2.1 Players and payoffs

The setup we consider has two uninformed homogeneous buyers and two heterogeneous

sellers. Buyers (B1 and B2 ) have a common valuation of v for the good (the maximum

willingness to pay for a unit of the indivisible good). There are two sellers. Each of the

sellers owns one unit of the indivisible good. Sellers differ in their valuations. The first

seller (SM ) has a reservation value of M which is commonly known. The other seller (SI)

Page 113: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

102

has a reservation value that is private information to her. SI ’s valuation is either L or H,

where,

v > H > M > L

We assume that L = 0, for purposes of reducing notation. It is commonly known by

all players that the probability that SI has a reservation value of L is π ∈ (0, 1). It is

worthwhile to mention that M ∈ [L,H] constitutes the only interesting case. If M < L (or

M > H) then one has no ambuguity about which seller has the lowest reservation value.

Although our model analyses the case of M ∈ (L,H), the same asymptotic result will be

true for M ∈ [L,H] ( even though the analytical characteristics of the equilibrium for δ < 1

are different).

Players have a common discount factor δ ∈ (0, 1). If a buyer agrees on a price pj with

seller Sj at a time point t, then the buyer has an expected discounted payoff of δt−1(v−pj).

The seller’s discounted payoff is δt−1(pj − uj), where uj is the valuation of seller Sj .

4.2.2 The extensive form

This is an infinite horizon, multi-player bargaining game with one sided offers and dis-

counting. The extensive form is as follows:

At each time point t = 1, 2, .., offers are made simultaneously by the buyers. The offers

are targeted. This means an offer by a buyer consists of a seller’s name (that is SI or

SM ) and a price at which the buyer is willing to buy the object from the seller he has

chosen. Each buyer can make only one offer per period. Two informational structures will

be considered; one in which each seller observes all offers made ( public targeted offers)

and one ( private targeted offers) in which each seller observes only the offers she gets.

(Similarly for the buyers after the offers have been made-in the private offers case each

buyer knows his own offer and can observe who leaves the market.) A seller can accept

at most one of the offers she receives. Acceptances or rejections are simultaneous. Once

an offer is accepted, the trade is concluded and the trading pair leave the game. Leaving

Page 114: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

103

the game is publicly observable (irrespective of public or private offers). The remaining

players proceed to the next period in which buyers again make price offers to the sellers.

As is standard in these games, time elapses between rejections and new offers.

4.3 Equilibrium

We will look for Perfect Bayes Equilibrium[24] of the above described extensive form. This

requires sequential rationality at every stage of the game given beliefs and the beliefs being

compatible with Bayes’ rule whenever possible, on and off the equilibrium path. The

PBE obtained is stationary in the sense that the strategies depend on the history only

to the extent to which it is reflected in the updated value of π (the probability that SI ’s

valuation is L). Thus at each time point buyers’ offers depend only on the number of

players remaining and the value of π. The sellers’ responses depend on the number of

players remaining, the value of π and the offers made by the buyers.

4.3.1 The Benchmark Case: Complete information

Before we proceed to the analysis of the incomplete information framework we state the

results of the above extensive form with complete information, the formal analysis of which

has been done in the previous chapter.

Suppose the valuation of SI is commonly known to be H. In that case there exists a

stationary equilibrium (an equilibrium in which buyers’ offers depend only on the set of

players present and the sellers’ responses depend on the set of players present and the offers

made by the buyers) in which one of the buyers (say B1) makes offers to both the sellers

with positive probability and the other buyer (B2) makes offers to SM only. Suppose E(p)

represents the expected maximum price offer to SM in equilibrium. Assuming that there

Page 115: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

104

exists a unique pl ∈ (M,H) such that,

pl −M = δ(E(p)−M)3

, the equilibrium is as follows:

1. B1 offers H to SI with probability q. With the complementary probability he makes

offers to SM . While offering to SM , B1 randomises his offers using an absolutely continuous

distribution function F1(.) with [pl, H] as the support. F1 is such that F1(H) = 1 and

F1(pl) > 0. This implies that B1 puts a mass point at pl.

2. B2 offers M to SM with probability q′. With the complementary probability his

offers to SM are randomised using an absolutely continuous distribution function F2(.)

with [pl, H] as the support. F2(.) is such that F2(pl) = 0 and F2(H) = 1.

Let us recollect few things from the previous chapter. There exists a unique pl and the

outcome implied by the above equilibrium play constitutes the unique stationary equilib-

rium outcome.

Also as δ → 1,

q → 0 , q′ → 0 and pl → H

This means that as market frictions go away, we tend to get a uniform price in differ-

ent buyer-seller matches. In this chapter, we show a similar asymptotic result even with

incomplete information, with somewhat different analysis.

4.3.2 Equilibrium of the one-sided incomplete information game with

two players

The equilibrium of the whole game contains the analyses of the different two-player games

as essential ingredients. If a buyer-seller pair leaves the market after an agreement and

the other pair remains, we have a continuation game that is of this kind. We therefore3Given the nature of the equilibrium it is evident that M(pl) is the minimum acceptable price for SM

when she gets one(two) offer(s).

Page 116: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

105

first review the features of the two-player game with one-sided private information and

one-sided offers.

The setting is as follows: There is a buyer with valuation v, which is common knowledge.

The seller’s valuation can either be H or L where v > H > L = 0. At each period, the

remaining buyer makes the offer and the remaining (informed) seller responds to it by

accepting or rejecting. If the offer is rejected then the value of π is updated using Bayes’

rule and the game moves on to the next period when the buyer again makes an offer. This

process continues until an agreement is reached. The equilibrium of this game(as described

in, for example, [21]) is as follows.

For a given δ we can construct an increasing sequence of probabilities, d(δ) = {0, d1, ....., dt, ....}

so that for any π ∈ (0, 1) there exists a t ≥ 0 such that π ∈ [dt, dt+1). Suppose at a partic-

ular time point the play of the game so far and Bayes’ Rule implies that the updated belief

is π. Thus there exists a t ≥ 0 such that π ∈ [dt, dt+1). The buyer then offers pt = δtH.

The H type seller rejects this offer with probability 1. The L type seller rejects this offer

with a probability that implies, through Bayes’ Rule, that the updated value of the belief

πu = dt−1. The cutoff points dt’s are such that the buyer is indifferent between offering

δtH and continuing the game for a maximum of t periods from now or offering δt−1H and

continuing the game for a maximum of t−1 periods from now. Thus here t means that the

game will last for at most t periods from now. The maximum number of periods for which

the game can last is given by N(δ). It is already shown in [21] that this N(δ) is uniformly

bounded by a finite number N∗ as δ → 1.

Since we are describing a PBE for the game it is important that we specify the off-path

behavior of the players. First, the off-path behavior should be such that it sustains the

equilibrium play in the sense of making deviations by the other player unprofitable and

second, if the other player has deviated, the behavior should be equilibrium play in the

continuation game, given beliefs. We relegate the discussion of these beliefs to appendix

(A.12).

Page 117: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

106

Given a π, the expected payoff to the buyer vB(π) is calculated as follows:

For π ∈ [0, d1), the two-player game with one-sided asymmetric information involves

the same offer and response as the complete information game between a buyer of valuation

v and a seller of valuation H. Thus we have

vB(π) = v −H for π ∈ [0, d1)

For π ∈ [dt, dt+1), (t ≥ 1 ), we have,

vB(π) = (v − δtH)a(π) + (1− a(π))δ(vB(dt−1)) (4.1)

where a(π) is the equilibrium acceptance probability of the offer δtH.

These values will be crucial for the analysis of the four-player game.

4.3.3 Equilibrium of the four-player game with incomplete information.

We now consider the four-player game. The complete-information benchmark case suggests

that there will be competition among the buyers for the more attractive seller, in the sense

that that seller will receive two offers with positive probability in equilibrium, whilst the

other seller will obtain at most one. However, the difference arises here because of the

private information of one of the sellers. Even if one pair of players has left the market,

a seller with private information has some power arising from the private information. In

fact, for δ high enough, this residual power of the informed seller leads, in equilibrium, to

competition taking place for the other seller (whose value is common knowledge), even if π

is relatively high. The main result of this chapter is described in the following proposition.

Proposition 20 There exists a δ∗ ∈ (0, 1) such that if δ > δ∗ then for all π ∈ [0, 1) there

exists a stationary equilibrium as follows (both public and private offers:):

Page 118: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

107

(i) One of the buyers (say B1) will make offers to both SI and SM with positive proba-

bility. The other buyer B2 will make offers to SM only.

(ii) B2 while making offers to SM will put a mass point at p′l(π) and will have an

absolutely continuous distribution of offers from pl(π) to p(π) where p′l(π) (pl(π)) is the

minimum acceptable price to SM when she gets one(two) offer(s). For a given π, p(π) is

the upper bound of the price offer SM can get in the described equilibrium (p′l(π) < pl(π) <

p(π)). B1 while making offers to SM will have an absolutely continuous (conditional)

distribution of offers from pl(π) to p(π), putting a mass point at pl(π).

(iii) B1 while making offers to SI on the equilibrium path behaves exactly in the same

manner as in the two player game with one-sided asymmetric information.

(iv) SI ’s behavior is identical to that in the two-player game. SM accepts the largest

offer with a payoff at least as large as the expected continuation payoff from rejecting all

offers.

(v) Each buyer in equilibrium obtains a payoff of vB(π).

Remark 3 The mass points and the distribution of buyers’ offers will depend upon π

though we show that these distributions will collapse in the limit. Off the path, the analysis

is different from the two-player game because the buyers have more options to consider

when choosing actions. For the description of off-path behavior refer to Appendix(A.13)

and Appendix(A.14) for public and private offers respectively.

Remark 4 A “road map” of the proof: We construct the equilibrium by starting from the

benchmark complete information case and showing that the complete information strategies

essentially carry over to the game where π is in a range near 0. This includes, through

Page 119: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

108

the competition lemma, showing the nature of the competition among the sellers. Once

π is outside this range, the mass points and support of the randomised strategies in the

candidate equilibrium will depend upon π and these are characterised for all values of π.

The equilibrium is then extended beyond the initial range (apart from the initial range, these

are functions of δ) for sufficiently high values of δ by recursion. Finally, checking that

the candidate equilibrium is immune to unilateral deviation at any stage involves specifying

out-of-equilibrium beliefs. This is done in the two appendices.

Proof. We prove this proposition in steps. (Not all of these steps are given here in order

to reduce unwieldy notation-see also the appendices.) First we derive the equilibrium for

a given value of π by assuming that there exists a threshold δ∗, such that if δ exceeds

this threshold then for each value of π, a stationary equilibrium as described above exists.

Later on we will prove this existence result.

To formally construct the equilibrium for different values of π, we need the following

lemma which we label as the competition lemma, following the terminology of [11], though

they proved it for a different model.

Consider the following sequences for t ≥ 1:

pt = v − [(v − δtH)α+ (1− α)δ(v − pt−1)] (4.2)

p′t = M + δ(1− α)(pt−1 −M) (4.3)

where α ∈ (0, 1) and p0 = H.

Lemma 24 There exists a δ′ ∈ (0, 1), such that for δ > δ

′and for all t ∈ {1, ....N(δ)}, we

have,

pt > p′t

Page 120: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

109

Proof.

pt − p′t = v − [(v − δtH)α+ (1− α)δ(v − pt−1))]−M

−δ(1− α)(pt−1 −M)

= (v −M)(1− δ + δα)− α(v − δtH)

= (1− δ)(v −M) + α(δv − δM − v + δtH)

= (1− δ)(v −M) + α(δtH − δM − (1− δ)v)

If we show that the second term is always positive then we are done. Note that the

coefficient of α is increasing in delta and is positive at δ = 1. Take t = N∗, where N∗ is

the upper bound on the number of periods up to which the two player game with one sided

asymmetric information (as described earlier) can continue. For t = N∗, ∃ δ′ < 1 such

that the term is positive whenever δ > δ′. Since this is true for t = N∗, it will be true for

all lower values of t.

As N(δ) ≤ N∗ for any δ < 1, for all t ∈ {1, ....N(δ)},

pt > p′t

whenever δ > δ′.

For both public and private targeted offers, the equilibrium path is the same. However

the off-path behavior differs (to be specified later).

Fix a δ > δ∗. Suppose we are given a π ∈ (0, 1)4. There exists a t ≥ 0 (it is easy to

see that this t ≤ N∗ ) such that π ∈ [dt, dt+1). The sequence dτ (δ) = {0, d1, d2, ...dt..} is

derived from and is identical with the same sequence in the two-player game. Next, we4π = 0 is the complete information case with a H seller.

Page 121: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

110

evaluate vB(π) (from the two player game). Define p(π) as,

p(π) = v − vB(π)

Define p′l(π) as,

p′l(π) = M + δ(1− a(π))[Edt−1(p)−M ] (4.4)

where Edt−1(p) represents the expected price offer to SM in equilibrium when the proba-

bility that SI is of the low type is dt−1. From (4.4) we can posit that, in equilibrium, p′l(π)

is the minimum acceptable price for SM if she gets only one offer.

Lemma 25 For a given π > d1, the acceptance probability a(π) of an equilibrium offer is

increasing in δ and has a limit a(π) which is less than 1.

Proof. The acceptance probability a(π) of an equilibrium offer is equal to πβ(π), where

β(π) is the probability with which the L-type SI accepts an equilibrium offer. From the

updating rule we know that β(π) is such that the following relation is satisfied:

π(1− β(π))π(1− β(π)) + (1− π)

= dt−1

From the above expression, we get

β(π) =π − dt−1

π(1− dt−1)

Therefore, β(π) is increasing in π and decreasing in dt−1. From [21] the dt are decreasing

in δ and have a limit. Hence β(π) (and also a(π) ) is increasing in δ. Since the dt have a

limit as δ goes to 1, so does β(π). Therefore, a(π) also has a limit a(π) which is less than

1 for π ∈ (0, 1).

Page 122: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

111

For π = dt−1, the maximum price offer to SM (according to the conjectured equilibrium)

is p(dt−1). This implies that Edt−1(p) < p(dt−1) (this will be clear from the description

below). Since a(π) ∈ (0, 1), from lemma (24) we can infer that p(π) > p′l(π). Suppose

there exists a pl(π) ∈ (p′l(π), p(π)) such that,

pl(π) = (1− δ)M + δEπ(p)

We can see that pl represents the minimum acceptable price offer for SM in the event that

he gets two offers. (Note that if SM rejects both offers, the game goes to the next period

with π remaining the same.)

¿From the conjectured equilibrium behavior, we derive the following5 :

1. B1 makes offers to SI with probability q(π), where

q(π) =vB(π)(1− δ)

(v − p′l(π))− δvB(π)(4.5)

B1 offers δtH to SI . With probability (1 − q(π)) he makes offers to SM . The conditional

distribution of offers to SM , given B1 makes an offer to this seller when the relevant

probability is π, is

F π1 (s) =vB(π)[1− δ(1− q(π))]− q(π)(v − s)

(1− q(π))[v − s− δvB(π)](4.6)

We can check that F π1 (pl(π)) > 0 and F π1 (p(π)) = 1. This confirms that B1 puts a mass

point at pl(π).

2. B2 offers p′l(π) to SM with probability q

′(π), where

q′(π) =

vB(π)(1− δ)(v − pl(π)))− δvB(π)

(4.7)

5We obtain these by using the indifference relations of the players when they are using randomizedbehavioral strategies.

Page 123: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

112

With probability (1−q′(π)) he makes offers to SM by randomizing his offers in the support

[pl(π), p(π)]. The conditional distribution of offers is given by

F π2 (s) =vB(π)[1− δ(1− q′(π))]− q′(π)(v − s)

(1− q′(π))[v − s− δvB(π)](4.8)

This completes the derivation. Appendix(A.13) and Appendix(A.14)(for public and

private offers respectively) describes the off-path play and show that it sustains the equi-

librium play in each of the cases.

Next, we show that there exists a δ∗ such that δ′< δ∗ < 1 such that for δ > δ∗ an

equilibrium as described above exists for all values of π ∈ [0, 1). To do these we need the

following lemmas:

Lemma 26 If π ∈ [0, d1), then the equilibrium of the game is identical to that of the

benchmark case.

Proof. From the equilibrium of the two player game with one sided asymmetric infor-

mation, we know that for π ∈ [0, d1), buyer always offers H to the seller and the seller

accepts this with probability one. Hence this game is identical to the game between a

buyer of valuation v and a seller of valuation H, with the buyer making the offers. Thus,

in the four-player game, we will have an equilibrium identical to the one described in the

benchmark case. We conclude the proof by assigning the following values:

p′l(π) = M and p(π) = H for π ∈ [0, d1)

Lemma 27 6If there exists a δ ∈ (δ′, 1) such that for δ ≥ δ and for all t ∈ {1, ..., N∗} an

equilibrium exists for π ∈ [0, dt(δ)), then there exists a δ∗t ≥ δ such that, for all δ ∈ (δ∗t , 1)6We use the following notation, from the appendix. For any x ∈ (M,H) Ex(p) be the expressions

obtained from F1(.), F2(.), q, q′

and E(p) respectively by replacing pl by x.

Page 124: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

113

an equilibrium also exists for π ∈ [dt(δ), dt+1(δ)).

Proof. We only need to show that there exists a δ∗t ≥ δ such that for all δ > δ∗t and for

all π ∈ [dt(δ), dt+1(δ)),there exists a pl(π) ∈ (p′l(π), p(π)) with

pl(π) = (1− δ)M + δEπ(p)

¿From now on we will write dt instead of dt(δ). For each δ ∈ (δ′, 1) we can construct

d(δ) and the equilibrium strategies as above (assuming existence). Construct the function

G(x) as

G(x) = x− [δExπ(p) + (1− δ)M ]

We can infer from Appendix (??) that the function G(.) is monotonically increasing in x.

Since Exπ(p) < p(π),

limx→p(π)

G(x) > 0

Next, we have

G(p′l(π)) = p

′l(π)− [δEp

′l(π)π (p) + (1− δ)M ]

By definition Ep′l(π)π (p) > p

′l(π). So for δ = 1, G(p

′l(π))) < 0. Since G(.) is a continuous

function, there exists a δ∗t ≥ δ such that for all δ > δ∗t , G(p′l(π))) < 0. By invoking the

Intermediate Value Theorem we can say that there is a unique x∗ ∈ (p′l(π), p(π)) such that

G(x∗) = 0. This x∗ is our required pl(π).

This concludes the proof.

¿From lemma (26) we know that for any δ ∈ (0, 1) an equilibrium exists for π ∈ [0, d1).7

Using lemma (27) we can obtain δ∗t for all t ∈ {1, 2, ..., N∗}. Define δ∗ as:

δ∗ = max1,..,N∗

δ∗t

7Note that d1 is independent of δ

Page 125: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

114

We can do this because N∗ is finite. Lemma (26) and (27) now guarantee that whenever

δ > δ∗ an equilibrium as described above exists for all π ∈ [0, 1) .

This concludes the proof of the proposition.

4.4 Asymptotic characterization

It has been argued earlier that as δ → 1, p′l(π) reaches a limit which is less than p(π).

From (4.5) we then have,

q(π)→ 0 as δ → 1

Then from (4.6) we have,

1− F π1 (s) =p(π)− s

(1− q(π))[v − s− δvB(π)]

We have shown that q(π) → 0 as δ → 1. Hence as δ → 1, for s arbitrarily close to p(π),

we have

1− F π1 (s) ≈ p(π)− sp(π)− s

= 1

Hence the distribution collapses and pl(π) → p(π). From the expression of pl(π) we

know that pl(π) → Eπ(p) as δ goes to 1. Thus we can conclude that Eπ(p) approaches

p(π). From the two-player game with one-sided asymmetric information we know that as

δ goes to 1, p(π) → H, (since vB(π) goes to v − H) for any value of π. This leads us to

conclude that as δ goes to 1, Eπ(p) → H for all values of π. This in turn provides the

justification of having Edt−1(p) ≈ Exπ(p) for high values of δ(used in the proof of lemma

(27)).

¿From the proof of lemma (27) we know that G(p(π)) > 0. Hence there will be a

threshold of δ such that for all δ higher than that threshold we have G(δp(π)) > 0. Thus

Page 126: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

115

pl(π) is bounded above by δp(π). (4.7) implies that

q′(π) =

1v

vB(π) + δp(π)−pl(π)(1−δ)vB(π)

Since pl(π) is bounded above by δp(π), q′(π)→ 0 as δ goes to 1.

Thus we conclude that as δ goes to 1, prices in all transactions go to H. We state this

(informally) as a result.

Main result: With either public or private offers there exists a stationary Perfect-

Bayes equilibrium, such that, as δ → 1, the prices in both transactions go to H. The

bargaining ends “almost” immediately and both sellers, the one with private information

and L type and the one whose valuation is common knowledge, obtain strictly positive

expected profits.

Comment : It should be mentioned that we would expect the same result to be true, if,

instead of a two-point distribution, the informed type’s reservation value s is continuously

distributed in [L,H] according to some cdf G(s). Then with probability G(M) the reserva-

tion value of SI is less than M and with probability (1−G(M)) it is higher than M . Since

this will still be the “gap” case of the two-player bargaining case,the result of [33] will hold

so that the two-player offer goes to H as δ → 1. This should make an analogue to the com-

petition lemma true. The belief updating for private and public offers off the equilibrium

path will now not be in terms of the probability of a soft type but in the (truncated) updated

distribution of the informed seller’s reservation value.

4.5 A non-stationary equilibrium

We show that with public offers we can have a non-stationary equilibrium, so that the

equilibrium constructed in the previous sections is not unique. This is based on using the

stationary equilibrium as a punishment (the essence is similar to the pooling equilibrium

with positive profits in [45]). The strategies sustaining this are described below. The

Page 127: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

116

strategies will constitute an equilibrium for sufficiently high δ, as is also the case for the

stationary equilibrium.

Suppose for a given π, both the buyers offer M to SM . SM accepts this offer by

selecting each seller with probability 12 . If any buyer deviates, for example by offering to

SI or making a higher offer to M, then all players revert to the stationary equilibrium

strategies described above. If SM gets the equilibrium offer of M from the buyers and

rejects both of them then the buyers make the same offers in the next period and the seller

SM makes the same responses as in the current period.

Given the buyers adhere to their equilibrium strategies, the continuation payoff to SM

from rejecting all offers she gets is zero. So she has no incentive to deviate. Next, if one

of the buyers offers slightly higher than M to SM then it is optimal for her to reject both

the offers. This is because on rejection next period players will revert to the stationary

equilibrium play described above. Hence her continuation payoff is δ(Eπ(p) −M), which

is higher than the payoff from accepting.

Finally each buyer obtains an equilibrium payoff of 12(v −M) + 1

2δvB(π). If a buyer

deviates then, according to the strategies specified, SM should reject the higher offer if the

payoff from accepting it is strictly less than the continuation payoff from rejecting(which

is the one period discounted value of the payoff from stationary equilibrium). Hence if a

buyer wants SM to accept an offer higher than M then his offer p′

should satisfy,

p′

= δEπ(p) + (1− δ)M

The payoff of the deviating buyer will then be δ(v − Eπ(p)) + (1 − δ)(v −M). As δ → 1,

δ(v − Eπ(p)) + (1− δ)(v −M) ≈ δ(v − p(π) + (1− δ)(v −M)

= δvB(π) + (1− δ)(v −M).

For δ = 1 this expression is strictly less than 12(v−M)+ 1

2δvB(π), as (v−M) > δvB(π).

Hence for sufficiently high values of δ this will also be true. Also if a buyer deviates and

Page 128: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

117

makes an offer in the range (M,p′) then it will be rejected by SM . The continuation payoff

of the buyer will then be δvB(π) < 12(v−M)+ 1

2δvB(π). Hence we show that neither buyer

has any incentive to deviate.

We conclude this section by noting that this is not an equilibrium for private offers.

This is because we have different continuation play for buyer’s and seller’s deviations. For

public offers these deviations are part of the public history. However for private offers they

are not.

4.6 Conclusion

In the model we described above we have shown that with either public or private offers

there exists a stationary PBE, such that, as δ → 1, the prices in both transactions go to

H. The bargaining ends within the first two periods and both sellers, the one with private

information and L type and the one whose valuation is common knowledge, obtain strictly

positive expected profits. This equilibrium is reminiscent of the “Coase Conjecture” on the

rents from private information dominating the rents from having the sole right to make

offers, as the offers can be made more quickly. However, the setting is different, in that

there is an endogenous outside option for which buyers compete, and the model contains

a potential interaction between this competition and the private information bargaining.

This interaction comes through, at least in the equilibrium we study, mainly in the analysis

of out-of-equilibrium behavior. It is interesting that the equilibrium path behavior is almost,

though not quite, separable along these two dimensions.

It is also interesting that the equilibrium path in our model is essentially the same

with the two different observability structures of public offers and private offers. We were

somewhat hesitant to use the name PBE for the private offers case, since this is not a

multistage game with observable actions and private information, in the sense of Fudenberg

and Tirole, but the spirit of the analysis is very similar to theirs, so we have retained their

name.

Page 129: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

118

One question that might arise is how robust is our conclusion to different bargaining

extensive forms. Clearly, simultaneous offers is best to represent competition and one-sided

offers to represent the power to make offers. If we go to alternating offers, previous results

in the complete information setting indicate that we cannot expect the same results. This

is also true in the two-player setting, so the market element in the current model is not

the driver for this difference.

We have shown that there could be non-stationary equilibria in this model. However,

we have not been able to demonstrate an analogue to the uniqueness result for two-person

bargaining with one-sided offers and one-sided private information, even for stationary

equilibria.

In our future research we intend to address the issue of having two privately informed

sellers and to extend this model to more agents on both sides of the market.

Page 130: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

Bibliography

[1] Akcigit, U., Liu, Q., 2013: “The Role of Information in Competitive Experimentation.”, mimeo, Columbia University and University of Pennsylvania.

[2] Almqvist, Ebbe (2002), History of Industrial Gases, Springer, Berlin

[3] d’Aspremeont, C., Bhattacharya, S., Gerard-Varet, L., 2000 “Bargaining and SharingInnovative knowledge. ”, The Review of Economic Studies 67, 255− 271.

[4] Ausubel, L.M., Deneckere, R.J 1989 “A Direct Mechanism Characterzation of Sequen-tial Bargaining with One-Sided Incomplete Information ” Journal of Economic Theory48, 18− 46.

[5] Ausubel, L.M., Cramton, P., and Deneckere, R.J 2002 “Bargaining with IncompleteInformation ”Ch. 50 of R. Aumann and S.Hart (ed.) Handbook of Game Theory, Vol3, Elsevier.

[6] Baliga, S., Serrano,R. (1995), “Multilateral Bargaining with Imperfect Information”,Journal of Economic Theory, 67, 578-589

[7] Bhattacharya, S., Mookerjee D., 1986 “Portfolio choice in research and development.”, Rand Journal of Economics 17, 594− 605.

[8] Binmore, K.G. , Herrero, M.J. 1988. “Matching and Bargaining in Dynamic Markets,’Review of Economic Studies 55, 17− 31.

[9] Bolton, P., Harris, C., 1999 “Strategic Experimentation. ”, Econometrica 67, 349−374.

[10] Chatterjee, K. , Dutta, B. 1998. “Rubinstein Auctions: On Competition for BargainingPartners,’ Games and Economic Behavior 23, 119− 145.

[11] Chatterjee, K. , Dutta, B. 2006. “Markets with Bilateral Bargaining and IncompleteInformation” mimeo Penn State and University of Warwick

[12] Chatterjee, K., Evans, R., 2004: “Rivals’ Search for Buried Treasure: Competitionand Duplication in R&D. ”, Rand Journal of Economics 35, 160− 183.

119

Page 131: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

120

[13] Chatterjee, K. , Samuelson, L. 1988. “Bargaining Under Two-Sided Incomplete Infor-mation: The unrestricted Offers Case,’ Operations Research 36, 605− 638.

[14] Chatterjee, K. , Samuelson, L. 1987. “Infinite Horizon Bargaining Models with Alter-nating Offers and Two-Sided Incomplete Information ”Review of Economic Studies,54 , 175− 192.

[15] Chatterjee, K. , Lee, C.C. 1998. “Bargaining and Search with Incomplete Informationabout Outside Options ” Games and Economic Behavior 22, 203− 237.

[16] Chikte S.D., Deshmukh S.D. 1987. “The Role of External Search in Bilateral Bargain-ing ”Operations Research 35, 198− 205.

[17] Corominas-Bosch, Margarida, 2004. “Bargaining in a network of buyers and sellers ”,Journal of Economic Theory 115 , 35− 77

[18] Dasgupta, P., Maskin, E., 1987: “The Simple Economics of Research Portfolios ”, TheEconomic Journal 581− 595

[19] Dasgupta, P., Stiglitz, J., 1980 “Uncertainty, Industrial Structure and the Speed ofR&D ”, Bell JOurnal of Economics 111− 28

[20] “No end to Dementia ”, The Economist, June 2010

[21] Deneckere,R. , Liang, M.Y., 2006. “Bargaining with Interdependent Values,’ Econo-metrica, 74, 1309− 1364.

[22] Fershtman, C., Rubinstein, A., 1997 “A Simple Model of Equilibrium in Search Pro-cedures. ”,Journal of Economic Theory 72, 432− 441.

[23] Fudenberg, D., Levine, D., and Tirole, J. , 1985. “Infinite-Horizon Models of Bargain-ing with One-Sided Incomplete Information. ”A. Roth (ed.), Game-Theoretic Modelsof Bargaining , Cambridge University Press .

[24] Fudenberg, D. , Tirole, J. 1990 Game Theory

[25] Gale, D., (1986). “Bargaining and Competition Part I: Characterization,’ Economet-rica 54, 785− 806.

[26] Gale, D. 1987. “Limit theorems for Markets with Sequential Bargaining,’ Journal ofEconomic Theory 43, 20− 54.

[27] Gale, D. 2000. “Strategic Foundations of General Equilibrium: Dynamic Matchingand Bargaining Games,’ Cambridge University Press

[28] Gale, D., Sabourian, H. (2005).“Complexity and Competition,’ Econometrica 73, 739−769.

Page 132: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

121

[29] Galenianos, M. , Kircher, P. 2009. “Directed Search with Multiple Job Applications,’Journal of Economic Theory 144, 445− 471.

[30] Gantner, A. 2008. “Bargaining, Search and Outside Options. ”Games and EconomicBehavior 62, 417− 435.

[31] Graham, M.B.W., 1986 “The Business of research ”, New York:Cambridge UniversityPress.

[32] Gul, F., Sonnenschein, H., “On Delay in Bargaining with One-Sided Uncertainty.”Econometrica 56, 601− 611.

[33] Gul, F., Sonnenschein, H., and Wilson, R. , 1986. “Foundations of Dynamic Monopolyand the Coase Conjecture. ” Journal of Economic Theory 39, 155− 190.

[34] Heidhues, P., Rady, S., Strack, P.,2012 “Strategic Experimentation with Private Pay-offs ”, mimeo. University of Bonn.

[35] Hendon, E., and Tranaes, T. (1991). “Sequential Bargaining in a Market with OneSeller and Two Different Buyers,’ Games and Economic Behavior 4,453− 466.

[36] Horner, J., Vieille, N., (2009). “Public vs. Private Offers in the Market for Lemons ”,Econometrica 77, 29− 69.

[37] Keller, G., Rady, S., Cripps, M., 2005: “Strategic Experimentation with ExponentialBandits ”, Econometrica 73, 39− 68.

[38] Keller, G., Rady, S., 2010:“Strategic Experimentation with Poisson Bandits ”, Theo-retical Economics 5, 275− 311.

[39] Klein, N., 2011: “Strategic Learning in Teams ”, mimeo University of Bonn

[40] Klein, N., Rady, S., 2011: “Negatively Correlated Bandits ”, The Review of EconomicStudies 78 693− 792.

[41] Lee, C.C. 1995. “Bargaining and Search with Recall: A Two-Period Model with Com-plete Information ” Operations Research 42, 1100− 1109

[42] Lee, T., Wilde, L., 1980: “Market Structure and Innovation: A Reformulation”, Quar-terly Journal of Economics 94429− 436

[43] Loury,G.C., 1979 “Market Structure and Innovation ”, Quarterly Journal of Eco-nomics 93395− 410.

[44] Muthoo, A. 1995. “On the strategic Role of Outside Options in Bilateral Bargaining”, Operations Research, 43 292− 297.

Page 133: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

122

[45] Noldeke, G. , Van Damme, E. , 1990 “Signalling in a Dynamic Labor Market ”, TheReview of Economic Studies, 57 1− 23.

[46] Osborne, M., and Rubinstein, A. (1990). Bargaining and Markets. San Diego: Aca-demic Press

[47] Peters, M. 2010. “Noncontractible Heterogeneity in Directed Search,’ Econometrica78, 1173− 1200.

[48] Peters, M. , Severinov, S. , 2006. “Internet auctions with many traders,’ Journal ofEconomic Theory 130, 220− 245.

[49] Presman, E.L., 1990: “Poisson Version of the Two-Armed Bandit Problem with Dis-counting, Theory of Probability and its Applications

[50] Raiffa, H., 1985, The Art and Science of Negotiation, Harvard University Press.

[51] Reinganum, J. 1982 “A dynamic Game of R&D Patent Protection and ComeptitiveBehavior ”, Econometrica 50671− 688.

[52] Rubinstein, A. 1982. “Perfect equilibrium in a bargaining model ”, Econometrica 50,97− 109.

[53] Rubinstein, A., and Wolinsky, A. (1985). “Equilibrium in a Market with SequentialBargaining,’ Econometrica 53, 1133−−1150.

[54] Rubinstein, A. and Wolinsky, A. (1990). “Decentralised Trading, Strategic Behaviorand the Walrasian Outcome,’ Review of Economic Studies 57.

[55] Sabourian,H. 2004. “Bargaining and markets: complexity and the competitive out-come”, Journal of Economic Theory 116,, 189− 228.

[56] Sobel, J., and Takahashi, I. 1983. “A Multi-Stage Model of Bargaining ”Review ofEconomic Studies 50 411− 426.

[57] Shaked, A. and Sutton, J. (1984), “Involuntary Unemployment as a Perfect Equilib-rium in a Bargaining Model”, Econometrica, 52, 1351-1364

[58] Scherer, F.M., “International High-Technology Competition ”, Cambridge, Mass:,Harvard University Press

[59] Stokey,N.L., 2009: “The Economics of Inaction ”, Princeton University Press.

[60] Swinkels, J.M. (1999), “Asymptotic Efficiency for Discriminatory Private Value Auc-tions ”, The Review of Economic Studies, 66, 509− 528.

[61] Thomas, C., 2011: “Experimentation with Congestion ”, mimeo, University Collegeof London and University of Texas Austin

Page 134: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

Appendix

A.1 Solution for planner’s v(p)

The O.D.E is given by:

v′(p) +

[r + (π + π′)p]

(π + π′)p(1− p)v(p) =

11− p

The integrating factor of the above differential equation µ(p) is given by

µ(p) = e

∫ r+(π+π′)p

(π+π′)p(1−p)

dp=

pr

π+π′

(1− p)r

π+π′ +1

Multiplying both sides of the O.D.E with µ(p) and integrating both sides we get

∫dv(p) =

∫ pr

π+π′

(1−p)r

π+π′ +2 + C1

pr

π+π′

(1−p)r

π+π′ +1

which gives (1.2)

123

Page 135: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

124

A.2 Switching-derivative lemma

Lemma 28 When both firms are conducting their research at S1 then v′1(.) is given by

v′1(.) =

π′

r + π + π′− C1

11[Λ(p)]r

π+π′ (1 +

r

π + π′1p

)

Similarly when both firms are conducting their research at S2 then v′2(.) is given by

v′2(.) = − π

r + π + π′+ C2

22[Γ(p)]r

π+π′ (1 +

r

π + π′1

1− p)

Let p2s (p1

s) be the switching point of B (A). Then if p2s < p∗1 (p1

s > p∗2), v′1(p2

s) > 0

(v′2(p1

s) < 0).

Proof. Since C111 is chosen by imposing value matching at p2

s, we have

v′1(p2

s) =π′

r + π + π′− {[ π

r + π′− π

r + π + π′](

p2s

1− p2s

+r

π + π′1

1− p2s

)}

It is easy to see that v′1(p2

s) is decreasing in p2s. Also we can show that v

′1(p∗1) = 0 where

p∗1 = π′

π+π′. Hence if p2

s < p∗1 then v′1(p2

s) > 0. Similarly we can argue for v′2(p1

s).

A.3 Auxillary results

A.3.1 For the proof of proposition (3

Imposing the value matching condition to v1(.) at p∗N2 and p∗N1 we obtain:

π′

r + π + π′p∗N1 + C1

11(1− p∗N1 )[Λ(p∗N1 )]r

π+π′ =

π′

r + π′p∗N1

⇒ C111 =

p∗N1 [ π′

r+π′− π

r+π+π′]

(1− p∗N1 )[Λ(p∗N1 )]r

π+π′

(A.1)

Page 136: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

125

andπ

r + π + π′(1− p∗N2 ) + C1

22p∗N2 [Γ(p∗N2 )]

r

π+π′ =

π′

r + π′p∗N2

⇒ C122 =

π′

r+π′p∗N2 − π

r+π+π′(1− p∗N2 )

p∗N2 [Γ(p∗N2 )]r

π+π′

(A.2)

Similarly by imposing value matching to v2(p) at p∗N1 and p∗N2 we obtain

C211 =

π′

r+π′(1− p∗N1 )− π

r+π+π′p∗N1

(1− p∗N1 )[Λ(p∗N1 )]r

π+π′

(A.3)

C222 =

(1− p∗N2 )[ π′

r+π′− π

r+π+π′]

p∗N2 [Γ(p∗N2 )]r

π+π′

(A.4)

For both 1 and 2, the switching point is in the interior of the range of beliefs over which

the other player’s action is constant. This implies that v1(p) and v2(p) should satisfy

certain smooth pasting conditions.

Lemma 29 v1(.) and v2(.) are smooth at p∗N2 and p∗N1 respectively.

Proof of Lemma. v1(.) is smooth at p∗N2 if

−πr + π + π′

+ C122[Γ(p∗N2 )]

r

π+π′ [1 +

r

π + π′1

1− p∗N2] =

π′

r + π′

Substituting the value of C122 from (A.2) it can be shown that the above equality holds.

Similarly v2(.) is smooth at p∗N1 if

π

r + π + π′− C2

11[Λ(p∗N1 )]r

π+π′ [1 +

r

π + π′1p∗N1

] = − π′

r + π′

Substituting the value of C211 from (A.3) it can be shown that the above equality holds.

This concludes the proof.

Page 137: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

126

A.3.2 For the proof proposition (8)

The derivative of Cp1 [Λ(p)]r

π1+π2 with respect to p is given by(1−p)v′SR(p)+vSR(p)− π1+π2

r+π1+π2(1−p)2 .

Since

v′SR(p) =

rπ1

(r + π0)(r + π0 + π1)− C2[Λ(p)]

r+π0π1 [1 +

r + π0

π1p]

⇒ (1− p)v′SR(p) =π0

r + π0+

rπ1

(r + π0 + π1)(r + π0)− vSR(p)− C2[Λ(p)]

r+π0π1

(r + π0)π1

1p

Hence

(1−p)v′SR(p)+vSR(p)− π1 + π2

r + π1 + π2=

π1 + π0

r + π0 + π1− π1 + π2

r + π2 + π1−C2[Λ(p)]

r+π0π1

(r + π0)π1

1p< 0

This follows from the fact that [ π1+π0r+π0+π1

− π1+π2r+π2+π1

] < 0 and C2 > 0.

A.3.3 For the proof of lemma (8)

Since firm 2 switches at p∗N1 ,

vR′

2 (p∗N1 ) =π2

r+π1+π2(1− p∗N1 )− π0

r+2π0+ π2

r+π1+π2p∗N1 − π0r

(r+2π0)(π1+π2)p∗N1+ π2r

(π1+π2)(r+π1+π2)

(1− p∗N1 )

This is obtained by imposing the value matching condition at p∗N1 . Substituting p∗N1 = π0π1

,

the value of the numerator is:

=π2

π1 + π2− π0

r + 2π0− rπ1

(r + 2π0)(π1 + π2)

<π2

π1 + π2− π0

r + 2π0− rπ2

(r + 2π0)(π1 + π2)

=2π0π2

(π1 + π2)(r + 2π0)− π0

r + 2π0=

π0(π2 − π1)(π1 + π2)(r + 2π0)

< 0

as π1 > π2.

Page 138: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

127

A.4 Strategy depending on both belief and the location of

the opponent

Case 1 :

r(π′ − π)− ππ′ > 0

Using lemma (3) one can show that given firm 2 is at S2, firm 1 will go to S1 for

p ∈ [p∗N2 , 1] and to S2 for p ∈ [0, p∗N2 ].

Further, we can show that given firm 2 is at S1, firm 1 will go to S1 for p ∈ (p′2, 1] and

to S2 for p ∈ [0, p′2], where p

′2 = rπ

rπ′+π(r+π′ ). p′2 < p∗N2 .

Similarly, we can show the following:

Given that firm 1 is at S1, firm 2 will go to S1 for p ∈ (p∗N1 , 1] and to S2 for p ∈ [0, p∗N1 ].

Given that firm 1 is at S2, firm 2 will go to S1 for p ∈ (p′1, 1] and to S2 for p ∈ [0, p

′1) ,

where p′1 = π

′(r+π)

rπ′+π(r+π′ ). p′1 > p∗N1 .

Define the following strategy for firm 1(s1) :

Choose S2 for p ∈ [0, p′2).

For p ∈ [p′2, p∗N2 ), choose S2 if 2 is at S2. If 2 is at S1, then choose S1.

Choose S1 for p ∈ [p∗N2 , 1].

Define the following strategy for firm 2(s2):

Choose S1 for p ∈ (p′1, 1].

For p ∈ (p∗N1 , p′1], choose S1 if 1 is at S1. If 1 is at S2, then choose S2.

Choose S2 for p ∈ [0, p∗N1 ].

Then (s1, s2) constitutes an equilibrium and the outcome is same as the one obtained

with stationary markovian strategies.

Page 139: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

128

Case 2:

r(π′ − π)− ππ′ < 0

First, observe that p′1 > p∗1 and p

′2 < p∗2. Thus for any p∗ ∈ [max{ps, p∗N1 }, 1 −

max{ps, p∗N1 }], p∗ ∈ (p′2, p

′1).

Define the following strategy for firm 1(s1) :

Choose S1 for p ∈ [p∗, 1].

For p ∈ [p′2, p∗), choose S2 if 2 is at S2. If 2 is at S1 then choose S1.

Choose S2 for p ∈ [0, p′2).

Define the following strategy for firm 2(s2) :

Choose S2 for p ∈ [0, p∗].

For p ∈ (p∗, p′1], choose S1 if 1 is at S1. If 1 is at S2, then choose S2.

Choose 1 for p ∈ (p′1, 1].

(s1, s2) constitutes an equilibrium and the outcome is same as the one obtained with

stationary Markovian strategies.

A.5 Proof of Lemma 11

The proof proceeds as follows: We first show that F1(·), F2(.) as given are probability dis-

tributions and have the desired properties. Next we show that q, q′ are in (0,1). Assuming

pl is between M and H, we then show that the strategies are an equilibrium. In the lemma

(12), we show that there is a unique pl implied by all these conditions and it is between M

and H.

Since both buyers offer to SM , it is clear that in equilibrium the offers to SM from both

Page 140: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

129

the buyers have to be randomised.

To begin with, we figure out the continuation payoff for SM from rejecting her offer(s).

Consider the case when rejecting an offer leads her to face a 2-player game next period.

This gives her a continuation payoff of zero.

When rejection leads SM to face a 4-player game next period, the continuation payoff

needs to be determined from the equilibrium strategies of the buyers. (Recall y is the

maximum price SM gets in equilibrium in the next period (a random variable this period)).

Thus if pl is the minimum acceptable price for SM in this situation, we must have,

pl −M = δ(E(y)−M)

⇒ pl = (1− δ)M + δ(E(y))

Given the buyers’ strategies, E(y) is given by,

E(y) = q[q′M + (1− q′)E2(p)] + (1− q)[q′E1(p) + (1− q′)E(highest offer)]

where E1(p) is the conditional expectation of B1’s offers given that he is offering to SM

and E2(p) is the conditional expectation of B2’s offers given that he is not offering M to

SM .

Since, as per our proposed strategies, competition takes place for SM only, it is easy to

note that E(y) > M . The fact δ ∈ (0, 1) implies that we must have pl > M .

Consider the region [pl, H] first, where both B2 and B1( if he does make one to SM )

make an offer. In equilibrium both buyers must be indifferent for all price offers in this

region.

According to the proposed strategies the support of B1’s offer to SM is [pl, H]. Also

we know that B1 in equilibrium can obtain a payoff of v −H by offering H to SH . Hence

Page 141: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

130

for any s ∈ (pl, H] we should have the following indifference relation:

(v − s)[q′ + (1− q′)F2(s)] + (1− q′)(1− F2(s))[δ(v −H)] = v −H

, which gives us,

F2(s) =(v −H)(1− δ(1− q′))− q′(v − s)

(1− q′)[(v − s)− δ(v −H)](A.5)

As stated earlier, pl is the minimum acceptable price for SM , when on rejection he faces

a 4-player game next period. This implies that on the equilibrium path pl is the minimum

acceptable price for SM when he gets two offers. Thus B1’s offer of pl to SM is accepted

only when B2 offers M to SM . Hence for s = pl, B1’s indifference relation is,

(v − pl)[q′] + (1− q′)[δ(v −H)] = v −H

which implies,

q′

=[v −H](1− δ)

(v − pl)− δ(v −H)

as per (3.4).

Since H > pl, from ( 3.4) we have,

q′

=[v −H](1− δ)

(v − pl)− δ(v −H)<

[v −H](1− δ)[v −H](1− δ)

= 1

This implies that q′ ∈ (0, 1).

For F2(.) to be a distribution function as conjectured we must have F2(H) = 1 and

F2(Pl) = 0. From 3.2 we have,

1− F2(s) =H − s

(1− q′)[(v − s)− δ(v −H)]

Page 142: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

131

From (3.4) we can infer that,

1− F2(pl) =H − pl

(1− q′)[(v − pl)− δ(v −H)]=

(1− q′)(1− q′)

= 1

and 1−F2(H) = 0. Thus we have F2(H) = 1 and F2(pl) = 0. Hence F2 has the conjectured

properties .

Now consider the behavior of B2 in the selected region. Since B2 can obtain a payoff

of v −H by offering H to SM , for any s ∈ [pl, H] we should have,

(v − s)[q + (1− q)F1(s)] + (1− q)(1− F1(s))[δ(v −H)] = v −H

which gives us (3.1).

Next, consider other regions. According to the conjectured equilibrium strategies, B2

offers M to SM with probability q′(i.e, he puts a mass point at M). Also B1 offers H to

Sh with probability q. At equilibrium B2 should be indifferent for all price offers he makes.

Therefore we should have,

(v −M)q + (1− q)δ(v −H) = v −H

which gives us (3.3). Since H > M , from ( 3.3) we have,

q =[v −H](1− δ)

(v −M)− δ(v −H)<

[v −H](1− δ)[v −H](1− δ)

= 1

This implies that q ∈ (0, 1).

For F1 to satisfy the conjectured properties we should have F1(pl) > 0(since B1 puts a

mass point at pl while offering to SM ) and F1(H) = 1. From ( 3.1) and ( 3.3) we have,

1− F1(pl) =H − pl

(1− q)[(v − pl)− δ(v −H)]

Page 143: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

132

=(1− q′)(1− q)

Since pl > M , q > q′. Thus

(1− q′)(1− q)

< 1 (A.6)

From (A.6) we can infer that,

1− F1(pl) < 1⇒ F1(pl) > 0

Also it is easy to note that 1 − F1(H) = 0. Hence F1(H) = 1.Thus F1(.) satisfies the

conjectured properties.

Lastly, to conclude the proof it needs to be verified that above specified strategies

constitute a subgame perfect equilibrium. We use the one deviation property to do this.

Consider the sellers first. Since we are considering public offers, a seller’s history con-

stitutes the set of players, the offer she receives and the other seller’s received offer. On

the equilibrium path there are only two possible histories. One has all the players present

with both sellers getting equilibrium offers. The other one is when only two players are

present and an equilibrium offer is made. It is easy to observe that in the two-player game

no seller has a profitable one-shot deviation. This is because offers are one sided. Thus we

need to verify equilibrium for the 4-player game only. In the 4-player game irrespective of

SM ’s offer, it is always optimal for SH to accept any offer greater than or equal to H. If

she rejects then next period period either she will face a 4-player game or a 2-player game.

In either case, given that other players adhere to their equilibrium strategies the maximum

payoff which SH can obtain is 0. Also SH has no incentive to accept any offer less than H,

(which gives her a negative payoff), as she can always guarantee a zero payoff by rejecting

the offer.

Next let us look at the possible one-shot deviations for SM on the equilibrium path.

Suppose in the event when she gets two offers, she rejects an offer greater than or equal

Page 144: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

133

to pl. Her continuation payoff would be then be pl −M . This is less than or equal to the

payoff obtained by accepting the offer. Thus on the candidate equilibrium path there is

no profitable one-shot deviation by SM . Finally, the way we have specified SM ’s strategy,

there exists no profitable one shot deviations for SM for any off-path history.

Now consider the buyers. After any history there can be only two possible situations.

Either all the players are present or only one pair remains. Given other players’ strategies

and the one-deviation property it is easy to note that buyers cannot profitably deviate.

This concludes the proof of the lemma.

A.6 Proof of Lemma 13

We prove this in the following steps:

(i) From the expression obtained for q′

we can say that q′x is increasing in x.

(ii) Next we show that as we raise x by 1 unit, there is an increase in Ex2 (p) by less

than 1 unit.

Increasing x by 1 unit means raising the lower bound of support of F x2 (.) by 1 unit.

Thus we need to show that

Ex+12 (p) < Ex2 (p) + 1

Consider the distribution F x2 (.) with [x+ 1, H + 1] as the support such that,

F x2 (s) = F x2 (s− 1)

Let Ex2 (p) be the expectation obtained under F x2 (s) . Thus,

Ex2 (p) =∫ H+1

x+1s dF x2 (s)

Page 145: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

134

⇒ Ex2 (p) = [∫ H+1

x+1(s− 1) dF x2 (s)] + 1

= [∫ H+1

x+1(s− 1) dF x2 (s− 1)] + 1

= [∫ H

x(s) dF x2 (s)] + 1

= Ex2 (p) + 1

F x+12 (p) is obtained from F x2 (s) by transferring the mass from the interval (H,H + 1]

to [x+ 1, H], i.e transferring mass from higher values to lower values. Thus it is clear that,

Ex+12 (p) < Ex2 (p) = Ex2 (p) + 1

By similar reasoning we can say that ,

Ex+11 (p) < Ex1 (p) + 1

These imply that the increase in E(highest offer) following a unit increase in x is less

than 1.

Hence from the above arguments it follows that,

∂Ex(y)∂x

< 1

A.7 Proof of lemma 23

As before, define the function G(.) as,

G(x) = x− [δEx(y) + (1− δ)M ]

Page 146: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

135

where Ex(y) is obtained from E(y) as before(i.e by replacing pl by x). Using lemma 13

we can argue that G′(x) is monotonically increasing in x for x ∈ (p

′l, H). Next, from the

above prescribed strategies it is easy to see that for any x ∈ (p′l, H),we have Ex(y) > p

′l.

Thus we can infer that there exists a δ∗ ∈ (0, 1) such that,

limx→p′l

G(x) = x− [δ∗Ex(y) + (1− δ∗)M ] = 0

Thus for any δ > δ∗, we have limx→p′l

G(x) < 0. Also since for all x ∈ (p′l, H), Ex(y) < H,

we have limx→H G(x) > 0. Hence by applying the Intermediate Value Theorem we can

infer that there exists a unique x∗ ∈ (p′l, H) such that G(x∗) = 0. This x∗ is our required

pl. Thus there is a unique pl ∈ (p′l, H) such that for all δ > δ∗,

G(pl) = 0⇒ pl = δE(y) + (1− δ)M

A.8 Proof of Proposition 15

Consider Buyer B1 first. He puts a mass point at L and his equilibrium payoff is v −H.

Since we are considering public offers, S1 will accept an offer of L only when Bn is offering

to Sn. This is because only in that contingency would the continuation payoff to S1 from

rejection be zero. Thus we must have,

(v − L)qH + (1− qH)δ(v −H) = v −H (A.7)

Solving for qH we get (3.12). Consider the region [p1, H], where both B1 and Bn make

offers. In equilibrium each buyer should be indifferent among all the points in the support.

Thus for s ∈ [p1, H], B1’s indifference relation is given by:

(v − s)[(1− q1) + q1F1n(s)] + q1(1− F 1

n(s))δ(v −H) = v −H

Page 147: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

136

Solving for F 1n(.) from the above relation we get (3.14). Similarly for s ∈ [p1, H], Bn’s

indifference relation from offering to S1 is,

(v − s)[q′1 + (1− q′1)F1(s)] + (1− q′1)(1− F1(s))δ(v −H) = v −H

Solving for F1(.) we get (3.13). Putting s = p1 in Bn’s indifference relation we get,

(v − s)q′1 + (1− q′1)δ(v −H) = v −H

which gives us (3.15). Note that from (3.14) and (3.11) we have,

1− F 1n(p1) =

H − p1

q1[(v − p1)− δ(v −H)]=q1

q1

From (3.16) we know that q1 > q1. Hence we have 1 − F 1n(p1) < 1 which implies that

F 1n(p1) > 0. This confirms our conjecture that Bn, while offering to S1 puts a mass point

at p1. It is easy to check that F 1n(H) = 1. Similarly from (3.13) and (3.15) we have,

1− F1(p1) =H − p1

(1− q′1)[(v − p1)− δ(v −H)]=

(1− q′1)(1− q′1)

= 1

which implies F1(p1) = 0. Again it is easy to observe that F1(H) = 1.

Next, consider buyer Bi, i = 2, ..., n − 1. Consider the region [pi, H], where both Bi

and Bn make offers. In equilibrium both buyers should be indifferent between any offers

in the region. For s ∈ [pi, H], Bn’s indifference relation is given by,

(v − s)[Fi(s)] + [1− Fi(s)]δ(v −H) = v −H

Solving for Fi(.) from above, we get (3.17). We can easily infer that Fi(pi) > 0 and

Fi(H) = 1. This confirms the conjecture that Bi puts a mass point at pi. Similarly, Bi’s

Page 148: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

137

indifference relation is given by:

(v − s)[(1− qi) + qiFin(s)] + qi(1− F in(s))δ(v −H) = v −H

which gives us (3.18). Putting s = pi inBi’s indifference relation we get qi = v−pi(v−pi)−δ(v−H) =

qi. Hence we have,

1− F in(pi) =H − pi

qi[(v − pi)− δ(v −H)]=qiqi

= 1

Thus F 1n(pi) = 0 and F in(H) = 1. Also note that,

∑i=1,,,n−1

qi + qH = q1 + (1− P − qH) +∑

j=2,..,,n−1

qj + qH = 1

Since uj > L for j > 1, from (A.7) we know that,

(v − uj)qH + (1− qH)δ(v −H) < v −H for j = 2, ...n− 1

Hence Bi (i = 2, ..., N) does not have any incentive to offer uj to seller Sj . Further, Bi

cannot obtain a payoff higher than v − H by deviating unilaterally and making offers to

any other sellers. Lastly, the way we have specified sellers’ strategies it is easy to check

that none of the sellers has a unilateral profitable deviation on the equilibrium path. This

concludes the proof.

A.9 Proof of Proposition 16

This proof is identical in many respects to the proof of proposition (15). Consider the region

[pi, H],(i = 1, .., n−1). In this region both Bi and Bn make offers with positive probability.

By considering the indifference relations of Bi and Bn in this region, we can get (3.19) and

(3.20) in the same manner as we obtained (3.17) and (3.18) in the proof of the previous

Page 149: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

138

proposition. Similarly, we can infer that Fi(pi) > 0;Fi(H) = 1 and F in(pi) = 0;F in(H) = 1.

Since qn = 1− P < qH , from (A.7) we know that,

(v − L)qH + (1− qH)δ(v −H) = v −H and

(v − uj)qH + (1− qH)δ(v −H) < v −H

for all j = 2, .., n− 1. Since qn < qH , for all j = 1, ..n− 1 we have,

(v − uj)qn + (1− qn)δ(v −H) < v −H

Hence Bi (i = 1, ..., n− 1) has no incentive to offer ui to seller Si. Finally note that,

∑i=1,..,n

qi =∑

i=1,..,n−1

qi + (1− P) = 1

This concludes the proof.

A.10 Proof of Proposition 17

Consider the region [pi, p] (i = 1, ..., n− 1), where both the buyers Bi and Bn make offers.

Hence the indifference relation of Bn is given by,

(v − s)Fi(s) + (1− Fi(s))δ(v −H) = v − p

This gives us (3.21). One can easily figure out from (3.21) that Fi(pi) > 0 and Fi(p) = 1.

This confirms our conjecture that Bi(i = 1, .., n − 1) puts a mass point at pi. Buyer Bi’s

indifference relation is given by,

(v − s)[(1− qi) + qi(F in(s))] + qi(1− F in(s))δ(v −H) = v − p

Page 150: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

139

Solving for F in(.) we get (3.22). By substituting s = pi in Bi’s indifference relation we

get (3.23). From (3.22)and (3.23) it is easy to see that F in(pi) = 0 and F in(H) = 1. For

consistency in the expressions obtained we must have,

∑i=1,..,n−1

qi = 1⇒∑

i=1,..,n−1

p− pi(v − pi)− δ(v −H)

= 1 (A.8)

From the hypothesis of the proposition we know that P ≥ 1. If P = 1, from (A.8) we have

p = H. If P > 1, from (A.8) we can infer that p < H.

From the analysis of the basic complete information game we know that for each i =

1, ..., n − 1, H−pi(v−pi)−δ(v−H) → 1 and pi → H as δ → 1. Thus if P > 1 for a particular

δ∗ ∈ (0, 1),1 it will be so for all δ > δ∗. Thus, the equilibrium behavior will remain the

same for all higher values of δ. Hence we can characterise the equilibrium for values of δ

close to one. Using (A.8) we have,

∑i=1,..,n−1

(1− p− pi(v − pi)− δ(v −H)

) = n− 2⇒∑

1,..,n−1

(v − p)− δ(v −H)(v − pi)− δ(v −H)

= n− 2

⇒ p = v − (n− 2)[

∏i=1,..,n−1[(v − pi)− δ(v −H)]∑

j=1,..n−1[∏k=1,..,n−1;k 6=j{(v − pk)− δ(v −H)}]

]− δ(v −H) (A.9)

From the basic model we know that for each i = 1, .., n − 1, pi → H as δ → 1. Hence

[(v − pi)− δ(v −H)]→ 0 as δ → 1. From (A.9) we have,

p = v − [n− 2∑

j=1,..n−1[∏k=1,..,n−1;k 6=j{(v−pk)−δ(v−H)}∏

i=1,..,n−1[(v−pi)−δ(v−H)] ]]− δ(v −H)

As δ → 1, [ n−2∑j=1,..n−1[

∏k=1,..,n−1;k 6=j{(v−pk)−δ(v−H)}∏

i=1,..,n−1[(v−pi)−δ(v−H)]]] → 0. Hence as δ → 1, p → H. This

concludes the proof.1In fact, as δ increases we will eventually have P > 1

Page 151: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

140

A.11 Details of the equilibria defined in proposition (18)

We give here a more detailed description of the equilibrium for heterogeneous buyers for

the n× n model.

A.11.1 Ph < 1 and 1− Ph > qH

Buyer Bi (i = 1, .., n− 1) offers to seller Si only. B1 while making offers to S1 puts a mass

of q′h1 at L. With probability (1 − q′h1 ) he randomises his offers to S1 using a continuous

probability(conditional) distribution function F h1 with [ph1 , H] as the support. Bn offers to

S1 with probability qh1 . His offers are randomised using a probability distribution function

F 1nh with [ph1 , H] as the support. F 1

nh puts a mass point at phi . The distributions F h1 , F 1nh

and the probabilities qh1 and q′h1 are given by:

F h1 =(vn −H)[1− δ(1− q′h1 )]− q′h1 (vn − s)

(1− q′h1 )[(vn − s)− δ(vn −H)]

F 1nh =

(v −H)[1− δqh1 ]− (1− qh1 )(v − s)qh1 [(v − s)− δ(v −H)]

q′h1 =

(vn −H)(1− δ)(vn − ph1)− δ(vn −H)

qh1 = qh1 + (1− Ph − qH)

For i = 2, .., n − 1, Bi’s offers to Si are randomised with a distribution F hi (s). F hi (.) puts

a mass point at phi and has an absolutely continuous part from phi to H. Bn makes offers

to Si(i = 2, .., n − 1) with probability qhi = qhi . Bn’s offers to Si are randomised using

an absolutely continuous probability distribution F inh with [phi , H] as the support. For

i = 2, ..., n− 1, F hi (.), F inh(.) are given by,

F hi =(vn −H)(1− δ)

(vn − s)− δ(vn −H)

Page 152: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

141

F inh =(vi −H)[1− δqhi ]− (1− qhi )(vi − s)

qhi [(vi − s)− δ(vi −H)]

Bn offers to Sn with probability qH . He offers H to Sn.

A.11.2 Ph < 1 and 1− Ph < qH

Buyer Bi (i = 1, .., n − 1) offers to seller Si only. Bi’s offers to Si are random with a

distribution F hi (s). F hi (.) puts a mass point at phi and has an absolutely continuous part

from phi to H. Bn makes offers to Si (i = 1, .., n− 1) with probability qhi = qhi . Bn’s offers

to Si are random with an absolutely continuous probability distribution F inh with [phi , H]

as the support. For i = 1, .., n− 1, F hi (.) and F inh(.) are given by

F hi =(vn −H)(1− δ)

(vn − s)− δ(vn −H)

F inh =(vi −H)[1− δqhi ]− (1− qhi )(vi − s)

qhi [(vi − s)− δ(vi −H)]

Bn offers to Sn with probability qhn = 1− Ph. He offers H to Sn.

A.11.3 Ph ≥ 1

Buyer Bi makes offers to seller Si only. Bi’s offers to Si are randomised using a distribution

function F hi (.) with [phi , ph] as the support. The distribution F hi (.) puts a mass point at phi

and has an absolutely continuous part from phi to ph. Buyer Bn makes offers to all sellers

except Sn. Bn’s offers to Si (i = 1, .., n− 1) are randomised with a continuous probability

distribution F inh. The support of offers is [phi , ph]. The probability with which Bn makes

offers to Si is qhi . If Ph = 1 then ph = H. If Ph > 1 then ph < H and as δ → 1, ph → H.

F hi (.), F inh and qhi are given by the following expressions:

F hi (s) =(vn − ph)− δ(vn −H)(vn − s)− δ(vn −H)

Page 153: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

142

F inh =(vi −H)[1− δqhi ]− (1− qhi )(vi −H)

qi[(vi − s)− δ(vi −H)]

qhi =ph − phi

(vi − phi )− δ(vi −H)

A.12 Off-path behavior of the 2 player game with incom-

plete information

We recapitulate here the off-path beliefs that sustain the equilibrium we have discussed

for the two-player game. Suppose, for a given δ and π, the equilibrium offer is δtH(i.e

π ∈ [dt, dt+1) ) .We need to consider the following off-path contingencies.

(a) The buyer offers po to the seller such that po < δtH: If p0 < δt+1H then both the

L-type and H-type seller reject this offer with probability 1. If po ∈ [δt+1H, δtH) then the

L-type seller rejects this with a probability, which, through Bayes’ rule, implies that the

updated belief is dt. Let this probability be β′′(p). Hence the acceptance probability of this

offer is a′′(p) = πβ

′′(p). The H-type seller always rejects this offer. Since po ∈ [δt+1H, δtH),

there exists a k ∈ (0, 1] such that po = kδt+1H+(1−k)δtH. Next period (if the seller rejects

now) the buyer offers δtH with probability k and δt−1H with probability (1 − k). This

is optimal from the point of view of the buyer because at π = dt, the buyer is indifferent

between offering δtH and δt−1H. Also the expected continuation payoff to the L-type seller

from rejection is equal to δ(kδtH+(1−k)δt−1H) = po. Thus the L-type seller is indifferent

between accepting and rejecting the offer of po.

The way the cutoffs dt’s are derived ensures that the buyer has no incentive to deviate

and offer something less than δtH.

(b) Next, consider the case when the buyer offers po to the seller such that po > δtH.

If po ∈ (δtH, δt−1H], the L-type seller rejects this offer with a probability that takes

the updated belief to dt−1. Since po ∈ (δtH, δt−1H], there exists a k ∈ (0, 1], such that

Page 154: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

143

po = kδt−1H+(1−k)δtH. If the seller rejects then next period the buyer offers δt−2H with

probability k and δt−1H with probability 1− k. This is optimal from the buyer’s point of

view since at π = dt−1, the buyer is indifferent between offering δt−1H and δt−2H. Since

the expected payoff to the L-type seller from rejection is δ(kδt−2H + (1− k)δt−1H) = po,

he is indifferent between accepting and rejecting an offer of po. As po is strictly greater

than δtH and the acceptance probability is the same as that of the equilibrium offer, the

buyer has no incentive to deviate and offer po to the seller where po ∈ (δtH, δt−1H].

If po ∈ (δτ , δτ−1] (for τ ≤ t − 1 ) then the L-type seller rejects this with a probability

which through Bayes’ rule implies that the updated belief is dτ−1. If the seller rejects

then next period the buyer randomises between offering δτ−1H and δτ−2H such that the

expected continuation payoff to the L-type seller from rejection is po. It can be checked

that the buyer has no incentive to deviate and offer po where po ∈ (δτ , δτ−1] (τ ≤ t− 1 ).

A.13 Off-path behavior of the 4 player game with incom-

plete information(public offers)

Suppose B2 adheres to his equilibrium strategy. Then the off path behavior of B1 that of

L-type SI , while B1 makes an offer greater δtH SI , are the same as in the 2-player game

with incomplete information. If B1’s offer to SI is less than δH then the off path behavior

of the L-type SI is in the following manner. If B2’s offer to SM is in the range [pl(π), p(π)],

then the L-type SI behaves in the same way as in the 2-player game. If B2 offers p′l(π) to

SM then the L-type SI accepts the offer with the equilibrium probability so that rejection

takes the posterior to dt−1. Next period, B1 randomises bwteen dt−1 and dt−2 so that the

L-type SI is indifferent betwween accepting or rejecting the offer now. High values of δ

ensures that B1 has no incentive to deviate.

Next, suppose B2 makes an unacceptable offer to SM , (which is observable to SI) and

B1 makes an equilibrium offer to SI . The L-type SI rejects this offer with a probability

Page 155: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

144

that takes the updated belief to dt−1. If SI rejects this equilibrium offer and next period

both the buyers make offers to SM , then two periods from now, the remaining buyer

offers δt−2H (the buyer is indifferent between offering δt−1H and δt−2H at π = dt−1) to

SI . Thus the expected continuation payoff to SI from rejection is δ(q(dt−1)δt−1H + δ(1−

q(dt−1))δt−2H) = δtH. This implies that the L-type SI is indifferent between accepting

and rejecting an offer of δtH if he observes SM to get an unacceptable offer.

Now consider the case when B2 deviates and makes an offer to SI . It is assumed that

if SI gets two offers then she disregards the lower offer.

Suppose B1 makes an equilibrium offer to SI and B2 deviates and offers something less

than δtH to SI . SI ’s probability of accepting the equilibrium offer (which is the higher

offer in this case) remains the same. If SI rejects the higher offer (which in this case is

the offer of δtH from B1 ) and next period both the buyers make offers to SM , then two

periods from now, the remaining buyer offers δt−2H to SI .

If B2 deviates and offers po ∈ (δtH, δt−1H] to SI , then SI rejects this with a probability

that takes the updated belief to dt−2. If SI rejects this offer then next period if B1 offers

to SI , he offers δt−2H. If both B1 and B2 make offers to SM then two periods from now

the remaining buyer randomises between offering δt−2H and δt−3H to SI (conditional on

SI being present). Randomisations are done in a manner to ensure that the expected

continuation payoff to SI from rejection is po. It is easy to check that this can always

be done. Lastly, if B2 deviates and offers to SI and B1 offers to SM (according to his

equilibrium strategy), then the off-path specifications are the same as in the 2-player game

with incomplete information.

We will now show that B2 has no incentive to deviate. Suppose he makes an unaccept-

able offer to SM . His expected discounted payoff from deviation is given by,

D = q(π)[δ{a(π)(v −M) + (1− a(π))vB(dt−1)}] + (1− q(π))δvB(π) (A.10)

Page 156: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

145

From (4.4) we know that,

p′l(π) < M + δ(1− a(π))[p(dt−1)−M ]

as Edt−1 < p(dt−1). Hence we have,

p′l(π) < M + δ(1− a(π))[(v −M)− (v − p(dt−1))]

Rearranging the terms above we get,

(v − p′l(π)) > δ{a(π)(v −M) + (1− a(π))vB(dt−1)}+ (1− δ)(v −M) (A.11)

By comparing (A.10) and (A.11) we have,

q(π)(v − p′l(π)) + (1− q(π))δvB(π) > D

The L.H.S of the above relation is B2’s equilibrium payoff, as he puts a mass point at p′l(π).

Hence he has no incentive to make an unacceptable offer to SM .

Next, suppose B2 deviates and makes an offer of po to SI such that po ∈ (δtH, δt−1H].

B2’s payoff from deviation is:

ΓH = q(π)[(v−po)a′(π)+(1−a′(π))δvB(dt−2)]+(1−q(π))[(v−po)a(π)+(1−a(π))δvB(dt−1)]

where a′(π) is the probability with which B2’s offer is accepted by SI in the event when

both B1 and B2 make offers to SI and B2’s offer is in the range (δtH, δt−1H]. From our

above specification it is clear that a′(π) > a(π), where a(π) is the acceptance probability

of an equilibrium offer to SI . This is also very intuitive. In the contingency when B1

makes an equilibrium offer to SM and B2’s out of the equilibrium offer to SI is in the

range (δtH, δt−1H], the acceptance probability is equal to a(π), the equilibrium acceptance

Page 157: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

146

probability. In this case if the L-type SI rejects an offer then next period he will get an

offer with probability 1. However if both B1 and B2 make offers to SI and B2’s offer is

in the range (δtH, δt−1H] then the L-type SI accepts this offer with a higher probability.

This is because, on rejection, there is a positive probability that SI might not get an offer

in the next period. This explains why a′(π) > a(π).

Since po > p′l(π)2 and p(dt−2) > p

′l(π)3, we have

v − p′l(π) > (v − po)a′(π) + (1− a′(π))δvB(dt−2) (A.12)

Also, since po > δtH, we have

(v − po)a(π) + (1− a(π))δvB(dt−1) < vB(π)

The expression [(v− po)a(π) + (1− a(π))δvB(dt−1)− δvB(π)] is strictly negative for δ = 1.

From continuity, we can say that for sufficiently high values of δ, (v − po)a(π) + (1 −

a(π))δvB(dt−1) < δvB(π). This implies that,

(v − p′l(π))q(π) + (1− q(π))δvB(π) > ΓH

The L.H.S of the above inequality is the equilibrium payoff of B2. Similarly if B2 deviates

and make an offer to SI such that his offer p0 is in the range [δt+1H, δtH), the payoff from

deviation is

ΓL = q(π)[δ{a(π)(v −M) + (1− a(π))vB(dt−1)}]

+(1− q(π))[(v − p0)a′′(π) + (1− a′′(π))δvB(dt)]

¿From the 2-player game we know that [(v − p0)a′′(π) + (1 − a′′(π))δvB(dt)] < vB(π).

2For sufficiently high values of δ this will always be the case.3Since p(dt−2) > p(π) > p

′l(π).

Page 158: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

147

Also from the previous analysis we can posit that (v − p′l(π)) > δ{a(π)(v −M) + (1 −

a(π))vB(dt−1)}. Thus for sufficiently high values of δ, (v−p′l(π))q(π) + (1− q(π))δvB(π) >

ΓL.

Hence B2 has no incentive to deviate and make an offer to SI .

A.14 Off-path behavior with private offers

The off-path behavior described in the preceding appendix is not applicable to the case of

private offers. This is because it requires the offers made by both the buyers to be publicly

observable. The off-path behavior of the players in the case of private offers is described

as follows.

Specifically we need to describe the behavior of the players in the following three con-

tingencies.

(i) B2 makes an unacceptable offer to SM .

(ii) B2 makes an offer of po to SI such that po < δtH.

(iii) B2 makes an offer of po to SI such that po > δtH.

We denote the above three contingencies by E1, E2 and E3 respectively. We now

construct a particular belief system that sustains the equilibrium described in the text.

Suppose B1 attaches probabilities λ,λ2 and λ3 (0 < λ < 1 ) to E1, E2 and E3 respec-

tively. Thus he thinks that B2 is going to stick to his equilibrium behavior with probability

[1− (λ+ λ2 + λ3)].

If E1 or E2 occurs and B1 makes an equilibrium offer to SI , then SI ’s probability of

accepting the equilibrium offer remains the same and two periods from now (conditional on

the fact that the game continues until then), if B2 is the remaining buyer he offers δt−2H

to SI . If E3 occurs and all players are observed to be present, then next period B2 offers

p(dt−1) to SM . In any off-path contingency, if B1 is the last buyer remaining (two periods

Page 159: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

148

from now) then he offers δt−2H to SI .

The L-type SI accepts an offer higher than δtH with probability 1 if she gets two offers.

If she gets only one offer then the probability of her acceptance of out-of-equilibrium offers

is the same as in the two-player game with incomplete information.

We will now argue that the off-path behavior constitutes a sequentially optimal response

by the players to the limiting beliefs as λ→ 0.

Suppose B1 makes an equilibrium offer to SI and it gets rejected. Although offers are

private, each player can observe the number of players remaining. Thus, next period, if

B1 finds that all four players are present he infers that this is due to an out-of-equilibrium

play by B2. Using Bayes’ rule he attaches the following probabilities to E1, E2 and E3

respectively.

11 + λ+ λ2

to E1

λ

1 + λ+ λ2to E2

λ2

1 + λ+ λ2to E3

As λ→ 0, the probability attached to E1 goes to 1. Thus B1 believes that his equilibrium

offer of δtH to SI was rejected and the updated belief is dt−1. In the case of E1 or E2 the

beliefs of B1 and B2 coincide. However, in the case of E3 they differ. Suppose E3 occurs

and B1’s equilibrium offer to SI gets rejected. Then next period all four players will be

present and given L-type SI ’s behavior, the belief of B2 will be π = 0 and that of B1 will

be π = dt−1. In that contingency it is an optimal response of B2 to offer p(dt−1) to SM

since he knows that B1 is playing his equilibrium strategy with the belief dt−1.

Next we will argue that the L-type SI finds it optimal to accept an offer higher than

δtH with probability 1, if she gets two offers. This is because in the event when she gets

two offers she knows that rejection will lead the buyer B1 to play according to the belief

Page 160: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

149

dt−1 and, two periods from now, the remaining buyer will offer δt−2H to SI . Thus her

continuation payoff from rejection is

δ{δt−1Hq(dt−1) + δ(1− q(dt−1))δt−2H} = δ{δt−1H} = δtH

Hence she finds it optimal to accept an offer higher than δtH with probability 1.

We need to check that B2 has no incentive to deviate and make an offer of po to SI

such that po > δtH.

Suppose B2 deviates and makes an offer of po to SI such that po > δtH. With prob-

ability q(π), SI will get two offers and B′2s will be accepted with probability π. With

probability (1− q(π)), SI will get only one offer. B2 then gets a payoff of

(v − po)q(π)π + (1− q(π))[(v − po)a(π) + (1− a(π))δvB(dt−1)]

As shown in the previous appendix, for high values of δ we have (v − po)a(π) + (1 −

a(π))δvB(dt−1) < δvB(π). Also for high values of δ, po > p′l(π). Thus4,

vB(π) = (v − p′l(π))q(π) + (1− q(π))δvB(π)

> (v − po)q(π)π + (1− q(π))[(v − po)a(π) + (1− a(π))δvB(dt−1)]

Hence B2 has no incentive to deviate and make an offer of po to SI .

Lastly, to show that B2 has no incentive to deviate and make an unacceptable offer to

SM or offer p0 to SI such that p0 < δtH we refer to the analysis in the previous appendix.

4This is because B2 puts a mass point at p′l(π)

Page 161: ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING

Vita

Kaustav Das

Education

• Ph.D. in Economics The Pennsylvania State University, 2013

• MS Quantitative Economics Indian Statistical Institute, Kolkata, 2008

• B.Sc. in Economics Presidency College, University of Calcutta, 2006

Research Interests

Strategic experimentation,Inefficiency in R&D, Bargaining theory, Game theory.

Completed Manuscripts

• Competition, Duplication and Learning in R&D.

• Competition and Learning in R&D: The Role of Private Information.

• Decentralised Bilateral Trading, Competition for Bargaining Partners and the law of

one price.

• Decentralised Bilateral Trading in a Market with Incomplete Information (Revise and

Resubmit at the American Economic Journal: Microeconomics).

Awards

• Rosenberg Centennial Scholarship, Department of Economics, Penn State University,

Spring 2012

• Mrs. M.R Iyer Gold Medal, awarded by the Indian Statistical Institute for outstand-

ing performance in the M.S. in Quantitative Economics program, 2008.