Dynamic inventory competition with stockout-based substitution

Dynamic Inventory Competition with Stockout-Based Substitution

Tava Lennon Olsen∗

The University of Auckland

Rodney P. Parker†

The University of Chicago

November 24, 2010

Abstract

We examine when there is a commitment value to inventory in a dynamic game between two firms

under stockout-based substitution. In our model, the firms face independent direct demand but

some fraction of a firm’s lost sales will switch to the other firm. This problem has been previously

studied in both the single period and stationary infinite horizon (open loop) setting but not in

a Markov perfect (closed loop) setting. We numerically show that, as seen in the economics

literature, significantly different dynamics can occur in the closed loop setting and indeed there

may be a commitment value to inventory in this setting. However, we give conditions under

which the stationary infinite horizon equilibrium is also a Markov perfect equilibrium and provide

a discussion on when the (simpler) stationary equilibrium may be sufficient for study. We provide

a review of the types of equilibria typically found in operations management papers and provide

some roadmaps on avenues for future research.

∗Email: [email protected]†Email: [email protected]

1 Introduction

It is well documented that consumers may switch brands upon experiencing a stockout (e.g., Schary

and Christopher, 1979). Given this behavior, one dimension of retailer competition is the avail-

ability of stock. Of course, retail prices also strongly influence customers purchasing decisions but

once price competition has settled prices to a common level between retailers, customers next look

to whether the product is on the shelf. We seek to understand retailer competitive behavior in

the face of stock-out based substitution. More specifically, this paper studies the dynamics of such

competition. For example, is it ever rational for a retailer to overstock inventory in one period so

that they can credibly commit to a high inventory level in the following period?

We consider a simple duopoly competition where prices are fixed and consumers have a preferred

product (primary demand). Upon finding that product unavailable, they either will not buy or will

switch to the competing product. Such a scenario can either model two competing retailers stocking

the same product, or one retailer with two competing vendors (e.g., Coca Cola and Pepsi) who

manage their stock on the retailer’s shelves. This paper studies the dynamic game between the two

firms, focusing on the inter-period dynamics. It gives conditions under which the stationary infinite

horizon equilibrium is also a Markov perfect equilibrium but also provides examples of where this

is not the case. In this latter case, the dynamics can be quite complex but there may indeed be a

commitment value to inventory.

The commitment value of capacity has been studied in the economics literature. The general

idea is that the irreversible nature of capacity allows a firm to demonstrate commitment. Tirole

(1988) describes the period of commitment associated with an investment (such as purchasing

capacity) as “a period of time over which the cost of being freed from the commitment within

the period is sufficiently high that it does not pay to be freed.” The irreversibility of capacity

typically arises from the fact that it is lumpy, industry-specific, or otherwise difficult to discard on

a secondary market. The sinking of money into an investment of such a magnitude tends to show

a commitment, usually towards operating in a particular market. Spence (1977), Dixit (1980),

and others show how such a capacity commitment by an incumbent firm can be used to ward off

potential entrants into a market, ensuring monopoly operations for the incumbent. The entrants

are aware of the incumbent’s investment and its durability implies the commitment. Prices do not

have a similar deterrent effect because they can be easily changed (Tirole, 1988, p.368).

1

A review of the literature did not yield any previous articles on the commitment value of

inventory in the sense we intend, despite the lengthy literature on inventory control. Saloner

(1986) does describe a partial commitment of inventory in the sense that acquiring inventory (either

through production or purchasing) and incurring the holding cost does endow the firm with the

ability to sell up to that quantity. Our sense is somewhat different, in that a firm may order up to a

“high” level in one period with the hope that its inventory entering the following period is similarly

high. Such a hope is predicated on knowledge of the direct demand’s probability distribution and

of the rival’s demand distribution and stocking level. Thus, the beginning inventory level in the

following period is far from guaranteed, but if it happens to be high, then the commitment to this

inventory is beyond dispute: the concrete physical nature of the inventory means it is available for

usage. Of course, this inventory commitment is for a shorter number of periods compared with

capacity, but it has a similar flavor of requiring an observable investment which cannot be easily

reversed.

This paper contributes to two main areas of literature: inventory duopolies and dynamic oper-

ational games. The earliest acknowledged inventory oligopoly is attributed to Kirman and Sobel

(1974) who elegantly formulate and analyze two competing firms. They were the first to recog-

nize that intertemporal dependence in inventory duopolies arises in two manners: via the physical

inventory being carried from period to period and via the stochastic demand. They formulate a

dynamic oligopoly where the firms set prices; there is no explicit flow of customers based upon

stockout substitution.

Although there are examples of stockout-based substitution in centralized models, such as Bas-

sok, Anupindi, and Akella (1999) and Honhon, Gaur, and Seshadri (2010), our focus is on com-

petitive models. The relationship between the retailers is dependent upon two elements: (a) how

the primary demand is divided between the firms; and (b) how the customers upon experiencing

a stockout will affect the other firm. Considering (a), Lippman and McCardle (1997) consider a

single source of stochastic market demand that is then split between the retailers according to four

exogenous rules. However, most papers allow separate random demand streams arriving at each

firm, as we do. Nagarajan and Rajagopalan (2009) consider separate but correlated streams of pri-

mary demand, but it is more usual to consider independent primary streams and incorporate the

degree of substitutability by parameterizing the stockout routing paths, as we also do. Although

2

there is some literature dealing with demands influenced by the availability of stock (e.g., Wang

and Gerchak, 2001), we assume independence between the originating demand and the stocking

levels.

Focusing on aspects of (b), most papers assume some flow of customers from one retailer to

the other (for a duopoly) upon experiencing a stockout. The sum of a firm’s primary demand and

transfer demand to that firm is known as the firm’s effective demand. Parlar (1988), Lippman and

McCardle (1997), Avsar and Baykal-Gursoy (2002), Netessine and Rudi (2003), and Zeinalzadeh

et al. (2010) assume a fixed portion of customers search at the other firm while all others are lost;

we make a similar assumption. Netessine, Rudi, and Wang (2006) generalize this assumption by

considering four separate models, three of which include an allowance for backlogging. Their first

model, where excess effective demand is fully lost, is equivalent to our model. Further, it is entirely

possible that a stockout will negatively affect a customer’s likelihood of visiting that retailer in a

subsequent visit, which has the effect of changing the demand distribution in subsequent periods.

Under this scenario, Olsen and Parker (2008) find that base-stock levels can be the optimal or

equilibrium policy only under certain flow restrictions; this paper does not consider that effect.

The other strand of literature this paper contributes to is that of dynamic operational games,

where the the players may compete multiple times. In the context of operational problems, it often

makes sense to have these models be state dependent. Such a dependence is frequently associated

with an assumption that all payoff-relevant information is embedded in these state variables and

anything else in the past is payoff-irrelevant; these are Markov Games. In this paper, the firms’

beginning inventory levels form the periodic state variables.

Our primary contribution is a careful treatment of the competitive dynamics for our duopoly.

While similar stock-out based substitution models to ours have been studied in the literature, to the

best of our knowledge, none have studied the Markov Equilibrium (ME). Indeed, the literature on

ME in operational games is limited and we review this further in Section 2. We also believe that the

treatment of equilibria type has been hampered in the operations literature by a lack of consistent

vocabulary, and we propose the term an equilibrium in stationary strategies (ESS) for a common,

but inconsistently labeled, equilibrium used frequently in operations papers; this equilibrium type

is used in Avsar and Baykal-Gursoy, 2002, and Netessine, Rudi, and Wang, 2006, in similar models

of stock-out based substitution to ours. Here we give conditions for when this equilibrium is also

3

a Markov equilibrium using a novel proof technique. It is also shown numerically that this is not

always the case.

This paper is organized as follows. Section 2 reviews the appropriate equilibria types for our

duopoly, defines an ESS, and further details relevant literature for dynamic operational games.

Section 3 formally defines our stockout-based substitution duopoly, gives initial structural results

on the payoff and response functions, and shows existence and uniqueness of the ESS. Section 4

gives conditions under which the ESS is also a ME. Section 5 numerically studies settings where

the ESS is not a ME and shows that indeed there may be, in some cases, a commitment value to

inventory. Finally, Section 6 concludes the paper.

2 Equilibrium Types and Literature

This section discusses the types of equilibria and policies that are most relevant to our work. In

particular, we propose a new equilibrium term, an “Equilibrium in Stationary Strategies” (ESS)

as well as reviewing the existing concept of a “Markov Equilibrium” (ME). We define “up-to”

and “base-stock” policies in the context of inventory games and then relate these to the existing

literature on stockout-based substitution. We also provide a brief review of papers in the inventory

literature using the Markov Equilibrium concept.

We first describe two policy types relevant for inventory duopolies. We will refer to firms i

and j = i and it may be assumed that the below definitions and all results in what follows are for

i, j = 1, 2 and j = i. Time is given by t = 1, 2 . . . but where no confusion is likely the time subscript

will be dropped.

Definition 1 A policy for firm i is a stationary up-to response policy if, in each period t, for some

level yi∗, whenever the initial inventory is below yi∗ firm i orders up to yi∗ regardless of the other

firm’s ordering policy for that period. If both firms follow the up-to response policy then we call this

(yi∗, yj∗) a stationary up-to policy.

The appeal of up-to policies is well known; they are easy to implement and intuitively satisfying.

They simply require a firm to know its current inventory level, its target stocking level, and to order

the difference. Notice that an up-to policy makes no policy prescription for what should be done if

inventory is found to be above (yi∗, yj∗), and hence papers proving the optimality or equilibrium

4

nature of an up-to policy will typically impose a requirement on initial inventory. A natural

extension to an up-to policy is the basestock policy. In the single-firm setting a basestock policy is

simply an up-to policy where the firm orders nothing should it find itself above the basestock level.

However, in the game setting additional policy description is needed when one firm is above the

up-to level and the other firm is not. Let (xit, xjt ) be the initial inventory levels at the beginning of

some period t.

Definition 2 A policy for firm i is a stationary basestock response policy if, in each period t, for

some level yi∗, firm i orders up to max(xit,min(Ri(xjt ), yi∗)) regardless of the other firm’s ordering

policy for that period, for some response function Ri(·) with Ri(xj) ≥ yi∗ for xj ≤ yj∗. If both firms

follow the stationary basestock response policy then we call this (yi∗, yj∗) a stationary basestock

policy.

Notice that this definition relies on the definition of a response function because should firm j be

at level xjt > yj∗ and hence ordering nothing, firm i may have a value yi = yi∗ that it would prefer

to stock to in this case; this will be shown to indeed be the case in the stockout substitution game

considered here. The constraint in the definition that Ri(xj) ≥ yi∗ for xj ≤ yj∗ ensures that if both

firms are below (yi∗, yj∗) then both will order up-to (yi∗, yj∗). A special case of such a basestock

policy would be the more traditional definition where each firm always orders up-to the same level

regardless of the competitor’s inventory and orders nothing if above that desired up-to level (i.e.,

Ri(xj) = yi∗ ∀xj).

In a single period, the appropriate form of equilibrium to study is typically a simple Nash

equilibrium (Nash, 1950, 1951). The earliest work on stockout-based substitution appears to be

the single-period model of Parlar (1988), hereafter P88, who shows that there exists a unique Nash

equilibrium and, by implication, that an up-to policy is Nash. Lippman and McCardle (1997),

hereafter LMC, similarly show existence of a Nash solution (via supermodularity) and so a joint

inventory stocking level (up-to policy) is Nash. Given the variety of methods of dividing the initial

inventory, LMC does not necessarily achieve a unique solution. For competitive substitution upon

stockouts in a single period, Mahajan and van Ryzin (2001) allows consumer utility maximization,

Netessine and Rudi (2003) compare the centralized and competitive stocking scenarios, and Li and

Ha (2008) incorporate reactive capacity, and all show that an up-to policy is the Nash equilibrium.

A natural extension to the single-period Nash equilibrium, is the “open-loop” equilibrium, or

5

as it is more typically referred to in the operations management literature, the “infinite horizon

equilibrium” (see, e.g., Bernstein and Federgruen, 2004, for a careful treatment). We find neither of

these terms as fully descriptive and so propose a new term, namely an “Equilibrium in Stationary

Strategies”, which is more specific than “open loop” but more accurate than “infinite horizon,”

because closed loop games can be played on the infinite horizon. Formally, it may be defined as

follows.

Definition 3 A policy is said to be an Equilibrium in Stationary Strategies (ESS) if, for a possibly

restricted set of initial states, the policy is a pure-strategy Nash equilibrium in the infinite horizon

game where the same (possibly state-dependent) strategy is played in each period.

If the initial set of states is restricted, then any proof of an ESS must include showing that the

future state under the equilibrium policy also lies in the same restricted set. In effect, it assumes

that both players are currently setting up their ERP systems, or that strategy decisions are made

on a much longer time scale than the operational payoffs. It is one of the most commonly studied

multi-period equilibria types in the operations literature having a long history (see Kirman and

Sobel, 1974; Heyman and Sobel, 1984), primarily due to its simplicity in formulation and efficacy

in analysis.

Avsar and Baykal-Gursoy (2002), hereafter ABG, extend P88’s model to an infinite-horizon

setting. The equilibrium in each setting is an up-to policy for both firms, albeit Parlar’s equilibrium

is a one-shot decision in a single period whereas Avsar and Baykal-Gursoy’s is a one-shot decision

at the start of an infinite-horizon under the assumption that neither firm will deviate from those

decisions in the future. In other words, ABG show that an up-to policy forms an ESS. Netessine,

Rudi, and Wang (2006), hereafter NRW, also show an up-to policy forms an ESS, but under more

general demand routing compared with ABG. Both ABG and NRW assume initial inventories to

be below the up-to levels in their key theorems.

A stronger equilibrium than an ESS is a Markov Equilibrium. This is a closed-loop equilibrium

and requires subgame perfection in every period.

Definition 4 (Fudenberg and Tirole, 1991, p. 501)

A Markov strategy depends on the past only through the current value of the state variables. A

Markov Equilibrium (ME) or Markov perfect equilibrium is a profile of Markov strategies that

6

yields a Nash equilibrium in every proper subgame.

A ME may be defined over a finite or infinite horizon, but unlike the ESS above, in the infinite

horizon, subgame perfection is still required. One of the simpler forms of ME found commonly in

the operations literature is a linear game. For example, Hall and Porteus (2000) study a linear

game where service competitors, or pure newsvendors, compete for market share that may diminish

as a result of poor service. In this approach, the firms’ payoffs are linear in the state variable (each

firm’s market share). A related technique is to assume separability between the firm’s decisions in

the probabilistic transition function and this is the approach taken in Albright and Winston (1979)

who consider firms competing via pricing and advertising, and Nagarajan and Rajagopalan (2009)

who consider newsvendor stocking conditions under which the competitive scenario reduces into a

single-product optimization for each firm.

Another approach to proving subgame perfection is to limit the model to two periods and there

are some papers which deal specifically with ME of competing retailers with stockout-based substi-

tution across two periods. Caro and Martınez-de-Albeniz (2010) examine the effect of competition

on quick response (see Table 2 in their paper for a crisp summary of the literature). Zeinalzadeh,

Alptekinoglu, and Arslan (2010) consider an inventory duopoly where the firms incur a fixed cost

for placing orders. They find that in two periods, under a number of conditions, equilibria do not

exist.

The inventory literature on non-linear non-separable ME across greater than two periods ap-

pears scarce. Lu and Lariviere (2009) numerically compute ME in order to examine a supplier’s

allocation of scarce inventory to competing retailers; they show a “turn and earn” approach induces

greater sales. Numerical computation is the typical approach for finding ME taken in the economics

literature and there is a large literature on efficient methods for such computation (e.g., Pakes and

McGuire, 1994). Parker and Kapuscinski (2010) demonstrate that the ME for a supplier-retailer

relationship subject to capacity constraints is a modified echelon base-stock equilibrium policy.

They prove the equilibrium policy structure through construction and induction, exploiting Pareto

improvement amongst non-unique equilibria in every period.

Section ?? shows that (under some significant conditions) if initial inventory is below the equi-

librium values then any ESS up-to policy is also a up-to ME policy. However, an up-to equilibrium

does not have any claim over off-equilibrium behavior and assumes that initial inventory is low.

7

A stronger equilibrium is a basestock equilibrium and we also establish conditions under which

the ESS up-to levels will form a stationary Markov base-stock equilibrium policy. However, this

is quite a strong equilibrium and will need strong conditions imposed to establish it. Section ??

will show, through numerics, that the ESS forming a ME is far from guaranteed to be the case and

will highlight numerous instances of non-ESS equilibria and illustrate some compelling equilibrium

behavior (commitment, cycling, sabotage), including that there is frequently a strategic value to

inventory.

3 Model Formulation and Initial Results

We consider two competing firms who face both primary and secondary customer demand. Primary

(or “direct”) customers arrive to a firm each period and will each buy one unit the of product if

it is found to be available. If the product is unavailable, then some proportion of the unsatisfied

customer demand will switch products and become secondary (or “transferred”) demand at the

opposite firm and the rest will be lost sales. The periodic reward consists of revenue minus costs.

The costs are (a) holding costs for physical inventory and (b) production costs for the amount

“processed” in each period; these will be described more precisely below. Also, all payoff relevant

information is assumed to be embedded in the inventory levels of the retailers. This assumption, or

something similar, is necessary in a Markov game formulation. We also assume that all information

on costs, revenues, and demand distributions is common knowledge.

The notation for retailer i in period t is:

ri = revenue per unit (retail price);

hi = holding cost per unit per period;

ci = production cost per unit (wholesale price);

yit = inventory order up-to level in period t (decision variable);

xit = inventory level at the beginning of period t (state variable);

Dit = direct demand to retailer i in period t (random variable);

γij = the proportion of retailer i’s direct customers who will transfer to retailer j; and

α = periodic discount factor.

8

For notational convenience the discount factor is assumed to be the same for all firms; firm-specific

discount factors could be incorporated but little additional insight would be gained. We define the

total demand in period t at retailer i, when retailer j stocks yjt , as

Λit(y

jt ) = Di

t + γji(Djt − yjt )

+,

where x+ = max(x, 0). The argument of Λit(·) will sometimes be dropped when it is too nota-

tionally cumbersome and the meaning is clear from the context. We assume the {Dit} and {Dj

t}

form mutually independent i.i.d. sequences. This model of demand then corresponds to “Rule 4:

Independent Random Demands” of LMC; it is also the form assumed in both P88 and ABG and

is a special case of that in NRW. We focus on this form for simplicity and because it appears to

be a very natural form for demand, but as most results will be written in terms of Λit(·) (which

corresponds to Ri in LMC) extensions to other demand splitting models are likely possible.

We assume γij ∈ [0, 1).1 Firms accrue revenues only when they satisfy the customers’ orders;

recall that no backlogging is allowed. We assume ri ≥ ci, which is natural. All revenue and cost

parameters are non-negative. We also assume 0 ≤ α ≤ 1.

The inventory transition function between periods (counting time periods forward) for retailer

i is:

xit+1 =(yit − Λi

t(yjt ))+

,

which is the level of physical inventory remaining after satisfying direct and transfer demand. Write

Xi(yi, yj) =(yi − Λi(yj)

)+as the random variable representing the firm i inventory left after demand in a generic period when

order-up-to points of (yi, yj) are chosen for firms i, j and X(y1, y2) = (X1(y1, y2), X2(y2, y1)).

Retailer i’s infinite-horizon expected payoff is

E

[ ∞∑t=1

αt−1(rimin(yit,Λ

it)− ci(yit − xit)− hi(yit − Λi

t)+)]

.

1The assumption that γij < 1 will be used in Lemma ?? to show that the magnitude of the derivative of the

response function is strictly less than 1. An equivalent more technical condition on the support of the density could

also be used if γij = 1 (i.e., if all customers transfer upon a stockout).

9

Using the standard translation that xit = (yit−1 − Λit−1)

+ (e.g., Veinott, 1965) and rimin(yit,Λit) =

riyit − ri(yit − Λit)+, we rewrite the infinite horizon expected payoff as

πi∞ = E

[ ∞∑t=1

αt−1((ri − ci)yit − (ri + hi − αci)(yit − Λi

t)+)]

+ cixi1.

We therefore define firm i’s one-period expected payoff as

πi(yi, yj) = (ri − ci)yi − E[(ri + hi − αci)(yi − Λi(yj))+

]= (ri − ci)yi − E

[(ri + hi − αci)(yi −Di − γji(Dj − yj)+)+

].

For the sake of establishing existence of the equilibrium, certain technical assumptions about the

strategy spaces need to be made. A minor assumption is that we treat the strategy spaces as drawn

from real space; that is (yit, yjt ) ∈ ℜ2. So we treat the inventory decisions as continuous variables, a

common assumption in the literature (e.g., Kirman and Sobel, 1974). Further assumptions require

restrictions upon the strategy space, commonly requiring compactness. A blunt approach to this

is to merely truncate the real number line (for a single decision variable) at a very high positive

level (for an upper bound on physical inventory) and at zero as a lower limit (because no backlogs

are allowed). This truncation of the strategy spaces is sometimes done merely by assumption or

through a justification derived from the demand distribution. A reasonable approach is simply to

select an extremely large quantity (“U”) to which inventory levels will never realistically be raised.

Thus, we assume (yit, yjt ) ∈ [0, U ]× [0, U ].

Remark 1 A lost sales cost has not been included in our profit function because lost revenue has

been accounted for explicitly (a similar assumption is made in NRW). However, one could also

include a lost sales cost for primary demand that represents a goodwill cost for unsatisfied direct

demand (as in P88 or ABG), or a lost sales (or goodwill) cost that applies equally across all

customers (as is done in LMC). In this latter case, one would set the lost sales cost in a given

period equal to li(Λi(yj)− yi

)+for some li ≥ 0. All following results in this paper go through with

this additional cost with the exception that, in order to ensure that firm i’s profit is decreasing in

firm j’s stocking level (Lemma ??(c) below), it must be required that

−(ri + hi − αci + li)γjiP(Dj − (x−Di)/γji < z ≤ Dj) + liγjiP(Dj > z) ≤ 0.

For clarity of exposition we simply assume li = 0.

10

We begin with some properties of the payoff function.

Lemma 1 Payoff πi(yi, yj) is

(a) submodular in (yi, yj);

(b) concave in yi; and

(c) nonincreasing in yj.

Proof Taking partial derivatives of πi(x, z),

∂πi(x, z)

∂x= (ri − ci)− (ri + hi − αci)P(x > Di + γji(Dj − z)+) (1)

∂πi(x, z)

∂z= −(ri + hi − αci)γjiP(Dj − (x−Di)/γji < z ≤ Dj) (2)

∂2πi(x, z)

∂x2= −(ri + hi − αci)fΛi(z)(x) (3)

= −(ri + hi − αci)

[fDi(x)P(Dj ≤ z) +

∫ z+x/γji

zfDi(x− γji(w − z))fDj (w)dw

]∂2πi(x, z)

∂x∂z= −(ri + hi − αci)γji

∫ z+x/γji

zfDi(x− γji(w − z))fDj (w)dw. (4)

Equations (??), (??), and (??) imply that πi(yi, yj) is (a) submodular in (yi, yj), (b) concave in

yi, and (c) nonincreasing in yj respectively.

For the one period game with payoff function πi(yi, yj), let Ri(yj) be the optimal response for

firm i to firm j setting an inventory level of yj (similarly for Rj(yi)). The following two properties

will be useful in what follows.

Lemma 2 For each response function Ri(·),

(a) ∂Ri(z)∂z ≤ 0, and

(b)∣∣∣∂Ri(z)

∂z

∣∣∣ < 1.

Proof Define Ki(x, z) = ∂πi(x,z)∂x = 0 where

∂πi(x, z)

∂x= (ri − ci)− (ri + hi − αci)P(x > Di + γji(Dj − z)+)

= (ri − ci)− (ri + hi − αci)[P(x > Di)P(z > Dj)

+

∫ z+x/γji

zP(x > Di + γji(w − z))fDj (w)dw

].

11

We find the derivative of the best response function via the implicit function theorem as follows:

∂Ri(z)

∂z=

−∂Ki

∂z∂Ki

∂x

∣∣∣∣∣(Ri(z),z)

where

∂Ki

∂x= −(ri + hi − αci)

[fDi(x)P(z > Dj) +

∫ z+x/γji

zfDi(x− γji(w − z))fDj (w)dw

], and

∂Ki

∂z= −(ri + hi − αci)

[∫ z+x/γji

zγjifDi(x− γji(w − z))fDj (w)dw

].

The first thing to note is that the best response function’s derivative has a consistent sign (specif-

ically, negative), indicating monotonicity of the function as in (a). Clearly,∣∣∣∂K∂z ∣∣∣ < ∣∣∣∂K∂x ∣∣∣ because

γji < 1 and fDi(x)P(z > Dj) ≥ 0.2 The result of this is that the magnitude of the slope of the

best response function will be less than 1, yielding (b).

Notice that the derivative of the response function is negative, indicating that the firm’s best

decision is to stock less as the competitor stocks more. This could also have been shown directly

from the submodularity of the payoff functions.

Lemma 3 Inventory Xi(yi, Rj(yi)) is stochastically nondecreasing in yi and X(yi, yj) is stochas-

tically nondecreasing in (yi, yj).

Proof Note

Xi(yi, Rj(yi)) =(yi −Di − γji(Dj −Rj(yi))+

)+.

For any fixed (Di, Dj), a sufficient condition for this to be nondecreasing is γji ∂Rj(yi)∂yi

≥ −1. But

this is true by Lemma ??(b) and because 0 ≤ γji < 1. Thus, unconditioning on demand, we

have the desired result. Finally, Xi(·) is clearly stochastically nondecreasing in both yi and yj so

X(yi, yj) is stochastically nondecreasing in (yi, yj).

While the second part of Lemma ?? is intuitive (stocking more in this period should result in

having more inventory in the following period) the first part relies on the fact that the effect of

stocking more in this period has a secondary effect on transfer demand but an immediate effect on

carry over inventory.

2To accommodate γji = 1 one would simply need a sufficient condition on the support of demand to guarantee

that fDi(x)P(z > Dj) > 0.

12

Lemma 4 Payoff πi(Ri(yj), yj) is nonincreasing in yj.

Proof By the envelope theorem (e.g., Sydsæter, Strøm, and Berck, 2000)

∂πi(Ri(yj), yj)

∂yj=

∂πi(yi, yj)

∂yj

∣∣∣∣∣(Ri(yj),yj)

.

But πi(·) is nonincreasing in yj by Lemma ??, so ∂πi(Ri(yj), yj)/∂yj ≤ 0 as desired.

This lemma is quite intuitive and implies that even when firm i responds optimally to the other

firm’s inventory, they would still prefer the other firm to stock less. We are now ready to state the

main result of this section.

Theorem 1 There exists a unique ESS that is up-to (given sufficiently low initial inventories),

which is also basestock if α = 1.

Proof If the magnitude of the best response derivative is less than 1, then the existence of a unique

equilibrium of the one period pay-off function is guaranteed (Vives, 1999), but this requirement is

met by Lemma ??(b). The fact that it is up-to is given by the concavity of πi(·). Further, if initial

inventory is below the ESS level then it will also be below it in all following periods, by definition of

the inventory transition function. Therefore, so long as initial inventories lie below the equilibrium

levels, then following the equilibrium up-to levels is indeed an ESS. We will show that a basestock

policy is an ESS following the argument used by Bernstein and Federgruen (2004). To this end,

we assume that if in any period t, xit > yi∗ then firm i will not order regardless of the other firm’s

inventory and if xit ≤ yi∗ then firm i will order up-to yi∗ (again regardless of firm j’s inventory).

Under this policy, if initial inventory is above yi∗ then there will be a finite number of periods until

it is below yi∗ and after this point it will lie below yi∗. Further, these periods are negligible in the

infinite horizon when α = 1 and the average reward under such a basestock policy will be the same

as under the up-to policy. It is therefore an ESS.

Theorem 1 establishes that the ESS inventory decisions for the retailers are order up-to and,

under average costs, are also basestock. The up-to portion of the argument reproduces that in NRW

for their Model I: Lost Sales, but for completeness is included here (their proof is also similar). The

result is also similar to that in ABG, but ABG state that they restrict attention to equilibria in

stationary basestock policies; they show that if firm j follows a stationary basestock policy then a

stationary basestock policy is a best response for firm i.

13

4 Markov Equilibrium

Throughout this section we assume α < 1. Define (yi∗, yj∗) as the unique ESS,

πi∗ = πi(yi∗, yj∗)

as the expected one period profit under the ESS, and

V i∗ = πi∗/(1− α)

as the infinite horizon discounted expected payoff under repeated playing of the ESS.

Let V it (x

it, x

jt ) be the profit-to-go function under basestock policy (yi∗, yj∗) when the terminal

value for the game in period T + 1 is equal to V iT+1(x

iT+1, x

jT+1)

∆= V i∗. Define

yi = limy→∞

Ri(y),

which exists because Ri(·) is nonincreasing and bounded below by zero. We will show that, re-

gardless of the other firm’s behavior, firm i will always stock at least yi. Note that the less firm j

stocks the better for firm i from a one period payoff perspective.

Definition 5 Let eiT+1(xi) = 0 for all xi ≥ 0. Then recursively define

eit(x) =

0 0 ≤ x ≤ yi∗

maxz≤x{πi(z, yj)− πi∗ + αP(Xi(z,∞) > yi∗)eit+1(x

it+1)

}x > yi∗

and xit = argmaxx eit(x).

Then eit(x) represents the potential extra reward available to firm i for an overstock of x or less if

firm j reacts in the most favorable fashion possible. Note that, by definition, eit(x) is nondecreasing

and eit(yi∗) = 0. We will require the following property.

Property 1 For xi ≥ 0 and yj ≥ yj∗,

πi(xi, yj) + αP(Xi(xi, yj) > yi∗)eit(xti) ≤ πi∗,

for t = 0, 1, . . . , T .

14

Property ?? is the key assumption for an ESS to remain an equilibrium in all periods. It states

that is not worth firm i’s while to overstock in this period purely for next period’s gain, if the

firm doesn’t get an immediate response from firm j. As this requirement is dependent on t, it is

somewhat unattractive and the following property is stronger but time independent.

Property 2 Define πi = maxx πi(x, yj),

z = argmaxz

{πi(z, yj) + αP(X i(z,∞) > yi∗)

π − πi∗

1− α

},

and p = P(X i(z,∞) > yi∗). Assume, for xi ≥ 0 and yj ≥ yj∗,

πi(xi, yj) + αP(X i(xi, yj) > yi∗)πi − πi∗

1− αp≤ πi∗.

Notice that we can see immediately from Property ?? that if the discount rate is low, if there is

a low probability that overstocked inventory remains high in the following period, or if the payoff

function is not very responsive to the competitor’s inventory level, then this condition, which will

be shown to imply that the ESS forms a ME, is more likely to hold. Such conditions are highly

intuitive.

Lemma 5 Property ?? implies Property ??.

Proof

eit+1(x) ≤ πi − πi∗ + αeit+2(xit+2)

≤T−t−1∑s=0

αs(πi − πi∗)

≤ πi − πi∗

1− α

So that, by the concavity of πi(·) and the nondecreasing nature of cumulative probabilities, xit ≤ z.

Therefore,

eit(x) ≤ πi − πi∗ + αpeit+1(xit+1)

≤T−t∑s=0

(αp)s(πi − πi∗)

≤ πi − πi∗

1− αp.

15

The following theorem shows that under Property ?? or ?? the ESS is a ME. The proof is

inductive. In each period it is shown that if firm j follows the ESS equilibrium then firm i has no

incentive to deviate. No assumptions are made on the behavior if inventory is above the ESS level

but a bound is given for the value in these regions, thus allowing us to prove subgame perfection,

i.e., that there is no incentive to deviate from the equilibrium in order to move off the equilibrium

path in the following period.

Theorem 2 Under Property ?? or Property ??, if initial inventory starts below (yi∗, yj∗) then

ordering up to (yi∗, yj∗) in each period is a Markov Equilibrium. Further, regardless of firm j’s

initial inventory, firm i will always order at least yi.

Proof Notice that as Property ?? implies Property ?? we need only consider the latter. We

proceed by induction where at each step of the induction we will prove:

(a) V it (x

i, xj) = V i∗ for xi ≤ yi∗ and xj ≤ yj∗ and the up-to policy is Markov from period t

onwards when initial inventories are below the up-to levels.

(b) V it (x

i, xj) ≤ V i∗ + eit(xit) for x

i > yi∗

(c) Firm i will always order at least yi in period t and V it (x

i, xj) is constant in xi for xi < yi.

The basis is trivially established in period T + 1, now assume true in period t+ 1 onwards.

Part (a): Assume 0 ≤ xi ≤ yi∗ and 0 ≤ xj ≤ yj∗. We need to show that if firm j follows a yj∗ up-to

policy then firm i’s optimal response is to order up yi∗ and that V it (x

i, xj) = V i∗, which clearly

yields (a) for time t. Now

V it (x

i, xj) = maxx≥xi

{πi(x, yj∗) + αE

[V it+1(X(x, yj∗))

]}≤ max

x≥xi

{πi(x, yj∗) + α

[P(Xi(x, yj∗) ≤ yi∗)V i∗ + P(X i(x, yj∗) > yi∗)(V i∗ + eit+1(x

it+1))

]}= max

x≥xi

{πi(x, yj∗) + α

(V i∗ + P(Xi(x, yj∗) > yi∗)eit+1(x

it+1))

)}≤ πi∗ + αV i∗ = V i∗,

where the first inequality follows by the induction hypotheses (a) and (b) and the second inequality

follows from Property ??. But, by definition, πi(yi∗, yj∗) + αE[V it+1(X(yi∗, yj∗))

]= V i∗ so that

16

ordering up to yi∗ must be a best response for firm i to firm j ordering yj∗ (which is feasible by

the assumption on the initial inventories).

Part (b): Suppose xi > yi∗. We need to show that so long as firm j orders at least yj then

V it (x

i, xj) ≤ V i∗ + eit(xi). Note that we make no claim as to firm i’s response; extra conditions will

be needed to show that firm i’s best response is to order nothing (see Theorem ??). Now, if firm j

orders yj ≥ yj ,

V it (x

i, xj) ≤ maxx≥xi

{πi(x, yj) + α

(V i∗ + P(Xi(x,∞) > yi∗)eit+1(x

it+1)

)}≤ πi∗ + αV i∗ +max

x≥xi

{eit(x)

}≤ V i∗ + eit(x

it).

The first inequality follows by the same reasoning as in Part (a), the second by the definition of

eit(·), and the third by the definition of xit.

Part (c): Suppose that xi < yi. We need to show that firm i orders at least yi and that V it (·)

is constant over this range. But for x < yi, for any yj , Xi(x, yj) < yi and E[V it+1(X(x, yj))

]is

constant in x by the induction hypothesis. Therefore,

V it (x

i, xj) = maxx≥xi

{πi(x, yj) + αE

[V it+1(X(x, yj))

]}= max

x≥yi

{πi(x, yj) + αE

[V it+1(X(x, yj))

]},

where the second equality follows since πi(x, yj) is increasing in x for x < yi (by concavity). Thus,

firm i orders at least yi and V it (x

i, xj) is constant in xi for xi < yi.

A stronger result than the ESS up-to levels being Markov is that they are actually basestock.

However, this requires a stronger condition as follows.

Definition 6 Let εiT+1(xi) = 0 for all xi ≥ 0. Then recursively define

εit(xi) =

0 0 ≤ xi ≤ yi∗

maxz≤xi

{πi(z,Rj(z))− πi∗ + αE

[εit+1(X

i(z,Rj(z)))]}

xi > yi∗.

Then εit(xi) represents the potential extra reward available to firm i for an overstock of xi or less.

Note that by definition εit(xi) is nondecreasing and εit(y

i∗) = 0.

Property 3 For xi ≥ Ri(xj), xj ≥ Rj(xi), πi(xi, xj) + αE[εit+1(X

i(xi, xj))]is nonincreasing in

(xi, xj). Further, πi(xi, Rj(xi)) + αE[εit+1(X

i(xi, Rj(xi)))]is nonincreasing in xi.

17

In particular, Property ?? implies that

πi(xi, yj∗) + αE[εit+1(X

i(xi, yj∗))]

is nonincreasing in x beyond yi∗. This is the key assumption for the ESS to remain an equilibrium:

it is not worth firm i’s while to overstock in this period purely for next period’s gain if it does

not get an immediate response from firm j. To keep basestock always a best response it has to be

assumed at all xj ≥ yj∗, as in the actual property given above, and further the extra gain needs to

be nonincreasing so that no improvement can be made by moving off a high inventory level to an

even higher level. Note that considerable effort was applied to prove the nonincreasing nature of

the functions in Property ?? under some exogenous conditions, but we found no natural conditions.

Definition 7 Let νiT+1(xi, xj) = 0 for all xi, xj ≥ 0. Then recursively define

νit(x

i, x

j) =

0 0 ≤ xi ≤ yi∗, 0 ≤ xj ≤ yj∗

πi(Ri(xj), xj) − πi∗ + αE[νit+1(X(Ri(xj), xj))

]xi ≤ Ri(xj) ≤ yi∗, xj > yj∗

πi(xi, Rj(xi)) + αE[εit+1(X

i(xi, Rj(xi)))]+ αE

[νit+1(X(xi, Rj(xi)))

]− εit(x

i) − πi∗ xi > yi∗, xj ≤ Rj(xi)

πi(xi, xj) + αE[εit+1(X

i(xi, xj))]+ αE

[νit+1(X(xi, xj))

]− εit(x

i) − πi∗ xi > Ri(xj), xj > Rj(xi).

We will show that V it (x

i, xj) = V i∗ + εit(xi) + νit(x

i, xj).

Lemma 6 The function νit(xi, xj) is constant in xi for xi ≤ min(Ri(xj), yi∗), is constant in xj for

xj ≤ min(Rj(xi), yj∗), and is nonincreasing in both xi and xj.

Proof Line 1 of νit(xi, xj): For xj < yj∗, Ri(yj) ≥ yi∗ so constant in xi on desired range and

trivially nonincreasing.

Line 2: Clearly constant in xi and the first term is decreasing in xj (which is above desired range)

by assumption. For the last term, Xj(Ri(yj), yj) ≤ yj and so Ri(Xj(Ri(yj), yj)) ≥ Ri(yj), hence

νit+1(x,Xj(Ri(yj), yj)) is constant in x for x ≤ min(yi∗, Ri(yj)). But X i(Ri(yj), yj) ≤ Ri(yj) ≤ yi∗,

and Xj(Ri(yj), yj) is nondecreasing in yj (by assumption) so by the properties νit+1(·) this term

must be decreasing in xj .

Line 3: The sum of the first two terms is nonincreasing in xi by assumption. For the third term,

Xj(Ri(yj), yj) ≤ yj and so Ri(Xj(Λi(yj), yj)) ≥ Ri(yj), hence νit+1(x,Xj(Ri(yj), yj) is constant in

x for x ≤ min(yi∗, Ri(yj)). But Xi(Ri(yj), yj) ≤ Ri(yj) ≤ yi∗, and Xj(Ri(yj), yj) is stochastically

nondecreasing in yj (by Lemma ??) so by the properties νit+1(·) this term must be decreasing in

xj .

18

Line 4: The sum of the first and second term is nonincreasing in both arguments by assumption

(and xi and xj are above the range they need to be constant). The third and fourth term follow

by assumption on νit(·), X(·), and εit(·).

We are now ready for our last theoretical result.

Theorem 3 Under Property ??, the ESS basestock policy with response functions Ri(·) is a Markov

Equilibrium with V it (x


i, xj).

Proof We proceed by induction. Suppose that the ESS policy is Markov from period t + 1

onwards and V it+1(x

i, xj) = V i∗+εit+1(xi)+νit+1(x

i, xj). Further, suppose in period t firm j follows

the ESS basestock response policy. It suffices to show that firm i’s optimal response is the ESS

basestock response and that V it (x


i, xj).

Case 1: 0 ≤ xi ≤ yi∗, 0 ≤ xj ≤ yj∗.

V it (x

i, xj) = maxx≥xi

{πi(x, yj∗) + α

(πi∗/(1− α) + E

[εit+1(X

i(x, yj∗))]+ E

[νit+1(X(x, yj∗))

])}.

For x < yi∗, πi(x, yj∗) is increasing in x. Further, Xi(x, yj∗) ≤ x < yi∗ and Xj(x, yj∗) ≤ yj∗ so

the third and fourth term are zero. Hence the equilibrium value is at least yi∗. Now, by assump-

tion, πi(x, yj∗) + αE[εit+1(X

i(x, yj∗))]is maximized at x = yi∗, further X(x, yj∗) is stochastically

nondecreasing in x and ν(·) is a nonincreasing function, so yi∗ must be a best response. Further,

V it (x

i, xj) = πi(yi∗, yj∗) + απi∗/(1− α) = V i∗.

Case 2: xi ≤ Ri(xj) ≤ yi∗, xj > yj∗.

V it (x

i, xj) = maxx≥xi

{πi(x, xj) + α

(πi∗/(1− α) + E

[εit+1(X

i(x, xj))]+ E

[νit+1(X(x, xj))

])}.

For x < Ri(xj), πi(x, xj) is increasing in x. Further, X i(x, xj) ≤ x < Ri(xj) ≤ yi∗ so the third term

is zero. For the fourth term X(x, xj) is nondecreasing in x and ν(·) is a nonincreasing function,

hence the equilibrium value is at least Ri(xj). Now, by assumption, πi(x, xj)+αE[εit+1(X

i(x, xj))]

is decreasing for x ≥ Ri(xj). As before, the last term is nonincreasing in x, so Ri(xj) must be a

best response. Further,

V it (x

i, xj) = πi(Ri(xj), xj) + α(πi∗/(1− α) + E

[εit+1(X

i(Ri(xj), xj))]+ E

[νit+1(X(Ri(xj), xj))

])= V i∗ − πi∗ + πi(Ri(xj), xj) + αE

[νit+1(X(Ri(xj), xj))

]= V i∗ + νti (x

i, xj).

19

Case 3: xi > yi∗, xj ≤ Rj(xi).

V it (x

i, xj) = maxx≥xi

{πi(x,Rj(xi)) + α

(πi∗/(1− α) + E

[εit+1(X

i(x,Rj(xi)))]+ E

[νit+1(X(x,Rj(xi)))

])}.

By assumption, πi(x,Rj(xi))+αE[εit+1(X

i(x,Rj(xi)))]is decreasing in x for x > xi > yi∗. Further,

Ri(Rj(xi)) < xi by Lemma ??(b), so it is decreasing on x > Ri(Rj(xi)). But X(x,Rj(xi)) is

stochastically nondecreasing in x and ν(·) is a nonincreasing function, so xi must be a best response.

Further,

V it (x

i, xj) = πi(xi, Rj(xi)) + α(πi∗/(1− α) + E

[εit+1(X

i(xi, Rj(xi)))]+ E


])= V i∗ − πi∗ + πi(xi, Rj(xi)) + αE

[εit+1(X

i(xi, Rj(xi)))]+ αE


]= V i∗ + εit(x

i) + νti (xi, xj).

Case 4: xi > Ri(xj), xj > Rj(xi).

V it (x

i, xj) = maxx≥xi

{πi(x, xj) + α

(πi∗/(1− α) + E

[εit+1(X

i(x, xj))]+ E

[νit+1(X(x, xj))

])}.

By assumption, πi(x, xj) + αE[εit+1(X

i(x, xj))]is decreasing in x for x > Ri(xj) and X(x, xj) is

stochastically nondecreasing in x and ν(·) is a nonincreasing function, so xi must be a best response.

Further,

V it (x

i, xj) = πi(xi, xj) + α(πi∗/(1− α) + E

[εit+1(X

i(xi, xj))]+ E

[νit+1(X(xi, xj))

])= V i∗ + εit(x

i) + νti (xi, xj).

Theorem ?? shows that a basestock policy is Markov. However, the condition required to prove

it (Property ??) is quite strong. This is because not only must we show that there is no incentive

to deviate from the ESS if inventories lie below the ESS (as was shown under Property ??) but

also that should inventory lie above the ESS then no further order should be placed.

The proofs of Theorems ?? and ?? are inductive in nature, which is natural when trying to

prove subgame perfection. While a finite horizon is assumed, a similar technique could also be used

in the infinite horizon, either directly, or by letting T → ∞ (e.g., Fudenberg and Levine, 1983).

The key to the inductive arguments is the definition of the eit(·) and εit(·) functions, which are

single variable functions in a firm’s own stocking level. Bounding the payoff functions in terms of

20

these functions allows subgame perfection to be proven. It seems likely that a similar technique for

proving that an ESS is also a ME could be used in other settings to those considered here, which

is naturally left as a subject for future research. In the following section we numerically investigate

what happens when the conditions of this section do not hold.

5 Numerical Results

This section describes the results of exercising the model numerically. While the analysis in the

previous sections demonstrates that the Markov Equilibrium structure coincides with the Equilib-

rium in Stationary Strategies structure under some conditions, we have few analytical results when

these conditions do not hold. Specifically, our motivation is to explore the possibility that a firm

may choose to stock higher than the ESS levels in order to gain in the following period, hopeful

they will enter that following period at a high stocking level forcing their rival to stock at their best

response (below the ESS level). The following examples did not satisfy Property 2 in Section 4.

Nor are these the only examples we examined but we consider them representative of the properties

we wish to highlight.

The first example we wish to illustrate may be found in Table ??. Seeking a situation where a

firm will stock at a higher level to enter the following period at a similarly high level, we introduce a

demand distribution with a point mass of probability at zero demand with the remaining probability

distributed across strictly positive demands. The purpose of this zero demand is to allow the high-

stocking firm (with some probability) to continue to stock high in the following period. The numbers

in Table ?? reflect the equilibrium stocking levels for the following data: r1 = 21, r2 = 18, h1 =

1, h2 = 2, γ12 = γ21 = 1, c1 = 14, c2 = 10,Pr(D1 = 0) = Pr(D2 = 0) = 0.6,Pr(D1 = i) = Pr(D2 =

i) = 0.01, i = 1, ..., 40, α = 0.99. Notice that demand is assumed to be discrete and all unsatisfied

demand is assumed to transfer for numerical tractability (our state space is therefore the whole

numbers, suitably bounded). While these two assumptions are not identical to those in the previous

two sections we feel that the insights apply equally well.

We observe four items. Firstly, this is an illustration that the ESS, (29,20), is not the equilib-

rium in every period. This consolidates the message that the Equilibrium in Stationary Strategies

solution is not always a Markov Equilibrium. Secondly, the recommended stocking levels are not

21

Period Equilibrium

T + 1 (29,20)

T (28,21)

T − 1 (28,20)

T − 2 (29,20)

T − 3 (28,21)

T − 4 (28,20)

Table 1: Example for unique equilibrium

consistent, despite this being a problem with stationary parameters. Thirdly, the solution to this

Markov Game is a unique equilibrium in each period (that is not the ESS), although, as we will

observe in Table ??, this is not always the case. And finally, there is a cycling behavior observed,

where the equilibria are repeated in subsequent periods, in this case with a periodicity of three

periods. This regular cycle continues as the horizon length grows. Notice these are equilibrium

solutions of the up-to variables of the Markov model, which automatically implies this is a up-to

equilibrium policy, albeit not a stationary or monotone one. It is also interesting to note that the

equilibria in the Table ?? always have one player on the best response curves generated by the

ESS; for example, firm 1 stocking 28 is a best response to firm 2 stocking 21 but 21 is not a best

response for firm 2 to firm 1 stocking 28 but 20 is.

There are numerous other patterns we observe across a variety of numerical scenarios. Table

?? contains the equilibrium stocking levels for a symmetrical example with parameters: r1 = r2 =

20, h1 = h2 = 5, γ12 = γ21 = 1, c1 = c2 = 10,Pr(D1 = 0) = Pr(D2 = 0) = 0.6,Pr(D1 = i) =

Pr(D2 = i) = 0.01, i = 1, ..., 40, α = 0.99. The ESS is shown in period T + 1: (15,15). The

columns in the table illustrate the equilibria in each period when firm 1 chooses the equilibrium

(in each period) with the highest and lowest stocking level. For example, moving back from the

end of the horizon, in Period T when “Choosing highest” the (17,14) equilibrium is chosen and

when “Choosing lowest” the (14,17) equilibrium is chosen. Notice there are many equilibria which

arise again and again. This is a pattern whereby the same equilibria will appear, although not in

a stationary setting. Another observation is that the equilibria in every period when “Choosing

highest” are the mirror image of those when “Choosing lowest,” although such symmetry does not

22

exist amongst the multiple equilibria within one of these columns. Although not shown in the table,

choosing the ESS (15, 15) is also a ME. Thus, this is an example where the ESS is a ME but it is

not unique.

As noted above, Table ?? illustrates the existence of multiple equilibria. The conditions in

Section ?? guarantee a unique ESS but even in Section ?? we have not shown uniqueness; there we

have shown that the ESS is a ME but we have not ruled out the existence of other ME. Numerically,

we have seen up to five equilibria in a given period, although there seems to be no theoretical bound.

The immediate significance of finding multiple equilibria is that it ostensibly destroys the potential

for finding an equilibrium policy. Unless there is a compelling reason to choose one equilibrium

above the others, the model cannot deliver a valid equilibrium policy over multiple periods since

the ME concept consists of a predictable sequence of actions from a policy, delivering a unique

cost and a predictable state in the following period. The literature gives some guidance as to

what may constitute a compelling reason for choosing one equilibrium over another, including

Pareto improvement where all players encounter a superior outcome for that choice (see Parker and

Kapuscinski, 2010, for an example). Another reason could be a focal point equilibrium where the

custom or norms of the problem context suggest one equilibrium has a greater likelihood of being

chosen, such as jointly choosing the time of a meeting where the equilibrium (12:00pm,12:00pm)

could be preferred over (11:54am,12:08pm). As shown in Table ??, in period T there are three

equilibria: (14,17), (17,14), and (15,15) and each firm prefers a different equilibrium. Without

further information or problem context, it is impossible to validly select one of these equilibria as

a focal point. However, despite the fact that (15,15) is not the preferred equilibrium of either firm,

the ESS levels (15,15) might arguably be an attractive candidate as a mutual choice, particularly

when it is consistent as it is here.

In this example, we again observe a cycling behavior. This is where an equilibrium will reappear

on a regular basis. For example, in Table ?? we see that (14,17) appears under “Choosing highest”

and (17,14) appears under “Choosing lowest” in period T − 1, T − 3, T − 5, T − 7, etc.

As noted, there are equilibria where a firm will stock higher than its ESS level even for symmet-

rical problem sets. This seems consistent with the above-conjectured behavior on the overstocking

firm attempting to credibly commit to an overstock in the following period. For example, we find

23

Period Choosing highest Choosing lowest

T + 1 (15,15) (15,15)

T (14,17) (15,15) (17,14) (14,17) (15,15) (17,14)

T − 1 (13,19) (14,17) (15,16) (16,15) (17,14) (19,13)

T − 2 (13,19) (14,17) (16,15) (20,13) (21,12) (12,21) (13,20) (15,16) (17,14) (19,13)

T − 3 (13,19) (14,17) (17,14) (19,13)

T − 4 (13,19) (18,14) (14,18) (19,13)

T − 5 (13,19) (14,17) (17,14) (19,13)

T − 6 (13,19) (19,13) (13,19) (19,13)

T − 7 (13,19) (14,17) (17,14) (19,13)

T − 8 (13,19) (17,14) (20,13) (21,12) (12,21) (13,20) (14,17) (19,13)

T − 9 (13,19) (14,17) (17,14) (19,13)

T − 10 (13,19) (18,14) (14,18) (19,13)

T − 11 (13,19) (14,17) (17,14) (19,13)

T − 12 (13,19) (16,15) (19,13) (13,19) (15,16) (19,13)

T − 13 (13,19) (14,17) (17,14) (19,13)

T − 14 (13,19) (17,14) (21,12) (12,21) (14,17) (19,13)

T − 15 (13,19) (14,17) (17,14) (19,13)

T − 16 (13,19) (18,14) (14,18) (19,13)

T − 17 (13,19) (14,17) (17,14) (19,13)

T − 18 (13,19) (16,15) (19,13) (13,19) (15,16) (19,13)

Table 2: Example for multiple equilibria

24

(14,17), (17,14), and (15,15) in a symmetrical example where the ESS is (15,15).3 Another obser-

vation is that whenever the target levels do not correspond to the ESS levels, one firm will stock

above its ESS level while the other will stock below its ESS level. This property is observed both in

symmetrical and asymmetrical examples. We found no examples where both firms overstock (which

we had initially hypothesized might be observed) nor any examples where both firms understock

(which is not surprising).

An alternative perspective of this “overstock-understock” behavior is that the firm which is

understocking relative to its ESS is attempting to sabotage the firm which is overstocking. The

understocking firm knows they will be at a disadvantage in the following period if the overstocking

firm continues to have an inventory superiority in the following period, so they understock in the

current period knowing that some of their current-period unsatisfied customers will transfer to the

rival firm and draw down the rival’s inventory. Thus, they can indirectly attempt to reduce the high

stocking level of the other firm by deliberately channeling unmet demand towards them. Consider

an equilibrium in Table ?? of (21,12) in Period T − 2 (when equilibria of (15, 16) and (17,14) are

chosen in Periods T − 1 and T , respectively). With R1(12) = 17 which is lower than 21, we can see

that firm 1 is stocking higher than the ESS best response level. Knowing that R2(17) = 14, clearly

firm 1 is overstocking relative to firm 2’s best response, too. The understocking interpretation may

be seen as firm 2 stocking at 12 rather than its ESS best response of 14 (to firm 1 stocking 17) or

13 (R2(21) = 13). The effect of this understocking is that more of its customers will stockout and

transfer to the overstocking firm, thus reducing the chance that the overstocking firm will enter the

following period at a high stocking level.

One final property observed in Table ?? is what we label as the see-saw effect. This is where

the highest stocking level for firm 1 under “Choosing highest” appears above and below the ESS

levels in alternating periods. Similarly, firm 1’s lowest stocking level see-saws above and below firm

1’s ESS level under “Choosing lowest.” Firm 2’s stocking level similarly follows such an alternating

see-saw effect but will be above its ESS level when firm 1’s stocking level is below its respective

ESS level.

3Clearly, in a symmetrical example which has unequal target stocking levels for one equilibrium, there will be a

corresponding equilibrium with the target levels reversed.

25

6 Conclusions

In this paper we formulate and analyze an inventory duopoly where unsatisfied customers at one

firm may either transfer to the other firm or leave unsatisfied. This model has been previously

studied in the literature but not from the perspective of a Markov Equilibrium. We show that

significantly different dynamics can occur when subgame perfection is required across periods.

We feel that the extant literature on inventory games has been confused by the lack of a good

term for the typical (open-loop or infinite-horizon) type of equilibrium studied, and we suggest

the term an Equilibrium in Stationary Strategies for the standard equilibrium. For our specific

example application we give conditions for when the ESS is also a ME (i.e., is also a closed-loop

equilibrium). The proof technique used to show subgame perfection relies on an inductive argument

showing that any deviation from the ESS in the current period does not provide sufficient reward in

the following period to warrant consideration. While relatively natural, we are not aware of other

papers using this technique and believe it may be a general technique to consider for proving that

an ESS is also a ME.

We show that behavior under a ME may be far from trivial, and future research should study

ways to allow policy insights from the equilibrium. Until that is achieved, we feel that the ESS

concept has significant value and should continue to be studied from an operations perspective.

After all, there are many environments where the macro decisions on operations (e.g., the up-to

levels for an ERP system) are made on a longer timescale then the day to day fluctuations of, say,

inventory. Also, the macro-level insights obtained from such games (e.g., the losses to a firm due

to competition) will likely also frequently carry over to the closed-loop ME setting.

In summary, our primary contribution is a deeper understanding of both the dynamic stockout-

based substitution game and also the general relationship between ESS and ME. Future research

could focus on ESS results for more general stockout-based substitution models to that considered

here and also further methodological results on the analysis of ME.

References

[1] Albright, S.C. and Winston, W., “Markov Models of Advertising and Pricing Decisions,”

Operations Research, Vol. 27, No. 4 (1979) 668-681.

26

[2] Anderson, E., Fitzsimons, G.J., and Simester, D., “Measuring and Mitigating the Cost of

Stockouts in Mail Order Catalogs,” University of Chicago working paper (2005).

[3] Avsar, Z.M. and Baykal-Gursoy, M., “Inventory Control under Substitute Demand: A Stochas-

tic Game Application,” Naval Research Logistics, Vol. 49, (2002) 359-375.

[4] Bassok, Y., Anupindi, R., and Akella, R., “Single-Period Multiproduct Inventory Models with

Substitution,” Operations Research, Vol. 47, No. 4 (1999) 632-642.

[5] Bernstein, F. and Federgruen, A., “Dynamic Inventory and Pricing Models for Competing

Retailers,” Naval Research Logistics, Vol. 51 (2004) 258-274.

[6] Caro, F. and Martınez-de-Albeniz, V., “The Impact of Quick Response in Inventory-Based

Competition,” Manufacturing & Service Operations Management, Vol. 12, No. 3 (2010) 409-

429.

[7] Dixit, A., “The Role of Investment in Entry-Deterrence,” The Economic Journal, Vol. 90, No.

357 (1980) 95-106.

[8] Fudenberg, D. and Levine, D., “Subgame-Perfect Equilibria of Finite- and Infinite-Horizon

Games,” Journal of Economic Theory, Vol. 31, No. 2, (1983) 251-268.

[9] Fudenberg, D. and Tirole, J., Game Theory, MIT Press, Cambridge, MA (1991).

[10] Hall, J. and Porteus, E., “Customer Service Competition in Capacitated Systems,” Manufac-

turing & Service Operations Management, Vol. 2, No. 2 (2000) 144-165.

[11] Heyman, D. and Sobel, M., Stochastic Models in Operations Research, Vol. II, McGraw Hill,

New York (1984).

[12] Honhon, D., Gaur, V., and Seshadri, S., “Assortment Planning and Inventory Decisions Under

Stockout-Based Substitution,” Operations Research, Vol. 58, No. 5 (2010) 1364-1379.

[13] Kirman, A.P. and Sobel, M.J., “Dynamic Oligopoly with Inventories,” Econometrica, Vol. 42,

No. 2, (1974) 279-287.

[14] Li, Q., and Ha, A., “Reactive Capacity and Inventory Competition under Demand Substitu-

tion,” IIE Transactions, Vol. 80 (2008) 707-717.

27

[15] Lippman, S.A. and McCardle, K.F., “The Competitive Newsboy,” Operations Research, Vol.

45, No. 1, (1997) 54-65.

[16] Lu, L.X. and Lariviere, M.A., “Capacity Allocation over a Long Horizon: The Return on

Turn-and-Earn,” working paper, University of North Carolina (2009).

[17] Mahajan, S. and van Ryzin, G., “Inventory Competition under Dynamic Consumer Choice,”

Operations Research, Vol. 49, No. 5 (2001) 646-657.

[18] Nagajaran, M. and Rajagopalan, S., “A Multiperiod Model of Inventory Competition,” Oper-

ations Research, Vol. 57, No. 3 (2009) 785-790.

[19] Nash, J.F., “Equilibrium Points in n-Person Games,” Proceedings of the National Academy of

Sciences of the United States of America, Vol. 36, No. 1 (1950) 48-49.

[20] Nash, J.F., “Non-Cooperative Games,” Annals of Mathematics, Vol. 54, No. 2 (1951) 286-295.

[21] Netessine, S. and Rudi, N., “Centralized and Competitive Inventory Models with Demand

Substitution,” Operations Research, Vol. 51, No. 2, (2003) 329-335.

[22] Netessine, S., Rudi, N., and Wang, Y., “Inventory Competition and Incentives to Back-order,”

IIE Transactions, Vol. 38 (2006) 883-902.

[23] Olsen, T.L. and Parker, R.P., “Inventory Management Under Market Size Dynamics,” Man-

agement Science, Vol. 54, No. 10 (2008) 1805-1821.

[24] Pakes, A. and McGuire, P., “Computing Markov-Perfect Nash Equilibria: Numerical Implica-

tions of a Dynamic Differentiated Product Model,” RAND Journal of Economics, Vol. 25, No.

4 (1994) 555-589.

[25] Parker, R.P. and Kapuscinski, R., “Managing a Non-Cooperative Supply Chain with Limited

Capacity,” forthcoming in Operations Research (2010).

[26] Parlar, M., “Game Theoretic Analysis of the Substitutable Product Inventory Problem with

Random Demand,” Naval Research Logistics, Vol. 35, (1988) 397-409.

[27] Saloner, G., “The Role of Obsolescence and Inventory Costs in Providing Commitment,”

International Journal of Industrial Organization, Vol. 4 (1986) 333-345.

28

[28] Schary, P.B. and Christopher, M., “The Anatomy of a Stockout,” Journal of Retailing, Vol.

55, No. 2, (1979) 59-70.

[29] Sydsæter, K., Strøm, A., and Berck, P., Economists’ Mathematical Manual, Springer, Heidel-

berg, Germany, 2000.

[30] Spence, A.M., “Entry, Capacity, Investment and Oligopolistic Pricing,” The Bell Journal of

Economics, Vol. 8, No. 2 (1977) 534-544.

[31] Tirole, J., The Theory of Industrial Organization, MIT Press, Cambridge, MA (1988).

[32] Vives, X., Oligopoly Pricing: Old Ideas and New Tools, The MIT Press, Cambridge, MA

(1999).

[33] Wang, Y. and Gerchak, Y., “Supply Chain Coordination when Demand is Shelf-Space Depen-

dent,” Manufacturing & Service Operations Management, Vol. 3, No. 1, (2001) 82-87.

[34] Zeinalzadeh, A., Alptekinoglu, A., and Arslan, G., “Inventory Competition under Fixed Order

Costs,” working paper, Southern Methodist University (2010).

29

Documents

Dynamic inventory competition with stockout-based substitution