Upload
auckland
View
1
Download
0
Embed Size (px)
Citation preview
Dynamic Inventory Competition with Stockout-Based Substitution
Tava Lennon Olsen∗
The University of Auckland
Rodney P. Parker†
The University of Chicago
November 24, 2010
Abstract
We examine when there is a commitment value to inventory in a dynamic game between two firms
under stockout-based substitution. In our model, the firms face independent direct demand but
some fraction of a firm’s lost sales will switch to the other firm. This problem has been previously
studied in both the single period and stationary infinite horizon (open loop) setting but not in
a Markov perfect (closed loop) setting. We numerically show that, as seen in the economics
literature, significantly different dynamics can occur in the closed loop setting and indeed there
may be a commitment value to inventory in this setting. However, we give conditions under
which the stationary infinite horizon equilibrium is also a Markov perfect equilibrium and provide
a discussion on when the (simpler) stationary equilibrium may be sufficient for study. We provide
a review of the types of equilibria typically found in operations management papers and provide
some roadmaps on avenues for future research.
∗Email: [email protected]†Email: [email protected]
1 Introduction
It is well documented that consumers may switch brands upon experiencing a stockout (e.g., Schary
and Christopher, 1979). Given this behavior, one dimension of retailer competition is the avail-
ability of stock. Of course, retail prices also strongly influence customers purchasing decisions but
once price competition has settled prices to a common level between retailers, customers next look
to whether the product is on the shelf. We seek to understand retailer competitive behavior in
the face of stock-out based substitution. More specifically, this paper studies the dynamics of such
competition. For example, is it ever rational for a retailer to overstock inventory in one period so
that they can credibly commit to a high inventory level in the following period?
We consider a simple duopoly competition where prices are fixed and consumers have a preferred
product (primary demand). Upon finding that product unavailable, they either will not buy or will
switch to the competing product. Such a scenario can either model two competing retailers stocking
the same product, or one retailer with two competing vendors (e.g., Coca Cola and Pepsi) who
manage their stock on the retailer’s shelves. This paper studies the dynamic game between the two
firms, focusing on the inter-period dynamics. It gives conditions under which the stationary infinite
horizon equilibrium is also a Markov perfect equilibrium but also provides examples of where this
is not the case. In this latter case, the dynamics can be quite complex but there may indeed be a
commitment value to inventory.
The commitment value of capacity has been studied in the economics literature. The general
idea is that the irreversible nature of capacity allows a firm to demonstrate commitment. Tirole
(1988) describes the period of commitment associated with an investment (such as purchasing
capacity) as “a period of time over which the cost of being freed from the commitment within
the period is sufficiently high that it does not pay to be freed.” The irreversibility of capacity
typically arises from the fact that it is lumpy, industry-specific, or otherwise difficult to discard on
a secondary market. The sinking of money into an investment of such a magnitude tends to show
a commitment, usually towards operating in a particular market. Spence (1977), Dixit (1980),
and others show how such a capacity commitment by an incumbent firm can be used to ward off
potential entrants into a market, ensuring monopoly operations for the incumbent. The entrants
are aware of the incumbent’s investment and its durability implies the commitment. Prices do not
have a similar deterrent effect because they can be easily changed (Tirole, 1988, p.368).
1
A review of the literature did not yield any previous articles on the commitment value of
inventory in the sense we intend, despite the lengthy literature on inventory control. Saloner
(1986) does describe a partial commitment of inventory in the sense that acquiring inventory (either
through production or purchasing) and incurring the holding cost does endow the firm with the
ability to sell up to that quantity. Our sense is somewhat different, in that a firm may order up to a
“high” level in one period with the hope that its inventory entering the following period is similarly
high. Such a hope is predicated on knowledge of the direct demand’s probability distribution and
of the rival’s demand distribution and stocking level. Thus, the beginning inventory level in the
following period is far from guaranteed, but if it happens to be high, then the commitment to this
inventory is beyond dispute: the concrete physical nature of the inventory means it is available for
usage. Of course, this inventory commitment is for a shorter number of periods compared with
capacity, but it has a similar flavor of requiring an observable investment which cannot be easily
reversed.
This paper contributes to two main areas of literature: inventory duopolies and dynamic oper-
ational games. The earliest acknowledged inventory oligopoly is attributed to Kirman and Sobel
(1974) who elegantly formulate and analyze two competing firms. They were the first to recog-
nize that intertemporal dependence in inventory duopolies arises in two manners: via the physical
inventory being carried from period to period and via the stochastic demand. They formulate a
dynamic oligopoly where the firms set prices; there is no explicit flow of customers based upon
stockout substitution.
Although there are examples of stockout-based substitution in centralized models, such as Bas-
sok, Anupindi, and Akella (1999) and Honhon, Gaur, and Seshadri (2010), our focus is on com-
petitive models. The relationship between the retailers is dependent upon two elements: (a) how
the primary demand is divided between the firms; and (b) how the customers upon experiencing
a stockout will affect the other firm. Considering (a), Lippman and McCardle (1997) consider a
single source of stochastic market demand that is then split between the retailers according to four
exogenous rules. However, most papers allow separate random demand streams arriving at each
firm, as we do. Nagarajan and Rajagopalan (2009) consider separate but correlated streams of pri-
mary demand, but it is more usual to consider independent primary streams and incorporate the
degree of substitutability by parameterizing the stockout routing paths, as we also do. Although
2
there is some literature dealing with demands influenced by the availability of stock (e.g., Wang
and Gerchak, 2001), we assume independence between the originating demand and the stocking
levels.
Focusing on aspects of (b), most papers assume some flow of customers from one retailer to
the other (for a duopoly) upon experiencing a stockout. The sum of a firm’s primary demand and
transfer demand to that firm is known as the firm’s effective demand. Parlar (1988), Lippman and
McCardle (1997), Avsar and Baykal-Gursoy (2002), Netessine and Rudi (2003), and Zeinalzadeh
et al. (2010) assume a fixed portion of customers search at the other firm while all others are lost;
we make a similar assumption. Netessine, Rudi, and Wang (2006) generalize this assumption by
considering four separate models, three of which include an allowance for backlogging. Their first
model, where excess effective demand is fully lost, is equivalent to our model. Further, it is entirely
possible that a stockout will negatively affect a customer’s likelihood of visiting that retailer in a
subsequent visit, which has the effect of changing the demand distribution in subsequent periods.
Under this scenario, Olsen and Parker (2008) find that base-stock levels can be the optimal or
equilibrium policy only under certain flow restrictions; this paper does not consider that effect.
The other strand of literature this paper contributes to is that of dynamic operational games,
where the the players may compete multiple times. In the context of operational problems, it often
makes sense to have these models be state dependent. Such a dependence is frequently associated
with an assumption that all payoff-relevant information is embedded in these state variables and
anything else in the past is payoff-irrelevant; these are Markov Games. In this paper, the firms’
beginning inventory levels form the periodic state variables.
Our primary contribution is a careful treatment of the competitive dynamics for our duopoly.
While similar stock-out based substitution models to ours have been studied in the literature, to the
best of our knowledge, none have studied the Markov Equilibrium (ME). Indeed, the literature on
ME in operational games is limited and we review this further in Section 2. We also believe that the
treatment of equilibria type has been hampered in the operations literature by a lack of consistent
vocabulary, and we propose the term an equilibrium in stationary strategies (ESS) for a common,
but inconsistently labeled, equilibrium used frequently in operations papers; this equilibrium type
is used in Avsar and Baykal-Gursoy, 2002, and Netessine, Rudi, and Wang, 2006, in similar models
of stock-out based substitution to ours. Here we give conditions for when this equilibrium is also
3
a Markov equilibrium using a novel proof technique. It is also shown numerically that this is not
always the case.
This paper is organized as follows. Section 2 reviews the appropriate equilibria types for our
duopoly, defines an ESS, and further details relevant literature for dynamic operational games.
Section 3 formally defines our stockout-based substitution duopoly, gives initial structural results
on the payoff and response functions, and shows existence and uniqueness of the ESS. Section 4
gives conditions under which the ESS is also a ME. Section 5 numerically studies settings where
the ESS is not a ME and shows that indeed there may be, in some cases, a commitment value to
inventory. Finally, Section 6 concludes the paper.
2 Equilibrium Types and Literature
This section discusses the types of equilibria and policies that are most relevant to our work. In
particular, we propose a new equilibrium term, an “Equilibrium in Stationary Strategies” (ESS)
as well as reviewing the existing concept of a “Markov Equilibrium” (ME). We define “up-to”
and “base-stock” policies in the context of inventory games and then relate these to the existing
literature on stockout-based substitution. We also provide a brief review of papers in the inventory
literature using the Markov Equilibrium concept.
We first describe two policy types relevant for inventory duopolies. We will refer to firms i
and j = i and it may be assumed that the below definitions and all results in what follows are for
i, j = 1, 2 and j = i. Time is given by t = 1, 2 . . . but where no confusion is likely the time subscript
will be dropped.
Definition 1 A policy for firm i is a stationary up-to response policy if, in each period t, for some
level yi∗, whenever the initial inventory is below yi∗ firm i orders up to yi∗ regardless of the other
firm’s ordering policy for that period. If both firms follow the up-to response policy then we call this
(yi∗, yj∗) a stationary up-to policy.
The appeal of up-to policies is well known; they are easy to implement and intuitively satisfying.
They simply require a firm to know its current inventory level, its target stocking level, and to order
the difference. Notice that an up-to policy makes no policy prescription for what should be done if
inventory is found to be above (yi∗, yj∗), and hence papers proving the optimality or equilibrium
4
nature of an up-to policy will typically impose a requirement on initial inventory. A natural
extension to an up-to policy is the basestock policy. In the single-firm setting a basestock policy is
simply an up-to policy where the firm orders nothing should it find itself above the basestock level.
However, in the game setting additional policy description is needed when one firm is above the
up-to level and the other firm is not. Let (xit, xjt ) be the initial inventory levels at the beginning of
some period t.
Definition 2 A policy for firm i is a stationary basestock response policy if, in each period t, for
some level yi∗, firm i orders up to max(xit,min(Ri(xjt ), yi∗)) regardless of the other firm’s ordering
policy for that period, for some response function Ri(·) with Ri(xj) ≥ yi∗ for xj ≤ yj∗. If both firms
follow the stationary basestock response policy then we call this (yi∗, yj∗) a stationary basestock
policy.
Notice that this definition relies on the definition of a response function because should firm j be
at level xjt > yj∗ and hence ordering nothing, firm i may have a value yi = yi∗ that it would prefer
to stock to in this case; this will be shown to indeed be the case in the stockout substitution game
considered here. The constraint in the definition that Ri(xj) ≥ yi∗ for xj ≤ yj∗ ensures that if both
firms are below (yi∗, yj∗) then both will order up-to (yi∗, yj∗). A special case of such a basestock
policy would be the more traditional definition where each firm always orders up-to the same level
regardless of the competitor’s inventory and orders nothing if above that desired up-to level (i.e.,
Ri(xj) = yi∗ ∀xj).
In a single period, the appropriate form of equilibrium to study is typically a simple Nash
equilibrium (Nash, 1950, 1951). The earliest work on stockout-based substitution appears to be
the single-period model of Parlar (1988), hereafter P88, who shows that there exists a unique Nash
equilibrium and, by implication, that an up-to policy is Nash. Lippman and McCardle (1997),
hereafter LMC, similarly show existence of a Nash solution (via supermodularity) and so a joint
inventory stocking level (up-to policy) is Nash. Given the variety of methods of dividing the initial
inventory, LMC does not necessarily achieve a unique solution. For competitive substitution upon
stockouts in a single period, Mahajan and van Ryzin (2001) allows consumer utility maximization,
Netessine and Rudi (2003) compare the centralized and competitive stocking scenarios, and Li and
Ha (2008) incorporate reactive capacity, and all show that an up-to policy is the Nash equilibrium.
A natural extension to the single-period Nash equilibrium, is the “open-loop” equilibrium, or
5
as it is more typically referred to in the operations management literature, the “infinite horizon
equilibrium” (see, e.g., Bernstein and Federgruen, 2004, for a careful treatment). We find neither of
these terms as fully descriptive and so propose a new term, namely an “Equilibrium in Stationary
Strategies”, which is more specific than “open loop” but more accurate than “infinite horizon,”
because closed loop games can be played on the infinite horizon. Formally, it may be defined as
follows.
Definition 3 A policy is said to be an Equilibrium in Stationary Strategies (ESS) if, for a possibly
restricted set of initial states, the policy is a pure-strategy Nash equilibrium in the infinite horizon
game where the same (possibly state-dependent) strategy is played in each period.
If the initial set of states is restricted, then any proof of an ESS must include showing that the
future state under the equilibrium policy also lies in the same restricted set. In effect, it assumes
that both players are currently setting up their ERP systems, or that strategy decisions are made
on a much longer time scale than the operational payoffs. It is one of the most commonly studied
multi-period equilibria types in the operations literature having a long history (see Kirman and
Sobel, 1974; Heyman and Sobel, 1984), primarily due to its simplicity in formulation and efficacy
in analysis.
Avsar and Baykal-Gursoy (2002), hereafter ABG, extend P88’s model to an infinite-horizon
setting. The equilibrium in each setting is an up-to policy for both firms, albeit Parlar’s equilibrium
is a one-shot decision in a single period whereas Avsar and Baykal-Gursoy’s is a one-shot decision
at the start of an infinite-horizon under the assumption that neither firm will deviate from those
decisions in the future. In other words, ABG show that an up-to policy forms an ESS. Netessine,
Rudi, and Wang (2006), hereafter NRW, also show an up-to policy forms an ESS, but under more
general demand routing compared with ABG. Both ABG and NRW assume initial inventories to
be below the up-to levels in their key theorems.
A stronger equilibrium than an ESS is a Markov Equilibrium. This is a closed-loop equilibrium
and requires subgame perfection in every period.
Definition 4 (Fudenberg and Tirole, 1991, p. 501)
A Markov strategy depends on the past only through the current value of the state variables. A
Markov Equilibrium (ME) or Markov perfect equilibrium is a profile of Markov strategies that
6
yields a Nash equilibrium in every proper subgame.
A ME may be defined over a finite or infinite horizon, but unlike the ESS above, in the infinite
horizon, subgame perfection is still required. One of the simpler forms of ME found commonly in
the operations literature is a linear game. For example, Hall and Porteus (2000) study a linear
game where service competitors, or pure newsvendors, compete for market share that may diminish
as a result of poor service. In this approach, the firms’ payoffs are linear in the state variable (each
firm’s market share). A related technique is to assume separability between the firm’s decisions in
the probabilistic transition function and this is the approach taken in Albright and Winston (1979)
who consider firms competing via pricing and advertising, and Nagarajan and Rajagopalan (2009)
who consider newsvendor stocking conditions under which the competitive scenario reduces into a
single-product optimization for each firm.
Another approach to proving subgame perfection is to limit the model to two periods and there
are some papers which deal specifically with ME of competing retailers with stockout-based substi-
tution across two periods. Caro and Martınez-de-Albeniz (2010) examine the effect of competition
on quick response (see Table 2 in their paper for a crisp summary of the literature). Zeinalzadeh,
Alptekinoglu, and Arslan (2010) consider an inventory duopoly where the firms incur a fixed cost
for placing orders. They find that in two periods, under a number of conditions, equilibria do not
exist.
The inventory literature on non-linear non-separable ME across greater than two periods ap-
pears scarce. Lu and Lariviere (2009) numerically compute ME in order to examine a supplier’s
allocation of scarce inventory to competing retailers; they show a “turn and earn” approach induces
greater sales. Numerical computation is the typical approach for finding ME taken in the economics
literature and there is a large literature on efficient methods for such computation (e.g., Pakes and
McGuire, 1994). Parker and Kapuscinski (2010) demonstrate that the ME for a supplier-retailer
relationship subject to capacity constraints is a modified echelon base-stock equilibrium policy.
They prove the equilibrium policy structure through construction and induction, exploiting Pareto
improvement amongst non-unique equilibria in every period.
Section ?? shows that (under some significant conditions) if initial inventory is below the equi-
librium values then any ESS up-to policy is also a up-to ME policy. However, an up-to equilibrium
does not have any claim over off-equilibrium behavior and assumes that initial inventory is low.
7
A stronger equilibrium is a basestock equilibrium and we also establish conditions under which
the ESS up-to levels will form a stationary Markov base-stock equilibrium policy. However, this
is quite a strong equilibrium and will need strong conditions imposed to establish it. Section ??
will show, through numerics, that the ESS forming a ME is far from guaranteed to be the case and
will highlight numerous instances of non-ESS equilibria and illustrate some compelling equilibrium
behavior (commitment, cycling, sabotage), including that there is frequently a strategic value to
inventory.
3 Model Formulation and Initial Results
We consider two competing firms who face both primary and secondary customer demand. Primary
(or “direct”) customers arrive to a firm each period and will each buy one unit the of product if
it is found to be available. If the product is unavailable, then some proportion of the unsatisfied
customer demand will switch products and become secondary (or “transferred”) demand at the
opposite firm and the rest will be lost sales. The periodic reward consists of revenue minus costs.
The costs are (a) holding costs for physical inventory and (b) production costs for the amount
“processed” in each period; these will be described more precisely below. Also, all payoff relevant
information is assumed to be embedded in the inventory levels of the retailers. This assumption, or
something similar, is necessary in a Markov game formulation. We also assume that all information
on costs, revenues, and demand distributions is common knowledge.
The notation for retailer i in period t is:
ri = revenue per unit (retail price);
hi = holding cost per unit per period;
ci = production cost per unit (wholesale price);
yit = inventory order up-to level in period t (decision variable);
xit = inventory level at the beginning of period t (state variable);
Dit = direct demand to retailer i in period t (random variable);
γij = the proportion of retailer i’s direct customers who will transfer to retailer j; and
α = periodic discount factor.
8
For notational convenience the discount factor is assumed to be the same for all firms; firm-specific
discount factors could be incorporated but little additional insight would be gained. We define the
total demand in period t at retailer i, when retailer j stocks yjt , as
Λit(y
jt ) = Di
t + γji(Djt − yjt )
+,
where x+ = max(x, 0). The argument of Λit(·) will sometimes be dropped when it is too nota-
tionally cumbersome and the meaning is clear from the context. We assume the {Dit} and {Dj
t}
form mutually independent i.i.d. sequences. This model of demand then corresponds to “Rule 4:
Independent Random Demands” of LMC; it is also the form assumed in both P88 and ABG and
is a special case of that in NRW. We focus on this form for simplicity and because it appears to
be a very natural form for demand, but as most results will be written in terms of Λit(·) (which
corresponds to Ri in LMC) extensions to other demand splitting models are likely possible.
We assume γij ∈ [0, 1).1 Firms accrue revenues only when they satisfy the customers’ orders;
recall that no backlogging is allowed. We assume ri ≥ ci, which is natural. All revenue and cost
parameters are non-negative. We also assume 0 ≤ α ≤ 1.
The inventory transition function between periods (counting time periods forward) for retailer
i is:
xit+1 =(yit − Λi
t(yjt ))+
,
which is the level of physical inventory remaining after satisfying direct and transfer demand. Write
Xi(yi, yj) =(yi − Λi(yj)
)+as the random variable representing the firm i inventory left after demand in a generic period when
order-up-to points of (yi, yj) are chosen for firms i, j and X(y1, y2) = (X1(y1, y2), X2(y2, y1)).
Retailer i’s infinite-horizon expected payoff is
E
[ ∞∑t=1
αt−1(rimin(yit,Λ
it)− ci(yit − xit)− hi(yit − Λi
t)+)]
.
1The assumption that γij < 1 will be used in Lemma ?? to show that the magnitude of the derivative of the
response function is strictly less than 1. An equivalent more technical condition on the support of the density could
also be used if γij = 1 (i.e., if all customers transfer upon a stockout).
9
Using the standard translation that xit = (yit−1 − Λit−1)
+ (e.g., Veinott, 1965) and rimin(yit,Λit) =
riyit − ri(yit − Λit)+, we rewrite the infinite horizon expected payoff as
πi∞ = E
[ ∞∑t=1
αt−1((ri − ci)yit − (ri + hi − αci)(yit − Λi
t)+)]
+ cixi1.
We therefore define firm i’s one-period expected payoff as
πi(yi, yj) = (ri − ci)yi − E[(ri + hi − αci)(yi − Λi(yj))+
]= (ri − ci)yi − E
[(ri + hi − αci)(yi −Di − γji(Dj − yj)+)+
].
For the sake of establishing existence of the equilibrium, certain technical assumptions about the
strategy spaces need to be made. A minor assumption is that we treat the strategy spaces as drawn
from real space; that is (yit, yjt ) ∈ ℜ2. So we treat the inventory decisions as continuous variables, a
common assumption in the literature (e.g., Kirman and Sobel, 1974). Further assumptions require
restrictions upon the strategy space, commonly requiring compactness. A blunt approach to this
is to merely truncate the real number line (for a single decision variable) at a very high positive
level (for an upper bound on physical inventory) and at zero as a lower limit (because no backlogs
are allowed). This truncation of the strategy spaces is sometimes done merely by assumption or
through a justification derived from the demand distribution. A reasonable approach is simply to
select an extremely large quantity (“U”) to which inventory levels will never realistically be raised.
Thus, we assume (yit, yjt ) ∈ [0, U ]× [0, U ].
Remark 1 A lost sales cost has not been included in our profit function because lost revenue has
been accounted for explicitly (a similar assumption is made in NRW). However, one could also
include a lost sales cost for primary demand that represents a goodwill cost for unsatisfied direct
demand (as in P88 or ABG), or a lost sales (or goodwill) cost that applies equally across all
customers (as is done in LMC). In this latter case, one would set the lost sales cost in a given
period equal to li(Λi(yj)− yi
)+for some li ≥ 0. All following results in this paper go through with
this additional cost with the exception that, in order to ensure that firm i’s profit is decreasing in
firm j’s stocking level (Lemma ??(c) below), it must be required that
−(ri + hi − αci + li)γjiP(Dj − (x−Di)/γji < z ≤ Dj) + liγjiP(Dj > z) ≤ 0.
For clarity of exposition we simply assume li = 0.
10
We begin with some properties of the payoff function.
Lemma 1 Payoff πi(yi, yj) is
(a) submodular in (yi, yj);
(b) concave in yi; and
(c) nonincreasing in yj.
Proof Taking partial derivatives of πi(x, z),
∂πi(x, z)
∂x= (ri − ci)− (ri + hi − αci)P(x > Di + γji(Dj − z)+) (1)
∂πi(x, z)
∂z= −(ri + hi − αci)γjiP(Dj − (x−Di)/γji < z ≤ Dj) (2)
∂2πi(x, z)
∂x2= −(ri + hi − αci)fΛi(z)(x) (3)
= −(ri + hi − αci)
[fDi(x)P(Dj ≤ z) +
∫ z+x/γji
zfDi(x− γji(w − z))fDj (w)dw
]∂2πi(x, z)
∂x∂z= −(ri + hi − αci)γji
∫ z+x/γji
zfDi(x− γji(w − z))fDj (w)dw. (4)
Equations (??), (??), and (??) imply that πi(yi, yj) is (a) submodular in (yi, yj), (b) concave in
yi, and (c) nonincreasing in yj respectively.
For the one period game with payoff function πi(yi, yj), let Ri(yj) be the optimal response for
firm i to firm j setting an inventory level of yj (similarly for Rj(yi)). The following two properties
will be useful in what follows.
Lemma 2 For each response function Ri(·),
(a) ∂Ri(z)∂z ≤ 0, and
(b)∣∣∣∂Ri(z)
∂z
∣∣∣ < 1.
Proof Define Ki(x, z) = ∂πi(x,z)∂x = 0 where
∂πi(x, z)
∂x= (ri − ci)− (ri + hi − αci)P(x > Di + γji(Dj − z)+)
= (ri − ci)− (ri + hi − αci)[P(x > Di)P(z > Dj)
+
∫ z+x/γji
zP(x > Di + γji(w − z))fDj (w)dw
].
11
We find the derivative of the best response function via the implicit function theorem as follows:
∂Ri(z)
∂z=
−∂Ki
∂z∂Ki
∂x
∣∣∣∣∣(Ri(z),z)
where
∂Ki
∂x= −(ri + hi − αci)
[fDi(x)P(z > Dj) +
∫ z+x/γji
zfDi(x− γji(w − z))fDj (w)dw
], and
∂Ki
∂z= −(ri + hi − αci)
[∫ z+x/γji
zγjifDi(x− γji(w − z))fDj (w)dw
].
The first thing to note is that the best response function’s derivative has a consistent sign (specif-
ically, negative), indicating monotonicity of the function as in (a). Clearly,∣∣∣∂K∂z ∣∣∣ < ∣∣∣∂K∂x ∣∣∣ because
γji < 1 and fDi(x)P(z > Dj) ≥ 0.2 The result of this is that the magnitude of the slope of the
best response function will be less than 1, yielding (b).
Notice that the derivative of the response function is negative, indicating that the firm’s best
decision is to stock less as the competitor stocks more. This could also have been shown directly
from the submodularity of the payoff functions.
Lemma 3 Inventory Xi(yi, Rj(yi)) is stochastically nondecreasing in yi and X(yi, yj) is stochas-
tically nondecreasing in (yi, yj).
Proof Note
Xi(yi, Rj(yi)) =(yi −Di − γji(Dj −Rj(yi))+
)+.
For any fixed (Di, Dj), a sufficient condition for this to be nondecreasing is γji ∂Rj(yi)∂yi
≥ −1. But
this is true by Lemma ??(b) and because 0 ≤ γji < 1. Thus, unconditioning on demand, we
have the desired result. Finally, Xi(·) is clearly stochastically nondecreasing in both yi and yj so
X(yi, yj) is stochastically nondecreasing in (yi, yj).
While the second part of Lemma ?? is intuitive (stocking more in this period should result in
having more inventory in the following period) the first part relies on the fact that the effect of
stocking more in this period has a secondary effect on transfer demand but an immediate effect on
carry over inventory.
2To accommodate γji = 1 one would simply need a sufficient condition on the support of demand to guarantee
that fDi(x)P(z > Dj) > 0.
12
Lemma 4 Payoff πi(Ri(yj), yj) is nonincreasing in yj.
Proof By the envelope theorem (e.g., Sydsæter, Strøm, and Berck, 2000)
∂πi(Ri(yj), yj)
∂yj=
∂πi(yi, yj)
∂yj
∣∣∣∣∣(Ri(yj),yj)
.
But πi(·) is nonincreasing in yj by Lemma ??, so ∂πi(Ri(yj), yj)/∂yj ≤ 0 as desired.
This lemma is quite intuitive and implies that even when firm i responds optimally to the other
firm’s inventory, they would still prefer the other firm to stock less. We are now ready to state the
main result of this section.
Theorem 1 There exists a unique ESS that is up-to (given sufficiently low initial inventories),
which is also basestock if α = 1.
Proof If the magnitude of the best response derivative is less than 1, then the existence of a unique
equilibrium of the one period pay-off function is guaranteed (Vives, 1999), but this requirement is
met by Lemma ??(b). The fact that it is up-to is given by the concavity of πi(·). Further, if initial
inventory is below the ESS level then it will also be below it in all following periods, by definition of
the inventory transition function. Therefore, so long as initial inventories lie below the equilibrium
levels, then following the equilibrium up-to levels is indeed an ESS. We will show that a basestock
policy is an ESS following the argument used by Bernstein and Federgruen (2004). To this end,
we assume that if in any period t, xit > yi∗ then firm i will not order regardless of the other firm’s
inventory and if xit ≤ yi∗ then firm i will order up-to yi∗ (again regardless of firm j’s inventory).
Under this policy, if initial inventory is above yi∗ then there will be a finite number of periods until
it is below yi∗ and after this point it will lie below yi∗. Further, these periods are negligible in the
infinite horizon when α = 1 and the average reward under such a basestock policy will be the same
as under the up-to policy. It is therefore an ESS.
Theorem 1 establishes that the ESS inventory decisions for the retailers are order up-to and,
under average costs, are also basestock. The up-to portion of the argument reproduces that in NRW
for their Model I: Lost Sales, but for completeness is included here (their proof is also similar). The
result is also similar to that in ABG, but ABG state that they restrict attention to equilibria in
stationary basestock policies; they show that if firm j follows a stationary basestock policy then a
stationary basestock policy is a best response for firm i.
13
4 Markov Equilibrium
Throughout this section we assume α < 1. Define (yi∗, yj∗) as the unique ESS,
πi∗ = πi(yi∗, yj∗)
as the expected one period profit under the ESS, and
V i∗ = πi∗/(1− α)
as the infinite horizon discounted expected payoff under repeated playing of the ESS.
Let V it (x
it, x
jt ) be the profit-to-go function under basestock policy (yi∗, yj∗) when the terminal
value for the game in period T + 1 is equal to V iT+1(x
iT+1, x
jT+1)
∆= V i∗. Define
yi = limy→∞
Ri(y),
which exists because Ri(·) is nonincreasing and bounded below by zero. We will show that, re-
gardless of the other firm’s behavior, firm i will always stock at least yi. Note that the less firm j
stocks the better for firm i from a one period payoff perspective.
Definition 5 Let eiT+1(xi) = 0 for all xi ≥ 0. Then recursively define
eit(x) =
0 0 ≤ x ≤ yi∗
maxz≤x{πi(z, yj)− πi∗ + αP(Xi(z,∞) > yi∗)eit+1(x
it+1)
}x > yi∗
and xit = argmaxx eit(x).
Then eit(x) represents the potential extra reward available to firm i for an overstock of x or less if
firm j reacts in the most favorable fashion possible. Note that, by definition, eit(x) is nondecreasing
and eit(yi∗) = 0. We will require the following property.
Property 1 For xi ≥ 0 and yj ≥ yj∗,
πi(xi, yj) + αP(Xi(xi, yj) > yi∗)eit(xti) ≤ πi∗,
for t = 0, 1, . . . , T .
14
Property ?? is the key assumption for an ESS to remain an equilibrium in all periods. It states
that is not worth firm i’s while to overstock in this period purely for next period’s gain, if the
firm doesn’t get an immediate response from firm j. As this requirement is dependent on t, it is
somewhat unattractive and the following property is stronger but time independent.
Property 2 Define πi = maxx πi(x, yj),
z = argmaxz
{πi(z, yj) + αP(X i(z,∞) > yi∗)
π − πi∗
1− α
},
and p = P(X i(z,∞) > yi∗). Assume, for xi ≥ 0 and yj ≥ yj∗,
πi(xi, yj) + αP(X i(xi, yj) > yi∗)πi − πi∗
1− αp≤ πi∗.
Notice that we can see immediately from Property ?? that if the discount rate is low, if there is
a low probability that overstocked inventory remains high in the following period, or if the payoff
function is not very responsive to the competitor’s inventory level, then this condition, which will
be shown to imply that the ESS forms a ME, is more likely to hold. Such conditions are highly
intuitive.
Lemma 5 Property ?? implies Property ??.
Proof
eit+1(x) ≤ πi − πi∗ + αeit+2(xit+2)
≤T−t−1∑s=0
αs(πi − πi∗)
≤ πi − πi∗
1− α
So that, by the concavity of πi(·) and the nondecreasing nature of cumulative probabilities, xit ≤ z.
Therefore,
eit(x) ≤ πi − πi∗ + αpeit+1(xit+1)
≤T−t∑s=0
(αp)s(πi − πi∗)
≤ πi − πi∗
1− αp.
15
The following theorem shows that under Property ?? or ?? the ESS is a ME. The proof is
inductive. In each period it is shown that if firm j follows the ESS equilibrium then firm i has no
incentive to deviate. No assumptions are made on the behavior if inventory is above the ESS level
but a bound is given for the value in these regions, thus allowing us to prove subgame perfection,
i.e., that there is no incentive to deviate from the equilibrium in order to move off the equilibrium
path in the following period.
Theorem 2 Under Property ?? or Property ??, if initial inventory starts below (yi∗, yj∗) then
ordering up to (yi∗, yj∗) in each period is a Markov Equilibrium. Further, regardless of firm j’s
initial inventory, firm i will always order at least yi.
Proof Notice that as Property ?? implies Property ?? we need only consider the latter. We
proceed by induction where at each step of the induction we will prove:
(a) V it (x
i, xj) = V i∗ for xi ≤ yi∗ and xj ≤ yj∗ and the up-to policy is Markov from period t
onwards when initial inventories are below the up-to levels.
(b) V it (x
i, xj) ≤ V i∗ + eit(xit) for x
i > yi∗
(c) Firm i will always order at least yi in period t and V it (x
i, xj) is constant in xi for xi < yi.
The basis is trivially established in period T + 1, now assume true in period t+ 1 onwards.
Part (a): Assume 0 ≤ xi ≤ yi∗ and 0 ≤ xj ≤ yj∗. We need to show that if firm j follows a yj∗ up-to
policy then firm i’s optimal response is to order up yi∗ and that V it (x
i, xj) = V i∗, which clearly
yields (a) for time t. Now
V it (x
i, xj) = maxx≥xi
{πi(x, yj∗) + αE
[V it+1(X(x, yj∗))
]}≤ max
x≥xi
{πi(x, yj∗) + α
[P(Xi(x, yj∗) ≤ yi∗)V i∗ + P(X i(x, yj∗) > yi∗)(V i∗ + eit+1(x
it+1))
]}= max
x≥xi
{πi(x, yj∗) + α
(V i∗ + P(Xi(x, yj∗) > yi∗)eit+1(x
it+1))
)}≤ πi∗ + αV i∗ = V i∗,
where the first inequality follows by the induction hypotheses (a) and (b) and the second inequality
follows from Property ??. But, by definition, πi(yi∗, yj∗) + αE[V it+1(X(yi∗, yj∗))
]= V i∗ so that
16
ordering up to yi∗ must be a best response for firm i to firm j ordering yj∗ (which is feasible by
the assumption on the initial inventories).
Part (b): Suppose xi > yi∗. We need to show that so long as firm j orders at least yj then
V it (x
i, xj) ≤ V i∗ + eit(xi). Note that we make no claim as to firm i’s response; extra conditions will
be needed to show that firm i’s best response is to order nothing (see Theorem ??). Now, if firm j
orders yj ≥ yj ,
V it (x
i, xj) ≤ maxx≥xi
{πi(x, yj) + α
(V i∗ + P(Xi(x,∞) > yi∗)eit+1(x
it+1)
)}≤ πi∗ + αV i∗ +max
x≥xi
{eit(x)
}≤ V i∗ + eit(x
it).
The first inequality follows by the same reasoning as in Part (a), the second by the definition of
eit(·), and the third by the definition of xit.
Part (c): Suppose that xi < yi. We need to show that firm i orders at least yi and that V it (·)
is constant over this range. But for x < yi, for any yj , Xi(x, yj) < yi and E[V it+1(X(x, yj))
]is
constant in x by the induction hypothesis. Therefore,
V it (x
i, xj) = maxx≥xi
{πi(x, yj) + αE
[V it+1(X(x, yj))
]}= max
x≥yi
{πi(x, yj) + αE
[V it+1(X(x, yj))
]},
where the second equality follows since πi(x, yj) is increasing in x for x < yi (by concavity). Thus,
firm i orders at least yi and V it (x
i, xj) is constant in xi for xi < yi.
A stronger result than the ESS up-to levels being Markov is that they are actually basestock.
However, this requires a stronger condition as follows.
Definition 6 Let εiT+1(xi) = 0 for all xi ≥ 0. Then recursively define
εit(xi) =
0 0 ≤ xi ≤ yi∗
maxz≤xi
{πi(z,Rj(z))− πi∗ + αE
[εit+1(X
i(z,Rj(z)))]}
xi > yi∗.
Then εit(xi) represents the potential extra reward available to firm i for an overstock of xi or less.
Note that by definition εit(xi) is nondecreasing and εit(y
i∗) = 0.
Property 3 For xi ≥ Ri(xj), xj ≥ Rj(xi), πi(xi, xj) + αE[εit+1(X
i(xi, xj))]is nonincreasing in
(xi, xj). Further, πi(xi, Rj(xi)) + αE[εit+1(X
i(xi, Rj(xi)))]is nonincreasing in xi.
17
In particular, Property ?? implies that
πi(xi, yj∗) + αE[εit+1(X
i(xi, yj∗))]
is nonincreasing in x beyond yi∗. This is the key assumption for the ESS to remain an equilibrium:
it is not worth firm i’s while to overstock in this period purely for next period’s gain if it does
not get an immediate response from firm j. To keep basestock always a best response it has to be
assumed at all xj ≥ yj∗, as in the actual property given above, and further the extra gain needs to
be nonincreasing so that no improvement can be made by moving off a high inventory level to an
even higher level. Note that considerable effort was applied to prove the nonincreasing nature of
the functions in Property ?? under some exogenous conditions, but we found no natural conditions.
Definition 7 Let νiT+1(xi, xj) = 0 for all xi, xj ≥ 0. Then recursively define
νit(x
i, x
j) =
0 0 ≤ xi ≤ yi∗, 0 ≤ xj ≤ yj∗
πi(Ri(xj), xj) − πi∗ + αE[νit+1(X(Ri(xj), xj))
]xi ≤ Ri(xj) ≤ yi∗, xj > yj∗
πi(xi, Rj(xi)) + αE[εit+1(X
i(xi, Rj(xi)))]+ αE
[νit+1(X(xi, Rj(xi)))
]− εit(x
i) − πi∗ xi > yi∗, xj ≤ Rj(xi)
πi(xi, xj) + αE[εit+1(X
i(xi, xj))]+ αE
[νit+1(X(xi, xj))
]− εit(x
i) − πi∗ xi > Ri(xj), xj > Rj(xi).
We will show that V it (x
i, xj) = V i∗ + εit(xi) + νit(x
i, xj).
Lemma 6 The function νit(xi, xj) is constant in xi for xi ≤ min(Ri(xj), yi∗), is constant in xj for
xj ≤ min(Rj(xi), yj∗), and is nonincreasing in both xi and xj.
Proof Line 1 of νit(xi, xj): For xj < yj∗, Ri(yj) ≥ yi∗ so constant in xi on desired range and
trivially nonincreasing.
Line 2: Clearly constant in xi and the first term is decreasing in xj (which is above desired range)
by assumption. For the last term, Xj(Ri(yj), yj) ≤ yj and so Ri(Xj(Ri(yj), yj)) ≥ Ri(yj), hence
νit+1(x,Xj(Ri(yj), yj)) is constant in x for x ≤ min(yi∗, Ri(yj)). But X i(Ri(yj), yj) ≤ Ri(yj) ≤ yi∗,
and Xj(Ri(yj), yj) is nondecreasing in yj (by assumption) so by the properties νit+1(·) this term
must be decreasing in xj .
Line 3: The sum of the first two terms is nonincreasing in xi by assumption. For the third term,
Xj(Ri(yj), yj) ≤ yj and so Ri(Xj(Λi(yj), yj)) ≥ Ri(yj), hence νit+1(x,Xj(Ri(yj), yj) is constant in
x for x ≤ min(yi∗, Ri(yj)). But Xi(Ri(yj), yj) ≤ Ri(yj) ≤ yi∗, and Xj(Ri(yj), yj) is stochastically
nondecreasing in yj (by Lemma ??) so by the properties νit+1(·) this term must be decreasing in
xj .
18
Line 4: The sum of the first and second term is nonincreasing in both arguments by assumption
(and xi and xj are above the range they need to be constant). The third and fourth term follow
by assumption on νit(·), X(·), and εit(·).
We are now ready for our last theoretical result.
Theorem 3 Under Property ??, the ESS basestock policy with response functions Ri(·) is a Markov
Equilibrium with V it (x
i, xj) = V i∗ + εit(xi) + νit(x
i, xj).
Proof We proceed by induction. Suppose that the ESS policy is Markov from period t + 1
onwards and V it+1(x
i, xj) = V i∗+εit+1(xi)+νit+1(x
i, xj). Further, suppose in period t firm j follows
the ESS basestock response policy. It suffices to show that firm i’s optimal response is the ESS
basestock response and that V it (x
i, xj) = V i∗ + εit(xi) + νit(x
i, xj).
Case 1: 0 ≤ xi ≤ yi∗, 0 ≤ xj ≤ yj∗.
V it (x
i, xj) = maxx≥xi
{πi(x, yj∗) + α
(πi∗/(1− α) + E
[εit+1(X
i(x, yj∗))]+ E
[νit+1(X(x, yj∗))
])}.
For x < yi∗, πi(x, yj∗) is increasing in x. Further, Xi(x, yj∗) ≤ x < yi∗ and Xj(x, yj∗) ≤ yj∗ so
the third and fourth term are zero. Hence the equilibrium value is at least yi∗. Now, by assump-
tion, πi(x, yj∗) + αE[εit+1(X
i(x, yj∗))]is maximized at x = yi∗, further X(x, yj∗) is stochastically
nondecreasing in x and ν(·) is a nonincreasing function, so yi∗ must be a best response. Further,
V it (x
i, xj) = πi(yi∗, yj∗) + απi∗/(1− α) = V i∗.
Case 2: xi ≤ Ri(xj) ≤ yi∗, xj > yj∗.
V it (x
i, xj) = maxx≥xi
{πi(x, xj) + α
(πi∗/(1− α) + E
[εit+1(X
i(x, xj))]+ E
[νit+1(X(x, xj))
])}.
For x < Ri(xj), πi(x, xj) is increasing in x. Further, X i(x, xj) ≤ x < Ri(xj) ≤ yi∗ so the third term
is zero. For the fourth term X(x, xj) is nondecreasing in x and ν(·) is a nonincreasing function,
hence the equilibrium value is at least Ri(xj). Now, by assumption, πi(x, xj)+αE[εit+1(X
i(x, xj))]
is decreasing for x ≥ Ri(xj). As before, the last term is nonincreasing in x, so Ri(xj) must be a
best response. Further,
V it (x
i, xj) = πi(Ri(xj), xj) + α(πi∗/(1− α) + E
[εit+1(X
i(Ri(xj), xj))]+ E
[νit+1(X(Ri(xj), xj))
])= V i∗ − πi∗ + πi(Ri(xj), xj) + αE
[νit+1(X(Ri(xj), xj))
]= V i∗ + νti (x
i, xj).
19
Case 3: xi > yi∗, xj ≤ Rj(xi).
V it (x
i, xj) = maxx≥xi
{πi(x,Rj(xi)) + α
(πi∗/(1− α) + E
[εit+1(X
i(x,Rj(xi)))]+ E
[νit+1(X(x,Rj(xi)))
])}.
By assumption, πi(x,Rj(xi))+αE[εit+1(X
i(x,Rj(xi)))]is decreasing in x for x > xi > yi∗. Further,
Ri(Rj(xi)) < xi by Lemma ??(b), so it is decreasing on x > Ri(Rj(xi)). But X(x,Rj(xi)) is
stochastically nondecreasing in x and ν(·) is a nonincreasing function, so xi must be a best response.
Further,
V it (x
i, xj) = πi(xi, Rj(xi)) + α(πi∗/(1− α) + E
[εit+1(X
i(xi, Rj(xi)))]+ E
[νit+1(X(xi, Rj(xi)))
])= V i∗ − πi∗ + πi(xi, Rj(xi)) + αE
[εit+1(X
i(xi, Rj(xi)))]+ αE
[νit+1(X(xi, Rj(xi)))
]= V i∗ + εit(x
i) + νti (xi, xj).
Case 4: xi > Ri(xj), xj > Rj(xi).
V it (x
i, xj) = maxx≥xi
{πi(x, xj) + α
(πi∗/(1− α) + E
[εit+1(X
i(x, xj))]+ E
[νit+1(X(x, xj))
])}.
By assumption, πi(x, xj) + αE[εit+1(X
i(x, xj))]is decreasing in x for x > Ri(xj) and X(x, xj) is
stochastically nondecreasing in x and ν(·) is a nonincreasing function, so xi must be a best response.
Further,
V it (x
i, xj) = πi(xi, xj) + α(πi∗/(1− α) + E
[εit+1(X
i(xi, xj))]+ E
[νit+1(X(xi, xj))
])= V i∗ + εit(x
i) + νti (xi, xj).
Theorem ?? shows that a basestock policy is Markov. However, the condition required to prove
it (Property ??) is quite strong. This is because not only must we show that there is no incentive
to deviate from the ESS if inventories lie below the ESS (as was shown under Property ??) but
also that should inventory lie above the ESS then no further order should be placed.
The proofs of Theorems ?? and ?? are inductive in nature, which is natural when trying to
prove subgame perfection. While a finite horizon is assumed, a similar technique could also be used
in the infinite horizon, either directly, or by letting T → ∞ (e.g., Fudenberg and Levine, 1983).
The key to the inductive arguments is the definition of the eit(·) and εit(·) functions, which are
single variable functions in a firm’s own stocking level. Bounding the payoff functions in terms of
20
these functions allows subgame perfection to be proven. It seems likely that a similar technique for
proving that an ESS is also a ME could be used in other settings to those considered here, which
is naturally left as a subject for future research. In the following section we numerically investigate
what happens when the conditions of this section do not hold.
5 Numerical Results
This section describes the results of exercising the model numerically. While the analysis in the
previous sections demonstrates that the Markov Equilibrium structure coincides with the Equilib-
rium in Stationary Strategies structure under some conditions, we have few analytical results when
these conditions do not hold. Specifically, our motivation is to explore the possibility that a firm
may choose to stock higher than the ESS levels in order to gain in the following period, hopeful
they will enter that following period at a high stocking level forcing their rival to stock at their best
response (below the ESS level). The following examples did not satisfy Property 2 in Section 4.
Nor are these the only examples we examined but we consider them representative of the properties
we wish to highlight.
The first example we wish to illustrate may be found in Table ??. Seeking a situation where a
firm will stock at a higher level to enter the following period at a similarly high level, we introduce a
demand distribution with a point mass of probability at zero demand with the remaining probability
distributed across strictly positive demands. The purpose of this zero demand is to allow the high-
stocking firm (with some probability) to continue to stock high in the following period. The numbers
in Table ?? reflect the equilibrium stocking levels for the following data: r1 = 21, r2 = 18, h1 =
1, h2 = 2, γ12 = γ21 = 1, c1 = 14, c2 = 10,Pr(D1 = 0) = Pr(D2 = 0) = 0.6,Pr(D1 = i) = Pr(D2 =
i) = 0.01, i = 1, ..., 40, α = 0.99. Notice that demand is assumed to be discrete and all unsatisfied
demand is assumed to transfer for numerical tractability (our state space is therefore the whole
numbers, suitably bounded). While these two assumptions are not identical to those in the previous
two sections we feel that the insights apply equally well.
We observe four items. Firstly, this is an illustration that the ESS, (29,20), is not the equilib-
rium in every period. This consolidates the message that the Equilibrium in Stationary Strategies
solution is not always a Markov Equilibrium. Secondly, the recommended stocking levels are not
21
Period Equilibrium
T + 1 (29,20)
T (28,21)
T − 1 (28,20)
T − 2 (29,20)
T − 3 (28,21)
T − 4 (28,20)
Table 1: Example for unique equilibrium
consistent, despite this being a problem with stationary parameters. Thirdly, the solution to this
Markov Game is a unique equilibrium in each period (that is not the ESS), although, as we will
observe in Table ??, this is not always the case. And finally, there is a cycling behavior observed,
where the equilibria are repeated in subsequent periods, in this case with a periodicity of three
periods. This regular cycle continues as the horizon length grows. Notice these are equilibrium
solutions of the up-to variables of the Markov model, which automatically implies this is a up-to
equilibrium policy, albeit not a stationary or monotone one. It is also interesting to note that the
equilibria in the Table ?? always have one player on the best response curves generated by the
ESS; for example, firm 1 stocking 28 is a best response to firm 2 stocking 21 but 21 is not a best
response for firm 2 to firm 1 stocking 28 but 20 is.
There are numerous other patterns we observe across a variety of numerical scenarios. Table
?? contains the equilibrium stocking levels for a symmetrical example with parameters: r1 = r2 =
20, h1 = h2 = 5, γ12 = γ21 = 1, c1 = c2 = 10,Pr(D1 = 0) = Pr(D2 = 0) = 0.6,Pr(D1 = i) =
Pr(D2 = i) = 0.01, i = 1, ..., 40, α = 0.99. The ESS is shown in period T + 1: (15,15). The
columns in the table illustrate the equilibria in each period when firm 1 chooses the equilibrium
(in each period) with the highest and lowest stocking level. For example, moving back from the
end of the horizon, in Period T when “Choosing highest” the (17,14) equilibrium is chosen and
when “Choosing lowest” the (14,17) equilibrium is chosen. Notice there are many equilibria which
arise again and again. This is a pattern whereby the same equilibria will appear, although not in
a stationary setting. Another observation is that the equilibria in every period when “Choosing
highest” are the mirror image of those when “Choosing lowest,” although such symmetry does not
22
exist amongst the multiple equilibria within one of these columns. Although not shown in the table,
choosing the ESS (15, 15) is also a ME. Thus, this is an example where the ESS is a ME but it is
not unique.
As noted above, Table ?? illustrates the existence of multiple equilibria. The conditions in
Section ?? guarantee a unique ESS but even in Section ?? we have not shown uniqueness; there we
have shown that the ESS is a ME but we have not ruled out the existence of other ME. Numerically,
we have seen up to five equilibria in a given period, although there seems to be no theoretical bound.
The immediate significance of finding multiple equilibria is that it ostensibly destroys the potential
for finding an equilibrium policy. Unless there is a compelling reason to choose one equilibrium
above the others, the model cannot deliver a valid equilibrium policy over multiple periods since
the ME concept consists of a predictable sequence of actions from a policy, delivering a unique
cost and a predictable state in the following period. The literature gives some guidance as to
what may constitute a compelling reason for choosing one equilibrium over another, including
Pareto improvement where all players encounter a superior outcome for that choice (see Parker and
Kapuscinski, 2010, for an example). Another reason could be a focal point equilibrium where the
custom or norms of the problem context suggest one equilibrium has a greater likelihood of being
chosen, such as jointly choosing the time of a meeting where the equilibrium (12:00pm,12:00pm)
could be preferred over (11:54am,12:08pm). As shown in Table ??, in period T there are three
equilibria: (14,17), (17,14), and (15,15) and each firm prefers a different equilibrium. Without
further information or problem context, it is impossible to validly select one of these equilibria as
a focal point. However, despite the fact that (15,15) is not the preferred equilibrium of either firm,
the ESS levels (15,15) might arguably be an attractive candidate as a mutual choice, particularly
when it is consistent as it is here.
In this example, we again observe a cycling behavior. This is where an equilibrium will reappear
on a regular basis. For example, in Table ?? we see that (14,17) appears under “Choosing highest”
and (17,14) appears under “Choosing lowest” in period T − 1, T − 3, T − 5, T − 7, etc.
As noted, there are equilibria where a firm will stock higher than its ESS level even for symmet-
rical problem sets. This seems consistent with the above-conjectured behavior on the overstocking
firm attempting to credibly commit to an overstock in the following period. For example, we find
23
Period Choosing highest Choosing lowest
T + 1 (15,15) (15,15)
T (14,17) (15,15) (17,14) (14,17) (15,15) (17,14)
T − 1 (13,19) (14,17) (15,16) (16,15) (17,14) (19,13)
T − 2 (13,19) (14,17) (16,15) (20,13) (21,12) (12,21) (13,20) (15,16) (17,14) (19,13)
T − 3 (13,19) (14,17) (17,14) (19,13)
T − 4 (13,19) (18,14) (14,18) (19,13)
T − 5 (13,19) (14,17) (17,14) (19,13)
T − 6 (13,19) (19,13) (13,19) (19,13)
T − 7 (13,19) (14,17) (17,14) (19,13)
T − 8 (13,19) (17,14) (20,13) (21,12) (12,21) (13,20) (14,17) (19,13)
T − 9 (13,19) (14,17) (17,14) (19,13)
T − 10 (13,19) (18,14) (14,18) (19,13)
T − 11 (13,19) (14,17) (17,14) (19,13)
T − 12 (13,19) (16,15) (19,13) (13,19) (15,16) (19,13)
T − 13 (13,19) (14,17) (17,14) (19,13)
T − 14 (13,19) (17,14) (21,12) (12,21) (14,17) (19,13)
T − 15 (13,19) (14,17) (17,14) (19,13)
T − 16 (13,19) (18,14) (14,18) (19,13)
T − 17 (13,19) (14,17) (17,14) (19,13)
T − 18 (13,19) (16,15) (19,13) (13,19) (15,16) (19,13)
Table 2: Example for multiple equilibria
24
(14,17), (17,14), and (15,15) in a symmetrical example where the ESS is (15,15).3 Another obser-
vation is that whenever the target levels do not correspond to the ESS levels, one firm will stock
above its ESS level while the other will stock below its ESS level. This property is observed both in
symmetrical and asymmetrical examples. We found no examples where both firms overstock (which
we had initially hypothesized might be observed) nor any examples where both firms understock
(which is not surprising).
An alternative perspective of this “overstock-understock” behavior is that the firm which is
understocking relative to its ESS is attempting to sabotage the firm which is overstocking. The
understocking firm knows they will be at a disadvantage in the following period if the overstocking
firm continues to have an inventory superiority in the following period, so they understock in the
current period knowing that some of their current-period unsatisfied customers will transfer to the
rival firm and draw down the rival’s inventory. Thus, they can indirectly attempt to reduce the high
stocking level of the other firm by deliberately channeling unmet demand towards them. Consider
an equilibrium in Table ?? of (21,12) in Period T − 2 (when equilibria of (15, 16) and (17,14) are
chosen in Periods T − 1 and T , respectively). With R1(12) = 17 which is lower than 21, we can see
that firm 1 is stocking higher than the ESS best response level. Knowing that R2(17) = 14, clearly
firm 1 is overstocking relative to firm 2’s best response, too. The understocking interpretation may
be seen as firm 2 stocking at 12 rather than its ESS best response of 14 (to firm 1 stocking 17) or
13 (R2(21) = 13). The effect of this understocking is that more of its customers will stockout and
transfer to the overstocking firm, thus reducing the chance that the overstocking firm will enter the
following period at a high stocking level.
One final property observed in Table ?? is what we label as the see-saw effect. This is where
the highest stocking level for firm 1 under “Choosing highest” appears above and below the ESS
levels in alternating periods. Similarly, firm 1’s lowest stocking level see-saws above and below firm
1’s ESS level under “Choosing lowest.” Firm 2’s stocking level similarly follows such an alternating
see-saw effect but will be above its ESS level when firm 1’s stocking level is below its respective
ESS level.
3Clearly, in a symmetrical example which has unequal target stocking levels for one equilibrium, there will be a
corresponding equilibrium with the target levels reversed.
25
6 Conclusions
In this paper we formulate and analyze an inventory duopoly where unsatisfied customers at one
firm may either transfer to the other firm or leave unsatisfied. This model has been previously
studied in the literature but not from the perspective of a Markov Equilibrium. We show that
significantly different dynamics can occur when subgame perfection is required across periods.
We feel that the extant literature on inventory games has been confused by the lack of a good
term for the typical (open-loop or infinite-horizon) type of equilibrium studied, and we suggest
the term an Equilibrium in Stationary Strategies for the standard equilibrium. For our specific
example application we give conditions for when the ESS is also a ME (i.e., is also a closed-loop
equilibrium). The proof technique used to show subgame perfection relies on an inductive argument
showing that any deviation from the ESS in the current period does not provide sufficient reward in
the following period to warrant consideration. While relatively natural, we are not aware of other
papers using this technique and believe it may be a general technique to consider for proving that
an ESS is also a ME.
We show that behavior under a ME may be far from trivial, and future research should study
ways to allow policy insights from the equilibrium. Until that is achieved, we feel that the ESS
concept has significant value and should continue to be studied from an operations perspective.
After all, there are many environments where the macro decisions on operations (e.g., the up-to
levels for an ERP system) are made on a longer timescale then the day to day fluctuations of, say,
inventory. Also, the macro-level insights obtained from such games (e.g., the losses to a firm due
to competition) will likely also frequently carry over to the closed-loop ME setting.
In summary, our primary contribution is a deeper understanding of both the dynamic stockout-
based substitution game and also the general relationship between ESS and ME. Future research
could focus on ESS results for more general stockout-based substitution models to that considered
here and also further methodological results on the analysis of ME.
References
[1] Albright, S.C. and Winston, W., “Markov Models of Advertising and Pricing Decisions,”
Operations Research, Vol. 27, No. 4 (1979) 668-681.
26
[2] Anderson, E., Fitzsimons, G.J., and Simester, D., “Measuring and Mitigating the Cost of
Stockouts in Mail Order Catalogs,” University of Chicago working paper (2005).
[3] Avsar, Z.M. and Baykal-Gursoy, M., “Inventory Control under Substitute Demand: A Stochas-
tic Game Application,” Naval Research Logistics, Vol. 49, (2002) 359-375.
[4] Bassok, Y., Anupindi, R., and Akella, R., “Single-Period Multiproduct Inventory Models with
Substitution,” Operations Research, Vol. 47, No. 4 (1999) 632-642.
[5] Bernstein, F. and Federgruen, A., “Dynamic Inventory and Pricing Models for Competing
Retailers,” Naval Research Logistics, Vol. 51 (2004) 258-274.
[6] Caro, F. and Martınez-de-Albeniz, V., “The Impact of Quick Response in Inventory-Based
Competition,” Manufacturing & Service Operations Management, Vol. 12, No. 3 (2010) 409-
429.
[7] Dixit, A., “The Role of Investment in Entry-Deterrence,” The Economic Journal, Vol. 90, No.
357 (1980) 95-106.
[8] Fudenberg, D. and Levine, D., “Subgame-Perfect Equilibria of Finite- and Infinite-Horizon
Games,” Journal of Economic Theory, Vol. 31, No. 2, (1983) 251-268.
[9] Fudenberg, D. and Tirole, J., Game Theory, MIT Press, Cambridge, MA (1991).
[10] Hall, J. and Porteus, E., “Customer Service Competition in Capacitated Systems,” Manufac-
turing & Service Operations Management, Vol. 2, No. 2 (2000) 144-165.
[11] Heyman, D. and Sobel, M., Stochastic Models in Operations Research, Vol. II, McGraw Hill,
New York (1984).
[12] Honhon, D., Gaur, V., and Seshadri, S., “Assortment Planning and Inventory Decisions Under
Stockout-Based Substitution,” Operations Research, Vol. 58, No. 5 (2010) 1364-1379.
[13] Kirman, A.P. and Sobel, M.J., “Dynamic Oligopoly with Inventories,” Econometrica, Vol. 42,
No. 2, (1974) 279-287.
[14] Li, Q., and Ha, A., “Reactive Capacity and Inventory Competition under Demand Substitu-
tion,” IIE Transactions, Vol. 80 (2008) 707-717.
27
[15] Lippman, S.A. and McCardle, K.F., “The Competitive Newsboy,” Operations Research, Vol.
45, No. 1, (1997) 54-65.
[16] Lu, L.X. and Lariviere, M.A., “Capacity Allocation over a Long Horizon: The Return on
Turn-and-Earn,” working paper, University of North Carolina (2009).
[17] Mahajan, S. and van Ryzin, G., “Inventory Competition under Dynamic Consumer Choice,”
Operations Research, Vol. 49, No. 5 (2001) 646-657.
[18] Nagajaran, M. and Rajagopalan, S., “A Multiperiod Model of Inventory Competition,” Oper-
ations Research, Vol. 57, No. 3 (2009) 785-790.
[19] Nash, J.F., “Equilibrium Points in n-Person Games,” Proceedings of the National Academy of
Sciences of the United States of America, Vol. 36, No. 1 (1950) 48-49.
[20] Nash, J.F., “Non-Cooperative Games,” Annals of Mathematics, Vol. 54, No. 2 (1951) 286-295.
[21] Netessine, S. and Rudi, N., “Centralized and Competitive Inventory Models with Demand
Substitution,” Operations Research, Vol. 51, No. 2, (2003) 329-335.
[22] Netessine, S., Rudi, N., and Wang, Y., “Inventory Competition and Incentives to Back-order,”
IIE Transactions, Vol. 38 (2006) 883-902.
[23] Olsen, T.L. and Parker, R.P., “Inventory Management Under Market Size Dynamics,” Man-
agement Science, Vol. 54, No. 10 (2008) 1805-1821.
[24] Pakes, A. and McGuire, P., “Computing Markov-Perfect Nash Equilibria: Numerical Implica-
tions of a Dynamic Differentiated Product Model,” RAND Journal of Economics, Vol. 25, No.
4 (1994) 555-589.
[25] Parker, R.P. and Kapuscinski, R., “Managing a Non-Cooperative Supply Chain with Limited
Capacity,” forthcoming in Operations Research (2010).
[26] Parlar, M., “Game Theoretic Analysis of the Substitutable Product Inventory Problem with
Random Demand,” Naval Research Logistics, Vol. 35, (1988) 397-409.
[27] Saloner, G., “The Role of Obsolescence and Inventory Costs in Providing Commitment,”
International Journal of Industrial Organization, Vol. 4 (1986) 333-345.
28
[28] Schary, P.B. and Christopher, M., “The Anatomy of a Stockout,” Journal of Retailing, Vol.
55, No. 2, (1979) 59-70.
[29] Sydsæter, K., Strøm, A., and Berck, P., Economists’ Mathematical Manual, Springer, Heidel-
berg, Germany, 2000.
[30] Spence, A.M., “Entry, Capacity, Investment and Oligopolistic Pricing,” The Bell Journal of
Economics, Vol. 8, No. 2 (1977) 534-544.
[31] Tirole, J., The Theory of Industrial Organization, MIT Press, Cambridge, MA (1988).
[32] Vives, X., Oligopoly Pricing: Old Ideas and New Tools, The MIT Press, Cambridge, MA
(1999).
[33] Wang, Y. and Gerchak, Y., “Supply Chain Coordination when Demand is Shelf-Space Depen-
dent,” Manufacturing & Service Operations Management, Vol. 3, No. 1, (2001) 82-87.
[34] Zeinalzadeh, A., Alptekinoglu, A., and Arslan, G., “Inventory Competition under Fixed Order
Costs,” working paper, Southern Methodist University (2010).
29