33
The Redesign of the Matching Market for American Physicians: Some Engineering Aspects of Economic Design By ALVIN E. ROTH AND ELLIOTT PERANSON* We report on the design of the new clearinghouse adopted by the National Resident Matching Program, which annually fills approximately 20,000 jobs for new physi- cians. Because the market has complementarities between applicants and between positions, the theory of simple matching markets does not apply directly. However, computational experiments show the theory provides good approximations. Fur- thermore, the set of stable matchings, and the opportunities for strategic manipu- lation, are surprisingly small. A new kind of “core convergence” result explains this; that each applicant interviews only a small fraction of available positions is important. We also describe engineering aspects of the design process. (JEL C78, B41, J44) The entry-level labor market for new physi- cians in the United States is organized via a centralized clearinghouse called the National Resident Matching Program (NRMP). Each year, approximately 20,000 jobs are filled in a process in which graduating physicians and other applicants interview at residency pro- grams throughout the country and then compose and submit Rank Order Lists (ROLs) to the NRMP, each indicating an applicant’s prefer- ence ordering among the positions for which she has interviewed. Similarly, the residency programs submit ROLs of the applicants they have interviewed, along with the number of positions they wish to fill. The NRMP processes these ROLs and capacities to produce a match- ing of applicants to residency programs. The clearinghouse used in this market dates from the early 1950’s. It replaced a decentral- ized process that suffered a market failure when residency programs and applicants started to seek each other out individually through infor- mal channels, earlier and earlier in advance of employment, rather than waiting to participate in the larger market. (By the 1940’s, contracts were typically being signed two years in ad- vance of employment.) Although the matching algorithm has been adapted over time to meet changes in the structure of medical employ- ment, roughly the same form of clearinghouse market mechanism has been used since 1951 (see Roth, 1984). The kind of market failure that gave rise to this clearinghouse has since been seen in many markets (Roth and Xiaolin Xing, 1994), a number of which have also organized clearinghouses in response. In the mid 1990’s, in an environment of rap- idly changing health-care financing with many implications for the medical labor market, the market began to suffer a crisis of confidence concerning whether the matching algorithm was unreasonably favorable to employers at the ex- pense of applicants, and whether applicants could “game the system” by strategically ma- nipulating the ROLs they submitted. The con- troversy was most clearly expressed in an exchange in Academic Medicine (Peranson and Richard R. Randlett, 1995a, b; Kevin J. Williams, 1995a, b). In reaction to this ex- change, groups such as the American Medical Student Association together with Ralph Nad- er’s Public Citizen Health Research Group (1995), and the Medical Student Section of the American Medical Association (AMA-MSS, 1995) advocated that the matching algorithm be * Roth: Department of Economics, and Graduate School of Business Administration, Harvard University, Cam- bridge, MA 02138 (e-mail: [email protected]); Peran- son: National Matching Services, Inc., 595 Bay Street, Suite 301, Box 29, Toronto, ON M5G 2C2, Canada. We thank Aljosa Feldin for able assistance with the theoretical com- putations reported in Section VI. Parts of this work were sponsored by the National Resident Matching Program, and parts by the National Science Foundation. 748

The Redesign of the Matching Market for American

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

The Redesign of the Matching Market for American Physicians:Some Engineering Aspects of Economic Design

By ALVIN E. ROTH AND ELLIOTT PERANSON*

We report on the design of the new clearinghouse adopted by the National ResidentMatching Program, which annually fills approximately 20,000 jobs for new physi-cians. Because the market has complementarities between applicants and betweenpositions, the theory of simple matching markets does not apply directly. However,computational experiments show the theory provides good approximations. Fur-thermore, the set of stable matchings, and the opportunities for strategic manipu-lation, are surprisingly small. A new kind of “core convergence” result explainsthis; that each applicant interviews only a small fraction of available positions isimportant. We also describe engineering aspects of the design process.(JEL C78,B41, J44)

i-aaha

ndo-oser-chcyeyofestch

teal-et

r-o

ets-get

y-e51ten,d

-yeesx-ts--n

The entry-level labor market for new physcians in the United States is organized viacentralized clearinghouse called the NationResident Matching Program (NRMP). Eacyear, approximately 20,000 jobs are filled inprocess in which graduating physicians aother applicants interview at residency prgrams throughout the country and then compand submit Rank Order Lists (ROLs) to thNRMP, each indicating an applicant’s prefeence ordering among the positions for whishe has interviewed. Similarly, the residenprograms submit ROLs of the applicants thhave interviewed, along with the numberpositions they wish to fill. The NRMP processthese ROLs and capacities to produce a maing of applicants to residency programs.

The clearinghouse used in this market dafrom the early 1950’s. It replaced a decentrized process that suffered a market failure whresidency programs and applicants startedseek each other out individually through infomal channels, earlier and earlier in advance

.-al-

pe

e

* Roth: Department of Economics, and Graduate Schooof Business Administration, Harvard University, Cam-bridge, MA 02138 (e-mail: [email protected]); Peran-son: National Matching Services, Inc., 595 Bay Street, Suite301, Box 29, Toronto, ON M5G 2C2, Canada. We thankAljosa Feldin for able assistance with the theoretical com-putations reported in Section VI. Parts of this work weresponsored by the National Resident Matching Program, anparts by the National Science Foundation.

748

l

e

-

s

no

f

employment, rather than waiting to participatin the larger market. (By the 1940’s, contracwere typically being signed two years in advance of employment.) Although the matchinalgorithm has been adapted over time to mechanges in the structure of medical emploment, roughly the same form of clearinghousmarket mechanism has been used since 19(see Roth, 1984). The kind of market failure thagave rise to this clearinghouse has since beseen in many markets (Roth and Xiaolin Xing1994), a number of which have also organizeclearinghouses in response.

In the mid 1990’s, in an environment of rapidly changing health-care financing with manimplications for the medical labor market, thmarket began to suffer a crisis of confidencconcerning whether the matching algorithm waunreasonably favorable to employers at the epense of applicants, and whether applicancould “game the system” by strategically manipulating the ROLs they submitted. The controversy was most clearly expressed in aexchange inAcademic Medicine(Peranson andRichard R. Randlett, 1995a, b; Kevin JWilliams, 1995a, b). In reaction to this exchange, groups such as the American MedicStudent Association together with Ralph Nader’s Public Citizen Health Research Grou(1995), and the Medical Student Section of thAmerican Medical Association (AMA-MSS,1995) advocated that the matching algorithm b

l

d

tcra

adt

dino

hen-verv

rde

uainownethtothin

amt

wse

tua

he

nextho-redis

ts.ticest-re

ts,edigne-u-o

eaad-in-anngey

gngionis

ndthetingnlychreelsooil

nd.al-

na-ngsenal,

tlege

t asan

2

749VOL. 89 NO. 4 ROTH AND PERANSON: MATCHING MARKET FOR PHYSICIANS

changed and/or that the description of the mabe changed to give applicants more accuadvice about how to participate.1

Medical-school personnel responsible forvising students about the job market beganreport that many students believed the NRMPnot function in the best interest of students, athat students were discussing the possibilitydifferent kinds of strategic behavior. Given tprior history of market failure due to lack of cofidence in the market in this and other entry-leprofessional labor markets, these reports deseand received the most serious attention.

In this atmosphere, in the fall of 1995 the Boaof Directors of the NRMP commissioned the dsign of a new algorithm for conducting the annmatch, and a study comparing it to the existNRMP algorithm. The present paper reports hthe new algorithm was designed, how the talgorithms were compared, and what was learabout the market in the process. (In May 1997,NRMP Board of Directors decided to switchthe new algorithm, and the first match usingnew algorithm was successfully completedMarch 1998.)

In the course of designing, testing, and evuating the new clearinghouse algorithm, sosurprising properties of large labor markeemerged. The high transaction costs involvedinterviewing place a practical limit on homany interviews are conducted, and one conquence of this is that the set of stable outcomis very small, and there are very few oppornities for participants to engage in strategic mnipulation of their stated preferences whencomes to making and accepting offers. (Neitof these would be the case in the absencetransaction costs.)

Aside from describing these new facts, apresenting some theoretical computation toplain them, we also describe in this paperprocessby which the new clearinghouse algrithm was designed, evaluated, and compawith the existing algorithm. At each stage, th

1 At around the same time, the Antitrust Division of theDepartment of Justice initiated a wide-ranging discoverprocess concerning these markets. This ultimately gave rito a fairly narrowly focused consent decree involving thepractices of the Association of Family Practice ResidencDirectors (U.S. District Court for the Western District ofMissouri, 1996).

hte

-oddf

led

-lgwode

e

l-esin

e-s

--

itrof

d-

e

process involved computational experimenThis process resembles engineering pracrather than theorem-proving or hypothesis-teing. But despite the fact that economists aincreasingly called upon to design markethere is little or no economic literature devotto the engineering aspects of economic desand the practical problems of moving from thory about simple markets to workable instittions for complex markets. Yet if we fail tdevelop such an “engineering” literature, wwill fail to profit from design experience incumulative way. The present paper then, indition to presenting some new results, istended to take a step in the direction ofengineering literature as well, by describihow those facts were learned, and how thimpacted design decisions.2

A rough analogy may be helpful for thinkinabout how the different parts of this paper hatogether. Consider the design of suspensbridges. The Newtonian physics they embodybeautiful both in mathematics and in steel, acollege students can be taught to derivecurves that describe the shape of the supporcables. But no bridge could be built based oon this elegant theoretical treatment, in whithe only force is gravity, and all beams aperfectly rigid. Real bridges are built of steand rest on rock and soil and water, andbridge design also concerns metal fatigue, smechanics, and the forces of waves and wiMany design questions concerning these reworld complications cannot be answered alytically but, instead, must be explored usiphysical or computational models. Often theinvolve estimating magnitudes of phenomemissing from the simple Newtonian modesome of which are small enough to be of litconsequence, while others will cause the bridto fall down if not adequately addressed. Jusno suspension bridges could be built without

yse

y

Some beginnings of such a literature can also be foundin connection with the design of electricity markets (RobertB. Wilson, 1993) and the auction of the radio spectrum (seee.g., John McMillan, 1994, 1995; R. Preston McAfee andMcMillan, 1996; Lawrence M. Ausubel et al., 1997; PeterCramton, 1997; John O. Ledyard et al., 1997; Paul Milgrom,1997; Charles R. Plott, 1997; David J. Salant, 1997). Thereis, of course, already something of an engineering-orientedliterature in finance; for an innovative example, see RobertJ. Shiller (1993).

eyuthnaleras

ia

a

er-athucaegeudx

d’s

ehioo

he

c

le

itita

earyndts-

by.

ar-n-atnt

esgn

o-ndicis-

ae

isfc-y

ndettsredower-seheedr-fd.gnt

s.

sr--dofn-ony,

750 THE AMERICAN ECONOMIC REVIEW SEPTEMBER 1999

understanding of the underlying physics, neithcould any be built without understanding manadditional features, also physical in nature, bmore varied and complex than addressed bysimple model. These additional features, ahow they are related to and interact with thpart of the physics captured by the simpmodel, are the concern of the scientific literatuof engineering. Some of this is less elegant ththe Newtonian model, but it is what makebridges stand. Just as important, it allowbridges designed on the same basic Newtonmodel to be built longer, stronger, and lighteover time, as the complexities and how to dewith them become better understood.

For the design of the medical labor-markclearinghouse, the underlying theory is the theoof two-sided matching. Simple models of twosided matching markets have proved to be elegand tractable, and very useful in understandingorganization and evolution of many markets. Bthe theory concentrates on simple models in whino worker needs more than one job, and thereno married couples or other connections betweworkers or between positions. There is a larbody of theory relevant to design problems (se.g., Roth and Marilda Sotomayor, 1990), bnone of the theorems applies directly to the meical market, although many of the countereamples do. That is, many of the existing theoremrest on assumptions not met in the complex meical market, and many of the medical marketcomplexities are known to open the door to thpossibility of serious design problems. But thcounterexamples do not give any guidance to tmagnitude of these problems, and for this we whave to rely on computational exploration, boththe data from the medical market itself, andsimpler models which will help explain what isgoing on in the complex market. In both cases, tcomputational explorations will be guided by ththeory, which will make possible computationaexperiments that would be impossible to conduby brute force on such large markets. It seemlikely that, as game theory moves from simpconceptual problems to complex design problemwe will need to make more general use of thinteraction among theory, computational invesgation of market data, and theoretical compution, and that this in turn will produce newproblems and directions for traditional theory.

This paper is organized as follows. Section

r

te

dt

en

sn

rl

ty

ntethreneet-

-s-

e

ellff

e

lts

s,s--

I

gives an overview of the medical market and thdesign problem, and it presents some necessbackground by discussing stable matchings awhy they are important, how complex markediffer from simple markets with respect to stable matchings, and how the algorithm usedthe NRMP prior to this study is structuredSection II presents statistics describing the mket and previous match results. These demostrate that three of the four match variations thmake the NRMP a complex market are presein substantial numbers. Section III describhow the new algorithm was designed, includinthe role of computational experiments. SectioIV compares the performance of the two algrithms on the data from recent matches, aSection V looks at the possibilities for strategbehavior when each of the two algorithmsemployed. In studying the possibilities for strategic behavior, we will first treat the ROL datas if they were the true preferences of thagents, and then (in Section VI) show why this justified; we will also explain why the set ostable matchings turns out to be so small. Setion VII presents some thoughts on the interplaamong theory, computational experiments, atheoretical computation in the design of markmechanisms. The theory of simple markeframed the questions that needed to be answein the course of this design and suggested hto construct and evaluate computational expiments on the complex system to answer thequestions. The magnitudes determined by tcomputational experiments were then explainwith theoretical computations on simple makets, providing results which, with the aid otheory, could be unambiguously interpreteThis interplay was what gave the present desieffort its “engineering” flavor, and we suspecthat this will generalize to other design effortSection VIII contains concluding remarks.

I. Background to the Present Study

The considerable body of theory that habeen developed for two-sided matching makets, together with multiple opportunities to observe empirically both the successful anunsuccessful clearinghouse organizationother entry-level labor markets, provided a geeral road map for both the design and evaluatiof a new clearinghouse algorithm. Specificall

ch

f

ehi

yi

aoospo

d

anh

trenrroic

oioe

nlto

oitbe-se”c-

k

nsul,tsto

e-n-

err

onsout

ingto

ta-to

lese

em-on-ntsentthed toongwasiod,

ofset

igh

unare

ddhetsouttic-P

m.

3 Typically these reversions arise when, for example, thedirector of a second-year postgraduate residency programarranges with the director of a prerequisite first-year pro-gram that his residents will spend their first year in thatprerequisite program. However if the second-year programthen fails to match with as many residents as were antici-pated, this leaves vacancies in the first-year program thatcan be filled by other applicants.

751VOL. 89 NO. 4 ROTH AND PERANSON: MATCHING MARKET FOR PHYSICIANS

there was strong empirical evidence that sucessful clearinghouses are generally those tproduce matchings that are stable in the senthat they do not create “blocking pairs” oagents, not matched to one another, who woumutually prefer to be matched to one anoththan to accept the matching produced by tclearinghouse. The theory clearly shows that,sufficiently simple markets (simple in a wathat will shortly be made precise), systematwelfare comparisons can be made between dferent stable matchings, with some being reltively favorable to firms and unfavorable tworkers, and some the reverse. In addition, fsufficiently simple markets the theory allowstrong conclusions to be drawn about the oportunity and scope for strategic behavior. (Fan overview of the theory, relevant parts owhich will be reviewed below, see Roth anSotomayor [1990].)

The goal of the design was to construct aalgorithm that would produce stable matchingsfavorable as possible to applicants, while meetithe specific constraints of the medical market. Tcomparisons between the new and existing algrithms were to focus both on how many applicanand residency programs could be expected toceive more-preferred or less-preferred matchunder the two algorithms and on how the differealgorithms might influence the opportunity oneed for strategic behavior by applicants and pgrams. Closely related issues were what advcould be given to participants in the match whenis conducted with one or the other of the algrithms, and what kinds of changes in the behavof match participants might be anticipated if thmatching algorithm were changed.

These questions were at the heart of the cotroversy that spilled into the medical journals i1995. Much of that discussion referred to resuin the theoretical literature concerning simple twsided matching markets. But, although the NRMoriginated as a simple market, it has become mcomplex particularly since the early 1980’s, ashas developed complementarities and linkagestween positions and between applicants. Thearise through four kinds of “match variations,introduced to accommodate the changing struture of the medical labor market, namely:

(i) couples in the applicant pool who seetwo positions close to one another;

-atse

ldren

cif--

r

-rf

nsgeo-se-st

-e

it-r

n-

s-Pre

(ii) applicants who seek second-year positioin the match and, if they are successfhave supplemental Rank Order Liswhich must be consulted to match themprerequisite first-year positions;

(iii) residency programs with positions that rvert to other programs if they remain ufilled;3 and

(iv) programs that wish to fill an even numbof positions if they cannot fill all theipositions.

These linkages can be shown to allow situatiin which many of the conclusions reached absimpler markets no longer apply.

It was therefore necessary, both in designthe new algorithm and in making comparisonsthe existing algorithm, first to conduct computional experiments to determine the extentwhich the predictions of the theory of simpmatching markets applied to the NRMP. Thecomputational experiments, as well as thoseployed to compare the two algorithms, were cducted on the ROLs submitted by all applicaand residency programs in the four most recmatches (1993, 1994, 1995, and 1996) and in1987 match. The recent matches were selectehave contemporary patterns of preferences amapplicants and residency programs, and 1987selected for a comparison over a longer perand specifically because it had the lowest rateunmatched U.S. seniors in the available data(6.0 percent, as opposed to the historically hrate of 7.5 percent for 1996).

A number of specialty matches are also runder the auspices of the NRMP, and theselargely free of the match variations which acomplexity to the general resident match. Texisting theory of simple matching marketherefore provides accurate predictions abthe nature and direction of changes to be anipated in these matches if the existing NRMalgorithm were replaced by the new algorith

tod

e

6e-

otys

osathmseotebw

x

isp

sre

g,lents

blet–

them

nt

ynghebegs.

utingr-yseher

-eh

d if)e-is

es-l-

wch

ger

to

ichffi-ro-or

thisro-

am)ingntso apro-ting

752 THE AMERICAN ECONOMIC REVIEW SEPTEMBER 1999

However the theory offers little guidance asthe magnitudeof the changes to be expecteand for this purpose, computational experimenon the data of past matches were also needThese were conducted for the Thoracic Surgematch, for the five years 1991–1994 and 199

The design of the new algorithm and thcomparisons of the two algorithms will be discussed in detail in the body of the paper. Thgeneral conclusions can be summarized by ning that, both for the NRMP and the specialmatches, the effects of changing from the exiing algorithm to the newly designed algorithmare in the directions predicted by the theory fsimple markets, but the size of these changesmall, and the opportunities for profitable strtegic behavior are comparably small for boapplicants and programs under either algorith

In the course of explaining why the differenceare so small, we will present a new kind of “corconvergence” result, which shows that the sizethe set of stable matchings becomes small assize of the market increases, even when prefences are uncorrelated, provided that the numof positions for which an applicant can intervieremains small (and not otherwise).

A. Stable Matchings in Simple and CompleMatching Markets

Centralized matching mechanisms often arto solve market failures due to unraveling of apointment dates. Perhaps the most important aleast controversial empirical finding about centraized matching algorithms is that they are mooften successful if the matchings they produce astable (Roth, 1984, 1990, 1991; Roth and Xin1994; John Kagel and Roth, 2000). In a simpmatching market, a matching between applicaand residency programs isstable if there is noapplicant or program matched to an unaccepta(unlisted) partner, and if there are no applicanprogram pairs such that the applicant prefersprogram to his/her current match, and the prograalso prefers the applicant to one of its currematches (or vacant position).4

terit

lisn-

t

,tsd.

ry.

et-

t-

ris-

.

fher-er

e-ndl-t

Therefore, this study, and the controverswhich preceded it, focused on choices amoalgorithms that produce stable matchings. Treason for the controversy is that there cansystematic differences among stable matchinAppendix C gives formal definitions of stabilityin simple and complex matching markets, bthe basic ideas can be conveyed by considerthe “deferred acceptance algorithm” first fomally studied by David Gale and Lloyd Shaple(1962).5 There are two basic versions of thialgorithm, in each of which one side of thmarket (firms or workers) makes offers, whicthe other side can reject or hold to see whethany better offers are forthcoming.

In the worker-proposing version of the algorithm, each worker begins by applying for thposition at the top of her preference list. Eacfirm rejects any unacceptable candidates, anit hasq positions it temporarily holds the (up toq most-preferred applications it has so far rceived and rejects the rest. A candidate whorejected at any step of the algorithm next applito her next-highest-ranked position (if any remain) among those not yet applied to. The agorithm stops at any step in which no neapplications are made, at which point eaworker is matched to the firm (if any) holdingher application.

In a simple market the resulting matchinmust be stable (i.e., there are no firm–workblocking pairs) since, if a workerw prefers firmf to her final match, she must have appliedfirm f and been rejected, and hence firmf does

ht

he

5 Although Gale and Shapley (1962) discussed the algo-rithm in an abstract setting, it appears that, in various forms,equivalent algorithms have been developed in applied con-texts both before and since, with the initial NRMP algo-rithm, dating from 1951, being the first we know of (Roth,1984).

4 Among the programs and applicants who have in-viewed one another, programs do not list applicants wwhom they are unwilling to match, and applicants do notprograms with whom they are unwilling to match. (Umatched programs and applicants can be matched in

postmatch secondary market called the “scramble,” whtakes place primarily in the 24-hour period before the ocial public announcements of the match results.) Also, pgrams and applicants generally do not list applicantsprograms with whom they have not had interviews [andis of course an equilibrium, since the clearinghouse pduces a stable matching, at which an applicant (progrcannot be matched to a program (applicant) without belisted, so there is no incentive to list programs or applicawith whom one has not had an interview]. There is alscharge to applicants who list more than 15 residencygrams, which may dissuade some applicants from lissome programs.

edabg

eag

5nP

t

eyeo’s

ola

rm

e

).eebnlfale

eatororled

td-o-,

onyr

n

tsg,attv-sry

eh,

-r-t

i,

-t

r-lcy

si-

oleeso5;

753VOL. 89 NO. 4 ROTH AND PERANSON: MATCHING MARKET FOR PHYSICIANS

not prefer her to any of the workers whosapplications it held when the algorithm stoppeFurthermore, Gale and Shapley showed thwhen preferences are strict, the particular stamatching produced by the worker-proposinversion of the algorithm gives each worker hmost-preferred position among those she cget at any stable matching. Even more strikinthe firm-proposing version of the algorithmgives every firm that fillsq positions itsq most-preferred workers among those it can bmatched to at any stable matching (Roth, 198Roth and Sotomayor, 1989). Much of the cotroversy about the organization of the NRMfocused on this difference between these twversions of the deferred-acceptance algorithm

But a deferred-acceptance algorithm may failproduce astable matching in a market withsome of the complexities of the NRMP, such athe presence of couples who submit Rank OrdLists of pairs of positions. The key to the stabilitof the outcome in simple markets is that (in thworker-proposing version of the algorithm) nfirm ever regrets having rejected a workerapplication, since it only does so when it has aapplication it prefers, and it will be matched tthis preferred applicant unless it receives appcations it prefers even more. However, inmarket containing couples, suppose that a fif1 receives an application from a workerw1,and rejects an application from a less-preferrworker w9 in order to holdw1’s application.Suppose further thatw1 is married tow2, whoseapplication is being held by firmf2, because thepair (f1, f2) is high on the preference list sub-mitted by the couplec 5 (w1, w2). Finally,suppose that firmf2 now receives an applicationit prefers and rejects the application ofw2. Inorder for the couplec now to apply to its next-choice pair of firms, (f3, f4), w1 must be with-drawn from firm f1. Thus, firm f1 now regretshaving rejected workerw9, and there may be apotential instability involvingf1 andw9 (and, ifw9 is part of a couple, this instability mayinvolve another firm as well; see Appendix C

The differences between simple and complmarkets involve more than the failure of thdeferred-acceptance algorithm to produce stamatchings: they extend to the nonemptiness astructure of the set of stable matchings itseSome of the important differences are summrized below, by noting theorems about simp

.t,le

rn,

e;

-

o.o

sr

n

i-

d

x

led.-

matching markets that do not hold when thmarket contains couples or other linkages thcreate complementarities between positionsapplicants (see Roth and Sotomayor [1990] fa comprehensive treatment and more detaireferences to the literature):

(i) In simple matching markets, firm andworker optimal stable matchings exisfor all possible ROLs and are produceby the firm- and worker-proposing variants of the deferred-acceptance algrithm (Gale and Shapley, 1962; Roth1985).

(i9) In markets with complementarities, nstable matching may exist, and evewhen stable matchings exist there mabe no optimal stable matchings for eitheside of the market (Roth, 1984; BriaAldershof and Olivia Carducci, 1996).

(ii) In simple markets, the same applicanare matched at every stable matchinand the same positions are filled. (This, any applicant who is unmatched aone stable matching is unmatched at eery stable matching, and the positionthat are unfilled are the same at evestable matching.) Furthermore, a firmthat fills only some of its positions at astable matching fills them with the samworkers at every stable matching (Rot1986).

(ii 9) In markets with complementarities, different stable matchings may have diffeent applicants matched and differenpositions filled (Aldershof and Carducc1996).

(iii) In simple markets, when the applicantproposing algorithm is used (but nowhen the program-proposing algorithmis used), it is a dominant strategy foapplicants to submit ROLs corresponding to their true preferences. No paralleassertion can be made about residenprograms that have more than one potion (Roth, 1982, 1985).

(iii 9) In markets with complementarities, nalgorithm exists that chooses a stabmatching whenever one exists and makit a dominant strategy for all agents tstate their true preferences (Roth, 198Aljosa Feldin, 1999).

tosntletcnoet

s

e

uf

hoce

yeleihem

cs-ost-o

ee

ich

e

ofs,enty-

-at

le

er.erc-i-ntmitin00o-if6s,hechlesko-heof

seInta-se,

chfor

a-

754 THE AMERICAN ECONOMIC REVIEW SEPTEMBER 1999

Therefore, a major focus of this study wasassess the extent to which these theoretical pobilities play a role in the actual NRMP matches. Ithe course of this report it will become clear thawhile it has always been possible to find stabmatchings in the previous years’ NRMP match(a stable matching has been found in every maat least since the mid 1970’s), it appears thatstable matching is precisely program-optimalapplicant-optimal in any of the years we havexamined. However, we will show that applicanproposing and program-proposing algorithmcontinue to perform approximately as in the caof simple markets.

B. The Preexisting NRMP Algorithm

The preexisting NRMP algorithm (the onin use in 1995 when this study began, anused through the 1997 matches) is the resof incremental modifications over a period oyears. It is primarily, but not entirely, aprogram-proposing algorithm and deals witmatch variations through a three-phase prcess. The first phase produces an initial matby ignoring most match variations, using thprogram-proposing deferred-acceptance algrithm, modified to let couples hold on to manoffers until a late stage in the algorithm. Thmatch produced in this way will in generanot be stable (because of the way it handlcouples, and because the other match vartions are ignored), so the second phase of tprogram identifies potential instabilities. Ththird phase of the program uses an algorithto fix these instabilities one by one and produces a stable match. The processing in ththird phase does not always have residenprograms proposing. Instead, couples propoin part of the algorithm designed to fix instabilities due to couples, and applicants alspropose in part of the algorithm that fixeinstabilities related to supplemental (firsyear) matches. Thus the 1995 NRMP algrithm is a hybrid; program-proposing in itsfirst phase (which performs the bulk of thmatching), and applicant-proposing in somparts of its third phase.

The NRMP specialty matches like ThoracSurgery are run using an algorithm that is tecnically a little different from the original NRMPalgorithm (it does not handle some of th

si-

,eshor

-se

dlt

-h

o-

sa-e

-isye

-

-

NRMP match variations such as the usesupplemental lists to form multiyear matcheand it is organized in a single phase). But whno match variations are present, the specialmatch algorithm and the 1995 NRMP algorithmare functionally equivalent to the programproposing deferred-acceptance algorithm in ththey all produce the program-optimal stabmatching.

II. Descriptive Statistics and Original NRMPMatch Results

A. The NRMP in the Years 1987 and1993–1996

Table 1 gives the descriptive statistics of thNRMP match in the five years we consideNotice that, in each year, a substantial numbof the more than 20,000 applicants who partiipate do so in ways that utilize the match varations that the NRMP allows: about 4 perceparticipate as couples, and 8–12 percent subsupplemental Rank Order Lists. In addition,the 1990’s about 7 percent of the 3,000–4,0programs that participate in each year have psitions that could revert to other programsthey remain unfilled (accounting for almostpercent of the total quota of positions). Thuthe match variations are a substantial part of tmatch. Before investigating how these matvariations change the properties of stabmatches and of strategic behavior, the first tais the design of an applicant-proposing algrithm to produce stable matches that meet tmatch-variation requirements of thousandsparticipants.

Quotas include positions in active programwith no ROL returned. Changes during thmatch are caused primarily by reversions.some cases, one position is reverted simulneously to two programs, causing a net increain the number of positions offered. In additiona few positions may be dropped from the matduring processing to accommodate requestseven/odd matching.

B. Specialty Matches: Thoracic Surgery inthe Years 1991–1994 and 1996

In contrast, the Thoracic Surgery match issimple match, with no match variations. Its ba

936

1625

812

058

272

78

588

179

arily byumber ofequests

755VOL. 89 NO. 4 ROTH AND PERANSON: MATCHING MARKET FOR PHYSICIANS

TABLE 1—DESCRIPTIVE STATISTICS AND ORIGINAL MATCH RESULTS FOR(A) THE NRMP AND (B) THORACIC SURGERY

A. NRMP

Category 1987 1993 1994 1995 1996

Applicants (Active, ROL returned):Primary ROLs 20,071 20,916 22,353 22,937 24,74Applicants with supplemental ROLs 1,572 2,515 2,312 2,098 2,4Results

Primary matches 16,117 17,209 17,725 18,170 18,3Supplemental matches 577 1,294 1,152 990 7

CouplesApplicants who are coupled 694 854 892 998 1,00Coupled applicants who matched 646 794 817 899 9

Programs:Active programs 3,225 3,677 3,715 3,800 3,83Active programs with ROL returned 3,170 3,622 3,662 3,745 3,7Potential reversions of unfilled positions

Programs specifying reversion 69 247 276 285 28Positions to be reverted if unfilled 225 1,329 1,467 1,291 1,2

Programs requesting even/odd matching 4 2 6 7 8

Quotas:a

Total quota before match 19,973 22,737 22,801 22,806 22,5Changes during match processing

Quota decreasesPrograms 22 120 143 124 130Positions 45 357 357 327 336

Quota decreasesPrograms 23 127 142 128 138Positions 46 338 338 303 326

Total quota after match (final quota) 19,972 22,756 22,820 22,830 22,

Results:Positions filled 16,694 18,503 18,877 19,160 19,04Positions unfilled 3,278 4,253 3,943 3,670 3,54Programs filled 2,100 2,309 2,440 2,599 2,58

B. Thoracic Surgery

Category 1991 1992 1993 1994 1996

Applicant ROLs 127 183 200 197 176Active programs 67 89 91 93 92Program ROLs 62 86 90 93 92Total quota 93 132 141 146 143Positions filled 79 123 136 140 132

a Quotas include positions in active programs with no ROL returned. Changes during the match are caused primreversions. In some cases, one position is reverted simultaneously to two programs, causing a net increase in the npositions offered. In addition, a few positions may be dropped from the match during processing to accommodate rfor even/odd matching.

ar

ngs.

ndasdi-ngdealscan

sic descriptive statistics and match resultsgiven in Table 1B.

III. Design of the Applicant-ProposingAlgorithm

The process by which the applicant-proposialgorithm was designed is roughly as follow

eFirst, a conceptual design was formulated acirculated for comment (Roth, 1996a). This wbased on an algorithm for simple markets, mofied to deal with the complexities of the NRMP. Iorder for the design to be coded into a workinalgorithm, a number of choices had to be maconcerning the sequence in which proposwould be made. The sequencing of proposals

oer

ig

mwrete

h.

ge

a

tio

e

i-

ay

e

a

ep

wli

le,r)talsi-e

m

g.rs

h

erell,n-le-ceantreg.stsro-Aro-

-chntis

ing

ksed

n,

er-

then-the

ch

e

756 THE AMERICAN ECONOMIC REVIEW SEPTEMBER 1999

be shown to have no effect on the outcomesimple matches, but it could potentially affect thoutcome when the NRMP match variations apresent. Thus, like the overall architecture of thalgorithm, the sequencing of proposals is a desquestion about which the existing theory givesome general guidance that falls short of a coplete engineering specification. Consequently,performed computational experiments befomaking sequencing choices. In what follows, wfirst present the conceptual design (from Ro[1996a]) and then discuss the sequencing expments and implementation decisions.

A. The Conceptual Design

The algorithm described here is based on tinstability-chaining algorithm in Roth and John HVande Vate (1990) (which finds stable matchinby resolving applicant–program instabilities onat a time) and on the general design of phase 3the preexisting NRMP algorithm.

The object of the algorithm is to producestable matching, by assembling a setA(k) ofresidency programs and applicants and a tentamatchingM(k) with the property that there are ninstabilities within the setA(k), and no applicantor program inA(k) is matched to anyone outsidof A(k). When the setA(k) has grown to includeall applicants and programs, the resulting matchstable, and the algorithm stops.

In the applicant-proposing algorithm, the intial set,A(0), consists of all positions offered inthe match, and the initial tentative matching hall positions vacant. The algorithm begins bselecting an applicantS(1) from the set of ap-plicants in the match and addingS(1) to A(0)to make the new setA(1).

At any stepk of the algorithm, at which a newapplicantS(k) has just been added to form the sA(k), the new tentative matchingM(k) is formedas follows. First, applicantS(k) [5S(k, 1)] pro-poses down his Rank Order List [of programs thalso rankS(k)], from the top, until the first pro-gram is reached that either has a vacant positionprefersS(k) to its least-preferred current tentativmatch. In the latter case, this least-preferred apcant, S(k, 2) is now rejected by the program inquestion, and this applicant now proposes doher ROL in a similar way, and so on. Each appcant S(k, n) displaced in this way similarly pro-poses down his or her ROL.

f

een

s-ee

hri-

e

s

of

ve

is

s

t

t

or

li-

n-

At some point in this process, an applicantS(k,n) may be displaced who is a member of a coupor who is displaced from a primary (second-yeaposition for which she also holds a supplemen(first-year) position. In either case, a second potion now potentially becomes vacant, as thspouse ofS(k, n) is withdrawn from his tentativematch, or asS(k, n) is withdrawn from her sup-plemental match. In either case, the prograwhose position is left vacant,P(k, n), is added toa “program stack” to be held for later processinIf S(k, n) is a couple, then both couple membe[S(k, n, a) andS(k, n, b)] now propose down theirjoint ROL of pairs of programs, and they eacmay displace another applicant. Also, if anyS(k,n) has a supplemental ROL associated with hnew tentative match, she proposes down it as wwhich may also result in the displacement of aother applicant. Thus, both couples and suppmental matches may simultaneously displamore than one applicant. One displaced applicis processed immediately, and any others aadded to an “applicant stack” for later processin

Applicants propose down their ROLs in thiway until the applicant stack is empty. (Applicancontinue throughout to be able to propose to pgrams which may be on the program stack.)residency program is then selected from the pgram stack, and all of the applicants inA(k) withwhom it might form instabilities [i.e., all of theapplicants inA(k) who are preferred by the program to its least-preferred current tentative matand who prefer this program to their currematch] are added to the applicant stack, whichprocessed as before, with applicants proposdown their ROLs from the top.

When both the applicant and program stacare empty, the tentative matching thus producis M(k): no instabilities forM(k) are containedin the setA(k), and no applicant or program inA(k) is matched byM(k) to anyone outside ofA(k). The algorithm is now ready to pick a newapplicantS(k 1 1), and start the process agaifor the setA(k 1 1).

When all applicants have been included in thsetA(k), even/odd requests and program revesions are adjusted, which causes additions toapplicant and program stacks, which are hadled as above. When these stacks are empty,algorithm stops, and the last tentative matbecomes final.

In a match with no match variations, th

eeeie

e.i-reggee

st-

p-ne

e

ap]eiecua

rao-aidowgh-lltres

n

hstalo-

lealsse-le

tchchavese-ro-dshe

ge-

n-he-ir

he

is

ley

toht-

ngP

ingfofb-o-,

6 Even in a simple matching market, the order in whichproposals are made can matter in versions of the Roth andVande Vate (1990) instability-chaining algorithm in whichmembers of both sides of the market may be chosen to makethe next proposal (in contrast to versions in which allproposals are made by one side of the market). Yosef Blumand Uriel G. Rothblum (1999) show that, in such a versionof the algorithm, late proposers have an advantage overearly proposers.

757VOL. 89 NO. 4 ROTH AND PERANSON: MATCHING MARKET FOR PHYSICIANS

applicant and position stacks would always bcome empty, and the final match would be thapplicant-optimal stable matching. When thmatch variations are present, there is a possibity that at some stages of the algorithm thposition stacks would never become empty (i.a cycle would occur, in which the same postions reappeared on the stack). Therefo“loop-detectors” need to be added to each stak. Every loop must involve a position becominunmatched and made vacant either becauscouple or a supplemental assignment has bewithdrawn from the position, or a position habeen withdrawn from an applicant (e.g., in saisfying an even/odd constraint). Thus, a loodetector can work by keeping a log of whepositions become unmatched in these ways [i.recording which applicant is unmatched fromwhich position, during the processing of somstepA(k)]. If the same pairs appear multipletimes, a loop is in progress. How to proceedthis point may depend on the nature of the loo(It is observed in Roth and Vande Vate [1990that certain kinds of inessential loops can brendered harmless by randomizing the orderwhich applicants and positions are processfrom the stacks. Loops due to the nonexistenof a stable matching would be more serious, bthe prior experience of the NRMP suggests ththese may be rare.)

Thus, the existing theory suggests the genearchitecture for an applicant-proposing algrithm that can deal with instabilities one attime as they are detected, and it provides guance on how the algorithm may possibly fail tfind a stable matching. But to determine hooften it might fail to produce a stable matchinwe need some computational experiments. Texperiments reported next, which will help determine the details of the algorithm design, wialso show that failures are rare: we will noobserve even a single failure when we explodifferent versions of the algorithm on previouyears’ ROL data.

B. Sequencing Questions and ImplementatioDecisions

In a simple match, without the NRMP matcvariations, the applicant-proposing algorithm judescribed always produces the applicant-optimstable match, and the program-proposing alg

-

l-

,

,e

an

.,

t.

ndett

l

-

e

rithm always produces the program-optimal stabmatch, regardless of the order in which proposare processed within the algorithm. One conquence of the fact that these optimal stabmatches do not exist in general when the mavariations are present is that the order in whiapplicants and programs are processed may han effect on the match produced. Thus thequence in which applicants and programs are pcessed at various points in the algorithm neeto be considered as part of the design of tapplicant-proposing algorithm.

Two issues were considered in conductinand evaluating experiments related to the squencing of operations in the algorithms.

(i) Do sequencing differences cause substatial or predictable changes in the matcresult (e.g., do applicants or programs slected first do better or worse than thecounterparts selected later)?6

(ii) Does the sequence of processing affect tlikelihood that an algorithm will produce astable matching? (In connection with thlatter point, recall that instability-chainingalgorithms can cycle—even when stabmatchings exist, and certainly when thedo not. Therefore, one objective wasconsider how sequencing decisions miginfluence the frequency of “loops” occurring in the algorithm.)

Experiments to test the effect of sequenciwere conducted using data from three NRMmatches: 1993, 1994, and 1995.

1. Sequencing Experiments on the PreexistNRMP Algorithm.—We investigated the effect odifferent sequencing of operations in variantsthe preexisting NRMP algorithm, in part to estalish a baseline against which to compare the algrithm to be designed. In the preexisting algorithm

inob

r

dri-

ias

lsod

t

n

ere

a

o

d-

o-

dt-

n

alin-man-eireyllro-pli-r, athep-edo-ankd

in

iald

n-ofodess-

ere-onsh-e(In

o

758 THE AMERICAN ECONOMIC REVIEW SEPTEMBER 1999

programs are processed in ascending sequencesix-digit program code number. To test the senstivity of the results to this sequencing, computational experiments were run on the ROL datawhich this sequencing was reversed (i.e., prgrams were processed in descending orderprogram code number). As expected, the resushowed differences, but the differences wesmall: the largest difference was in 1994 wheonly four out of 3,662 programs that submitteROLs received a different match under the altenative ordering, as did four out of 22,353 applcants. Not only are these differences very smathey do not appear to be systematic.7 (A fulleraccount of the results of these experiments apears in Appendix A.)

The preexisting NRMP algorithm was alsoinvestigated for its sensitivity to the sequencewhich reversions are processed. Rather thsimply changing the order in which reversionoccurred, the experiments involved setting thinput program quotas to be the final postmatcquotas produced by the preexisting NRMP agorithm. All further reversion processing wathen eliminated. These experiments then prvided an indication of the differences causenot only by changing the order of reversionsbut also by altering the time when reversionenter into the match processing (i.e., all requirereversions were assumed to take place simulneously, at the beginning of match processingNo more than two programs or applicants werobserved to be affected by such changes in aof the three years 1993–1995 (see Appendix A

Finally, it should be noted that no loops werdetected in any of these experiments on the pexisting NRMP algorithm. Consequently, despitthe presence of match variations, sequencing donot appear to play a significant role in the opertion of the preexisting NRMP algorithm.

2. Sequencing Experiments on the ApplicanProposing Algorithm.—Computational experi-ments were conducted to measure the impact

he2

leof18in

on

7 We use the term “very small” informally, but notmerely to express an opinion of changes that affect on thorder of 0.01 percent of applicants. These changes are aat least an order of magnitude smaller than the main effecwe will find due to changes between program-proposing anapplicant-proposing algorithms. Since the effects appearbe unsystematic, they do not appear to have any welfaimplications, on average.

byi--

-y

ltsen

-

ll,

p-

nn

eh-

-,,sda-).ey

).

e-

es-

t-

f:

(i) the sequence in which applicants are amitted to the algorithm for processing;

(ii) the sequence in which couples are prcessed relative to other applicants; and

(iii) the sequence in which applicants rankeby a program are processed when atempting to fill a program that has beeselected from the program stack.

To understand the results of the computationexperiments (which are tabulated in detailAppendix A) it is useful to compare the outcomes from each experiment to those froa fixed baseline. We chose as a baselineapplicant-proposing algorithm in which applicants were processed in ascending order by thapplicant codes, regardless of whether thwere single or members of couples. (In acases, when a member of a couple was pcessed, so was the other member. When apcants were processed in ascending code ordecouple was selected for processing based oncode number of the spouse with the lower aplicant code.) When a program was selectfrom the program stack, applicants were prcessed in ascending sequence by program rnumber. All of these experiments were carrieout on the ROL data from the NRMP matches1993, 1994, and 1995.

The experiments were conducted in a partfactorial design. The handling of couples hathree treatments (couples intermixed with sigles, couples first, and couples last); the orderintroducing applicants into the match had twtreatments (ascending order by applicant coor descending order); and the order of proceing applicants when a program is pulled fromthe stack had two treatments (ascending ordby program rank or descending order). The rsults are that none of these sequencing decisihad a large or a systematic effect on the matcing produced. In two-thirds of the cases, thmatch was the same as in the baseline case.the majority of the remaining cases only twapplicants received different matches, and tmaximum number of applicants affected was 1out of 22,937, which occurred when a coupreceived a worse match and initiated a chaindisplacements. This happened in two of thecases and involved the same 12 applicantsboth cases).

However, there was an effect of sequencing

elsotsdtore

uleeopsren-dveu-nce-i

-eg

ooedre93s

segl-hethnaree-he-ed

dea

gleo-

toontosaresre-

emeydse

kmnk

-isr

rede.

is

sam

-g.-nysr-

me

759VOL. 89 NO. 4 ROTH AND PERANSON: MATCHING MARKET FOR PHYSICIANS

the internal processing of the algorithm. The number of loops encountered was fewest when coples were introduced to the match after singapplicants. This is not too surprising in view of thfact that no loops would occur in the absencematch variations. The results indicate that looare least likely to occur when the couples aintroduced into the larger market with some tetative matches already assembled, as opposewhen couples enter first, so that the initial tentatimatches involve only couples. Introducing coples last reduces the numbers of loops (and hethe potential that in some future match it would bdifficult to find a stable matching) without changing the prospects of couples or single applicantsthe match.

Finally, experiments related to the sequence in which reversions are processwere performed on an applicant-proposinalgorithm. These experiments were similar tthose performed on the preexisting NRMP algrithm. Again, no substantial changes were inducby changing the order in which reversions wehandled; no changes at all resulted in the 19match, and only two applicants and programwere affected in the 1994 and 1995 matches (Appendix A). Thus, for both the preexistinNRMP algorithm and the applicant-proposing agorithm, there is almost no difference between tresults obtained with reversion processing andresults obtained by setting the quotas to the fiquotas after reversions and eliminating furtherversion processing. (This point simplifies the dsign of some of the experiments to compare ttwo algorithms, in connection with strategic behavior by residency programs, to be discusslater in this article.)

Based on the sequencing experimentsscribed above, it was decided to sequenceproposals by couples after proposals by sinapplicants, since this was the order that prduced the fewest internal loops.8

as

in

ngo

--

f

to

e

n

d

-

Note that we did not at any point chooserandomize the processing order (randomizatiwas shown in Roth and Vande Vate [1990]allow the algorithm to escape from certain kindof loops). The reason is that loops do not appeto be a problem with the processing sequencselected, and it was felt that a desirable featuof the match is that it should be precisely reproducible from the ROL data.

IV. Differences in the Matches Producedby the Two Algorithms

A. The NRMP

The preexisting NRMP algorithm and thnewly designed applicant-proposing algorithwere compared in terms of the matches that thproduce for the ROLs submitted in 1987 an1993–1996. Table 2 gives the results of the

f

yfe,r-

i)ilngas

s

8 The full details of the sequencing decisions arefollows:

(i) All single applicants are admitted to the algorithmfor processing before any couples are admitted.

(ii) Single applicants are admitted for processingascending sequence by applicant code.

(iii) Couples are admitted for processing in ascendisequence by the lower of the two applicant codesthe couple.

e

el-

-ll

(iv) When a program is selected from the program stacfor processing, the applicants ranked by the prograare processed in ascending order by program ranumber.

(v) The processing of programs requesting even numbers of matches or reversions of unfilled positionsdeferred until all applicants have been admitted foprocessing.

(vi) Programs requesting even numbers of matches aprocessed in ascending sequence by program coAn applicant deleted from a program in order toleave an even number of matches in the programplaced on the applicant stack for processing.

(vii) Programs requesting reversions of unfilled positionare processed in ascending sequence by the progrcode of the program “donating” the unfilled posi-tion(s). A program that “receives” a reverted position is placed on the program stack for processin

(viii) After all reversions have been processed, the requests for reversions are reprocessed, in case anew reversions of unfilled positions are required aa result of changes made in the processing of revesions that have been processed since the last tithis reversion request was considered.

(ix) When no further processing is required to satisfall reversions, requests for even numbers omatches are reprocessed as in point (vi) abovand if any changes are made, requests for revesions are reprocessed as in points (vii) and (viiabove. This iterative processing continues untno further changes are made by even processior reversion processing. (The possible need forreverted position to be “unreverted” is checked apart of the check for stability, by using originalquotas for programs which have lost positionthrough reversions.)

760 THE AMERICAN ECONOMIC REVIEW SEPTEMBER 1999

TABLE 2—COMPARISON OFRESULTS BETWEEN ORIGINAL NRMP ALGORITHM AND APPLICANT-PROPOSINGALGORITHM

Result 1987 1993 1994 1995 1996

Applicants:

Number of applicants affected 20 16 20 14 21Applicant-proposing result preferred 12 16 11 14 12Current NRMP result preferred 8 0 9 0 9

U.S. applicants affected 17 9 17 12 18Independent applicants affected 3 7 3 2 3

Difference in result by rank number1 rank 12 11 13 8 82 ranks 3 1 4 2 63 ranks 2 3 2 2 3More than 3 ranks 2 1 1 2 3

(max 9) (max 4) (max 5) (max 6) (max 6)

New matched 0 0 0 0 1New unmatched 1 0 0 0 0

Programs:

Number of programs affected 20 15 23 15 19Applicant-proposing result preferred 8 0 12 1 10Current NRMP result preferred 12 15 11 14 9

Difference in result by rank number5 or fewer ranks 5 3 9 6 36–10 ranks 5 3 3 5 311–15 ranks 0 5 1 3 1More than 15 ranks 9 4 6 0 11

(max 178) (max 36) (max 31) (max 191)

Programs with new position(s) filled 0 0 2 1 1Programs with new unfilled position(s) 1 0 2 0 0

nwn

fshots7

letp

el

etuy

-

et-tp-o-f

-

-dyto

nane-et--

comparisons. The first half of the table concetrates on the comparisons from the point of vieof applicants; the second half is from the poiof view of programs.

Only about 0.1 percent of applicants are afected by the change in algorithms, and of themost prefer the match they receive under tapplicant-proposing algorithm. Note that in twof the five years the number of applicanmatched changed by one (one fewer in 198one more in 1996). Recall that in a simpmatch a change from one stable matchinganother would never change the number of aplicants matched; so here is another casewhich the match variations cause a differencbut a difference which turns out to be very smaand unsystematic.

Equally few programs are affected by thchange of algorithms—and these constituabout 0.5 percent of all programs. Most, bnot all, of the programs prefer the match the

-

t

-e,e

,

o-

in,l

et

receive under the preexisting NRMP algorithm, but in 1994 and 1996 slightly moreprograms would even have preferred thapplicant-proposing algorithm to the preexising NRMP algorithm. Most programs thareceive a different match have only one aplicant different between the matches prduced by the two algorithms. The majority odifferences have to do with filling a positionwith a different applicant; only a small number of positions move from being filled tounfilled or vice versa. Again, this is a consequence of the match variations; as alreanoted in the case of applicants, it turns outbe both very rare and unsystematic.

It may be helpful at this point to consider aexample of precisely how the match variations ccause a deviation from the predictions of the thory for simple markets; for example, how it can bthat a few applicants do worse with the applicanproposing algorithm than with the program

ol

towrh

retsaanthfao

taaeu,we

aho-

s

thnetl

llthe-

atell

ss

d,din

ngchg

setsy

s-onr

ndesed

ee-htifeer-y

ssd-

761VOL. 89 NO. 4 ROTH AND PERANSON: MATCHING MARKET FOR PHYSICIANS

proposing algorithm. For example, if switching tthe applicant-proposing algorithm causes appcant A to improve his match from his secondhis first choice, it may be that the first choice norequires a supplemental match that was notquired before. If this new supplemental matcdisplaces a previously matched but less-preferapplicant in a program, that displaced applicanforced to go further down his or her list (i.e., doeworse). Furthermore, matching that applicant mdisplace another applicant, who may displaceother, and so on, causing a chain of applicawho do worse (even though, as expected ofapplicant-proposing algorithm, this chain oevents began with an applicant who did better thhe would have if the program-proposing algrithm had been used).

It is worth noting that when we refer to “only0.1 percent” of applicants, we are talking abouchange whose small size we will explain in whfollows. But this does not necessarily imply ththe associated change in welfare is small. Indein the debate that led to this study, and after oreport was circulated to the interested partiesgreat deal of discussion stemmed from the viethat the difference in welfare was likely to be largfor the affected applicants, and likely to be smfor the affected programs. This contributed to tdecision to adopt the applicant-proposing algrithm, a decision strongly lobbied for by the student organizations, and eventually unanimouadopted by the NRMP Board.9

B. Thoracic Surgery

Because there are no match variations inThoracic Surgery matches for the years we cosider, they are simple matches and are wdescribed by the existing theory. Consequen

i-h,

heaa-eeof

ta-wn.

9 The argument about the size, and relative size, of thwelfare effects for applicants and programs can be parphrased in part roughly as follows. Both programs anapplicants have some uncertainty in their rankings. Thermay not be that much difference between a program’s 7tand 17th ranked candidates. Similarly, applicants may nobe able to judge clearly whether they will get a betteeducational experience at their first- or second-choice programs. But applicantscanclearly judge other factors in theirpreferences, such as whether they would prefer to live iSeattle or Miami, where these programs may be locateTherefore, a change of algorithms may have a big effect othe affected applicants, and only a small one on the affecteprograms.

i-

e-

dis

yn-tse

n-

attd,ra

lle-

ly

e-lly

we know that the applicants will all do as weas possible at the stable match produced byapplicant-proposing algorithm, and the programs will all do as poorly as possible at thstable matching. What the theory does not tus is how large this effect will be; for that weneed to look at the data (see Table 3). Adiscussed in the introduction, the effect turnout to be minimal: in the five years we studieonly four applicants and four programs woulhave been affected by a change in algorithms;three of the five years, the applicant-proposialgorithm would have produced the same matas the program-proposing algorithm, indicatinthat this was the only stable matching in thoyears. (August Colenbrander [1996] reporsimilarly small differences in the specialtmatches he maintains.)

V. Differences in Sensitivity toParticipant Behavior

The comparisons of match outcomes dicussed in the previous section are all basedRank Order Lists that were submitted fomatches made by the preexisting NRMP aspecialty match algorithms. While the changobserved when the match was instead producby the applicant-proposing algorithm wersmall, a comparison of the algorithms also rquires us to consider whether participants mighave reason to submit different kinds of ROLsthe new algorithm were to be substituted for thpreexisting one. For this purpose, we considwhether participants could have favorably influenced the match, under either algorithm, bsubmitting different ROLs. The idea is to asseboth how many participants could do so anhow the number is different for the two algorithms. This will also allow us to determinewhat kinds of advice can be given to particpants about how to participate in the matcunder either algorithm.

Once again, this is a subject about which ttheory of simple matching markets tells usgreat deal for markets without the match varitions found in the NRMP. To see how well ththeory for simple markets approximates thNRMP matches, and also to assess the sizethe effects to expect, again required computional experiments on the data. A quick revieof the theory will help organize the discussio

ea-deht

r-

nd.nd

x

haoyt

fth

dn

id

th

thee-

.ss-edak-

sis

ult-angs-e-

tthectat

a-oflex

us-

10 This would free us from the computationally impos-sible task of investigating all possible manipulations by allparticipants.

ee

762 THE AMERICAN ECONOMIC REVIEW SEPTEMBER 1999

A. Strategic Behavior in Simple and CompleMatching Markets

In a simple matching market, without matcvariations, it has been shown (Roth, 1982) ththere do not exist any stable matching algrithms that completely remove the possibilitthat some applicant or program can get a betmatch by submitting an ROL that differs fromthe applicant or program’s straightforward preerences. However, we have already notedfollowing:

In simple markets, when the applicant-proposing algorithm is used, but not whenthe program-proposing algorithm is used,no applicant can possibly improve hismatch by submitting an ROL that is dif-ferent from his true preferences. (Recallalso that no parallel assertion can be madeabout residency programs that have morethan one position.)

Therefore, in simple markets, we would finstrategic opportunities for applicants only whethe program-proposing algorithm is used, anthe theory tells us what these might be. Specically, consider the ROL of some applicant, andefine atruncationof that ROL to be a shorterROL that is the same as the original ROL for amany programs as it ranks. We can then sayfollowing:

In simple markets when the program-proposing algorithm is used, every appli-cant who can do better than to submit histrue preferences as his ROL can do so bysubmitting a truncation of his true prefer-ences. That is, if (holding all other ROLsconstant) an applicant would be matchedto his kth choice if he submitted his truepreferences, and hisj th choice (withj ,

TABLE 3—DIFFERENCE IN RESULT WHEN ALGORITHM

CHANGED FROM PREEXISTING SPECIALTY MATCH TO

APPLICANT-PROPOSING

Year Difference

1991 none1992 2 applicants improve, 2 programs do wors1993 2 applicants improve, 2 programs do wors1994 none1996 none

t-

er

-e

df-

se

k) if he submitted some other ROL, thenhe can be matched to hisj th choice bysubmitting a truncation of his true prefer-ences at thej th choice. Furthermore, nopart of his original ROL below thekthchoice has any effect on the match (Rothand Vande Vate, 1991).

It can also be shown that truncations arekind of manipulation that can potentially bidentified with the least information about others’ preferences (Roth and Rothblum, 1999)

In simple markets, the reason that all succeful manipulations can (also) be accomplishby truncations is that, in a simple market,deferred-acceptance algorithm never “bactracks”: no information in an agent’s ROL iused beyond the point at which that agentmatched. Although we cannot apply this resdirectly to the complex market, we can do computational experiments to assess how goodapproximation is provided by concentratinonly on truncations in the investigation of posible strategic manipulations in the NRMP. Spcifically, if we find that information abouagents’ preferences among options belowpoint at which they are matched has little effeon the match, then we can be confident thinvestigating truncations will give us a comprably good approximation for the magnitudepossible strategic manipulations in the compNRMP market.10

For simple markets, the theory also tellswhich applicants can potentially profit from manipulation, and how much:

In simple markets, when the program-proposing algorithm is used, the only ap-plicants who can do better than to submittheir true preferences are those whowould have received a different matchfrom the applicant-proposing algorithm.Furthermore, the best such applicants cando is to obtain the applicant-optimalmatch, and they can do this by submittingto the program-proposing algorithm thetruncation of their true preferences thatstops at the match they would have gottenfrom the applicant-proposing match (see

on

uap

wh

do

ofuur

s

-efroe

le

asi

-yli

)-eenoe

ts

f

d-is

e

legn

s,,itns

i-

er-

so-s.o-e

ans.-

e;

dt,

ofg-g

one--f

.,

763VOL. 89 NO. 4 ROTH AND PERANSON: MATCHING MARKET FOR PHYSICIANS

Gabrielle Demange et al., 1987; Roth andSotomayor, 1990).

It is important to note that, even in the casea simple match without match variations, aapplicant generally would not have the information needed to submit such a truncation (andhe submitted a truncation that was one progratoo short he would become unmatched). Bthis result shows that, in a simple match, we cidentify an upper bound on the number of aplicants who couldpossiblyprofit from manip-ulating their Rank Order Lists, by seeing homany applicants receive different matches at ttwo algorithms.

We cannot directly apply this upper bounto the NRMP, because it depends for its proon the existence of optimal stable matchingfor each side of the market, which we know(from the sequencing experiments) do nexist in the NRMP data. But the theory osimple matches allows us to use the comptational results reported in Table 2 as a nmerical benchmark against which to compathe computational estimates we will make othe scope for possible manipulation. That iwe can compare the estimates we get of homany applicants can potentially profit fromstrategically stating their ROLs with the numbers of applicants who were observed to gdifferent matches from the two algorithms. Ithese numbers are close for the program-pposing algorithm (and close to zero for thapplicant-proposing algorithm), then the theory of simple matches provides a comparabclose approximation for the situation in thcomplex NRMP market.

The case of programs that have more thone position is not so simple, even in the caof simple matches. Programs may, at leasttheory, profit both from truncating theirROLs, and from reducing the number of positions they submit to the match (either bmaking early arrangements with some appcants or by withholding positions to be filledby unmatched applicants after the matchThe temptation for this latter kind of manipulation can be shown to be larger with thprogram-proposing algorithm than with thapplicant-proposing algorithm (see TayfuSonmez, 1997, 1999). Thus, in addition texperiments with truncations of ROLs, w

f

-ifmtn-

e

fs

t

--ef,w

t

-

-y

nen

-

.

must also conduct computational experimeninvolving reductions in stated capacities.

B. Experiments to Determine Upper Boundsfor Profitable Strategic Behavior

1. Preliminary Experiments: Truncation oROLs at the Match Point.—As noted above,in a simple market, if an applicant is matcheto his kth-choice program, or if the lowestranked applicant a program is matched toits kth-choice applicant, truncation of thROL at thekth entry would have no influenceon the match. This is because, in a simpmatch, the applicant- or program-proposinalgorithms never have to “backtrack” on aROL. But in the NRMP, backtracking canoccur, because of the match variations. Thubefore exploring what truncations, if anycould have a strategic effect on the match,was first necessary to see whether truncatioat the match point [i.e., deleting thek 1 1 andhigher (less-preferred) choices for a particpant who was matched to hiskth choice]could influence the result of the match undeither algorithm, and how much. The truncations of applicant ROLs and program ROLwere investigated separately, for each algrithm, for the 1993, 1994, and 1995 matcheIn the majority of cases no change was prduced when all ROLs were truncated at thmatch point; and in no case were more ththree applicants affected by such truncation(Over the more than 60,000 applicants involved in these experiments, only four weraffected by truncations of applicants’ ROLssee Table B1 in Appendix B for the detaileresults.) Thus, truncations at the match poinwhile not entirely without effect, do not playa substantial role; they affect on the order0.01 percent of applicants, an order of manitude smaller than the effects of changinalgorithms.

Because we have now seen that informatibeyond the match point influences the outcomfor only a tiny percentage of participants, concentrating on truncations will give us a comparably good approximation for the numbers oparticipants who could potentially profit fromany kind of strategic manipulation of ROLsThe computational experiments which follow

-(io

h

e

si

adowtau-

er

somes-l

s-e

sts

emst

d

enbo

oftss

ee

ofnyerbep-d

’se).eveetoit

e-

e)

d-

d-eal

e,os,s

orforl

edt)yhengnsdya-esisallot

764 THE AMERICAN ECONOMIC REVIEW SEPTEMBER 1999

therefore, will concentrate on identifying an upper bound on the number of participants whothey had the necessary information) could ptentially profit from strategic behavior involv-ing truncations of their ROLs above the matcpoint, and (for programs) reductions in the number of positions they offer in the match belowthe number of applicants to which they wermatched.

2. Experiments to Determine Upper Bound—As discussed above, the kinds of strategmanipulation to be considered involve trunction of ROLs by applicants or programs, anreductions in stated numbers of positions (qutas) by programs. Since we want to know hooften a single agent can profitably manipulathe stated ROL, we could in principle conductseparate experiment for each participant, bthis would be computationally infeasible. Consequently we need to design an efficient expiment that will let us tightly bound the numbeof individuals who can potentially profit frommanipulating their ROLs.

The manipulations involving program quotaraise the question of how to handle reversionspositions when quotas are to be different frothose in the data. Similar questions arise whtruncating program ROLs, as this may increaunfilled positions. All of the experiments concerning strategic behavior of programs handreversions by fixing quotas at the final quotaobserved after the match with the original matcdata. None of the results are likely to be sentive to this simplification, as shown by the results of the sequencing experiments discussin Section III and detailed in Appendix A.

For each of the strategic manipulationwhose potential magnitude is to be assessed,chief difficulty in designing the experiments ithat a change in a single ROL or quota has twkinds of effects: it may potentially change thmatch of the applicant or residency prograwhose ROL or quota is changed, but it may alpotentially change the match of other applicanand residency programs. To see why, suppothat the ROL of some applicant is truncateabove his current match point, and that thmatch under one of the algorithms is rerun aftmaking (only) this change. Then the applicawhose ROL was changed may do better (being matched to a more-preferred choice)

f-

-

.c-

-

e

t

r-

f

ne

eshi-

d

he

o

osse

ertyr

worse (by being unmatched insteadmatched). At the same time, other applicanmay do better (and other residency programmay do worse) because of the availability of thposition previously held by the applicant whoslist was truncated.

This means that, if we truncate a groupapplicant ROLs, for example, and see how maof the applicants in this group receive a bettmatch as a consequence of this change, we willlooking at an overestimate of the number of aplicants in the group who could have benefitefrom truncating their own ROL; many of themwill have instead profited from someone elsetruncation (even if that person himself becamunmatched as a result of his own truncationThus, the number obtained in this way would ban upper bound on the number that would habenefited by truncating their own ROL, whilholding all others’ constant. But we do not havesettle for this upper estimate; we can refineiteratively, by now continuing to truncate thROLs only of those applicants whose match improved as a result of the previous (collectivtruncations. This will allow us to further eliminatefrom the set of truncations those who profitefrom the truncations of applicants who were themselves harmed by their own truncation. Proceeing in this way, we can continue until no morreductions in the sample are achieved. This finnumber will still be an upper bound, of courssince even in a group of truncators who all dbetter when they all truncate their preferencesome may be profiting from the truncated ROLof the others, not from their own truncation.

Experiments were conducted separately fapplicants and for programs, and separatelyeach of the two algorithms. A computationaexperiment for applicants in a given year startby truncating all ROLs just above the (lowesmatch point (i.e., every applicant’s primarROL was truncated just above the matchreceived when no ROLs were truncated usithe algorithm in question). For example, if aapplicant originally matched to rank 3 on hiprimary ROL, the truncated ROL containeonly his first two choices. Of course, manapplicants were left unmatched by this trunction, while others received preferred match(these were the only two possibilities at thstage). Then at the next step, the ROLs ofthose who had truncated their lists but did n

llote

r

o3lte

fi

w-

eodnnte

snon

ea

c

le

.fi-tedual

e

e,ePli-ds

-teey

ysd

lyhect

nt-ct

ofirf

enrngin

765VOL. 89 NO. 4 ROTH AND PERANSON: MATCHING MARKET FOR PHYSICIANS

improve were restored to their original lengthand the process was repeated with the smanumber of truncations that remained. This prcess was repeated until it converged. Computional experiments for programs were structursimilarly; starting with every program’s ROLtruncated just above the lowest-ranked matchreceived. The full results for the NRMP fo1987 and 1993–1996 are given in Appendix B(Table B2 reports the results of the truncationapplicant ROLs for each algorithm; Table Breports the results for programs.) The resucan be summarized by looking at the final uppbounds of the number of applicants and thnumber of programs that could possibly benefrom truncating their ROLs.

The results are reported and analyzed belofirst for the NRMP matches, and then for Thoracic Surgery.

a. Results for the NRMP.—The truncationexperiments with applicants’ ROLs yield thupper bounds shown in Table 4 for the twalgorithms in the years studied. As expectemore applicants can benefit from list truncatiounder the preexisting NRMP algorithm thaunder the applicant-proposing algorithm. Nothat the number of applicants who could evepotentially benefit from truncating their ROLunder the preexisting NRMP algorithm is ieach year almost exactly equal to the numberapplicants who received a preferred match uder the applicant-proposing match (line 2 oTable 2). We will return to this point in amoment, but note that it suggests that this uppbound is very close to the precise number thwould be predicted in the absence of matvariations.

The truncation experiments with programsROLs yield the upper bounds shown in Tab5. As expected, some programs can benefrom list truncation under either algorithmHowever, consistently more programs benefrom list truncation under the applicantproposing algorithm than under the preexising NRMP algorithm. Note that, although thnumbers of programs in these upper bounremain small, they are in many cases abotwice as large as the number of programs threceived a preferred match at the stabmatching produced by the algorithm othethan the one being manipulated. (That is, r

,er-a-d

it

.f

sret

,

,

n

f-

f

rt

h

fit

t

-

stt

er-

ferring back to Table 2, we see, for examplthat in 1995 only 14 programs preferred thmatching produced by the preexisting NRMalgorithm to the one produced by the appcant-proposing algorithm, but we now fin36 programs in our upper bound of programthat could potentially profit from a manipulation of the applicant-proposing algorithm.) Itherefore seemed worthwhile to examinthese upper bounds further and see if thwere overestimates.

For each algorithm, this was first done btaking a 50-percent sample of the programcontained in the upper bound for 1995 anrestarting the truncation experiment with onthese programs having truncated ROLs. Tidea is that, if each of these programs can in fabenefit from its own truncation, the experimewould stop after the first iteration, with no further reductions in the upper bound. But if in fathe upper bound is an overestimate, and somethe programs in it are benefiting not from theown truncated ROLs, but from the truncation oone of the other ROLs in the upper bound, thon average half of such “false positives” in ou50-percent sample would have been benefitifrom the truncation by one of the programs

TABLE 4—UPPERLIMIT OF THE NUMBER OF APPLICANTS

WHO COULD BENEFIT BY TRUNCATING THEIR LISTS AT ONE

ABOVE THEIR ORIGINAL MATCH POINT

Year

Upper limit

Preexisting NRMPalgorithm

Applicant-proposingalgorithm

1987 12 01993 22 01994 13 21995 16 21996 11 9

TABLE 5—UPPERLIMIT OF THE NUMBER OF PROGRAMS

THAT COULD BENEFIT BY TRUNCATING THEIR LISTS AT

ONE ABOVE THE ORIGINAL MATCH POINT

YearPreexisting NRMP

algorithmApplicant-proposing

algorithm

1987 15 271993 12 281994 15 271995 23 361996 14 18

net-eweml

e

oas

mlihhrrd

o-Los

a

ae

ynon

oe

ldlnthh

ew-eshes

re,to

e

-c-se-b-s

n-stillntfry

nt)-

bew

velyeaeen-s.x-ees-.

hees

766 THE AMERICAN ECONOMIC REVIEW SEPTEMBER 1999

the other 50 percent, which are no longer trucated. In this case we would iterate until thnumber of truncators who improved their oucome again stabilized at a new, lower uppbound. This is in fact what happened; the neestimates for 1995 (equal to twice the numbobtained from the 50-percent sample) are copared to the old ones in Table 6. These resuconfirm that the number of programs that cabenefit from the ROL truncations stated earliare indeed overestimates.

A further analysis was undertaken for eachthe five years, to compare the specific individuprograms and applicants who appear in theupper bounds as potentially benefiting froROL truncations with the programs and appcants whose results changed when the algoritchanged. This analysis indicated that those wcould benefit from ROL truncations were, fothe most part, those who did differently (geneally worse) when the algorithm was changefrom their side proposing to the other side prposing (without ROL truncations). For example, the applicants who can benefit from ROtruncations when the program-proposing algrithm is used are very largely the same as thowho benefit when the algorithm is changed toapplicant-proposing algorithm with no ROLtruncations. Thus, in this respect also, it appeathat the theory for simple markets providesgood approximation of the situation in thNRMP match.

We next turn to the question of capacitmanipulation by programs. Recall that in aactual match this could be considered by a prgram in the context of either an early agreeme(for example with an independent applicant)in anticipation that some positions would bfilled postmatch.

TABLE 6—REFINED ESTIMATE OF THE UPPERLIMIT OF THE

NUMBER OF PROGRAMS THAT COULD IMPROVE THEIR

RESULTS BY TRUNCATING THEIR OWN ROLS IN 1995

EstimatePreexisting NRMP

algorithmApplicant-proposing

algorithm

Original results 23 36Current estimate

(still an upperlimit) 12 22

-

r

r-

tsnr

fle

-mo

-

-

-en

rs

-t

r

An initial experiment was run setting alprogram quotas to the number of positions fillewith the algorithm in question and the originadata. (This is analogous to the initial experimeinvolving truncations of the ROLs at the matcpoint, rather than above it.) In a simple matcwithout NRMP match variations, this would bexpected to have no impact on the results. Hoever, with NRMP match data some differencwere observed, as seen in Table 7. With tapplicant-proposing algorithm, the differenceare negligible. However, more differences weobserved with the preexisting NRMP algorithmand the results obtained by setting the quotasthe original positions filled tended to producbetter results for the programs.

In order to identify programs that could improve their remaining matches by further reduing their quotas, an iterative technique waemployed similar to that used to investigate theffects of ROL truncations. After several iterations revised downward the upper bounds otained in this way, the resulting upper boundon the number of programs that could potetially profit from stating lower quotas was ashown in Table 8. Again, these numbers are sestimates of the upper bound; further refinemeis still possible. However, given the size othese numbers, it seems clear that only a vesmall number of programs (less than 1 percecould improve their remaining matches by reducing their quotas. This does not appear toan advisable strategy for programs to followith either algorithm.

b. Results for Thoracic Surgery.—Becausethe Thoracic Surgery match does not hamatch variations, the theory tells us precisewhich applicants and programs could improvtheir match by an optimal manipulation. Ascheck on our computational procedures, wconfirmed these predictions by running thsame computational experiments on ROL trucations as described for the NRMP matcheThe results, summarized in Table 9, are as epected. Thus, in Thoracic Surgery as in thlarger and more complex NRMP match, thopportunities for strategic manipulation are esentially nonexistent under either algorithm(Colenbrander [1996] reaches essentially tsame conclusions about the specialty matchhe maintains.)

767VOL. 89 NO. 4 ROTH AND PERANSON: MATCHING MARKET FOR PHYSICIANS

TABLE 7—RESULTS WITH INPUT QUOTAS SET TO POSITIONS FILLED, COMPARED TO ORIGINAL RESULTS

Result

1993 1994 1995

PreexistingNRMP

algorithm

Applicant-proposingalgorithm

PreexistingNRMP

algorithm

Applicant-proposingalgorithm

PreexistingNRMP

algorithm

Applicant-proposingalgorithm

ProgramsImprove 12 2 9 none 25 noneDo worse none none 3 2 9 2

ApplicantsImprove none none 3 2 6 2Do worse 12 2 9 none 27 none

r-eod-e

exleesretoy-

ethaae

alr

m-on

der

dy-ic

ofesr-ndhlytots,ostngsre-, soch-esch-ts-

of

ntof. Ifta-

bepo-, if

VI. Why the Differences Are Small: Insightsfrom the Theory of Simple Markets

All the results to this point can be characteized by noting that the theory of simplmatches, without match variations, gives a goprediction of the direction of each of the comparisons, and, in addition, the size of all thchanges has been very small. This sectionplores what insights we can get from simpmarkets to help explain why these differencare so small. The results in this section abased on computational comparisons similarthose discussed earlier, but now concerning hpothetical markets without any match variations.

The small differences between algorithms whave been seeing reflects that, in each ofyears studied, the set of stable matchings hbeen small, as measured by the number of pticipants who receive different matches from thprogram-proposing and applicant-proposinggorithms.11 It is therefore of interest to conside

TABLE 8—REVISED ESTIMATE OF THE UPPERBOUND OF

THE NUMBER OF PROGRAMS THAT COULD IMPROVE THEIR

REMAINING MATCHES BY REDUCING QUOTAS

YearPreexisting NRMP

algorithmApplicant-proposing

algorithm

1987 28 81993 16 241994 32 161995 8 161996 44 32

orna

-

-

how the set of stable matchings looks in coparably large markets when we concentratesimple matches. For this purpose, we consithe very simple matching markets withn firms(each with one position) andn applicants, asnapproaches the size of the markets we are stuing, namely, the specialty markets like ThoracSurgery and the general NRMP match.

One factor that strongly influences the sizethe set of stable matchings (which coincidwith the core in this simple model) is the corelation of preferences among programs aamong applicants. When preferences are higcorrelated (i.e., when similar programs tendagree which are the most desirable applicanand applicants tend to agree which are the mdesirable programs), the set of stable matchiis small. (When preferences are perfectly corlated, then there is a unique stable matchingboth algorithms would produce the same mating.) However, as the correlation of preferencgoes down, the size of the set of stable matings grows, and more and more participanwould be matched differently by the two algorithms. This is true independently of the sizethe market.

le.vehoset

11 This is the natural measure for the size of the setstable matchings in the present context, since the concewith how many market participants will be affected by

esr-

-

fis

change in algorithms. Note, however, that it is differefrom the more common measure of the size of the setstable matchings, the number of distinct stable matchings20 applicants receive different assignments at different sble matchings, there could be as many as 210 5 1,024different stable matchings, in case the 20 applicants canresolved into 10 independent pairwise interchanges ofsitions, or there could be as few as two stable matchingsall 20 applicants are involved in a single irreducible cycIn either case, if there are 20,000 jobs being filled, we habeen focusing on the approximately 20 applicants wreceive different assignments when we conclude that theof stable matchings is small.

e

e

768 THE AMERICAN ECONOMIC REVIEW SEPTEMBER 1999

TABLE 9—THORACIC SURGERY: (A) NUMBERS OF APPLICANTS WHO COULD IMPROVE MATCHES BY TRUNCATING THEIR

ROLS; (B) NUMBERS OF PROGRAMS THAT COULD IMPROVE MATCHES BY TRUNCATING THEIR ROLS

Algorithm 1991 1992 1993 1994 1996

A. Applicants Who Could Improve by Truncating Their ROLs:

Preexisting NRMP none 2 applicants improve(same ones who didbetter when thealgorithm changed)

2 applicants improve(same ones who didbetter when thealgorithm changed)

none none

Applicant-proposing none none none none non

B. Programs That Could Improve by Truncating Their ROLs:

Preexisting NRMP none none none none nonApplicant-proposing none 2 programs improve

(same programs thatdid worse when thealgorithm changed)

2 programs improve(same programs thatdid worse when thealgorithm changed)

none none

est-cengw-

ulde or-

plihath

too

ntoinge

-

arm

eexx-

a

r-

mee

er-

re

te

shisf-ryle

a-

ntisr-n

heiningal

-

It turns out, however, that the size of thmarket also plays a critical role, in an intereing way. Consider the case in which preferenare uncorrelated (so the set of stable matchiis large). If every applicant could somehointerview and be interviewed for all of the positions, then the set of stable matchings wogrow larger and larger (even as a percentagthe number of applicants who could get diffeent stable matchings) as the number of apcants and positions grew. Figure 1 shows tthis percentage grows to over 90 percent bytime n reaches 1,000.

Of course, in a real market there is a limithow many interviews an applicant can have,a program can conduct. When we take this iaccount, we see that the set of stable matchquickly becomes very small as the market bcomes large.

Specifically, letk equal the number of interviews a candidate can have, and letn equal thenumber of applicants and positions in the mket. Then, even when preferences are copletely uncorrelated, ask/n becomes small, thset of stable matchings becomes small. Forample, ifk 5 15 (not an unreasonable approimation for the NRMP) andn . 10,000,fewerthan 0.1 percent of applicants would receivedifferent match from the two algorithms.12 Thatis, even with completely uncorrelated prefe

aonte-s-

12 The variance (based on 1,000 randomly generatesimple markets) is well under 0.001 percent.

ss

f

-te

r

s-

--

-

ences, we see in this simple market the saone-in-a-thousand order of magnitude that wsee in the NRMP. And for simple markets thsize of the specialty matches like Thoracic Sugery, withn on the order of 100 positions, if wesuppose that applicants interview at no mothan k 5 10 programs we find only about 2percent of applicants receiving differenmatches from the two algorithms. Figur2 graphs the curves for fixedk, asn goes from10 to 10,000.

Especially in view of the fact that preferenceare not uncorrelated in the medical matches, tmeans that the orders of magnitude of the efects studied in the actual matches are vecomparable to what we should expect of simpmatches with similar values ofk and n. Thus(once we look at bothk and n) these simplemarkets turn out to provide a good approximtion not only for the direction of the effects weare seeing, but also for their size.

The reason this is important for the presestudy of the NRMP and specialty matchesthat, in the theoretical study of simple makets, we can look at what would happen whewe know agents’ true preferences, not just tROLs they submit to the match (whereasthe study of real matches we have been usas data the submitted ROLs). One theoreticpossibility for why we find such small potential for strategic manipulation is that our dathas been collected after such manipulatihas already taken place. That is, one counrhypothesis might explain our results by po

d

769VOL. 89 NO. 4 ROTH AND PERANSON: MATCHING MARKET FOR PHYSICIANS

FIGURE 1. SIZE OF THE SET OF STABLE MATCHINGS AS A FRACTION OF n, WHEN k 5 n (UNCORRELATED PREFERENCES)

Note: C(n) is the number of applicants who get different stable matches, when the market size isn.

svehehg-hesle

hen

s-sewnalhewheael

iseeyndss-ery

nutont-n-an

tss,ryts

iting that there are substantial opportunitiefor strategic manipulation but that these habeen exhausted by the time we look at tROLs submitted to the match, because tparticipants have already behaved stratecally in an optimal way. Another counterhypothesis could be that the hybrid nature of tpreexisting NRMP algorithm in fact producematches that are far from the worst possibstable matching for applicants, and that tset of stable matchings is therefore substatially larger than we detect. The results dicussed in this section show that thehypotheses are implausible, because whenlooked at similarly sized artificial matches, iwhich we can examine the hypotheticparticipants’ true preferences, we find that tset of stable matchings is close to the sizehave computed from the ROL data. Thus tstudy of simple markets provides an explantion of not only the direction of the effects whave been examining, but also their smasize.13

e-e-

he-d,be

ye-

13 It remains an open problem to develop analyticaresults that explain why the core of this simple markeshrinks as the market grows when the number of interviewan applicant can go on remains constant. The fact that eveworker who does get a different job at different stablematchings is involved in certain sorts of preference cyclemay provide an avenue for obtaining such results.

ei-

-

e

e

-

l

VII. Theory and Computation in EconomicDesign: Some Methodological Reflections

Perhaps the first rule of any design effortthat “details matter.” The details determinwhat outcomes are even feasible, and so thmatter in the most basic aspects of design; athey have implications for all of the market’properties, so they matter for the subtlest apects of the design’s consequences. Thus, evdesign effort will be different. But if we are todevelop a body of knowledge about desigpractice in economics, we need to think abothe methodological issues that may be commto many design efforts. This section is an atempt to put the methodological issues encoutered in the NRMP design and evaluation intocontext that may be useful for other desigefforts. Specifically, this design effort involvedthe continual interplay among various aspecof simple theory, computational experimentand theoretical computation. The simple theoguided the design of computational experimenon the complex system, which provided unprdicted results that were then explained by thoretical computation.

The reason why there are gaps between tory and design is that, just as design is detailetheoretical models must often be sparse, touseful for organizing and directing work in avariety of applications whose connections mabecome apparent only with the benefit of th

ltsry

s

770 THE AMERICAN ECONOMIC REVIEW SEPTEMBER 1999

FIGURE 2. SIZE OF THE SET OF STABLE MATCHINGS AS A FRACTION OF n FOR DIFFERENT VALUES

OF k (UNCORRELATED PREFERENCES)

Notes: C(n) is the number of applicants who get different stable matches, when the market size isn; k is the number ofprograms on an applicant’s ROL.

enpleBufuoredatho

ci-se

heedolet ithd

ndhab

bleli-

eda-y

sm

y,re

-ed

se--

ithnofa-ser-nry

ory. Much of this paper has therefore beconcerned with filling the gaps between simabstract markets and complex real ones.before we discuss the filling of gaps, it is useto recall the essential role played by the theof simple matching markets. This role rangfrom suggesting the basic design of the cleinghouse algorithm and the comparisons ofalgorithms, to directing attention to aspectsthe market in which problems might be antipated, and to offering insights into how themight be overcome.

It was the existing simple theory, and tempirical studies it permitted to be conducton field data, that pointed to the importancestable matchings. Although counterexampshowed that stable matchings might not existhe complex American medical market (Ro1984), the theory of simple markets suggestegeneral architecture for an algorithm to fistable matchings. Furthermore, it showed talgorithms in which proposals were issued

tly

r-ef

fsn,a

ty

applicants could be expected to produce stamatchings as favorable as possible to appcants. In short, the body of theory that existprior to the start of this design (e.g., as summrized in Roth and Sotomayor [1990]) alreadconstituted a rough road map for the mechanidesign and evaluation reported here.

At the same time, the existing body of theorthrough counterexamples designed to exploits limits (inspired by empirical studies of existing markets), pointed to questions that needto be answered. These included the role ofquencing in design of the algorithm, the frequency with which the algorithm might fail tofind a stable matching, and the frequency wwhich opportunities for strategic manipulatiomight arise. These all required estimationsmagnitudes, which in turn required computtional experiments on the data. Some of thecomputational experiments were straightfoward to conduct. But for estimating how oftestrategic opportunities might arise, the theo

h

seelesneae

un

y

eaet

mle-s

ex

t

nr

o

tiileygndedat

tede

ri-ionseders

hatng

in

bym

ret-ofrgeusketheree-

slych-adcti-

n-dr-heketretchb-ofar-e-s.is-sultall

15 Of course, it was necessary to check directly that theprogram worked, and in fact it was easy to confirm that thematchings it produced were stable as well as feasible with

771VOL. 89 NO. 4 ROTH AND PERANSON: MATCHING MARKET FOR PHYSICIANS

played an essential role in the design of tcomputational experiments.

Specifically, although the main conclusionabout strategic behavior do not carry ovfrom the simple to the complex market, ththeory of the simple case gives us not onfinal conclusions, but also insight into thway that strategic behavior works. In the caof misrepresentation of ROLs, the way aapplicant might gain an advantage, in eiththe simple or complex markets, is to stateROL that causes him, at some point in thalgorithm, to make a rejection that would nohave been made if he had submitted his trpreferences. This rejection causes a resideprogram to have a vacancy and hence makeoffer to another applicant, who in turn mamake a different rejection than he would havif the original applicant had stated his trupreferences. It is the propagation of this “vcancy chain” through the market that raisthe possibility that an applicant could do beter than to state his true preferences.14 Thefact that the potential advantage comes frorejections being made implies in the simpmodel that the possibility of profitable strategic misrepresentations of ROLs can be invetigated by looking at only the small subset omisrepresentations that consist of truncationTo see if this was approximately true for thcomplex market required a computational eperiment, and (when this proved to be thcase) it became computationally feasibleinvestigate the strategic properties of thcomplex market, through an experiment cocentrating on truncations. Thus, the theoallowed us to see what computational expeiments would give us the answer to a questithat the theory alone could not answer.

While computational experiments on the daallow us to get answers that may not be avaable from simple theory, they do not necessarlet us understand why the answers are what thare. In addition, results obtained from explorina large and complex data set with a large anew piece of software (the new algorithm) neto be checked in some way, to make sure th

14 The propagation of vacancy chains as such in simpmarkets is a topic that was explored in the course of thdesign effort, and is reported in Yossi Blum et al. (1997).

e

r

y

e

rn

tecyan

e

-s-

-fs.

-eoe-yr-n

al-y

the results are not due to some unanticipaartifact of the way the algorithm deals with thcomplexities of the data.15 That is, althoughproperly constructed computational expements on the data offer us answers to questwe cannot answer with theory alone, we neboth to check and to understand these answbefore we can have the confidence in them twe would like to have before recommendithat the new algorithm be considered for usethe market.

We addressed these issues in two ways:computational experiments on the data frothe Thoracic Surgery matches and by theoical computation to determine how the sizethe set of stable matches behaved in lasimple markets. The first of these allowedto exercise the software on a medical marfree of the match variations present in tgeneral medical market. The theory therefopermitted us to interpret the comparisons btween the two algorithms as unambiguoumeasuring the size of the set of stable matings. The small size of this set therefore hno possibility of resulting from some aspeof how the algorithm deals with match varations.

The computations on thousands of radomly generated simple markets with fixelength of ROLs and varying numbers of paticipants allowed us to see how the size of tset of stable matchings shrinks as the margrows, which establishes a new kind of coconvergence result. This shows that the mavariations in the medical market do not sustantially contribute to the size of the setstable matchings, since the results on the mket data are entirely consistent with the rsults for similarly sized simple marketGiven the theoretical results on strategic mrepresentation, this core convergence realso shows that it is always a best reply for

leis

regard to all the match variations. The question we arereferring to here is not whether the program does what itwas designed to do, but rather whether the apparent smallsize of the set of stable matchings might have to do withsome aspect of how the program handles the match varia-tions.

eeee

leatlshye

oes

hri.

eis

esr

eo

tos

ket-ro-he-delarsbleh-

d-e

en,oe-t-set-

d-

ets

edle-

n-t

etthee

s

r-ise

e

17 Indeed the initial study proposal (Roth, 1995) quotedHippocrates’s famous dictum that, when preparing to treat adisease,

The physician must be able to tell the antecedents,know the present, and foretell the future, must me-diate these things, and have two special objects inview with regard to disease, namely, to do good or todo no harm.

In this connection it is worth mentioning that, partic-ularly because confidence in the market was the keyissue, the study was conducted in an unusually publicway, with progress reports posted regularly on the inter-

772 THE AMERICAN ECONOMIC REVIEW SEPTEMBER 1999

but a tiny percentage of participants in largsimple markets to state their true preferenc

Note that we distinguish between what wcall the “computational experiments” on thactual NRMP data and the “theoretical computation” on the randomly generated simpmarkets. This has to do with our view th“theory” resides in the simplicity of the modeand systematic nature of the conclusionrather than the body of mathematical tecnique traditionally associated with theorThe theoretical computations tell us how thdifference between the applicant- and firmoptimal stable matches varies with the sizea simple market. This new computational rsult, combined with existing theory, allows uto interpret this as precisely measuring tsize of the core of the market, and to detemine the implications this has for the possbility of profitable strategic manipulationThe theorems explaining why the coremustconverge as it does will surely follow (seFeldin [1999] for some progress in thdirection).

In summary, the design process discusshere involved interplay among various apects of simple theory, computational expeiments, and theoretical computation.16 Wesuspect that, as we build a body of engineing practice in economics, this will prove tbe a general pattern.

VIII. Concluding Remarks

The crisis of confidence that threatenedundermine participation in the NRMP waserious precisely because the kind of marfailure which the NRMP was initially developed to correct arose when residency pgrams and applicants lost confidence in texisting market. But by the time of this modern crisis, the historical market failure anhow it was corrected by the NRMP werunderstood; and so was the fact that simimarket failures in British medical markethad occurred and been corrected with stamatching mechanisms, while unstable mec

16 Laboratory experiments also have a role to play, a-though not one we will discuss here (but see Kagel and Ro[2000]).

s.

-

,-.

-f-

e--

d--

r-

anisms had failed (Roth, 1990, 1991). In adition, the general class of market failures duto unraveling of appointment dates had beidentified in many markets (Roth and Xing1994). Therefore, although physicians whhad participated in the unraveling of thAmerican medical market and in the formation of the NRMP were no longer active, iwas not difficult to communicate to the participants in the modern market why it wadesirable to focus on changes in the markthat would not reignite the unraveling of appointment dates (Roth, 1996b).17 Thus, al-though what we knew about two-sidematching markets did not provide an immediate solution to the design of a new markfor physicians, it provided clear guidelineand suggested clear approaches.

It was nevertheless troubling to us at thoutset of this design effort that not only dinone of the standard theorems about simpmatching markets apply directly to the medical market, but counterexamples to the coclusions of many of them were known to exiswhen the complications of the actual markwere present. These counterexamples hadpotential to be of great importance, as in thpossibility that different stable matchingmight yield different levels of employment (apossibility that does not arise in simple makets). Indeed, our results show that in thmarket this possibility is real and so cannot bruled out with better theory. But of the mor

lth

net (see ^http://www.economics.harvard.edu/;aroth/nrmp.html&) and widely distributed to interestedorganizations of physicians and medical students. A finalreport, briefly summarizing the overall results as in Ta-bles 1 and 2, was presented to the medical community atlarge in Roth and Peranson (1997).

d7mhet

in

angafasogc-ben

ne

lreyl

aedrtnina-e,”-n

ing.ts

ngenion

de

d,

idonthe

exin

773VOL. 89 NO. 4 ROTH AND PERANSON: MATCHING MARKET FOR PHYSICIANS

than 100,000 applicants in the years we stuied in detail, only two applicants (one in 198and one in 1996) would have changed froemployed to unemployed or vice versa at tdifferent stable matchings we consider (sTable 2). Because this difference was botiny and unsystematic, it did not play a rolethe market design.

This and the related results about the smnumber of applicants who receive differematches at the different stable matchinpoint to a need to develop theory in ways thwill tell us not only about the possibility odifferent effects, but also about their probbility and likely magnitudes. It seems to uthat questions about magnitudes of the swe encountered in the course of this desiwill often arise in efforts to employ economitheory in the design of institutions for complex markets. Theoretical computation cana big help in this effort, as it was in thpresent case in clarifying the unexpected cosequences of the simple fact that applicacan interview at only a small fraction of thavailable positions.

More generally, just as there is a chemicaengineering literature (and not just literatuabout theoretical and laboratory chemistrand a medical literature (and not just a bioogy literature), economists need to developscientific literature concerned with practicproblems of design. An engineering-orientdesign literature, and the theory that suppoit, will be different from the basic science owhich it depends, both in emphasis andmethod. If we do not develop such a literture, the practical problems of design will brelegated to the arena of “just consultingand we will fail to benefit from the accumulation of knowledge which is so evident iother kinds of engineering.

-

eeh

lltst

-

rtn

e

-ts

-

)-al

s

APPENDIX A: RESULTS OF THECOMPUTATIONAL

EXPERIMENTS CONCERNED WITH SEQUENCING

Table A1 presents results from the sequencexperiments on the preexisting NRMP algorithmTable A2 summarizes results of the experimenrelated to sequencing in the applicant-proposialgorithm. Table A3 compares the results whinput quotas are set to final quotas and reversprocessing is eliminated to the initial results.

TABLE A1—EFFECTS OFSEQUENCE IN WHICH PROGRAMS

ARE PROCESSED

A. Results with Programs Processed in Descending CoOrder Compared to Original Results with PreexistingNRMP Algorithm

Result 1993 1994 1995

ProgramsImprove none 2 2Do worse 2 2 none

ApplicantsImprove 2 2 noneDo worse none 2 2

B. Sequencing of Reversions: Results with Input QuotasSet to Final Quotas and Reversion Processing EliminateCompared to Original Results with Preexisting NRMPAlgorithm

Result 1993 1994 1995a

ProgramsImprove 2 none noneDo worse none 2 2

ApplicantsImprove none 2 2Do worse 2 none none

Notes: In 1994, when some programs and applicants dbetter while others did worse, there was no correlatibetween the change in result and the code numbers ofapplicants and programs.

a Subsequently, the years 1987 and 1996 were also-amined with similar results: no applicants were affected1987, two were affected in 1996.

5

er

ending

e

e

from

cases,rom two

774 THE AMERICAN ECONOMIC REVIEW SEPTEMBER 1999

TABLE A2—SUMMARY OF RESULTS OFEXPERIMENTS RELATED TO SEQUENCING IN THE APPLICANT-PROPOSINGALGORITHM

Sequence of processing 1993 1994 199

A. Baseline Results (When Program Selected from Stack, Applicants Processed in Ascending Program Rank NumbSequence)

Applicants ascending; singles and couples intermixedMatch resulta — — —Loops detected 3 6 4

B. Applicant and Couples Processing Sequence (When Program Selected from Stack, Applicants Processed in AscProgram Rank Number Sequence)

Applicants descending; couples lastMatch result same same sameLoops detected 0 0 0

Applicants ascending; couples lastMatch result same same sameLoops detected 2 0 0

Applicants ascending; couples firstMatch result 2 applicants worse 2 applicants worse samLoops detected 3 6 1

Applicants descending; couples firstMatch result 2 applicants worse 2 applicants worse samLoops detected 1 59 3

C. Sequence of Processing Applicants Ranked by Program Selected from Program Stack (When Program SelectedStack, Applicants Processed in Descending Program Rank Number Sequence)

Applicants ascending; singles and couples intermixedMatch result same 9 applicants improved,

3 applicants worsebsame

Loops detected 17 148 62

Applicants ascending; couples lastMatch result same 9 applicants improved,

3 applicants worsebsame

Loops detected 2 0 0

a This is the base result to which others are compared.b In part C, the results for the two experiments for 1994 (couples intermixed and couples last) were the same. In both

the differences in the results in part C as compared to the baseline results in part A were caused by chains resulting fapplicants doing worse in part C when compared to part A.

reltseit-

ithne

TABLE A3—RESULTS WITH INPUT QUOTAS SET TO FINAL

QUOTAS AND REVERSION PROCESSINGELIMINATED ,COMPARED TO INITIAL RESULTS WITH APPLICANT

PROPOSINGALGORITHM

Result 1993 1994 1995a

ProgramsImprove none none noneDo worse none 2 2

ApplicantsImprove none 2 2Do worse none none none

a Subsequently, 1987 and 1996 were also examined, wno applicants affected in 1987 and a single chain of niaffected in 1996.

APPENDIX B: RESULTS OF THECOMPUTATIONAL

EXPERIMENTS CONCERNED WITH TRUNCATION OF

ROLS AND CAPACITY REDUCTIONS

The results of truncation at the match point areported in Table B1. Table B2 shows the resufor iterative truncations of applicant ROLs, whilTable B3 shows the corresponding results forerative truncations of program ROLs.

the

ed

5,8062,9171,57186482715

7

RO

1,444725

8412152

R

775VOL. 89 NO. 4 ROTH AND PERANSON: MATCHING MARKET FOR PHYSICIANS

TABLE B1—TRUNCATIONS AT THE MATCH POINT

1993 1994 1995

Difference in Result for Both the Preexisting NRMP Algorithm and the Applicant-Proposing Algorithm When Applicant ROLs Are Truncated atMatch Point:

none 2 applicants improve, same positions filled 2 applicants improve, same positions fill

Difference in Result for the Preexisting NRMP Algorithm When Program ROLs Are Truncated at the Match Point:none none 2 applicants do worse, same positions filled

Difference in Result for the Applicant-Proposing Algorithm When Program ROLs Are Truncated at the Match Point:none 3 applicants do worse, same number of positions filled, but not same positions (3

programs filled 1 less position; 1 program filled 1 more position; 1 program filled2 more positions; 1 additional position was reverted from one program to another)

none

TABLE B2—RESULTS FORITERATIVE TRUNCATIONS OF APPLICANT ROLS

Run

1987 1993 1994 1995 1996

OriginalNRMP

algorithmApplicant-proposing

OriginalNRMP

algorithmApplicant-proposing

OriginalNRMP

algorithmApplicant-proposing

OriginalNRMP

algorithmApplicant-proposing

OriginalNRMP

algorithmApplicant-proposing

T T & I T T & I T T & I T T & I T T & I T T & I T T & I T T & I T T & I T T & I

1 16,117 4,324 16,116 4,317 17,209 4,546 17,209 4,536 17,725 4,935 17,725 4,934 18,170 5,763 18,170 5,758 18,316 5,805 18,3172 4,324 1,894 4,317 1,887 4,546 2,093 4,536 2,082 4,935 2,361 4,934 2,359 5,763 2,907 5,758 2,899 5,805 2,915 5,8063 1,894 898 1,887 891 2,093 1,036 2,082 1,023 2,361 1,185 2,359 1,183 2,907 1,572 2,899 1,559 2,915 1,569 2,9174 898 437 891 429 1,036 514 1,023 498 1,185 602 1,183 598 1,572 857 1,559 844 1,569 861 1,5715 437 203 429 194 514 258 498 241 602 292 598 287 857 473 844 460 861 481 864 46 203 93 194 84 258 135 241 116 292 151 287 143 473 251 460 238 481 271 482 27 93 41 84 31 135 73 116 52 151 75 143 66 251 136 238 124 271 157 271 158 41 24 31 13 73 48 52 25 75 40 66 31 136 79 124 67 157 89 155 89 24 18 13 6 48 34 25 12 40 27 31 17 79 45 67 31 89 57 87 55

10 18 14 6 2 34 27 12 5 27 18 17 7 45 31 31 17 57 36 55 3311 14 12 2 0 27 24 5 2 18 14 7 3 31 22 17 8 36 24 33 2112 12 12 — — 24 22 2 0 14 13 3 2 22 18 8 4 24 19 21 1513 — — — — 22 22 — — 13 13 2 2 18 16 4 2 19 15 15 1314 — — — — — — — — — — — — 16 16 2 2 15 14 13 1215 — — — — — — — — — — — — — — — — 14 13 12 1116 — — — — — — — — — — — — — — — — 13 12 11 1017 — — — — — — — — — — — — — — — — 12 11 10 918 — — — — — — — — — — — — — — — — 11 11 9 9

Notes:Columns labeled “T” report the number of matches involving truncated ROLs. Columns labeled “T & I” report the number of matches involving truncatedLsand “improved” matches.

TABLE B3—RESULTS FORITERATIVE TRUNCATIONS OF PROGRAM ROLS

Run

1987 1993 1994 1995 1996

OriginalNRMP

algorithmApplicant-proposing

OriginalNRMP

algorithmApplicant-proposing

OriginalNRMP

algorithmApplicant-proposing

OriginalNRMP

algorithmApplicant-proposing

OriginalNRMP

algorithmApplicant-proposing

T T & I T T & I T T & I T T & I T T & I T T & I T T & I T T & I T T & I T T & I

1 2,967 1,345 2,967 1,349 3,342 1,457 3,342 1,462 3,369 1,514 3,369 1,517 3,444 1,538 3,444 1,541 3,410 1,445 3,4102 1,345 670 1,349 675 1,457 740 1,462 748 1,514 809 1,517 813 1,538 783 1,541 790 1,445 727 1,4443 670 347 675 353 740 382 748 394 809 441 813 444 783 420 790 431 727 384 725 34 347 186 353 194 382 201 394 216 441 249 444 255 420 237 431 248 384 213 384 25 186 100 194 110 201 107 216 122 249 138 255 145 237 130 248 141 213 114 212 16 100 55 110 66 107 64 122 79 138 79 145 86 130 77 141 89 114 71 115 77 55 33 66 44 64 37 79 52 79 44 86 52 77 50 89 62 71 50 72 528 33 21 44 33 37 22 52 37 44 31 52 39 50 35 62 47 50 35 52 399 21 17 33 29 22 15 37 30 31 23 39 32 35 29 47 41 35 26 39 30

10 17 15 29 27 15 13 30 28 23 20 32 30 29 26 41 38 26 21 30 2511 15 15 27 27 13 12 28 28 20 18 30 29 26 24 38 36 21 19 25 2312 — — — — 12 12 — — 18 16 29 28 24 23 36 36 19 18 23 2213 — — — — — — — — 16 15 28 27 23 23 — — 18 17 22 2114 — — — — — — — — 15 15 27 27 — — — — 17 16 21 2015 — — — — — — — — — — — — — — — — 16 15 20 1916 — — — — — — — — — — — — — — — — 15 14 19 1817 — — — — — — — — — — — — — — — — 14 14 18 18

Notes:Columns labeled “T” report the number of matches involving truncated ROL’s. Columns labeled “T & I” report the number of matches involving truncatedOLsand “improved” matches.

n

ots

f

oaatn

a

h

-

invae

l

d,m

vees

gs.sees

aninet ait

otoitre-m-

,

heh-

eir

inin-andder)a

P,the

18 This definition of stability appears to account only forcoalitions of size 1 or 2, but in fact it accounts for coalitionsof any size (i.e., stable matchings are in the core; see Rothand Sotomayor [1990]).

776 THE AMERICAN ECONOMIC REVIEW SEPTEMBER 1999

APPENDIX C: FORMAL DEFINITIONS OF STABILITY

Simple Matching Markets

For markets without linkages between positiowe use the “college admissions” model as refomulated in Roth (1985) and Roth and Sotomay(1990 Ch. 5). There are two finite and disjoint seF 5 { f1, ... , fn} and W 5 {w1, ... ,wm}, of firmsand workers. For each firmf in F, there is apositive integerqf, which indicates the number o(identical) positionsf has to offer.

An outcome is a matching of workers tfirms, such that each worker is matched tomost one firm, and each firm is matched tomost its quota of workers. It will be conveniento denote a firm that has some number of ufilled positions as matched to itself in each othose positions, and similarly an unmatcheworker will be matched to herself. To giveformal definition, define for any setX anunor-dered family of elementsof X to be a collectionof elements, not necessarily distinct, in whicthe order is immaterial.

A matchingm is a function from the setF øW into the set of unordered families of elements ofF ø W such that:

(i) um(w)u 5 1 for every worker w andm(w) 5 w if m(w) ¸ F;

(ii) um( f)u 5 qf for every firm f, and if thenumber of workers inm( f), sayr , is lessthanqf , thenm( f) containsqf 2 r copiesof f;

(iii) m(w) 5 f if and only if w is in m( f).

Each worker has preferences over the firm(and the possibility of remaining unmatchedthe market), and each firm has preferences othe workers (and the possibility of leavingposition unfilled). All preferences are transitivand strict (recall that, in the markets we consider, participants are obliged to submit ranorders which are necessarily strict). We wiwrite fi .w fj to indicate that workerw prefersfi to fj. Similarly, wi .f wj represents firmf ’spreferencesP(f ) over individual workers. Firmfis acceptableto workerw if f .w w, and workerwis acceptable to firmf if w .f f (i.e., an acceptablefirm is one that is preferable to being unmatcheand an acceptable worker is one which the firprefers to leaving a position unfilled).

sr-r,

tt

-fd

s

er

-kl

Each worker’s preferences over alternatimatchings correspond exactly to her preferencover her own assignments at the two matchinThings are not quite so simple for firms, becaueven though we have described firms’ preferencover workers, each firm with a quota greater th1 must be able to compare groups of workersorder to compare alternative matchings. It will bsufficient for our purposes to assume merely thafirm’s preferences over groups of employeescould be matched with (i.e., over groups of nmore thanqf workers) are such that, for any twassignments that differ in only one worker,prefers the assignment containing the mopreferred worker (and is indifferent between theif it is indifferent between the workers). Any preferences of this sort are calledresponsiveto thefirm’s preferences over individual workers (Roth1985).

A matching m is individually irrational ifm(w) 5 f for some workerw and firmf such thateither the worker is unacceptable to the firm or tfirm is unacceptable to the worker. Such a matcing will also be said to beblockedby the unhappyagent. A firmf and workerw will be said togetherto block a matchingm if they are not matched toone another atm but would both prefer to bematched to one another than to (one of) thpresent assignments. That is,m is blocked by thefirm–worker pair (f, w) if m(w) Þ f and iff .w m(w) andw .f s for somes in m( f ). (Notethats may equal either some workerw9 in m( f ),or, if one of firm f’s positions is unfilled atm( f ),s may equalf.) Matchings blocked in this way byan individual or by a pair of agents are unstablethe sense that there are agents with both thecentive (because preferences are responsive)the power (under rules that allow any firm anworker to conclude an agreement with each othto disrupt such matchings. We can now definematchingm to bestableif it is not blocked by anyindividual or any firm–worker pair.18

Complex Matches

In the medical markets served by the NRMthe employers are residency programs, and

ocge

f-

dsntu

r

aha

l

tfh

ey

e

-

orry

o-si-

aato

tch-ty

.)

ces-

mara

i-s,d-

fof

f

its

e

-we

-

ed

fcyo

777VOL. 89 NO. 4 ROTH AND PERANSON: MATCHING MARKET FOR PHYSICIANS

workers are physicians applying to those prgrams. The simple model of the previous setion does not allow for the variety of matchinrequirements observed in the medical markfor which purpose we will have to distinguishbetween different kinds of applicants and diferent kinds of residency programs.

Let the set of applicants beA 5 A1 ø A2 øC, whereA1 is the set of (single) applicantswho seek no more than one position,A2 is theset of applicants who may want two jobs, anwho submit supplemental lists of first-year jobin connection with any second-year position otheir ROLs that requires a complementary firsyear position (and does not come with one atomatically), andC is the set of couples, whosubmit a single ROL listing pairs of positions. Amember ofC is a couple {ai , aj} such thatai isin the setA3 (of husbands) andaj is in the setA4, and the setsA1, A2, A3, andA4 are setsof applicants, who together make up the entipopulation of individual applicants, which willbe denotedA9 5 A1 ø A2 ø A3 ø A4. (TheAi may not be disjoint, since members ofcouple may also submit supplemental lists.) Treason for denoting the set of applicants bothA and asA9 is that, from the point of view ofa potential employer, the members of a coupC 5 { ai , aj} are two distinct applicants whoseek distinct positions (typically in differenresidency programs), while from the point oview of the couple they are one agent witpreferences over pairs of positions.

The set of residency programs isR 5{ r1, ... , rn}, and associated with each programr is a positive integerqr indicating how manypositions it seeks to fill. However, for somprogramsr , qr may not be a constant at everpoint in the matching process. There are twreasons whyqr may change. A residency pro-gram r may have an agreement with anothresidency programr 9 (typically within the samehospital) that ifr can only fill k , qr positions,the remainingqr 2 k positions will be added tothe capacity ofr 9. In such a situation, the algorithm will changeqr to k andqr 9 to qr 9 1 (qr 2k). (It can also happen that theqi 2 k unfilledpositions revert to more than one other resdency program, and so the total numberpositions need not remain constant, and diffeent positions from a given program may reveto different programs.) The other reason wh

--

t,

--

e

es

e

o

r

i-f-t

quotas may vary is that some residency prgrams wish to have an even number of redents, so a residency programr with quotaqmay have its quota reduced toq9 5 q 2 2 inthe event that it can only be matched tomaximum of q 2 1 residents. (These quotadjustments take place after an initial attemptmake a stable match, and they cause the maing algorithm to continue from the currenmatch; in what follows, discussion of stabilitwill refer to the current quota of a programr atany point in the algorithm, except as indicated

Applicants in the setA1 submit ROLs overresidency programs and hence have preferenjust like the workers in the simple model discussed earlier. Applicants in the setA2 have ontheir ROLs at least one second-year prograthat requires (but does not supply) first-yetraining as well, and these applicants submitsupplemental ROL for each such position, indcating their preferences for first-year positionconditional on being matched to a given seconyear position. Each couplec 5 { ai , aj} in thesetC submits, as a single ROL, a ranked list oordered pairs of positions [i.e., an ordered listelements ofR 3 R whose first element is some(r i , r j) which is the couple’s first-choice pair opositions for ai and aj , respectively, and soforth]. Each residency program submits asROL an ordered list of members ofA9 (i.e., ofindividual applicants, whether or not they armembers of a couple).

Having thus defined the form in which different kinds of agents state their preferences,can now define stable matchings. A matchingmwith rangeR ø A9 is defined as in the simplemarket, except that for an applicanta in the setA2 it may be thatum(a)u 5 1 or 2 if m(a)matchesa to a program for which it has submitted a supplemental ROL. In caseum(a)u 5 2we will write m(a) 5 (r1, r2), wherer1 is the(second-year) residency program ona’s pri-mary ROL, andr2 is the (first-year) residencyprogram ona’s supplemental ROL whena ismatched withr1. (Whenum(a)u 5 1 it must bethat r 5 m(a) is on a’s primary ROL.)

As in the case of the simple market considerearlier, we will say that a matching isstableif it isnot blocked by any individual agent or by a pair oagents consisting of an individual and a residenprogram, or by a couple together with one or twresidency programs.

-

s

g.

fe

-.

al-

:

778 THE AMERICAN ECONOMIC REVIEW SEPTEMBER 1999

A matchingm is blocked by an individual ap-plicant (in the setA1 or A2), or by a residencyprogram, ifm matches that agent to some individual or residency program not on its ROL, precisely as in the simple model. A matching iblocked by an individual couple {ai , aj} if they arematched to a pair (ri , rj) not on their ROL. Ofcourse no individual or couple blocks a matchinat which the individual or couple is unmatched

A residency programr and an applicanta inthe setA1 together block a matchingm pre-cisely as in the simple market, if they are nomatched to one another and would both preto be. A residency programr and an applicantain the setA2 together block a matchingm if rprefersa to one of its matches underm [i.e.,aj .r s for some s in m(r )], and if eitherr .a r1 [ m(a) where the preferences.acorrespond toa’s primary ROL, orr .a r2 [m(a) where.a corresponds toa’s supplemen-tal ROL for the positionr1 [ m(a).

A couplec 5 {a1, a2} and residency programsr andr9 block a matchingm if (r, r9) .c m(c) andif either:

(i) a1 ¸ m(r ), a1 .r s for somes [ m(r )and eithera2 [ m(r 9) or a2 .r 9 s9 forsomes9 [ m(r 9); or

(ii) a2 ¸ m(r 9), a2 .r 9 s9 for somes9 [m(r 9) and eithera1 [ m(r ) or a1 .r s forsomes [ m(r ).

REFERENCES

Aldershof, Brian and Carducci, Olivia M. “StableMatchings with Couples.”Discrete AppliedMathematics, August 1996, 68(1–2), pp.203–07.

AMA-MSS. “Resolutions for the 1995 InterimMeeting.” [Online: ^http://www.bcm.tmc.edu/ama-mss/i95res.htm#11&], 1995.

American Medical Students’ Association andPublic Citizen Health Research Group. “Re-port on Hospital Bias in the NRMP.” [Online:^http://pubweb.acns.nwu.edu/;alan/nrmp2.html&], 1995.

Ausubel, Lawrence M.; Cramton, Peter;McAfee, R. Preston and McMillan, John.“Synergies in Wireless Telephony: Evidence from the Broadband PCS AuctionsJournal of Economics and Managemen

-

tr

”t

Strategy(Special Issue:Market Design andthe Spectrum Auctions), Fall 1997,6(3), pp.497–527.

Blum, Yossi; Roth, Alvin E. and Rothblum,Uriel G. “Vacancy Chains and Equilibrationin Senior-Level Labor Markets.”Journal ofEconomic Theory, October 1997,76(2), pp.362– 411.

Blum, Yosef and Rothblum, Uriel G. “ ‘Timing IsEverything’ and Marital Bliss.”Journal ofEconomic Theory, 1999 (forthcoming).

Colenbrander, August. “Match Algorithms Re-visited.” Academic Medicine, May 1996,71(5), pp. 414–15.

Cramton, Peter. “The FCC Spectrum Auctions:An Early Assessment.”Journal of Economicsand Management Strategy(Special Issue:Market Design and the Spectrum Auctions),Fall 1997,6(3), pp. 431–95.

Demange, Gabrielle; Gale, David and Sotomayor,Marilda. “A Further Note on the StableMatching Problem.”Discrete Applied Math-ematics, 1987,16, pp. 217–22.

Feldin, Aljosa. “Core Convergence in Two-Sided Matching Markets: Some TheoreticConsiderations.” Mimeo, University of Pittsburgh, 1999.

Gale, David and Shapley, Lloyd.“College Ad-missions and the Stability of Marriage.American Mathematical Monthly, January1962,69(1), pp. 9–15.

Kagel, John H. and Roth, Alvin E. “The Dynam-ics of Reorganization in Matching MarketsA Laboratory Experiment Motivated by aNatural Experiment.”Quarterly Journal ofEconomics, 2000 (forthcoming).

Ledyard, John O.; Porter, David and Rangel, An-tonio. “Experiments Testing Multiobject Al-location Mechanisms.”Journal of Economicsand Management Strategy(Special Issue:Market Design and the Spectrum Auctions),Fall 1997,6(3), pp. 639–75.

McAfee, R. Preston and McMillan, John. “Ana-lyzing the Airwaves Auction.”Journal ofEconomic Perspectives, Winter 1996,10(1),pp. 159–75.

McMillan, John. “Selling Spectrum Rights.”Journal of Economic Perspectives, Summer1994,8(3), pp. 145–62.

. “Why Auction the Spectrum?”Tele-communications Policy, April 1995, 19, pp.191–99.

”n

e

-

-d

an

m/

”n

-

--

of

.”

-

e:

-

d”

-

-

e”

779VOL. 89 NO. 4 ROTH AND PERANSON: MATCHING MARKET FOR PHYSICIANS

Milgrom, Paul. “Auction Theory in Practice:The Simultaneous Ascending Auction.Mimeo, Stanford University, 1997.

Peranson, E. and Randlett, R. R.“The NRMPMatching Algorithm Revisited: Theory ver-sus Practice.” Academic Medicine, June1995a,70(6), pp. 477–84.

. “Comments on Williams’ ‘A Reexam-ination of the NRMP Matching Algorithm’.”Academic Medicine, June 1995b,70(6), pp.490–94.

Plott, Charles R. “Laboratory ExperimentalTestbeds: Application to the PCS Auction.Journal of Economics and ManagemeStrategy(Special Issue:Market Design andthe Spectrum Auctions), Fall 1997,6(3), pp.605–38.

Roth, Alvin E. “The Economics of Matching:Stability and Incentives.”Mathematics ofOperations Research, 1982,7, pp. 617–28.

. “The Evolution of the Labor Marketfor Medical Interns and Residents: A CasStudy in Game Theory.”Journal of PoliticalEconomy, December 1984,92(6), pp. 991–1016.

. “The College Admissions Problem IsNot Equivalent to the Marriage Problem.Journal of Economic Theory, August 1985,36(2), pp. 277–88.

. “On the Allocation of Residents toRural Hospitals: A General Property of TwoSided Matching Markets.”Econometrica,March 1986,54(2), pp. 425–27.

. “New Physicians: A Natural Experi-ment in Market Organization.”Science, De-cember 14, 1990,250, pp. 1524–28.

. “A Natural Experiment in the Organi-zation of Entry-Level Labor Markets: Regional Markets for New Physicians anSurgeons in the United Kingdom.”AmericanEconomic Review, June 1991,81(3), pp.415–40.

. “Proposed Research Program: Evalution of Changes to Be Considered ithe NRMP Algorithm.” Consultant’s reportto the National Resident Matching Progra[Online:^http://www.economics.harvard.edu;alroth/nrmp.html&], 1995.

. “Interim Report No. 1: Evaluation of theCurrent NRMP Algorithm, and PreliminaryDesign of an Applicant-Proposing Algorithm.Consultant’s report to the National Reside

t

-

t

Matching Program [Online:̂ http://www.economics.harvard.edu/;alroth/nrmp.html&],March 1996a.

. “The NRMP as a Labor Market.”Journal of the American Medical Association, April 3, 1996b,275(13), pp. 1054–56.

Roth, Alvin E. and Peranson, Elliott. “The Effectsof the Change in the NRMP Matching Algorithm.” Journal of the American Medical Association, September 3, 1997,278(9), pp.729–32.

Roth, Alvin E. and Rothblum, Uriel G. “TruncationStrategies in Matching Markets: In SearchPractical Advice for Participants.”Economet-rica, January 1999,67(1), pp. 21–43.

Roth, Alvin E. and Sotomayor, Marilda. “TheCollege Admissions Problem RevisitedEconometrica, May 1989,57(3), pp. 559–70.

. Two-sided matching: A study in gametheoretic modeling and analysis. Economet-ric Society Monograph Series. CambridgCambridge University Press, 1990.

Roth, Alvin E. and Vande Vate, John H. “Ran-dom Paths to Stability in Two-Sided Matching.” Econometrica, November 1990,58(6),pp. 1475–80.

. “Incentives in Two-Sided Matchingwith Random Stable Mechanisms.”EconomicTheory, January 1991,1(1), pp. 31–44.

Roth, Alvin E. and Xing, Xiaolin. “Jumping theGun: Imperfections and Institutions Relateto the Timing of Market Transactions.American Economic Review, September1994,84(4), pp. 992–1044.

Salant, David J. “Up in the Air: GTE’s Experi-ence in the MTA Auction for Personal Communication Services Licenses.”Journal ofEconomics and Management Strategy(Spe-cial Issue:Market Design and the SpectrumAuctions), Fall 1997,6(3), pp. 549–72.

Shiller, Robert J. Macro markets: Creating in-stitutions for managing society’s largest economic risks. Clarendon Lectures inEconomics. Oxford: Oxford UniversityPress, 1993.

Sonmez, Tayfun. “Manipulation via Capacitiesin Two-Sided Matching Markets.”Journal ofEconomic Theory, November 1997,77(1),pp. 197–204.

. “Can Pre-arranged Matches BAvoided in Two-Sided Matching Markets?Journal of Economic Theory, May 1999,

d

780 THE AMERICAN ECONOMIC REVIEW SEPTEMBER 1999

86(1), pp. 148–56.U.S. District Court for the Western District of

Missouri, Western Division. United States ofAmerica v. Association of Family PracticeResidency Directors, Final judgment. May28, 1996.

Williams, K. J. “A Reexamination of the NRMPMatching Algorithm.” Academic Medicine,

June 1995a,70(6), pp. 470–76.. “Comments on Peranson an

Randlett’s ‘The NRMP Matching Algo-rithm Revisited: Theory versus Practice’.Academic Medicine, June 1995b,70(6), pp.485– 89.

Wilson, Robert B. Nonlinear pricing. Oxford:Oxford University Press, 1993.