Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
Individual vs Group Decision-Making:
Evidence from a Natural Experiment in Arbitration
Proceedings
Naomi Gershoni*
PERLIMINARY & INCOMPLETE
May 1, 2018
Abstract
The importance of understanding the systematic differences between group and individual
decisions has been well-recognized in the economic literature. However, the vast majority of
empirical evidence on this issue are derived from laboratory experiments, and hence do not
reflect professional incentives and career concerns both of which may play a crucial role. This
paper uses an original data set from the Financial Industry Regulatory Authority’s arbitration
awards as well as a unique regulatory change that exogenously decreased the number of pre-
siding arbitrators from three to one for a specific class of cases. A difference-in-differences
strategy is implemented to identify the impact of the number of decision-makers on the dis-
tribution of outcomes. The findings indicate a tendency of sole arbitrators to render moderate
awards and to facilitate more settlements, whereas groups make more extreme “all or nothing”
decisions. The paper discusses potential underlying mechanisms, and offers evidence to sup-
port a novel explanation to the findings based on the notion that concerns about the adverse
effects of extreme decisions on arbitrators’ reputation are mitigated within panels.
1 Introduction
The importance of understanding the systematic differences between group and individual deci-
sions has been well recognized in the economic literature. Economic decisions in a wide array of
issues and settings are made by groups. Some examples are business decisions made by boards,
*Ben Gurion University of the Negev
1
policy recommendations by committees, household choices, and judicial panels or jury decisions.
While common wisdom suggests that “two heads are better than one”, the theoretical and experi-
mental literature has long acknowledged that differences between group and individual decisions
may be driven by various psychological and economic mechanisms. In addition, such differences
heavily depend on context and institutions, as well as on the criteria by which the decision is eval-
uated (see e.g. Charness and Sutter, 2012; Goeree and Yariv, 2011; Eliaz et al., 2006; Cooper and
Kagel, 2005). In light of the numerous competing predictions, the scarcity of empirical evidence
based on “real world” observational data is striking, though not surprising. The formation of
groups, as well as the type of decisions they make, are typically endogenous and hence evidence
often points to correlations rather than causal effects. Empirical evidence that come from labo-
ratory experiments do not reflect professional incentives and career concerns, which may play a
crucial role.
This paper aims to identify the causal effects of groups on outcomes, utilizing an original data
set of FINRA (Financial Industry Regulatory Authority) arbitration awards and a unique regula-
tory change that exogenously decreased the number of arbitrators deciding on a case from three
to one, for a specific class of cases. A difference-in-differences strategy is applied to identify the
impact of the number of decision-makers on the distribution of outcomes. The findings indicate
a tendency of groups towards more extreme, “all or nothing”, decisions, while sole arbitrators
render more moderate awards. In addition, sole arbitrators appear to facilitate more settlements.
Furthermore, suggestive evidence point towards a specific mechanism that drives these differ-
ences.
Deriving a specific hypothesis based on previous studies is impossible, as multiple channels
and mechanisms were suggested leading to contradicting results. While some studies empha-
size the moderating effect of groups on potentially extreme individual opinions, others present
the possibility that groups evoke polarization of individual views and motivate manipulation of
outcomes.
The well-known Condorcet Jury Theorem, predicts that majorities are more likely than any
single individual to select the best out of two alternatives, based on the assumption that informa-
tion aggregation is efficient.1 In addition, it was claimed that groups are more predictable and
less extreme due to a ‘compromise effect’ – individual members’ opinions averaging out in order
to reach consensus (Adams and Ferreira, 2010). Even when consensus is not required, studies
document a tendency towards unanimity or ‘dissent aversion’, especially when members of the
group interact repeatedly, e.g. panels of judges (Posner, 2010). Moreover, Adams and Ferreira
1A long line of literature that followed altered and relaxed the underlying assumptions and a variety of results werederived (see Piketty (1999) for a survey of papers in this line of literature).
2
(2010) predict that the ‘membership effect’ – groups choosing to exclude individuals with extreme
points of view or such individuals voluntarily leaving groups – will amplify the moderating effect
of groups. Since usually arbitrators are chosen by the parties of the dispute, a similar “selection”
effect is expected (although not necessarily moderating). As to the quality of decisions, experi-
mental studies repeatedly find group decisions to be more consistent with rationality (Kocher and
Sutter, 2005; Baillon et al., 2016; Bornstein and Yaniv, 1998; Rockenbach et al., 2007; Blinder and
Morgan, 2005).
However, others cast doubt on the wisdom and moderation of groups. Social psychology
points to several cognitive biases which may lead to the opposite outcome. The phenomena
known as group polarization, was repeatedly documented in lab experiments and was supported
by a variety of theoretical models (Glaeser and Sunstein, 2009). Two types of potential choice shifts
of groups, either “risky” or “cautious”, were demonstrated under divergent circumstances and
shown to depend on group members’ individual predispositions and the extent of deliberation
(see e.g. Stoner, 1968, 1961; Moscovici and Zavalloni, 1969; Kerr et al., 1996). An even more alarm-
ing view of group behavior was first described by Janis (1982), who coined the term “groupthink”
to describe the tendency of groups to apply self-deception in an attempt to conform with each
other and avoid conflict.2
Ottaviani and Sørensen (2001) stress on a different aspect of group decisions– reputation con-
cerns lead to flaws in information aggregation. They predict that when committee members
present their opinions sequentially, they will end-up adopting relatively extreme views, a phe-
nomenon known as ‘herding’. Levy (2007) analyzes decisions made by committees whose mem-
bers are motivated by similar concerns for reputation and focuses on the interaction of these con-
cerns with the transparency of the decision process. In this setting, transparency promotes deci-
sions that go against members’ preexisting bias.
Although reputation concerns are dominant in arbitration, the environment is more likely to
generate “adverse reputation effects”, i.e. concerns for “bad” reputation as biased since the accu-
racy of decisions cannot be evaluated (Klement and Neeman, 2013).3 As a result, many researchers
hypothesized that arbitrators tend to simply “split the difference” between the parties to a dispute
and avoid extreme decisions.(Posner, 2004). Since arbitrators’ deliberations are secretive, relative
to sole arbitrators, the group setting reduces transparency and mitigates concerns for reputation.
Hence, panels are expected to give rise to more extreme decisions.
Comparing decisions of sole arbitrators to those of panels to establish causal effects requires
2Following this idea, Benabou (2013) presents a theoretical model “collective denial and willful blindness in groups,organizations, and markets”
3Generally, Morris (2001); Ely and Valimaki (2003) predict “adverse reputation effects” when only the actions ofagents are observed and the truth remains unknown.
3
a setting where the assignment of specific cases to one option or the other is random or as good
as random. In reality, the option is assigned either by an initial agreement between the parties to
a dispute or according to a specific rule that assigns different types of cases to different tribunals.
Therefore, the results of a “naive” comparison between the decisions taken by different forums
may actually reflect the underlying difference in the characteristics of the dispute and not the
behavior of arbitrators.
To overcome this challenge, I use a unique data set of FINRA arbitration awards and a change
in the FINRA regulations that exogenously impacted the number of arbitrators assigned to some
cases. The study focuses on claims of investors (customers) against brokerage firms. In March
2009, FINRA raised the threshold for the assignment of cases to three arbitrators (rather than one)
from $50,000 to $100,000. This change in regulation serves as a “natural experiment” which allows
to test the group of cases between the two thresholds, old and new, for the differences between
awards given by a panel of three arbitrators and awards given by a sole arbitrator, depending on
filing date. An important feature of this research design is that the rule change is not related to
the hypothesis or outcomes of interest. Rather, it resulted from an unusual load of cases during
the year 2009, probably following the financial crisis that peeked in the preceding year. While
allowing me to consider the rule change as an exogenous shock to tribunal size, this chain of events
raises concerns that any observed pattern will be a result of changes provoked by the financial
crisis. More generally, other factors may have caused some changes over time or awards may
simply exhibit a time trend. To address these potential obstacles, I use a difference-in-differences
framework where the group of cases that were not affected by the change in regulations serves as
a control group. I present elaborate balance tests and a variety of robustness tests to confer that
this setting allows the identification of the true causal effect of the size of the arbitral tribunal on
arbitration outcomes.
The main outcome considered is the award rate, i.e. the amount of awarded damages relative
to the relief initially requested by the claimant. The results show that the distribution of awards
significantly changes for the treatment group in a way that is not exhibited by the control group
of cases. Specifically, when compared to sole arbitrators, panels of three tend to entirely dismiss
more claims but also award claimants with very high damages when their claims are accepted. In
addition, cases assigned to sole arbitrators are more likely to be settled. These results suggest that
groups tend to be more extreme, although no inference can be made regarding the quality of the
decisions.
Three main mechanisms can explain these findings. First, the selection of arbitrators into pan-
els may be different and hence the difference in outcomes could be due to the difference in arbitra-
tors’ characteristics. I rule out this option using an extended sample of cases and adding arbitrator
4
fixed effects to the estimated model. The two additional channels that may drive the portrayed
differences between individuals and groups are either the general group polarization phenomena
or the rational choices of moderate awards by sole arbitrators who are concerned for their repu-
tation. I present evidence on cases that do not involve reputation considerations to establish that
the second explanation is more likely.
In addition to the general contribution of this paper to the analysis of group decision-making,
the findings have policy implications in the specific context of arbitration proceedings. As arbi-
tration is becoming an increasingly widespread alternative to courts, questioning the quality of
arbitrators’ decisions, especially their fairness and non-biasedness, becomes essential. Contrary
to courts, the number of arbitrators assigned to a case is usually determined by the parties to the
dispute. The common practice is to increase the number of arbitrators as the stakes of the case
increase. Choosing more arbitrators imposes higher costs of proceedings, however the benefit
of this choice is not clear, as none of the (very few) empirical studies that tackle the difference
between group and individual decisions address arbitration decisions.4 On the other hand, the-
oretical and empirical studies of the determinants of arbitrators’ decision-making have always
focused either on decisions of single arbitrators or on decisions of panels, but the two have never
been compared.5
The paper proceeds as follows— In section 2 the FINRA Arbitration system is reviewed in
detail. Next, I describe the unique data set and the empirical strategy. Results and robustness
tests are presented in section 3. Section 4 tests the plausibility of the alternative mechanisms and
presents evidence in support of the “reputation channel”. Section 5 concludes.
2 Background on FINRA Arbitration
FINRA is a private corporation that acts as a self-regulatory organization for the financial industry
by authorization of the Congress. The members of the corporation are securities firms and broker-
ages that operate in the US. Its main operation is the formulation and enforcement of rules that
are aimed to ensure fair and honest conduct of its members. In addition, FINRA runs the largest
dispute resolution forum in the US securities industry, offering extensive arbitration and media-
tion services. Arbitration cases are divided into two types: customer cases, where the claimant
4The vast majority focus on investment decisions and offer mixed evidence for the association between groupsand outcomes (Barber et al., 2003; Barber and Odean, 2000; Bone et al., 1999; Prather and Middleton, 2002).Adamsand Ferreira (2010) compare the distribution of bets placed by individuals and groups on ice breakup dates and findgroups to be more moderate and conform more to historic data. However, the setting of a specific wagering game issignificantly different from other economic and judicial decisions.
5See for example: Choi et al. (2010). The only exception to this is Marselli et al. (2013) that consider the associationbetween the number of arbitrators and the probability of a dispute resolving in settlement.
5
is a customer and the respondent is a member, and industry cases, were both parties are either
members or other entities who are involved in the industry, such as employees or suppliers of
member firms.
In customer cases, which are the focus of this study, the arbitration process starts when an
investor files a statement of claim against a brokerage firm. Essentially all agreements between
brokerage firms and their customers include a clause which obliges them to turn to FINRA ar-
bitration in case of a dispute. Therefore, FINRA customer cases are considered to be mandatory
arbitration proceedings, meaning that customers cannot choose to file their claim in court or in any
other dispute resolution forum.6 FINRA exerts a lot of effort to sustain the image of its arbitration
proceedings and arbitrators as fair and objective, since as a group they are often suspected of bias
in favor of their own members. Nevertheless, both the FINRA forum and individual arbitrators
are constantly suspected of being biased against both sides, members and customers.7
After filing the claim, the arbitrators’ selection process begins. The number of arbitrators ap-
pointed to a case depends on the amount of relief requested in the statement of claim. FINRA
regulations set a threshold amount such that cases with a requested relief equal to or below it are
heard by a sole arbitrator and cases above the threshold are heard by a panel of three arbitrators.8
However, parties may agree to stray from this rule.9 In addition, the regulations determine the
types of arbitrators that should be assigned to the case. Each arbitrator in the FINRA roster is
categorized as either “Public” or “Non-Public” according to the extent of his experience and in-
volvement in the financial industry (e.g. as employee or attorney of a brokerage firm).10 Only
public arbitrators that meet certain requirements of experience and education can serve as chair-
persons.11 In cases where only one arbitrator is appointed, he must be a public arbitrator qualified
6In some cases investors tried to turn to courts to discard them from the obligation to use arbitration proceedings.These appeals were mostly denied by the supreme court, even though courts and legislatures often expressed theopinion that mandatory arbitration clauses are unfair or might violate the basic principle of the justice system, thateveryone is entitled to have their “day at court” (see Shearson/American Express Inc. v. McMahon (482 US 220 [1987]),Rodriguez de Quijas v. Shearson/American Express, Inc. (490 US 477 [1989])
7Scanning through securities blogs and law firm websites many statements that indicate that FINRA is biasedin favor of its own members appear. One example is “There is a wide-spread perception that the arbitration system isstacked against customers. Some people have expressed concern that many arbitrators are reluctant to mete out large awardsagainst big brokerage firms or to issue strong discovery rulings. ”(http://www.sbpllplaw.com/an-outline-of-the-finra-arbitration-process-for-customer-broker-disputes/). Others express exactly the opposite opinion, refering to FINRA as“customer-friendly” and suggesting that FINRA’s own members are interested in taking disputes with customers to otherjudicial forums (http://www.pepperlaw.com/publications/second-circuit-makes-it-easier-to-avoid-finra-arbitration-2015-04-02/). However, the latter view is much less prevalent.
8It should be noted that the threshold was raised during the studied period and this change in regulations is usedas the identification strategy (see section 3.2).
9Before the change in regulations, cases that were assigned to a sole arbitrator, could be heard by a panel of three byrequest of the claimant (without requiring the consent of the respondent.
10Non-public arbitrators are often referred to as “industry arbitrators”. The exact definitions are listed in rules12100(p) & (u) of FINRA’s Code of Arbitration Procedure for Customer Disputes.
11To qualify as chairperson an arbitrator must also complete FINRA’s chairperson training. The complete require-
6
to serve as chairperson. In cases with three arbitrators, at least two arbitrators are public arbitra-
tors and at least one must be eligible for chairperson. In most cases the panel of three arbitrators
is a “Majority Public Panel”, i.e. it consists of two public arbitrators and one non-public. This was
the only option for the composition of the panel up to 2011 when the rules changed. since January
2011, the claimant can choose to have an “All Public Panel”, i.e. appoint only public arbitrators to
the panel. 12
FINRA uses the “veto-rank” method for the selection of arbitrators. Each party receives a list of
randomly selected arbitrators and is asked to veto some and rank the others. The arbitrators with
the highest combined scores are appointed. In cases where one arbitrator should be nominated,
parties get one list of ten potential arbitrators. They can veto four out of ten and then they rank
the remaining six. When three arbitrators are required, three different lists of ten arbitrators each
are provided and the same process is applied to each list. Therefore the probability for each party
to get their favorite arbitrators appointed does not depend on the total number of arbitrators
assigned to the case.13
The FINRA guidelines to arbitrators specifically address the question of how a decision should
be made by a panel. The arbitrator’s guide offers the following protocol:
After each panel member has reviewed the case information, the chairperson should
begin the discussion by asking the panel members to express their individual views of
the case. All arbitrators must take part in deliberating the facts and issues, and their
observations and opinions should be heard, acknowledged and considered.
In addition, rule 12410 of the FINRA Code states that all rulings and determinations by the panel
shall be by a majority of the arbitrators. Note, that while a decision requires at least two panel
members to agree on liability, the reasons behind their determinations may differ.
Although unanimity is not required, the number of cases where it is recorded that one arbitra-
tor opposed the majority opinion is negligible.14 This is in line with the findings of many studies
that document and rationalize the phenomena referred to as “dissent aversion” among judges.15
In most cases this tendency is driven by considerations of collegiality.
ments for chairperson eligibility are listed in rule 12400(c) of FINRA’s Code of Arbitration Procedure for CustomerDisputes.
12Betweem 2008 and 2011, there was a pilot program that offered the choice of an “All Public Panel” in selected cases.In my main sample only 46 cases were actually heard by an “All Public Panel”.
13The only exception is in the case of an “All Public Panel” where the parties get only two lists of 10 and the selectionprocess is slightly different. However, as mentioned above, such cases are negligible in our sample.
14Out of 2575 panel decisions in the full sample only 78 cases record a dissenting arbitrator. In some of these cases,it is noted that the dissent only regards to a specific detail or a technical component of the award, e.g. expungement ofrecords or distribution of costs between the parties.
15See for example Epstein et al. (2011)
7
3 Empirical Analysis of FINRA Arbitration Awards
3.1 Data and Sample
The data was gathered from FINRA arbitration awards, which are publicly available on-line. The
details that were recorded for each case include filing and award dates, number of claimants and
respondents, parties’ representation, causes of claim (or controversy type), type of securities in-
volved in the claim, number of arbitrators and type of forum. The filing of a counter claim or third
party involvement in the proceedings are also recorded.16
The complete database consists of approximately 4000 awards for cases that where filed be-
tween 2006 and 2011. I focus on customer cases since specific rules and guidelines rules apply to
this type of cases. In addition, these cases repeatedly confront parties that belong to two distinct
groups with opposing interests, i.e. customers vs. firms.
Since the data is retrospective, in the sense that we only observe a case once the award is
posted, censoring may affect the results. To avoid such a distortion, I use survival analysis and
only consider cases where the award was given within a limited period of time since the case
was filed.17 It is also important to note that aggregate statistics posted by FINRA suggest that
approximately 80% of cases that are filed are closed before the hearing stage, and hence no award
is issued.18 This does not pose a major concern for the purpose of the analysis, since in such
cases the effect of the arbitrators on the outcome is negligible. For the rest of the cases an award is
issued and posted. In cases that are resolved in settlement at or after the hearing stage, an award is
issued declaring that the parties settled, but the details of the settlement are usually kept secret.19
Obviously these outcomes should be included in the analysis, since arbitrators’ behavior during
the hearing could influence the probability of settlement.
In the main analysis, I use a sub-sample of customer cases that only includes cases with a relief
requested up to 250,000$ as explained in further details in the next section.
16The data was collected with the assistance of a group of law and economics undergraduate students. They weregiven elaborate written and oral instructions to ensure uniformity and consistency in the coding of the awards’ sub-stance. In the course of collecting the data, students’ questions were asked and answered in an open forum in sake ofclarity and coherence. In addition, records were double checked to avoid any mistakes.
17I elaborate on the choice of this period in section 3.2.18The reasons for closing a case at such a preliminary stage could be a direct settlement between the parties,
bankruptcy of a critical party, forum denying, and others.19Such awards regularly decide on the issue of expungement, i.e. whether or not the claim will be erased from the
member’s records, and on the fees that should be paid by the parties for the arbitration proceedings.
8
3.2 Empirical Strategy
Comparing the awards given by single arbitrators to those given by panels is not an easy task
since in most cases different types of cases are assigned to different numbers of arbitrators. There-
fore, such a comparison might actually reflect differences in unobserved characteristics of cases
and parties. FINRA’s rules distinguish the types of cases which are assigned to three rather than
one arbitrators, by setting a threshold based on the relief requested by the claimant. Up to the
end of March 2009 the threshold for a single arbitrator was 50,000$ and cases above this amount
were assigned to panels of three. Figure 1 presents a ‘naive’ comparison of the distribution of
awards granted by panels of three versus sole arbitrators, for this period (which in our main anal-
ysis serves as the “pre- period”). Awards are measured as rates, i.e. the fraction of the monetary
relief initially requested by the claimants that was approved by the arbitrators in the final award.
Accordingly, the minimal award equals zero, while the maximum is set to one, although it could
potentially be higher if for example the arbitrators ordered the respondent to pay punitive dam-
ages.20 It appears like there is some difference between the two distributions. Cases that were
heard by sole arbitrators seem to have slightly more “extreme” awards, which grant plaintiffs
with all or nothing, compared to cases with three arbitrators. However, this difference cannot
be directly attributed to the different number of arbitrators, since these cases have many other
observed and unobserved qualities which separate them.
In 2009 the threshold for the assignment of three arbitrators was raised to 100,000$.21 Press
coverage and security bloggers suggested that the change was required due to an unusual load
of new claims, which was probably the result of the economic crises evoked by the collapse of
financial markets.22 This offers a unique opportunity to compare arbitration awards in cases where
the relief requested is between the old and the new threshold, i.e. 50,000$ - 100,000$. For this group
of cases, if a claim was filed before Marh 30, 2009, the rules assigned a panel of three arbitrators
to hear the case. However, starting that date, only one arbitrator is assigned to each case in this
same category. The nature and timing of the described change in FINRA regulations, is used as a
“natural experiment”, where the assignment of arbitral tribunal to cases in the affected group is as
good as random. Nevertheless, many other obstacles may stand in the way of correctly identifying
the effect of interest.20In many cases the award rate exceeds 1 since the claimant is awarded with punitive or examplary damages. The
relief requested specifies only the compensatory damages and this is the amount that dictates the choice of forumwhich is our main interest. In addition in most cases the amount of punitive damages requested by the claimant is notspecified, so it is impossible to use a consistent measure of total damages requested.
21See FINRA regulatory notice 09-13 “Threshold for Single Arbitrator Cases”.22Indeed 2009 stands out with an unusually large number of filed claims both in FINRA aggregate statistics and in
my sample, as presented in Appendix figure A1.
9
Figure 1Distribution of Award Rate Pre & Post Change (All Years)
The distribution of awards before and after the change in regulations for the affected group of
cases is presented in figure 2. This comparison clearly suggests that there was a substantial change
in the distribution of awards and that the difference between panels’ and sole arbitrators’ decision
is exactly in the opposite to the one suggested by figure 1 (where selection is not controlled for).
The fraction of cases where claims are denied entirely or where the award rate is very close to zero
decreased significantly after the number of arbitrators was reduced from three to one. The same is
true regarding awards that grant claimants with approximately the same amount of damages they
requested or more. On the other hand, awards that tend to “split the difference”, i.e. an award rate
close to 0.5, are more frequent under the new rule, that assigns only one arbitrator to each case.
These findings could support the theoretical prediction that when grouped together arbitrators
are less concerned with reputation and therefore not as reluctant as sole arbitrators to give awards
closer to the tails of the distribution and therefore appear biased, but could also be driven by the
more general phenomena of group polarization.
Although figure 2 exhibits substantial changes in the distribution of award rates, one should
suspect that similar changes occur for all classes of cases and not just for cases that were affected
by the threshold change. This could be the result of a time trend or some other unobserved het-
erogeneity of the pre and post periods. Therefore, to identify the pure effect of the change in
the arbitral tribunal on arbitration results, I use a difference-in-differences (DD) approach. There
are two potential comparison groups: cases where the relief requested by the claimant is below
$50,000, which were assigned to sole arbitrators throughout the entire period and cases with a
relief requested above $100,000, which were assigned to panels of three arbitrators all along. The
10
Figure 2Distribution of Award Rate Pre & Post Change (All Years)
second group is limited to cases where the relief requested does not exceed $250,000. Using a
wider range would make the comparability of the groups questionable and raise substantial con-
cerns for unobserved differences within this group.23 I use both groups combined in the DD set-
ting, but control for each category of cases separately to allow for variation in the average award
rate across groups. I also control for cases where the relief requested is equal to or below 25,000$
since for such cases the “simplified arbitration” procedure applies.
With all groups, the issue of compliance should be addressed, since as mentioned above, par-
ties can agree to change the number of arbitrators. It should also be noted, that the rule regarding
switching from one to three arbitrators was slightly changed at the same time that the threshold
for panels was raised. Prior to the change, a request by the claimant was enough to make the
switch, whereas after the change the consent of the respondent was required as well. This change
made it more difficult to switch the number of arbitrators from one to three and hence increased
compliance from approximately 70% to approximately 95%. In practice, it only affected cases in
the first comparison group, where the rules set the number of arbitrators to one throughout the
entire period. 95% was also the rate of compliance to the three arbitrators default rule, except for
a slightly lower rate in the treatment group post change (it seems reasonable to get lower levels of
compliance to a new rule).
In the main part of the paper, I compare cases by the expected number of arbitrators (according
to the rules) to avoid bias caused by parties’ self selection of the tribunal.24 Hence, results should
23The maximal relief requested for cases in the full sample is 1.25 billion dollars.24Conducting the same analysis dropping the cases where the actual tribunal is not the expected one does not affect
results significantly. However, since compliance cannot be considered random, such results could be biased and cannot
11
be thought of as intention to treat effects (ITT), rather than actual treatment effects. In section
3.5, I use an Instrumental variable analysis to derive the treatment effect on the compliers. An
indicator for the rule change interacted with being in the treatment group serves as an instrument
for forum size. This is a valid instrument since it is highly correlated with the the number of
arbitrators actually assigned to a case and it affects the outcome only through the endogenous
treatment variable.
The main specification for estimation is (i is an index for a specific case):
Yi = α + β0Treatedi + β1Posti + β2Treated× Posti + γ′Xi + ψi
Where Y is the outcome of interest. The main outcomes that are considered are a set of indi-
cators for whether the award rate (AR), i.e. the ratio between the awarded damages and the relief
initially requested in the statement of claim, equals a specific value or lies within a specified range.
The aim of this is to understand if and how the distribution of award rates changed. Settlements
are coded as award rate equal to 0.5, implying that arbitrator’s reputation is not affected by this
outcome.25 I also use an indicator to whether or not the parties agreed to settle the case as an out-
come. This allows to separately measure the impact on settlement rate, rather than just consider
settlements as moderate awards granted by the arbitrators.
X is a vector of controls, which can be divided into two groups. The first consists of variables
that refer to claimants and respondents numbers, whether or not they are represented by an at-
torney and whether or not the respondent entered an appearance. The second includes a large
set of dummy variables that describe the causes of action and the type of securities involved in
the dispute. A complete list of these variables is presented in table 1 and the notes that follow.
In addition, since the comparison group actually consists of three distinct groups, I allow for a
group-specific effect.
Treated is an indicator for the case being in the affected range of relief requested (i.e. 50,000$ to
100,000$) and Post indicates whether the case was filed after the change of rules came into effect.
TreatedXPost is the interaction between the two indicators and is the explanatory variable of in-
terest in this setting. Therefore, β2 is the difference in outcome that is attributed to the assignment
of a case to a single arbitrator rather than to a panel of three arbitrators.
To validate the use of this specification, the observed characteristics of the treatment and con-
trol groups are compared. Table 1 presents the mean of these characteristics for the three groups of
be interpreted effectively.25The average award rate for cases that did not resolve in settlement and where the claimant’s claims were not
entirely dismissed is 0.56. If cases were the award was zero are also considered this rate drops to 0.28. Coding awardrate to equal 0.3 instead of 0.5 has no significant affect on results.
12
interest, in the pre-change period. Then, two measures of the differences between treatment and
control are estimated. First, I test for the significance of the difference in means between treatment
and control. Second, I use the difference-in-differences specification with each covariate as an out-
come. This allows me to evaluate the change in underlying characteristics following the new rule,
and estimate whether the treatment group exhibits a different trend than the control groups. As
we would expect if assignment to treatment was as good as random, the observed characteris-
tics are very similar. Very few significant or marginally significant differences exist, but not more
than is expected with such a long list of covariates. Nevertheless, I use specifications where the
complete set of observable characteristics is controlled for.
One more concern is that parties can manipulate the number of arbitrators either by changing
the filing date (around the time that the new rule became effective) or by changing the relief re-
quested (around the threshold). If this is true then the estimates based on the proposed empirical
model will be biased. The change in rules was announced on February 2009, so the concern is that
claimants either rushed to file their claims before the rule came into effect or delayed filing till after
that date, hence actually selecting the number of arbitrators assigned to their case. If this was true,
we would expect to see some irregularity in the number of cases filed by month around the time
of the policy change. In fact the trends in number of new claims filed over the first 6 months of
2009 appear to be practically identical when the affected group of cases is compared to the control
group. Table 2 reports the proportion of cases that were filed during the quarter that preceded the
effective date of the new rule and the quarter that followed (out of all cases filed during 2009) by
treatment status. In addition, the differences in means between the groups is calculated for both
the pre and post change periods and found to be very close to zero and tatistically insignificant. To
further establish that there is no manipulation on filing date, figure 3 shows the number of cases
filed each day, for a period of four months, around the time the new rule was applied, for the
whole sample and for each sub-group separately. No significant irregularities are found during
this period.
To assess the possibility of manipulation around the threshold of relief requested, Figure 4
compares the distribution of relief requested in cases filed in the pre and post periods. There are
some notable differences that may suggest some extent of manipulation. Mainly, it seems like the
percentage of cases where the relief requested was the “old” threshold (50k) or slightly less de-
creased following the change and the number of cases with a relief requested at the “new” thresh-
old (100k) or slightly less increased. This may portray a preference of claimants to the appoint-
ment of a single arbitrator. However, comparable and even larger differences in the frequency of
claims occur at 25k and 200k. These changes cannot be explained by this or any other change in
FINRA regulations. Moreover, figure 5 compares the kernel density of all cases filed pre-change
13
Table 1Descriptive Statistics & Balance Tests for Treatment & Comparison Group,Pre-Change & Post-Pre Differences
Control Treatment Diff. DID
A. Parties’ characteristicsNumber of claimants 1.595 1.588 0.007 -0.09
(0.108) (0.134)Claimant represented .819 0.876 -0.057* -.068
(0.033) (0.044)Respondent represented 0.984 0.969 0.014 0.038**
(0.013) (0.017)B. Case characteristics
Counter claim 0.027 0.015 0.012 0.008(0.013) (0.018)
Security types -Options 0.027 0.041 -0.014 -0.013
(0.016) (0.021)Common stocks 0.252 0.247 0.005 -0.049
(0.039) (0.049)Government bonds 0.019 0.01 0.009 0.004
(0.011) (0.015)Controversy types -
Margin 0.055 0.021 0.034* 0.022(0.018) (0.02 )
Violation of Rules 0.438 0.438 0 -0.057(0.044) (0.065)
Excess trade 0.14 0.134 0.006 -0.031(0.031) (0.045)
Failure to supervise 0.529 0.536 -0.007 0.036(0.044) (0.066)
Negligence 0.603 0.675 -0.073 -0.021(0.043) (0.062)
Fiduciary breach 0.759 0.722 0.037 0.09(0.039) (0.056)
Unsuitability 0.438 0.428 0.011 0.014(0.044) (0.065)
Misrepresentation 0.463 0.418 0.045 -0.004(0.044) (0.065)
Fraud 0.416 0.376 0.04 0.07(0.044) (0.064)
Failure to execute 0.066 0.067 -0.001 0.02(0.022) (0.035)
Deception 0.025 0.031 -0.006 0.006(0.014) (0.027)
Notes: The table reports means and difference in means for selected characteristics of cases in the treatment and compar-ison groups. Columns (1) and (2) report the mean during the pre-change period for each group seperately, and column(3) presents the difference between the groups in the pre period as well as the standard error for this difference.Column(4) reports the coefficient estimates for a DD estimation comparing the pre- and post periods for the treatment andcontrol group. To avoid censoring the duration of cases is limited as explained hereinafter in figure 6 and the text thatfollows. Standard deviations and robust standard errors are reported in parenthesis (*** p<0.01, ** p<0.05, * p<0.1).
14
Table 2Percentage of Cases Filed Pre vs. Post Change by Treatment Status
Treatment Control Difference
Filed Pre (Jan-Mar) 0.257 0.265 0.008(0.044) (0.021) (0.049)
Filed Post (Apr-Jun) 0.267 0.261 -0.006(0.044) (0.021) (0.049)
Notes: The table reports the share of cases that were filed within a 3 months period before and after the chnge in rulesbecame effective (on March 30, 2009) out of all cases filed during 2009. This shares are presented for the treatment andcontrol groups separately and the difference between the groups is calculated in the last column. To avoid censoringthe duration of cases is limited as explained hereinafter in figure 6 and the text that follows. Standard deviations androbust standard errors are reported in parenthesis (*** p<0.01, ** p<0.05, * p<0.1).
Figure 3Claim Filing Frequency Pre & Post Change (by Range of Relief Requested)
15
Figure 4Distribution of Relief Requested 2008 vs. 2010
to those filed post change. Again, the most apparent differences do not occur around any of the
thresholds. Comparing the two distributions using the Kolmogorov-Smirnov test for the equality
of distributions, the null hypothesis that the samples are drawn from the same distribution cannot
be rejected (with a P-value of 0.42). It should also be noted that parties can agree to change the size
of the arbitral tribunal and moreover, prior to the change the consent of the respondent was not
required when the rule assigned a sole arbitrator to the case. This structure significantly reduces
incentives to alter the relief requested in the interest of choosing panels instead of singles or vice
versa. Nevertheless, in Table 4, I test the robustness of the main results to an alternative definition
of the treatment and control group, where the threshold amounts are attributed to these groups,
as though manipulation actually occurred.
Finally, as mentioned in the previous section, data on awards is retrospective. To avoid censor-
ing, I drop cases for which the duration from filing to award exceeds a specific number of days,
which is calculated as the 90th percentile of the distribution of case duration for each category
of cases. Figure 6 presents the cumulative distribution of the duration for the treatment and two
control groups. The calculation of these distributions is based on cases filed only during 2006 and
2007. The awards in the sample were collected up to 2012, allowing a maximal duration of 7 years
for cases filed in the beginning of 2006, and 5 years to those filed very close to the end of 2007.
Assuming that the number of cases that take longer than 5 years to complete is practically zero,
16
Figure 5Distribution of Relief Requested Pre & Post Change (All Years)
I treat the presented distribution as portraying all cases that eventually result in a written award.
The vertical line in each figure marks the 90th percentile, indicating that for the treatment group
90% of cases end within 744 days. For the first control group, which is characterized by lower
amounts requested, the duration of proceedings tends to be slightly shorter, and hence the same
amount of resolved cases is reached within 636 days. The opposite is true for the second com-
parison group, where determining cases is likely to take longer and hence the maximal number
of days for cases in the main sample is set to be 776. To portray how this limitation may affect
the results, in the next section, a sensitivity analysis is conducted and boundaries to the estimates
are derived. Moreover, in table 4 I repeat the main analysis using the full sample without any
restriction and a more restricted sample that sets the limit at the 75th percentile for each group.
It should be noted that the average case duration increased in the period after the change both
in the treated category of cases and in the control group. The point estimate of the change for
the treated group is 33 days and the 95% confidence interval is [10.7,55.8]. The point estimate for
the second control group is even larger— 44 days and the 95% confidence interval is [25.7,71.7].
This change might be attributed to the increase in caseload following the financial crisis, since
when 2009 is dropped from the sample, the change in mean duration for the treatment category
decreases and is no longer statistically significant.
17
(a) 0 < RR ≤ 50k (b) 50k < RR ≤ 100k
(c) 100k < RR ≤ 250k
Figure 6Case Duration CDF (by Range of Relief Requested)
18
3.3 Results
The theoretical prediction I aim to test relies on the idea that when arbitrators make decisions
within a panel, their decision in the specific case will only have a minor effect on their reputation.
The expected consequence would be that, when compared to sole arbitrators’ decisions, a larger
fraction of panel decisions are decisions that are perceived as extreme, i.e. far from the average
decision. This perception is not necessarily related to the accuracy or fairness of the decision under
the specific circumstances. Such a result can also be attributed to the more standard explanation
based on “group polarization”.
Table 3 summarizes the results of a series of DD regressions using indicators for specific values
(or value ranges) of the award rate as outcomes. Columns (1) and (7) present the estimated change
in “extreme” awards due to the switch from three arbitrators to a sole arbitrator. The coefficient
on the interaction term is negative and significant ( only at the 10 percent level for AR=0), both
with the basic set of covariates— in row (a)— and with the full set of controls— in row (b). Both
specifications include year fixed effects (in addition to the “post” indicator). The interpretation of
this result is that sole arbitrators are less likely to award claimants with zero damages and also less
likely to accept all the claims and fully compensate the claimants. At the same time, a positive and
significant changes occurs for moderate or “split the difference” awards, where the award rate is
between 0.4 and 0.6.
The magnitude of the effect is very similar regardless of the specification and suggests that
an individual arbitrator is 10 percentage points less likely to entirely deny a claim compared to a
panel of three arbitrators. Out of a baseline level of 51% of claims being denied, this constitutes
an approximately 20% decrease in respondents’ wins when a sole arbitrator is presiding. On the
other extreme, there is a 9 percentage point increase in awards that grant all of the claimants’
claims. Amazingly, a shift from three arbitrators to one, almost entirely eliminates such awards,
as the decrease amounts to 85%.
Moreover, column (8) shows a significant increase of 16 percentage points in the likelihood of
settlements. This result seems to be in line with the reputation channel suggested in the introduc-
tion, if one believes that arbitrators can effectively influence the probability of settlement. Since
settlements will not affect the arbitrator’s reputation, sole arbitrators will be more interested in
settlements and therefore act upon it. At the same time, such a result contradicts the well known
predictions, that as costs of trial increase so does the probability of settlement (Priest & Klein
(1984), Bebchuk (1984)). According to this, since panels of three impose higher costs on the parties
when compared to a sole arbitrator, panels are expected to promote settlements. This result also
contradicts the one presented by Marselli et al. (2013). Their explanations to having more cases
19
settled in the presence of three arbitrators rather than one, is that in such cases results are more
expected and this fosters the successful bargaining between the parties.26
In panel B, the same estimates are presented within arbitrator, as arbitrator fixed-effects are
added to the model. The effects that are estimated are very similar in magnitude, although sig-
nificance levels drop below conventional levels for some of the estimates. The sample is smaller
since singletons are dropped. These results are further discussed in section 4.
Putting together the different effects of the change, reveals a tendency of panels to give more
“extreme” awards when compared to singles and to promote less settlements. The strongest im-
pact seems to be on customer wins, implying that sole arbitrators are more reluctant to rule against
brokerage firms than the other way around.
3.4 Robustness of Results
3.4.1 Sensitivity Analysis
The results presented so far are only based on a selected sample of approximately 90% of all cases,
due to the retrospective nature of the data. As explained in the previous section, I only use awards
that are given within a specific number of days from filing. To see how this might affect results, I
conduct the following experiment. I use all the awards in the full sample for the pre period, since
from the time the last case was filed (time of the rule change) we have almost 4 years of data on
awards. This allows us to consider the sample for the pre period to be the full sample of cases
filed prior to the change. For the post period, I add artificial observations that follow the “worst
case” scenario for the prediction of interest. More specifically, I increase the control group by
10% and all new observations are assumed to show the pattern predicted for the treatment group,
i.e. all awards are exactly 0.5. The opposite is done for the treatment group, where half of the
added observation have award rate equal to one and the other half are equal to zero. Row (b) of
table 4 presents the estimated coefficient on the interaction term for the main outcomes of interest,
when the main specification is estimated using this extended sample. When compared to the main
results (presented again for convenience in row (a)), these should be thought of as the lower bound
estimates. Clearly, the point estimates exhibit the same trends although smaller magnitudes of
change (as expected). Significance levels are maintained, except for the coefficient in column (1),
which indicates the change in the likelihood of awards in favour of the respondent. Keeping in
mind that this scenario is extreme and highly unlikely, this experiment actually proves that under
more realistic scenarios, even if the missing cases do not follow the same pattern, results hold both
26In our sample panels actually seem to be less expected and therefore their explanation may fit the results. However,it is difficult to justify an argument that relies on the fact that parties know and expect panels to be less predictable.
20
Tabl
e3
The
Effe
ctof
Num
ber
ofA
rbit
rato
rson
Aw
ard
Rat
eD
istr
ibut
ion
&Se
ttle
men
ts
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
AR=
00<
AR≤
.2.2<
AR≤
.4.4<
AR≤
.6.6<
AR≤
.8.8<
AR≤
1A
R=
1Se
ttle
dC
ontr
ols
Obs
.
The
effe
ctof
the
rule
chan
ge(s
witc
hing
from
3to
1)
(a)
-0.1
03*
-0.0
193
0.01
300.
200*
**0.
0068
3-0
.079
2***
-0.0
930*
**0.
164*
**N
O19
21(0
.058
1)(0
.028
8)(0
.032
4)(0
.046
7)(0
.027
3)(0
.030
5)(0
.030
5)(0
.041
)
(b)
-0.1
07*
-0.0
210
0.01
830.
181*
**0.
0028
3-0
.008
65-0
.092
8***
0.16
0***
YES
1921
(0.0
586)
(0.0
218)
(0.0
236)
(0.0
446)
(0.0
215)
(0.0
153)
(0.0
308)
(0.0
404)
Incl
udin
gA
rbitr
ator
FEs
-the
with
inar
bitr
ator
effe
ct
(c)
-0.1
31-0
.026
10.
0102
0.18
7***
-0.0
106
-0.0
0721
-0.0
793
0.19
3***
NO
1257
(0.0
899)
(0.0
470)
(0.0
409)
(0.0
720)
(0.0
298)
(0.0
495)
(0.0
523)
(0.0
639)
(d)
-0.1
31-0
.020
80.
0173
0.18
5***
-0.0
0551
-0.0
0710
-0.0
762
0.19
2***
YES
1257
(0.0
930)
(0.0
329)
(0.0
312)
(0.0
704)
(0.0
241)
(0.0
174)
(0.0
538)
(0.0
644)
Not
es:T
heta
ble
repo
rts
DD
esti
mat
esfo
rth
eef
fect
sof
chan
ging
the
num
ber
ofar
bitr
ator
sfr
omth
ree
toon
eon
the
prob
abili
tyof
anaw
ard
rate
(AR
)wit
hin
the
rang
esp
ecifi
edin
colu
mn
titl
esan
dth
epr
obab
ility
ofse
ttle
men
ts.
The
‘trea
tmen
t’gr
oup
incl
udes
case
sw
ith
are
lief
requ
este
dhi
gher
than
50k
and
low
eror
equa
lto
100k
.The
com
pari
son
grou
pin
clud
esca
ses
wit
ha
relie
freq
uest
edbe
twee
n0
and
50k
and
case
sw
ith
are
liefr
eque
sted
high
erth
an10
0kan
dlo
wer
oreq
ualt
o25
0k.
The
sam
ple
isre
stri
cted
toca
ses
whe
reth
edu
rati
onof
proc
eedi
ngs
islo
wer
oreq
ualt
oth
e90
thpe
rcen
tile
ofca
sedu
rati
onin
each
grou
pof
case
s.Pa
nelA
repo
rts
resu
lts
for
the
full
sam
ple,
whe
reas
inPa
nelB
arbi
trat
orFE
sar
ead
ded
and
sing
leto
nsar
edr
oppe
dfr
omth
esa
mpl
e.Sp
ecifi
cati
ons
(a)a
nd(c
)con
trol
for
filin
gye
arfix
edef
fect
s.Sp
ecifi
cati
ons
(b)a
nd(d
)als
oad
dca
seco
ntro
ls,i
nclu
ding
indi
cato
rsfo
rpa
rtie
s’re
pres
enta
tion
and
coun
ter
clai
m,a
seto
find
icat
ors
for
type
ofse
curi
ties
and
ase
tofi
ndic
ator
sfo
rth
eca
use
ofcl
aim
orco
ntro
vers
yty
pe(f
orth
efu
lllis
tofv
aria
bles
see
tabl
e1)
.Sta
ndar
der
rors
are
clus
tere
dat
the
arbi
trat
orle
vela
ndre
port
edin
pare
nthe
sis
(***
p<0.
01,*
*p<
0.05
,*p<
0.1)
.
21
qualitatively and quantitatively.
3.4.2 More Robustness Tests
Rows (c)-(f) in table 4 display an array of additional robustness tests, that reinforce the main find-
ings. First, in row (c), I restrict the sample to exclude all cases filed during 2009, the year of change.
The aim of this is twofold. First and foremost, avoiding the direct effects of the financial crisis. It
is reasonable to assume that most claims that relate to the events of the crisis were filed during
this year, especially since, as mentioned above, 2009 stands out with a very high number of claims
relative to other years. Second, if there was some manipulation of file dates around the time of
the rule change, it assures us that this does not drive the results. Results are remarkably similar
to the main findings both in sign and in magnitude, although the impact on claim denial and on
settlement is no longer statistically significant due to an increase in standard errors. However, this
could be a result of the 25% decrease in sample size.
In row (d) the treatment and control groups are redefined to account for possible manipulation
around the 50k and 100k cutoffs. In the main specification, cases with a relief requested equal to
50k are included in the control group both before and after the change. Assume that pre-change
claimants tended to choose this amount to make sure their case is heard by a sole arbitrator, even
though they actually intended to request a slightly higher amount. Then actually in the pre period
some of these cases should be in the over 50k group. However in the post period, this threshold
is no longer relevant and there is no basis to believe any manipulation occurs. In accordance
with this story, the 50k cases are moved from the control group into the treatment group in the
pre period (but not in the post period). The same idea is applied to the 100k threshold, where
the threshold cases are moved from the second control group to the treatment group in the pre
period (but not in the post period). This re-definition does not change our main conclusions as the
estimates remain practically identical.
Last, in rows (e) and (f) I test whether the results are sensitive to changes in the duration limi-
tation (aimed at avoiding censoring effects). Row (e) presents the results for a sample without any
restriction on case duration, while row (f) shows the estimates when a more stringent restriction
is applied, i.e. the number of days at the 75th percentile of the case duration distribution. For the
most part, the choice of maximal duration, has no effect on results. However, the magnitude of
the effect on award rates of zero fluctuates between an insignificant 8 percentage point decrease
and a significant 14 percentage point decrease. However, the first estimate is based on the unlim-
ited sample and hence censoring is not dealt with. For award rates equal to one (or more), the
estimated change remains very similar at 8 percentage points. Overall, the different robustness
22
Table 4Robustness & Placebo Tests
(1) (2) (3) (4)AR=0 .4 < AR ≤ .6 AR=1 Settled Observations
(a) Main -0.103* 0.200** -0.093** 0.164** 1921(0.059) (0.048) (0.030) (0.044)
Sensitivity Analysis -(b) ‘Lower Bound’ -0.081 0.110** -0.062** 0.096** 2213
(0.057) (0.048) (0.030) (0.044)
Robustness Tests -(c) Excluding 2009 -0.134* 0.238*** -0.135*** 0.206*** 1395
(0.0685) (0.0613) (0.0374) (0.0572)
(d) Moving Thresholds -0.116* 0.165*** -0.0753** 0.116*** 1921(0.0601) (0.0510) (0.0299) (0.0449)
(e) Unlimited Duration -0.0863 0.184*** -0.0800** 0.152*** 2107(0.0554) (0.0447) (0.0315) (0.0392)
(f) 75th pct Duration -0.138** 0.188*** -0.0832** 0.168*** 1587(0.0617) (0.0479) (0.0339) (0.0404)
Placebo Tests -(g) Only Pre Period -0.0690 -0.00637 -0.000331 0.0417 1021
(0.0838) (0.0571) (0.0502) (0.0410)
(h) Only Pre Period -0.0626 -0.0171 -0.000742 0.0393 1021(0.0842) (0.0575) (0.0509) (0.0415)
Notes: The table reports DD estimates of various sensitivity, robustness and placebo tests, using the main outcomes ofinterest as defined in table 3. The sample is restricted to cases where the duration of proceedings is lower or equal tothe 90th percentile of case duration in each group of cases, except where noted otherwise. The first row reports the mainoutcomes (as reported in table 3 Panel A) and serves as a benchmark. Row (b) reports the same estimates for a samplethat includes all cases in the pre period, and cases with duration restricted as explained above for the post period. Fakeobservations that follow the ‘worst case’ scenario are added to the post period (i.e. award rate values go exactly theopposite direction from the findings in the first row). Hence, the second row actually reports the lower bound for theresults in the first row. Row (c) presents the results for the same specifications using a sample that excludes all casesfiled during 2009, which was the year when the number of claims peeked due to the financial crisis. Specification (d)shifts the threshold between the groups of cases to account for potential manipulation around the cutoffs. Specifications(e) and (f) test the sensitivity of the results to changing the choice of case duration that accounts for potential censoring.The last two rows report DD-placebo estimates for the effects of being in the treatment group post two arbitrary datesin the pre period, August 1st 2007 and September 1st 2006. In this last test, the sample is restricted to cases which werefiled before April 2009, i.e. prior to the actual policy change. Robust standard errors are reported in parenthesis ( ***p<0.01, ** p<0.05, * p<0.1).
23
tests appear to support the patterns reported in the main part.
3.4.3 Placebo Test
The last two rows of table 4 present the results of two different placebo tests to further establish
that results are not coincidental or simply follow a pre-existing trend. Rows (h) and (i) use the
same treatment and control groups but only in the pre period. This period is divided into two
sub-periods using the arbitrary dates, August 1st 2007 and September 1st 2006, so the placebo is
on the “post” indicator. None of the effects are significant. These results are helpful in refuting
concerns regarding any difference in pre-trends between treatment and control groups.27
3.5 Instrumental Variable Estimation
All the preceding results are based on the expected number of arbitrators according to the regula-
tions applied by FINRA at the filing date. However, as mentioned above, regulations allow parties
to agree on a change in the number of arbitrators. Parties may wish to switch from a panel of three
to a sole arbitrator to reduce costs of proceedings. They may also choose to switch based on their
perception of which forum will better serve their interests. In practice, most parties seem to com-
ply with the rules set by FINRA, especially when claimants are not represented. The “switching
rate” ranges between 4 and 20 percentage of cases, where the former is the percentage of cases that
should have panels but choose a sole arbitrator instead, and the later is the percentage of cases that
switch from one to three.
The lack of full compliance means that the results displayed so far should be thought of as the
“Intention to Treat” effect. Namely, the portrayed change is a consequence of the change in rule
and not in the actual forum hearing the case. To asses the effect of the change in the number of
arbitrators on the compliers, I next present the results of an estimation that uses the interaction
between the rule change and the affected group as an Instrumental Variable for the actual number
of arbitrators. The proposed instrument satisfies the necessary conditions and hence estimators
reported in table 5 are consistent. Column (1) presents the results for the first stage and confirms
that the instrument is highly correlated with a sole arbitrator presiding over the case. In addition,
the exclusion restriction is satisfied since there is no reason to believe that the rule change affects
arbitration outcomes by any other channel, except for the change in the number of arbitrators.
Last, as confirmed above, the timing of the rule change and hence the instrument, could be thought
of as exogenous to the outcomes of interest.
27Although only the results for two cutoff dates are reported in the table, this exercise was conducted for numerousdifferent cutoffs. It should also be noted, that the same test was conducted for the post period, and, again, no significanteffects were found.
24
Table 5Instrumental Variable Estimation
(1) (2) (3) (4) (5)Sole Arbitrator
(1st Stage) AR=0 .4 < AR ≤ .6 AR=1 Settled
TreatedXPost 0.718***(0.035)
Sole Arbitrator -0.136* 0.265*** -0.144*** 0.224***(0.0801) (0.0669) (0.0439) (0.0596)
Observations 1921 1921 1921 1921 1921Robust standard errors in parentheses
*** p<0.01, ** p<0.05, * p<0.1
Columns (2) through (5) present the 2SLS estimators for the main specifications and outcomes.
These estimates are larger in absolute values compared to the DD estimators reported above in
table 3 and significance levels are very similar. Generally, it shows that the effect is bigger for
cases that ‘play by the rules’ and reinforces the conclusion that panels and sole arbitrators behave
and decide differently.
4 Exploring the Underlying Mechanisms
As discussed above, three alternative explanations can be offered to the observed differences be-
tween individual and group decisions in this setting. First, litigants may have different preference
over specific attributes of arbitrators, that depend on whether the chosen arbitrator will serve as a
sole arbitrator or as a member of a panel (or chair). If these attributes are also correlated with the
potential decision, then the results presented above serve as un-biased estimators of the impact of
the rule change, namely the difference in outcomes between a regime where individual arbitra-
tors decide on cases and a regime where groups decide. To explore this option, Panel B of table 3
reports the results of an arbitrator fixed-effects estimation. The estimated model is the same DD
model as in panel A, with the addition of arbitrator fixed effects. Since only arbitrators that qual-
ify as chairs may serve as sole arbitrators, the fixed-effect attributed to each panel of arbitrators is
based on the identity of the panel-chair. In addition to the technical reason for this choice of spec-
ification, it is reasonable to expect that the chair will have the greatest influence on the decision
of the panel, in light of the extensive responsibilities and authorities assigned to panel-chairs by
FINRA’s regulations.
25
It should be noted, that this specification dramatically decreases statistical power, since there
are 457 unique arbitrator names in a sample of 1257 cases. Nevertheless, as can be clearly seen
from Table 3, the main effects are in the same direction and of a similar magnitude to the main
results reported in panel A. The “within” estimators for the change in extreme, zero or one, awards
are negative, implying that the same arbitrator renders less extreme awards as a sole arbitrator
compared to the awards issued by the panel she chairs. These estimates are remarkably similar
to the results without fixed effects, although not significant. It is reasonable to assume that the
relatively high standard errors are the result of the limited sample size relative to the number of
unique arbitrator names in the sample. The estimated effects on “average” awards (award rate
between 0.4 and 0.6) and on the tendency to settle are also very similar to the main results and
significant. Hence, these results suggest that arbitrators change their decision patterns depending
on the type of tribunal, and tend towards less extreme decisions when deciding as individuals
(rather than within groups). Alternatively stated, these results rule out the “selection mechanism”
suggested above.
The second mechanism that potentially drives the results is the group polarization phenomena,
which implies in the context of this research, that panels of arbitrators will always tend to give
zero or one awards, regardless of the type of dispute they are asked to decide on. In contrast, the
third potential channel is only expected to operate in the presence of reputation concerns, where
sole arbitrators will be more cautious about their reputation and avoid extreme decisions that
may portray them as biased. In the main estimations, I only focus on customer cases, where the
claimant is always the investor and the respondent is a firm. In such cases, litigants may suspect
arbitrators for being biased towards either side in a systematic way, based on each arbitrator’s
past decisions. Hence, it is reasonable to assume that reputation concerns play a role.
However, in other cases (“industry cases”), the parties to the dispute could be two firms or
a firm and a related person (employee, supplier etc.) and no specific inference on arbitrators’
bias can be made. Table 6 presents the results of an estimation of the same DD model as above
using a sample of industry cases. These cases were subject to the same change in regulations
that is used to identify the difference between individuals and groups in the sample of customer
cases. These results are not similar in any way to the estimates on the customer cases sample and
none of the estimates is significant. Although this can be attributed to sample size, the fact that
the patterns of change don’t match and that not distinct pattern of moderation or polarization
can be identified in the industry cases sample, suggests that the most probable mechanism is the
reputation mechanism.
26
Table 6The Effect of Number of Arbitrators on Award Rate Distribution & Settlements in Industry Cases
(1) (2) (3) (4) (5) (6) (7) (8)AR=0 0 < AR ≤ .2 .2 < AR ≤ .4 .4 < AR ≤ .6 .6 < AR ≤ .8 .8 < AR < 1 AR=1 Settled
TreatedXpost 0.0872 0.0267 -0.0314 -0.137 0.0342 0.339 -0.277 -0.010(0.139) (0.0341) (0.0731) (0.138) (0.0367) (0.247) (0.266) (0.013)
Treated -0.0333 -0.0597* -0.0538 0.0947 -0.0612 -0.107 0.318** 0.001(0.0379) (0.0341) (0.0439) (0.132) (0.0527) (0.206) (0.160) (0.014)
Post -0.0265 0.0749 -0.0156 0.103* -0.0900 0.0320 -0.0661 0.045(0.0624) (0.0512) (0.0933) (0.0566) (0.0962) (0.161) (0.163) (0.037)
Constant 0.225 0.0215 0.249 0.154 0.0783 0.0310 0.150 -0.022(0.336) (0.0410) (0.198) (0.0988) (0.0683) (0.317) (0.292) (0.035)
Observations 169 169 169 169 169 169 169 169R-squared 0.317 0.099 0.139 0.154 0.084 0.236 0.273 0.059
Robust standard errors in parentheses*** p<0.01, ** p<0.05, * p<0.1
27
5 Conclusion
This study presents empirical evidence on the difference between decisions of sole arbitrators
and panels of three arbitrators. The results portray a tendency of panels to make more extreme
decisions, i.e. closer to zero or one in award rates. In addition, panels seem to be less interested in
promoting settlements.
The results are based on a research design that exploits a change in FINRA regulations regard-
ing the threshold for assigning cases to panels rather than sole arbitrators. Using a difference-in-
differences setting, allows to estimate the causal effect of “singles versus panels”. Interestingly,
results contradict many theoretical predictions and other findings based on observational studies,
by clearly showing that panels tend to make more “all or nothing” decisions. Mainly, these results
are surprising in light of the prevalent conjectures that panels are more predictable, i.e. provide
moderate, ‘split the difference’ decisions, since arbitrators tend to balance each other’s bias or
noisy signals by averaging out the opinions of the panel members.
I show that these findings are robust and most likely driven by the fact that reputation concerns
affect sole arbitrators, driving them to avoid extreme outcomes that can potentially establish their
reputation as biased towards either side. Alternatively stated, when arbitrators decide as a group
they exhibit less concern for their reputation and hence have the courage to make extreme and
even controversial decisions. If the arbitrators derive these results based on their true evaluation
of the claim this result is desirable. However, such results may also be driven by biased arbitra-
tors, who use the fact that they are part of a group to freely express their biased views. Without
the support of the panel, the same arbitrators are expected to make less biased decisions, to avoid
damaging their reputation. On the down side, unbiased arbitrators may be reluctant to render
truthful but extreme decisions, since the truth will never become publicly known and hence they
will be perceived as potentially biased. Finding more opportunities to compare groups and indi-
viduals in different environments, with similar career concerns would shed more light on the the
systematic differences between group and individual decisions and decision making processes.
28
References
Adams, R. and D. Ferreira (2010). Moderation in groups: Evidence from betting on ice break-ups
in alaska. The Review of Economic Studies 77(3), 882–913.
Baillon, A., H. Bleichrodt, N. Liu, and P. P. Wakker (2016). Group decision rules and group ratio-
nality under risk. J of risk and Uncertainty 52(2), 99–116.
Barber, B. M., C. Heath, and T. Odean (2003). Good reasons sell: Reason-based choice among
group and individual investors in the stock market. Management Science 49(12), 1636–1652.
Barber, B. M. and T. Odean (2000). Too many cooks spoil the profits: Investment club performance.
Financial Analysts Journal 56(1), 17–25.
Benabou, R. (2013). Groupthink: Collective delusions in organizations and markets. The Review of
Economic Studies 80, 429–462.
Blinder, A. S. and J. Morgan (2005). Are two heads better than one? monetary policy by committee.
Journal of money, credit, and Banking 37(5), 798–811.
Bone, J., J. Hey, and J. Suckling (1999). Are groups more (or less) consistent than individuals?
Journal of Risk and Uncertainty 18(1), 63–81.
Bornstein, G. and I. Yaniv (1998). Individual and group behavior in the ultimatum game: Are
groups more “rational” players? Experimental Economics 1(1), 101–108.
Charness, G. and M. Sutter (2012). Groups make better self-interested decisions. The Journal of
Economic Perspectives 26(3), 157–176.
Choi, S. J., J. E. Fisch, and A. C. Pritchard (2010). Attorneys as arbitrators. The Journal of Legal
Studies 39(1), 109–157.
Cooper, D. J. and J. H. Kagel (2005). Are two heads better than one? team versus individual play
in signaling games. The American economic review 95(3), 477–509.
Eliaz, K., D. Ray, and R. Razin (2006). Choice shifts in groups: A decision-theoretic basis. The
American economic review 96(4), 1321–1332.
Ely, J. C. and J. Valimaki (2003). Bad reputation. The Quarterly Journal of Economics, 785–814.
Epstein, L., W. M. Landes, and R. A. Posner (2011). Why (and when) judges dissent: A theoretical
and empirical analysis. J. Legal Analysis 3, 101.
29
Glaeser, E. L. and C. R. Sunstein (2009). Extremism and social learning. Journal of Legal Analy-
sis 1(1), 263–324.
Goeree, J. K. and L. Yariv (2011). An experimental study of collective deliberation. Economet-
rica 79(3), 893–921.
Janis, I. L. (1982). Groupthink: Psychological studies of policy decisions and fiascoes. Houghton Mifflin
Boston.
Kerr, N. L., R. J. MacCoun, and G. P. Kramer (1996). Bias in judgment: Comparing individuals and
groups. Psychological review 103(4), 687.
Klement, A. and Z. Neeman (2013). Does information about arbitrators’ win/loss ratios improve
their accuracy? The Journal of Legal Studies 42(2), 369–397.
Kocher, M. G. and M. Sutter (2005). The decision maker matters: Individual versus group be-
haviour in experimental beauty-contest games. The Economic Journal 115(500), 200–223.
Levy, G. (2007). Decision making in committees: Transparency, reputation, and voting rules. The
American Economic Review, 150–168.
Marselli, R., B. C. McCannon, and M. Vannini (2013). Bargaining in the shadow of arbitration.
Available at SSRN 2270481.
Morris, S. (2001). Political correctness. Journal of political Economy 109(2), 231–265.
Moscovici, S. and M. Zavalloni (1969). The group as a polarizer of attitudes. Journal of personality
and social psychology 12(2), 125.
Ottaviani, M. and P. Sørensen (2001). Information aggregation in debate: who should speak first?
Journal of Public Economics 81(3), 393–421.
Piketty, T. (1999). The information-aggregation approach to political institutions. European Eco-
nomic Review 43(4), 791–800.
Posner, R. A. (2004). Judicial behavior and performance an economic approach. Fla. St. UL Rev. 32,
1259.
Posner, R. A. (2010). How judges think. Harvard University Press.
Prather, L. J. and K. L. Middleton (2002). Are n+ 1 heads better than one?: The case of mutual fund
managers. Journal of Economic Behavior & Organization 47(1), 103–120.
30
Rockenbach, B., A. Sadrieh, and B. Mathauschek (2007). Teams take the better risks. Journal of
Economic Behavior & Organization 63(3), 412–422.
Stoner, J. A. (1968). Risky and cautious shifts in group decisions: The influence of widely held
values. Journal of Experimental Social Psychology 4(4), 442–459.
Stoner, J. A. F. (1961). A comparison of individual and group decisions involving risk. Ph. D. thesis,
Massachusetts Institute of Technology.
31
6 Appendix
(a) Awards in Customer Cases (b) All Cases Filed
Figure A1Number of Cases by Filing Year
Notes: Figure (a) on the left hand side, presents the number of cases filed with FINRA arbitration every year, between2006 and 2011. The source for these numbers is statistics posted by FINRA on their website https://www.finra.org/.Figure (b) on the right hand side, shows the number of customer cases in the full sample collected for this study, byfiling year. Due to the retrospective nature of the data, only cases that were closed by the end of 2012 are counted, andthis probably explains the relatively small number of cases in 2011.
32