Individual vs Group Decision-Making: Evidence from a

Individual vs Group Decision-Making:

Evidence from a Natural Experiment in Arbitration

Proceedings

Naomi Gershoni*

PERLIMINARY & INCOMPLETE

May 1, 2018

Abstract

The importance of understanding the systematic differences between group and individual

decisions has been well-recognized in the economic literature. However, the vast majority of

empirical evidence on this issue are derived from laboratory experiments, and hence do not

reflect professional incentives and career concerns both of which may play a crucial role. This

paper uses an original data set from the Financial Industry Regulatory Authority’s arbitration

awards as well as a unique regulatory change that exogenously decreased the number of pre-

siding arbitrators from three to one for a specific class of cases. A difference-in-differences

strategy is implemented to identify the impact of the number of decision-makers on the dis-

tribution of outcomes. The findings indicate a tendency of sole arbitrators to render moderate

awards and to facilitate more settlements, whereas groups make more extreme “all or nothing”

decisions. The paper discusses potential underlying mechanisms, and offers evidence to sup-

port a novel explanation to the findings based on the notion that concerns about the adverse

effects of extreme decisions on arbitrators’ reputation are mitigated within panels.

1 Introduction

The importance of understanding the systematic differences between group and individual deci-

sions has been well recognized in the economic literature. Economic decisions in a wide array of

issues and settings are made by groups. Some examples are business decisions made by boards,

*Ben Gurion University of the Negev

1

policy recommendations by committees, household choices, and judicial panels or jury decisions.

While common wisdom suggests that “two heads are better than one”, the theoretical and experi-

mental literature has long acknowledged that differences between group and individual decisions

may be driven by various psychological and economic mechanisms. In addition, such differences

heavily depend on context and institutions, as well as on the criteria by which the decision is eval-

uated (see e.g. Charness and Sutter, 2012; Goeree and Yariv, 2011; Eliaz et al., 2006; Cooper and

Kagel, 2005). In light of the numerous competing predictions, the scarcity of empirical evidence

based on “real world” observational data is striking, though not surprising. The formation of

groups, as well as the type of decisions they make, are typically endogenous and hence evidence

often points to correlations rather than causal effects. Empirical evidence that come from labo-

ratory experiments do not reflect professional incentives and career concerns, which may play a

crucial role.

This paper aims to identify the causal effects of groups on outcomes, utilizing an original data

set of FINRA (Financial Industry Regulatory Authority) arbitration awards and a unique regula-

tory change that exogenously decreased the number of arbitrators deciding on a case from three

to one, for a specific class of cases. A difference-in-differences strategy is applied to identify the

impact of the number of decision-makers on the distribution of outcomes. The findings indicate

a tendency of groups towards more extreme, “all or nothing”, decisions, while sole arbitrators

render more moderate awards. In addition, sole arbitrators appear to facilitate more settlements.

Furthermore, suggestive evidence point towards a specific mechanism that drives these differ-

ences.

Deriving a specific hypothesis based on previous studies is impossible, as multiple channels

and mechanisms were suggested leading to contradicting results. While some studies empha-

size the moderating effect of groups on potentially extreme individual opinions, others present

the possibility that groups evoke polarization of individual views and motivate manipulation of

outcomes.

The well-known Condorcet Jury Theorem, predicts that majorities are more likely than any

single individual to select the best out of two alternatives, based on the assumption that informa-

tion aggregation is efficient.1 In addition, it was claimed that groups are more predictable and

less extreme due to a ‘compromise effect’ – individual members’ opinions averaging out in order

to reach consensus (Adams and Ferreira, 2010). Even when consensus is not required, studies

document a tendency towards unanimity or ‘dissent aversion’, especially when members of the

group interact repeatedly, e.g. panels of judges (Posner, 2010). Moreover, Adams and Ferreira

1A long line of literature that followed altered and relaxed the underlying assumptions and a variety of results werederived (see Piketty (1999) for a survey of papers in this line of literature).

2

(2010) predict that the ‘membership effect’ – groups choosing to exclude individuals with extreme

points of view or such individuals voluntarily leaving groups – will amplify the moderating effect

of groups. Since usually arbitrators are chosen by the parties of the dispute, a similar “selection”

effect is expected (although not necessarily moderating). As to the quality of decisions, experi-

mental studies repeatedly find group decisions to be more consistent with rationality (Kocher and

Sutter, 2005; Baillon et al., 2016; Bornstein and Yaniv, 1998; Rockenbach et al., 2007; Blinder and

Morgan, 2005).

However, others cast doubt on the wisdom and moderation of groups. Social psychology

points to several cognitive biases which may lead to the opposite outcome. The phenomena

known as group polarization, was repeatedly documented in lab experiments and was supported

by a variety of theoretical models (Glaeser and Sunstein, 2009). Two types of potential choice shifts

of groups, either “risky” or “cautious”, were demonstrated under divergent circumstances and

shown to depend on group members’ individual predispositions and the extent of deliberation

(see e.g. Stoner, 1968, 1961; Moscovici and Zavalloni, 1969; Kerr et al., 1996). An even more alarm-

ing view of group behavior was first described by Janis (1982), who coined the term “groupthink”

to describe the tendency of groups to apply self-deception in an attempt to conform with each

other and avoid conflict.2

Ottaviani and Sørensen (2001) stress on a different aspect of group decisions– reputation con-

cerns lead to flaws in information aggregation. They predict that when committee members

present their opinions sequentially, they will end-up adopting relatively extreme views, a phe-

nomenon known as ‘herding’. Levy (2007) analyzes decisions made by committees whose mem-

bers are motivated by similar concerns for reputation and focuses on the interaction of these con-

cerns with the transparency of the decision process. In this setting, transparency promotes deci-

sions that go against members’ preexisting bias.

Although reputation concerns are dominant in arbitration, the environment is more likely to

generate “adverse reputation effects”, i.e. concerns for “bad” reputation as biased since the accu-

racy of decisions cannot be evaluated (Klement and Neeman, 2013).3 As a result, many researchers

hypothesized that arbitrators tend to simply “split the difference” between the parties to a dispute

and avoid extreme decisions.(Posner, 2004). Since arbitrators’ deliberations are secretive, relative

to sole arbitrators, the group setting reduces transparency and mitigates concerns for reputation.

Hence, panels are expected to give rise to more extreme decisions.

Comparing decisions of sole arbitrators to those of panels to establish causal effects requires

2Following this idea, Benabou (2013) presents a theoretical model “collective denial and willful blindness in groups,organizations, and markets”

3Generally, Morris (2001); Ely and Valimaki (2003) predict “adverse reputation effects” when only the actions ofagents are observed and the truth remains unknown.

3

a setting where the assignment of specific cases to one option or the other is random or as good

as random. In reality, the option is assigned either by an initial agreement between the parties to

a dispute or according to a specific rule that assigns different types of cases to different tribunals.

Therefore, the results of a “naive” comparison between the decisions taken by different forums

may actually reflect the underlying difference in the characteristics of the dispute and not the

behavior of arbitrators.

To overcome this challenge, I use a unique data set of FINRA arbitration awards and a change

in the FINRA regulations that exogenously impacted the number of arbitrators assigned to some

cases. The study focuses on claims of investors (customers) against brokerage firms. In March

2009, FINRA raised the threshold for the assignment of cases to three arbitrators (rather than one)

from $50,000 to $100,000. This change in regulation serves as a “natural experiment” which allows

to test the group of cases between the two thresholds, old and new, for the differences between

awards given by a panel of three arbitrators and awards given by a sole arbitrator, depending on

filing date. An important feature of this research design is that the rule change is not related to

the hypothesis or outcomes of interest. Rather, it resulted from an unusual load of cases during

the year 2009, probably following the financial crisis that peeked in the preceding year. While

allowing me to consider the rule change as an exogenous shock to tribunal size, this chain of events

raises concerns that any observed pattern will be a result of changes provoked by the financial

crisis. More generally, other factors may have caused some changes over time or awards may

simply exhibit a time trend. To address these potential obstacles, I use a difference-in-differences

framework where the group of cases that were not affected by the change in regulations serves as

a control group. I present elaborate balance tests and a variety of robustness tests to confer that

this setting allows the identification of the true causal effect of the size of the arbitral tribunal on

arbitration outcomes.

The main outcome considered is the award rate, i.e. the amount of awarded damages relative

to the relief initially requested by the claimant. The results show that the distribution of awards

significantly changes for the treatment group in a way that is not exhibited by the control group

of cases. Specifically, when compared to sole arbitrators, panels of three tend to entirely dismiss

more claims but also award claimants with very high damages when their claims are accepted. In

addition, cases assigned to sole arbitrators are more likely to be settled. These results suggest that

groups tend to be more extreme, although no inference can be made regarding the quality of the

decisions.

Three main mechanisms can explain these findings. First, the selection of arbitrators into pan-

els may be different and hence the difference in outcomes could be due to the difference in arbitra-

tors’ characteristics. I rule out this option using an extended sample of cases and adding arbitrator

4

fixed effects to the estimated model. The two additional channels that may drive the portrayed

differences between individuals and groups are either the general group polarization phenomena

or the rational choices of moderate awards by sole arbitrators who are concerned for their repu-

tation. I present evidence on cases that do not involve reputation considerations to establish that

the second explanation is more likely.

In addition to the general contribution of this paper to the analysis of group decision-making,

the findings have policy implications in the specific context of arbitration proceedings. As arbi-

tration is becoming an increasingly widespread alternative to courts, questioning the quality of

arbitrators’ decisions, especially their fairness and non-biasedness, becomes essential. Contrary

to courts, the number of arbitrators assigned to a case is usually determined by the parties to the

dispute. The common practice is to increase the number of arbitrators as the stakes of the case

increase. Choosing more arbitrators imposes higher costs of proceedings, however the benefit

of this choice is not clear, as none of the (very few) empirical studies that tackle the difference

between group and individual decisions address arbitration decisions.4 On the other hand, the-

oretical and empirical studies of the determinants of arbitrators’ decision-making have always

focused either on decisions of single arbitrators or on decisions of panels, but the two have never

been compared.5

The paper proceeds as follows— In section 2 the FINRA Arbitration system is reviewed in

detail. Next, I describe the unique data set and the empirical strategy. Results and robustness

tests are presented in section 3. Section 4 tests the plausibility of the alternative mechanisms and

presents evidence in support of the “reputation channel”. Section 5 concludes.

2 Background on FINRA Arbitration

FINRA is a private corporation that acts as a self-regulatory organization for the financial industry

by authorization of the Congress. The members of the corporation are securities firms and broker-

ages that operate in the US. Its main operation is the formulation and enforcement of rules that

are aimed to ensure fair and honest conduct of its members. In addition, FINRA runs the largest

dispute resolution forum in the US securities industry, offering extensive arbitration and media-

tion services. Arbitration cases are divided into two types: customer cases, where the claimant

4The vast majority focus on investment decisions and offer mixed evidence for the association between groupsand outcomes (Barber et al., 2003; Barber and Odean, 2000; Bone et al., 1999; Prather and Middleton, 2002).Adamsand Ferreira (2010) compare the distribution of bets placed by individuals and groups on ice breakup dates and findgroups to be more moderate and conform more to historic data. However, the setting of a specific wagering game issignificantly different from other economic and judicial decisions.

5See for example: Choi et al. (2010). The only exception to this is Marselli et al. (2013) that consider the associationbetween the number of arbitrators and the probability of a dispute resolving in settlement.

5

is a customer and the respondent is a member, and industry cases, were both parties are either

members or other entities who are involved in the industry, such as employees or suppliers of

member firms.

In customer cases, which are the focus of this study, the arbitration process starts when an

investor files a statement of claim against a brokerage firm. Essentially all agreements between

brokerage firms and their customers include a clause which obliges them to turn to FINRA ar-

bitration in case of a dispute. Therefore, FINRA customer cases are considered to be mandatory

arbitration proceedings, meaning that customers cannot choose to file their claim in court or in any

other dispute resolution forum.6 FINRA exerts a lot of effort to sustain the image of its arbitration

proceedings and arbitrators as fair and objective, since as a group they are often suspected of bias

in favor of their own members. Nevertheless, both the FINRA forum and individual arbitrators

are constantly suspected of being biased against both sides, members and customers.7

After filing the claim, the arbitrators’ selection process begins. The number of arbitrators ap-

pointed to a case depends on the amount of relief requested in the statement of claim. FINRA

regulations set a threshold amount such that cases with a requested relief equal to or below it are

heard by a sole arbitrator and cases above the threshold are heard by a panel of three arbitrators.8

However, parties may agree to stray from this rule.9 In addition, the regulations determine the

types of arbitrators that should be assigned to the case. Each arbitrator in the FINRA roster is

categorized as either “Public” or “Non-Public” according to the extent of his experience and in-

volvement in the financial industry (e.g. as employee or attorney of a brokerage firm).10 Only

public arbitrators that meet certain requirements of experience and education can serve as chair-

persons.11 In cases where only one arbitrator is appointed, he must be a public arbitrator qualified

6In some cases investors tried to turn to courts to discard them from the obligation to use arbitration proceedings.These appeals were mostly denied by the supreme court, even though courts and legislatures often expressed theopinion that mandatory arbitration clauses are unfair or might violate the basic principle of the justice system, thateveryone is entitled to have their “day at court” (see Shearson/American Express Inc. v. McMahon (482 US 220 [1987]),Rodriguez de Quijas v. Shearson/American Express, Inc. (490 US 477 [1989])

7Scanning through securities blogs and law firm websites many statements that indicate that FINRA is biasedin favor of its own members appear. One example is “There is a wide-spread perception that the arbitration system isstacked against customers. Some people have expressed concern that many arbitrators are reluctant to mete out large awardsagainst big brokerage firms or to issue strong discovery rulings. ”(http://www.sbpllplaw.com/an-outline-of-the-finra-arbitration-process-for-customer-broker-disputes/). Others express exactly the opposite opinion, refering to FINRA as“customer-friendly” and suggesting that FINRA’s own members are interested in taking disputes with customers to otherjudicial forums (http://www.pepperlaw.com/publications/second-circuit-makes-it-easier-to-avoid-finra-arbitration-2015-04-02/). However, the latter view is much less prevalent.

8It should be noted that the threshold was raised during the studied period and this change in regulations is usedas the identification strategy (see section 3.2).

9Before the change in regulations, cases that were assigned to a sole arbitrator, could be heard by a panel of three byrequest of the claimant (without requiring the consent of the respondent.

10Non-public arbitrators are often referred to as “industry arbitrators”. The exact definitions are listed in rules12100(p) & (u) of FINRA’s Code of Arbitration Procedure for Customer Disputes.

11To qualify as chairperson an arbitrator must also complete FINRA’s chairperson training. The complete require-

6

to serve as chairperson. In cases with three arbitrators, at least two arbitrators are public arbitra-

tors and at least one must be eligible for chairperson. In most cases the panel of three arbitrators

is a “Majority Public Panel”, i.e. it consists of two public arbitrators and one non-public. This was

the only option for the composition of the panel up to 2011 when the rules changed. since January

2011, the claimant can choose to have an “All Public Panel”, i.e. appoint only public arbitrators to

the panel. 12

FINRA uses the “veto-rank” method for the selection of arbitrators. Each party receives a list of

randomly selected arbitrators and is asked to veto some and rank the others. The arbitrators with

the highest combined scores are appointed. In cases where one arbitrator should be nominated,

parties get one list of ten potential arbitrators. They can veto four out of ten and then they rank

the remaining six. When three arbitrators are required, three different lists of ten arbitrators each

are provided and the same process is applied to each list. Therefore the probability for each party

to get their favorite arbitrators appointed does not depend on the total number of arbitrators

assigned to the case.13

The FINRA guidelines to arbitrators specifically address the question of how a decision should

be made by a panel. The arbitrator’s guide offers the following protocol:

After each panel member has reviewed the case information, the chairperson should

begin the discussion by asking the panel members to express their individual views of

the case. All arbitrators must take part in deliberating the facts and issues, and their

observations and opinions should be heard, acknowledged and considered.

In addition, rule 12410 of the FINRA Code states that all rulings and determinations by the panel

shall be by a majority of the arbitrators. Note, that while a decision requires at least two panel

members to agree on liability, the reasons behind their determinations may differ.

Although unanimity is not required, the number of cases where it is recorded that one arbitra-

tor opposed the majority opinion is negligible.14 This is in line with the findings of many studies

that document and rationalize the phenomena referred to as “dissent aversion” among judges.15

In most cases this tendency is driven by considerations of collegiality.

ments for chairperson eligibility are listed in rule 12400(c) of FINRA’s Code of Arbitration Procedure for CustomerDisputes.

12Betweem 2008 and 2011, there was a pilot program that offered the choice of an “All Public Panel” in selected cases.In my main sample only 46 cases were actually heard by an “All Public Panel”.

13The only exception is in the case of an “All Public Panel” where the parties get only two lists of 10 and the selectionprocess is slightly different. However, as mentioned above, such cases are negligible in our sample.

14Out of 2575 panel decisions in the full sample only 78 cases record a dissenting arbitrator. In some of these cases,it is noted that the dissent only regards to a specific detail or a technical component of the award, e.g. expungement ofrecords or distribution of costs between the parties.

15See for example Epstein et al. (2011)

7

3 Empirical Analysis of FINRA Arbitration Awards

3.1 Data and Sample

The data was gathered from FINRA arbitration awards, which are publicly available on-line. The

details that were recorded for each case include filing and award dates, number of claimants and

respondents, parties’ representation, causes of claim (or controversy type), type of securities in-

volved in the claim, number of arbitrators and type of forum. The filing of a counter claim or third

party involvement in the proceedings are also recorded.16

The complete database consists of approximately 4000 awards for cases that where filed be-

tween 2006 and 2011. I focus on customer cases since specific rules and guidelines rules apply to

this type of cases. In addition, these cases repeatedly confront parties that belong to two distinct

groups with opposing interests, i.e. customers vs. firms.

Since the data is retrospective, in the sense that we only observe a case once the award is

posted, censoring may affect the results. To avoid such a distortion, I use survival analysis and

only consider cases where the award was given within a limited period of time since the case

was filed.17 It is also important to note that aggregate statistics posted by FINRA suggest that

approximately 80% of cases that are filed are closed before the hearing stage, and hence no award

is issued.18 This does not pose a major concern for the purpose of the analysis, since in such

cases the effect of the arbitrators on the outcome is negligible. For the rest of the cases an award is

issued and posted. In cases that are resolved in settlement at or after the hearing stage, an award is

issued declaring that the parties settled, but the details of the settlement are usually kept secret.19

Obviously these outcomes should be included in the analysis, since arbitrators’ behavior during

the hearing could influence the probability of settlement.

In the main analysis, I use a sub-sample of customer cases that only includes cases with a relief

requested up to 250,000$ as explained in further details in the next section.

16The data was collected with the assistance of a group of law and economics undergraduate students. They weregiven elaborate written and oral instructions to ensure uniformity and consistency in the coding of the awards’ sub-stance. In the course of collecting the data, students’ questions were asked and answered in an open forum in sake ofclarity and coherence. In addition, records were double checked to avoid any mistakes.

17I elaborate on the choice of this period in section 3.2.18The reasons for closing a case at such a preliminary stage could be a direct settlement between the parties,

bankruptcy of a critical party, forum denying, and others.19Such awards regularly decide on the issue of expungement, i.e. whether or not the claim will be erased from the

member’s records, and on the fees that should be paid by the parties for the arbitration proceedings.

8

3.2 Empirical Strategy

Comparing the awards given by single arbitrators to those given by panels is not an easy task

since in most cases different types of cases are assigned to different numbers of arbitrators. There-

fore, such a comparison might actually reflect differences in unobserved characteristics of cases

and parties. FINRA’s rules distinguish the types of cases which are assigned to three rather than

one arbitrators, by setting a threshold based on the relief requested by the claimant. Up to the

end of March 2009 the threshold for a single arbitrator was 50,000$ and cases above this amount

were assigned to panels of three. Figure 1 presents a ‘naive’ comparison of the distribution of

awards granted by panels of three versus sole arbitrators, for this period (which in our main anal-

ysis serves as the “pre- period”). Awards are measured as rates, i.e. the fraction of the monetary

relief initially requested by the claimants that was approved by the arbitrators in the final award.

Accordingly, the minimal award equals zero, while the maximum is set to one, although it could

potentially be higher if for example the arbitrators ordered the respondent to pay punitive dam-

ages.20 It appears like there is some difference between the two distributions. Cases that were

heard by sole arbitrators seem to have slightly more “extreme” awards, which grant plaintiffs

with all or nothing, compared to cases with three arbitrators. However, this difference cannot

be directly attributed to the different number of arbitrators, since these cases have many other

observed and unobserved qualities which separate them.

In 2009 the threshold for the assignment of three arbitrators was raised to 100,000$.21 Press

coverage and security bloggers suggested that the change was required due to an unusual load

of new claims, which was probably the result of the economic crises evoked by the collapse of

financial markets.22 This offers a unique opportunity to compare arbitration awards in cases where

the relief requested is between the old and the new threshold, i.e. 50,000$ - 100,000$. For this group

of cases, if a claim was filed before Marh 30, 2009, the rules assigned a panel of three arbitrators

to hear the case. However, starting that date, only one arbitrator is assigned to each case in this

same category. The nature and timing of the described change in FINRA regulations, is used as a

“natural experiment”, where the assignment of arbitral tribunal to cases in the affected group is as

good as random. Nevertheless, many other obstacles may stand in the way of correctly identifying

the effect of interest.20In many cases the award rate exceeds 1 since the claimant is awarded with punitive or examplary damages. The

relief requested specifies only the compensatory damages and this is the amount that dictates the choice of forumwhich is our main interest. In addition in most cases the amount of punitive damages requested by the claimant is notspecified, so it is impossible to use a consistent measure of total damages requested.

21See FINRA regulatory notice 09-13 “Threshold for Single Arbitrator Cases”.22Indeed 2009 stands out with an unusually large number of filed claims both in FINRA aggregate statistics and in

my sample, as presented in Appendix figure A1.

9

Figure 1Distribution of Award Rate Pre & Post Change (All Years)

The distribution of awards before and after the change in regulations for the affected group of

cases is presented in figure 2. This comparison clearly suggests that there was a substantial change

in the distribution of awards and that the difference between panels’ and sole arbitrators’ decision

is exactly in the opposite to the one suggested by figure 1 (where selection is not controlled for).

The fraction of cases where claims are denied entirely or where the award rate is very close to zero

decreased significantly after the number of arbitrators was reduced from three to one. The same is

true regarding awards that grant claimants with approximately the same amount of damages they

requested or more. On the other hand, awards that tend to “split the difference”, i.e. an award rate

close to 0.5, are more frequent under the new rule, that assigns only one arbitrator to each case.

These findings could support the theoretical prediction that when grouped together arbitrators

are less concerned with reputation and therefore not as reluctant as sole arbitrators to give awards

closer to the tails of the distribution and therefore appear biased, but could also be driven by the

more general phenomena of group polarization.

Although figure 2 exhibits substantial changes in the distribution of award rates, one should

suspect that similar changes occur for all classes of cases and not just for cases that were affected

by the threshold change. This could be the result of a time trend or some other unobserved het-

erogeneity of the pre and post periods. Therefore, to identify the pure effect of the change in

the arbitral tribunal on arbitration results, I use a difference-in-differences (DD) approach. There

are two potential comparison groups: cases where the relief requested by the claimant is below

$50,000, which were assigned to sole arbitrators throughout the entire period and cases with a

relief requested above $100,000, which were assigned to panels of three arbitrators all along. The

10

Figure 2Distribution of Award Rate Pre & Post Change (All Years)

second group is limited to cases where the relief requested does not exceed $250,000. Using a

wider range would make the comparability of the groups questionable and raise substantial con-

cerns for unobserved differences within this group.23 I use both groups combined in the DD set-

ting, but control for each category of cases separately to allow for variation in the average award

rate across groups. I also control for cases where the relief requested is equal to or below 25,000$

since for such cases the “simplified arbitration” procedure applies.

With all groups, the issue of compliance should be addressed, since as mentioned above, par-

ties can agree to change the number of arbitrators. It should also be noted, that the rule regarding

switching from one to three arbitrators was slightly changed at the same time that the threshold

for panels was raised. Prior to the change, a request by the claimant was enough to make the

switch, whereas after the change the consent of the respondent was required as well. This change

made it more difficult to switch the number of arbitrators from one to three and hence increased

compliance from approximately 70% to approximately 95%. In practice, it only affected cases in

the first comparison group, where the rules set the number of arbitrators to one throughout the

entire period. 95% was also the rate of compliance to the three arbitrators default rule, except for

a slightly lower rate in the treatment group post change (it seems reasonable to get lower levels of

compliance to a new rule).

In the main part of the paper, I compare cases by the expected number of arbitrators (according

to the rules) to avoid bias caused by parties’ self selection of the tribunal.24 Hence, results should

23The maximal relief requested for cases in the full sample is 1.25 billion dollars.24Conducting the same analysis dropping the cases where the actual tribunal is not the expected one does not affect

results significantly. However, since compliance cannot be considered random, such results could be biased and cannot

11

be thought of as intention to treat effects (ITT), rather than actual treatment effects. In section

3.5, I use an Instrumental variable analysis to derive the treatment effect on the compliers. An

indicator for the rule change interacted with being in the treatment group serves as an instrument

for forum size. This is a valid instrument since it is highly correlated with the the number of

arbitrators actually assigned to a case and it affects the outcome only through the endogenous

treatment variable.

The main specification for estimation is (i is an index for a specific case):

Yi = α + β0Treatedi + β1Posti + β2Treated× Posti + γ′Xi + ψi

Where Y is the outcome of interest. The main outcomes that are considered are a set of indi-

cators for whether the award rate (AR), i.e. the ratio between the awarded damages and the relief

initially requested in the statement of claim, equals a specific value or lies within a specified range.

The aim of this is to understand if and how the distribution of award rates changed. Settlements

are coded as award rate equal to 0.5, implying that arbitrator’s reputation is not affected by this

outcome.25 I also use an indicator to whether or not the parties agreed to settle the case as an out-

come. This allows to separately measure the impact on settlement rate, rather than just consider

settlements as moderate awards granted by the arbitrators.

X is a vector of controls, which can be divided into two groups. The first consists of variables

that refer to claimants and respondents numbers, whether or not they are represented by an at-

torney and whether or not the respondent entered an appearance. The second includes a large

set of dummy variables that describe the causes of action and the type of securities involved in

the dispute. A complete list of these variables is presented in table 1 and the notes that follow.

In addition, since the comparison group actually consists of three distinct groups, I allow for a

group-specific effect.

Treated is an indicator for the case being in the affected range of relief requested (i.e. 50,000$ to

100,000$) and Post indicates whether the case was filed after the change of rules came into effect.

TreatedXPost is the interaction between the two indicators and is the explanatory variable of in-

terest in this setting. Therefore, β2 is the difference in outcome that is attributed to the assignment

of a case to a single arbitrator rather than to a panel of three arbitrators.

To validate the use of this specification, the observed characteristics of the treatment and con-

trol groups are compared. Table 1 presents the mean of these characteristics for the three groups of

be interpreted effectively.25The average award rate for cases that did not resolve in settlement and where the claimant’s claims were not

entirely dismissed is 0.56. If cases were the award was zero are also considered this rate drops to 0.28. Coding awardrate to equal 0.3 instead of 0.5 has no significant affect on results.

12

interest, in the pre-change period. Then, two measures of the differences between treatment and

control are estimated. First, I test for the significance of the difference in means between treatment

and control. Second, I use the difference-in-differences specification with each covariate as an out-

come. This allows me to evaluate the change in underlying characteristics following the new rule,

and estimate whether the treatment group exhibits a different trend than the control groups. As

we would expect if assignment to treatment was as good as random, the observed characteris-

tics are very similar. Very few significant or marginally significant differences exist, but not more

than is expected with such a long list of covariates. Nevertheless, I use specifications where the

complete set of observable characteristics is controlled for.

One more concern is that parties can manipulate the number of arbitrators either by changing

the filing date (around the time that the new rule became effective) or by changing the relief re-

quested (around the threshold). If this is true then the estimates based on the proposed empirical

model will be biased. The change in rules was announced on February 2009, so the concern is that

claimants either rushed to file their claims before the rule came into effect or delayed filing till after

that date, hence actually selecting the number of arbitrators assigned to their case. If this was true,

we would expect to see some irregularity in the number of cases filed by month around the time

of the policy change. In fact the trends in number of new claims filed over the first 6 months of

2009 appear to be practically identical when the affected group of cases is compared to the control

group. Table 2 reports the proportion of cases that were filed during the quarter that preceded the

effective date of the new rule and the quarter that followed (out of all cases filed during 2009) by

treatment status. In addition, the differences in means between the groups is calculated for both

the pre and post change periods and found to be very close to zero and tatistically insignificant. To

further establish that there is no manipulation on filing date, figure 3 shows the number of cases

filed each day, for a period of four months, around the time the new rule was applied, for the

whole sample and for each sub-group separately. No significant irregularities are found during

this period.

To assess the possibility of manipulation around the threshold of relief requested, Figure 4

compares the distribution of relief requested in cases filed in the pre and post periods. There are

some notable differences that may suggest some extent of manipulation. Mainly, it seems like the

percentage of cases where the relief requested was the “old” threshold (50k) or slightly less de-

creased following the change and the number of cases with a relief requested at the “new” thresh-

old (100k) or slightly less increased. This may portray a preference of claimants to the appoint-

ment of a single arbitrator. However, comparable and even larger differences in the frequency of

claims occur at 25k and 200k. These changes cannot be explained by this or any other change in

FINRA regulations. Moreover, figure 5 compares the kernel density of all cases filed pre-change

13

Table 1Descriptive Statistics & Balance Tests for Treatment & Comparison Group,Pre-Change & Post-Pre Differences

Control Treatment Diff. DID

A. Parties’ characteristicsNumber of claimants 1.595 1.588 0.007 -0.09

(0.108) (0.134)Claimant represented .819 0.876 -0.057* -.068

(0.033) (0.044)Respondent represented 0.984 0.969 0.014 0.038**

(0.013) (0.017)B. Case characteristics

Counter claim 0.027 0.015 0.012 0.008(0.013) (0.018)

Security types -Options 0.027 0.041 -0.014 -0.013

(0.016) (0.021)Common stocks 0.252 0.247 0.005 -0.049

(0.039) (0.049)Government bonds 0.019 0.01 0.009 0.004

(0.011) (0.015)Controversy types -

Margin 0.055 0.021 0.034* 0.022(0.018) (0.02 )

Violation of Rules 0.438 0.438 0 -0.057(0.044) (0.065)

Excess trade 0.14 0.134 0.006 -0.031(0.031) (0.045)

Failure to supervise 0.529 0.536 -0.007 0.036(0.044) (0.066)

Negligence 0.603 0.675 -0.073 -0.021(0.043) (0.062)

Fiduciary breach 0.759 0.722 0.037 0.09(0.039) (0.056)

Unsuitability 0.438 0.428 0.011 0.014(0.044) (0.065)

Misrepresentation 0.463 0.418 0.045 -0.004(0.044) (0.065)

Fraud 0.416 0.376 0.04 0.07(0.044) (0.064)

Failure to execute 0.066 0.067 -0.001 0.02(0.022) (0.035)

Deception 0.025 0.031 -0.006 0.006(0.014) (0.027)

Notes: The table reports means and difference in means for selected characteristics of cases in the treatment and compar-ison groups. Columns (1) and (2) report the mean during the pre-change period for each group seperately, and column(3) presents the difference between the groups in the pre period as well as the standard error for this difference.Column(4) reports the coefficient estimates for a DD estimation comparing the pre- and post periods for the treatment andcontrol group. To avoid censoring the duration of cases is limited as explained hereinafter in figure 6 and the text thatfollows. Standard deviations and robust standard errors are reported in parenthesis (*** p<0.01, ** p<0.05, * p<0.1).

14

Table 2Percentage of Cases Filed Pre vs. Post Change by Treatment Status

Treatment Control Difference

Filed Pre (Jan-Mar) 0.257 0.265 0.008(0.044) (0.021) (0.049)

Filed Post (Apr-Jun) 0.267 0.261 -0.006(0.044) (0.021) (0.049)

Notes: The table reports the share of cases that were filed within a 3 months period before and after the chnge in rulesbecame effective (on March 30, 2009) out of all cases filed during 2009. This shares are presented for the treatment andcontrol groups separately and the difference between the groups is calculated in the last column. To avoid censoringthe duration of cases is limited as explained hereinafter in figure 6 and the text that follows. Standard deviations androbust standard errors are reported in parenthesis (*** p<0.01, ** p<0.05, * p<0.1).

Figure 3Claim Filing Frequency Pre & Post Change (by Range of Relief Requested)

15

Figure 4Distribution of Relief Requested 2008 vs. 2010

to those filed post change. Again, the most apparent differences do not occur around any of the

thresholds. Comparing the two distributions using the Kolmogorov-Smirnov test for the equality

of distributions, the null hypothesis that the samples are drawn from the same distribution cannot

be rejected (with a P-value of 0.42). It should also be noted that parties can agree to change the size

of the arbitral tribunal and moreover, prior to the change the consent of the respondent was not

required when the rule assigned a sole arbitrator to the case. This structure significantly reduces

incentives to alter the relief requested in the interest of choosing panels instead of singles or vice

versa. Nevertheless, in Table 4, I test the robustness of the main results to an alternative definition

of the treatment and control group, where the threshold amounts are attributed to these groups,

as though manipulation actually occurred.

Finally, as mentioned in the previous section, data on awards is retrospective. To avoid censor-

ing, I drop cases for which the duration from filing to award exceeds a specific number of days,

which is calculated as the 90th percentile of the distribution of case duration for each category

of cases. Figure 6 presents the cumulative distribution of the duration for the treatment and two

control groups. The calculation of these distributions is based on cases filed only during 2006 and

2007. The awards in the sample were collected up to 2012, allowing a maximal duration of 7 years

for cases filed in the beginning of 2006, and 5 years to those filed very close to the end of 2007.

Assuming that the number of cases that take longer than 5 years to complete is practically zero,

16

Figure 5Distribution of Relief Requested Pre & Post Change (All Years)

I treat the presented distribution as portraying all cases that eventually result in a written award.

The vertical line in each figure marks the 90th percentile, indicating that for the treatment group

90% of cases end within 744 days. For the first control group, which is characterized by lower

amounts requested, the duration of proceedings tends to be slightly shorter, and hence the same

amount of resolved cases is reached within 636 days. The opposite is true for the second com-

parison group, where determining cases is likely to take longer and hence the maximal number

of days for cases in the main sample is set to be 776. To portray how this limitation may affect

the results, in the next section, a sensitivity analysis is conducted and boundaries to the estimates

are derived. Moreover, in table 4 I repeat the main analysis using the full sample without any

restriction and a more restricted sample that sets the limit at the 75th percentile for each group.

It should be noted that the average case duration increased in the period after the change both

in the treated category of cases and in the control group. The point estimate of the change for

the treated group is 33 days and the 95% confidence interval is [10.7,55.8]. The point estimate for

the second control group is even larger— 44 days and the 95% confidence interval is [25.7,71.7].

This change might be attributed to the increase in caseload following the financial crisis, since

when 2009 is dropped from the sample, the change in mean duration for the treatment category

decreases and is no longer statistically significant.

17

(a) 0 < RR ≤ 50k (b) 50k < RR ≤ 100k

(c) 100k < RR ≤ 250k

Figure 6Case Duration CDF (by Range of Relief Requested)

18

3.3 Results

The theoretical prediction I aim to test relies on the idea that when arbitrators make decisions

within a panel, their decision in the specific case will only have a minor effect on their reputation.

The expected consequence would be that, when compared to sole arbitrators’ decisions, a larger

fraction of panel decisions are decisions that are perceived as extreme, i.e. far from the average

decision. This perception is not necessarily related to the accuracy or fairness of the decision under

the specific circumstances. Such a result can also be attributed to the more standard explanation

based on “group polarization”.

Table 3 summarizes the results of a series of DD regressions using indicators for specific values

(or value ranges) of the award rate as outcomes. Columns (1) and (7) present the estimated change

in “extreme” awards due to the switch from three arbitrators to a sole arbitrator. The coefficient

on the interaction term is negative and significant ( only at the 10 percent level for AR=0), both

with the basic set of covariates— in row (a)— and with the full set of controls— in row (b). Both

specifications include year fixed effects (in addition to the “post” indicator). The interpretation of

this result is that sole arbitrators are less likely to award claimants with zero damages and also less

likely to accept all the claims and fully compensate the claimants. At the same time, a positive and

significant changes occurs for moderate or “split the difference” awards, where the award rate is

between 0.4 and 0.6.

The magnitude of the effect is very similar regardless of the specification and suggests that

an individual arbitrator is 10 percentage points less likely to entirely deny a claim compared to a

panel of three arbitrators. Out of a baseline level of 51% of claims being denied, this constitutes

an approximately 20% decrease in respondents’ wins when a sole arbitrator is presiding. On the

other extreme, there is a 9 percentage point increase in awards that grant all of the claimants’

claims. Amazingly, a shift from three arbitrators to one, almost entirely eliminates such awards,

as the decrease amounts to 85%.

Moreover, column (8) shows a significant increase of 16 percentage points in the likelihood of

settlements. This result seems to be in line with the reputation channel suggested in the introduc-

tion, if one believes that arbitrators can effectively influence the probability of settlement. Since

settlements will not affect the arbitrator’s reputation, sole arbitrators will be more interested in

settlements and therefore act upon it. At the same time, such a result contradicts the well known

predictions, that as costs of trial increase so does the probability of settlement (Priest & Klein

(1984), Bebchuk (1984)). According to this, since panels of three impose higher costs on the parties

when compared to a sole arbitrator, panels are expected to promote settlements. This result also

contradicts the one presented by Marselli et al. (2013). Their explanations to having more cases

19

settled in the presence of three arbitrators rather than one, is that in such cases results are more

expected and this fosters the successful bargaining between the parties.26

In panel B, the same estimates are presented within arbitrator, as arbitrator fixed-effects are

added to the model. The effects that are estimated are very similar in magnitude, although sig-

nificance levels drop below conventional levels for some of the estimates. The sample is smaller

since singletons are dropped. These results are further discussed in section 4.

Putting together the different effects of the change, reveals a tendency of panels to give more

“extreme” awards when compared to singles and to promote less settlements. The strongest im-

pact seems to be on customer wins, implying that sole arbitrators are more reluctant to rule against

brokerage firms than the other way around.

3.4 Robustness of Results

3.4.1 Sensitivity Analysis

The results presented so far are only based on a selected sample of approximately 90% of all cases,

due to the retrospective nature of the data. As explained in the previous section, I only use awards

that are given within a specific number of days from filing. To see how this might affect results, I

conduct the following experiment. I use all the awards in the full sample for the pre period, since

from the time the last case was filed (time of the rule change) we have almost 4 years of data on

awards. This allows us to consider the sample for the pre period to be the full sample of cases

filed prior to the change. For the post period, I add artificial observations that follow the “worst

case” scenario for the prediction of interest. More specifically, I increase the control group by

10% and all new observations are assumed to show the pattern predicted for the treatment group,

i.e. all awards are exactly 0.5. The opposite is done for the treatment group, where half of the

added observation have award rate equal to one and the other half are equal to zero. Row (b) of

table 4 presents the estimated coefficient on the interaction term for the main outcomes of interest,

when the main specification is estimated using this extended sample. When compared to the main

results (presented again for convenience in row (a)), these should be thought of as the lower bound

estimates. Clearly, the point estimates exhibit the same trends although smaller magnitudes of

change (as expected). Significance levels are maintained, except for the coefficient in column (1),

which indicates the change in the likelihood of awards in favour of the respondent. Keeping in

mind that this scenario is extreme and highly unlikely, this experiment actually proves that under

more realistic scenarios, even if the missing cases do not follow the same pattern, results hold both

26In our sample panels actually seem to be less expected and therefore their explanation may fit the results. However,it is difficult to justify an argument that relies on the fact that parties know and expect panels to be less predictable.

20

Tabl

e3

The

Effe

ctof

Num

ber

ofA

rbit

rato

rson

Aw

ard

Rat

eD

istr

ibut

ion

&Se

ttle

men

ts

(1)

(2)

(3)

(4)

(5)

(6)

(7)

(8)

AR=

00<

AR≤

.2.2<

AR≤

.4.4<

AR≤

.6.6<

AR≤

.8.8<

AR≤

1A

R=

1Se

ttle

dC

ontr

ols

Obs

.

The

effe

ctof

the

rule

chan

ge(s

witc

hing

from

3to

1)

(a)

-0.1

03*

-0.0

193

0.01

300.

200*

**0.

0068

3-0

.079

2***

-0.0

930*

**0.

164*

**N

O19

21(0

.058

1)(0

.028

8)(0

.032

4)(0

.046

7)(0

.027

3)(0

.030

5)(0

.030

5)(0

.041

)

(b)

-0.1

07*

-0.0

210

0.01

830.

181*

**0.

0028

3-0

.008

65-0

.092

8***

0.16

0***

YES

1921

(0.0

586)

(0.0

218)

(0.0

236)

(0.0

446)

(0.0

215)

(0.0

153)

(0.0

308)

(0.0

404)

Incl

udin

gA

rbitr

ator

FEs

-the

with

inar

bitr

ator

effe

ct

(c)

-0.1

31-0

.026

10.

0102

0.18

7***

-0.0

106

-0.0

0721

-0.0

793

0.19

3***

NO

1257

(0.0

899)

(0.0

470)

(0.0

409)

(0.0

720)

(0.0

298)

(0.0

495)

(0.0

523)

(0.0

639)

(d)

-0.1

31-0

.020

80.

0173

0.18

5***

-0.0

0551

-0.0

0710

-0.0

762

0.19

2***

YES

1257

(0.0

930)

(0.0

329)

(0.0

312)

(0.0

704)

(0.0

241)

(0.0

174)

(0.0

538)

(0.0

644)

Not

es:T

heta

ble

repo

rts

DD

esti

mat

esfo

rth

eef

fect

sof

chan

ging

the

num

ber

ofar

bitr

ator

sfr

omth

ree

toon

eon

the

prob

abili

tyof

anaw

ard

rate

(AR

)wit

hin

the

rang

esp

ecifi

edin

colu

mn

titl

esan

dth

epr

obab

ility

ofse

ttle

men

ts.

The

‘trea

tmen

t’gr

oup

incl

udes

case

sw

ith

are

lief

requ

este

dhi

gher

than

50k

and

low

eror

equa

lto

100k

.The

com

pari

son

grou

pin

clud

esca

ses

wit

ha

relie

freq

uest

edbe

twee

n0

and

50k

and

case

sw

ith

are

liefr

eque

sted

high

erth

an10

0kan

dlo

wer

oreq

ualt

o25

0k.

The

sam

ple

isre

stri

cted

toca

ses

whe

reth

edu

rati

onof

proc

eedi

ngs

islo

wer

oreq

ualt

oth

e90

thpe

rcen

tile

ofca

sedu

rati

onin

each

grou

pof

case

s.Pa

nelA

repo

rts

resu

lts

for

the

full

sam

ple,

whe

reas

inPa

nelB

arbi

trat

orFE

sar

ead

ded

and

sing

leto

nsar

edr

oppe

dfr

omth

esa

mpl

e.Sp

ecifi

cati

ons

(a)a

nd(c

)con

trol

for

filin

gye

arfix

edef

fect

s.Sp

ecifi

cati

ons

(b)a

nd(d

)als

oad

dca

seco

ntro

ls,i

nclu

ding

indi

cato

rsfo

rpa

rtie

s’re

pres

enta

tion

and

coun

ter

clai

m,a

seto

find

icat

ors

for

type

ofse

curi

ties

and

ase

tofi

ndic

ator

sfo

rth

eca

use

ofcl

aim

orco

ntro

vers

yty

pe(f

orth

efu

lllis

tofv

aria

bles

see

tabl

e1)

.Sta

ndar

der

rors

are

clus

tere

dat

the

arbi

trat

orle

vela

ndre

port

edin

pare

nthe

sis

(***

p<0.

01,*

*p<

0.05

,*p<

0.1)

.

21

qualitatively and quantitatively.

3.4.2 More Robustness Tests

Rows (c)-(f) in table 4 display an array of additional robustness tests, that reinforce the main find-

ings. First, in row (c), I restrict the sample to exclude all cases filed during 2009, the year of change.

The aim of this is twofold. First and foremost, avoiding the direct effects of the financial crisis. It

is reasonable to assume that most claims that relate to the events of the crisis were filed during

this year, especially since, as mentioned above, 2009 stands out with a very high number of claims

relative to other years. Second, if there was some manipulation of file dates around the time of

the rule change, it assures us that this does not drive the results. Results are remarkably similar

to the main findings both in sign and in magnitude, although the impact on claim denial and on

settlement is no longer statistically significant due to an increase in standard errors. However, this

could be a result of the 25% decrease in sample size.

In row (d) the treatment and control groups are redefined to account for possible manipulation

around the 50k and 100k cutoffs. In the main specification, cases with a relief requested equal to

50k are included in the control group both before and after the change. Assume that pre-change

claimants tended to choose this amount to make sure their case is heard by a sole arbitrator, even

though they actually intended to request a slightly higher amount. Then actually in the pre period

some of these cases should be in the over 50k group. However in the post period, this threshold

is no longer relevant and there is no basis to believe any manipulation occurs. In accordance

with this story, the 50k cases are moved from the control group into the treatment group in the

pre period (but not in the post period). The same idea is applied to the 100k threshold, where

the threshold cases are moved from the second control group to the treatment group in the pre

period (but not in the post period). This re-definition does not change our main conclusions as the

estimates remain practically identical.

Last, in rows (e) and (f) I test whether the results are sensitive to changes in the duration limi-

tation (aimed at avoiding censoring effects). Row (e) presents the results for a sample without any

restriction on case duration, while row (f) shows the estimates when a more stringent restriction

is applied, i.e. the number of days at the 75th percentile of the case duration distribution. For the

most part, the choice of maximal duration, has no effect on results. However, the magnitude of

the effect on award rates of zero fluctuates between an insignificant 8 percentage point decrease

and a significant 14 percentage point decrease. However, the first estimate is based on the unlim-

ited sample and hence censoring is not dealt with. For award rates equal to one (or more), the

estimated change remains very similar at 8 percentage points. Overall, the different robustness

22

Table 4Robustness & Placebo Tests

(1) (2) (3) (4)AR=0 .4 < AR ≤ .6 AR=1 Settled Observations

(a) Main -0.103* 0.200** -0.093** 0.164** 1921(0.059) (0.048) (0.030) (0.044)

Sensitivity Analysis -(b) ‘Lower Bound’ -0.081 0.110** -0.062** 0.096** 2213

(0.057) (0.048) (0.030) (0.044)

Robustness Tests -(c) Excluding 2009 -0.134* 0.238*** -0.135*** 0.206*** 1395

(0.0685) (0.0613) (0.0374) (0.0572)

(d) Moving Thresholds -0.116* 0.165*** -0.0753** 0.116*** 1921(0.0601) (0.0510) (0.0299) (0.0449)

(e) Unlimited Duration -0.0863 0.184*** -0.0800** 0.152*** 2107(0.0554) (0.0447) (0.0315) (0.0392)

(f) 75th pct Duration -0.138** 0.188*** -0.0832** 0.168*** 1587(0.0617) (0.0479) (0.0339) (0.0404)

Placebo Tests -(g) Only Pre Period -0.0690 -0.00637 -0.000331 0.0417 1021

(0.0838) (0.0571) (0.0502) (0.0410)

(h) Only Pre Period -0.0626 -0.0171 -0.000742 0.0393 1021(0.0842) (0.0575) (0.0509) (0.0415)

Notes: The table reports DD estimates of various sensitivity, robustness and placebo tests, using the main outcomes ofinterest as defined in table 3. The sample is restricted to cases where the duration of proceedings is lower or equal tothe 90th percentile of case duration in each group of cases, except where noted otherwise. The first row reports the mainoutcomes (as reported in table 3 Panel A) and serves as a benchmark. Row (b) reports the same estimates for a samplethat includes all cases in the pre period, and cases with duration restricted as explained above for the post period. Fakeobservations that follow the ‘worst case’ scenario are added to the post period (i.e. award rate values go exactly theopposite direction from the findings in the first row). Hence, the second row actually reports the lower bound for theresults in the first row. Row (c) presents the results for the same specifications using a sample that excludes all casesfiled during 2009, which was the year when the number of claims peeked due to the financial crisis. Specification (d)shifts the threshold between the groups of cases to account for potential manipulation around the cutoffs. Specifications(e) and (f) test the sensitivity of the results to changing the choice of case duration that accounts for potential censoring.The last two rows report DD-placebo estimates for the effects of being in the treatment group post two arbitrary datesin the pre period, August 1st 2007 and September 1st 2006. In this last test, the sample is restricted to cases which werefiled before April 2009, i.e. prior to the actual policy change. Robust standard errors are reported in parenthesis ( ***p<0.01, ** p<0.05, * p<0.1).

23

tests appear to support the patterns reported in the main part.

3.4.3 Placebo Test

The last two rows of table 4 present the results of two different placebo tests to further establish

that results are not coincidental or simply follow a pre-existing trend. Rows (h) and (i) use the

same treatment and control groups but only in the pre period. This period is divided into two

sub-periods using the arbitrary dates, August 1st 2007 and September 1st 2006, so the placebo is

on the “post” indicator. None of the effects are significant. These results are helpful in refuting

concerns regarding any difference in pre-trends between treatment and control groups.27

3.5 Instrumental Variable Estimation

All the preceding results are based on the expected number of arbitrators according to the regula-

tions applied by FINRA at the filing date. However, as mentioned above, regulations allow parties

to agree on a change in the number of arbitrators. Parties may wish to switch from a panel of three

to a sole arbitrator to reduce costs of proceedings. They may also choose to switch based on their

perception of which forum will better serve their interests. In practice, most parties seem to com-

ply with the rules set by FINRA, especially when claimants are not represented. The “switching

rate” ranges between 4 and 20 percentage of cases, where the former is the percentage of cases that

should have panels but choose a sole arbitrator instead, and the later is the percentage of cases that

switch from one to three.

The lack of full compliance means that the results displayed so far should be thought of as the

“Intention to Treat” effect. Namely, the portrayed change is a consequence of the change in rule

and not in the actual forum hearing the case. To asses the effect of the change in the number of

arbitrators on the compliers, I next present the results of an estimation that uses the interaction

between the rule change and the affected group as an Instrumental Variable for the actual number

of arbitrators. The proposed instrument satisfies the necessary conditions and hence estimators

reported in table 5 are consistent. Column (1) presents the results for the first stage and confirms

that the instrument is highly correlated with a sole arbitrator presiding over the case. In addition,

the exclusion restriction is satisfied since there is no reason to believe that the rule change affects

arbitration outcomes by any other channel, except for the change in the number of arbitrators.

Last, as confirmed above, the timing of the rule change and hence the instrument, could be thought

of as exogenous to the outcomes of interest.

27Although only the results for two cutoff dates are reported in the table, this exercise was conducted for numerousdifferent cutoffs. It should also be noted, that the same test was conducted for the post period, and, again, no significanteffects were found.

24

Table 5Instrumental Variable Estimation

(1) (2) (3) (4) (5)Sole Arbitrator

(1st Stage) AR=0 .4 < AR ≤ .6 AR=1 Settled

TreatedXPost 0.718***(0.035)

Sole Arbitrator -0.136* 0.265*** -0.144*** 0.224***(0.0801) (0.0669) (0.0439) (0.0596)

Observations 1921 1921 1921 1921 1921Robust standard errors in parentheses

*** p<0.01, ** p<0.05, * p<0.1

Columns (2) through (5) present the 2SLS estimators for the main specifications and outcomes.

These estimates are larger in absolute values compared to the DD estimators reported above in

table 3 and significance levels are very similar. Generally, it shows that the effect is bigger for

cases that ‘play by the rules’ and reinforces the conclusion that panels and sole arbitrators behave

and decide differently.

4 Exploring the Underlying Mechanisms

As discussed above, three alternative explanations can be offered to the observed differences be-

tween individual and group decisions in this setting. First, litigants may have different preference

over specific attributes of arbitrators, that depend on whether the chosen arbitrator will serve as a

sole arbitrator or as a member of a panel (or chair). If these attributes are also correlated with the

potential decision, then the results presented above serve as un-biased estimators of the impact of

the rule change, namely the difference in outcomes between a regime where individual arbitra-

tors decide on cases and a regime where groups decide. To explore this option, Panel B of table 3

reports the results of an arbitrator fixed-effects estimation. The estimated model is the same DD

model as in panel A, with the addition of arbitrator fixed effects. Since only arbitrators that qual-

ify as chairs may serve as sole arbitrators, the fixed-effect attributed to each panel of arbitrators is

based on the identity of the panel-chair. In addition to the technical reason for this choice of spec-

ification, it is reasonable to expect that the chair will have the greatest influence on the decision

of the panel, in light of the extensive responsibilities and authorities assigned to panel-chairs by

FINRA’s regulations.

25

It should be noted, that this specification dramatically decreases statistical power, since there

are 457 unique arbitrator names in a sample of 1257 cases. Nevertheless, as can be clearly seen

from Table 3, the main effects are in the same direction and of a similar magnitude to the main

results reported in panel A. The “within” estimators for the change in extreme, zero or one, awards

are negative, implying that the same arbitrator renders less extreme awards as a sole arbitrator

compared to the awards issued by the panel she chairs. These estimates are remarkably similar

to the results without fixed effects, although not significant. It is reasonable to assume that the

relatively high standard errors are the result of the limited sample size relative to the number of

unique arbitrator names in the sample. The estimated effects on “average” awards (award rate

between 0.4 and 0.6) and on the tendency to settle are also very similar to the main results and

significant. Hence, these results suggest that arbitrators change their decision patterns depending

on the type of tribunal, and tend towards less extreme decisions when deciding as individuals

(rather than within groups). Alternatively stated, these results rule out the “selection mechanism”

suggested above.

The second mechanism that potentially drives the results is the group polarization phenomena,

which implies in the context of this research, that panels of arbitrators will always tend to give

zero or one awards, regardless of the type of dispute they are asked to decide on. In contrast, the

third potential channel is only expected to operate in the presence of reputation concerns, where

sole arbitrators will be more cautious about their reputation and avoid extreme decisions that

may portray them as biased. In the main estimations, I only focus on customer cases, where the

claimant is always the investor and the respondent is a firm. In such cases, litigants may suspect

arbitrators for being biased towards either side in a systematic way, based on each arbitrator’s

past decisions. Hence, it is reasonable to assume that reputation concerns play a role.

However, in other cases (“industry cases”), the parties to the dispute could be two firms or

a firm and a related person (employee, supplier etc.) and no specific inference on arbitrators’

bias can be made. Table 6 presents the results of an estimation of the same DD model as above

using a sample of industry cases. These cases were subject to the same change in regulations

that is used to identify the difference between individuals and groups in the sample of customer

cases. These results are not similar in any way to the estimates on the customer cases sample and

none of the estimates is significant. Although this can be attributed to sample size, the fact that

the patterns of change don’t match and that not distinct pattern of moderation or polarization

can be identified in the industry cases sample, suggests that the most probable mechanism is the

reputation mechanism.

26

Table 6The Effect of Number of Arbitrators on Award Rate Distribution & Settlements in Industry Cases

(1) (2) (3) (4) (5) (6) (7) (8)AR=0 0 < AR ≤ .2 .2 < AR ≤ .4 .4 < AR ≤ .6 .6 < AR ≤ .8 .8 < AR < 1 AR=1 Settled

TreatedXpost 0.0872 0.0267 -0.0314 -0.137 0.0342 0.339 -0.277 -0.010(0.139) (0.0341) (0.0731) (0.138) (0.0367) (0.247) (0.266) (0.013)

Treated -0.0333 -0.0597* -0.0538 0.0947 -0.0612 -0.107 0.318** 0.001(0.0379) (0.0341) (0.0439) (0.132) (0.0527) (0.206) (0.160) (0.014)

Post -0.0265 0.0749 -0.0156 0.103* -0.0900 0.0320 -0.0661 0.045(0.0624) (0.0512) (0.0933) (0.0566) (0.0962) (0.161) (0.163) (0.037)

Constant 0.225 0.0215 0.249 0.154 0.0783 0.0310 0.150 -0.022(0.336) (0.0410) (0.198) (0.0988) (0.0683) (0.317) (0.292) (0.035)

Observations 169 169 169 169 169 169 169 169R-squared 0.317 0.099 0.139 0.154 0.084 0.236 0.273 0.059

Robust standard errors in parentheses*** p<0.01, ** p<0.05, * p<0.1

27

5 Conclusion

This study presents empirical evidence on the difference between decisions of sole arbitrators

and panels of three arbitrators. The results portray a tendency of panels to make more extreme

decisions, i.e. closer to zero or one in award rates. In addition, panels seem to be less interested in

promoting settlements.

The results are based on a research design that exploits a change in FINRA regulations regard-

ing the threshold for assigning cases to panels rather than sole arbitrators. Using a difference-in-

differences setting, allows to estimate the causal effect of “singles versus panels”. Interestingly,

results contradict many theoretical predictions and other findings based on observational studies,

by clearly showing that panels tend to make more “all or nothing” decisions. Mainly, these results

are surprising in light of the prevalent conjectures that panels are more predictable, i.e. provide

moderate, ‘split the difference’ decisions, since arbitrators tend to balance each other’s bias or

noisy signals by averaging out the opinions of the panel members.

I show that these findings are robust and most likely driven by the fact that reputation concerns

affect sole arbitrators, driving them to avoid extreme outcomes that can potentially establish their

reputation as biased towards either side. Alternatively stated, when arbitrators decide as a group

they exhibit less concern for their reputation and hence have the courage to make extreme and

even controversial decisions. If the arbitrators derive these results based on their true evaluation

of the claim this result is desirable. However, such results may also be driven by biased arbitra-

tors, who use the fact that they are part of a group to freely express their biased views. Without

the support of the panel, the same arbitrators are expected to make less biased decisions, to avoid

damaging their reputation. On the down side, unbiased arbitrators may be reluctant to render

truthful but extreme decisions, since the truth will never become publicly known and hence they

will be perceived as potentially biased. Finding more opportunities to compare groups and indi-

viduals in different environments, with similar career concerns would shed more light on the the

systematic differences between group and individual decisions and decision making processes.

28

References

Adams, R. and D. Ferreira (2010). Moderation in groups: Evidence from betting on ice break-ups

in alaska. The Review of Economic Studies 77(3), 882–913.

Baillon, A., H. Bleichrodt, N. Liu, and P. P. Wakker (2016). Group decision rules and group ratio-

nality under risk. J of risk and Uncertainty 52(2), 99–116.

Barber, B. M., C. Heath, and T. Odean (2003). Good reasons sell: Reason-based choice among

group and individual investors in the stock market. Management Science 49(12), 1636–1652.

Barber, B. M. and T. Odean (2000). Too many cooks spoil the profits: Investment club performance.

Financial Analysts Journal 56(1), 17–25.

Benabou, R. (2013). Groupthink: Collective delusions in organizations and markets. The Review of

Economic Studies 80, 429–462.

Blinder, A. S. and J. Morgan (2005). Are two heads better than one? monetary policy by committee.

Journal of money, credit, and Banking 37(5), 798–811.

Bone, J., J. Hey, and J. Suckling (1999). Are groups more (or less) consistent than individuals?

Journal of Risk and Uncertainty 18(1), 63–81.

Bornstein, G. and I. Yaniv (1998). Individual and group behavior in the ultimatum game: Are

groups more “rational” players? Experimental Economics 1(1), 101–108.

Charness, G. and M. Sutter (2012). Groups make better self-interested decisions. The Journal of

Economic Perspectives 26(3), 157–176.

Choi, S. J., J. E. Fisch, and A. C. Pritchard (2010). Attorneys as arbitrators. The Journal of Legal

Studies 39(1), 109–157.

Cooper, D. J. and J. H. Kagel (2005). Are two heads better than one? team versus individual play

in signaling games. The American economic review 95(3), 477–509.

Eliaz, K., D. Ray, and R. Razin (2006). Choice shifts in groups: A decision-theoretic basis. The

American economic review 96(4), 1321–1332.

Ely, J. C. and J. Valimaki (2003). Bad reputation. The Quarterly Journal of Economics, 785–814.

Epstein, L., W. M. Landes, and R. A. Posner (2011). Why (and when) judges dissent: A theoretical

and empirical analysis. J. Legal Analysis 3, 101.

29

Glaeser, E. L. and C. R. Sunstein (2009). Extremism and social learning. Journal of Legal Analy-

sis 1(1), 263–324.

Goeree, J. K. and L. Yariv (2011). An experimental study of collective deliberation. Economet-

rica 79(3), 893–921.

Janis, I. L. (1982). Groupthink: Psychological studies of policy decisions and fiascoes. Houghton Mifflin

Boston.

Kerr, N. L., R. J. MacCoun, and G. P. Kramer (1996). Bias in judgment: Comparing individuals and

groups. Psychological review 103(4), 687.

Klement, A. and Z. Neeman (2013). Does information about arbitrators’ win/loss ratios improve

their accuracy? The Journal of Legal Studies 42(2), 369–397.

Kocher, M. G. and M. Sutter (2005). The decision maker matters: Individual versus group be-

haviour in experimental beauty-contest games. The Economic Journal 115(500), 200–223.

Levy, G. (2007). Decision making in committees: Transparency, reputation, and voting rules. The

American Economic Review, 150–168.

Marselli, R., B. C. McCannon, and M. Vannini (2013). Bargaining in the shadow of arbitration.

Available at SSRN 2270481.

Morris, S. (2001). Political correctness. Journal of political Economy 109(2), 231–265.

Moscovici, S. and M. Zavalloni (1969). The group as a polarizer of attitudes. Journal of personality

and social psychology 12(2), 125.

Ottaviani, M. and P. Sørensen (2001). Information aggregation in debate: who should speak first?

Journal of Public Economics 81(3), 393–421.

Piketty, T. (1999). The information-aggregation approach to political institutions. European Eco-

nomic Review 43(4), 791–800.

Posner, R. A. (2004). Judicial behavior and performance an economic approach. Fla. St. UL Rev. 32,

1259.

Posner, R. A. (2010). How judges think. Harvard University Press.

Prather, L. J. and K. L. Middleton (2002). Are n+ 1 heads better than one?: The case of mutual fund

managers. Journal of Economic Behavior & Organization 47(1), 103–120.

30

Rockenbach, B., A. Sadrieh, and B. Mathauschek (2007). Teams take the better risks. Journal of

Economic Behavior & Organization 63(3), 412–422.

Stoner, J. A. (1968). Risky and cautious shifts in group decisions: The influence of widely held

values. Journal of Experimental Social Psychology 4(4), 442–459.

Stoner, J. A. F. (1961). A comparison of individual and group decisions involving risk. Ph. D. thesis,

Massachusetts Institute of Technology.

31

6 Appendix

(a) Awards in Customer Cases (b) All Cases Filed

Figure A1Number of Cases by Filing Year

Notes: Figure (a) on the left hand side, presents the number of cases filed with FINRA arbitration every year, between2006 and 2011. The source for these numbers is statistics posted by FINRA on their website https://www.finra.org/.Figure (b) on the right hand side, shows the number of customer cases in the full sample collected for this study, byfiling year. Due to the retrospective nature of the data, only cases that were closed by the end of 2012 are counted, andthis probably explains the relatively small number of cases in 2011.

32

Documents

Individual vs Group Decision-Making: Evidence from a