38
Estimating Preferred Outcomes from Votes and Text May 4, 2017 Abstract We introduce a framework that simultaneously encompasses vote data and text data using a shared set of underlying preference parameters. Our method allows us to estimate the number of underlying ideological dimensions, and it is robust to both the “zero inflation” and extreme outliers commonly encountered in text data. To illustrate its workings, we apply the method to roll call and floor speech from recent sessions of the US Senate. We find two stable dimensions: the first aligns closely with the standard ideological dimension that emerges in most analyses of Congress, while the second identifies legislators with formal leadership positions. We then use our method to leverage speech in order to impute missing data, to estimate the preferred outcomes of the rank-and-file using only their words and the vote history of party leaders, and even to scale newspaper editorials. Word Count: 9,793 (abstract: 139)

Estimating Preferred Outcomes from Votes and Text · Estimating Preferred Outcomes from Votes and Text May 4, 2017 Abstract We introduce a framework that simultaneously encompasses

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Estimating Preferred Outcomes from Votes and Text · Estimating Preferred Outcomes from Votes and Text May 4, 2017 Abstract We introduce a framework that simultaneously encompasses

Estimating Preferred Outcomes from

Votes and Text

May 4, 2017

Abstract

We introduce a framework that simultaneously encompasses vote data and text data using a

shared set of underlying preference parameters. Our method allows us to estimate the number

of underlying ideological dimensions, and it is robust to both the “zero inflation” and extreme

outliers commonly encountered in text data. To illustrate its workings, we apply the method to

roll call and floor speech from recent sessions of the US Senate. We find two stable dimensions:

the first aligns closely with the standard ideological dimension that emerges in most analyses

of Congress, while the second identifies legislators with formal leadership positions. We then

use our method to leverage speech in order to impute missing data, to estimate the preferred

outcomes of the rank-and-file using only their words and the vote history of party leaders, and

even to scale newspaper editorials.

Word Count: 9,793 (abstract: 139)

Page 2: Estimating Preferred Outcomes from Votes and Text · Estimating Preferred Outcomes from Votes and Text May 4, 2017 Abstract We introduce a framework that simultaneously encompasses

1 Introduction

There is now what is generally regarded as a settled technology for analyzing roll call votes within a

spatial model (e.g., Clinton, Jackman and Rivers, 2004; Poole and Rosenthal, 1997). Likewise, there

are methods for recovering ideological locations from text (e.g. Slapin and Proksch, 2008; Laver,

Benoit and Garry, 2003). However, researchers are commonly confronted with both vote and text

data. Recent research has focused on one or the other, or modeled the two separately. For example,

studies have focused on votes while excluding readily available text data (Barbera, 2015; Ho and

Quinn, 2008), or they have focus only on text, while setting aside easily accessible binary choice

data (Lo, Proksch and Slapin, 2014; Elff, 2013; Quinn et al., 2010). A third group has estimated

latent clusters of co-occurring words, and then modeled ideal points conditional on this clustering

(Gerrish and Blei, 2011, 2012; Wang et al., 2013; Lauderdale and Clark, 2014). In contrast with

each of these analysis, our approach recognizes that both voting and speech are deliberate political

acts.

We introduce a method, Sparse Factor Analysis (SFA), that estimates actors’ spatial preferences

using information from both their word choice and their vote choice. Formally, the method unifies

word and vote choice within a single spatial framework. Legislator preference, vote cut-points,

and the ideological location of words are placed in a single, coherent structure. Statistically, we

estimate rather than assume the number of underlying latent dimensions. The statistical model

places a sparsity prior over the dimension weights, setting irrelevant dimensions zero. The method

also models two key features of text data: zero-inflation and extreme outliers in the term-document

matrix. Lastly, we provide a data-driven means to balance the information coming from words and

votes.

1

Page 3: Estimating Preferred Outcomes from Votes and Text · Estimating Preferred Outcomes from Votes and Text May 4, 2017 Abstract We introduce a framework that simultaneously encompasses

SFA offers several advantages for the applied researcher. First, as mentioned above, the method

estimates the number of underlying latent dimensions. Second, text data can provide leverage in

cases where votes may have been whipped or otherwise swayed. In strong-party systems, legislative

speech can help add useful variance in the presence of party-line voting. Third, words and votes

are placed in the same, politically meaningful, space. Rather than scoring words off usage patterns

among previously assumed left- or right- actors (e.g. Gentzkow and Shapiro, 2010; Laver, Benoit and

Garry, 2003), joint scaling identifies a common ordering of words, votes, and individuals. Fourth,

ideologically charged words will be estimated to anchor one side of the dimension or the other,

clarifying the concept captured by each dimension. Fifth, rather than requiring common votes or

survey responses (e.g Bafumi and Herron, 2010), word usage can serve to bridge voting actors.

We illustrate each of these advantages using data from eight recent US Senates. Our estimates

reveal two stable dimensions: a left-right dimension encountered by other analysts (e.g., Clinton,

Jackman and Rivers, 2004; Poole and Rosenthal, 1997) and a second distinguishing leadership from

“rank and file” membership. Second, to explore behavior in the presence of party-line voting, we

treat all vote data except those from the majority and minority leader and whip as “missing.”

Even with such extreme missingness, SFA uses the text data to return reliable preference estimates.

Third, we show how SFA uses votes to orient the words. In recent Senates, ideologically charged

words anchor the left-right dimension, and a set of parliamentary control terms flip sides based off

which party holds the majority. Fourth, we illustrate the method’s ability to use words to “bridge”

across different sets of actors. Specifically, we treat unsigned editorials from the New York Times,

Washington Post, and Wall Street Journals as speeches from legislators who happen not to vote.

SFA recovers a ranking that places the Wall Street Journal to the right, the New York Times to

the left, and the Washington Post in the middle.

2

Page 4: Estimating Preferred Outcomes from Votes and Text · Estimating Preferred Outcomes from Votes and Text May 4, 2017 Abstract We introduce a framework that simultaneously encompasses

SFA builds off the roll call voting model of Clinton, Jackman and Rivers (2004), so it inherits

both the strengths and shortcomings of the standard spatial model (e.g. Poole and Rosenthal,

1985; Ladha, 1991; Clinton, Jackman and Rivers, 2004). We explain the behavioral assumptions

underlying SFA below, and we compare it with the increasingly popular and flexible family of topic

models (Roberts et al., 2014; Grimmer, 2010; Blei, Ng and Jordan, 2003). To help facilitate use of

the method, we make software publicly available in the R package BLINDED. We pay particular

care to establish the internal and external validity of SFA estimates, as well as discussing the

method’s scope.

The paper progresses as follows. Section 2 develops our choice theoretic spatial model encom-

passing both voting and speech, we set forth our estimator, Sparse Factor Analysis (SFA), in section

3. Section 4 presents a discussion of some of the key ideas driving SFA, and provides a comparison

of our approach with topic models. As a validity check, we apply our model to recent sessions of

the US Senate in section 3. A brief final section concludes.

2 The Model

The basic inputs for our model are observed binary data, i.e. roll call votes, and observed count

data, i.e. words. We observe a stream of votes Vlp P t1, 0u for each legislator l P t1, 2, . . . , Lu on

each proposal p P t1, 2, . . . , P u. We also observe the number of times Tlw P t0, 1, 2, . . .u legislator l

utters each term1 w P t1, 2, . . . ,W u.

Associated with observed vote, Vlp, is a latent variable V ˚lp capturing the intensity with which a

legislator l will vote Aye on proposal p. Similarly, the latent variable T ˚lw corresponds with speaker

l’s proclivity to use term w. Higher values of V ˚lp are associated with a more likely vote of Aye,

1We will later operationalize terms as stemmed bigrams.

3

Page 5: Estimating Preferred Outcomes from Votes and Text · Estimating Preferred Outcomes from Votes and Text May 4, 2017 Abstract We introduce a framework that simultaneously encompasses

while higher values of T ˚lw are associated with observing a larger count for the term.

We assume that legislators’ preferences over outcomes are embedded in a D-dimensional space.

In each dimension d P t1, 2, . . . , Du, member l has a preferred outcome xld. The first dimension

generally corresponds with an ideological dimension, ranging from “left” to “right.” A second di-

mension will, by construction, capture a spatial consideration uncorrelated with the first dimension.

For example, if xl1 captures the left-right dimension, then xl2 may capture either an ethnic, lin-

guistic, or racial divide (Lijphart, 1999), or perhaps an institutional characteristic of the chamber

being modeled.

Each dimension has a weight ad ě 0, common across members, with a higher value of ad

signifying a more relevant dimension. Dimensions with a weight of 0 are considered to be irrelevant.

We discuss this in detail below.

Vote Choice A legislator’s propensity to vote in favor of proposal p is directly proportional to the

difference between the legislator’s utility from the proposal, whose location corresponds to tzayepd uDd“1,

and her utility from the status quo outcome pertinent to proposal p located at tznaypd uDd“1, that would

prevail if the proposal is defeated.

Just as legislators’ most preferred outcomes can be multidimensional, so are the outcomes them-

selves. For example, policy p may have both a redistributive and ethnic aspect–think a pre-school

program, with lessons delivered in the majority ethnicity’s language. In this hypothetical case, zayep1

would be to the left on the left-right dimension, while zayep2 would be closer to the majority ethnicity

on the second dimension. The relative weights given to the two dimensions are controlled by the

relative magnitude of a1 and a2.

When considering a vote, the legislator must choose between an Aye and Nay outcome. We

4

Page 6: Estimating Preferred Outcomes from Votes and Text · Estimating Preferred Outcomes from Votes and Text May 4, 2017 Abstract We introduce a framework that simultaneously encompasses

operationalize the legislator’s preferences as quadratic:

V ˚lp “ U votel

`

Aye; txlduDd“1, tz

ayepd u

Dd“1

˘

´ U votel

`

Nay; txlduDd“1, tz

naypd u

Dd“1

˘

“ ´1

2

Dÿ

d“1

adpzayepd ´ xldq

˜

´1

2

Dÿ

d“1

adpznaypd ´ xldq

2

¸

´ εvotelp (1)

with εvoteld a standard normal random variable. Simplifying and combining terms gives a represen-

tation of the vote choice as:2

V ˚lp “ cvotel ` bvotep `

Dÿ

d“1

adxldgvotepd ´ εvotelp (2)

where cvotel are legislator l specific fixed effects, bvotep are fixed effects peculiar to proposal p, and ad

and xld are the dimension weights and preferred outcomes discussed above, while gvotepd is the signed

distance between the dimension d coordinates pertaining to the Aye and Nay alternatives relevant

to the vote on proposal p. The terms cvotel and bvotep are amalgams of the structural parameters. For

our purposes they are nuisance parameters that capture the baseline propensity for a given proposal

to receive support and for a given member to vote in support of a generic proposal.

Notice that in the context of equation (2) the standard normal preference shock, εvotelp , combined

with Vlp “ 1tV ˚lp ą 0u, leaves us with a probit link between equation (2) and Vlp which aligns closely

with the voting model of Clinton, Jackman and Rivers (2004).

Term choice. Our model of term choice centers on a latent intensity variable T ˚lw that maps to

the observed count Tlw for each term. T ˚lw reflects two considerations: the ideological proximity

of the term to legislator l1s most preferred outcome ptxlduDd“1q and the pertinence of the term to

the issues of the day. Ideological proximity is the distance from her most preferred outcome to the

term’s spatial location ptztermwd uDd“1q. If the member only selected terms based on ideology, then she

would simply utter her most preferred terms ad infinitum, regardless of external circumstance. But

2For a specification of the utility functions and a full derivation, see the technical appendix.

5

Page 7: Estimating Preferred Outcomes from Votes and Text · Estimating Preferred Outcomes from Votes and Text May 4, 2017 Abstract We introduce a framework that simultaneously encompasses

no one chooses terms this way. Members with more extensive active vocabularies stem in part from

considerations of pertinence. We decompose pertinence into three components. First, there is the

aptness pswq of the term to the substantive content of the issues before Congress; for example, we

find discussion of mortgage backed securities in 2009 that were not relevant in 1999. Secondly there

is the legislator’s baseline verbosity, pvlq, which reflects their inherent garrulousness. Third, there

is the diminishing return from overusing a term. We formalize the intensity with which legislator l

applies tern w, as the T ˚lw that maximizes:

U termlw

`

T ˚lw; txlduDd“1, tz

termwd u

Dd“1

˘

“ ´1

2T ˚lw

Dÿ

d“1

adpxld ´ ztermwd q

2

looooooooooooooomooooooooooooooon

Ideology

`T ˚lw

ˆ

vl ` sw ´1

2T ˚lw ´ ε

termlw

˙

loooooooooooooooooomoooooooooooooooooon

Pertinence

(3)

Whereas the elements of term usage related to Ideology involve the preferred outcome of each

legislator, txlduDd“1, and the spatial location of terms, tztermwd uDd“1. The Pertinence component is a

function of non-ideological concerns: the legislator’s predilection to speak, the relevance of the term

during this session, and a diminishing returns component that controls the total level of speech.

Maximizing equation (3) with respect to T ˚lw leads to an optimal choice of the form3:

T ˚lw “ cterml ` btermw `

Dÿ

d“1

adxldztermwd ´ εtermlw (4)

Like our model of vote choice, which takes the characteristics of legislative proposals as fixed,

this formulation treats the pertinence and ideological content with which words are freighted as

exogenous. However, there is no counterpart to the status quo policy in our model of term usage.

Ceteris paribus, terms are used in our model of speech on the basis of their proximity to the speaker’s

most preferred outcome, and on the pertinence of the word. We summarize each element of our

3See the Appendix for a full derivation.

6

Page 8: Estimating Preferred Outcomes from Votes and Text · Estimating Preferred Outcomes from Votes and Text May 4, 2017 Abstract We introduce a framework that simultaneously encompasses

Symbol Equation Interpretation

Observed Outcomes

Vlp 2 Observed vote Vlp for legislator l on proposal p

Tlw 4 Observed count Tlw for legislator l on term w

Latent Outcomes

V ˚lp 2 Intensity with which legislator l supports proposal p (V ˚lp)

T ˚lw 4 Intensity with which legislator l uses term w (T ˚lw)

Model Parameters, Common Across Term and Vote Models

xld 2 & 4 Preferred outcome for legislator l in dimension d

ad 2 & 4 Weight for dimension d

Model Parameters pertinent to the Vote Model

gvotepd 2 gap between Aye and Nay vote dimension d

cvotel 2 Auxiliary legislator parameter

bvotep 2 Auxiliary proposal parameter

εvotelp 2 Random component of vote choice

Model Parameters pertinent to the Term Model

ztermwd 4 Term location in dimension d

cterml 4 Auxiliary legislator parameter

btermw 4 Auxiliary term parameter

εtermlw 4 Random component of term choice

Table 1: Elements of the model. For clarity, we present the variables central to our analysis.The ideal point, xld, and dimension weight ad for each dimension d are common across the word andvote models, providing an explicit link between the two. The remaining parameters and outcomesvary between the word and vote model, though they perform a similar function in each.

model in Table 1.

Placing votes and words in a common space. The most preferred outcome txlduDd“1, and

dimension weights, taduDd“1, are precisely the same parameters that affect both the voting proclivity

V ˚lp in equation (2) and the term use inclination T ˚lw, given in equation (4), while the parameters

cterml and btermw are individual- and term-specific effects that are peculiar to term w.

However, the link between the latent term intensity, T ˚lw, and the actual term count Tlw is

7

Page 9: Estimating Preferred Outcomes from Votes and Text · Estimating Preferred Outcomes from Votes and Text May 4, 2017 Abstract We introduce a framework that simultaneously encompasses

different than the corresponding link between voting proclivities and votes (Vlp “ 1tV ˚lp ą 0u). A

set of cut-points, tτku8

k“´1 partition T ˚lw so that the probability of observing a given term count is

the probability of the latent variable falling between two adjacent cut-points. This connects the

latent space to the observed term count as:

PrpTlw “ kq “ Pr pτk´1 ď T ˚lw ă τkq

“ Φ

˜

τk ´ cterml ´ btermw ´

Dÿ

d“1

adxldztermwd

¸

´ Φ

˜

τk´1 ´ cterml ´ btermw ´

Dÿ

d“1

adxldztermwd

¸ (5)

with the convention that τ´1 “ 8 and Φp¨q denotes the distribution of the standard normal density.

We note that, under this framework, the posterior density is log-convex, meaning there is a single

mode. This provides an advantage over mixture models such as the topic model, where different

starting values may lead to different results (but see Roberts, Stewart and Tingley, 2015).

Modeling assumptions. Before considering any statistical method, the researcher should check

that the assumptions of the model seem plausible in the study at hand. SFA is tightly connected

to the standard spatial voting model of Poole and Rosenthal (1985), and implemented in Clinton,

Jackman and Rivers (2004). Thus, we note that SFA inherits the strengths, criticisms, and as-

sumptions of the vote model. We first discuss some of the modeling assumptions of SFA and their

implications for the method’s intended scope.

Behaviorally, SFA assumes actors stake out consistent positions with both their votes and their

speech. Legislators’ voting and speech should exhibit a consistent spatial position. This is not to

say that legislators are sincerely revealing their heartfelt beliefs, but rather that they telegraph a

consistent issue position for public consumption. In particular legislators’ floor speeches express a

point of view, and they are not part of a process of deliberation in which a legislator’s positions

evolve in response to the persuasive floor speeches of others. Legislators certainly persuade one

another, but our method requires that such swaying of others’ opinions not take place in the text

8

Page 10: Estimating Preferred Outcomes from Votes and Text · Estimating Preferred Outcomes from Votes and Text May 4, 2017 Abstract We introduce a framework that simultaneously encompasses

being analyzed.

SFA also requires two underlying structural assumptions. First, as with the canonical vote

model, (e.g., Poole and Rosenthal, 1985; Clinton, Jackman and Rivers, 2004), the voting component

of the SFA vote model treats the agenda as exogenous. The SFA model treats the status quo and

the alternative positions associated with each proposal as exogenously given.4

Likewise, the term model requires that the relevance of different terms be commonly perceived

by all speakers, with each term at an exogenously fixed position in the ideological space. Thus

bigrams such as “reproductive rights” or “death tax” are taken to have stable ideological content

across speeches. What about procedural terms such as “rescind”, “quash” or “enact”? Whereas

these words may be part of messages advocating positions on either the right or the left, if they

are used at similar rates by legislators across the political spectrum our estimator will, correctly,

attribute little intrinsic ideological content to such terms.

As with any method, SFA should only be applied in situations where the analyst is well aware

of the assumptions it embodies, either because she believes that assumptions are reasonable, or to

assess and explore the implications of assumptions on substantive findings. Applied to speeches and

votes from the floor of the US Senate, our estimator treats floor speech as expressive, an assumption

we find plausible. We would be more skeptical about applying the method to judicial argument or to

academic discourse. In any event, we also discuss and implement several methods for assessing the

internal and external validity of our estimator, and we strongly recommend applying these checks

when using SFA.

4Note that we could relax this assumption, allowing the positions associated with a given vote to be perceived

with error, provided all actors share a common perception of their locations (Ladha, 1991, esp. Sec 2)

9

Page 11: Estimating Preferred Outcomes from Votes and Text · Estimating Preferred Outcomes from Votes and Text May 4, 2017 Abstract We introduce a framework that simultaneously encompasses

3 Estimation

Conditional on knowing the number of dimensions, the voting portion of SFA can be estimated as

a standard Bayesian item response theory (IRT) model (see Jackman (2009)). For all parameters

except the cut-points, tτku8

k“´1, and dimension weights, ad, we assume conjugate priors that are

normal for mean parameters and inverse-gamma for variance parameters. As we rely on a latent

probit specification (Clinton, Jackman and Rivers, 2004; Albert and Chib, 1993), the error terms

εtermlw and εvoteld are assumed independent and identically distributed standard normal variables.5

Estimating the number of dimensions. Our method differs from the standard practice of

assuming a number of dimensions for the latent space. Instead we estimate the number of dimen-

sions. We start by placing a Laplacian (LASSO) prior over the dimension weights (Park et al.,

2008; Tibshirani, 1996):

Prpadq „λ

2expp´λ|ad|q (6)

This provides us with a framework to estimate, rather than to impose, the number of relevant

dimensions. In practice, our algorithm estimates most of the dimension weights as being equal

to zero, recovering a parsimoniously low dimensional representation of legislators’ preferences. As

part of our estimation, we naturally recover the conditional maximum likelihood estimate of ad,

given the estimated values of the other parameters, denote this as: paMLd . Given λ, the maximum a

posteriori estimate (MAP) estimate for ad is:

5As much of the estimation is standard, we defer it to a technical appendix.

10

Page 12: Estimating Preferred Outcomes from Votes and Text · Estimating Preferred Outcomes from Votes and Text May 4, 2017 Abstract We introduce a framework that simultaneously encompasses

paMAPd “

$

&

%

paMLd ´ λ aML

d ą λ

0 aMLd ď λ

(7)

The threshold parameter λ is estimated within a Gibbs sampler (Park et al., 2008). Thus, our

method sets weights to zero for dimensions for which the posterior mode of the dimension weight

is zero.

Zero-inflation and robust cut-point estimation. We model the cut-points as a function of

the word count. The likelihood in equation (5) suggests an ordered probit formulation for the cut-

points. Given the data are counts, with values from 0 to several thousand, we cannot fit a cut-point

for each value. Instead, we place a model over the cut-points. The model is designed to handle

three attributes common to text data. First, the data is zero-inflated6: most members do not use

most terms in a given year. Second, the data is highly skewed: the observed counts range from

0 to the hundreds. Third, the largest values are highly variable from year to year, and we model

cut-points that are robust to shifts at the high end.

To generate cut-points robust to outliers, we model the cut-points as function of the empirical

CDF. The empirical CDF for a given time period is defined as

pFpcq “1

LW

Lÿ

l“1

Wÿ

w“1

1pTlw ď cq. (8)

Using the empirical CDF is equivalent to working with ranks, rescaled from zero to one.

6The term “zero inflation” owes its etymology to the early use of Poisson models to analyze text. Let ftd denote

the frequency with which term t is used in document d. In the Poisson model Probtftd “ 0u “ 12Probtftd“1u2

Probtftd“2u ,

whereas the relative frequency of 0 counts in observed corpora of text is a great deal larger.

11

Page 13: Estimating Preferred Outcomes from Votes and Text · Estimating Preferred Outcomes from Votes and Text May 4, 2017 Abstract We introduce a framework that simultaneously encompasses

We model the cth cut-point as:

τc|β0, β1, β2 “ β0 ` β1 pFpc´ 1qβ2 (9)

where

pFp´1q “ 0 (10)

β0 “ τ0 “ pFp0q (11)

β1, β2 ą 0 (12)

Modeling the cut-points in terms of pFpcq instead of c leaves them less sensitive to extreme

outliers. Forcing the intercept, and hence first cut-point, to be pFp0q models the zero-inflation

directly. The intercept shifts with the sparsity of the speech data, such that Φpβ0q is the proportion

of zeroes in the speaker-term matrix. Finally, the quasi-linear form of β1 and β2 allows some

flexibility in modeling the cut-points, while still ensuring that they are an increasing function of c.

The values of β1 and β2 are estimated via Hamiltonian Monte Carlo (Neal, 2011); see the technical

appendix for details.

Balancing words and votes. As there are often an order of magnitude more terms than votes,

the researcher may fear that the term data is swamping the vote data. We therefore introduce

a parameter, α, that controls the relative information coming from each source.7 At α “ 0, all

information on the scaled locations comes from votes; at α “ 1, all information on the scaled

locations comes from words.

We suggest two ways to select α. The first involves fitting α at a range of values and present the

results, showing how they change along these shifts. This is the strategy we follow in our example

7As a Bayesian model, the sources should be averaged and weighted by their precisions. Since the random errors

in the latent space are standard normal, the precision is 1 for each source.

12

Page 14: Estimating Preferred Outcomes from Votes and Text · Estimating Preferred Outcomes from Votes and Text May 4, 2017 Abstract We introduce a framework that simultaneously encompasses

below, presenting results for α P t0, 1{2, 1u.

For the researcher interested in a data-driven means for selecting α, we suggest a criterion

statistic such that the ideal points are maximally discriminatory.8 Denote the dimension weights

and ideal points as a function of α: adpαq and, xldpαq. Our suggested criterion is:

discpαq “Dÿ

d“1

Lÿ

l“1

Lÿ

l1“1

adpαq2pxldpαq ´ xl1dpαqq

2 . (13)

All else equal, higher values for discpαq attribute a larger fraction of the disparities among

legislators speech and voting to ideological differences.

All elements of the discrimination statistics are returned from the MCMC output, so our software

yields the full posterior density of this statistic, and the optimal value can be selected using the

mean. We find that, in our example, the statistic is reasonably convex in α with a well-defined

stable maximum at about α “ 0.36 over all years. We report the results from this statistic in the

supplemental results.

Additional uses. We have focused on a situation where both votes and term counts are present.

There are cases where we observe legislative speech, but the votes of members outside the leadership

circle are either missing or they are so heavily whipped we do not trust them. In this case, as we

show below, SFA can leverage the relative handful of votes cast by party leaders and the term

frequencies for “back bench” legislators to recover reliable ideal point estimates that permit us to

predict how members would vote in the absence of “whipping.” Our method really comes into

its own when we use text to link the speech by extramural political actors, such as failed election

candidates, newspaper editors, and even voters responding to open ended interview questions, with

the speech and hence the voting choices made by legislators.

8We are grateful to BLINDED for suggesting this approach.

13

Page 15: Estimating Preferred Outcomes from Votes and Text · Estimating Preferred Outcomes from Votes and Text May 4, 2017 Abstract We introduce a framework that simultaneously encompasses

Second, SFA recovers the spatial location of words. As these results come from an underlying

measurement model, they have a firmer basis than methods that generate right/left measures due to

how often each is used by actors with known preferences (Gentzkow and Shapiro, 2010; Laver, Benoit

and Garry, 2003), or based on the assumption that the data generating process for term choice is

exogenous with respect to vote choice (Gerrish and Blei, 2011, 2012; Wang et al., 2013; Lauderdale

and Clark, 2014). SFA scales terms and votes simultaneously, providing natural structural estimates

of word affect.

4 SFA for Identifying Latent Spatial Structure

The applied researcher confronts a host of possible text analytic methods in answering a substantive

question. Likely the most popular is the topic model (Roberts et al., 2014; Grimmer, 2010; Blei,

Ng and Jordan, 2003), so we wish to clarify how the user may wish to think about which method,

SFA or the topic model, may be appropriate. We discuss next the choice between a topic model

and SFA, and later address existing methods that have combined both topic models and scaling.

We emphasize that this is not an either/or distinction; easy to implement software is available for

each.

How do dimensions and topics differ? The basic difference between SFA and a topic model

is analogous to the difference between cluster models and factor analysis. Topic models will return

clusters of co-occurring words. SFA will return latent factors, locating actors and terms in a latent

space.

Consider the following illustrative example that actually possesses the spatial structure SFA is

designed to capture. Suppose there are ten legislators facing six votes, and that the probability

of voting Aye comes from an underlying process with one ideological dimension, as presented in

14

Page 16: Estimating Preferred Outcomes from Votes and Text · Estimating Preferred Outcomes from Votes and Text May 4, 2017 Abstract We introduce a framework that simultaneously encompasses

Vote 1

Vote 2

Vote 3

Vote 4

Vote 5

Vote 6

Legislator 1

Legislator 2

Legislator 3

Legislator 4

Legislator 5

Legislator 6

Legislator 7

Legislator 8

Legislator 9

Legislator 10

Darker Color Means More Likely to Vote Yay

Data Generating Process for Comparing SFA and Topic Models: Likelihood of Voting Yes For Each Legislator by Vote

Figure 1: Simulated data setup. Legislators are arrayed across rows and votes across columns.The darker the square, the more likely the legislator to vote Aye on that particular vote.

Figure 1. Legislators are arrayed across rows and votes across columns. The darker the square, the

more likely the legislator to vote Aye on that particular proposal. Legislators 1 – 5 are more likely

to vote Aye on the first 3 votes and more likely to vote Nay on the last 3. Legislators 6–10 are

more likely to vote Nay on the first 3 votes, and Aye on the last 3. Legislators 5 and 6 are relative

moderates, while bills 3 and 4 are relatively noncontroversial.9

We fit both SFA and a topic model to a draw of the vote data. In order to fit a topic model,

we assume each legislator uttered six “terms” representing their vote and the bill number from a

potential vocabulary of twelve terms: {“Aye on 1”, “Nay on 1”, “Aye on 2”, “Nay on 2”, . . ., “Aye

on 6”, “Nay on 6”}. We implemented the EM version of SFA and also gave the same data to the

a Structural Topic Model, as implemented in stm. We fit a three-topic model to the data. Four-,

9 Specifically, let si “ t´4.5,´3.5, . . . , 4.5u and wj “ t´2.5,´1.5, . . . , 2.5u. We drew Ylj „ Bern pΦpsiwj{2qq .

15

Page 17: Estimating Preferred Outcomes from Votes and Text · Estimating Preferred Outcomes from Votes and Text May 4, 2017 Abstract We introduce a framework that simultaneously encompasses

SFA Results Topic Model Results

Dimension Most Preferred BillTopic 1 Topic 2 Topic 3

Displacements Outcomes Weights0.24 0.87 1.19 Nay on 2 Nay on 4 Aye on 1

0 1.02 0.84 Nay on 1 Nay on 6 Nay on 50 1.02 0.57 Aye on 4 Aye on 2 Aye on 20 1.02 -0.60 Aye on 6 Aye on 3 Aye on 30 0.36 -1.060 0.14 -1.03

-1.02-0.95-1.11-1.35

Table 2: Results from SFA and a Topic Model on the Simulated Dataset.

five-, and six-topic models returned qualitatively similar results.

The left three columns of Table 2 contain the results from SFA. The first column contains the

estimated posterior mode, and only the first dimension has a non-zero mode. The next two columns

contain each legislator’s ideal points and the bill estimates. SFA returns estimates of the underlying

structure, correctly recovering the unidimensional structure of the data generating process, and

identifying legislators 1-4 and 7-10 as relative extremists at opposite ends of the spectrum. SFA

also successfully identifies the relatively moderate legislators, 5 and 6, and correctly notes which

proposals will draw support from which legislators.

The rightmost three columns of Table 2 report the topic model estimates, presenting the first

four terms of the three fitted topics. Consider the first topic. Legislators that vote Nay on votes 1

and 2 are likely to vote “Aye” on votes 4 and 6. Similarly, considering the second topic, legislators

who vote “Nay” on votes 4 and 6 are likely to vote “Aye” on votes 2 and 3. All of this is roughly

consistent with the spatial model that generated the data, but the topic model does not flag our

attention to the unidimensional nature of the data.

We make note that both the topic model and SFA produced consistent results. In the course

16

Page 18: Estimating Preferred Outcomes from Votes and Text · Estimating Preferred Outcomes from Votes and Text May 4, 2017 Abstract We introduce a framework that simultaneously encompasses

of practical modeling, both should be tried even if one is favored. If their results corroborate, the

researcher should be more confident that some systematic attribute of the data is being discovered.

Secondly, when our data do possess a low dimensional spatial structure, SFA will identify it. In

contrast, while the topic model tracks the behavior of the low dimensional data, the number of

topics exceeds the number of dimensions. By itself the topic model will not call the researcher’s

attention to the underlying dimensionality of the data.

When the researcher suspects the data are organized around a small number of underlying

dimensions, the SFA model provides an efficient vehicle to elucidate the deep structure of her data.

For the researcher interested in summarizing word co-occurrence in a primarily descriptive way,

topic models are an appropriate tool.

Existing methods integrating topic models and scaling. Some works combine vote data and

topic models hierarchically (Gerrish and Blei, 2011, 2012; Wang et al., 2013; Lauderdale and Clark,

2014). While the details differ, in each of these works topics are identified independently of voting

choices, while vote choices are conditional on topics. Whereas our Sparse Factor Analysis approach

uses the Bayesian LASSO to calibrate dimensionality and then jointly estimates the ideological

content of terms and vote choices, the models of Gerrish and Blei (2012) and Lauderdale and Clark

(2014) estimate topics independently of binary choice outcomes.10 The number of dimensions in

their voting model is then set equal to the number of topics in their text model, and they use the

topic weights from their text model to orient each vote in their multidimensional latent space.11

Gerrish and Blei (2011) and Wang et al. (2013) model the spatial characteristics of proposals as

linear in the topic weights associated with each vote. Each of these works treats the allocation of

10Lauderdale and Clark (2014) focus on court rulings–which they treat as as “votes”.11Lauderdale and Clark (2014) are explicit about using the topic weights to orient each vote in a multidimensional

space, but the preferred outcome adjustments of Gerrish and Blei (2012) have the same effect.

17

Page 19: Estimating Preferred Outcomes from Votes and Text · Estimating Preferred Outcomes from Votes and Text May 4, 2017 Abstract We introduce a framework that simultaneously encompasses

text into topics as exogenous to the spatial model, and each requires voting data to infer legislators’

ideological orientation. In contrast, our estimator recognizes that legislators’ choice of words is itself

an act of political volition, and our model can still estimate legislators’ ideological orientation even

when only text is available.

Comparison with additional methods. SFA is related to several existing methods for scaling

votes, scaling text, and combining multiple outcomes in a single factor analytic model. Our formal

model extends the spatial model of Ladha (1991) to term choice, while our statistical model likewise

generalizes the latent probit model of Clinton, Jackman and Rivers (2004) to include count as well

as binary outcomes.

Our model is also related to the text analytic Wordfish model of Slapin and Proksch (2008).

However, Wordfish and SFA contend with zero inflation somewhat differently, whereas Slapin and

Proksch exponentiate the systematic propensity to use a word in order to model a non-negative

count (see also Elff, 2013; Bonica, 2014), SFA places word use propensities in the context of a z-scale,

this has the effect of modeling zero inflation directly, while the estimator is robust to outliers, as

described above in section 3. As an added bonus, SFA also estimates the underlying dimensionality.

Another modeling strategy, mixed factor analysis models, have been used to combine data of

different types. Quinn (2004) converts observed continuous data to a z-scale, and then combines

it with ordinal and categorical data on the same scale; for recent extensions, see Murray et al.

(2013); Hoff (2007).12 Unlike these methods, SFA uses a sparsity prior to estimate the underlying

dimensionality.

Other methods have estimated dimensionality (Hahn, Carvalho and Scott, 2012; Heckman and

12Like SFA, these are all Gaussian copula models. Our method is more powerful than Murray et al. (2013), since

we estimate the cut-points.

18

Page 20: Estimating Preferred Outcomes from Votes and Text · Estimating Preferred Outcomes from Votes and Text May 4, 2017 Abstract We introduce a framework that simultaneously encompasses

Snyder Jr., 1997). Aldrich, Montgomery and Sparks (2014) show that sufficiently large cross-party

variance can mask important within-party dimensions. We differ from these works in combing both

vote (binary) and word (count) data.

To illustrate the use and efficacy of SFA, we now turn to an analysis of text and roll call votes

from the contemporary US Senate.

5 Illustrative Application: The US Senate, 1997–2012

In this section, we apply SFA to recent US Senate data. The analysis proceeds in four steps. First,

we describe the data and discuss the viability of SFA in this context. Second, we apply the method

and present results. Third, we present several tests of internal validity. Fourth, we present a test

of external validity through using the legislative model to scale newspaper editorials.

5.1 Data

We apply SFA to eight recent sessions (105th ´ 112th) of the US Senate. Our algorithm returns

estimated spatial preferences for legislators and political content for bills, as well as our calibration

of the underlying dimensionality. Our data come from two sources. We use the Rollcall records of

VoteView13, while our text consists of floor speeches as gleaned by the Sunlight Foundation.14 We

treat all votes other than “Aye” and “Nay” as missing at random, while following standard practice

(e.g., Quinn et al., 2010; Grimmer and Stewart, 2013), we stem, eliminate stop words, and model

unigrams and bigrams, including data for the entirety of each session. We trim all terms that are

not used at least ten times over the course of a given session by each of at least ten people. A

complete summary of the data can be found in the supplemental materials.

13 http://www.voteview.com/. Last accessed October 27, 2014.14 http://www.capitolwords.org/. Last accessed October 24, 2014. Code used to stem and eliminate stopwords

available upon request.

19

Page 21: Estimating Preferred Outcomes from Votes and Text · Estimating Preferred Outcomes from Votes and Text May 4, 2017 Abstract We introduce a framework that simultaneously encompasses

Does it make sense to apply SFA in this context? We note a pervasive consensus among Congress

scholars that strategic voting in Congress is very rare (e.g. Poole and Rosenthal, 1997; Wilkerson,

1999; Ladha, 1994), see Groseclose and Milyo (2010) for an extended discussion. Likewise, previous

work has found that Congressional floor speeches are expressive rather than deliberative (Hill and

Hurley, 2002; Maltzman and Sigelman, 1996). As further indication that floor speeches are vehicles

of expression rather than avenues of persuasion, many perorations aren’t even read aloud, but are

simply entered into the record.15

5.2 Results

We present three sets of results. Each corresponds with the weight placed by our estimator on

the information coming from votes instead of words (α P t0, 1{2, 1u). We present results on the

estimated number of dimensions associated with each value of α.

Scaling results informed only by votes (α “ 0). We begin with the model with information

coming only from votes. This model places a posterior mass estimate of 100% on one dimension for

each Senate. Posterior means of ideal point estimates correlate with DW-NOMINATE estimates

ranging from 0.95 to 0.98 across the eight Senates analyzed here. See Clinton, Jackman and Rivers

(2004, Figure 1) for similar results.

Scaling results informed by words and votes (α “ 1{2). We next move on to the model that

gives equal weight to words and to votes. First, we consider the estimated number of dimensions,

see Figure 2. The average density over the number of dimension parameters merging all Senates is

in the top left corner, while the successive sessions are depicted from top to bottom and from left

15We feel more comfortable applying the method to floor speeches than we would to discourse during conference

committee meetings where genuine deliberation might take place.

20

Page 22: Estimating Preferred Outcomes from Votes and Text · Estimating Preferred Outcomes from Votes and Text May 4, 2017 Abstract We introduce a framework that simultaneously encompasses

● ●0.0

0.2

0.4

0.6

0.8

Average

0 1 2 3 4 50 1 2 3 4 5

●● ●0.

00.

20.

40.

60.

8

0 1 2 3 4 5

Senate 105

0 1 2 3 4 5

●● ●0.

00.

20.

40.

60.

8

0 1 2 3 4 5

Senate 106

0 1 2 3 4 5

● ●0.0

0.2

0.4

0.6

0.8

0 1 2 3 4 5

Senate 107

0 1 2 3 4 5

●● ●0.

00.

20.

40.

60.

8

0 1 2 3 4 5

Senate 108

0 1 2 3 4 5

● ●0.0

0.2

0.4

0.6

0.8

0 1 2 3 4 5

Senate 109

0 1 2 3 4 5

●● ●0.

00.

20.

40.

60.

8

0 1 2 3 4 5

Senate 110

0 1 2 3 4 5

● ●

● ●0.0

0.2

0.4

0.6

0.8

0 1 2 3 4 5

Senate 111

0 1 2 3 4 5

● ●

●● ●0.

00.

20.

40.

60.

8

0 1 2 3 4 5

Senate 112

0 1 2 3 4 5

Mas

sM

ass

Mas

s

Dimensions Dimensions Dimensions

Figure 2: Posterior density over number of underlying dimensions for the joint wordand vote model. We find a pronounced mode at two dimensions consistently across Senates. Theaverage across all Senates appears in the top left corner.

to right. A pronounced mode at two dimensions reappears consistently across Senates.

Not only is the finding of two dimensions consistent, but the two dimensions themselves are

stable across sessions. The first closely coincides with the standard ideological dimension uncovered

from scaling roll call votes. The second appears to be a leadership dimension, with party leaders as-

sembled near one pole while a variegated mix of “rank and file” partisans and ideological moderates

populate the opposite end of the spectrum.

Figure 3 presents the log density of term weights, after scaling votes and terms together. The

21

Page 23: Estimating Preferred Outcomes from Votes and Text · Estimating Preferred Outcomes from Votes and Text May 4, 2017 Abstract We introduce a framework that simultaneously encompasses

−0.4 −0.2 0.0 0.2 0.4

−10

−5

05

administr

said

expenditur

immedi consider

consent committe

american

deficit

period time

consent proceed

author meet

cutchildren

illinoi

proceed immedi

meet session

healthhelp

shall appli

en bloc

930 m

need

medicar

march

committe foreign 10 m

Scale

Log

Den

sity

| || || || ||||| | || | || ||| || | | || || || |||| | || |||| | | ||||| | | | |||| || | ||| || || | || |||| | || ||| || | |||| | | || | ||| || | ||| | || | | ||| | | || || || ||||| || ||| || ||| | || || |||| | ||| || | ||| ||||| | ||| || | |||| |||| || || || | || | |||| | || |||| | | |||| || || |||| || || |||| ||| | |||| | ||| | || ||| || || | | ||| || |||| || ||||| || | ||| | ||| | || ||| | ||| | | || || | ||| || | || | || | || ||| || ||| || |||||||| ||| || | || || || |||| |||| | | ||| || | || | || || || ||| | |||| || | | ||| | ||| || |||| | || | || || | || | | || ||| | ||||||||| | ||||| || | ||| | || | ||| | | || || | || ||||| ||| || || || || || ||| || || | | || | || | | ||| ||| | ||| | |||| | ||| || ||| |||| ||| |||| || || | ||| |||| | || || || || | | ||| |||| | | ||| | ||| || |||||| ||| || | || || || ||| || ||| ||| | ||| || || ||| |||| | || || || ||| || |||| || | ||||| || | || | ||| || || | | ||| ||| || || | ||| || | | | |||| | || ||| || | || | |||| | || | || ||||| | || | | ||| |||||| | || || || ||| | | || || | ||| |||| || | |||| | || | ||| || ||||| |||| ||| | |||| | || | | || ||| | | ||| | | || | ||| || |||| |||| || || | || || || |||| | |||| | | || |||||| | || || || || | |||| ||| | ||| | ||| || |||| ||| || | || | |||| || | || |||| | |||| || ||| ||| || || | || || | ||| ||| ||||| ||||| |||| | || ||| || ||| || |||| || || |||| || | || || ||| || | || | || || ||| | | ||| | || || | || ||| || | |||| || || | || || | ||| ||| |||| | ||| || | || || | ||| || | ||| | || || || || || || ||| |||| | || || ||| | |||| | || | | ||| || || || |||| | ||| || ||| | ||| |||| ||| || || || || | ||| ||||||| || || | |||| | || ||| | || || |||| || ||| | || | || ||| | ||| || || || || ||||| || ||||| | ||||| | || | ||| | || || || ||| | || || ||| || || ||| || | || || | || || | | ||| || ||| || |||| | | |||| | | ||||| || | | ||| | | ||| | || ||| |||| || |||| ||| || |||| | ||| ||||| | |||| |||| ||| | |||| | | |||| ||||| || | || | || || | || || | || ||| ||| || ||| | |||| ||| || || || || |||||| | |||| || ||| || | |||| ||||| | | |||| || || | |||| || |||| | || || |||| || || || || || ||| | || ||||| | || | | ||| |||| ||| | | ||| || | | || ||| | || | || || || || || | | || || ||| || ||| || || || ||| | | | || | || | || || || |||| || | | |||| | |||| ||| || ||| || | ||| || || || | || ||| ||| ||| | || || | ||| || | | || || |||| | ||||| | ||| | | || | ||| ||| ||| | | || | || || | ||| |||||| | ||| |||| | | ||| | || | | |||| || | || ||||| |||| ||| |||| || | | ||| || ||| | ||| || ||| || || || | || ||| || |||| | | || || || || ||| | ||| || |||| | || |||| ||| | || |||| || | |||| ||| || |||| || |||| || |||| ||| ||| | || || || || | | || || ||| || ||| |||| || || | || | |||||||| || | || | || || || || || |||| | || || || | ||| | || || | || || ||| | | | || ||| | ||| || ||| |||| | |||| | | || || ||| |||| |||| || || | | || |||| | | || |||| ||| | || ||||| | | || | ||| ||||| || | ||| ||| ||| | ||||| || ||| || | || || ||| | |||| | || |||||| || |||||| | ||| ||| ||| | || || ||| | || | || || || ||| ||| || ||| | | || ||| ||| || | ||| | | || | || | || || | |||| || |||| | || ||| ||| | | ||| || | || | || | || || || | || | | || || || || | ||| | ||| ||| || | |||| || || || || || || || || |||| | ||| ||||| |||| || || ||||| || | | || | || || ||| | ||| | || | ||| || ||| | || || || | || ||| | |||| ||| || || || | |||| ||| || || ||||| || | ||| | |||| || || | |||| |||| | || || || ||| ||||| || | || || | ||| || ||| ||| | || ||||| || | || | || ||| ||| | || | | || | || | ||| | | || | ||| | ||| | ||| || ||| | | || | | | | || ||| ||| || || | ||| ||| | |||| || | ||| || ||| | | || || ||| || | ||| ||| | | ||||| | ||| ||| | | || | ||||| ||| || || | || ||| || | | | | || || ||| ||| ||| | | | || || | ||| || || || |||| || || |||| || || || || || |||| | || || || | || | || || |||| || || || || || || ||| | ||| || ||| || | ||| ||| || || ||| || | |||| ||| | ||| || || |||| | || | || || |||| | | || ||| || |||| | | |||| ||| | || || | | || ||| ||| || ||| ||| || | | |||| || || |||| | ||| | || ||| || | ||| || | ||| | || || || | ||| |||| | ||| | || | | ||| || | ||| ||| || ||| | || | || |||| | ||| || | | ||| || |||| || | |||| || ||| || | | |||| | | |||| | |||| ||||| || ||| || || ||| || || || | ||||| | || ||| |||| | |||||| | ||| || | | |||| || || || ||||| || || | || || | || | | ||||| | || ||| || || | ||| | | | ||| || || || | |||| |||| ||| || ||| || || || || || ||| || | ||||| || || | |||| || || | ||| || || | || || ||| | |||| ||| | ||| | || || || || | ||| || ||| | || | || | |||| | || || | || || | || ||| ||| | || || | ||| || || || ||| || || || ||| || || ||| ||| | || |||| | || |||| |||| | || || ||| | || | || || || || ||| | ||| || | || | || | || ||| || || || || | || | || | | || | | | | ||| || |||| | || || | || || | |||||||| | ||| | ||| || | ||| | || |||| || | | | || || | |||| ||| || ||| || || | ||| ||| || | ||| ||||| || || || | || || || || | |||| | | || | ||| ||| || ||| |||| ||| | |||| || || || ||| | | ||| ||||| | | |||| | | ||| |||| || | |||| ||| | ||||| | ||| || |

108th Senate

−0.4 −0.2 0.0 0.2 0.4

−8

−6

−4

−2

02

4

author meet

10 m

want take

spend

meet session

women

2010

trillion

consent committe

dirksen

determin

debt

conduct hear

hear entitl

option

stimulus

offic building

dirksen offic

spring

budget

Scale

Log

Den

sity

| || ||| | | ||| || | || | | ||| |||| || | | |||||| | ||||| || | || ||| ||| | ||| || ||| ||| | || | ||| ||| ||| || || | |||| ||| | ||| ||| | ||||| |||| || || | | | |||| | | || | |||| | |||| || ||| | || |||| | ||| |||| ||| | || | ||||| |||| |||| || | | || ||| || || |||| ||| || | | |||| | | | ||| ||| || || || ||| ||| ||| || || ||| | || || || | || ||| | ||| | | ||| || | || ||| ||| || | || || || ||| | | | | || ||| | | ||| | || ||||| | | | | || | || ||| || | ||| || |||| | | || || || ||| ||| | || || | ||| || | || || | || ||| ||| || | ||| || || ||| || | || || || || ||| | |||| || || || |||| |||| | ||| | ||| | ||| ||| | | || || ||| || || | | |||| || | ||| || | | ||| ||| | || |||| | || | | |||| || ||| | |||| | |||| | || | | | ||| || | |||||| ||| |||| || || | || || | |||| | ||| || | || || | | ||| || | | ||| | || ||| || ||| | ||| ||| |||| ||| |||| || | || | || |||| | ||| ||| ||| ||| ||| | || ||| | | ||| || || || || || | || |||| || || | ||| || ||| || ||| | | || | | ||| | ||| |||| || ||| || ||| | | || | || | | | ||||| | |||| ||| |||| | ||| || | || || | | || || | | | || ||| || || ||| | | || || ||| | || || | | || | | ||| | | ||| ||| | ||| | || || || | || | |||| ||||| | || || || || || | || || | || |||| || |||| || || | ||| ||| ||| || || | || | | || || || | |||| |||| | ||||| || | |||||||| || || || | ||| || ||| | ||| | | || ||| || | ||| | ||| ||| ||| ||| | || || | || ||| || || |||| || || |||| | |||||| ||| ||||| | ||| || || || || | ||| | || |||| ||| || || | || ||| || ||| || || || | || ||| |||| || | |||| || || ||||| | ||| | | | | |||| | || || | | | || || ||| | || ||| ||| ||| | | ||||| ||| | ||| ||||| ||| || || || || || || ||| | || || ||| | || || | || || ||| | || |||| || || || || | |||| || |||| | || || ||| || | ||| | ||| | || ||| | || || | || || | || | || || |||| ||||| ||| ||| | || | || || ||||| |||| || | ||| | || | | | ||||| || | |||| ||| | ||| | |||| | | || ||| | ||| |||| ||| |||| || | | || || | |||| || || ||| ||| || | | ||| ||| || | | ||| || ||| || | || | || | || |||| | |||| |||| | | ||| ||| ||| || || | | ||| || || || || | |||| |||| || ||| ||| ||| | || | | ||| | || | || ||| | | | ||| | || |||||| | |||| ||| |||| ||||| || || ||| || || | ||| || || || |||| | ||| | | || | |||||| |||| || || | || |||||| | || || ||| | |||| || || ||| ||| ||| ||| || | | |||| || | | |||| || | | ||| || | || ||| | || ||| || ||| || | ||| || ||| || | || |||| | ||| || || || || |||||| | | | | ||||| | ||| | || || ||| | ||||| | || |||| | || || || || || || | || || ||||| | ||| | ||| ||| ||| || ||| || ||| || || ||| | || || || || ||| || ||||| || || | || |||||| | | ||| | || ||| | || | || || | ||| ||| | ||| | || | | |||| | || ||| | || | || | || || || ||| || | ||| | || | | || ||| || || |||| | | || ||| | || || || |||| |||| | ||| || | || | || |||| || ||| ||| ||| | ||| ||| || || || | |||| || ||| || | ||| | || || | || ||| ||||| | ||| | || | || || ||| || | ||| ||| | ||| | || || ||| ||||| ||| | ||||| || || ||| || ||| | ||| || ||||| | || || || |||||| |||| ||| | || || ||| | || || || | || ||| |||| ||| | | ||| || || || || || ||| || ||| || || | ||||| ||| || ||| | | || || || ||| || || || || || | | ||| || || ||||| |||| | || || | | ||| | ||| | | || | ||| || |||| || || || || | || | ||| | || | |||| | || ||| ||| || | | ||| || || | ||| | || | | || || |||| | || | ||||| ||| ||| | ||| || | | ||| || || || ||| | ||| ||||||| ||| | | ||| |||| || | |||| | || | || |||| ||| |||| || || |||| | || || | || | ||| || || ||| | || ||| || || || || || || | |||| | | ||| |||| ||| || | |||| ||| || || || ||| || | |||| || | || || | | || || || | || | || ||| || | || | | || | | | || |||| | || || | | || || |||| ||| || | | ||||| ||| ||| || | ||| || |||| || || | ||||| || || || |||| | || || ||| || || | ||| | ||| || || || | ||| || || | || ||| || |

112th Senate

Distribution of Words Along Ideological Dimension when Scaled with Votes

Figure 3: Log density of term weights, after scaling votes and terms together. Eachlocal mode is labeled by the five terms closest to that mode. The left figure presents results fromthe Republican controlled 108th Senate, the right figure contains results from the Democratic led112th Senate.

weights are oriented such that terms more likely to be spoken by Republicans are to the right. Each

local mode is labeled by the five terms closest to that mode. The left figure contains results from

the 108th Senate, a Republican-led session during President George W. Bush’s tenure. The right

figure contains results from the 112th Senate, a Democratic-led session during President Barack

Obama’s time as President.

We find a consistent pattern: for the majority party, the most extreme terms relate to parlia-

mentary control words (“consent committee,” “author meet,” “meet session”). For the minority

party, the first dimension identifies ideologically relevant terms. For the Democrats during the 108th

Senate, these terms included “administr,” as the Democrats turned their ire upon the Bush admin-

istration, and “health,” a centerpiece of the Democratic policy agenda. In the 112th Senate, with the

Democrats in the majority, parliamentary control terms switched their ideological polarity, aligning

22

Page 24: Estimating Preferred Outcomes from Votes and Text · Estimating Preferred Outcomes from Votes and Text May 4, 2017 Abstract We introduce a framework that simultaneously encompasses

●●

●●

● ●

● ●

−1.5 −1.0 −0.5 0.0 0.5 1.0 1.5 2.0

−3

−2

−1

01

REID DURBIN

SCHUMER

MCCONNELL

KYL ALEXANDER

THUNE

First Dimension

Sec

ond

Dim

ensi

on

Policy and Leadership Dimensions

●●

● ●

−1.5 −1.0 −0.5 0.0 0.5 1.0 1.5 2.0

−3

−2

−1

01 BOOZMAN

COCHRAN CRAPO

ENSIGN

HELLER

KIRK

LUGAR SHELBY

First Dimension

Sec

ond

Dim

ensi

on

'fiscal cliff' 'student loan'

'Boehner'

Identifying Moderates with Cutting Lines

Dimensions Identified by SFA

Figure 4: Latent dimensions estimated by SFA, 112th Senate. Legislators’ preferred outcomeson the first dimension (x-axis) and the second (y-axis). The left plot labels party leaders, whips,and the chairs of major committees. In the right plot cutting lines separate frequent from infrequentusers, of the terms: “Boehner,” “student loan,” and “fiscal cliff.”

with the Democrats (“author meet,” “meet session,” “consent committee”). The Republican end

of this first dimension reflects that party’s programmatic concerns over fiscal balance (“budget,”

“stimulus,” “debt,” “trillion”).

Next, we look at the preferred outcomes of legislators from the 112th. Points in Figure 4 are

shaded in proportion to their first dimensional DW-NOMINATE score, showing the agreement

between SFA and DW-NOMINATE on the first dimension (pρ « 0.95). The first dimension captures

the political battle lines, reflecting legislators left vs right policy differences, while the second,

vertical, dimension reflects differences in the terms selected by leaders, who appear lower down in

the lefthand panel of Figure 4, see the labels corresponding to the names of party leaders, whips,

and major committee chairs, versus the rank and file members, whose preferred outcomes span the

upper sector of the diagram.

23

Page 25: Estimating Preferred Outcomes from Votes and Text · Estimating Preferred Outcomes from Votes and Text May 4, 2017 Abstract We introduce a framework that simultaneously encompasses

The right plot of Figure 4 contains cutting lines for three terms: “Boehner,” “student loan,”

and “fiscal cliff.” The lines were constructed such that legislators on one side are expected to make

above median use of the term, while legislators on the other side are expected to utter the word at

below its median frequency. We find leaders are more likely to use the term “Boehner,” the name of

the House Speaker during this session. Republicans were more likely to use the term “fiscal cliff,”

with leaders the most likely. Democrats were more likely to utter the phrase “student loan,” again

with leaders the most likely to employ the term. SFA identifies a group of Republican moderates

in the “V” shaped region at the upper center of the panel. Here we label them by name. These

moderates are not likely to use either “student loan” or “fiscal cliff,” nor are they likely to invoke

then name “Boehner.”

Scaling results informed by only words (α “ 1). We also apply SFA using only information

from words. This is not our preferred model, as it ignores vote data, yet SFA still uncovers structure

in the text data The posterior density of estimated dimensionality for pooled floor speeches can be

found in Figure 5. Results across all sessions are in the top left corner while the remaining sessions

follow in order from top to bottom and from left to right. In contrast with the high concentration

of probability on two dimensions in our preferred model, when we exclude the valuable information

contained in votes and analyze oratory alone, we obtain a somewhat more diffuse density that

accords a 75% probability to there being between five and eight dimensions, and a probability of

over 95% that the underlying dimensionality is within the range r4, 11s. Looking at individual

sessions, we find a similar dimensionality, albeit with some year-to-year variation.

Figure 6 contains the top ten terms associated with each of the first six dimensions of the

112th Senate.16 We note that the positive and negative level distinction along the y-axis is wholly

16 Results from the 105th–111th Senates are available with the supplemental materials.

24

Page 26: Estimating Preferred Outcomes from Votes and Text · Estimating Preferred Outcomes from Votes and Text May 4, 2017 Abstract We introduce a framework that simultaneously encompasses

●●

● ●

●● ● ● ● ● ● ● ● ●0.

00.

10.

20.

30.

40.

5

Average

3 5 7 9 11 13 15 17 19

● ●●

● ●

● ● ● ● ● ● ●0.0

0.1

0.2

0.3

0.4

0.5

3 5 7 9 11 13 15 17 19

Senate 105

● ●

● ● ●

●●

●●

● ● ● ● ● ●0.0

0.1

0.2

0.3

0.4

0.5

3 5 7 9 11 13 15 17 19

Senate 106

● ●

●●

●● ● ● ● ● ● ● ● ●0.

00.

10.

20.

30.

40.

5

3 5 7 9 11 13 15 17 19

Senate 107

● ●●

● ●●

● ● ● ● ● ● ● ● ●0.0

0.1

0.2

0.3

0.4

0.5

3 5 7 9 11 13 15 17 19

Senate 108

● ●

●●

● ●

● ● ● ● ● ● ● ● ● ●0.0

0.1

0.2

0.3

0.4

0.5

3 5 7 9 11 13 15 17 19

Senate 109

● ● ● ● ● ● ● ● ● ● ●0.0

0.1

0.2

0.3

0.4

0.5

3 5 7 9 11 13 15 17 19

Senate 110

●●

● ● ● ● ● ● ● ● ● ● ●0.0

0.1

0.2

0.3

0.4

0.5

3 5 7 9 11 13 15 17 19

Senate 111

●●

●●

● ●● ● ● ● ● ● ● ●0.

00.

10.

20.

30.

40.

5

3 5 7 9 11 13 15 17 19

Senate 112

Mas

sM

ass

Mas

s

Dimensions Dimensions Dimensions

Figure 5: Estimated underlying dimensionality for Senate floor speeches. Results acrossall sessions are in the top left corner and remaining sessions follow.

arbitrary, as we only identify term levels up to a sign. Looking at the first column, we find that the

first dimension starts with a set of non-controversial terms. These include parliamentary procedural

terms (as opposed to parliamentary control terms) such as today wish, madam rise, and colleague

support. Also on the non-controversial side are martial terms with universally positive affect during

this Congress such as army, air forc, and deploy. On the other side are word stems that will be used

in to differentiate issues in other dimensions, such as tax, vote, and peopl. The other dimensions

25

Page 27: Estimating Preferred Outcomes from Votes and Text · Estimating Preferred Outcomes from Votes and Text May 4, 2017 Abstract We introduce a framework that simultaneously encompasses

today wish

madam rise

rise today

army

colleagu support

deploy

air forc

resources

recognit

legaci

.

.

.votetaxget

americantimeonecanyearbill

peopl

Dimension 1

nomindistrictprotectwomenjudicicourt

confirmnominenation

support...

moneypeoplsaygettax

thinkbudgettrilliondebt

spend

Dimension 2

busitaxjob

smallsmall busi

tradeeconom

agreementlegisl

economi...

telljudiciari

judicijudg

nominepeoplwant

moneycourtsay

Dimension 3

healthstudent

careschoolwork

companifamili

jobchildrenmillion

.

.

.motionnomineleaderconsidjudg

obamaclotur

confirmnomin

democrat

Dimension 4

federlaw

increasgovernspend

congressreport

administrcourtstate

.

.

.tablelaid

ask unanimaction debate

laid uponproceed

motion reconsidinterven action

mornmotion

Dimension 5

debtstudent

cardbankfee

dreamcolleg

loan

school

famili...

stateknowtimesay

oilone

amendgetcanbill

Dimension 6

Wor

ds w

ith N

egat

ive

Leve

lW

ords

with

Pos

itive

Lev

el

Figure 6: Extreme Terms by Dimension, 112th Senate. Extreme terms for the first six dimen-sions as estimated by SFA from the 112th Senate. The type size of each term is proportional to theabsolute value of the associated coefficient; terms earning positive coefficients appear in the upperpart of the panel, those assigned negative coefficients are presented in the lower segment.

have at their extremes words connoting some underlying dimension of policy. For example, the

second dimension ranges from judiciary and women’s issues at one end to fiscal concerns at the

other; the fourth goes from a broad set of social welfare concerns to the consideration of judicial

nominees. The dimensions adapt to the issues of the day. Tobacco, for example is present in the

105th Senate; Iraq comes and goes as an issue, and health care goes from dealing with seniors and

Medicare in the 107th Senate to dealing with students and families in the 112th.

Even without including votes in our analysis, SFA selects a relatively parsimonious and infor-

26

Page 28: Estimating Preferred Outcomes from Votes and Text · Estimating Preferred Outcomes from Votes and Text May 4, 2017 Abstract We introduce a framework that simultaneously encompasses

−1 0 1 2

−1

01

2

●●

●●

●●

●●

●●●

●● ●

●●

●●

XX

XX

X

X

X

X

XX

First Dimension

Legislator Score (All Data)

Legi

slat

or S

core

(M

issi

ng D

ata)

Missing LegislatorsObserved Legislators

X Missing LegislatorsObserved Legislators

−3 −2 −1 0 1 2

−3

−2

−1

01

2

●●

●●

●●

XX

XX

X

XX

XX

X

Second Dimension

Legislator Score (All Data)

Legi

slat

or S

core

(M

issi

ng D

ata)

Missing LegislatorsObserved Legislators

X Missing LegislatorsObserved Legislators

Estimating Ideology when Ten Legislators' Vote Data is Missing Completely at Random

Figure 7: Estimated Ideal Points for Ten Legislators Missing at Random. The lefthandpanel compares the censored and uncensored estimates (marked by X’s) of the preferred outcomesfor the ten randomly censored legislators on the first dimension, while the righthand panel makesthe analogous comparison for dimension two.

mative representation of the Senate.

5.3 Internal Validity

We turn now to assessing SFA’s internal validity in the US Senate data.

Imputing estimates for legislators missing completely at random. First, we randomly

discard the votes cast by ten legislators selected completely at random, coding all of their votes as

“missing,”, while we maintain all of their speech data. The left and right panels of Figure 7 plot

the imputed versus fitted values (“X”) for the dropped legislators, for the first (left) and second

(right) dimension. SFA recovers reliable first-dimension preferred outcomes well, except for some

27

Page 29: Estimating Preferred Outcomes from Votes and Text · Estimating Preferred Outcomes from Votes and Text May 4, 2017 Abstract We introduce a framework that simultaneously encompasses

−4 −3 −2 −1 0 1 2

−4−3

−2−1

01

2

DR

R

DD

R

DD

RRR

DD

R

D

D

D

D

R R

RRD R

R

R

D

R

D

D

RR

R

DD

RR

D

D

RRR

D

R

R

D

R

D

RR

DD

R

DD

D

DD

R

D

R

R

DD

R

D

RD

D

DDD

R

D

R

D

R

D

R R

D

RR

D

R

D

R

R

D

DD

RR

D

D

DD

D

D

R

R

RD

RepublicansNon−Republicans

First Dimension (All Data)

Seco

nd D

imen

sion

(Im

pute

d)

−3 −2 −1 0 1 2

−3−2

−10

12 D

R

RD

D

RD

D

RR

R

D

D

RD

DD

D

R

RR

R

D

R

R

R

D

R

D

D

R

R

R

D

D R

R

D

D

R

R

RD

R

R

D

R

D

R

R

D

D

R

D

DD

DD

R

D

R

RD

D RD

R D

D

DD

D

R

D

R

D RD

R

R

D

R

R

D

R

D

R

R

D

DD

R

RD

D

D

D

D

D

R R

● ●

RD

RepublicansNon−Republicans

Second Dimension (All Data)

Firs

t Dim

ensi

on (I

mpu

ted)

Ideology Dimension Leadership Dimension

Estimating Ideology when Only Leaders' Votes are Informative

Figure 8: Estimated Ideology when Only Leaders Votes are Informative. The votingdimension estimates appear in the left panel, with the censored estimates measured on the vertical(y-axis) while the uncensored ones appear on the horizontal (x-axis). In the censored data thesalience of the voting dimension drops, so that it becomes the second dimension. The righthandpanel exhibits the leadership dimension, again the censored estimates correspond with the vertical(y-axis) and the uncensored ones coincide with the horizontal (x-axis).

expected attenuation bias. The second dimension ideal points are recovered almost exactly. We

remind the reader that the first SFA dimension coincides closely with the dimension that emerges

from an analysis of the votes alone, and so we might expect it to be more affected by the loss of

voting data, while the accuracy of our second dimension estimates, which are dominated by speech

data, would be expected to suffer less when we censor the votes.

Imputing estimates for members’ given only votes from leadership. We next offer a more

challenging test of internal validity. For this analysis, we coded all vote data except for the party

leaders and whips as missing, while maintaining all speech data. This left a vote record for less

than 4% of the Senate. We then compared the SFA ideal point estimates to the SFA estimates

28

Page 30: Estimating Preferred Outcomes from Votes and Text · Estimating Preferred Outcomes from Votes and Text May 4, 2017 Abstract We introduce a framework that simultaneously encompasses

using everyone’s speech, but only leaders’ votes. When we estimate the censored data we again

recover two dimensions, but their order is reversed, with the voting dimension becoming noisier,

and falling into second place, while the leadership dimension, the evidence for which comes almost

entirely through legislative speech, earns the higher dimension weight, see Figure 8. The left panel

of the figure compares estimates for the voting dimension, which is the second dimension estimated

using the heavily censored data (plotted along the vertical y-axis) while it corresponds with the first

dimension of the uncensored estimates (graphed relative to the horizontal x-axis). Observations are

labeled by party, and leaders’ locations are in bold and circled. As one would expect, with less than

1{25th of the voting data, recovery of the first dimension is far from perfect, but remarkably the

imputed scores correlate highly, at more than 0.85, with the estimates based on the full data set.

The right hand panel compares estimates for the “leadership” dimension, which coincides with the

first dimension based on the censored data, but to the second dimension based on the uncensored

data set. In contrast with the voting dimension, the censored estimates correspond closely with

their uncensored counterparts. Of course, the “leadership” dimension is driven mostly by words,

and we did not censor those.

While this last exercise may seem a stunt, we note that in heavily whipped parliaments most

legislators vote their parties, rather than their preferences (e.g., Kellerman, 2012), yet they still

give speeches. In such settings we might use SFA to “bridge” between speeches actually given by

members of a parliament and the votes that they would have cast had they not been “whipped,”

anchoring the exercise by treating the votes of party principals as a genuine reflection of the leaders’

preferences, while analyzing the backbenchers as if their votes, but not their words, were missing.

29

Page 31: Estimating Preferred Outcomes from Votes and Text · Estimating Preferred Outcomes from Votes and Text May 4, 2017 Abstract We introduce a framework that simultaneously encompasses

−1 0 1 2

0.0

0.2

0.4

0.6

0.8

1.0

1.2

| | ||| ||| || || | || || | | | ||| || || ||| || || | | || | |||| | || || | ||| ||| || | | || || ||| || ||| || | || ||| || | ||| ||| || ||||| |||| ||

Democrats Republicans

Newspapers Placed on the Same Scale as Legislators

Den

sity

First Scaled Dimension

NYT WSJ

Wash Post

Figure 9: Scaling newspaper editorials given only their text. This figure presents the relativelocations and differences between the ideal points for legislators and newspapers.

5.4 External Validity

So far, we have used SFA to impute the preferences of legislators based on the contemporaneous

behavior of their legislative colleagues. We now turn to the more challenging step of estimating the

preferences of non-legislative actors, namely the authors of newspaper editorials.

We apply SFA to word count data from unsigned editorials published during the two years

that the 112th Congress was in session in the New York Times, the Wall Street Journal, and the

30

Page 32: Estimating Preferred Outcomes from Votes and Text · Estimating Preferred Outcomes from Votes and Text May 4, 2017 Abstract We introduce a framework that simultaneously encompasses

Washington Post, using the same terms we employed in our analysis of the Senate. As above, we

combine the word counts of these editorials with the Senate data, treating the editorials as the

speech of legislators whose voting records are missing.

As the term data come from different venues, the Senate floor versus the editorial page, the

exercise is one of “out of sample prediction.” This leaves us with the question of whether the

political meanings of the terms of discourse are the same in both venues. As a first approach

to this issue, we treat the ideal points for both groups as coming from a mean-zero distribution.

Results appear in Figure 9. We orient the dimension so that the Republicans have a positive value.

The densities for the Republican and Democratic Senators are in the background, and the voting

dimension legislator preferred outcomes are plotted as hatch marks along the x-axis. The results

are largely as expected. If we treat the three sets of editorial boards as legislators who do not vote,

we find the Wall Street Journal (WSJ) to the right of the Washington Post (Wash Post) and

the New York Times (NYT) to its left. The distance between the Wall Street Journal and the

Washington Post is about half the estimated distance between the New York Times and the Post.

Of course, these estimates may be somewhat attenuated, as were our imputed positions for the ten

Senators we randomly censored as part of our internal validity check.

6 Conclusion

We propose a method, Sparse Factor Analysis, for combining votes and text data in a single scaling

procedure. The method models both word choice and vote choice in terms of the same spatial

preferences. We furthermore develop a statistical framework that allows us to estimate both indi-

viduals’ most preferred outcomes and the underlying dimensionality of the joint word-vote space.

The resulting framework links the choice-theoretic models of vote and word choice. This tight con-

31

Page 33: Estimating Preferred Outcomes from Votes and Text · Estimating Preferred Outcomes from Votes and Text May 4, 2017 Abstract We introduce a framework that simultaneously encompasses

nection permits the extension of SFA to more complex decision scenarios (Clinton and Meirowitz,

2003, e.g.). SFA enables the analyst to estimate the underlying number of latent dimensions, rather

than having to impose dimensionality a priori.

Substantively, we analyze legislative speech and roll call voting from eight recent sessions of the

US Senate. Combining both data sources reveals a consistent picture of a two dimensional Senate,

with a first ideological dimension coinciding with the dimension that emerges when votes alone are

analyzed, while a second procedural dimension distinguishes leaders of both parties from the rank

and file.

While SFA is designed to analyze individuals who both speak and cast votes, it allows us to

attribute policy preferences to non-voting political speakers, a potential we illustrated for the case

of newspaper editorial boards. This may prove useful in confronting the perennial research problem

of imputing the preferred policy outcomes of legislative candidates. While analysts can infer the

ideology of victorious candidates from their subsequent congressional conduct, as they can infer

the leanings of defeated incumbents from their previous voting records, measuring the preferences

of defeated challengers has proven to be a more elusive goal. Yet every challenger spends time

and energy generating political speech. SFA offers the possibility of imputing the policies such a

candidate would have pursued had he been elected.

We hope the approach in this paper also finds purchase beyond the US Congress. For example,

in strong party systems where votes are relatively uninformative, words may be used to help clarify

the within-party divergence in ideal points. We are currently exploring applications of the method

in situations where voting is not perfectly reflective of underlying individual preference or where

ideal points are allowed to evolve over time.

32

Page 34: Estimating Preferred Outcomes from Votes and Text · Estimating Preferred Outcomes from Votes and Text May 4, 2017 Abstract We introduce a framework that simultaneously encompasses

References

Albert, James H. and Siddhartha Chib. 1993. “Bayesian Analysis of Binary and Polychotomous

Response Data.” Journal of the American Statistical Association 88:669–679.

Aldrich, John, Jacob Montgomery and David Sparks. 2014. “Polarization and Ideology: Partisan

Sources of Low Dimensionality in Scaled Roll Call Analyses.” Political Analysis .

Bafumi, Joseph and Michael Herron. 2010. “Leapfrog Representation and Extremism: A Study

of American Voters and Their Members in Congress.” American Political Science Review

104(3):519–542.

Barbera, Pablo. 2015. “Birds of the Same Feather Tweet Together. Bayesian Ideal Point Estimation

Using Twitter Data.” Political Analysis 23(1):76–91.

Blei, David M., Andrew Y. Ng and Michael I. Jordan. 2003. “Latent dirichlet allocation.” Journal

of Machine Learning Research 3:993–1022.

Bonica, Adam. 2014. “Mapping the Ideological Marketplace.” American Journal of Political Science

58(2):367–386.

Clinton, Joshua and Adam Meirowitz. 2003. “Integrating Voting Theory and Roll Call Analysis: A

Framework.” Political Analysis 11:381–396.

Clinton, Joshua, Simon Jackman and Doughlas Rivers. 2004. “The Statistical Analysis of Roll Call

Data.” American Political Science Review 98:355–370.

Elff, Martin. 2013. “A Dynamic State-Space Model of Coded Political Texts.” Political Analysis

21(217–232).

Page 35: Estimating Preferred Outcomes from Votes and Text · Estimating Preferred Outcomes from Votes and Text May 4, 2017 Abstract We introduce a framework that simultaneously encompasses

Gentzkow, Matthew and Jesse M. Shapiro. 2010. “What Drives Media Slant? Evidence From U.S.

Daily Newspapers.” Econometrica 78(1):35–71.

Gerrish, Sean and David Blei. 2011. “Predicting Legislative Roll Calls from Text.” Proceedings of

the 28th International Conference on Machine Learning .

Gerrish, Sean and David M. Blei. 2012. How They Vote: Issue-Adjusted Models of Legislative Be-

havior. In Advances in Neural Information Processing Systems 25, ed. F. Pereira, C.J.C. Burges,

L. Bottou and K.Q. Weinberger. Curran Associates, Inc. pp. 2753–2761.

Grimmer, Justin. 2010. “A Bayesian Hierarchical Topic Model for Political Texts: Measuring

Expressed Agendas in Senate Press Releases.” Political Analysis 18(1):1–35.

Grimmer, Justin and Brandon Stewart. 2013. “Text as Data: The Promise and Pitfalls of Automatic

Content Analysis Methods for Political Texts.” Political Analysis 21(3):267–297.

Groseclose, Tim and Jeffrey Milyo. 2010. “Sincere Versus Sophisticated Voting in Congress: Theory

and Evidence.” Journal of Politics 72(1):60–73.

Hahn, P. Richard, Carlos M. Carvalho and James G. Scott. 2012. “A Sparse factor Analytic Probit

Model for Congressional Voting Patterns.” Journal of the Royal Statistical Society, Series A

61(4):619–635.

Heckman, James J. and James M. Snyder Jr. 1997. “Linear Probability Models of the Demand

for Attributes with an Empirical Application to Estimating the Preferences of Legislators.” The

RAND Journal of Economics 28:S142–S189.

Hill, Kim Quaile and Patricia A. Hurley. 2002. “Symbolic Speeches in the U.S. Senate and Their

Representational Implications.” The Journal of Politics 64(1):219–231.

Page 36: Estimating Preferred Outcomes from Votes and Text · Estimating Preferred Outcomes from Votes and Text May 4, 2017 Abstract We introduce a framework that simultaneously encompasses

Ho, Daniel and Kevin Quinn. 2008. “Measuring Explicit Political Positions of Media.” Quarterly

Journal of Political Science 3:353–377.

Hoff, Peter D. 2007. “Extending the Rank Likelihood for Semiparametric Copula Estimation.” The

Annals of Applied Statistics 1(1):265–283.

Jackman, Simon. 2009. Bayesian Analysis for the Social Sciences. Chichester, U.K.: Wiley.

Kellerman, Michael. 2012. “Estimating Ideal Points in the British House of Commoms Using Early

Day Motions.” American Journal of Political Science 56(3):757–771.

Ladha, Krishna. 1991. “A Spatial Model of Leglslative Voting with Perceptual Error.” Public Choice

68:151–74.

Ladha, Krishna. 1994. “Coalitions in Congressional Voting.” Public Choice 78:43–64.

Lauderdale, Benjamin and Tom Clark. 2014. “Scaling Politically Meaningful Dimensions Using

Texts and Votes.” American Journal of Political Science 58:754–71.

Laver, Michael, Kenneth Benoit and John Garry. 2003. “Extracting Policy Positions from Political

Text Using Words as Data.” American Political Science Review 97(2):311–331.

Lijphart, Arend. 1999. Patterns of democracy. New Haven: Yale University Press.

Lo, James, Sven-Oliver Proksch and Jonathan B. Slapin. 2014. “Ideological Clarity in Multiparty

Competition: A New Measure and Test Using Election Manifestos.” British Journal of Political

Science 46:591–610.

Maltzman, Forrest and Lee Sigelman. 1996. “The Politics of Talk: Unconstrained Floor Time in

the U.S. House of Representatives.” The Journal of Politics 58(3):819–30.

Page 37: Estimating Preferred Outcomes from Votes and Text · Estimating Preferred Outcomes from Votes and Text May 4, 2017 Abstract We introduce a framework that simultaneously encompasses

Murray, Jared S., David B. Dunson, Lawrence Carin and Joseph E. Lucas. 2013. “Bayesian Gaussian

Copula Factor Models for Mixed Data.” Journal of the American Statistical Association 108(502).

Neal, Radford. 2011. MCMC Using Hamiltonian Dynamics. In Handbook of Markov Chain Monte

Carlo, ed. Steve Brooks, Andrew Gelman, Galin Jones and Xiao-Li Meng. Vol. 2 of CRC Hand-

books of Modern Statistical Method Chapman and Hall pp. 113–162.

Park, Trevor, Casella and George. 2008. “The Bayesian Lasso.” Journal of the American Statistical

Association 103(482):681–686.

Poole, Keith and Howard Rosenthal. 1997. Congress: A Political Economic History of Roll Call

Voting. New York: Oxford University Press.

Poole, Keith T. and Howard Rosenthal. 1985. “A Spatial Model for Legislative Roll Call Analysis.”

American Journal of Political Science 29:357–84.

Quinn, Kevin M. 2004. “Bayesian Factor Analysis for Mixed Ordinal and Continuous Responses.”

Political Analysis 12(4):338–353.

Quinn, Kevin M., Burt L. Monroe, Michael Colaresi, Michael H. Crespin and Dragomir R. Radev.

2010. “How to Analyze Political Attention with Minimal Assumptions and Costs.” American

Journal of Political Science 54(1):209–228.

Roberts, Molly, Brandon Stewart and Dustin Tingley. 2015. Navigating the Local Modes of Big

Data: The Case of Topic Models. In Computational Social Science: Discovery and Prediction,

ed. R. Michael Alvarez. Cambridge.

Roberts, Molly, Brandon Stewart, Dustin Tingley, Christopher Lucas, Jetson Leder-Luis, Shana

Page 38: Estimating Preferred Outcomes from Votes and Text · Estimating Preferred Outcomes from Votes and Text May 4, 2017 Abstract We introduce a framework that simultaneously encompasses

Gadarian, Bethany Albertson and David Rand. 2014. “Structural Topic Models for Open Ended

Survey Responses.” American Journal of Political Science 58:1064–1082.

Slapin, Jonathan B. and Sven-Oliver Proksch. 2008. “A Scaling Model for Estimating Time Series

Party Positions from Texts.” American Journal of Political Science 52(3):705–722.

Tibshirani, Robert. 1996. “Regression Shrinkage and Selection via the Lasso.” Journal of the Royal

Statistical Society. Series B (Methodological) 58(1):pp. 267–288.

Wang, Eric, Esther Salazar, David Dunson and Lawrence Carin. 2013. “Spatio-Temporal Modeling

of Legislation and Votes.” Bayesian Analysis 8(1):233–268.

Wilkerson, John D. 1999. “‘Killer’ Amendments in Congress.” American Political Science Review

93:535–52.