Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Curious case of Rotten Tomatoes - Effects of quality signalling in the US
domestic motion picture market
Master’s Thesis 15 credits
Department of Business Studies
Uppsala University
Spring Semester of 2018
Date of Submission: 2018-06-01
Deniss Dobrovolskis
Supervisor: Niklas Bomark
2
signalling +
Curious case of Rotten Tomatoes: Effects of quality
signalling in the US domestic motion picture market.
Deniss Dobrovolskis
Handledare: Niklas Bomark
Företagsekonomiska institutionen
Inlämningsdatum: 2018-06-01
ABSTRACT Quality signalling in motion picture markets is hardly a new topic. It has been covered by many
researchers over the years. However, most of the previous studies focused on quality signals in
interactions between moviemakers and moviegoers. This study employs a more holistic
approach as the author attempts to evaluate effects of quality signals throughout different stages
of movies’ life cycle. The author has identified three audiences that movies are presented to;
and, each group of audience generates a quality signal for the next audience. Based on the
feedback from test audiences, moviemakers decide on when to show movies to professional
critics and when to allow them to publish their reviews. Interpretation of these timelines
become quality signals for the professional critics who interpret shorter time slot for review
publication as a signal of the low quality of the movie and vice versa. Professional critics write
their reviews which when published on review aggregators become quality signals for the
moviegoers. Reviews generated by the initial moviegoers are interpreted by the moviegoers
who intend to watch movies at a later stage.
All three assumptions are operationalised and evaluated in a series of linear regression tests in
this research on a sample containing 130 out of 134 widely released movies in the US and
Canada domestic market in 2017. All of the abovementioned quality signals found to be
significant as they could explain at least 40 % of the variance of respective response variables.
Key words: Rotten Tomatoes; box office success; judgement devices; quality signals; reviews;
motion picture market
Master’s thesis,
SAOE,
VT 2018, 15 hp
INTRODUCTION Nature of markets and actor behaviour have been topics of interest for researchers in economic
sociology for quite some time (Akerlof, 1970; Aspers, 2009; Beckert and Rossel, 2013;
MacKenzie, 2006; White, 1981; Zuckerman, 1999). In his prolific paper “Where do markets
come from?”, White (1981) attempted to describe behaviour of a firm in a production market.
The author came to the conclusion that firms observe each other and based and take decisions
based on the behaviour of other firms. White (2002, pp. 32, 38, 1981) argued that although
firms do observe buyer’s behaviour to some degree, however, it is close to impossible to
anticipate preferences of each individual buyer and firms treat buyers as price takers and can
only accept an offer from the firms. However, the firms still face a fundamental challenge that
is present in the markets – asymmetry. This means that the producers know much more
information about their products than the buyers who might consume them.
Topic of market asymmetry is, therefore, one of the main areas of interest for researchers who
operate on the border of economy and sociology analyse challenges of communication product
quality from sellers to buyers (Akerlof, 1970; Aspers, 2012; Beckert and Musselin, 2013;
Beckert and Rossel, 2013; White, 1981; Zuckerman, 1999). Motion picture market, in
particular, seems to be very compelling when it comes to analysing the interaction between
producers and consumers (Hsu, 2006; MacKenzie, 2006; Zuckerman, 2003; Zuckerman et al.,
2003). One of the aspects that makes this market so attractive for researchers is that it is rather
transparent (Brown et al., 2012). The Motion Picture Association of America (MPAA, 2013),
which is an industry organization for the content creators for the motion picture, home video,
and television in the US and Canada markets, publishes annual reports with comprehensive
analysis of theatrical and home entertainment market environments. These reports contain
statistics on viewership numbers, financial statistics and major trends in the industry. For the
purposes of this paper, I will use the latest available report, which in this case, is for 2017
(MPAA, 2018).
And yet, despite such a transparency, motion picture market is still asymmetrical due to nature
of the product that is traded in the market. Motion pictures are intangible or experience goods
and quality of such goods is not known until after the consumption (Klein, 1998). Movie-goers
who are interested in watching quality movies and avoiding low-quality movies and actively
seek out available quality signals (Bharadwaj et al., 2017). Therefore, moviegoers consult with
various popular web resources for quality signals (Kim et al., 2013) such as IMDb and Box
Office Mojo where they can learn about cast, production crew, budget, awards and even get
4
some relevant information about the movies from both regular moviegoers’ and professional
critics’ and regular movie-goers’ ratings (Box Office Mojo, 2018a; IMDb, 2018a; Metacritic,
2018).
Interaction between audiences and critics (Goff et al., 2016; Zuckerman, 2003, 2000, 1999a)
and effects of word of mouth (Bharadwaj et al., 2017; Goldenberg et al., 2001; Hsu et al., 2009;
Kim et al., 2013; McKenzie, 2009; Palsson et al., 2013; Rasmussen et al., 2010; Ye et al., 2009;
Zhang et al., 2010) have been widely presented in academic literature. One common motif that
characterizes most of the previous studies is that the studies used different sources in order to
analyse the interaction between critics and audiences and how this interaction affects the
success of the movies in the box office. However, as social media has become more and more
widespread, some media actors become more influential than others. A very good example of
such an influential web resource is Rotten Tomatoes which acts as an aggregator of critical
reviews (Rotten Tomatoes, 2018a). According to the LA Times, 36% of the US movie-goers
consulted Rotten Tomatoes in 2017 before making a decision on which movie to watch at the
cinema (Faughnder, 2017). In a way similar as another internet resource YouTube became
institutionalized (Kim, 2012), internet movie review sites underwent evolution from audience-
generated reviews (IMDb, 2018a) to reviews generated by professional critics (Rotten
Tomatoes, 2018a). Rotten Tomatoes is also recognized by film studios as a legitimate channel
for providing the audiences with information about their products (Cavna, 2017a, 2017b; Fritz,
2016). Recently, Rotten Tomatoes ratings have been used by academics as a legitimate source
of review valence and volume (Bharadwaj et al., 2017; Goff et al., 2016; Kim et al., 2013).
As I have already shown, Rotten Tomatoes is a recognized platform for signalling product
quality to the audiences in a very asymmetrical and mediated market. Therefore, in this thesis,
I would like to take the logical step and use Rotten Tomatoes rating system as a proxy for
aggregated critical reviews and word of mouth generated by the moviegoers. Purpose of this
study is, firstly, to explore what techniques moviemakers use to signal quality to the audiences.
Secondly, I will employ statistical analysis to understand how effective these technics are. In
order to fulfill the purpose of this study, I will start with setting a theoretical foundation on
quality signaling. Thereafter, I will establish the context of the study by presenting results of
previous research on quality signalling in the motion picture market and some empirical data
to describe the state of the US and Canada domestic motion picture market in 2017. When the
theoretical foundation is set up and context is presented, I will describe what methodology I
will use to analyse data gathered during the course of the study. This section will be followed
by a description of results. This paper will be finalised by a section where I will present my
conclusions and implication of academia and business practices.
LITERATURE REVIEW
Producers and product quality
According to White (1981), firms in the market can be grouped by the qualities of their products
as perceived by consumers. This perception by customers plays an important role because, in
accordance with White (1981), producers differ from each another in appreciation of their
products by the consumers. However, appreciation by the consumers is (or, probably more
precisely, was in the 1980s) hard to quantify, therefore, firms do not act on the market based
on consumers’ appreciation. Instead, firms act on observable volumes and payments of their
competitors as they are unable to make sense of qualities and/or consumer valuations of the
competitors’ products. Since it is significantly more productive to replicate the behaviour of
one’s peers than to speculate on valuations, reproduction of each other’s behaviour in order to
sustain their niche in the market can be employed as a successful business strategy.
Surely, transparency of the motion picture market, where such information as box office
revenue (Box Office Mojo, 2018b), consumer behaviour and demographics (MPAA, 2018)
and, movie release plans that stretch many years onward (Couch, 2017; Dockterman, 2018;
McCluskey, 2017); would allow many movie producers to replicate each other’s behaviour
without trying to understand needs and expectations of the moviegoers. However, this might
be an oversimplification of the state of the market. Dubuisson-Quellier (2013, pp. 15–16)
attempts to reconcile White’s theory with the use of marketing and market research in order to
create a better understanding of consumers’ behaviour. Simply put, the author argues that not
all firms have resources and manpower to invest in marketing (Dubuisson-Quellier, 2013, p.
7) and, therefore, firms that do not occupy leading position in a particular market are forced to
observe and replicate behaviour of the market leaders’ as they are assumed to have better
understanding of customer’s expectations (Dubuisson-Quellier, 2013, p. 13).
Dubuisson-Quellier (2013) argues that replication of behaviour of the market leaders’ is a self-
reproducing phenomenon. In fact, Dubuisson-Quellier (2013, pp. 16–17) argues that this self-
reproducing behaviour implies performativity of decision-making in the markets. She argues
that decisions on production quantities are reinforced by observation of the market which lay
the ground for the decision-making process which relies on the belief that market-leading
companies have a better understanding of consumers’ demands. This reinforcement of
6
observations creates path-dependency in the mass markets (Dubuisson-Quellier, 2013, p. 14)
as firms engage in self-repetitive behaviour and create very similar products. Renewal rate in
the mass markets is high and rate of innovation is low as firms supply consumers with either
their own or cheaper version of competitor’s product (Dubuisson-Quellier, 2013, pp. 14, 17).
This notion of path-dependency and product similarity might be exemplified by high numbers
of movie sequels and/or movie franchises (Bharadwaj et al., 2017; Kim et al., 2013; Zhao et
al., 2013). And, yet, Dubuisson-Quellier (2013, pp. 7-10) argues that firms that have resources
and manpower to work with marketing would do that in order to shape the demand in the
market. The researcher argues that it is typically the larger companies would employ either
qualitative or quantitative techniques to gather consumer preferences, adjust their products and
then create marketing campaigns in order pursue buyers to consume their products. Smaller
firms that do not have these resources are then forced to observe bigger firms and mimic their
products (Dubuisson-Quellier, 2013, p. 13). Although Dubuisson-Quellier (2013) presents a
compelling case, it still doesn’t explain why so many firms invest millions into market research
and why even leading companies with a better understanding of consumers’ need fail when
launching new products. Surely, this cannot be explained by only looking at the producers’
side of things
Quality signalling and cultural products
So far, we have reconciled White’s classical approach to firm behaviour that downplays the
importance of producer-consumer interaction and endless example of marketing researchers
and efforts by the firms in the markets. Although, even White (2002, pp. 16, 32) doesn’t
disregard the importance of signalling quality from the producers to the consumers. However,
according to White (2002, p. 16, 1981) quality is a social construct and this creates some
challenges in evaluating and differentiating products based quality. Beckert et al. (2017) argue
that understanding valuation of goods in the markets has become one of the central problems
in economic sociology. Value of cultural products such as movies is a social construct that has
its meaning on several levels argues DiMaggio (1987). The researcher claims that cultural
products are divided into categories which allow producers to analyse competition in the
market; consumers to compare different offers on the market; and, critics can with help of
categories to classify products even if products have abstract and intangible artistic content.
Uncertainty in quality of cultural products
In similar fashion, Beckert and Rossel (2013) argue that buyers of artistic products face
fundamental uncertainty challenge regarding the quality of art since it is based on subjective
aesthetic judgements. Quality of artistic products can only evolve from the interaction between
experts, institutions, and media in the art field assessing work of the artists and conferring their
reputation. This reputation then is perceived as a quality signal by buyers and lays the ground
for the value of the artwork. Again, there is a fundamental asymmetry in artistic markets where
producers of the art may have much more information about the objective properties of their
work. This asymmetry creates uncertainty for the buyers. Therefore, (Beckert and Rossel
(2013) argue that buyers look for quality signals based on both reputation of the artists but also
on judgements made by critics of artwork. Again, since critics base their judgement on their
subjective interpretation of quality of the art there is an uncertainty regarding the correctness
of critics’ evaluation. Therefore, the critics themselves are judged on quality and significance
of their artistic judgement. This creates a feedback loop of sorts where buyers of art products
evaluate judgment of the critics thus granting a higher status to and institutionalizing critics
whose evaluation are perceived as reliable. These institutions with higher reliability enjoy
benefits of higher reputation and thus their quality signals are perceived as more reliable.
According to Beckert and Rossel (2013), these instructions serve to reduce the uncertainty
degree regarding quality in the market. And researchers (Aspers, 2009; Beckert, 1996; Beckert
and Rossel, 2013; Zuckerman, 1999) argue that stable markets can only exist where uncertainty
regarding product quality is reduced.
Overcoming uncertainty: Step one – establish categories
Beckert and Musselin (2013, pp. 1–5) who literally wrote a book on quality argue that
construction of quality of goods consists of three processes. The first process is the construction
of categories with which the goods can be associated. The authors argue that “categories are
boxes within a set of related boxes that form classification systems” Beckert and Musselin (
2013, p. 2). The authors also employ Bowker and Star's (1999, pp. 10–11) interpretation of
classification which they define as “spatial, temporal or spatiotemporal segmentation of the
world”. A classification system in order to function has to possess three fundamental
properties: “There are consistent, unique classificatory principles in operation.”, “The
categories are mutually exclusive.”, “The system is complete”.
Overcoming uncertainty: Step two – find a place in the category
When the categories have been constructed, the next process of positioning a specific good or
product must within its category (Beckert and Musselin, 2013, p. 3). (Zuckerman, 1999) argues
that for a product to compete in a market, it should be viewed regarded as a legitimate member
of a product category represented in the market. Zuckerman (1999), therefore, is able to
8
formulate a notion of “categorical imperative”. The researcher argues that entities that “do not
exhibit certain common characteristics may not be readily compared to others and are thus
difficult to evaluate. Such offers stand outside of the field of comparison and are ignored as so
many oranges in competition among apples. It is this inattention that constitutes the cost of
illegitimacy”. Zuckerman (2003) applied this approach to the US motion picture market and
found that films that do not fall into a category associated with the type of movie studio, suffer
in box office performance because of that. Zuckerman (2003) argues that movies represented
in categories major or independent perform better in the box office if they are clearly positioned
as belonging to one of the categories and not two at the same time, i.e. when a major film studio
release an independent movie. Hsu et al. (2009) and Zhao et al. (2013) have also analysed the
US domestic motion picture market and argue that films that stretch over several genres are
subject to illegitimacy discount and as a result, such films receive lower attention for audiences.
Overcoming uncertainty: Step three - (E-)valuation with help judgement devices
When a category has been established and a product has been clearly placed within this
category, the process of establishing product quality takes place. This process is built upon
establishing product quality differences within a product category. Due to the asymmetrical
nature of the market valuation of intangible products based on its qualities is a rather
challenging process (Beckert et al., 2017; DiMaggio, 1987; Karpik, 2010, p. 289; Rössel and
Beckert, 2013, p. 2). Differences in product qualities are determined based on product
differentiation in a direct comparison of products between each other. This can be done by
employing judgement devices which are considered “to be the central mechanisms in the
qualification of goods” and most common type of judgement device would be a rating scale
for products placed within one category. The researchers continue as they claim that without
references to judgement devices it would be too difficult to evaluate goods and buyer’s choices
would become random (Beckert and Musselin, 2013, p. 17). Judgement devices as many other
phenomena in the market judgement devices evolve and compete with each other. This
produces ambiguity and uncertainty for the audiences when it comes to choosing a judgement
device. Many different aspects play into the choice of judgement devices and most significant
ones are tradition, power, and trust towards the judgement device (Rössel and Beckert, 2013,
p. 18). Therefore, in order to reduce uncertainty judgement devices may employ ordering of
products according to a scale created and/or facilitated by “market professionals” (Beckert and
Musselin, 2013, p. 22). Beckert and Musselin (2013, p. 23) conclude their reasoning around
judgement devices as they claim that important factor in the success of a judgement device is
its ability to impose quality criteria upon its audience. That is only achievable through
interaction or signalling quality between producers and consumers (Beckert and Musselin,
2013, p. 19; Callon et al., 2002, pp. 202–203) directly or through gatekeepers of cultural goods
such as critics and evaluators (Lamont, 2012).
Judgement devices and critical reviews in the motion picture market
Influence of critical reviews and judgement devices on communicating the quality of intangible
goods is (as mentioned above) widely covered by academic literature. If we shift focus to
motion picture market in specific, we will immediately see an array of studies that cover effects
of critical reviews and other judgements devices on box office performance. Bae and Kim
(2013) claim that studies on effects of critical reviews produced mixed results, the researchers
find that valence of reviews and word-of-mouth play more important role than the volume
(number or frequency) of the reviews. (Kim et al., 2013) explored effects of online word of
mouth and expert reviews and found that valence (ranking or rating) played an essential role in
the success of films in the box office. Similar results were found in a number of other studies
on the motion picture markets (Brown et al., 2012; Goff et al., 2016; Gopinath et al., 2010; Lee
and Choeh, 2018; McKenzie, 2009; MCKENZIE, 2008; Moul, 2007; Ye et al., 2009; Zhao et
al., 2013; Zuckerman, 2003).
Bharadwaj et al. (2017) study is another example of research that sheds light on the importance
of critical review as the researchers claim that one-third of the Americans take into
consideration critics’ reviews in situations when they want to choose a movie to attend. The
researchers also argued that audience perceive higher movie ratings as a signal of the higher
quality of the final product. The researchers argued that both volume of reviews and valence
of ratings play role in box office performance of the movies. These findings on volume of
reviews play well into Rössel and Beckert's (2013, p. 7) argument that stronger the consensus
on the quality of a product the lower the uncertainty regarding its quality.
The discussion on the importance of both volume and valence of reviews is especially
interesting when it comes to Rotten Tomatoes which is an aggregator of reviews compiled by
recognized professional critics (Goff et al., 2016). Therefore, it may satisfy necessary
conditions for being a legitimate source of both valence and volume of critical reviews as
acknowledged by one-third of the US moviegoers who visit the website before making the
choice of the movie to see (Faughnder, 2017). And critical appreciation seems to matter when
it comes to determining box office success of the movies as evidenced by many researchers
who looked at the motion picture market and highlighted the importance of critical reviews in
10
forming both total box office and box office on the opening weekend, that on the weekend
when the movie has its premiere (Bharadwaj et al., 2017; Brown et al., 2012; Lee and Choeh,
2018; McKenzie, 2009; Zhao et al., 2013; Zuckerman, 2003).
Quality of movies is more than critical reviews
Many researchers highlight the importance of the abovementioned decisions. Palsson et al.
(2013) analysed how MPAA ratings influence box office performance and found that R-rating
that implies the presence of more violent or obscene scenes in the movies may reduce box
revenue by 20%. Hsu et al. (2009) argue that genre has great importance when the audience
considers which film to attend. Zhao et al., (2013) argues that films that stretch over several
genres are subject to illegitimacy discount as studies show that these kinds of films receive
lower audience ratings and box-office results. Films that do not have clear genre boundaries or
have elements of incompatible genres might be ignored by the audience because of the unclear
identity of the films (Zuckerman, 2003). Zhao et al. (2013) also argue that naming convention
has an influence on box-office performance as movies that are a part of a recognized franchise
(are a continuation of a film series) tend to attract broader audience attention. Actors’ star
power, budget, director, size of a studio and/or distributor’s market power are another
important factors that influence box office performance which is the reason why many
researchers use these notions as variables in their analysis (Bharadwaj et al., 2017; Goff et al.,
2016; Kim et al., 2013; Lee and Choeh, 2018; MCKENZIE, 2008; McKenzie, 2009; Zhao et
al., 2013; Zuckerman, 2003; Zuckerman et al., 2003). However, Zuckerman (2003) argues that
most of the elements mentioned here have no significant effect on the box office performance
once the screen allocation is set. The researcher further argues that with the exception for the
critical reception the above-mentioned elements have diminishable effects on the box-office
performance as the number of allocated screen grows, i.e. for independent movies, this number
is 1100.
Movie life cycle
The example above revolved around activities associated with a theatrical run of the movies.
However, box office performance during the theatrical release is only one component of the
whole movie life cycle which usually consists of various stages of production, distribution, and
exhibition. (McKenzie, 2012). It is important to separate these stages as most of the important
decisions that would influence a movie’s future are taken long before the movie reaches
theatres. Such decisions on the genre, plot, casting, director, budget and MPAA rating are
obviously taken before the production or filming starts. However, distribution decisions,
namely, how many theatres movie will be shown in usually are also in advance as both studios
that produce movies and theatres that show movies have their own business schedules and have
to plan months (if not years) ahead (Brown et al., 2012).
The above mentioned may imply that allocation of screens might be one of the most important
business decision a distributor might take before releasing the movie to a broader audience.
However, before doing that the movies are usually shown to test audiences in order to catch
initial response to the finished product. Based on this response, the distributors may form an
understanding whether the finished movie will be received by the critics and/or audience
positively or negatively. If a movie may be received negatively, the distributors might choose
to exercise the option of a so-called cold opening. This means that the movie will not be shown
to the critics prior to release to a general audience. This is done in order to avoid the risk of
negative reviews and by that sending negative quality signals to the audience. This strategy
seems to be rather successful as it correlates with 10-30 percent increase in domestic box-office
revenue. The studios are clearly aware of this phenomenon a the number of cold openings
increased sharply in the middle of 2005. Researchers also found that cold opening in case of
movies of the lower quality cold opening is usually correlated with “a pattern of
disappointment”. However, since the cold opening is a successful strategy, the researchers
argued that this could imply that the audiences did not perceive the relation between cold
openings and lower movie quality. (Brown et al., 2012)
But is all of that still relevant in 2017?
Although Brown’s et al. (2012) study is very thorough and thus convincing, the data set the
study was based on covers movies released between 2000 and 2009. A lot of things have
changed since then, and one may ask if quality signals employed in the early 2000’s are still
relevant. Before I present my conceptual framework that I will apply analysis on, allow me to
briefly describe the US and Canada motion picture market as of 2017 in order to describe what
is at stake for different actors in the market.
Let me start with the producers’ side of the market. There were 738 movies released in the US
and Canada with a total domestic box office of $11,1 billion. 130 of these movies had a wide
release, i.e. were released in over 600 theatres on the night of the premiere. These 130 movies
accounted for box office revenue of $10,1 billion which might sound a lot at the first glance.
However, these 130 movies had a production budget of $7,8 billion. It is worth mentioning,
12
that production budget doesn’t equal to the total cost of a movie. Additional 80% should usually
be added to the production budget as this amount corresponds to the marketing budget. Taking
into consideration that box office revenue was distributed in accordance with Pareto’s
principle, i.e. 20% of the movies generated 80% of box office, one might draw the conclusion
that the stakes are very high for the movie producers who are interested in mitigating effects
of bad quality signals.
In an interview with a US-based critic (further in the text as respondent 1) with experience in
interaction with movie studios explained how this interaction might work. At some point in the
early stage of its life cycle, the movie is shown to the test audience that might be comprised of
either internal or external members. Based on feedback received from the test audiences,
studios have a good understanding of how the movie will be received by the critics. Based on
this anticipation, movie studios set times for screenings for the critics. At the screenings, the
critics would receive instructions on when they are allowed to publish reviews in social media
and/or outlets that the critics represent. The interviewee stressed that the studios set these time
slots with “mathematical precision” as the studios were fully aware of the consequences of
review embargoes. Similar accounts were discovered in several US-based newspapers where
interaction between movie studios and movie critics were discussed. (Ahsan, 2017; Cavna,
2017b; Fritz, 2016). This practice of withholding reviews has some unintended consequences
associated with it, as some entertainment industry insiders argue that the shorter the window
between the lift of the review embargo and the premiere of the movie, the lower the Rotten
Tomatoes score the movie would get (Dickey and Han, 2017). This correlation between the
review embargoes and the Rotten Tomatoes score was observed, however, on rather limited
number of movies (27 major wide releases) and no attempt to establish a correlation between
the Rotten Tomatoes score and box office took place.
Respondent 1 mentioned that instructions regarding review embargoes were usually issued at
the screenings directly to the critics, not via some publicly available source. There were no
sanctions associated with disclosing timelines communicated by the studios. However, there
were sanctions if reviews were published prior to the lift of review embargo. These sanctions
would usually mean that the critic who violated the embargo rules would not be accredited to
future screenings. Having said that, Respondent 1 added that in some cases he as entertainment
editor at a large web resource would publish reviews a few minutes prior to the official timeline.
This was due to competitive nature of outlets that would want to have their reviews published
ahead of competitors in order attract attention to their outlets. This was, however, a rather
common practice that didn’t really give any competitive advantage. In email correspondence
with an UK-based YouTube movie reviewer (further in text responder 2) the same topic was
discussed. The review stated that “Studios tend to employ that method of restraint from film
critics as a way to hold back any negativity. If a film studio has faith in their product they will
want people to tell their audience that the film is good it just makes logically sense etc when
they hold back a movie from being seen by the critics in some cases and only allowing reviews
to be published usually a day before its release or even on the day, most moviegoers tend to
know the film is going to be a let down.”.
Respondent 1 also explained that the same principle would apply for Rotten Tomatoes as the
reviews would be published there as soon as possible after the lift of review embargo. I
questioned Respondent 1 of whether it would be possible to acquire a list with time slots for
review embargoes to which the respondent replied that to do so one would need to get in contact
with a large number of critics who had attended all the critical screenings. Having established
that, I asked the respondent if one could use the date of publication of reviews on Rotten
Tomatoes as an indicator of the length of review embargo. The respondent replied that they
would employ the same technic in their analysis of this phenomenon.
This reply from the respondent 1 confirmed my assumption regarding timelines for review
publication on Rotten Tomatoes and allowed me to formulate my first hypothesis:
H1: All other factors equal, does shorter review turnover lead to a lower Rotten Tomatoes
rating?
After Respondent 1 was asked about the influence of Rotten Tomatoes score on the box office
performance to which he replied that they couldn’t establish any direct correlation. Respondent
2 didn’t mention relation between Rotten Tomatoes score and box office performance.
However, as mentioned earlier many researchers covered topic of effects of critical and user-
generated reviews on box office (Basuroy et al., 2006; Bharadwaj et al., 2017; Brown et al.,
2012; Eliashberg and Shugan, 1997; Gopinath et al., 2010; Kim et al., 2013; Lee and Choeh,
2018; McKenzie, 2009; Oh et al., 2017; Zuckerman, 2003). But in 2017, Rotten Tomatoes was
a very popular site amounts movie-goers. As mentioned earlier in the text, approximately one-
third of the movie-goers in the US visited this site before choosing a movie to watch
14
(Faughnder, 2017). SimilarWeb (2018) which is web analytics service puts Rotten Tomatoes
as number two most visited site in category movies in the US and number four globally.
Amazon’s web analytics service Alexa (2018) shows that Rotten Tomatoes had 12 million
unique visitors per month. Another interesting observation made in Alexa was that traffic on
Rotten Tomatoes was unevenly distributed between the weekdays and usually peaked on the
weekends when movies had their premieres.
This significant interest from the internet users to Rotten Tomatoes on premiere week-ends led
me to the conclusion that I could use Rotten Tomatoes score as a proxy for aggregated critical
reviews and test its effects on box office performance. This led to my second hypothesis:
H20: All other factors equal, does higher Rotten Tomatoes score lead to a higher revenue at
box office opening?
During gathering of data on Rotten Tomatoes, I have discovered that non-critical users could
also leave their reviews on the webpage and over 3 million reviews were left by non-critic users
for the 130 widely-released movies in 2017on Rotten Tomatoes. During gathering of data from
SimilarWeb (2018) and Alexa (2018), it was noted that web resource in category movies that
attracted most traffic was IMDb.com which is a resource where non-critics would leave their
reviews that would be aggregated and systemised in form of a rating. IMDb attracts twice as
many unique visitors, i.e. 24 million per month. It is worth noticing that IMDb had the same
visit pattern as Rotten Tomatoes with peaks on the weekends.
This very significant interest from the internet users to non-critical reviews led me to the
conclusion that I could try to use Audience Score score as a proxy for aggregated word of
mouth and test its effects on box office performance. This allowed me to formulate my third
and final hypothesis:
H30: All other factors equal, Audience score published on Rotten Tomatoes is a better predictor
of the box office success that the Rotten Tomato score?
Criticism of the presented literature
Before I move on to the methodology I would like to add some observations on the previous
research. First of all, there was no consensus on which factors besides critical reviews and word
of mouth could influence box office performance. I will expand on this point in the conclusion
of this thesis. But also, the role of critics as such isn’t that obvious as it might seem. Some
researchers question whether critics could actually influence decisions of consumers of cultural
products (Eliashberg and Shugan, 1997). Hirsch (1972) argues that critical reception is just a
result of efforts undertaken by the distributors, i.e. critics base their review on the extent and
the nature of marketing campaigns. Eliashberg and Shugan (1997) argue that some reviews
might actually be a projection of consumers’ potential reception of a product, rather than a
prediction of the reception. Goff et al. (2016) argued that the difference between the perception
of different attributes between critics and non-critics is so significant that one could argue that
there are two different motion picture markets: mass market and artistic/elite market. Rössel
and Beckert (2013) raised similar example as they claimed that focus of wine producers on
specific critics led to situations where wines were produced with preferences of a certain critic
in mind and not the preferences of mass consumers.
METHOD The following section contains a description of the methods used during the study. The section
will start with an explanation of choice of the methods. Thereafter, research design for
respective methods will be presented. This section will conclude with a brief discussion of
advantages and disadvantages of the chosen methods.
Choice of method
Choice of method for this study was a rather challenging process in itself. Most of the academic
literature mentioned in the previous section did employ quantitative methods where certain
data was collected and certain hypotheses were tested. However, how can researchers be sure
that the hypotheses that they test actually exist in the real life? Well, this challenge could be
overcome if researchers employ a qualitative method when researchers engage in conversations
with the real-life people who are subjected to the researched phenomenon. However, both
methods have a number of limitations and can lead to conflicting conclusions. Martin et al.
(2006) showcase this with an example from organizational studies - “ontological and
epistemological differences underlie qualitative and quantitative methods choices, affecting
fundamental ideas about the nature of an organization”. Moreover, Martin et al. (2006) claim
that choice of method might even (although not necessarily) be influenced by geographical
16
factors because researchers in the US are more inclined to perform quantitative studies with
hypothesis testing due to their “neo-positivist assumptions about knowledge building” and at
the same time researchers in Europe and other parts of the world employ “a broader variety of
ontologies, epistemologies and methods (often qualitative) are preferred”. Therefore,
according to the authors, the researchers who limit themselves to only one method can fall into
the interpretational pitfall associated with respective methods.
Arnold (2006) argues that combination of both qualitative and quantitative methods might help
researchers both to understand what and perhaps how phenomena might arise when qualitative
technics are employed and how often these phenomena occur when quantitative are used in the
studies. Therefore, Arnold (2006) suggests combining both qualitative and quantitative
techniques in order to mitigate the risks associated with the respective method and improve
validity and generalisability of research. In accordance with Arnold’s advice, I have decided to
use both technics in my research and therefore go for a mixed method approach (Saunders et
al., 2009, pp. 152). Mixed method research is built upon the use of both qualitative and
quantitative data collection techniques and analysis procedures. It is very important to stress
that, in accordance with Saunders et al. (2009, p. 153), qualitative data will be analysed
qualitatively and quantitative data will be analysed using statistical techniques. In other words,
none of the data gathered during the interviews will be quantitised and used in hypotheses
testing.
To summarize my choice of method, I have decided to proceed with the research in two parts.
The first part of the research will be based on interviews with people that have a connection to
the movie industry. This will be a pre-study of sorts in order to create understanding about the
practice of review embargoes and if it has any effects on the valence of critical reviews. During
this part, I will also attempt to identify sources of information about the timelines imposed on
the critics by the review embargoes. The second part of my thesis will revolve around the
statistical analysis of the gathered data in order to understand if the variance in critical
perception and box office performance might be explained by quality signals such as review
turnover and Rotten Tomatoes and Audience scores respectively.
Pre-study
Pre-study for this thesis had 2 aims. First, to establish an empirical foundation that could
highlight validity of choice of Rotten Tomatoes as a proxy for both aggregated critical reviews
and word-of-mouth. Second, to explore nature of interactions between moviemakers and
critics.
To achieve the first aim, I employed internet searching techniques where I searched for
different combinations of keywords “Rotten Tomatoes”, “box office”, “critical review”,
“review embargo” in both Google and on Uppsala University library search engine. These
searches gave me results from both peer reviews and non-peer review sources. Peer-reviewed
articles were used as in the literature overview as expected. Claims from non-peer reviewed
sources, however, went through additional analysis. There were 2 major motifs in claims from
non-peer reviewed articles that I used in my thesis. The first claim was regarding user
engagement on Rotten Tomatoes webpage. By using data from web analytics services
SimilarWeb and Alexa I wasn’t able to disprove the claim that 1/3 of US moviegoers visited
Rotten Tomatoes before making a choice of movie watching in the cinema. 12 million monthly
visits correspond to 144 million yearly visits which is significantly higher than 1/3 of 260
million people who bought movie ticket in the US and Canada in 2017.
The second aim of the pre-study was to explore the nature of interactions between movie
studios and critics. The exploratory method is usually best fitted by unstructured interviews
where respondents are given the possibility to talk in depth about specific questions Saunders
et al. (2009, pp. 321–323). I have chosen to contact YouTube movie reviewers because from
my experience they mention the topic of review embargoes in their content on YouTube. I have
engaged 25 critics in total both via Twitter and emails. Unfortunately, I received only 1
response from an UK-based movie reviewer who sent a written answer to my initial mail where
I explain the extent of my study. At a later stage, I noticed that I send my emails and Twitter
messages around the same time as “Avengers: The Infinity War” was being released to the US
and Canada domestic market and that might have influenced their decision not to engage in
conversation with me due to scheduling conflicts.
In parallel to that, I contacted 3 non-academic authors whose articles I used in the first part of
my thesis. I received 1 response from an US-based critic who agreed to an interview. As a
result, 1 in-depth interview was conducted. The interview was conducted over Skype and
recorded in its entirety. Since this was unstructured interview several topics related to review
embargoes were discussed. However, only topics related to my hypotheses were described in
the first part of this thesis.
18
I have chosen not to disclose the names of respondents. Although, I received approval to
publish name of Respondent 1 during the interview. I contacted both respondents when my
thesis was almost done in order to make sure that I interpreted their words correctly and get
final approval for disclosure of their names. At the time of hand in of this thesis, I didn’t receive
any response from the respondents.
Quantitative research
Since the purpose of my quantitative analysis is to attempt to explain the relationship between
the review turnover and valence of critical reviews and the the relationship between the critical
reviews and performance of the reviewed movies in the box office, I will employ multiple
regression analysis (Hair, 2010, pp. 169–171). By employing this type of analysis, I will be
able to “determine relative importance of each independent variable in the prediction of
dependent measure”; assess the relationship between the independent and the dependent
variables; and, finally, I will be able to evaluate the relationship between the different
independent variables in their prediction of the dependent variables (Hair, 2010, p. 170).
Next step in designing my quantitative research was defining sample size, statistical power,
and generalizability. In accordance with Hair's suggestions (2010, pp. 174–176), I will have
sample size over 100 observations, and not less than 5 observations per independent variable
in order to have as high generalizability as possible. When it comes to variables (especially
independent variables), it is worth mentioning that when in case of motion pictures, one must
use not only use metric data in one’s models but even nonmetric data (Bharadwaj et al., 2017;
Kim et al., 2013; Zuckerman, 2003). As previous research implies - genre, rating and so on
are a good example of data that will be treated as dummy variables.
When working with dummy variables, I will treat my nonmetric variables as dichotomous, i.e.
each category for the respective variable will be assigned a value of 0 or 1. Then, each variable
that has k nonmetric categories, will be represented by k-1 dummy variables. Also, it is
important to mention that none of the dependent variables used in my models are nonmetric
which makes it possible to use linear regression model. (Hair, 2010, p. 177)
Last but not least important topic in my analysis is handling outliers. Since the size of my
sample should be over 100 observations and only 134 movies had a wide theatrical release in
the US in 2017, I have decided to use data for all the movies in my data sample. However,
during the analysis of my data, it became apparent that the Pareto principle is observable when
it comes to box office performance and movie budgets, i.e. roughly 20% of the movies
accumulate 80% of the box office. This creates a number of problems when it comes to
statistical analysis. According to Hair (2010, p. 71), normality of the data is a fundamental
assumption in multivariate analysis, therefore, data should have a normal distribution.
However, larger sample sizes can reduce the detrimental effects of nonnormality in the analysis
(Hair, 2010, p. 72). Another important assumption of multivariate analysis is homoscedasticity
which refers to equal level of variance across the variables (Hair, 2010, p. 72). There 2 ways
to handle bias that might be created due to conflicts with these assumptions: handle outliers
(Hair, 2010, pp. 68–69) and/or transform data (Hair, 2010, pp. 77–78). If necessity for use of
these methods would arise, I will address in analysis part of my paper.
The last 2 important statistical assumptions are linearity and absence of correlation errors (Hair,
2010, p. 76). In order to identify bias associated with non-linearity and correlations errors, I
will examine residuals and create correlation tables respectively in order to identify any
anomalies. If problems with these 2 assumptions arise, I will consider removing variables from
my hypotheses test model. List of all variables that will be used in my model will be presented
closer to the end of this section.
Data Sources
There were two main sources for information on movies and box office performance used in
this study. Boxofficemojo.com was the main source for information about box office
performance for the widely released movies in 2017. I will use MPAA’s Theatrical and Home
Entertainment Market Environment report for 2017 (MPAA, 2018)in order to gather market
and movie-goer data for the descriptive statistics part.
Rottentomatoes.com was used for gathering data on about critical perception. It is significant
for the purpose of this study to explain why specifically Rotten Tomatoes was used. As
mentioned earlier in the text, Rotten Tomatoes is a review aggregator which means that instead
of gathering review data from multiple sources I could just gather data from one source. This
review aggregation from multiple sources might be the reason why Rotten Tomatoes is a very
popular site amounts movie-goers. As mentioned earlier in the text, approximately one-third of
the movie-goers in the US visited this site before choosing a movie to watch (Faughnder, 2017).
SimilarWeb (2018) which is web analytics service puts Rotten Tomatoes as number two most
visited site in category movies in the US and number four globally. The number one spot in
this category goes to IMDb.com. Amazon’s web analytics service Alexa (2018) shows that
amount of traffic on IMDb is twice as high as on Rotten Tomatoes. This raises a question -
why wasn’t IMDb chosen as a data source for this study? The answer to this question lies in
20
the validity of ratings available for the general audience prior to a movie premiere. IMDb shows
user-generated ratings on its webpage and a movie needs only 5 reviews to show a rating. At
the time of writing of this paper (on the 19th of May 2018), a movie called “Solo: A Star Wars
Story” has already received a small number of user reviews (IMDb, 2018b), however, the
movie has its premiere on the 25th of May.
Rotten Tomatoes, on the other hand, have developed a quality assurance system with clearly
defined eligibility criteria in order to make sure that only qualified critics can leave their
reviews on the website (Rotten Tomatoes, 2018b). Rotten Tomatoes has divided critics into
two categories: critics and Top critics depending on whether a critic represents a major outlet
or has a significant social media following. For the purpose of this study, no distinction between
the critics will be made because Rotten Tomato score is not influenced by the critic category.
Movies can, however, get an additional seal of approval (Certified Fresh) by the Rotten
Tomatoes if these have more than 75% score and at least five Top critics must leave a positive
review for the movie. This seal of approval, however, will not be included in this study. This
critic validation system allows the website to have confidence in the score that they publish
prior to the movie premieres. If we take the same movie as mentioned above “Solo: A Star
Wars Story”, we can observe that it has received a score of 71% based on 122 reviews. This
should be interpreted as 71% of 122 critics gave a positive review to the movie.
Data Sample
For the purpose of this study I have decided to include all movies that hade a wide release in
the US on its premiere night in 2017. In total 735 movies were released in the US during 2017.
Out of these movies 134 were released on more than 600 screens on its premiere night which
is definition of wide release according to Box Office Mojo. 4 movies out of these 134 movies
movies were re-releases which means that these movies were already released prior to 2017,
i.e. “Casablanca” was re-released in 2017 in order to celebrate its 75th anniversary (Box Office
Mojo, 2018c). Since these movies are already familiar to both critics and audiences and quality
signals would not have the same interpretation as for the movies that have their very first
theatrical run, I have decided to exclude these 4 movies from my analysis.
Variables
Overview of variables
Table 1 shows an overview of all variables with type, operationalisation and data source used
in hypotheses testing. After the table, a short description of variable Rotten Tomatoes Score
Opening is presented because rather unconventional techniques were employed to gather data
on that variable.
Rotten Tomato Score on the night of the premiere
Rotten Tomato score will be represented by a score of the respective movie on the Rotten
Tomatoes on the night of its premiere. In order to gather this data, I had to use internet archive
more commonly known as 4 (Internet Archive, 2017). This service takes snapshots of different
22
web resources and allows to see how web pages looked at different times. Therefore, for the
purpose of this study, it was possible for me to access Rotten Tomatoes in the very same state
as it was on the date of the premiere of all 130 movies that are in my sample. This allowed me
to gather Rotten Tomato score as it was on the night of the premiere. However, there are some
caveats when it comes to the score on the premiere night. Firstly, the Wayback Machines might
take several snapshots per day. As a result, one might get different scores depending on which
snapshot You decide to open if reviews are published on the day of the premiere and the score
changes as more and more reviews are published. In some cases, a movie can go from not being
scored at all to having some score. Also, the same phenomenon is observed over time, i.e. most
movies had different scores on the night of the premiere and at the time when data was gathered.
Secondly, some movies were not featured on the main page on Rotten Tomatoes. This lead to
situations when it was not possible to gather score easily. In these cases, I had to look up
subpages on the Rotten Tomatoes dedicated to the respective movie via the Wayback Machine.
In cases, if the movie had a snapshot of its page then the data was gathered from this page. In
cases when there was no snapshot of the movie from the date of the premiere available some
additional analysis was performed that was based on the number of reviews published. If there
were no reviews published prior to the premiere, the movies received a score of “0”. If there
were some reviews published prior to the premiere, a snapshot that was closest to the premiere
date was used to determine the score. By employing this method, I was able to avoid situations
when movies that had some score were labelled as movies with “0” on Rotten Tomatoes. The
topic of changes in scores will be further discussed closer to the end of this paper.
Hypotheses testing model
Having presented the variables and what methods, I will now present the model that will be
used for hypotheses testing. The generic equation for multiple regression (Hair, 2010, p. 166)
that I will employ looks as follows:
Y = b0 + b1V1 + b2V2 + .. + e
Based on the multiple regression equation, I will use the 8 different models to test my
hypotheses. Below I have written out model 7 for a reference:
Box Office Total = Audience Score + Audience Score Current Volume + Rotten Tomatoes
Score Current + Rotten Tomatoes Current Volume + Studio Major + Screens Total +
Production Budget + Recognized Property + Animated + Genre Action & Adventure +
Genre Drama + Genre Comedy + Genre Mystery & Suspense + Genre Kids & Family +
Genre Horror + Genre Science Fiction & Fantasy + Genre Other + MPAA rating R
Advantages and disadvantages of the chosen approach
As mentioned earlier combination of methods allowed me to formulate my hypotheses based
on phenomena observable in the real life. By gathering data via an interview with matter expert,
I was able to confirm the validity of operationalisation of variable Review Turnover which is
central notion to quality signalling between moviemakers and movie critics. However, this
operationalisation is based on only 1 interview which is sample low even for the qualitative
method. The disadvantage of a quantitative method for multivariate data analysis is that
researchers might encounter negative effects of multicollinearity (Hair, 2010, pp. 204–205).
From the overview of variables used in my thesis, one can clearly see that I have used 22
variables in the sample with a size of 130. Although I haven’t used all 22 variables in the same
hypothesis test, I have addressed this issue in the next sections of this thesis.
RESULTS
Descriptive statistics
In 2017, the US domestic box office generated revenue of $11,123 billion by 738 titles that
were released theatrically. On the premiere weekends, these 738 movies generated $3,365
billion. This means that a movie generates $15,07 million during the whole theatrical run and
$4,55 million on the premiere weekend. However, these numbers are divided between all the
movies. For the purpose of this study, we are more interested in the movies that had a wide
release (opened in at least 600 theatres on the premiere weekend). This criterion leaves us with
a sample of 134 movies. These 134 movies generated $10,155 billion over the course of
theatrical runs and $3,289 billion on the premiere weekends. More observant readers will
immediately recognize that some movies generate significantly more money than others as 134
widely-released movies generated approximately 10 times more revenue than the movies that
were not widely released. But even amongst widely-released movies, there is a very skewed
distribution of revenue as revenue range is from $0,6 million (“The Stray”) to $220 million
(“Star Wars: The Last Jedi”).
Let’s change our focus and look at the widely-released movies themselves. Out of 134 movies,
4 movies were re-releases of already known titles. Therefore, I have decided to remove these
movies from my sample and from now on all percentages shown in the further analysis will be
calculated from the sample size of 130. Descriptive statistics for this sample are presented in 2
24
tables below. Table 2 shows descriptive statistics for continuous variables and Table 3 shows
descriptive statistics for categorical variables.
Table 2 Descriptive statistics for continuous variables. N = 130.
Table 3 Descriptive statistics for categorical variables. N = 130.
There are 2 variables that are not shown in Table 2: Review turnover and Rotten Tomatoes
score on the day of the premiere. In the sample, there were 17 movies (13%) that had no rating
or “0” rating on the night of the premiere. 1 movie (0,7%) had a rating of “100”. The average
score for the sample was 46,95. Review turnovers had much larger spread – from 0 to 224 days.
Such a significant difference in values could lead to errors in interpretation of data. Therefore,
I applied some additional techniques on my data sample. Firstly, I created a boxplot to identify
Variable Min Max Mean Std Dev Sum
1 Total Box Office ($ million) 1,58 620,18 78,06 106,61 10 147,71
2 Theaters Total 640,00 4 535,00 2 897,31 1 011,03
3 Box Office Opening ($ million) 0,60 220,01 25,27 35,40 3 285,38
4 Theaters Opening 626,00 4 529,00 2 864,04 1 006,25
5 Rotten Tomatoes Score Opening 0,00 100,00 46,95 31,43
6 Rotten Tomatoes Score Current 5,00 99,00 50,02 27,57
7 Rotten Tomatoes Score Current Volume 5,00 379,00 157,52 96,91 3 057,00
8 Audience Score 11,00 94,00 59,29 20,18
9 Audience Score Volume 270,00 192 719,00 23 518,54 29 843,05 3 057 410,00
10 Production Budget ($ million) 2,00 300,00 59,65 61,66 7 754,30
11 Production Budget (log) 0,69 5,70 3,57
12 Box Office Total (log) 0,46 6,43 3,56
13 Audience Score Volume (log) 5,60 12,17 9,45
Descriptive statistcs. Continous variables used in hypotheses 2 and 3 (N =130)
Variable Number Procentage of N N = 130
1 Studo Major 69 53% 130
2 Festival Release 19 15% 130
3 Fresh (premiere) 49 38% 130
4 Rotten (premiere) 81 62% 130
5 Fresh (current) 46 35% 130
6 Rotten (current) 84 65% 130
7 Sequel 33 25% 130
8 Remake or reboot 17 13% 130
9 Animation 16 12% 130
10 MPAA G 2 2% 130
11 MPAA PG 26 20% 130
12 MPAA PG-13 55 42% 130
13 MPAA R 47 36% 130
14 Drama 79 61% 130
15 Action & Adventure 55 42% 130
16 Comedy 45 35% 130
17 Horror 33 25% 130
18 Science Fiction & Fantasy 25 19% 130
19 Mystery & Suspence 23 18% 130
20 Kids & Family 14 11% 130
21 Other genre 10 8% 130
Descriptive statistcs. Categorical variables used in hypotheses 2 and 3 (N =130)
the outliers. As a result, all review turnover that was longer than 30 days were identified as
outliers. At this point, I had to two options: to remove the data from my sample or do some
additional research on these movies in order to understand what might explain such a spread.
It turned out that of 20 outliers, 19 movies had their original premiere at a film festival and 1
movie was released outside of the US prior to its domestic premiere. This led me decision to
exclude the 19 movies1 from my data set for the test of the first hypothesis because these movies
had rather specific conditions during their premiere and there was an additional quality signal
(movie was included into program of a film festival) that would not be applicable for the rest
of the widely-released movies. Relation between the review turnovers and Rotten Tomatoes
scores for the new sample of 111 movies is presented in Figure I below.
Figure I Relation between the review turnovers and Rotten Tomatoes scores for the new sample of 111 movies.
Although I have decided to use the smaller sample for the test of my first hypothesis, I will use
the original 130 movie sample for tests of the rest of my hypotheses. This is due to fact that
movie premieres at movie festival are usually not available for general audiences. In case of
my sample, this didn’t affect revenue at the box office premiere weekend as it was not
registered in Box Office Mojo. The relation between box office and Rotten Tomatoes score is
shown in Figure II below.
1 These 19 movies have average review turnover that is equal to – 27 which would qualify these movies as
outliers. Another interesting observation about these 19 movies is that they have average Rotten Tomatoes score
of 66 which is significantly higher than the original sample.
26
Figure II Relation between the Rotten Tomatoes score and box office at the time of movie premiere (in $ million)
for the sample of 130 movies.
And, finally, the last part of the presentation of descriptive statistics will be an overview of
Rotten Tomato and Audience scores that were available in April and May of 2018 for the
sample of 130 movies. When it to Rotten Tomatoes score it was important to mention that
average score went up and is equal to 50 (compared to 46,95 at the time of the premieres).
Average Audience score is equal to 59,29 which means that audience score is on average almost
10% higher than the Rotten Tomatoes score and roughly 12% higher compared to the Rotten
Tomatoes score at the premiere. And since we compare audience and critics scores after
movies’ theatrical runs, I will present total box office revenues for the 130 movies. In the same
manner as box office at the premiere, total box office revenue has skewed revenue distribution
which ranges from $1,58 million (“The Stray”) to $620 million (“Star Wars: The Last Jedi”).
Compilation of Rotten Tomatoes, Audience scores and box office revenue over the whole
theatrical run is shown in Table IV below. On closer examination of Figure III, it becomes
apparent that the audience is more generous with its score for the movies on the lower end and
is sparser in its judgement of movies with a high critical appraisal.
Figure III Relation between the Rotten Tomatoes and Audience scores and total box office revenue over the
whole theatrical run (in $ million) for the sample of 130 movies.
Sources: Rotten Tomatoes and Box Office Mojo
Test of Hypotheses
Correlation matrices
Before I present results of tests of the hypotheses I would like to present a short analysis of
correlations in order to determine if results may be affected by multicollinearity. As I
mentioned earlier, I will use 2 different subsets of the sample in my hypotheses testing. The
first subset contains 111 movies that were not premiered a movie festival. Correlation matrix
for this subset that will be used for testing hypothesis 1 is shown in Table 4.
Table 4 Correlation matrix for continuous variables used in testing of hypothesis 1 (N = 111).
*** p<.01; ** p<.05; * p<.10.
The significant coefficients are in written in bold for an easier overview.
My second subset contains all 130 movies first time released in the US and Canada in 2017.
Correlation matrix for this subset that will be used on for testing hypotheses 2 and 3 is shown
in Table 5.
Variable 1 2 3
1 Review Turnover 1,00
2 Rotten Tomaotes score opening 0,57*** 1,00
3 Production Budget 0,29*** 0,30*** 1,00
Correlation matrix. Continous variables used in hypothesis 1 (N = 111)
28
Table 5 Correlation matrix for continuous variables used in testing of hypotheses 2 and 3 (N = 130).
*** p<.01; ** p<.05; * p<.10.
The significant coefficients are in written in bold for an easier overview.
Several variables used in hypotheses 2 and 3 have correlation over 0,70 which according to
(Hair, 2010, pp. 204–205) may result in issues associated with multicollinearity. Rotten
Tomatoes at the opening of a movie has a very high correlation (0,88***) with the Rotten
Tomatoes score that movies have after their theatrical run. Even higher correlation is observed
for based variables on the box office performance and a number of screens at different points
in time (0,94*** and 0,997***), but none of these variables will be used in the same models
for the hypotheses testing. Therefore, I haven’t performed Variance Inflation Factor analysis
on these variables. However, the volume of the Audience Score and the total box office had a
correlation of 0,79***, therefore I have performed VIF analysis on the models where I used
this variable. Results of this analysis are summarized in Table 6 below. Both volume of Rotten
Tomatoes Score and Audience Score have VIF values close to or over 5 depending on the
model. Although (Hair, 2010, p. 205) argues that in most cases VIF of 10 will cause problems;
in some cases, VIF of 3 to 5 might cause problems. Therefore, I have performed some
additional analysis on the models where volumes for Rotten Tomatoes Score and Audience
Score were used. Before moving on to the analysis, it is worth mentioning that in (Kim et al.,
2013) study where researchers studied effects of word of mouth on box office performance (N
= 169) similar levels of correlation between word of mouth frequency and box office were
observed.
Variables 1 2 3 4 5 6 7 8 9 10
1 Rotten Tomatoes Score Opening 1.00 xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx
2 Rotten Tomatoes Score Current 0.88*** 1.00
3 Rotten Tomaotes Score Current Volume 0.63*** 0.56*** 1.00
4 Box Office Opening 0.39*** 0.38*** 0.65*** 1.00
5 Screens Opening 0.25*** 0.11 0.60*** 0.62*** 1.00
6 Box Office Total 0.42*** 0.43*** 0.64*** 0.94*** 0.62*** 1.00
7 Screens Total 0.27*** 0.13 0.60*** 0.61*** 0,997*** 0.62*** 1.00
8 Production Budget 0.22** 0.19** 0.61*** 0.68*** 0.64*** 0.63*** 0.63*** 1.00
9 Audience Score 0.56*** 0.69*** 0.32*** 0.31*** 0.05 0.37*** 0.07 0.21** 1.00
10 Audience Score Volume 0.35*** 0.32*** 0.68*** 0.82*** 0.59*** 0.79*** 0.59*** 0.68*** 0.26* 1.00
Correlation matrix. Continous variables used in hypotheses 2 and 3 (N = 130)
Table 6 Variance Inflation Factor analyses for variables used in models 7 and 8 (N = 130).
*** p<.01; ** p<.05; * p<.10.
The significant coefficients are in written in bold for an easier overview.
Test of Hypothesis 1
In order to understand if return turnover has an effect on Rotten Tomatoes score, I have
employed a linear regression analysis. The results of this analysis are shown in Table 7 below.
As demonstrated below, this model could account for almost half of the variance in Rotten
Tomatoes score (R2 = 0,49 and adjusted R2 = 0,41). Based on the results of linear regression,
review turnover was found to be a significant variable (b = 3,23; t = 7,08; p < 0,01 and
standardized BETA = 0,57). Recognized Property, Animation, Genres of Comedy, Kids &
Family and Other were found to be significant factors in Rotten Tomatoes score as well. It is
worth highlighting that genres of Genres of Comedy, Kids & Family and Other had negative
coefficients that might imply that critics do tend to give lower rates movies in these genres.
Variance Inflation Factor summary for model 7 and 8 (N = 130)
Independent variables
1 Audience Score 2,64 2,66
2 Audience Score Volume 2,80
3 Rotten Tomatoes Current Score 3,75 3,86
4 Rotten Tomatoes Current Volume 4,61 5,10
5 Studio Major (1) 1,76 1,72
6 Screens Total 3,20 4,63
7 Production Budget 3,76
8 Recognized Property (1) 2,07 2,05
9 Animated (1) 2,95 2,95
10 Genre Action & Adventure (1) 1,75 1,74
11 Genre Drama (1) 2,19 2,23
12 Genre Comedy (1) 3,07 3,18
13 Genre Mystery & Suspence (1) 1,51 1,51
14 Genre Kids & Family (1) 2,93 3,27
15 Genre Horror (1) 2,20 2,55
16 Grenre Science & Fiction Fantasy (1) 1,86 1,80
17 Genre Other (1) 1,35 1,38
18 MPAA rating R (1) 1,54 1,43
19 Production_Budget (log) 3,68
20 Audience Score Volume (log) 4,92
Model 7 Model 8
Dependedent variable = Box Office Total
30
Table 7 Results of linear regression for model 1. Dependent variable: Rotten Tomatoes. Independent variable:
Review turnover (days). Sample size: 111 movies.
*** p<.01; ** p<.05; * p<.10.
The significant coefficients are in written in bold for an easier overview.
As a part of regression analysis, I employed pairs.panels command in R which allowed me to
get an overview of correlations between the variables, histograms, and scatterplots with
regression lines. An example of the output of such a command is shown below. As one can see
on Figure IV both variables Review Turnover and Production Budget do not have a normal
distribution and have observable positive skewness. Therefore, I have decided to test the
robustness of my model and applied data transformation techniques on variables with non-
normal distribution. Results of this test are presented in the next section.
Independet variables Standardised coefficients
b t p BETA
1 Review Turnover 3,231 7,082 < 0,01*** 0,567
2 Studio Major (1) 5,532 0,991 0,324 0,087
3 Production Budget -0,028 -0,479 0,633 -0,055
4 Recognized Property (1) 10,413 1,787 0,077* 0,165
5 Animated (1) 34,881 3,238 0,002*** 0,380
6 Genre ActionAdventure (1) 1,478 0,238 0,813 0,023
7 Genre Drama (1) -1,184 -0,178 0,859 -0,019
8 Genre Comedy (1) -20,754 -2,397 0,018** -0,317
9 Genre Mystery & Suspence (1) -1,101 -0,145 0,885 -0,013
10 Genre Kids & Family (1) -21,014 -1,678 0,097* -0,222
11 Genre Horror (1) -11,501 -1,382 0,170 -0,159
12 Grenre Science Fiction & Fantasy (1) -1,040 -0,138 0,891 -0,014
13 Genre Other (1) -19,037 -1,901 0,06* -0,157
14 MPAA rating R (1) 6,497 1,102 0,273 0,096
xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx
Multiple R2 = 0,4918
Adjusted R2 = 0,4177
F (14, 96) = 6,637
p < 0,01***
Regression result for model 1 (DV = Rotten Tomatoes score opening, N = 111)
Unstandardised coefficients
Figure IV Output of pairs.panels command in R used on model 1. This figure shows bivariate scatter plots blew
the diagonal, histograms in the diagonal and correlation between the variable above the diagonal. Sample size of
111 movies.
*** p<.001; ** p<.01; * p<.05; . p<.10.
Hypothesis 1 robustness test
As mentioned earlier, to normalise variables Review Turnover and Production budget I have
applied different data transformation techniques. In order to get a distribution pattern as close
as possible to normal distribution, I used square root transformation on Review Turnover and
log transformation on the Production budget. As a result, I have received the following output
from the pairs.panels command with a distribution that is much closer to the normal distribution
compared to the untransformed data.
Figure V Output of pairs.panels command in R used on model 2 (robustness test of hypothesis 1). This figure
shows bivariate scatter plots blew the diagonal, histograms in the diagonal and correlation between the variable
above the diagonal. Sample size of 111 movies.
*** p<.001; ** p<.01; * p<.05; . p<.10.
With the variables transformed, I have applied linear regression analysis on model 2. The
results of this analysis are shown in Table 8 below. As demonstrated below, this model can
explain even more variance in Rotten Tomatoes score (R2 = 0,59 and adjusted R2 = 0,53).
32
Based on the results of linear regression, review turnover was found to be a significant variable
(b = 18,08; t = 9,06; p < 0,01 and standardized BETA = 0,71). Recognized Property, Animation,
Genres of Comedy, Kids & Family were found to be significant factors in Rotten Tomatoes
score like in the model 1. However, after the data transformation Production Budget received
significance (p = 0,05) and genre Other lost its significance (p = 0,159 compared to p = 0,06 in
model 1).
Table 8 Results of linear regression for model 2. Dependent variable: Rotten Tomatoes. Independent variable:
Review turnover (days). Sample size: 111 movies.
*** p<.01; ** p<.05; * p<.10.
The significant coefficients are in written in bold for an easier overview.
Test of Hypothesis 2
My second hypothesis revolved around effects of Rotten Tomatoes score on the performance
of movies at the box office on the opening weekend. In order to determine that, I have once
again employed linear regression on my model 3. The results of this analysis are shown in
Table 9 below. As demonstrated below, this model could explain almost around 60% of the
variance in the box office opening score (R2 = 0,64 and adjusted R2 = 0,59). Based on the
results of linear regression, Rotten Tomatoes score (b = 0,31; t = 4,33; p < 0,01 and standardized
BETA = 0,27) and production budget (b = 0,27; t = 4,97; p < 0,01 and standardized BETA =
0,48) were found to be significant variables. Number of screens movie was shown at the
Independet variables Unstandardised coefficients Standardised coefficients
b t p BETA
1 Review Turnover (sqrt) 18,076 9,064 < 0,01*** 0,707
2 Studio Major (1) 2,124 0,422 0,674 0,034
3 Production Budget (log) -5,735 -1,811 0,073* -0,194
4 Recognized Property (1) 10,610 1,981 0,050* 0,168
5 Animated (1) 33,452 3,423 < 0,01*** 0,364
6 Genre ActionAdventure (1) 2,922 0,513 0,609 0,046
7 Genre Drama (1) 3,064 0,509 0,612 0,048
8 Genre Comedy (1) -13,909 -1,781 0,078* -0,213
9 Genre Mystery & Suspence (1) -1,667 -0,242 0,809 -0,020
10 Genre Kids & Family (1) -23,416 -2,029 0,045* -0,247
11 Genre Horror (1) -8,517 -1,118 0,266 -0,118
12 Grenre Science Fiction & Fantasy (1) 1,945 0,294 0,770 0,026
13 Genre Other (1) -12,962 -1,421 0,159 -0,107
14 MPAA rating R (1) 4,790 0,915 0,362 0,071
xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxxxxxxxxxx
Multiple R2 = 0,5851
Adjusted R2 = 0,5246
F (14, 96) = 9,671
p < 0,01***
Regression results for model 2 (DV = Rotten Tomatoes score opening, N = 111)
premiere, if the movie was a recognized property, animation, genres of Action & Adventure
and Science Fiction & Fantasy were found to be significant factors that could contribute to a
successful box office opening.
Table 9 Results of linear regression for model 3. Dependent variable: Box Office Opening. Independent variable:
Rotten Tomatoes score. Sample size: 130 movies.
*** p<.01; ** p<.05; * p<.10.
The significant coefficients are in written in bold for an easier overview.
Hypothesis 3
Based on my third and final hypothesis I have designed a number of models in order to
understand if critical reviews or feedback from other moviegoers is a better predictor of box
office success I have conducted a series of linear regression tests in the same manner as
(Bharadwaj et al., 2017; Kim et al., 2013). By applying this method, I was able, firstly to
establish a baseline where I used total box office as dependent variable and movie attributes as
independent variables (model 4). This baseline would later be used in order to compare R2
values and determine which variables can have greater predictive power on box office
performance. After that, I added current Rotten Tomatoes score and volume (model 5) and
current Audience Score and volume (model 6) separately. In my model 7 I have combined both
critically and user-generated scores and their volume. And, finally, model 8 was tested on the
same set of variables as model 7 but variables Audience Score Volume, Production Budget and
Independet variables Standardised coefficients
b t p BETA
1 Rotten Tomatoes score opening 0,31 4,33 < 0,01*** 0,272
2 Studio Major (1) -6,65 -1,31 0,192 -0,094
3 Screens Opening 0,01 2,30 0,023** 0,222
4 Production Budget 0,27 4,97 < 0,01*** 0,479
5 Recognized Property (1) 11,35 1,99 0,049** 0,157
6 Animated (1) -13,12 -1,30 0,195 -0,122
7 Genre Action & Adventure (1) -9,95 -1,87 0,064* -0,139
8 Genre Drama (1) 3,02 0,51 0,610 0,042
9 Genre Comedy (1) 10,12 1,39 0,167 0,137
10 Genre Mystery & Suspence (1) 4,63 0,74 0,459 0,050
11 Genre Kids & Family (1) -4,09 -0,38 0,707 -0,036
12 Genre Horror (1) 5,02 0,75 0,456 0,062
13 Grenre Science Fiction & Fantasy (1) 12,27 1,80 0,075* 0,137
14 Genre Other (1) 8,03 0,95 0,344 0,061
15 MPAA rating R (1) -2,63 -0,53 0,599 -0,036
xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx
Multiple R2 = 0,6367
Adjusted R2 = 0,5889
F (15, 114) = 13,32
p < 0,01***
Unstandardised coefficients
Regression results for model 3 (DV = Rotten Tomatoes score opening, N = 130)
34
Box Office Total were log transformed in order to achieve a normal distribution of values and
higher robustness. Results of regression tests of models 4 through 8 are presented in Table 10.
Regression results of model 4 showed that attributes of movies, i.e. genre, MPAA rating, type
of action and membership in a movie franchise in conjunction with business decisions such as
production budget and number of screens that movie was shown on can explain around 50%
(multiple R2 = 0,54 and adjusted R2 = 0,48) of the variance of the box office performance.
Addition of Rotten Tomatoes score and volume (model 5) to the movie attributes added ca
another 12% to the total variance (multiple R2 = 0,66 and adjusted R2 = 0,61). Addition of
Audience Score and volume (model 6) added approximately 21% to the total variance
(multiple R2 = 0,74 and adjusted R2 = 0,70). Combination of both Rotten Tomatoes and
Audience scores (model 7) and volumes accounted for approximately three quarters of the total
variance (multiple R2 = 0,76 and adjusted R2 = 0,72). Finally, model 8 which evaluates effects
of both Rotten Tomatoes and Audience scores and volumes on box office performance but with
Audience Score Volume, Production Budget and Box Office Total log-transformed in order to
achieve a more normal distribution of values; had even higher explanatory power as it could
account for approximately 85% of the total variation of the box office (multiple R2 = 0,86 and
adjusted R2 = 083).
When it comes to the variables themselves, Screens Opening was only variable that was
significant in all 4 models and had the highest standardised BETA in cases when it was
calculated. Other variables that were significant in and had relatively high standardised BETAs
Audience Score Volume (models 6, 7 and 8), Audience Score (models 6 and 8), Rotten
Tomatoes Current Score (models 5 and 7), genres Comedy (models 5, 6, 7 and 8) and Science
Fiction & Fantasy (models 4, 5 and 6). Also, worth mentioning that there were 2 variables
Animated (model 5) and MPAA rating R (model 8) with low p values (p <.001***) that had
negative b coefficients which might imply that these variables have a negative effect on box
office performance. To summarise the observations around hypothesis 3, one could conclude
that number of Rotten Tomatoes users (non-critics), movies’ production budget and the number
of screens that movies are shown on are more significant predictors of box office success on
the long run.
Having said the abovementioned, I would like to draw attention to Figure VI that shows
bivariate profiling of relationship between variables used in testing of hypothesis 3. One can
immediately see that Audience Score Volume har rather high degree of correlation with Box
Office Total, Screens Total and Production Budget. One may argue that a large number of
screens that movie is shown on is a necessary condition for a large box office revenue as a
movie that is shown on 100 screens cannot generate as much revenue as a movie shown on
36
1000 all other factors being equal. As mentioned at the beginning of this thesis, production
budget and marketing budget are heavily correlated (marketing budget usually corresponds to
80% of production budget) and since marketing activities might have an effect on the
generation of interest around a movie for both critics and movie-goers there might be some
degree of causal effect. However, I haven’t included marketing budget in my analysis (due to
lack of solid data on this variable) and in the course of this thesis, I haven’t established any
causal effects between the variables available to me. This in conjunction with the absence of
extreme (over 10) VIF values, led me to the conclusion that even if there is any bias or issues
associated with multicollinearity that influence my models, I wasn’t able to establish that due
to time constraints and the limitations of the chosen methods. In accordance with (Hair, 2010,
pp. 643–645) models that involve multiple predictor constructs may exhibit multicollinearity
and additional analysis to determine causal effect is required to determine if the
multicollinearity is a result of causal interference. This analysis was not possible in course of
this study due to time constraint.
Last but not least, It is also rather conspicuous that these variables along with Rotten Tomatoes
Current Volume have a stronger organisation of points along regression line which is indicative
of linear relationship or correlation (Hair, 2010, p. 39). In other words, the independent
variables (Audience Score Volume, Production Budget, Screens Total and Rotten Tomatoes
Score Volume) exhibit rather equal levels of variance in relation to the dependent variable (Box
Office Total). This also means that relationship between these variables was homoscedastic is
rather fortunate for my analysis as it is one of the statistical assumptions required for the linear
regression (Hair, 2010, p. 71). In case of model 8, at least relation between 4 variables exhibited
homoscedasticity which may explain higher R2 values compared to model 1 where R2 values
were lower, yet still rather high.
38
DISCUSSION AND IMPLICATIONS In this thesis, I have attempted to present a comprehensive case for quality signals (Beckert
and Musselin, 2013; Callon et al., 2002; Dubuisson-Quellier, 2013; Spence, 1973; White,
1981) that are sent to different audiences by the moviemakers in the US and Canada domestic
markets. Firstly, moviemakers test their movies on test audiences and based on the results of
such test screenings, the studios determine when the movie would be shown to movie critics.
Based on results of test screening, the moviemakers also decide when critics would be allowed
to publish their reviews. When the reviews are published in different sources, these are also
linked to and aggregated on a website called Rotten Tomatoes. Rotten Tomatoes employed a
mechanism where they based on whether a critical review is positive or negative, the movie
gets a Rotten Tomatoes score. Some sources (Ahsan, 2017; Cavna, 2017b; Dickey and Han,
2017) argued that a shorter time frame between the publication of a review and the premiere at
the box office of a movie, the lower Rotten Tomatoes score movie would get. This phenomenon
is referred as review embargo by the professionals in the motion picture market. Therefore, I
have decided use review embargo as a quality signal sent by moviemakers to the professional
critics.
Unfortunately, I encountered a number of issues with the operationalisation of review
embargoes. Firstly, information about embargo timelines wasn’t available on the internet for
all the movies released in 2017. Secondly, the sources mentioned above that discussed
importance of review embargos were not published in peer-reviewed journals. Therefore, I
decided to conduct a series of interviews with movie critics in order to understand if the critics
themselves saw review turnover as a quality signal. Although I have contacted over 25 critics,
only 1 interview was conducted, and 1 movie reviewer wrote an answer to some of my
questions via email. Although this was a very small sample size, I was able to operationalise
notion of review embargo where I used date of review publication on Rotten Tomatoes as an
indicator of date when the review embargo was lifted. By applying such a method, I was able
to formulate my first hypothesis and test if review turnover could explain variance in the Rotten
Tomatoes score. Regression tests showed that review turnover could explain roughly 50% or
40% of the variance in Rotten Tomatoes score depending on whether data was transformed (R2
= 0,59 and adjusted R2 = 0,53) or not (R2 = 0,49 and adjusted R2 = 0,41).
After analysis of Rotten Tomatoes score, I followed in footsteps of may researchers who
studied effects of critical reviews and word-of-mouth on the box office performance (Basuroy
et al., 2006; Bharadwaj et al., 2017; Brown et al., 2012; Eliashberg and Shugan, 1997; Gopinath
et al., 2010; Kim et al., 2013; Lee and Choeh, 2018; McKenzie, 2009; Oh et al., 2017;
Zuckerman, 2003). The main difference between my research and previous research is that in
many cases previous research was based on unstructured reviews and/or reviews and word-of-
mouth that came from different sources. Therefore, as a past of pre-study for this thesis, I
investigated whether claims that Rotten Tomatoes influence the decision of one third of the US
moviegoers (Faughnder, 2018; Fritz, 2016). Results of my pre-study showed that around 12
million people in the US visited Rotten Tomatoes on monthly basis. Moreover, web traffic
statistics showed peaks of unique visits on the weekends when movies was premiered. Based
on this observation, I made an assumption that Rotten Tomatoes score could serve as a valid
quality signal to the moviegoers. Therefore, I used linear regression in order to determine if
Rotten Tomatoes Score could explain variance in box office performance on premiere
weekend. Results of regression tests showed a rather high predictive power of my model (R2 =
0,64 and adjusted R2 = 0,59).
During gathering of data, I noticed that that rather many Rotten Tomatoes users (non-critics)
left their feedback on the movies. For the 130 movies that I test my models on, more than
3 000 000 non-critical reviews were submitted compared to ca 20 000 reviews written by
professional critics. Since there were so many user-generated reviews and influence of word of
mouth on box office performance, I decided to use Audience Score on Rotten Tomatoes as a
proxy for word of mouth valence and number of user-generated reviews as a proxy for the
volume of word of mouth. Based on that I formulated my last and final hypothesis in order to
compare rating left by professional reviewers and moviegoers. Based on series of linear
regression tests, I could determine that volume and valence of critical reviews could explain
roughly 60% (multiple R2 = 0,66 and adjusted R2 = 0,61) of variance in box office performance
over time; volume and valence of moviegoers’ reviews could explain roughly 70% (multiple
R2 = 0,74 and adjusted R2 = 0,70) of variance in box office performance over time; and, finally,
combination of both could account for ca 72% of the total variation of the box office multiple
R2 = 0,76 and adjusted R2 = 0,72). With some data transformation techniques applied, the
combination of critical and non-critical reviews could explain over 80% (multiple R2 = 0,86
and adjusted R2 = 083) of the total variation of the box office.
Such a high coefficient of determination calculated by the model described above would and,
probably, should raise some eyebrows as it is very high. However, (Kim et al., 2013) which is
rather comparable to my work based both on method and set of variables had R2 of 0,87.
Bharadwaj et al. (2017) used similar methodological approach but a more complicated set of
40
variables achieved R2 = 0,82 and adjusted R2 = 0,73. Zuckerman's (2003) study on the influence
of critical reviews on box office performance also had a coefficient of determination over 0,80.
Brown et al. (2012) who studied effects of cold openings on box office also calculated
coefficient of determination over 70% (R2 = 0,71). By these examples, I want to illustrate that
such coefficients of determination are rather common in studies covering motion picture
markets. However, I must admit that my research had somewhat smaller sample size and due
to that my regression tests might have some degree of bias built it. But, I believe that the method
and theoretical basis used in this thesis were sound and application of the same models on a
larger sample size, e.g. widely-released movies from 2017 and 2018, might help future
researchers test the predictive power of my models. More importantly, application of a similar
method on a larger sample size might add validity to my assumptions that user-generated
reviews on one of the most popular web pages can serve as a proxy for word of mouth scattered
over many sources over the internet.
Having discussed the models, I would like to draw attention to the variables used in the models.
Based on the analysis, I could conclude that different variables had a different degree of
influence on professional critics and moviegoers. Also, comparison of Rotten Tomatoes Score
and Audience Score seemed to indicate that critics were more nuanced in their judgement as
there were lower and higher rated movies compared to ratings generated by the non-critic users.
This supports an earlier finding by (Goff et al., 2016) who argued that for the notion of two-
sided market with mass consumers and artistic, elite consumers who would evaluate same
movies differently. However, it is worth mentioning that this phenomenon was only observed
in the analysis of quality signals perceived by the professional critics in conjunction to movie
premiers (models 1, 2 and/or 3). For example, genre of comedy and kids & family had negative
effects on scores set by critics but had a positive effect on scores set by audiences and box
office performance. MPAA rating R exhibited negative influence on box office performance
(model 8) which support earlier findings by (Palsson et al., 2013).
It was also found that volume and valence of reviews (both critical and user-generated) had
positive effects on box office (models 5, 6, 7 and 8). This result supported previous findings
by Zuckerman (2003), Moul (2007), McKenzie (2009) and Lee and Choeh (2018). Also, it was
worth mentioning that volume of reviews (models 7 and 8) had a greater influence on box
office performance compared to the valence of reviews which supported previous findings by
(Bharadwaj et al., 2017; Gopinath et al., 2010). Production budget was also found to be a good
predictor of box office success which was supported by (Bharadwaj et al., 2017; Kim et al.,
2013) findings but was contradicted by McKenzie's (2009) results.
Previous studies (Basuroy et al., 2006; Bharadwaj et al., 2017; McKenzie, 2009; Zhao et al.,
2013) found that recognition factor in name or by belonging to a movie franchise was a
significant factor in box office success. This was only partially supported by the findings of
this study as it was a significant factor prior or at the time of movie premiere (models 1, 2 and
3) but not at the later stages of movies’ life-cycle. This, however, supported findings by
McKenzie (2009).
So far, findings of my study generally supported findings from previous studies. There were,
however, 2 significant variables that were omitted from my research: advertising budget and
star power. Many researchers (Basuroy et al., 2006; Bharadwaj et al., 2017; Goff et al., 2016;
Karniouchina, 2011; Zuckerman, 2003; Zuckerman et al., 2003) argued for importance either
or both of these variables. However, due to access to advertising budget data and ability to
operationalize star power of directors and/or actors, I decided to omit these variables from my
analysis. Goff et al. (2016) argued that star power was associated with probability of above-
average critical reviews. Possibly, the omission of star power from my analysis led to lower
coefficient of determination in the model that was used to find relation between review turnover
and Rotten Tomatoes score. Therefore, I would suggest using star power in future research in
order to determine effects of star power on critical ratings.
Last, but not least, category researchers argued that movies that fall outside established
categories or fall into several categories that contradict each other (Hsu, 2006; Hsu et al., 2009;
Karniouchina, 2011; Zhao et al., 2013; Zuckerman, 2003; Zuckerman et al., 2003). Although
none of the models tested this notion specifically, it was observed that only 34 movies had
single genre on Rotten Tomatoes, average number of genres was 2,18 and the first top 20
grossing movies in 2017 had at least 2 genres. Again, this didn’t directly contradict previous
findings on identity spanning, but this aspect might be used in future research as movies with
several genres were successful in 2017. Therefore, it would be interesting to research which
identities could complement each other and which identities would be more counterproductive.
Theoretical implications
This research contributes to existing academic literature in several ways. Firstly, in contrast to
previous studies that focused on quality signals send to the end consumers, I have analysed
quality signals even on stage of life cycle of the movies that precede interaction with end
42
consumers. Results of test screenings may determine when movies will be screened to
professional critics and when the critics will be This amount of time may serve as a quality
signal to the critics. I have performed linear regression test on this assumption and found that
review turnover could explain over 40% of the variance of Rotten Tomatoes scores on the night
of the previews. Secondly, in the course of regression tests, I was able to confirm previous
findings that volume and valence of word of mouth were more significant predictors of box
office success compared to volume and valence of critical reviews.
Methodological implications
This study has attempted to expand toolkit available for researchers. By employing Internet
Archive, I was able to gather as it was available for the moviegoers on the day of the premiere.
As shown earlier, Rotten Tomatoes score changed over time. This means that when working
with aggregated values like web-based ratings, researchers must be aware of this phenomenon
and possibly be able to control for this variation. Use of Internet Archive might present an
opportunity for researchers to perform longitudinal studies in the past and account for variance
in the variables. Just to give the readers an example of what kind variation one could discover
with this method. “Despicable me 3” had a rating of 61% upon release but has a rating of 59%
at the moment. Change of 2 % might seem insignificant but it meant that the movie went from
being “fresh“ to being “rotten”. Suddenly, researchers could encounter problems associated
with lumping and splitting (Zerubavel, 1996) as movies transcend categories over time.
Therefore, this change in rating could have effects on regression test, especially if logit
regression is employed or a significant number of variables change over time. Something
researchers could avoid by employing services such as Internet Archive.
Business implications Finally, I have presented a strong case of interaction between moviemakers and professional critics.
However, volume of end-consumer generated reviews is a more significant predictor of box office
success. More web traffic is drawn IMDb which is a page where user-generated ratings are presented.
Moreover, professional critics seemed to be more nuanced in their judgement as exemplified by relation
between Rotten Tomatoes score, Audience score and box office performance. In some cases these
differences are so significant that Goff et al. (2016) argued about the market separation between mass
consumers and artistic elite markets; and, Rössel and Beckert (2013) argued about products customized
to taste of critics and not end-consumers. Therefore, I would like to propose for the business owners to
promote more quality signals generated by end-consumers and less by professional critics in mass
markets. Especially, if producers already have access to feedback generated by the end-consumers, e.g.
results of test screenings in motion picture markets.
REFERENCES
Ahsan, S., 2017. Rotten Tomatoes Delays Its Score For Justice League. National Post.
Akerlof, G.A., 1970. The Market for “Lemons”: Qualitative Uncertainty and Market Mechanisms.
The Quarterly Journal of Economics 84, 488.
Alexa, 2018. Alexa Internet - Site Comparisons [WWW Document]. URL
https://www.alexa.com/comparison#?sites=imdb.com&sites=rottentomatoes.com (accessed
5.19.18).
Arnold, V., 2006. Behavioral research opportunities: Understanding the impact of enterprise systems.
International Journal of Accounting Information Systems 7, 7–17.
https://doi.org/10.1016/j.accinf.2006.02.001
Aspers, P., 2012. Markets in fashion: A phenomenological approach.
Aspers, P., 2009. Knowledge and valuation in markets. Theory and Society 38, 111–131.
https://doi.org/10.1007/s11186-008-9078-9
Bae, J., Kim, B.-D., 2013. IS THE ELECTRONIC WORD OF MOUTH EFFECT ALWAYS
POSITIVE ON THE MOVIE? Academy of Marketing Studies Journal 17, 61.
Basuroy, S., Desai, K.K., Talukdar, D., 2006. An empirical investigation of signaling in the motion
picture industry. Journal of Marketing Research 43, 287.
Beckert, J., 1996. What is sociological about economic sociology? Uncertainty and the embeddedness
of economic action. Theory and Society 25, 803–840. https://doi.org/10.1007/BF00159817
Beckert, J., Musselin, C., 2013. Constructing quality. Oxford University Press.
Beckert, J., Rossel, J., 2013. The price of art. European Societies 15, 178–195.
https://doi.org/10.1080/14616696.2013.767923
Beckert, J., Rössel, J., Schenk, P., 2017. Wine as a Cultural Product. Sociological Perspectives 60,
206–222. https://doi.org/10.1177/0731121416629994
Bharadwaj, N., Noble, C.H., Tower, A., Smith, L.M., 2017. Predicting Innovation Success in the
Motion Picture Industry: The Influence of Multiple Quality Signals. Journal of Product
Innovation Management 34, 659. https://doi.org/10.1111/jpim.12404
Bowker, G.C., Star, S.L., 1999. Sorting things out, Inside technology. MIT Press.
Box Office Mojo, 2018a. About Box Office Mojo [WWW Document]. URL
http://www.boxofficemojo.com/about/?ref=ft (accessed 3.21.18).
Box Office Mojo, 2018b. All Time Domestic Box Office Results [WWW Document]. URL
http://www.boxofficemojo.com/alltime/domestic.htm (accessed 3.26.18).
Box Office Mojo, 2018c. 2017 Yearly Box Office Results - Box Office Mojo [WWW Document].
URL http://www.boxofficemojo.com/yearly/chart/?yr=2017&p=.htm (accessed 5.19.18).
Brown, A.L., Camerer, C.F., Lovallo, D., 2012. To Review or Not to Review? Limited Strategic
Thinking at the Movie Box Office. American Economic Journal: Microeconomics 4, 1–26.
Callon, M., Meadel, C., Rabeharisoa, V., 2002. The economy of qualities. Economy and Society 31,
194.
Cavna, M., 2017a. 2017 is the best year in superhero movie history, according to the Rotten Tomatoes
math. The Washington Post.
Cavna, M., 2017b. “The Emoji Movie” has a zero score on Rotten Tomatoes - critics are eviscerating
the film. The Washington Post.
Couch, A., 2017. DC Movies: A Timeline of the Upcoming Films (And What’s in the Works) [WWW
Document]. The Hollywood Reporter. URL https://www.hollywoodreporter.com/lists/dc-
movies-upcoming-timeline-release-dates-extended-universe-970619 (accessed 5.23.18).
Dickey, J., Han, A., 2017. Embargo dates and Rotten Tomatoes scores: What’s the relationship?
[WWW Document]. URL https://mashable.com/2017/09/05/movies-rotten-tomatoes-scores-
embargo-times/#3Tt72Aot0iqq (accessed 4.23.18).
DiMaggio, P., 1987. Classification in Art. American Sociological Review 52, 440.
Dockterman, E., 2018. A Complete List of Upcoming Marvel Movies: Dates, Casts | Time [WWW
Document]. URL http://time.com/5167535/upcoming-marvel-movies/ (accessed 5.23.18).
Dubuisson-Quellier, S., 2013. From Qualities to Value: Demand Shaping and Market Control in Mass
Consumption Markets, in: Constructing Quality. Oxford University Press.
44
Eliashberg, J., Shugan, S.M., 1997. Film Critics: Influencers or Predictors? Journal of Marketing 61,
68.
Faughnder, R., 2018. How Rotten Tomatoes became Hollywood’s most influential — and feared —
website. Los Angeles Times.
Faughnder, R., 2017. How Rotten Tomatoes became Hollywood’s most influential — and feared —
website [WWW Document]. URL http://www.latimes.com/business/hollywood/la-fi-ct-
rotten-tomatoes-20170721-htmlstory.html (accessed 3.27.18).
Fritz, B., 2016. Studios Live and Die by the Tomato --- Hollywood focuses on sites like Rotten
Tomatoes that distill reviews into single score. Wall Street Journal.
Goff, B., Wilson, D., Zimmer, D., 2016. MOVIES, MASS CONSUMERS, AND CRITICS:
ECONOMICS AND POLITICS OF A TWO-SIDED MARKET. Contemporary Economic
Policy. https://doi.org/10.1111/coep.12180
Goldenberg, J., Libai, B., Muller, E., 2001. Talk of the Network: A Complex Systems Look at the
Underlying Process of Word-of-Mouth. Marketing Letters 12, 211.
https://doi.org/10.1023/A:1011122126881
Gopinath, S., Chintagunta, P.K., Venkataraman, S., 2010. The Effects of Online User Reviews on
Movie Box Office Performance: Accounting for Sequential Rollout and Aggregation Across
Local Markets. Marketing Science, Marketing Science 29, 944–957.
Hair, J.F., 2010. Multivariate data analysis. Pearson Education.
Hirsch, P.M., 1972. Processing Fads and Fashions: An Organization-Set Analysis of Cultural Industry
Systems. American Journal of Sociology 77, 639–659. https://doi.org/10.1086/225192
Hsu, G., 2006. Jacks of All Trades and Masters of None: Audiences’ Reactions to Spanning Genres in
Feature Film Production. Administrative Science Quarterly 51, 420–450.
https://doi.org/10.2189/asqu.51.3.420
Hsu, G., Kocak, O., Hannan, M.T., 2009. Multiple category memberships in markets: an integrative
theory and two empirical tests. American Sociological Review 74, 150.
IMDb, 2018a. IMDb | Help [WWW Document]. URL https://help.imdb.com/article/imdb/general-
information/what-is-imdb/G836CY29Z4SGNMK5?ref_=helpart_nav_1# (accessed 3.21.18).
IMDb, 2018b. Solo: A Star Wars Story [WWW Document]. URL
http://www.imdb.com/title/tt3778644/ (accessed 5.19.18).
Internet Archive, 2017. Wayback Machine [WWW Document]. URL
https://web.archive.org/web/20170602000000*/rottentomatoes.com (accessed 5.19.18).
Karniouchina, E.V., 2011. Impact of star and movie buzz on motion picture distribution and box
office revenue. International Journal of Research in Marketing 28, 62–74.
https://doi.org/10.1016/j.ijresmar.2010.10.001
Karpik, L., 2010. Valuing the unique. Princeton University Press.
Kim, J., 2012. The institutionalization of YouTube: From user-generated content to professionally
generated content. Media, Culture & Society 34, 53.
Kim, S.H., Park, N., Park, S.H., 2013. Exploring the Effects of Online Word of Mouth and Expert
Reviews on Theatrical Movies’ Box Office Success. Journal of Media Economics 26, 98–114.
https://doi.org/10.1080/08997764.2013.785551
Klein, L.R., 1998. Evaluating the Potential of Interactive Media through a New Lens: Search versus
Experience Goods. Journal of Business Research, Journal of Business Research 41, 195–203.
Lamont, M., 2012. Toward a Comparative Sociology of Valuation and Evaluation. Annual Review of
Sociology 38, 201–221. https://doi.org/10.1146/annurev-soc-070308-120022
Lee, S., Choeh, J.Y., 2018. The interactive impact of online word-of-mouth and review helpfulness on
box office revenue. Management Decision 56, 849.
MacKenzie, D., 2006. An Engine, Not a Camera: How Financial Models Shape Markets, MIT Press
Books. The MIT Press.
Martin, J., Frost, P.J., O’Neill, O.A., 2006. Organizational Culture: Beyond Struggles for Intellectual
Dominance, in: The SAGE Handbook of Organization Studies. SAGE Publications Ltd,
London, pp. 725–753. https://doi.org/10.4135/9781848608030
McCluskey, M., 2017. A Complete List of Upcoming Star Wars Movies: Dates, Casts | Time [WWW
Document]. URL http://time.com/5045736/upcoming-star-wars-movies/ (accessed 5.23.18).
McKenzie, J., 2012. THE ECONOMICS OF MOVIES: A LITERATURE SURVEY. Journal of
Economic Surveys 26, 42. https://doi.org/10.1111/j.1467-6419.2010.00626.x
McKenzie, J., 2009. Revealed word-of-mouth demand and adaptive supply: survival of motion
pictures at the Australian box office. Journal of Cultural Economics 33, 279.
https://doi.org/10.1007/s10824-009-9104-4
MCKENZIE, J., 2008. Bayesian Information Transmission and Stable Distributions: Motion Picture
Revenues at the Australian Box Office. Economic Record 84, 338–353.
https://doi.org/10.1111/j.1475-4932.2008.00495.x
Metacritic, 2018. About Us - Metacritic [WWW Document]. URL http://www.metacritic.com/about-
metacritic (accessed 5.31.18).
Moul, C.C., 2007. Measuring Word of Mouth’s Impact on Theatrical Movie Admissions. Journal of
Economics & Management Strategy 16, 859–892. https://doi.org/10.1111/j.1530-
9134.2007.00160.x
MPAA, 2018. New Report: Global Entertainment Market Expands on Multiple Fronts - MPAA
[WWW Document]. URL https://www.mpaa.org/wp-content/uploads/2018/04/MPAA-
THEME-Report-2017_Final.pdf (accessed 4.11.18).
MPAA, 2013. Our Story [WWW Document]. Motion Picture Association of America. URL
https://www.mpaa.org/our-story/ (accessed 3.20.18).
Oh, C., Roumani, Y., Nwankpa, J.K., Hu, H.-F., 2017. Beyond likes and tweets: Consumer
engagement behavior and movie box office in social media. Information & Management 54,
25. https://doi.org/10.1016/j.im.2016.03.004
Palsson, C., Price, J., Shores, J., 2013. Ratings And Revenues: Evidence From Movie Ratings.
Contemporary Economic Policy 31, 13–21. https://doi.org/10.1111/j.1465-7287.2012.00315.x
Rasmussen, S.J., Berger, J., Sorensen, A.T., 2010. Positive Effects of Negative Publicity: When
Negative Reviews Increase Sales. Marketing Science, Marketing Science 29, 815–827.
Rössel, J., Beckert, J., 2013. Quality Classifications in Competition: Price Formation in the German
Wine Market, in: Constructing Quality. Oxford University Press.
Rotten Tomatoes, 2018a. Rotten Tomatoes: About [WWW Document]. URL
https://www.rottentomatoes.com/about/ (accessed 3.27.18).
Rotten Tomatoes, 2018b. About Critics - Rotten Tomatoes [WWW Document]. URL
https://www.rottentomatoes.com/help_desk/critics/ (accessed 5.19.18).
Saunders, M.N.K., Lewis, P., Thornhill, A., 2009. Research methods for business students. Prentice
Hall.
SimilarWeb, 2018. Top Movies Websites in United States [WWW Document]. URL
https://www.similarweb.com/top-websites/united-states/category/arts-and-
entertainment/movies (accessed 5.19.18).
Spence, M., 1973. Job Market Signaling. The Quarterly Journal of Economics 87, 355–374.
White, H.C., 2002. Markets from networks. Princeton University Press.
White, H.C., 1981. Where Do Markets Come From? American Journal of Sociology 87, 517–547.
Ye, Q., Law, R., Gu, B., 2009. The impact of online user reviews on hotel room sales. International
Journal of Hospitality Management 28, 180–182. https://doi.org/10.1016/j.ijhm.2008.06.011
Zerubavel, E., 1996. Lumping and splitting: Notes on social classification. Sociological Forum 11,
421–433. https://doi.org/10.1007/BF02408386
Zhang, Z., Ye, Q., Law, R., Li, Y., 2010. The impact of e-word-of-mouth on the online popularity of
restaurants: A comparison of consumer reviews and editor reviews. International Journal of
Hospitality Management 29, 694–700. https://doi.org/10.1016/j.ijhm.2010.02.002
Zhao, E.Y., Ishihara, M., Lounsbury, M., 2013. Overcoming the Illegitimacy Discount: Cultural
Entrepreneurship in the US Feature Film Industry. Organization Studies 34, 1747–1776.
https://doi.org/10.1177/0170840613485844
Zuckerman, E.W., 2003. The critical trade-off: identity assignment and box-office success in the
feature film industry. Industrial and Corporate Change 12, 27–67.
https://doi.org/10.1093/icc/12.1.27
Zuckerman, E.W., 2000. Focusing the corporate product: Securities analysts and de-diversification.
Administrative Science Quarterly 45, 591.
46
Zuckerman, E.W., 1999. The Categorical Imperative: Securities Analysts and the Illegitimacy
Discount. American Journal of Sociology 104, 1398–1438.
Zuckerman, E.W., Kim, T.-Y., Ukanwa, K., Von Rittmann, J., 2003. Robust Identities or Nonentities?
Typecasting in the Feature-Film Labor Market. American Journal of Sociology 108, 1018–
1074.