20
Political slant in public broadcasting Author: Bart de Goede Supervisors: Dr. Maarten Marx Dr. Johan van Doornik June 23, 2011

Political slant in public broadcasting

Embed Size (px)

DESCRIPTION

Presentation of my bachelor thesis Information Science. It provides an overview of my attempt to use parsimonious language models on parliamentary proceedings to derive characteristic words for left-wing and right-wing parties, and compare the occurences of these words in subtitles of programmes broadcasted by Dutch public broadcasting organizations.

Citation preview

Page 1: Political slant in public broadcasting

Political slant in public broadcasting

Author:Bart de Goede

Supervisors:Dr. Maarten Marx

Dr. Johan van DoornikJune 23, 2011

Page 2: Political slant in public broadcasting

Why?

Automatically identify political slant in Dutch public broadcasting

Page 3: Political slant in public broadcasting

Gentzkow & Shapiro (2010)

Gentzkow, M. and Shapiro, J. M. (2010). What drives media slant? Evi-dence from U.S. daily newspapers. Econometrica, 78(1):35–71.

Econometrical research: compare language use of news outlets to political language

Conclusion: ‘An economically signi!cant demand for news slanted towards one’s own political ideology exists.’

Page 4: Political slant in public broadcasting

Operationalization

Gentzkow, M. and Shapiro, J. M. (2010). What drives media slant? Evi-dence from U.S. daily newspapers. Econometrica, 78(1):35–71.

Find characteristic words for Republicans and Democrats in Congress Proceedings.

Count relative frequencies of these words in newspapers

Compare occurrence of words between newspapers

Page 5: Political slant in public broadcasting

Di"erences

Dutch versus English

Television instead of newspapers

More political parties

Other technique to derive characteristic words

Other comparison method(s)

Page 6: Political slant in public broadcasting

Television

Subtitles for the hearing impaired (http://tt888.nl)

Data complete from January 2008 to February 2011

Problem: Hardly any useful metadata

Page 7: Political slant in public broadcasting

Television

Solution: TV guide

Before After

Broadcast with title

Unique titles

Broadcast frequency > 2

16.995 32.491

4.560 --> 2.702 2.238

1.104 1.064

Page 8: Political slant in public broadcasting

TelevisionPauw & Witteman

895.935 words

Nova362.844 words

Nos Journaal12.609.620 words

NOS Jeugdjournaal1.383.728 words

Netwerk879.635 words

Goedemorgen Nederland760.658 words

EenVandaag1.556.642 words

DWDD1.626.929 words

Buitenhof DWDDEenVandaag Goedemorgen NederlandHet Elfde Uur Holland DocKnevel en Van den Brink NetwerkNieuwsuur NOS JeugdjournaalNos Journaal NovaOchtendspits Pauw & WittemanPowNews SchoolTV WeekjournaalSinterklaasjournaal TegenlichtUitgesproken VragenuurtjeZembla

Page 9: Political slant in public broadcasting

Political groups

Hirst, G., Riabinin, Y., Graham, J., and Boizot-Roche, M. Text to Ideology

or Text to Party Status?

Parliamentary period with greatest overlap on TV data set:Balkenende IV

Ideology: goverment - opposition, not left - right (Hirst et al., 2010)

Page 10: Political slant in public broadcasting

Political groups

Hirst, G., Riabinin, Y., Graham, J., and Boizot-Roche, M. Text to Ideology

or Text to Party Status?

Government (CDA, PvdA and ChristenUnie)

Left wing opposition (GroenLinks, SP)

Right wing opposition (PVV, VVD)

Page 11: Political slant in public broadcasting

Parsimonious language models

Hiemstra, D., Robertson, S., and Zaragoza, H. (2004). Parsimonious lan-

guage models for information retrieval. In Proceedings of the 27th Annual Inter-national ACM SIGIR Conference on Research and development in InformationRetrieval, SIGIR ’04, pages 178–185, New York, NY, USA. ACM.

et = tf(t,D) · λ(t|D)

(1− λ)P (t|C) + λP (t|D)

P (t|D) =et�t et

Page 12: Political slant in public broadcasting

Parsimonious language models

Hiemstra, D., Robertson, S., and Zaragoza, H. (2004). Parsimonious lan-

guage models for information retrieval. In Proceedings of the 27th Annual Inter-national ACM SIGIR Conference on Research and development in InformationRetrieval, SIGIR ’04, pages 178–185, New York, NY, USA. ACM.

Probability distribution from word frequencies per document

Compare distribution with collection of documents

Choose terms that are substantially more frequent than expected

Page 13: Political slant in public broadcasting

Parsimonious language models

Hiemstra, D., Robertson, S., and Zaragoza, H. (2004). Parsimonious lan-

guage models for information retrieval. In Proceedings of the 27th Annual Inter-national ACM SIGIR Conference on Research and development in InformationRetrieval, SIGIR ’04, pages 178–185, New York, NY, USA. ACM.

Filter out corpus speci!c stopwords (‘voorzitter’)

Remove noise

Page 14: Political slant in public broadcasting

Parsimonious language models

Page 15: Political slant in public broadcasting

Parsimonious language models

Page 16: Political slant in public broadcasting

Parsimonious language models

Page 17: Political slant in public broadcasting

Comparison

Two methods: estimated probability and Kullback-Leibler divergence

‘For each political group, estimate the probability that an arbitrary word in a tv-programme is one of their characteristic words’

‘Calculate the risk of returning a document to the query’

P̂ (q|TV ) =�

t∈q

tft,TV

|TV | KL(Md � Mq) =�

t�V

P (t|Mq) · logP (t|Mq)

P (t|Md)

Page 18: Political slant in public broadcasting

Results

Right never wins

Casual evaluation does not imply ‘strange’ right wing words

Government and left results are close

Comparison with regular Dutch does imply a little preference for left wing words

Page 19: Political slant in public broadcasting

Conclusions

Language in Dutch public broadcasting is not particularly left (only a slight preference was found)

Descriptive right wing words used less

Might be PVV-in#uence; further investigation is needed

Page 20: Political slant in public broadcasting

Questions?