Upload
vankhue
View
218
Download
3
Embed Size (px)
Citation preview
Big Data and Formal Methods of Cultural Analysis
John MohrUC-Santa Barbara
Talk presented at the Center for Information, Technology & Society,
University of California, Santa Barbara (10/15/13)
Slides posted @
www.ucsb.soc.edu/ct1
1Tuesday, October 15, 13
“The greatest enterprise of the mind has always been and always will be the attempted linkage of the sciences and the humanities.”(E. O. Wilson, Consilience, 1998, p. 8)
2Tuesday, October 15, 13
I. Big Data and Social Science.
3Tuesday, October 15, 13
I. Big Data and Social Science:
A. The Biggest thing yet in social science. Why?
B. Social Science Lags Behind Natural Science. Why?
• Size of workforce, $ investment?
• Humans make bad (non-compliant) research subjects
• Human action, human institutions built out of meaning
• How do you measure meaning?
• The Biggest Problem...
4Tuesday, October 15, 13
I. Big Data and Social Science:
• The Biggest Problem...Getting good data.
• Limited from the beginning (origins in state statistics) and so we’re used to it.
• For the complexities of consciousness we have surveys
• Leads to lack of common data core, data sharing, scientific theorizing across disciplines (even across sub-disciplines)
• Sociology “Middle-Range theory” (Robert K. Merton).
5Tuesday, October 15, 13
I. Big Data and Social Science:
• Big Data holds out the promise of changing this.
• Twitter, Face-Book, Wikipedia, Blogs, email, text archives, etc.
• People make data as they make life (solves the non-compliance problem)
• People deposit traces of meaning along with traces of action (solves the meaning measurement problem)
• Now we just need to train social scientists to get access toBig Data so that it can be analyzed
6Tuesday, October 15, 13
Ambuj Singh, Divy Agrawal, John Mohr, Stephen Proulx and Subhash Suri (PI’s)
Communication, Computer Science, Ecology, Evolution and Marine Biology (EEMB), Electrical and Computer Engineering (EEC), Geography, Mechanical Engineering, Sociology
7Tuesday, October 15, 13
I. Big Data and Social Science:
• Doesn’t solve the meaning measurement analysis problem, here (thanks to under-development of science and unavailability of data) we still have a long ways to go.
8Tuesday, October 15, 13
9Tuesday, October 15, 13
II. Formal Analysis and the Sociology of Culture
10Tuesday, October 15, 13
11Tuesday, October 15, 13
#I 111 Charity Fund of the Chamber of Commerce#D 1883#J 42#W distressed merchants who shall have been members of the Chamber in good repute in the City of New York and whose misfortunes were not the result of any dishonorable transactions
12Tuesday, October 15, 13
13Tuesday, October 15, 13
14Tuesday, October 15, 13
#I 681 New York Juvenile Asylum#L 176th Street and 10th Avenue#D 1851#J 23#WP truant children of both sexes residents of city committed by Magistrate 7 <=AGE<= 14#WP truant children of both sexes residents of the city surrendered by parents or guardians 7 <=AGE<= 14#WP disobedient children of both sexes residents of the city committed by Magistrate 7 <=AGE<= 14#WP disobedient Children of both sexes residents of the city surrendered by parents or guardians 7 <=AGE<= 14#J 53#WP friendless children#WP surrendered children
15Tuesday, October 15, 13
The Duality of Culture & Practice: An Example.
• Focus on Ideas (Culture) & Practice.
• Illustrate they are co-constitutive
• Theory & Society, 1997
Vol. 26 (2/3): 305-356.
16Tuesday, October 15, 13
Example:
What is the Meaning of the term “Indigent”?
Destitution? Distress? Deservingness or Worthyness?Being described as “Fallen”, “Homeless”, “Misfortunate”, “Needy”, or “Poor”?
Look to Practical Implications:Given Advice? Food? Money? Work? Investigated or put in the Poorhouse?
1888 — 208 references to these Categories in NYCCD We look for logical possibility (binary yes/no)
17Tuesday, October 15, 13
• Cultural Distinction x Practice, 1888: What goes with what?
Mohr & Duquenne
18Tuesday, October 15, 13
• Cultural Distinction x Poverty Practice, 1888: What is a subset of what? (And, vice-versa)
Mohr & Duquenne
19Tuesday, October 15, 13
• Cultural Distinction x Poverty Practice, 1888: What are the structural articulations that define each other?
Mohr & Duquenne
20Tuesday, October 15, 13
Split:whenever a pair category/practice (c/p) is such that both are "irreducible" and that p is the lowest practice not below c while c is the highest category not above p in the lattice, the pair c/p is said to be "perspective."
Example—paidWk/needy
21Tuesday, October 15, 13
22Tuesday, October 15, 13
f:food
g:give$
23Tuesday, October 15, 13
24Tuesday, October 15, 13
f:food
g:give$
25Tuesday, October 15, 13
• Cultural Distinction x Poverty Practice: A Focus on ‘Splits’ in Lattice Structures as Critical Markers for textual interpretation.
Mohr & Duquenne
26Tuesday, October 15, 13
• Cultural Distinction x Poverty Practice in 1917: A New Institutional Logic stabilizes.
Mohr & Duquenne
27Tuesday, October 15, 13
III. More examples of Formal Analysis and the Sociology of Culture
28Tuesday, October 15, 13
D. Duality Analysis - Examples of Applying Duality Analysis to Culture
• Charles Tilly (1929-2008)
• 1997 “Parliamentarization of Popular Contention in Great Britain, 1758-1834.” Theory & Society 26 (2/3):245-273
• Uses Blockmodels to Analyze Duality of Identities & Actions (in texts).
29Tuesday, October 15, 13
D. Duality Analysis - Examples of Applying Duality Analysis to Culture.
Ann Mische & Pip Pattison
• Lattice analysis groups& individual ideologies
• Poetics, 2000
Vol. 27 (2/3): 163-194.
30Tuesday, October 15, 13
• Ron Breiger examines the duality of the structure of individual influence structures among Supreme Court justices and the ideological structure of the key issues that split the court.
• “A Tool Kit for Practice Theory.” Poetics 27 (2000): 91-115.
D. Duality Analysis - Examples of Applying Duality Analysis to Culture
31Tuesday, October 15, 13
• Bernard Harcourt examines the duality of youth Gun practices and
Gun ideologies.
• “Measured Interpretation: Introducing the Method of Correspondence Analysis to Legal Studies.” University of Illinois Law Review vol. 2002 (2003): 979-1018.
D. Duality Analysis - Examples of Applying Duality Analysis to Culture
32Tuesday, October 15, 13
• John Martin examines the duality of animal species and occupational
types.
• “What do animals do all day? The division of labor, class bodies, and totemic thinking in the popular imagination.” Poetics vol. 27 (2000): 195-231.
D. Duality Analysis - Examples of Applying Duality Analysis to Culture
33Tuesday, October 15, 13
• John Mohr & Francesca Guerra-Pearson
• “The Duality of Niche and Form: The Differentiation of Institutional Space in New York City, 1888-1917.” Pp. 321-368 in Categories in Markets: Origins and Evolution, (Research in the Sociology of Organizations, Vol. 31)
G. Duality Analysis - Examples of Applying Duality Analysis to Culture
34Tuesday, October 15, 13
• Mohr, John W. and Helene K. Lee. 2000. “From Affirmative Action to Outreach: Discourse Shifts at the University of California.” Poetics: Journal of Empirical Research on Literature, the Media, and the Arts. Special Issue on “Culture and Cognition” edited by Karen Cerulo Vol. 28/1:47-71
D. Duality Analysis - Examples of Applying Duality Analysis to Culture
Figure 3. Discourse Structure of Boundary Programs
Exceptional Ability
Poverty
Intellectual Skills
Race
35Tuesday, October 15, 13
Happiness as the Duality of Ritual & Belief with Josep Rodriguez at U Barcelona.
D. Duality Analysis - Examples of Applying Duality Analysis to Culture
36Tuesday, October 15, 13
IV. Text Mining Tools and the Formal Analysis of Culture.
37Tuesday, October 15, 13
• More recent work:
• Franco Moretti. 2011. “Network Theory, Plot Analysis.” New Left Review 68: 80-102.
• Use networks tools to map plot structure in Shakespeare vs. Traditional Chinese Novels.
A. New Developments in Digital Humanities
38Tuesday, October 15, 13
A. New Developments in Digital Humanities
39Tuesday, October 15, 13
B. LDA Topic Models.
* Taken from David Blei, Princeton: http://www.cs.princeton.edu/~blei/kdd-tutorial.pdf
40Tuesday, October 15, 13
Poetics (Special Issue) Forthcoming December 2013“Topic Models and the Cultural Sciences”Edited by John Mohr and Petko Bogdanov
“Topic models: What they are and why they matter.”John Mohr (Soc. UCSB) and Petko Bogdanov (CS UCSB)
“Exploiting Affinities between Topic Modeling and the Sociological Perspective on Culture: Application to Newspaper Coverage of Government Arts Funding in the U.S.”Paul DiMaggio (Sociology, Princeton University), Manish Nag (Sociology, Princeton University), and David Blei (Computer Science, Princeton University).
“Differentiating Language-Usage Through Topic Models”Daniel A. McFarland (Education, Stanford), Daniel Ramage, Jason Chuang, Jeff Heer, Christopher D. Manning (Computer Science, Stanford) and Daniel Jurafsky (Linguistics, Stanford
“Rebellion, crime and violence in Qing China, 1722-1911: a topic modeling approach to the ‘great unread.’”Ian Miller (History, Harvard University)
41Tuesday, October 15, 13
“Elevated Threat-Levels and Decreased Expectations: How Democracy Handles Terrorist Threats”Tabitha Bonilla and Justin Grimmer (Political Science, Stanford).
“Graphing the Grammar of Motives in U.S. National Security Strategies: Cultural Interpretation, Automated Text Analysis and the Drama of Global Politics”John W. Mohr (Sociology, UCSB), Robin Wagner-Pacifici (Sociology, The New School), Ron Breiger (Sociology, U of Arizona), Petko Bogdanov (Computer Science, UCSB).
“Defining Population Problems: Using Topic Models for Cross-National Comparison of Disciplinary Development.”Emily Marshall (Department of Sociology, University of Michigan)
“Trawling in the Sea of the Great Unread: Sub-Corpus Topic Modeling and Humanities Research.”Peter Leonard (University of Chicago) and Tim Tangherlini (Scandinavian Studies, UCLA)
“Significant Themes: Topic Modeling the 19th-Century Novel”Matthew L. Jockers (Department of English, University of Nebraska-Lincoln) and David Mimno (Department of Information Science, Cornell University)
42Tuesday, October 15, 13
43Tuesday, October 15, 13
V. Recent Work using Text Mining Tools to the Pursue the Formal Analysis of Culture.
44Tuesday, October 15, 13
Using LDA Topic Models:
New Project looks at evolution of discourse logics in U.S. National Security Strategy Statements (1990-2010)
Ron Breiger (UA), Robin Wagner-Pacifici (New School) and Petko Bogdanov (CS, UCSB).
•Is there a deep structure? • An implicit moral ordering?
45Tuesday, October 15, 13
1I. What are the NSS docs?
• Origins: the Goldwater-Nichols legislation 1986 intended to address inter-service rivalry & chain of command (but also demanded a public accountability from Exec Branch by asking for annual review in the NSS). • Probably most famous was the 2002 NSS in which Bush administration laid out principles of the right of a preemptive attack (justifying the invasion of Iraq).
46Tuesday, October 15, 13
“Graphing the Grammar of Motives in U.S. National Security Strategies: Cultural Interpretation, Automated Text Analysis and the Drama of Global Politics” (forthcoming, Poetics, 2013)John W. Mohr (Sociology, UCSB), Robin Wagner-Pacifici (Sociology, The New School), Ron Breiger (Sociology, U of Arizona), Petko Bogdanov (Computer Science, UCSB).
Draws on work of Kenneth Burke (1897-1993)A literary theorist developed a “Dramatistic Theory” The best model that we have for studying the meaningfulness of human discourse is to look to the models of that discourse that humans have themselves made, which is to say we should examine the literary, the poetic, and the dramatic as exemplars for understanding human meanings.
47Tuesday, October 15, 13
Kenneth BurkeWe shall use five terms as generating principle of our investigation. They are: Act, Scene, Agent, Agency, Purpose. In a rounded statement about motives, you must have some word that names the act (names what took place, in thought or deed), and another that names the scene (the background of the act, the situation in which it occurred); also, you must indicate what person or kind of person (agent) performed the act, what means or instruments he used (agency), and the purpose. Men may violently disagree about the purposes behind a given act, or about the character of the person who did it, or how he did it, or in what kind of situation he acted; or they may even insist upon totally different words to name the act itself. But be that as it may, any complete statement about motives will offer some kind of answers to these five questions: what was done (act), when or where it was done (scene), who did it (agent), how he did it (agency), and why (purpose) (1945, p. xv).
48Tuesday, October 15, 13
Current work combining three types of text-mining tools: 1. Use (enhanced) Natural Language Processing tools to find all “Agents.”
2. Use syntactic parser to tag parts of speech (at the sentence level) to find “Acts.”
3. Use topic models to sift the text for more coherent discussion frames to find “Scenes”
49Tuesday, October 15, 13
NER finds Agents
50Tuesday, October 15, 13
State based Agents
51Tuesday, October 15, 13
Regions as Agents
52Tuesday, October 15, 13
People & Orgs as Agents
53Tuesday, October 15, 13
1990 NSS George H. W. Bush (Concept-Verb or Verb-Concept 3+)
54Tuesday, October 15, 13
1990 NSS George H. W. Bush (Concept-Verb-Concept 3+)
55Tuesday, October 15, 13
1991 NSS George H. W. Bush (Concept-Verb-Concept 3+)
56Tuesday, October 15, 13
1995 NSS William Clinton (Concept-Verb-Concept 3+)
57Tuesday, October 15, 13
1996 NSS William Clinton (Concept-Verb-Concept 3+)
58Tuesday, October 15, 13
2002 NSS George W. Bush (Concept-Verb-Concept 3+)
59Tuesday, October 15, 13
2010 NSS Barack Obama (Concept-Verb-Concept 3+)
60Tuesday, October 15, 13
15 Level Topic Model of NSS Corpus — Topic Distribution Across Years
61Tuesday, October 15, 13
1990 NSS George H. W. Bush (Concept-Verb-Concept) Topic = 0 (Terrorism)
62Tuesday, October 15, 13
1991 NSS George H. W. Bush (Concept-Verb-Concept) Topic = 0 (Terrorism)
63Tuesday, October 15, 13
2002 NSS G.W. Bush: All Agent x verb x Agent ties
in paragraphs with topic of Terrorism.
64Tuesday, October 15, 13
2002 NSS George W. Bush (Concept-Verb; Verb-Concept)
Topic = 5 (Energy)
65Tuesday, October 15, 13
2002 NSS George H. W. Bush (Concept-Verb-Concept) Topic = 7 (Conflict)
66Tuesday, October 15, 13
2010 NSS Barack Obama (Concept-Verb-Concept) Topic = 7 (Conflict)
67Tuesday, October 15, 13
Conclusions:
1. Big Data is (potentially) very good for social science. 2. Still need work in how to use formal models to analyze ideas and culture. 3. Exs. from sociology of culture (relationality/duality). 4. As sociology of culture moves forward to meet the rise of Big Data, so too duality of cultural forms are critical.
68Tuesday, October 15, 13