14
Citation studies in the humanities Chris Alen Sula School of Information & Library Science Pratt Institute #DH2013 / Citation studies in the humanities / @chrisalensula @thisismmiller Matt Miller NYPL Labs New York Public Library

Citation studies in the humanities

Embed Size (px)

Citation preview

Page 1: Citation studies in the humanities

Citation studiesin the humanitiesChris Alen SulaSchool of Information & Library SciencePratt Institute

#DH2013 / Citation studies in the humanities / @chrisalensula @thisismmiller

Matt Mil lerNYPL LabsNew York Public Library

Page 2: Citation studies in the humanities

Background‣ scholarly communication — the processes by which scholars

share their findings, both formally (e.g., articles) and informally (e.g., tweets, letters, blogs)

‣ bibliometrics — methods for analyzing citation behaviors

#DH2013 / Citation studies in the humanities / @chrisalensula @thisismmiller

‣ Bibliometrics is largely based on studies of scientific and technical corpora (Hérubel and Buchanan, 1994; Lamont, 2000), with relatively few studies in the humanities (cf. Ardanuy, 2013).

(Yan & Ding, 2012)

Page 3: Citation studies in the humanities

Citation networks

#DH2013 / Citation studies in the humanities / @chrisalensula @thisismmiller

Rosvall & Bergstrom (2007)

Page 4: Citation studies in the humanities

Bibliometrics & humanities: Why so l itt le?‣ lack of data (Linmans, 2010), especially for

‣ monographs (Hammarfelt, 2011), which still form the backbone of humanities work (Larivière, et. al., 2006)

‣ older sources, which humanists cite with greater frequency than scientists (Heinzkill, 1980)

‣ lack of citations, comparatively speaking

‣ Humanists cite each other less frequently than scientists (Heinzkill, 1980; Swales, 1990; Hellqvist, 2010).

‣ Multi-authored articles are rare (Price, 1966; Pao, 1981, 1982; Sievert and Sievert, 1989; Wiberly, 1989), around 1.06 authors per article from 1980–2007 (Linmans, 2010).

‣ Humanists do cite and co-author (Leydesdorff, Hammarfelt & Salah, 2011) and Dhers have done citation studies (Smith, 2009).

#DH2013 / Citation studies in the humanities / @chrisalensula @thisismmiller

Page 5: Citation studies in the humanities

‣ Humanities discourse differs from scientific discourse.

‣ more integral references, in which authors associate their own views with those they references (Swales, 1990; Hyland, 1999; Harwood, 2008)

‣ more negative references, which object to other authors’ claims (Meadows, 1974; Brooks, 1985; Cano, 1989).

‣ The mere fact that one humanist cites another says nothing about type or significance of their relationship.

‣ Understanding and tracking these these relationships would give us a richer, more nuanced view of the humanities. Part of that data can come from reference contexts, part from extra-citational information (mentions, likes, real-world relationships, etc.).

#DH2013 / Citation studies in the humanities / @chrisalensula @thisismmiller

Bibliometrics & humanities: Why so l itt le?

Page 6: Citation studies in the humanities

Reference context

#DH2013 / Citation studies in the humanities / @chrisalensula @thisismmiller

(Chubin & Moitra, 1975)

(Frost, 1979)

‣ two example schema

Page 7: Citation studies in the humanities

#DH2013 / Citation studies in the humanities / @chrisalensula @thisismmiller

Code at http://github.com/thisismattmiller/dh2013-humanities-citation

Page 8: Citation studies in the humanities

Our tool: extraction

#DH2013 / Citation studies in the humanities / @chrisalensula @thisismmiller

‣ Layout recognition used to extract citations and surrounding context (usually 1–2 sentences)

Page 9: Citation studies in the humanities

Our tool: classif ication

‣ Naïve Bayes classifier using NLT

#DH2013 / Citation studies in the humanities / @chrisalensula @thisismmiller

sample from positive training set‣extensively discussed by‣useful discussion‣indebted to‣groundbreaking work‣result confirms the hypothesis

sample from negative training set‣contra‣appears to overlook‣fail to account for‣problematic‣is unable to

Page 10: Citation studies in the humanities

Data & results

#DH2013 / Citation studies in the humanities / @chrisalensula @thisismmiller

‣ articles sampled for this study

‣ results of citation tool applied to sample set

Page 11: Citation studies in the humanities

Polarity results by discipl ine

#DH2013 / Citation studies in the humanities / @chrisalensula @thisismmiller

Page 12: Citation studies in the humanities

Broader patterns?‣ citation frequency x polarity

#DH2013 / Citation studies in the humanities / @chrisalensula @thisismmiller

Page 13: Citation studies in the humanities

Future directions

‣ further manual inspection of articles to determine the reliability of extraction and classification

‣ further training of the sentiment classifier on larger corpora

‣ measures of inter-rater reliability for classification

‣ support for more document layouts

‣ crowdsourced PDF analysis & classif ier training

#DH2013 / Citation studies in the humanities / @chrisalensula @thisismmiller

Page 14: Citation studies in the humanities

References‣ All references are available in the conference proceedings at

http://dh2013.unl.edu/abstracts/ab-353.html

‣ Additional references:

‣ Jordi Ardanuy (2013). "Sixty Years of Citation Analysis Studies in the Humanities (1951–2010)" Journal of the American Society for Information Science and Technology 64(8): 1751–1755.

‣ Erjia Yan and Ying Ding (2012). “Scholarly Network Similarities: How Bibliographic Coupling Networks, Citation Networks, Cocitation Networks, Topical Networks, Coauthorship Networks, and Coword Networks Relate to Each Other” Journal of the American Society for Information Science and Technology 63(7): 1313–1326.

‣ Code at http://github.com/thisismattmiller/dh2013-humanities-citation

#DH2013 / Citation studies in the humanities / @chrisalensula @thisismmiller