Upload
james-baker
View
228
Download
0
Embed Size (px)
Citation preview
Complex Analysis of Large Scale Digital Collections
Reflections on some opportunities and challenges
James Baker, Lecturer in Digital History@j_w_baker
slideshare.net/drjwbaker
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. Exceptions: quotations, embeds from external sources, logos, and marked images.
@j_w_baker
1794
1808
@j_w_baker
A search for mentions of various infectious diseases (Cholera, Whooping Cough, Consumption, and Measles) across the 60,000
book dataset https://github.com/UCL-dataspring
@j_w_baker
A search for the size of images on each page over time (left) and over the course of two books (right) across the 60,000 book dataset
https://github.com/UCL-dataspring
@j_w_baker
© oldweb.today
@j_w_baker
© oldweb.today
The Old Bailey Online can rightfully describe their 197,000 trials as the “largest body of texts detailing the lives of non-elite people ever published” between 1674 and 1913. But GeoCities, drawing on the material we have between 1996 and 2009, has over thirty-eight million pages.Ian Milligan, 'Herrenhausen Big Data Lightning Talk: Finding Community in the Ruins of GeoCities', 25 March 2015
[Yahoo!] found the way to destroy the most massive amount of history in the shortest amount of time with absolutely no recourseDan Fletcher, 'Internet Atrocity! GeoCities' Demise Erases Web History', TIME Magazine, 9 March 2009
@j_w_baker
© oldweb.today
@j_w_baker
@j_w_baker
© oldweb.today
@j_w_baker
wiki.bitcurator.net
Open source digital forensics
@j_w_baker
this.thatcamp.org
Complex Analysis of Large Scale Digital Collections
Reflections on some opportunities and challenges
James Baker, Lecturer in Digital History@j_w_baker
slideshare.net/drjwbaker
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. Exceptions: quotations, embeds from external sources, logos, and marked images.