13
Complex Analysis of Large Scale Digital Collections Reflections on some opportunities and challenges James Baker, Lecturer in Digital History @j_w_baker slideshare.net/drjwbaker This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. Exceptions: quotations, embeds from external sources, logos, and marked images.

Complex Analysis of Large Scale Digital Collections: reflections on some opportunities and challenges

Embed Size (px)

Citation preview

Page 1: Complex Analysis of Large Scale Digital Collections: reflections on some opportunities and challenges

Complex Analysis of Large Scale Digital Collections

Reflections on some opportunities and challenges

James Baker, Lecturer in Digital History@j_w_baker

slideshare.net/drjwbaker

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. Exceptions: quotations, embeds from external sources, logos, and marked images.

Page 2: Complex Analysis of Large Scale Digital Collections: reflections on some opportunities and challenges

@j_w_baker

1794

1808

Page 3: Complex Analysis of Large Scale Digital Collections: reflections on some opportunities and challenges

@j_w_baker

A search for mentions of various infectious diseases (Cholera, Whooping Cough, Consumption, and Measles) across the 60,000

book dataset https://github.com/UCL-dataspring

Page 4: Complex Analysis of Large Scale Digital Collections: reflections on some opportunities and challenges

@j_w_baker

A search for the size of images on each page over time (left) and over the course of two books (right) across the 60,000 book dataset

https://github.com/UCL-dataspring

Page 6: Complex Analysis of Large Scale Digital Collections: reflections on some opportunities and challenges

@j_w_baker

© oldweb.today

The Old Bailey Online can rightfully describe their 197,000 trials as the “largest body of texts detailing the lives of non-elite people ever published” between 1674 and 1913. But GeoCities, drawing on the material we have between 1996 and 2009, has over thirty-eight million pages.Ian Milligan, 'Herrenhausen Big Data Lightning Talk: Finding Community in the Ruins of GeoCities', 25 March 2015

[Yahoo!] found the way to destroy the most massive amount of history in the shortest amount of time with absolutely no recourseDan Fletcher, 'Internet Atrocity! GeoCities' Demise Erases Web History', TIME Magazine, 9 March 2009

Page 8: Complex Analysis of Large Scale Digital Collections: reflections on some opportunities and challenges

@j_w_baker

Page 11: Complex Analysis of Large Scale Digital Collections: reflections on some opportunities and challenges

@j_w_baker

wiki.bitcurator.net

Open source digital forensics

Page 12: Complex Analysis of Large Scale Digital Collections: reflections on some opportunities and challenges

@j_w_baker

this.thatcamp.org

Page 13: Complex Analysis of Large Scale Digital Collections: reflections on some opportunities and challenges

Complex Analysis of Large Scale Digital Collections

Reflections on some opportunities and challenges

James Baker, Lecturer in Digital History@j_w_baker

slideshare.net/drjwbaker

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. Exceptions: quotations, embeds from external sources, logos, and marked images.