58

From DARPA to Shakespeare: All the Data we Can Handle

Embed Size (px)

DESCRIPTION

Big Data and Digital Humanities overview presented to CUA LSC874 Digital Humanities Class February 2014.

Citation preview

Page 1: From DARPA to Shakespeare: All the Data we Can Handle
Presenter
Presentation Notes
http://www.darpa.mil/newsevents/releases/2012/03/29.aspx
Page 2: From DARPA to Shakespeare: All the Data we Can Handle

From DARPA to Shakespeare and all the data we can handle

Big Data and Digital Humanities

February 2014

http://www.darpa.mil/newsevents/releases/2012/03/29.aspx

Page 3: From DARPA to Shakespeare: All the Data we Can Handle

1. Big Data 2. Libraries & Librarians 3. University Researchers & Beyond 4. Digital Humanities

Page 4: From DARPA to Shakespeare: All the Data we Can Handle

1. Big Data

Page 5: From DARPA to Shakespeare: All the Data we Can Handle

High-Performance Computing (HPC) Act of 1991 (Public Law 102-194)

as amended by the Next Generation Internet Research Act of 1998 (Public Law

105-305) and America COMPETES Act of 2007 (Public Law 110-69).

It’s the law!

These laws authorize Federal agencies to set goals, prioritize their investments, and coordinate their activities in networking and information technology research and development.

George O. Strawn NITRD Networking and Information Technology Research and Development (NITRD) Program

From : Hot Topics in Big Data: What You Need to Know Now!

FEDLINK, NFAIS, CENDI; December 11, 2012

Presenter
Presentation Notes
http://www.darpa.mil/newsevents/releases/2012/03/29.aspx
Page 6: From DARPA to Shakespeare: All the Data we Can Handle
Page 7: From DARPA to Shakespeare: All the Data we Can Handle

Big data... is a mystery is a child of the internet

Big Data has grown from...

CPU's of information Disks of information

...to Networks of information Sensors everywhere

George O. Strawn NITRD

Page 8: From DARPA to Shakespeare: All the Data we Can Handle

Urban computing also aims to deeply understand the nature and sciences behind the phenomenon occurring in urban spaces, using a variety of heterogeneous data sources, such as traffic flows, human mobility, geographic and map data, environment, energy consumption, populations, and economics, etc. Recently, real-world data reflecting city dynamics becomes widely available, including, e.g., users’ mobile phone signal, GPS traces of vehicles and people, ticketing data in public transportation systems, user-generated content (like tweets, micro-blog, check-ins, photos), data from transportation sensor networks (camera and loop sensors) and environment sensor networks (temperature and air quality), as well as data from the Internet of Things. http://www.meetup.com/UrbanComputing/

Smart Cities

Presenter
Presentation Notes
http://www.ibm.com/smarterplanet/ie/en/smarter_cities/overview/index.html?re=CS1
Page 9: From DARPA to Shakespeare: All the Data we Can Handle

Examples of big data: • Electronic Health Records • Text vs tables • Textual analytics TEI • Sentiment analysis - FB posts, Twitter • Distributed data, distributed computing • Atmospheric sensors, undersea sensors • Hubble telescope • Library ERM

Page 10: From DARPA to Shakespeare: All the Data we Can Handle

Big Data & Science... • Analyzing output from simulations • Analyzing instrument output - LHC, Curiosity • Creating DB's to support wide collaboration: Human Genome Project • Creating Knowledge Bases from textural information:

Semantic Medline • Proteomics will be bigger than genomics

How do you move 100TB of information within a University or a research area?

Page 12: From DARPA to Shakespeare: All the Data we Can Handle

Experimental Science Theoretical Science Computational Science Data Science - Big Data

4th Paradigm of Science

Page 13: From DARPA to Shakespeare: All the Data we Can Handle

From bits to its... Does the world consist of ... matter, energy and information? Newton - matter and motion Steam engine - thermodynamics, matter, energy Computer - science of information, matter, energy and information Data intensive science is revolutionary science

Big Data is TOO BIG To KNOW! The dust hasn't settled; dust is swirling all around us; it is FUN dust! George O. Strawn

Page 14: From DARPA to Shakespeare: All the Data we Can Handle

See presentation: Philosophy & Big Data: Big Data, the Individual, and Society by Melanie Swan January 24, 2013 http://www.slideshare.net/lablogga/philosophy-and-big-data-big-data-the-individual-and-society

Page 15: From DARPA to Shakespeare: All the Data we Can Handle
Page 16: From DARPA to Shakespeare: All the Data we Can Handle

2. Libraries & Librarians 3. University Researchers (YOU) & Beyond

Page 17: From DARPA to Shakespeare: All the Data we Can Handle

http://d2c2.lib.purdue.edu/publications

Purdue University D. Scott Brandt and Jake Carlson

Page 18: From DARPA to Shakespeare: All the Data we Can Handle

Michael Furlough Associate Dean for Research and Scholarly Communications Penn State University Libraries

Libraries roles and challenges: Libraries will have to operate on faith Libraries will need deep collaboration

Page 19: From DARPA to Shakespeare: All the Data we Can Handle

Librarians - new roles Instruction - Best Practices

Data Information Literacy Collaborate - DMP & more

Data Management Plans Preserving/curating research

DO Manage - RDS Services

Keeping up!

Page 20: From DARPA to Shakespeare: All the Data we Can Handle

Conversion & Interoperability Cultures of Practice Databases & Data Formats Data Curation & Reuse Data Management & Organization Data Processing & Analysis Data Quality & Documentation Discovery & Acquisition Ethics & Attribution Metadata & Data Description Preservation Visualization & Representation See more at: Data Information Literacy Competencies http://wiki.lib.purdue.edu/display/ste/Materials+for+the+DIL+Symposium

Data is

information

Page 21: From DARPA to Shakespeare: All the Data we Can Handle

Librarians - new roles Instruction - Best Practices

Data Information Literacy Collaborate - DMP & more

Data Management Plans Preserving/curating research

DO Manage - RDS Services

Keeping up!

Page 22: From DARPA to Shakespeare: All the Data we Can Handle

Build on successes MANTRA - Research Management Data Training http://datalib.edina.ac.uk/mantra/ Data Management Course 2014 - University 0f Minnesota https://sites.google.com/a/umn.edu/data-management-workshop-series/ Data Train http://archaeologydataservice.ac.uk/learning/DataTrain#section-DataTrain-AimsObjectives

Page 25: From DARPA to Shakespeare: All the Data we Can Handle

Librarians - new roles Instruction - Best Practices

Data Information Literacy Collaborate - DMP & more

Data Management Plans Preserving/curating research

DO Manage - RDS Services

Keeping up!

Page 26: From DARPA to Shakespeare: All the Data we Can Handle

What do researchers care about? Where can I put my stuff? What is a data management plan?

Data needs to be... • available • findable • re-usable • citable

Page 27: From DARPA to Shakespeare: All the Data we Can Handle
Page 28: From DARPA to Shakespeare: All the Data we Can Handle
Page 29: From DARPA to Shakespeare: All the Data we Can Handle

DO

Page 30: From DARPA to Shakespeare: All the Data we Can Handle

DataNet from NSF http://datafed.org/

Digital Preservation from the LoC

http://www.digitalpreservation.gov/ HathiTrust Digital Library

http://www.hathitrust.org/ Digital Preservation Network

http://www.dpn.org/

Page 31: From DARPA to Shakespeare: All the Data we Can Handle

Title: State of Sustainability Practices among Minnesota Tourism Businesses, 2007-2013 Authors: Qian, Xinyi (Lisa) Schneider, Ingrid E.

Presenter
Presentation Notes
http://conservancy.umn.edu/handle/11299/160507
Page 32: From DARPA to Shakespeare: All the Data we Can Handle

Title: Public-Use Data from the Obstetrics and Periodontal Therapy (OPT) Study, a randomized trial of periodontal therapy to prevent pre-term birth Authors: Hodges, James S. Michalowicz, Bryan S.

Presenter
Presentation Notes
http://conservancy.umn.edu/handle/11299/160551
Page 33: From DARPA to Shakespeare: All the Data we Can Handle

Title: "Laundry Soap" from the Ojibwe Conversational Archives Project Authors: Hermes, Mary Tainter, Rose Kingbird-Porter, Margaret

Presenter
Presentation Notes
http://conservancy.umn.edu/handle/11299/160534
Page 34: From DARPA to Shakespeare: All the Data we Can Handle

https://www.lib.umn.edu/datamanagement/archiving

Presenter
Presentation Notes
https://www.lib.umn.edu/datamanagement/archiving
Page 35: From DARPA to Shakespeare: All the Data we Can Handle

https://www.lib.umn.edu/datamanagement/archiving

Presenter
Presentation Notes
https://www.lib.umn.edu/datamanagement/archiving
Page 36: From DARPA to Shakespeare: All the Data we Can Handle

Librarians - new roles Instruction - Best Practices

Data Information Literacy Collaborate - DMP & more

Data Management Plans Preserving/curating research

DO Manage - RDS Services

Keeping up!

Page 38: From DARPA to Shakespeare: All the Data we Can Handle

For all links please see: http://guides.lib.cua.edu/hoffman [tab] BigData Keeping Research Data Safe http://www.beagrie.com/krds.php

Page 39: From DARPA to Shakespeare: All the Data we Can Handle

4. Digital Humanities WHY?

Page 40: From DARPA to Shakespeare: All the Data we Can Handle

4. Digital Humanities ...Using data to tell our story

Page 41: From DARPA to Shakespeare: All the Data we Can Handle

Data Visualization Catalog

http://blog.visual.ly/the-data-visualization-catalogue/

Page 42: From DARPA to Shakespeare: All the Data we Can Handle

Visualization

http://www.edwardtufte.com/tufte/posters http://www.masswerk.at/minard/ http://vannevar.blogspot.com/2009/03/minard-napolean-russia-1812-best-chart.html

Presenter
Presentation Notes
http://www.edwardtufte.com/tufte/posters http://www.masswerk.at/minard/ http://vannevar.blogspot.com/2009/03/minard-napolean-russia-1812-best-chart.html
Page 43: From DARPA to Shakespeare: All the Data we Can Handle

http://research.google.com/bigpicture/music/?utm_content=buffer662d6&utm_medium=social&utm_source=twitter.com&utm_campaign=buffer#

Presenter
Presentation Notes
http://research.google.com/bigpicture/music/?utm_content=buffer662d6&utm_medium=social&utm_source=twitter.com&utm_campaign=buffer#
Page 44: From DARPA to Shakespeare: All the Data we Can Handle

http://www.ucl.ac.uk/infostudies/melissa-terras/DigitalHumanitiesInfographic.pdf

Presenter
Presentation Notes
http://www.ucl.ac.uk/infostudies/melissa-terras/DigitalHumanitiesInfographic.pdf
Page 45: From DARPA to Shakespeare: All the Data we Can Handle

http://www.folgerdigitaltexts.org/

Presenter
Presentation Notes
http://www.folgerdigitaltexts.org/
Page 46: From DARPA to Shakespeare: All the Data we Can Handle
Page 48: From DARPA to Shakespeare: All the Data we Can Handle
Page 50: From DARPA to Shakespeare: All the Data we Can Handle

JISC media hub http://jiscmediahub.ac.uk/

Page 51: From DARPA to Shakespeare: All the Data we Can Handle
Page 52: From DARPA to Shakespeare: All the Data we Can Handle
Presenter
Presentation Notes
http://www.dmoz.org/Reference/Knowledge_Management/Knowledge_Discovery/Information_Visualization/
Page 53: From DARPA to Shakespeare: All the Data we Can Handle

Examples of TEI: American Memory (uses a TEI-conformant DTD) http://memory.loc.gov/ammem/index.html Early Canada Online http://www.canadiana.org/

Victorian Women Writers Project http://www.indiana.edu/~letrs/vwwp/index.html

Oxford Text Archive http://ota.ahds.ac.uk/

Page 55: From DARPA to Shakespeare: All the Data we Can Handle

NEVER DONE

Page 56: From DARPA to Shakespeare: All the Data we Can Handle

• Data is information • Libraries can be partners in providing value

- access and analytics • Deep Collaboration - Federal, University,

Business, Researchers/Industry, Future of Research

• Data Policies • Renaissance of Archivists • Librarians as information consultants • Librarians as researchers

Page 58: From DARPA to Shakespeare: All the Data we Can Handle

References

2012/03/29 DARPA calls for advances in big data to help the warfighter. (2012). Retrieved from

http://www.darpa.mil/newsevents/releases/2012/03/29.aspx

Boyle, D. E., Yates, D. C., & Yeatman, E. M. (2013). Urban sensor data streams: London 2013. Internet Computing, IEEE, 17(6), 12-20.

doi:10.1109/MIC.2013.85

Domingo, A., Bellalta, B., Palacin, M., Oliver, M., & Almirall, E. (2013). Public open sensor data: Revolutionizing smart cities. Technology and

Society Magazine, IEEE, 32(4), 50-56. doi:10.1109/MTS.2013.2286421

Gladney, H. M. (2012). Long-term digital preservation: A digital humanities topic? HISTORICAL SOCIAL RESEARCH-HISTORISCHE

SOZIALFORSCHUNG, 37(3), 201-217.

IBM smarter cities - overview - ireland. Retrieved from http://www.ibm.com/smarterplanet/ie/en/smarter_cities/overview/index.html?re=CS1

JADH 2013: ODDly pragmatic: Documenting encoding practices in digital humanities projects by james cummings on prezi. Retrieved from

http://prezi.com/af2auinap-ug/jadh-2013-oddly-pragmatic-documenting-encoding-practices-in-digital-humanities-projects/

Lisa Johnston, Research Data Management and Curation Lead, & University Libraries University of Minnesota -‐ Twin Cities . (2014). A

Workflow Model for Curating Research Data in the University of Minnesota Libraries: Report from the 2013 Data Curation Pilot .

().University Digital of Minnesota Conservancy.

Michael Pepi. (2013). The postmodernity of big data – the new inquiry. Retrieved from http://thenewinquiry.com/essays/the-postmodernity-of-

big-data/

Van den Eynden, V., Corti, L., Woollard, M., Bishop, L., & Horton, L. (2011). Managing and sharing data: Best practice for researchers