Upload
minerva-lin
View
60
Download
2
Tags:
Embed Size (px)
Citation preview
Global communities and open cultural data: towards linked open
data in libraries, archives and museums
Mia Ridge
Academia Sinica, Taipei, Taiwan
October 2012
About me
Outline
• What problems can linked open data solve?
• Definitions
• Why is open cultural data important?
• History of open cultural data and role of communities
‘James Cook’ = maritime explorer?
• Computers are think in strings, people think in ‘things’.
• ‘James Cook’ == ‘Captain Cook’? Only if you’re human.
• Linking both to http://dbpedia.org/page/James_Cook helps a computer know who you mean
Types of data
• Metadata: who, what, where, when, material, size, location - the basic ‘tombstone’ data
• Data: the full collections record including descriptions, interpretive themes, narratives, etc
• Digital surrogates: e.g. images of the object, transcribed text of book or document, 3D printer files, etc
APIs (Application Programming Interfaces) are a way for one machine to talk to another:
‘Hi Bob, I’d like a list of objects from you, and hey, Alice, could you draw me a timeline to put the objects on?’
‘open’
• Open data is freely available for use and redistribution by anyone for any purpose
– Licensors might require attribution
– Licensors might require users to re-share under the same licence
Open licences
• Attribution: 'You must attribute the work in the manner specified by the author or licensor (but not in any way that suggests that they endorse you or your use of the work)’.
• Sharealike: 'If you alter, transform, or build upon this work, you may distribute the resulting work only under the same or similar license to this one.’
• Non-commercial: prohibits uses that are ‘primarily intended for or directed toward commercial advantage or private monetary compensation’.
• No derivatives: 'You may not alter, transform, or build upon this work.’
Linked data
• “data published on the Web in such a way that it is machine-readable, its meaning is explicitly defined, it is linked to other external data sets, and can in turn be linked to from external data sets”.
Source: ‘Linked Data - The Story So Far’
5 stars
★ make your stuff available on the web (whatever format)
★★ make it available as structured data (e.g. excel instead of image scan of a table)
★★★ non-proprietary format (e.g. csv instead of excel)
★★★★ use URLs to identify things, so that people can point at your stuff
★★★★★ link your data to other people’s data to provide context
From string to thing
‘James Cook’
From string to thing
‘James Cook’ = http://dbpedia.org/page/James_Cook ...
From string to thing ‘James Cook’ = http://dbpedia.org/page/James_Cook ...
‘James Cook’
What is linked open data?
• ‘data or metadata made freely available on the World Wide Web with a standard markup format’
– linked (and linkable): technical requirements
– open: licencing requirements
• Enabling connections and collaboration through interoperability
Open cultural data
• Data from cultural institutions that is made available for use in a machine-readable format under an open licence.
• Linkable: if published at a permanent URL, can be linked to from other projects
• Partial data releases e.g. low-resolution images, metadata-o nly releases
Why is open cultural data important?
• Helps achieve organisational goals, mission
• Can vastly increase access to content
• Can vastly increase engagement with content
• Can create ‘network effect’ with related institutions
Movements in linked open data
2007 - May
2007 - October
2008
2009
2010
2011
2001: first article about semantic web
2004: standards stabilised
• RDF Schema, RDF, OWL standardised
• SPARQL introduced
2006
2006: Semantic Web Think Tank
Image credit: jon pratty, all rights reserved
2007
Source: http://www.bbc.co.uk/blogs/bbcinternet/2010/02/case_study_use_of_semantic_web.html
• BBC: “a web identifier, with associated HTML pages and machine-readable feeds (RDF/XML, JSON and XML), for every programme the BBC broadcasts—allowing other teams within the BBC to incorporate those pages into new and existing programme support sites, TV Channel and Radio Station sites, and cross programme genre sites such as food, music and natural history”
2007
Source: http://www.bbc.co.uk/blogs/bbcinternet/2010/02/case_study_use_of_semantic_web.html
2008
Europeana prototype launched
2009
Within a few days...
‘raw data now’
Cosmic Collections
2010
Questions... If we built an API, would anyone use it? Can you really crowdsource the creation of collections interfaces?
2011
• Science Museum Group released 240,000 collections records as plain text CSV files.
2012
Europeana CC0
The role of communities
• Online community via social networks, wikis, discussion lists
• Events and meetups important
• Use hashtag like #lodlam for open, international conversation
What is LODLAM?
• 100 international attendees, Linked Open Data in Libraries, Archives, and Museums Summit
• San Francisco, June 2011
• Organised by Jon Voss (@jonvoss) with Kris Carpenter Negulescu, Internet Archive
• Sponsored by the Alfred P. Sloan Foundation, National Endowment for the Humanities and the Internet Archive
Image by martin_kalfatovic, Some rights reserved
4 stars
★
Attribution Share-Alike License (CC-BY-SA/ODC-ODbL)
★★ Attribution License (CC-BY / ODC-BY) with another form of attribution
★★★ Attribution License (CC-BY / ODC-BY) when the licensor considers linkbacks to meet the attribution requirement
★★★★ Public Domain (CC0 / ODC PDDL / Public Domain Mark)
LODLAM 2013 Challenge
• highlight data visualizations, tools, mashups, meshups, and all types of use cases for Linked Open Data in libraries, archives, and museums. Teams will register to submit in one and/or two heats during the fall and the spring.
• Submit: video presentation (no longer than 5 min), title, short description, long description (can include images, photos, mockups, etc), FAQ section by 1 December 2012 or 1 May 2013
LODLAM 2013
• June 2013, Montreal
• Your ideas can win!
• Challenge prize: 2 delegate seats at the Summit and $2,000USD in travel stipends.
• Grand prize $2,000USD.
LODLAM Challenge Criteria
• can be private/public partnerships, academic teams, individuals, private companies, non-profits, just about anyone.
• can be prototypes, mockups, design specs, working models • can be tools or processes for a broad GLAM community • can innovative ideas that will advance the entire community • must have a clearly articulated project goal • must utilize open data sets • must include a statement about how it’s distributed, what is the IP
and how is it held (does not have to be open) • will be judged partially on how well the idea is pitched and
visualized, just like in kickstarter: idea + marketing, ie. points for style
• must clearly describe the problem you’re trying to solve • must answer the question: if you win, what is your next step?
Thank you
• http://groups.google.com/group/lod-lam
• #lodlam on Twitter
• http://lod-lam.net
• miaridge.com @mia_out