Upload
tim-sherratt
View
2.011
Download
0
Embed Size (px)
DESCRIPTION
For CAARA Residential School, 10 November 2010
Citation preview
The four EsDoing more with metadata
Tim Sherratt (@wragge)
Archives know the value of metadata.
A metadata fetish?
Metadata is just data about data.
We value it according to our needs.
Once we get past the fetishistic allure, we can see...
Metadata is everywhere.
The four Es
The four Es
• Extraction
• Enhancement
• Extension
• Experimentation
Extraction
Extraction
Liberate the metadata trapped within existing processes and
systems.
Extraction
• Where is it?
• What is it?
• How do I get it out?
Extraction
• Inside• Outside• Neither in nor out
Where is it?
Extraction – where is it?
• Records
• Descriptive systems
• Research
• Websites
• Usage statistics
Inside
Extraction – where is it?
• Research
• Publications
• Social media
Outside
Extraction – where is it?
• Cloud services (eg Flickr)
Neither in nor out
Extraction
• People
• Places
• Subjects
• Dates
• Structure
What is it?
Extraction
• Text mining• Natural language processing• Web services• Crowdsourcing
How do I get it out?
Extraction – examples
Old Weather
Where?● Ships’ logs
Extraction – examples
Old Weather
What?● Ship movements● Weather observations
Extraction – examples
Old Weather
How?● Crowdsourcing
Extraction – examples
Mapping our Anzacs
Where?
Corrigan James : SERN 5308 : POB Aberfeldie VIC : POE Melbourne VIC : NOK S Corrigan Maggie
● Collection database
Extraction – examples
Mapping our Anzacs
What?
Corrigan James : SERN 5308 : POB Aberfeldie VIC : POE Melbourne VIC : NOK S Corrigan Maggie
● People
Extraction – examples
Mapping our Anzacs
What?
Corrigan James : SERN 5308 : POB Aberfeldie VIC : POE Melbourne VIC : NOK S Corrigan Maggie
● Places
Extraction – examples
Mapping our Anzacs
What?
Corrigan James : SERN 5308 : POB Aberfeldie VIC : POE Melbourne VIC : NOK S Corrigan Maggie
● Relationships
Extraction – examples
Mapping our Anzacs
What?
Corrigan James : SERN 5308 : POB Aberfeldie VIC : POE Melbourne VIC : NOK S Corrigan Maggie
● Other
Extraction – examples
Mapping our Anzacs
How?● Text mining
Corrigan James : SERN 5308 : POB Aberfeldie VIC : POE Melbourne VIC : NOK S Corrigan Maggie
Extraction – examples
Reference blog
Where?● Reference inquiries
http://itech.dickinson.edu/archives/
Extraction – examples
Reference blog
What?● People● Places● Subjects● Access points!
http://itech.dickinson.edu/archives/
Extraction – examples
Reference blog
How?● Workflow app● Blog/Drupal
http://itech.dickinson.edu/archives/
Extraction – examples
Factsheet explorer
Where?● Website
Extraction – examples
Factsheet explorer
What?● Subjects● Collection references
http://discontents.com.au/shed/fs/fs_explorer.php
Extraction – examples
Factsheet explorer
What?● Subjects● Collection references
http://discontents.com.au/shed/fs/fs_explorer.php
Extraction – examples
Factsheet explorer
How?● Screen scraping● ‘See also’ links
http://discontents.com.au/shed/fs/fs_explorer.php
Extraction – examples
JSTOR
Where?● Footnotes
Extraction – examples
JSTOR
What?● Collection references
Extraction – examples
JSTOR
What?● People
Extraction – examples
JSTOR
What?● Dates
Extraction – examples
JSTOR
What?● Detailed description!
Extraction – examples
JSTOR
How?● Screen scraping● XML from http://dfr.jstor.org/
Extraction – examples
Flickr context harvester
Where?● Flickr
http://userscripts.org/scripts/show/56135
Extraction – examples
Flickr context harvester
What?● Comments● Tags● Links
http://userscripts.org/scripts/show/56135
Extraction – examples
Flickr context harvester
How?● Flickr API● Javascript or...?● ‘See also’ links?
http://userscripts.org/scripts/show/56135
Extraction – examples
Zotero
Where?● Research databases● Zotero groups
Extraction – examples
Zotero
What?● Notes● Tags● Collections● Gems and strays● Annotations
Extraction – examples
Zotero
How?● Zotero everywhere● Web API● Integrate into apps
Enhancement
Enhancement
Add structure, meaning, value or context.
Enhancement
Not just what you do, but also what you don’t do.
Enhancement
Following a name● Entity extraction (eg Open Calais, AlchemyAPI)
‘I say emphatically that the climate has changed’, Henry Hodgson told the Argus in 1928. The experience of seventy-eight years brooked no denial, summers were milder, and thunderstorms were fewer. ‘It is no use telling me that weather bureau statistics do not bear this out’, he added defiantly. ‘You can do anything with statistics, but no statistics will convince me that the climate has not changed radically.’
Henry Hodgsonperson
But then what?
Enhancement
Following a name● Use once and throw away?
http://mysite.com/search?q=Henry+Hodgson
Enhancement
Following a name● Store as a subject?
Subjects:thunderstormsweathermemoryHenry Hodgson
Enhancement
Following a name● Store as a person?
Subjects:thunderstormsweathermemory
People:Henry Hodgson
Enhancement
Following a name● Add some structure?
<people><person>
<firstname>Henry</firstname><surname>Hodgson</surname>
</person></people>
Enhancement
Following a name● What about the text?
‘I say emphatically that the climate has changed’, <span typeof=”foaf:person”>Henry Hodgson</span> told the Argus in 1928. The experience of seventy-eight years brooked no denial, summers were milder, and thunderstorms were fewer. ‘It is no use telling me that weather bureau statistics do not bear this out’, he added defiantly. ‘You can do anything with statistics, but no statistics will convince me that the climate has not changed radically.’
Enhancement
Following a name● Disambiguation?
People:Henry Hodgson (1889-1956)Henry H Hodgson (1902-1974)
Enhancement
Following a name● Name authorities?
<people><person>
<firstname>Henry</firstname><surname>Hodgson</surname><id>http://nla.gov.au/nla.party-590379</id>
</person></people>
Enhancement
The way you store and structure your metadata will affect possibilities for
reuse.
Enhancement
Geocoding● Putting places on a map
Canberra, ACT, Australia -35.28346 / 149.12807
Enhancement
Geocoding services● Google maps● Yahoo Placemaker (includes entity extraction)● GeoNames● Geoscience Australia (under construction)● and more...
Enhancement
NMA collection map
http://labs.nma.gov.au/collection/map/
● Two days work● Used GeoNames● 57% success (2142 places)● Scotland is not a country
Enhancement
NLA photos map
http://www.paulhagon.com/playground/nla/geo/
● 35,000+ images located● Used Yahoo Placemaker● 80% success● See Paul Hagon’s blog
Enhancement
Topic modelling● Understanding what it all means
‘I say emphatically that the climate has changed’, Henry Hodgson told the Argus in 1928. The experience of seventy-eight years brooked no denial, summers were milder, and thunderstorms were fewer. ‘It is no use telling me that weather bureau statistics do not bear this out’, he added defiantly. ‘You can do anything with statistics, but no statistics will convince me that the climate has not changed radically.’
Weather forecasting
Enhancement
Topic modelling● Web services (AlchemyAPI)● MALLET (trainable)
Enhancement
Crowdsourcing● Harnessing the wisdom of the crowd● Seeking specialised knowledge● Gathering additional context
Enhancement
Mapping our Anzacs● Scrapbook● Adding context to records● More structure?
Enhancement
Archives Outside● Gathering information● Blog / Twitter / Flickr
Extension
Extension
Push your metadata beyond its boundaries.
Extension
New contexts● Visualisation● Mashups
Extension
Visible Archive● Seeing everything
http://visiblearchive.blogspot.com/
Extension
History Wall● Endless● Ephemeral● Serendipitous
http://visiblearchive.blogspot.com/http://labs.nma.gov.au/wall/
Extension
Making connections● Record linkage● Authority records
Extension
People Australia● Disambiguation● Aggregating identities● Assigning identifiers
http://nla.gov.au/nla.party-479364 me
Extension
People Australia● Contribute!● Use identifiers!● See the wiki
Extension
Identity browser● Bookmarklet enhanced● Enriched with RDFa● Machine tags
http://wraggelabs.com/identities/
Extension
FMTC● Crowdsource connections● Semantic linkages● Harvest metadata back
http://wraggelabs.com/fmtc/
Extension
Setting it free● Open data● APIs● Linked Data
Extension
Linked Open Data● Become part of the semantic web● Expose your metadata to the world● Get started with good URLs and RDFa
Extension
Linked Open Data
Experimentation
Experimentation
Build spaces to play, learn, create and fail.
Experimentation
Share ideas, examples, recipes, tools and code.
Experimentation
TNA Labs
http://labs.nationalarchives.gov.uk/wordpress/
Experimentation
Don’t wait for permission.
now
Experimentation
Do it.
now
Experimentation
It’s easier than you think.
now
Homework
● Make good urls● Use identifiers● Fix citation standards● Expose structures (RDFa)● Use NLA party ids
now
Where to find me:
@wraggewords – discontents.com.auexperiments – wraggelabs.comwork – labs.nma.gov.au
now