Upload
sebastianewert
View
587
Download
0
Tags:
Embed Size (px)
DESCRIPTION
This talk was given by Luke McKernan (The British Library) at the "Semantic Media @ The British Library" event on 23 September 2013.
Citation preview
News collections at the British Library
Luke McKernan
Lead Curator, News and Moving Image
Semantic Media @ British Library
www.bl.uk 2
News at the British Library
www.bl.uk 3
News at the British Library
www.bl.uk 4
Newspapers
57,000 separate newspaper, journal, and periodical titles: approximately 100M issues or 750 individual pages, from 16thC to today, on print and microfilm
Occupies 50km shelf space
Current acquisition: 1,934 newspaper and weekly/fortnightly periodical titles
Print copies acquired under legal deposit but will move increasingly towards digital acquisition
Physical access at Newspaper Library, Colindale, closing in November 2013 – new reading room at St Pancras early 2014
Online access to 7M pages via British Newspaper Archive
www.bl.uk 5
British Newspaper Archivehttp://www.britishnewspaperarchive.co.uk
www.bl.uk 6
Television and radio news
Began recording television and radio news programmes receivable in the UK in May 2010
Collection now over 30,000 programmes, of which 25,000 are TV, recorded off-air from 20 channels inc. BBC, Al-Jazeera, Russia Today, CNN, CCTV (China), NHK, Bloomberg, France 24, World Service, LBC
40 hours of TV and 22 hours of radio now captured per day
Born digital archive, including Electronic Programme Guide data and subtitles where available
Access onsite only, owing to copyright restrictions, via Broadcast News service
www.bl.uk 7
Broadcast Newshttp://videoserver.bl.uk (onsite only)
www.bl.uk 8
Web news
Non-print legal deposit legislation introduced in April 2013 means British Library can start harvesting UK websites
First annual crawl will collect 4.5M .uk websites and web pages
Plans to harvest a few hundred UK news websites on daily basis, beginning later this year
Access onsite only at BL and other legal deposit libraries
More information: http://www.bl.uk/aboutus/stratpolprog/digi/webarch/index.html
www.bl.uk 9
Available digital content
Underlying data (XML) for 2M nineteenth century newspaper pages, including Publication date Newspaper title Issue Uncorrected OCR text
Underlying data (XML) for 30,000 television and radio programmes, including Transmission date Programme title Channel Subtitles (where available)
Possible use of audio and video content onsite
Web news content not available as yet
www.bl.uk 10
How can we bring the different news media together?
Searching Speech
PoliMedia
Newsmap
I Wanted to See All of the News from Today
Guardian Data Blog
www.bl.uk 11
Searching Speechhttp://britishlibrary.greenbutton.com (onsite only)
www.bl.uk 12
PoliMediahttp://polimedia.nl
www.bl.uk 13
Newsmaphttp://newsmap.jp
www.bl.uk 14
I Wanted to See All of the News from Todayhttp://allnews.greyisgood.eu
www.bl.uk 15
Guardian Data Bloghttp://www.guardian.co.uk/news/datablog/2012/may/16/bitly-news-map-britain-data