Upload
europeana-newspapers
View
98
Download
1
Embed Size (px)
Citation preview
Improving the discovery of
European Historic Newspapers
Rossitza Atanassova, British Library
@RossiAtanassova
IFLA Newspapers, Lyon, 20 August 2014
This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the
Competitiveness and Innovation Framework Programme by the European Community
http://ec.europa.eu/ict_psp
Europeana Newspapers is making historic
newspapers pages searchable
2
http://vimeo.com/100313926
This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the
Competitiveness and Innovation Framework Programme by the European Community
http://ec.europa.eu/ict_psp
Project outcomes
• Content in 22 languages
ranging 17th-20th century
• 10 million pages of full text
• Article-level records and
named entities for 2 million
pages
• Aggregation of up to 18
million pages
• Aggregation of metadata of
up to additional 19 million
pages
• Cross-searchable
newspapers interface at The
European Library
• http://www.theeuropeanlibrary.
org/tel4/newspapers
• Issue-level metadata via
Europeana
http://www.europeana.eu/
3
This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the
Competitiveness and Innovation Framework Programme by the European Community
http://ec.europa.eu/ict_psp
Statistics
Currently one can search
through
• full-text for over 2 million
pages
• metadata records relating to
to over 1 million issues
(links to source libraries)
4
This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the
Competitiveness and Innovation Framework Programme by the European Community
http://ec.europa.eu/ict_psp
5
This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the
Competitiveness and Innovation Framework Programme by the European Community
http://ec.europa.eu/ict_psp
Search and browse options
6
This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the
Competitiveness and Innovation Framework Programme by the European Community
http://ec.europa.eu/ict_psp
Display options
• Metadata, full-text and full
zoomable images
• Metadata, full-text and static
images (full size or snippets)
• Metadata and full-text
• Metadata
7
This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the
Competitiveness and Innovation Framework Programme by the European Community
http://ec.europa.eu/ict_psp
Usability testing
• Remote 60 minutes long test sessions in April 2014
• Conducted by User Vision, Edinburgh
• 12 participants from 5 countries with professional or strong
personal research interest in the content
• 6 task scenarios
• Pre- and post-test questionnaires
• User Vision Report at http://www.europeana-
newspapers.eu/usability-testing-results-for-our-historic-
newspapers-browser/
8
This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the
Competitiveness and Innovation Framework Programme by the European Community
http://ec.europa.eu/ict_psp
Task success and ease of use ratings
9
Images in Alan Blackwood, The European Library Newspaper Archive –
Usability Testing, 16/04/2014
This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the
Competitiveness and Innovation Framework Programme by the European Community
http://ec.europa.eu/ict_psp
User response to the interface
• “Strong positive reaction to the availability of the archive”
• “Aggregated view of content from many sources highly
valued”
• “Basic search functionalities worked well”
• Presentation of images and image navigation controls are
appreciated, as is the display of OCRed text
• Browse content over geographical map is popular
• Identified issues with design and functionality: facets, results,
navigation
• More expectations: print, download, saved searches
10
This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the
Competitiveness and Innovation Framework Programme by the European Community
http://ec.europa.eu/ict_psp
Before and after
11
This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the
Competitiveness and Innovation Framework Programme by the European Community
http://ec.europa.eu/ict_psp
Changes to landing page
• Prominent browse and
advanced options
• ‘Discover’ tab for browse
options page
• This day in history allows
users to scroll through all
relevant issues
12
This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the
Competitiveness and Innovation Framework Programme by the European Community
http://ec.europa.eu/ict_psp
Changes to browsing options
• Search by issue date
modified to include a text
input box for the year with
auto-suggestions
• Select title from an
alphabetical index
• Geographical map of Europe
is bigger and uses better
colour palette to indicate
number of issues
13
This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the
Competitiveness and Innovation Framework Programme by the European Community
http://ec.europa.eu/ict_psp
• Sort by relevance,
descending date and
ascending date
• Configure number of items
per page (10-100)
• Further recommendations:
controls to navigate between
results, a ‘back to search
results’ button and a search
input box to allow
modification of search terms
14
Changes to results pages
This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the
Competitiveness and Innovation Framework Programme by the European Community
http://ec.europa.eu/ict_psp
15
Faceted search and newspaper source page
This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the
Competitiveness and Innovation Framework Programme by the European Community
http://ec.europa.eu/ict_psp
16
This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the
Competitiveness and Innovation Framework Programme by the European Community
http://ec.europa.eu/ict_psp
Integration of the viewer into the Europeana
portal
17
This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the
Competitiveness and Innovation Framework Programme by the European Community
http://ec.europa.eu/ict_psp
Next steps with the browser
18
• Second usability test in
September
• Final version by end of 2014
• Add OCR correction
functionality
• Allow access via API
• Further integration of the
newspapers viewer within
Europeana
This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the
Competitiveness and Innovation Framework Programme by the European Community
http://ec.europa.eu/ict_psp
Research practices and expectations
• Participants in the usability test have well established
research practices and higher expectations of the site’s
functionality
• Preference for search over browsing
• Greater control over search results
• Multiple layers of search through facets
• Would like to search by subject area and historical period
• User account to save search histories
• Download and print options
• New content notifications and feedback submission option
19
This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the
Competitiveness and Innovation Framework Programme by the European Community
http://ec.europa.eu/ict_psp
Researchers’ interest in the Europeana
Newspapers archive
20
• Interdisciplinary source of
information
• Mass digitised content
• Pan-European cross-
searchable archive
• Transnational comparative
studies
• Text mining for multilingual
content
• Computational analysis and
visualisation of the data
This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the
Competitiveness and Innovation Framework Programme by the European Community
http://ec.europa.eu/ict_psp
What researchers value
21
“I see enormous value in an archive that breaks
down national boundaries automatically, where I
can search for content from a range of
countries..” – Bob Nicholson
“The difference lies not just in access but in the
conversion of a massive amount of print into a
searchable resource … This holds the potential to
make connections across newspapers in ways
previously unimaginable.” Matt Rubery
“Now software allows us to work with millions of
pages. By combining words and expressions,
machines uncover patterns that we never even
suspected were there …” Professor Toine Pieters
This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the
Competitiveness and Innovation Framework Programme by the European Community
http://ec.europa.eu/ict_psp
Digital Humanities approaches to digitised
newspaper archives
22
• Asymmetrical Encounters: E-
Humanity Approaches to
Reference Cultures in
Europe, 1815-1992’
• The project will apply multi-
lingual text mining
techniques to long runs of
digitised newspapers and
other textual materials
This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the
Competitiveness and Innovation Framework Programme by the European Community
http://ec.europa.eu/ict_psp
The Victorian Meme Machine project
23
• Partnership between Bob
Nicholson, Edge Hill
University and British Library
Labs
• Extract Victorian jokes from
19th century British
newspapers
• Crowdsource transcriptions
• Algorithms to pair text with
images
• Share and re-use memes
https://www.youtube.com/wat
ch?v=FN1ZSAz2vMg
This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the
Competitiveness and Innovation Framework Programme by the European Community
http://ec.europa.eu/ict_psp
Europeana Newspapers Information Days
24
This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the
Competitiveness and Innovation Framework Programme by the European Community
http://ec.europa.eu/ict_psp
Final workshop “Newspapers in Europe & the
Digital Agenda for Europe”
25
• British Library, 29-30
September 2014
• The value of digitised historic
newspapers
• How to overcome the barriers
to improving access to
digitised historic newspapers
• Policy makers, researchers,
librarians, cultural heritage
professionals and newspaper
publishers
Thank you!
For more information visit
www.europeana-newspapers.eu