British Library Labs Roadshow University of Wolverhampton 2017

Preview:

Citation preview

1 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

British Library LabsWhat is British Library Labs and what have we learned over the last four years?

1305 – 1400 and 1500 - 1530, 3 April 2017Learning the Lessons of working with the British Library’s Digital Content and Data for your researchUniversity of Wolverhampton

https://goo.gl/Lh4zI6

Mahendra Mahey, Manager of British Library Labs@BL_Labs and @mahendra_maheymahendra.mahey@bl.uk

2 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

It’s all about you…jobs

3 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

It’s all about you…subjects

Please complete / correct sheet that is going round

4 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

The British Library

Inside the British LibrarySpace for 1200 readers, around 400,000 visitors per year

Uses low oxygen and robotsReading room and delivery to London

Document Supply and Storage at Boston Spa

Stockton-on-TeesAuthor right to payment each time their books

are borrowed from public libraries.

St Pancras, London, UKMany books are stored 4 stories below the buildingLegal Deposit Library – Reference only

5 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

Living Knowledge Vision (2015 – 2023)

Custodianship Research Business

Culture Learning International

To make our intellectual heritage accessible to everyone, for research, inspiration and enjoyment and be the most open, creative

and innovative institution of its kind by 2023.

Document:http://goo.gl/h41wW7 Speech:https://goo.gl/Py9uHK

Roly Keating (Chief Executive Officer of the British Library)

To make our intellectual heritage accessible to everyone, for research, inspiration and enjoyment and be the most open, creative

and innovative institution of its kind by 2023.

6 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

Collections – not just books!> 180* million items

> 0.8* m serial titles

> 8* m stamps

> 14* m books

> 3* m sound recordings> 4* m maps

> 1.6* m musical scores

> 0.3* m manuscripts

> 60* m patents

King’s Library *Estimates

7 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

http://www.bl.uk/projects/british-library-labsFunded by the Andrew W. Mellon Foundation

8 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

http://www.bl.uk/projects/british-library-labsFunded by the Andrew W. Mellon Foundation

9 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

Wider…not just Researchers

Researchershttps://goo.gl/WutNyi

Artistshttp://goo.gl/nNKhQ2

LibrariansCurators

https://goo.gl/9NWZUW

Software Developershttps://goo.gl/7QQ5Tf

Archivistshttps://goo.gl/x7b4tg Educators

https://goo.gl/qh01Mi

10 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

Digital research methods

Visualisations

Application Programming Interfaces for datasets e.g. Metadata, Images Annotation

Location based searching & Geo-tagging CrowdsourcingHuman Computation

11 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

How are we doing this?

12 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

Competition

Awards

Projects

Tell us your ideas of what to do with our digital content

Show us what you have already done with our digital content in research, artistic, commercial and learning and

teaching categories

Talk to us about working on collaborative projects

13 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

Why are we doing this?

14 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

Why are doing this?

• Working closely with and listening to those who want use our digital collections and data for their work

• We can learn how we are and should be supporting them:– Access to digital collections?– Advice, guidance, technical support, training– Services, Tools and Processes?– Many more reasons…

• Where are the gaps between what users want and what we can give?

• How do we build the bridges to overcome the gaps?

15 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

Born digitalData all around us!

/

Knowledge Quarter London55 knowledge organisations within 1 mile radius of Kings Cross, http://www.knowledgequarter.london

https://goo.gl/pGO7QY

Born digitalData all around us!

16 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

#bldigital1-2 %* digitised

* estimate

Digitisation

Partnerships Commercial & Other Organisations

Amountincreasing rapidly

Bias in digitisation

http://goo.gl/bR9UJL Sample Generator

17 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

Have you got X?

https://upload.wikimedia.org/wikipedia/commons/5/50/Real_wuerzburg.jpg

Looking for Physical Content in the British Library

18 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

Have you got X digitised?

http://www.yorkmix.com/wp-content/uploads/2014/04/mr-simms-sweet-shoppe-york.jpg

Looking for Digitised Content in the BL

19 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

So little digitised?• Digitisation costs time and resources…

• Still…over 650 Digital Collections but not all found through Google or even online

• Dialogue is either:– you are ‘lucky’ and we have the digital content relevant to your research– we don’t have exactly what your looking for, but is there anything of

interest? Let’s talk…

• Artists find this dialogue easier and we tend to attract researchers with ‘fuzzier’ research boundaries

• Access easier for openly licensed content

• More challenging for on-site and in-copyright contemporary material

20 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

only in Reading

Rooms due to ©

only on site due to

© or ethical etc

not online / available –

various storage devices,

personal data

online and open

British Library

online behind paywall

Challenges of access to Digital Collections

21 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

The Story of the Digital Collection…

DigitalCollection

CuratorWho paid for the digitisation?

Who did the digitisation?Technology used

Born digital?

Published

Unpublished

Where is it?

Can it still be accessed?

Generates income

Reputational Risk

Legalities

Political

Ego Surprises

Metadata

Old format not supportedWhat media was the digitisation done from?

Documentation

No Metadata

Messy Metadata

Still there?

Good to know the background of a Digital collection if you want to use it for research…

22 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

Open Licensed Digital Content?

15% Openly Licensed

Around 10%* available online

Working through

Breakdown by collection*Manuscripts 59%Books 9%Maps and Views 7%Newspapers 3%Archives and Records 3%Paintings, Prints and Drawings 2%

*Based on digitisation projects

Largest proportion of fundingPublic / Private Partnership

15%* Openly Licensed85%* Available onsite

*Estimates

23 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

How do we give access to onsite-only

Digital Collections(85% of our Digital Collections)?

24 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

READING ROOM

ON SITE

NOT ONLINE

OPEN

British Library

£

Labs Residency Model

Challenges of access to Digital Collections

25 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

Accessing digital collections onsite

OPEN £

• Have to be ‘onsite’

• Need to be security cleared for some collections– Hence ‘Researcher in Residence Model’

• Permission required (depending on ‘story’ of collection)

• Content on various media formats

• 20 % re-use of material for non commercial research for some collections

• We are learning ‘pathways’ so that this becomes ‘everyday’ to provide onsite access in the future

26 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

Playbills, Books, Newspapers (includes OCR)

Digital collections and Datasets

British National Bibliography

http://bnb.data.bl.uk

http://sounds.bl.ukhttp://dml.city.ac.uk/

Music (Recordings & Sheet) & Soundshttp://goo.gl/frSMJtBroadcast News (TV and Radio)

http://goo.gl/cwThHw

http://goo.gl/pBkisZhttp://goo.gl/E8aRyQ

Usage dataEtHOSImages, Manuscripts & Maps

http://www.qdl.qa/ Qatar Digital Library

http://idp.bl.uk/International Dunhuang

Project

Mapshttp://www.bl.uk/maps/

Hebrew Manuscriptshttp://goo.gl/4sbCp9

Flickr & Wikimedia Commons

https://goo.gl/LZRmaZ

27 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

Open Cultural Heritage DatasetsCollection Guides

Datasets about our collections Bibliographic datasets relating to our published and archival holdings

Datasets for content mining Content suitable for use in text and data mining research

Datasets for image analysisImage collections suitable for large-scale image-analysis-based research

Datasets from UK Web ArchiveData and API services available for accessing UK Web Archive

Digital mapping Geospatial data, cartographic applications, digital aerial photography and scanned historic map materials https://data.bl.uk

Discussion list: http://www.jiscmail.ac.uk/CULTURAL-HERITAGE-DATASETS

28 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

What did people

actually do?

29 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

Typical pattern of research for Labs

•Finding invisible things in ‘messy’ historical data

•Unearthing / unlocking hidden histories and data to stimulate new research

•Celebrating hidden histories / data creatively through events, art and performance

30 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

Finding things in messy OCR text

Mrs Folly• Clean up some manually• Get human ‘ground truth’• Write code to find things

reliably in it automatically• Try code on messy content• Tweak if necessary• Digital ‘lasso’ around content• Human sift through

Mrs Folly

31 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

Code: Machine Learning / Reading• Analogies to how humans read / learn

• Machines acquire ‘knowledge’ / data and use that knowledge / data to make sense / identify patterns

• Labs doing this on a case by case basis so methods can vary

• Need computational AND human effort

• Legalities of this process being ‘ironed’ out with publishers,

• Often a misunderstood area…

• Computers look for ‘patterns’ or the ‘essence’ of something

32 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

Smell of soup & Machine Learning

Thanks to Memo Akten (@memotv on twitter) for the inspiration!

https://goo.gl/toq4Bo Nasreddin, 13th Century Turkish Sufihttp://web2.uvcs.uvic.ca/elc/studyzone/330/reading/smell1.htm

33 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

http://victorianhumour.tubmblr.com

Victorian Meme Machine (2014)

https://goo.gl/HMqDt3

Bob Nicholson

http://victorianhumour.tumblr.com/

Bob Nicholson interviewed on BBC Radio 4 Making History Programme:

http://goo.gl/fmV9epAnd telling jokes to the public:

http://goo.gl/xIDRhzBob obtained further funding from his university

Looking for more collaborations https://www.youtube.com/watch?v=-GRgj7Q5OM0

Rob Walker, Victorian Mother-in-law Jokes

Victorian Comedy Night, 7 Nov 2016

34 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

Katrina Navickas (2015) Political Meetings Mapper

http://politicalmeetingsmapper.co.ukhttps://goo.gl/Qq78Oa

Labs Symposium 2015

https://goo.gl/BSA3be

Interview 2015

The Chartist Newspaperhttp://goo.gl/vOLSnH

Chartist Monster Meeting

Chartists Walking Tour and Re-enactment London

35 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

Black Abolitionist Performances & their Presence in Britain (2016) – Hannah-Rose Murray

FrederickDouglass

EllenCraft

JosiahHenson

Ida B Wells

A Performance by Joe Williams &

Martelle Edinborough

http://frederickdouglassinbritain.com/

36 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

Data-mining verse in 18th Century newspapersBL Labs Project 16-17, Jennifer Batt

https://goo.gl/5Akthd

Slides courtesy Jennifer BattJennifer Batt @ the BL on World Poetry Day

37 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

What thoj' among ourrelves, with too much Heat, or t W: fweutimes.wongle, wvhen we Ihould debate, W – (A confequential Ill which Freedom drawvs, fl t A bad Efficf, but from a noble Caufe) t We can with univeifal Zcal advance, to To cutb the faithlefs Arrogancccof V rance. hi

Dublin Journal 10-14 September, 1745

Slides courtesy Jennifer Batt

38 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

Verse: 81% lines begin with initial capital

Prose: 52% lines begin with initial capital

Westminster Journal 3 March 1745

Slides courtesy Jennifer Batt

39 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

Use of Overproof / OCR Correction?

Re-OCR with ABBY FineReader?

https://www.abbyy.com/en-gb/

http://overproof.projectcomputing.com/

40 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

Virtual Infrastructure for OCR text

OCR text scraped from digitised newspapers

and in cloud

Jupyter notebookWrite python code and results

in browserhttp://jupyter.org

Access available for researchers ‘in residence’

41 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

Other experiments with images

42 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

Worked better for female faces than men’s

Press

http://mechanicalcurator.tumblr.comPosts image every 30 minutes

http://www.flickr.com/photos/britishlibrary/

1,020,418 imagesneed tagging!

Creative uses of images

Face recognition

Mechanical Curator

http://goo.gl/qPPgxX

Flickr

Snipping out imagesfrom 65,000 Digitised Books*

>600,000,000 views

>20,000,000 tags

https://goo.gl/FgZ4HM

Work @ BL by Ben O’Steen, Labs

and Digital Research Team*Matt Prior - http://goo.gl/j29Tnx

Since Dec 2013

Tumblr

43 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

Using other platforms to host BL collectionsLinks back to Library & community engagement

You can purchase a ‘High Res’ Copy

View in the Library Item Viewer

Download .pdfAll illustrations

in book

Other illustrations in booksPublished in same year

View the item in the Library Catalogue Tags auto generated

User generatedTag

Grouping for image

Same on Wikimedia commons

British Library Flickr Commons Tags

44 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

Tagging, Tagging, Tagging…

45 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

Tagging a million imagesIterative Crowdsourcing

http://goo.gl/j6fxac

Cardiff University’sLost Visions Project

http://www.metadatagames.org/

Metadata Games

James Heald

Mario Klingemann

Chico 45

Use computational methods

Human Tagger

Top British Library Flickr Commons Taggers

46 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

Special Jury’s Prize (2015)James Heald – Wikimedia and Map work

https://goo.gl/WYZCB2

http://goo.gl/HNQq5e

https://goo.gl/VPgffL

https://commons.wikimedia.org/

https://goo.gl/djtm1b

Labs Symposium (2015)Geotagging maps

54,000 MapsFound in Flickr 1 million

Human & ComputationalTagging

& Community engagement

47 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

Adam Crymble (2015)Crowdsource Arcade

What if crowd sourcing

looked like this?

http://goo.gl/LBfJ4W

http://goo.gl/OH9pOZ

https://goo.gl/7z0j8p

30 mins talkLabs Symposium (2015)

https://goo.gl/SSRsdd

5 min interview (2015)

http://goo.gl/0APpE8

Game Jam

Using Arcade Gamesto help Tag images

48 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

SherlockNet: Competition Winner 2016Karen Wang, Luda Zhao and Brian Do

Using Convolutional Neural Networks to Automatically Tag and Caption the British Library Flickr Commons 1 million Image Collection

12 categories

>20 million tags added >100,000 captions

bit.ly/sherlocknet

Pooled surrounding OCR text on page from similar images

Used Microsoft COCO (photographs) & British Museum Prints and Drawings

collections as training sets.

Tags Captions

49 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

Artistic / Creative Works

http://goo.gl/dM8ieA

Mario Klingeman (2015)

https://www.youtube.com/watch?v=Q3SBxO34Zlc

David Normal 2014 and 2015

http://goo.gl/bNxGZZ

Kris Hoffman (2016)

https://goo.gl/QilqqT

Jiayi Chong 2016Ling Low 2016

https://www.youtube.com/watch?v=bcOP1E5bRE0

https://www.facebook.com/RealmlandStory/ Paul Rand Pierce 2016

A Hat on the Ground Spells trouble

Tragic Looking Women44 Men who Look 44

(Notice the direction faces)

50 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

Mario Klingemann 2016

https://www.youtube.com/watch?v=xgnxnmqnR7YGoogle Arts and Culture Lab – Experiments with Machine Learning

https://artsexperiments.withgoogle.com/

51 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

Imaginary Cities – BL Labs Project 16-17Michael Takeo Magruder

https://goo.gl/4ARwTyAn artistic exploration seeking to create provocative fictional cityscapes for the Information Agefrom the British Library’s digital collection of historic urban maps

52 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

Lessons Learned & Challenges…(1)• Start with a conversation (external and internal), our data isn’t all on Google

(yet!) & not easy to find, need to create and embrace serendipity and opportunities for use by talking!

• Need to have several conversations with several stakeholders and tap into their tacit knowledge that isn’t always written down sometimes to progress ideas.

• Often misunderstandings because of jargon & different meaning of words.

• Learn the story of the collection

• Expectations change when researchers actually see the data, systems and experience the ‘culture’ of the organisation.

• Opening collections requires some to need to let go of the emotional and psychological connection to them

53 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

Lessons Learned & Challenges…(2)• Embrace dirty data, it may never be perfect!

• We tend to work with researchers who can be ‘flexible’ with their research questions and are willing to embrace challenges.

• Many researchers have the domain knowledge but lack the technical / digital skills to use Digital Research methods. Should they be teamed up with those that want to solve problems (computer science) or get trained?

• Identifying / bridging gaps for researchers to use data, help them ‘navigate’ through the Library to get the data they want (sometimes).

• Huge appetite to use digital content & data (e.g. Flickr Commons stats).

• Stimulate the imagination, work fast, give it energy

54 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

Labs mindset…

1. Start a conversation and try to support ideas2. Start with small experiments, but think big!3. Fail faster (don’t be afraid)4. Reject perfectionism5. Good enough is sometimes Good enough6. Celebrate the uses of collections

55 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

The Magic of Openness!

• If digitised / digital collections are not used, what is the point of digitising / keeping them?

• Opening up our digital collections offers new ways for the Library’s content to be remixed and re-imagined

• Opening up our digital collections ‘re-energises’ them and the Library

• Generates plenty of examples to inspire use by others

56 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

The Future of BL Labs

• Continue to engage with researchers, learn what they want to do and collect evidence of demand

• Develop Business Model and Support process to make ‘Business as Usual’ at the British Library

• Help to create pathway to developing a Digital Research Suite at the British Library

57 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

Taking a peek at our Open Data

A digitised book…

58 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

002819694

59 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

60 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

61 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

Optically Character Recognised (OCR)generated Text

Scanned Page

Image on Flickr Commons

https://goo.gl/AC43vs

62 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

OCR XML Generated by ABBY Fine Reader

63 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

Taking a peek at our on-site only accessible data

A digitised newspaper

64 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

Accessing digitised newspapers onsite at the BL

1

Windows 7External access possible through Citrix Server

Results of digitisation exist on Windows file shares!

65 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

Accessing digitised newspapers onsite at the BL (JISC 1)

2

12 Volumes, each with terabytes of data

66 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

Accessing digitised newspapers onsite at the BL

3

67 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

Accessing digitised newspapers onsite at the BL

4

68 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

Accessing digitised newspapers onsite at the BL

5

69 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

Accessing digitised newspapers onsite at the BL

6

70 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

Accessing digitised newspapers onsite at the BL

7

71 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

Accessing digitised newspapers onsite at the BL

8

72 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

Accessing digitised newspapers onsite at the BL

9

73 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

Accessing digitised newspapers onsite at the BL

10

74 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

Accessing digitised newspapers onsite at the BL

11

75 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

Accessing digitised newspapers onsite at the BL

12

76 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

Accessing digitised newspapers onsite at the BL

13

Accessing original ‘master’ image (not cropped or post processed)

Or ‘service’ copy (post processed) and results of OCR available as ALTO XML

77 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

Accessing digitised newspapers onsite at the BL

14a

Accessing original ‘master’ image (not cropped or post processed) in .TIFF format

78 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

Accessing digitised newspapers onsite at the BL

Accessing original ‘master’ image (not cropped or post processed)

14b

79 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

Accessing digitised newspapers onsite at the BL

15a

Accessing ‘service’ Copy (post processed) and results of OCR available as ALTO XML

80 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

Accessing digitised newspapers onsite at the BL

Accessing ‘service’ Copy (post processed)

15b

81 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

Accessing digitised newspapers onsite at the BL

15c

Accessing OCR as ALTO XML

82 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

Accessing digitised newspapers through Gale Interface (subscription)

1

83 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

Accessing digitised newspapers through Gale Interface (subscription)

2

84 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

It’s all about you…

Please complete / correct sheet that is going round

85 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

Explore or Imagine Our Data!• CSV of Metadata

https://data.bl.uk/digbks/dig19cbooks-mdata-csv.csv

• 19th Century Books - Book Metadata - 01/09/2013.https://data.bl.uk/digbks/db21.html

• Digitised Books - Flickr Tag History - Dec 2013 to March 2016. TSVhttps://data.bl.uk/digbks/db15.html

• Digitised Hebrew Manuscripts - Metadatahttps://data.bl.uk/hebrewmanuscripts/heb1.html

• Digitised Hebrew Manuscripts: Or 2210 - Or 2364https://data.bl.uk/hebrewmanuscripts/heb8.html

• Theatrical playbills from Britain and Ireland (OCR text only)https://data.bl.uk/playbills/pb2.html

• Portraits of actors, views of theatres and playbills (covering 1750 - 1821 in a single volume)https://data.bl.uk/singlesheet/por1.html

• Volumes of Lysons Collectanea (Amusements), comprising broadsides, cuttings, advertisements on amusements.1660-1840.https://data.bl.uk/singlesheet/ad1.html

https://data.bl.uk•Have a look at the data.•Data Quality•Issues

Or an idea you have thought ofwhat to do with the data!

http://labs.bl.uk/Ideas+for+Labs

Smaller datasets

86 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6

Contact us

Mahendra MaheyManager of BL Labs

mahendra.mahey@bl.uklabs@bl.uk

Recommended