Introduction to British Library digital resources for social scientists

  • Published on

  • View

  • Download

Embed Size (px)




  • 1. Welcome and introduction to British Library digital resources for social scientists John Kaye Lead Curator Digital Social Science Peter Webster - Web Archiving Engagement and Liaison Manager 7th December

2. What kind of library are we? We exist for everyone who wants todo research for academic, personalor commercial purposes Our collections cover all knownsubject areas; sciences, technology,medicine, arts & humanities, socialsciences We have a copy of every itempublished in the UK Our collections cover all formats;sound, images, video, newspapers,maps, manuscripts, databases,books and journals, much more 2 3. News, newspapers and magazines 3 4. News and current eventsBroadcast news, recording from May 2010Political change in Middle EastOlympic GamesOccupy movement4 5. Images and photographs Images online Online gallery Photographically illustrated books 5 6. Online Services6 7. Social Science online resources for researchers ESRC online resource Management and Business Studies Portal Social Welfare Portal Social Science blog7 8. 8 9. Management and Business Studies portal 9 10. 10 11. 11 12. Oral history at a 370 collections from 1 tape to 5,500 (Millennium Memory Bank) 100-150 hours of new digital fieldwork recordings per month 2200 catalogue records added or updated per year 4000 public enquiries per year 40 talks and lectures per year 60 training sessions per year with OHS (500+ people)12 13. Guides and support Reference services: reading room, telephone, email Help for Researchers web pages Collection guides, eg for government publications: Topical bibliographies, eg Globalisation andemployment, Gang culture and knife crime, CorporateSocial Responsibility, Far Right in Britain Welfare Reform on the Web 13 14. Exhibitions and 14 15. Doctoral Open Days 201311 February Social Sciences18 February Media, Cultural Studies and Journalism 15 16. Web archives and digital methodDr Peter WebsterWeb Archiving Engagement and Liaison Officer@UKWebArchive / @pj_websterPeter.Webster@bl.uk 7th 2012 17. The lost web: people[, (archived 24/5/05)] 17 18. The lost web: people[ (archived 8/8/05)] 18 19. The lost web: organisations[ (archived 21/11/12)] 19 20. The lost web: organisations[ (archived 12/12/08)] 20 21. Our mission:Collect, preserve, and make accessible web sites ofcultural and scholarly importance from the UK domain 22. UK Web Archive Selective Web Archive over 11,000 websites collected since2004 over 50,000 instances Over 16TB of compressed data British Library, National Libraryof Wales, JISC Also National Library of Scotland,the National Archives, WellcomeLibrary Many collaborators eg Womens Library, Live ArtsDevelopment Agency, Quakers inBritain22 23. A typical event-based special collection Collect, preserve, andmake accessible eb sites of cultural and scholarlyimportancefrom the UK domain 24. The orphaned web 24 25. A comprehensive special collectionCollect, preserve, and make accessibleeb sites ofcultural and scholarly importance from the UK domain 26. Web archiving: the basics What Selecting, capturing, storing, preserving and managing access to snapshots of websites over time How Use crawler software to download websites automatically Selective or domain archiving Provide access in a Web ArchiveWhen Since mid 1990sWho Heritage and memory organisations, eg BL, The National Archives University libraries Not-for-profit and commercial organisations, eg Internet Archive Individual researchers Why Global information resource Artefact of cultural and technology change Representative sample of the web: historical and sociological data that may not be found elsewhere Part of national digital heritage - legal requirements26 27. Selective versus domain archiving Two complementary approaches: selective and domain archivingWidthDe Domain harvesting:p- Typically once/twice a yeart - Domain wide snapshoth - Supported by national legislativeframework Selective archiving: -- automated & cost-effective - More frequent gathers; manual QA - Guided by collection policy - Can be based on events or themes e.g. credit crunch -- manual & expensive27 28. Non-print Legal Deposit 2013: what will we collect ?A deposit library is entitled to copy UK publications from the open web.A deposit library is entitled to collect other password-protected material by harvesting, subject to giving at least 1 months written notice for the publisher to provide a password or access credentials.28 29. What will we be collecting ?Includes resources: that are issued from a .uk or other UK geographic top-leveldomain, or where part of the publishing process takes place in the UK; but excluding any which are only accessible to audiencesoutside the UK.29 30. What will we NOT be collecting ?Film and recorded sound where the audio-visual content predominatesPrivate intranets and emailsPersonal data in social networking sites or that are onlyavailable to restricted groups.30 31. What will users be able to do with it ?Users may: access deposited material while on library premisescontrolled by a deposit library. print one copy of a restricted amount of any depositedmaterial, for non-commercial research or other defined fairdealing purposes such as court proceedings, statutoryenquiry, criticism and review or journalism. 31 32. What will users NOT be able to do with it ?Users may NOT: use an item simultaneously with another user; make any digital copies, except by specific and explicitlicence of the publisher. 32 33. A web archiving strategy based on prioritisationDomain CrawlEvent EventEventDomain Events:Specialharvesting:Political,Collection:Broad sweep cultural, social Focused,of .uk domainand economic thematicSurvey andevents ofcollectionsdiscoverynational SupportImplement interest, eg priorityLegal DepositOlympics subjects 201233 34. JISC UK Web Domain Dataset (1996-2010)Funded by JISC to create a research collection of UKwebsites Collaboration between the Internet Archive, JISC and theBritish Library Copy of subset of the Internet Archives web collection thatrelates to the UK 470466 files, mostly arc.gz, with 4494 warc.gz. Total size: 32TB No local access possible through the Internet Archive Can be used to generate secondary datasets and makethese available Analytical access the main route 34 35. Historical Archive HTML Version Analysis 36. N-Gram Search: Prime Ministers 37. N-Gram Search: Social Media 38. Questions ?John.Kaye@bl.ukTwitter: @johnkayeBLPeter.Webster@bl.ukTwitter: @UKWebArchive / @pj_websterUK Web Archive: 38