EMu, Collections Online, and the Adkin Diaries: Using existing platforms for transcription. Carol Stevenson Collection Information System Manager Museum

Embed Size (px)

Citation preview

  • Slide 1

EMu, Collections Online, and the Adkin Diaries: Using existing platforms for transcription. Carol Stevenson Collection Information System Manager Museum of New Zealand Te Papa Tongarewa 11th Australasian EMu User Conference Sydney, 3-4 September 2013 Slide 2 Spent morning shepherding breeding ewes etc. Mustered the 2nd class Waikanae hoggarts (these are now the best we have) + drafted out 105 of the 109 to go UR (North Block). Also mustered the (former) 1st class Waikanae (193) + drafted out 179 to go up to N. Blk with the others. In evening rode down to see Maud helped her develop some plates spent a lovely time with her shes a perfect darling. http://collections.tepapa.govt.nz/theme.aspx?irn=4382 http://collections.tepapa.govt.nz/theme.aspx?irn=4382 Slide 3 Slide 4 Background George Leslie Adkin, 1888 - 1964 Farmer, photographer, geologist, explorer, archaeologist, ethnologist. 1 man, 41 diaries, 59 years, Over 21000 days Thousands of negatives and prints, some albums Initial deadline, launch of @life100yearsago twitter feed part of WW100 project@life100yearsago WW100 Did everything ourselves Figure out process (imaging, cropping, loading, transcription guidelines) as we went Slide 5 Process Assess album condition Photograph album pages, load as media assets to album Crop pages to days, load as Media Assets (derivatives) Create narrative for days Load day images to EMu day narrative Transcribe Add associated subjects, people, places Add context to narrative entries for month Some parts semi-automated, some completely manual; some need no special skills, others do Slide 6 Framework Using existing framework; EMu & Collections Online CIDOC CRM for building and expressing semantic relationships Days are conceptual entities, not physical Links to physical entities: diaries, photographs, albums (Catalogue) Links to people, places, topics (Thesaurus and Parties) All content managed in EMu and delivered to Collections Online Slide 7 Narrative for Day Slide 8 Slide 9 Narrative Hierarchy Slide 10 Slide 11 Slide 12 Narrative Associations Slide 13 Hierarchy links to Catalogue Slide 14 Catalogue - Media Asset Slide 15 Media Asset Hierarchy Slide 16 Slide 17 Slide 18 Slide 19 Slide 20 Existing framework: Cons No crowdsourcing opportunity Huge amounts of data pushes current visual design of Collections Online Hierarchies get very long, can be slow in EMu Slide 21 Existing framework: Pros Cheap! Know how to use it No set up Proved flexibility of system Full use of thesauri etc Links into rest of the collections (this is the most important) Existing audience Slide 22 Crowdsourcing Size of the project is daunting, but the transcription could be manageable through crowdsourcing The content is interesting: NZ history, early 20 th Century courtship, farming, geology, religion, war, politics, weather Horowhenua locals interested in local history, and one of their famous sons History students and educators: Bring students closer to primary material, work with cursive handwriting, highlight the importance of accuracy in relation to data, personal biography Learning history through a first hand account Slide 23 Platforms and complex data There are a number of existing online platforms that look great (Zooniverse, FromThePage), but how to deal with matching to our structure, vocabularies, authorities? Could use automated text authority mining, but would need to then match back to authorities and structure, and text doesnt include concepts that require human understanding (e.g. courtship) Beyond scope of crowdsourcing? But does that diminish the value of the data? External platforms means lots of data and image handling Closed crowdsourcing. Provide volunteers remote access to Emu, with very cut down access Slide 24 Where to Cant do with existing (human) resource Transcription only one part of the project, richness comes form the linking of concepts, people etc Need to figure what parts need to be crowdsourced, what cant Transcription could enable the adding of contextual and semantic relationships and links to other sources Options for automating the above Or, with a focussed crowd and a finite project, maybe we dont need a new platform, could provide training and use existing tools Make data available for analysis, visualisation, research, fun Slide 25 In evening rode down to see Maud showed her some books but there seemed to be a lack of sympathy between us + the evening was a failure. http://collections.tepapa.govt.nz/theme.aspx?irn=4080 Slide 26 Twitter Narrative teasers are also Tweets Link to Collections Online One of a number of 100 years ago accounts Feeds into a group account @life100yearsago@life100yearsago Also tweet images for days that have them Dead man tweeting: potential issues of responding to comments etc Slide 27 Slide 28 Slide 29 Slide 30 What weve learnt So much content, so much data Our existing data structure works really well Transcription only one part Context needed, or at least useful, for the reader Enlivens the collection, a step beyond just digitisation and transcription need to formalise the project Slide 31 See Adkin diaries on Collections Online @adkin_diary on Twitter @adkin_diary @life100yearsago on Twitter @life100yearsago Slide 32