18
Research & Development Analysing media in the cloud An experiment and a marketplace Tristan Ferne Executive Producer BBC Research & Development

Analysing media in the cloud

Embed Size (px)

DESCRIPTION

A talk through two projects that BBC R&D is involved in that use cloud computing for processing media. The first is a case study showing how we used cloud computing to efficiently process a very large archive of media and generate metadata, and the second part is about how this led to us to think about abstracting a service out of it, leading to a general purpose cloud service for analysing media. Full talk notes at http://www.cookinrelaxin.com/2013/06/analysing-media-in-cloud.html

Citation preview

Page 1: Analysing media in the cloud

Research & Development

Analysing media in the cloud

An experiment and a marketplace

Tristan Ferne

Executive Producer

BBC Research & Development

Page 2: Analysing media in the cloud

Research & Development

A experiment in using the cloud to process a radio archive

A prototype for the World Service archive

A marketplace for analysing media in the cloud

Page 3: Analysing media in the cloud

Research & Development

ABC-IP

Automatic Broadcast Content Interlinking Project

Unlocking media archives by making better use of metadata

TSB competition for “Metadata: increasing the value of digital content”

BBC R&D and Metabroadcast

May 2011 - May 2013

Page 4: Analysing media in the cloud

Research & Development

The BBC World Service archiveA 3-year digitisation project

50,000 radio programmes from the past 45 years

3 years of continuous audio

500TB of high quality audio

Page 5: Analysing media in the cloud

Research & Development

The missing metadata

Missing fields

Incorrect data

Spelling mistakes

Page 6: Analysing media in the cloud

Research & Development

Listening machines

Page 7: Analysing media in the cloud

Research & Development

Noisy transcriptsto be raised in a crisp and easy gait collar tradition and mystique and net bottle westphal mia ballroom with a fifth will one of your very well that p. c. set a caustic wet plate is sprint says it twice to purposes again who's addicted across stick is a podium which stopped at a slow start to the masses of setting up a world and on top was a big nineteen ninety three after a renewed spirit of the big dig ,comma off trillo .period when you are unable to compose and see what it's stole to working for a while at the guys when i started the eighth that we teach eighteen hamper and a timeless dave they'd each code for my list tinged yellow and io i had no east p. n. c. and i was a big epic tina afoot o'mara i. q. from kodiak and there was so they become kosher shopko misfit and i was a david to compose his team's end and at haas tied to districts in the indian head of i. a. moved to beijing

Page 8: Analysing media in the cloud

Research & Development

Extracting topics

Extract keywords from noisy transcripts

Match to Linked Data topics from DBpedia

Disambiguate using distance within the “semantic” space

Page 9: Analysing media in the cloud

Research & Development

Processing in the cloud

26,280 hours of audio processed

36,729 compute hours on “small” cloud machines

Processed whole archive in 2 weeks at a cost of ~$3,000

Built an API for managing the process

Page 10: Analysing media in the cloud

Research & Development

Machines + People

Page 11: Analysing media in the cloud

Research & Development

http://worldservice.prototyping.bbc.co.uk

Page 12: Analysing media in the cloud

Research & Development

http://worldservice.prototyping.bbc.co.uk

Page 13: Analysing media in the cloud

Research & Development

comma – Cloud marketplace for media analysis

TSB competition for “Innovating in the Cloud”

BBC R&D, Somethin’Else and Kite

May 2013 - May 2015

Page 14: Analysing media in the cloud

Research & Development

Media analysisTopic generation from text

Summarising text

Sentiment analysis

Speaker identification and diarisation

Music identification

Mood classification of audio and video

Face recognition

Segmentation of audio and video

Object and place recognition

Scene detection in video

Subtitle creation

Page 15: Analysing media in the cloud

Research & Development

Problems with media analysis

Computationally intensive

Hard to integrate with other systems

Hard to evaluate and compare

Hard to know what's possible and what’s available

Page 16: Analysing media in the cloud

Research & Development

Making media analysis easy

Algorithm providers upload algorithms

Media owners upload content and choose what they want to analyse

The platform manages:

Computation and scaling

Storing the data

Monitoring

Billing

Page 17: Analysing media in the cloud

Research & Development

The comma marketplace

Algorithm developers; e.g. research departments at universities and SMEs

Media owners; e.g. broadcasters, museums, archives, even individuals

Page 18: Analysing media in the cloud

Research & Development

Analysing media in the cloud

Tristan Ferne, BBC R&D

[email protected]

@tristanf

http://www.bbc.co.uk/rd

http://worldservice.prototyping.bbc.co.uk