10
7 Fun Things to do with MapReduce Chris Hillman – Teradata Data Scientist [email protected] @chillax7

7 Fun Things to do with MapReduce

  • Upload
    rumor

  • View
    29

  • Download
    0

Embed Size (px)

DESCRIPTION

7 Fun Things to do with MapReduce . Chris Hillman – Teradata Data Scientist [email protected] @chillax7. Agenda. Map Tasks Face Detection Character Recognition Speech to Text Shuffling Mass Spectrometer processing Reducers Text Mining Actual Mining Cluster Building. - PowerPoint PPT Presentation

Citation preview

Page 1: 7 Fun Things to do with MapReduce

7 Fun Things to do with MapReduce

Chris Hillman – Teradata Data Scientist [email protected]@chillax7

Page 2: 7 Fun Things to do with MapReduce

AgendaMap Tasks

Face DetectionCharacter RecognitionSpeech to Text

ShufflingMass Spectrometer processing

ReducersText MiningActual Mining

Cluster Building

Page 3: 7 Fun Things to do with MapReduce

Face Detection in ImagesStepStep 1.

Get a good Open Source Library

Step 2.Check the

Example Code

@chillax7

Page 4: 7 Fun Things to do with MapReduce

Character RecognitionStepMore Complex Task

than Face DetectionSELECT * FROM RecognizeNumberPlate( ON anpr.vehiclelogs imagecol('recognizedobject'));

@chillax7

Page 5: 7 Fun Things to do with MapReduce

Speech to TextStep

Fed up with word count examples?

How about counting words in a recorded wav

file?@chillax7

Page 6: 7 Fun Things to do with MapReduce

ProteomicsStepMass Spectrometers

Create a lot of data….In XML format….It’s nasty to work with

@chillax7

Page 7: 7 Fun Things to do with MapReduce

Text MiningStepFirst phases are map

tasksText Extraction andParsing

@chillax7

Page 8: 7 Fun Things to do with MapReduce

Actual MiningStepComparing Seismic

surveys taken at different points in time??

@chillax7

Page 9: 7 Fun Things to do with MapReduce

Cluster BuildingStepWhy Build your own

cluster?• It’s fun• You learn lots• It gets you invited

to parties

Physical or Virtual?

Physical – more fun, looks

impressive, harder to build,

maintain, use, cost of power

Virtual – performance? Easier

to test, try different versions,

configurations@chillax7

Page 10: 7 Fun Things to do with MapReduce

Thank youChris [email protected]@chillax7www.bigdatablog.co.uk

=+ +