Upload
emery-powers
View
245
Download
0
Tags:
Embed Size (px)
Citation preview
Introduction to Programming
Application example: Photo OCRProblem description and pipeline
Machine Learning1The Photo OCR problem
LULA Bs ANTIQUE MALLLULA BsLULA BsOPENAndrew NgPhoto OCR pipeline1. Text detection2. Character segmentation3. Character classification
NAT
Andrew NgImageText detectionCharacter segmentationCharacter recognitionPhoto OCR pipelineApplication example: Photo OCRSliding windows
Machine Learning5Text detectionPedestrian detection
Andrew Ng6Positive examplesSupervised learning for pedestrian detectionpixels in 82x36 image patches
Negative examples
Andrew Ng
Sliding window detectionAndrew Ng
Sliding window detectionAndrew Ng
Sliding window detectionAndrew Ng
Sliding window detectionAndrew NgText detection
Andrew NgText detectionPositive examples
Negative examples
Andrew NgText detection
[David Wu]Andrew NgExpansionRectangles around operator
141D Sliding window for character segmentation
Positive examples
Negative examples
Andrew NgChange it to segmentation examples instead:15Photo OCR pipeline1. Text detection2. Character segmentation3. Character classification
NAT
Andrew NgApplication example: Photo OCRGetting lots of data: Artificial data synthesis
Machine Learning17Character recognition
NIAQTAAndrew NgSort it, spell antique18Artificial data synthesis for photo OCR
Real dataAbcdefgAbcdefgAbcdefgAbcdefgAbcdefg[Adam Coates and Tao Wang]Andrew NgArtificial data synthesis for photo OCR
Real dataSynthetic data[Adam Coates and Tao Wang]Andrew NgSynthesizing data by introducing distortions[Adam Coates and Tao Wang]
Andrew NgSynthesizing data by introducing distortions: Speech recognitionOriginal audio:
Audio on bad cellphone connectionNoisy background: CrowdNoisy background: Machinery[www.pdsounds.org]
Andrew NgSynthesizing data by introducing distortions Distortion introduced should be representation of the type of noise/distortions in the test set.Audio:Background noise, bad cellphone connectionUsually does not help to add purely random/meaningless noise to your data.intensity (brightness) of pixel random noise
[Adam Coates and Tao Wang]
Andrew Ng2x2, add noise23Discussion on getting more dataMake sure you have a low bias classifier before expending the effort. (Plot learning curves). E.g. keep increasing the number of features/number of hidden units in neural network until you have a low bias classifier.How much work would it be to get 10x as much data as we currently have?Artificial data synthesisCollect/label it yourselfCrowd source (E.g. Amazon Mechanical Turk)Andrew NgDiscussion on getting more dataMake sure you have a low bias classifier before expending the effort. (Plot learning curves). E.g. keep increasing the number of features/number of hidden units in neural network until you have a low bias classifier.How much work would it be to get 10x as much data as we currently have?Artificial data synthesisCollect/label it yourselfCrowd source (E.g. Amazon Mechanical Turk)Andrew NgApplication example: Photo OCRCeiling analysis: What part of the pipeline to work on next
Machine Learning26Estimating the errors due to each component (ceiling analysis)ImageText detectionCharacter segmentationCharacter recognitionWhat part of the pipeline should you spend the most time trying to improve?ComponentAccuracyOverall system72%Text detection89%Character segmentation90%Character recognition100%Andrew NgAnother ceiling analysis exampleFace recognition from images (Artificial example)
Logistic regressionFace detectionCameraimageEyes segmentationNose segmentationMouth segmentationPreprocess(remove background)LabelAndrew NgComponentAccuracyOverall system85%Preprocess (remove background)85.1%Face detection91%Eyes segmentation95%Nose segmentation96%Mouth segmentation 97%Logistic regression100%Another ceiling analysis exampleLogistic regressionFace detectionCameraimageEyes segmentationNose segmentationMouth segmentationPreprocess(remove background)LabelAndrew Ng