View
245
Download
2
Category
Preview:
Citation preview
Section 5- Computer Vision Libraries
CS231ASection:ComputerVisionLibrariesOverview
AmirSadeghianFeb2018
1Amir Sadeghian
Section 5- Computer Vision Libraries
Overview
• Opencv• DeepLearningFrameworks
• Caffe• Torch• Tensorflow
Amir Sadeghian 2
Section 5- Computer Vision Libraries
OtherCVlibraries
• Vlfeat: AnOpensource librarywithpopular computervisionalgorithmsspecializing inimageunderstanding andlocalfeaturesextractionandmatching.
• scikit-learn:AnopensourcePythonlibrary thatimplementsarangeofmachine learning, preprocessing, cross-validationandvisualizationalgorithms.
• PCL: Astandalone, largescale,openprojectfor2D/3Dimageandpointcloudprocessing.
• SLAMframeworks(bundler, visualsfm,meshlab):Applicationsfor3Dreconstructionusingstructurefrommotion (SFM).
• Librariesforspecifictasks:e.g.trackinglibraries,Detectionlibraries…
Amir Sadeghian 3
Section 5- Computer Vision Libraries
IntroductiontoOpenCV
• Opensourcecomputervisionandmachinelearninglibrary• Containsimplementationsofalargenumberofvisionalgorithms• WrittennativelyinC++,alsohasC,Python,Java,andMATLABinterfaces
• SupportsWindows,Linux,MacOSX,Android,andiOS
Amir Sadeghian 5
Section 5- Computer Vision Libraries
Installation
• Downloadfromhttp://opencv.org andcompilefromsource• Windows:RunexecutabledownloadedfromOpenCV website• MacOSX:InstallthroughMacPorts,easy_install,…• Linux:Installthroughthepackagemanager(e.g.yum,apt)butmakesuretheversionissufficientlyup-to-dateforyourneeds
Amir Sadeghian 6
Section 5- Computer Vision Libraries
BasicOpenCV Structures
• Point,Point2f- 2DPoint
• Size - 2Dsizestructure
• Rect - 2Drectangleobject
• RotatedRect - Rect objectwithangle
• Mat - imageobject
• …
Amir Sadeghian 7
Section 5- Computer Vision Libraries
Point
Amir Sadeghian 8
• 2DPointObject- int x,y;
• SampleFunctions
- Point.dot(<Point>)- computesdotproduct
- Point.inside(<Rect>)- returnstrueifpointisinside
Mathoperators,youmayuse-Pointoperator+
-Pointoperator+=
-Pointoperator-
-Pointoperator-=
-Pointoperator*
-Pointoperator*=
-booloperator==
-booloperator!=
-doublenorm
Section 5- Computer Vision Libraries
Size
• 2DSizeStructure- int width,height;
• Functions- Size.area()- returns(width*height)
Amir Sadeghian 9
Rect
• 2DRectangleStructure- int x,y,width,height;
• Functions
- Rect.tl()- returntopleftpoint
- Rect.br()- returnbottomrightpoint
Section 5- Computer Vision Libraries
cv::Mat
• TheprimarydatastructureinOpenCV istheMatobject.Itstoresimagesandtheircomponents.
• Mainitems
• rows,cols- lengthandwidth(int)
• channels- 1:grayscale,3:BGR
• depth:CV_<bit_depth>C<#chan>
• Seethemanualsformoreinformation
Amir Sadeghian 10
Section 5- Computer Vision Libraries
cv::Mat- Functions
• Mat.at<datatype>(row,col)- returnspointertoalocationintheimage• Mat.channels()- returnsthenumberofchannels• Mat.clone()- returnsadeepcopyoftheimage• Mat.create(rows,cols,TYPE)- re-allocatesnewmemorytomatrix• Mat.cross(<Mat>)- computescrossproductoftwomatrices• Mat.depth()- returnsbit-depthofamatrix• Mat.dot(<Mat>)- computesthedotproductoftwomatrices
Amir Sadeghian 11
Section 5- Computer Vision Libraries
Pixeltypes
• PixelTypes showshowtheimageisrepresentedindata
• BGR- Thedefaultcolorofimread().Normal3channelcolor
• HSV- Hueiscolor,Saturationisamount,Valueislightness.3channels
• GRAYSCALE- Grayvalues,Singlechannel• OpenCV requiresthatimagesbeinBGRorGrayscaleinordertobeshownorsaved.Otherwise,undesirableeffectsmayappear.
Amir Sadeghian 12
Section 5- Computer Vision Libraries
ImageNormalizationandThresholding
• Normalizationremapsarangeofpixelvaluestoanotherrangeofpixelvalues
• voidnormalize(InputArray src,OutputArray dst,...)
• OpenCV providesageneralpurposemethodforthresholdinganimage
• doublethreshold(InputArray src,OutputArraydst,double thresh,double maxval,int type)
• Specifythresholding schemespecifiedbythetypevariable
Amir Sadeghian 14
Section 5- Computer Vision Libraries
ImageSmoothing
• Reducesthesharpnessofedgesandsmoothsoutdetailsinanimage
• OpenCV implementsseveralofthemostcommonlyusedmethods
• voidGaussianBlur(InputArray src,OutputArray dst,...)
• voidmedianBlur(InputArray src,OutputArray dst,...)
• Otherfunctionsincludegenericconvolution,separableconvolution,dilate,anderode.
Amir Sadeghian 15
Section 5- Computer Vision Libraries
ImageSmoothing:SampleImage
Amir Sadeghian 17
Original GaussianBlur MedianBlur
Section 5- Computer Vision Libraries
EdgeDetection
• OpenCV implementsanumberofoperatorstohelpdetectedgesinanimage
• SobelOperator• void cv::Sobel(image in, image out, CV_DEPTH, dx, dy);
• Scharr Operator• void cv::Scharr(image in, image out, CV_DEPTH, dx, dy);
• LaplacianOperator• void cv::Laplacian( image in, image out, CV_DEPTH);
• OpenCV alsoimplementsmulti-stageedgedetectionalgorithmssuchasCannyedgedetection
• Tip:Ifyourimageisnoisy,thenedgedetectionwilloftenexaggeratethenoise
• Sometimessmoothingtheimagebeforerunningedgedetectiongivesbetterresults
Amir Sadeghian 18
Section 5- Computer Vision Libraries
FaceDetection:Viola-Jones
} Robustandfast} IntroducedbyPaulViolaandMichaelJones
} http://research.microsoft.com/~viola/Pubs/Detect/violaJones_CVPR2001.pdf
} Haar-likeFeatures
Amir Sadeghian 21
Section 5- Computer Vision Libraries
AndManyMore…
• ObjectTrackingusingOpenCV• HandwrittenDigitsClassification:AnOpenCV(C++/Python)Tutorial
• EyeDetectorusingOpenCV• ImageRecognitionandObjectDetection• HeadPoseEstimationusingOpenCV• ConfiguringQtforOpenCV• …
Amir Sadeghian 22
Section 5- Computer Vision Libraries
DeepLearningFrameworks
• Caffe• Torch/PyTorch
• NYU• scientificcomputingframeworkinLua• supportedbyFacebook
• TensorFlow• Google• Python
• Theano/Pylearn2• U.Montreal• Python• symboliccomputationandautomaticdifferentiation
• MatConvNet• OxfordU.• DeepLearninginMATLAB
Amir Sadeghian 24
Section 5- Computer Vision Libraries
FrameworkComparison
• Morealikethandifferent• Allexpressdeepmodels• Allareopen-source(contributions differ)• Mostincludescripting forhackingandprototyping
• Nostrictwinners,experimentandchoosetheframeworkthatbestfitsyourwork
Amir Sadeghian 25
Section 5- Computer Vision Libraries
Caffe:Overview
• WhatisCaffe?• Training/Finetuning asimplemodel• DeepdiveintoCaffe!
Amir Sadeghian 26
Section 5- Computer Vision Libraries
WhatisCaffe?
• A deeplearningframework
• Openframework,models,andworkedexamplesfordeeplearning
• 4000+citations,250+contributors,11,000+forks
• Focushasbeenvision,butbranchingout:sequences, reinforcementlearning,speech+text
Amir Sadeghian 27
Section 5- Computer Vision Libraries
Caffe
• PureC++/CUDAarchitecturefordeeplearning• commandline,Python,MATLABinterfaces
• Fast,well-testedcode• Tools,referencemodels,demos,andrecipes• SwitchbetweenCPUandGPU
• Caffe::set_mode(Caffe::GPU);
Amir Sadeghian 28
Section 5- Computer Vision Libraries
Installation
• http://caffe.berkeleyvision.org/installation.html •• CUDA,OPENCV• BLAS(BasicLinearAlgebraSubprograms):operationslikematrixmultiplication,matrixaddition,bothimplementationforCPU(cBLAS)andGPU(cuBLAS).providedbyMKL(INTEL),ATLAS,openBLAS,etc.
• Boost:ac++ library.>Usesomeofitsmathfunctionsandshared_pointer.
• glog,gflags providelogging&commandlineutilities.>Essentialfordebugging.
• leveldb,lmdb:databaseio foryourprogram.>Needtoknowthisforpreparingyourowndata.
• protobuf:anefficientandflexiblewaytodefinedatastructure.>Needtoknowthisfordefiningnewlayers.
Amir Sadeghian 29
Section 5- Computer Vision Libraries
Caffe Tutorial
• Nets,Layers,andBlobs:theanatomyofaCaffe model.• Forward/Backward:theessentialcomputationsoflayeredcompositionalmodels.
• Loss:thetasktobelearnedisdefinedbytheloss.• Solver:thesolvercoordinatesmodeloptimization.• Interfaces:commandline,Python,andMATLABCaffe.• Data:howtocaffeinatedataformodelinput.
http:/caffe.berkeleyvision.org/tutorial/
Amir Sadeghian 30
Section 5- Computer Vision Libraries
Caffe
• Blob:StorageandCommunicationofData• DatablobsareNxCxHxW
• Net:Containsallthelayersinthenetworks• Performs forward/backwardpassthrough theentirenetwork
• Solver:Usedtosettraining/testingparameters• Numberof iterations,backpropagationmethod,etc..
Amir Sadeghian 31
Section 5- Computer Vision Libraries
Training:Step1
• Createalenet_train.prototxt
• DataLayers
• OperationalLayers
• LossLayers
Amir Sadeghian 32
Section 5- Computer Vision Libraries
Training:Step2
• Createalenet_solver.prototxt
Amir Sadeghian 36
Section 5- Computer Vision Libraries
Training:Step3
• #trainLeNet• caffe train-solverexamples/mnist/lenet_solver.prototxt
• #trainonGPU2• caffe train-solverexamples/mnist/lenet_solver.prototxt -gpu 2
• #resumetrainingfromthehalf-waypointsnapshot• caffe train-solverexamples/mnist/lenet_solver.prototxt -snapshotexamples/mnist/lenet_iter_5000.solverstate
Amir Sadeghian 39
Section 5- Computer Vision Libraries
PyCaffe (TraininginPython)
• Addcaffe pythondirectorytopathandimportcaffe
Amir Sadeghian 42
Section 5- Computer Vision Libraries
OpenModelCollection
• TheCaffe ModelZoo• opencollectionofdeepmodelstoshareinnovation
• VGGILSVRC14• Network-in-Network
• MITPlacesscenerecognitionmodelinthezoo• Helpreproduce research• Bundled toolsforloadingandpublishingmodels
• ShareYourModels!
Amir Sadeghian 47
Section 5- Computer Vision Libraries
ReferenceModels
Caffe offersthe
• Modeldefinitions
• Optimizationsettings
• Pre-trainedweightssoyoucanstartrightaway.
Amir Sadeghian 48
Alexnet: Imagenet Classification
Section 5- Computer Vision Libraries
WhyFine-tuning?
• Finetuning isaprocesstotakeanetworkmodelthathasalreadybeentrainedforagiventask,andretrainittomakeitperformasecondsimilartask.
• Why?• Morerobustoptimization• good initializationhelps• Needslessdata• Fasterlearning
Amir Sadeghian 49
Section 5- Computer Vision Libraries
Fine-tuningTricks
• Learnthelastlayerfirst• Caffe layershavelocallearning rates:blobs_lr• Freezeallbutthelastlayerforfastoptimizationandavoidingearlydivergence.
• Stopifgoodenough, orkeepfine-tuning
• Reducethelearningrate• Dropthesolverlearning rateby10x,100x–• Preservetheinitialization frompre-trainingandavoidthrashing
Amir Sadeghian 50
Section 5- Computer Vision Libraries
Trainingtips
• Beforerunningfinal/longtraining• Makesureyoucanoverfit onasmalltrainingset• Makesureyour lossdecreasesoverfirstseveraliterations• Otherwiseadjustparameteruntilitdoes,especiallylearning rate
• Separatetrain/val/testdata
Amir Sadeghian 51
Recommended