47
Weakly Supervised Correspondence Estimation Zhiwei Jia

Weakly Supervised Correspondence Estimation

  • Upload
    others

  • View
    13

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Weakly Supervised Correspondence Estimation

WeaklySupervisedCorrespondenceEstimation

ZhiweiJia

Page 2: Weakly Supervised Correspondence Estimation

WeeklySupervisedLearning

• 1.Notenoughlabeleddata• 2.Transferlearning• 3.Helptoincreaseperformanceofsupervisedlearning• 4.Toprovidegoodinsightsonsolvingcertainlearningproblems

Page 3: Weakly Supervised Correspondence Estimation

LearningtoSeebyMoving

• 1.Biologicalbackground• 2.Whyuseegomotion informationassupervision?

• Availabilityof“labeleddata”• 3.Overview:

• Egomotion informationasaformofself-supervision• 4.Mainresult:

• Learnedvisualrepresentationcomparedfavourably tothatlearntusingdirectsupervisiononthetasksofscenerecognition,objectrecognition,visualodometry andkeypoint matching

Page 4: Weakly Supervised Correspondence Estimation

MainApproach

• 1.Correlatingvisualstimuliwithegomotion:• Egomotion <==>cameramotion• Predictingthecameratransformationfromtheconsequentpairsofimage.

• 2.Visualcorrespondencecanhelpforvisualtasksingeneral:• Pretraining forothertasks

Page 5: Weakly Supervised Correspondence Estimation

ArchitectureOverview

1.Siamese style CNN

2.Learning by minimizing the prediction error of egomotion information

3. TCNNonlyusedintrainingprocess

Page 6: Weakly Supervised Correspondence Estimation

ComparedtoSFATraining

• xt1,xt2refertofeaturerepresentationsofframesobservedattimest1,t2respectively.

• Disameasureofdistancewithparameter.

• misapredefinedmarginand

• Tisapredefinedtimethreshold

Page 7: Weakly Supervised Correspondence Estimation

TrainingofthisNetwork

• 1.Transformationparametersasgroundtruth.• 2.Traning data:

• MNIST• KITTI• SFdataset

• 3.TrainedNetworkusedforfurthervisualtasks• KITTI-Net• SF-Net

Page 8: Weakly Supervised Correspondence Estimation

SamplesfromSF/KITTIDataset

Page 9: Weakly Supervised Correspondence Estimation

OnMNIST• Translation:

• integervalueintherange[-3,3]• X,Yaxes• binnedintosevenuniformlyspacedbins

• Rotation:• liewithintherange[-30◦ ,30◦ ].• Zaxe• binnedintobinsofsize3◦ eachresultingintoatotalof20bins

• SFA:• translationintherange[-1,1],rotationwithin[-3◦ ,3◦ ]

• 5millionimagepairs

Page 10: Weakly Supervised Correspondence Estimation

OnKITTI

• 1.CameradirectionasZaxis• 2.ImageplaneasXYplane.

• 3.TranslationsalongtheZ/Xaxis• 4.RotationabouttheYaxis(Eulerangle)• 5.Individuallybinnedinto20uniformlyspacedbinseach.

• Thetrainingimagepairsfromframesthatwereatmost±7framesapart

Page 11: Weakly Supervised Correspondence Estimation

SFDataset

• ConstructedusingGoogleStreetView (≈130Kimage).• Cameratransformationalongallsixdimensionsoftransformation.• Rotationsbetween[-30◦ ,30◦ ]werebinnedinto10uniformlyspacedbinsandtwoextrabinswereusedforrotationslargerandsmaller.

• Threetranslationswereindividuallybinnedinto10uniformlyspacedbinseach.

Page 12: Weakly Supervised Correspondence Estimation

EvaluationonMNIST

1.LearnedBase-CNNservedasapretrainingmethodforConvNet onclassificationofMNIST

2.smallamountofdata.

3.Learnedfeaturerepresentationincreasestheperformanceofclassificationtasks.

Page 13: Weakly Supervised Correspondence Estimation

EvaluationofKITTI- /SF-Net

• Measuredintermsoffurtherperformingthesevisualtasks• 1.Sceneclassification• 2.LargeScaleImageClassification• 3.Keypointmatching• 4.Visualodometry

• estimatingthecameratransformationbetweenimagepairs.

Page 14: Weakly Supervised Correspondence Estimation

SceneClassificationonSUNdataset

• 397indoor/outdoorscenecategories• provides10standardsplitsof5and20trainingimagesperclassandastandardtestsetof50imagesperclass

• CompareKITTI/SFNetwith:• 1.AlexNet pretrainedonImageNet• 2.GIST• 3.SPM

Page 15: Weakly Supervised Correspondence Estimation
Page 16: Weakly Supervised Correspondence Estimation

1. KITTI-Net outperforms SF-Net and is comparable to AlexNet-20K.

2. Performance from layer 4, 5 features of KITTI-Net outperform layer 4, 5 features of KITTI-SFA-Net

Page 17: Weakly Supervised Correspondence Estimation

LargeScaleImageClassification

• AlllayersofKITTI-Net,KITTI-SFA-NetandAlexNet-Scratch(i.e.CNNwithrandomweightinitialization)werefinetunedforimageclassification.

• ComparisonofAlexNet usingpretrainedKITTINetvs.AlexNet trainedfromscratch

Page 18: Weakly Supervised Correspondence Estimation

Keypoint Matching(intra-class)

• PASCAL-VOC2012datasetwithGround-truthobjectboundingboxes(GT-BBOX)

• 1.Computefeaturemapsfromlayers2-5• 2.MatchingscoreforallpairsofGT-BBOXinthesameobjectclass.

• thefeaturesassociatedwithkeypoints inthefirstimagewereusedtopredictthelocationofthesamekeypoints inthesecondimage.

• 3.Errormeasurementofmatching:• Thenormalizedpixeldistancebetweentheactualandpredictedkeypoint locations

Page 19: Weakly Supervised Correspondence Estimation

DetailsofKeypoint Matching

Page 20: Weakly Supervised Correspondence Estimation
Page 21: Weakly Supervised Correspondence Estimation

ComparisonResult

1.KITTI-Net-20KwassuperiortoAlexNet-20KandAlexNet-100KandinferioronlytoAlexNet-1M.2.AlexNet-RandsurprisinglyperformedbetterthanAlexNet-20K.

Page 22: Weakly Supervised Correspondence Estimation
Page 23: Weakly Supervised Correspondence Estimation

VisualOdometry

• AlllayersofKITTI-NetandAlexNet-1Mwerefinetunedfor25KiterationsusingthetrainingsetofSFdatasetonthetaskofvisualodometry.

Page 24: Weakly Supervised Correspondence Estimation

Weakness,Limitation&Extension

• 1.Impressiveperformanceonlargedataset?• 2.Insteadofpretraining,combinelearningwithothervisualtasksinanonlineform.

Page 25: Weakly Supervised Correspondence Estimation

LearningDenseCorrespondencevia3D-guidedCycleConsistency• 1.Background:

• Worksinintra-classcorrespondenceestimationviadeeplearningvs.worksforcomputingcorrespondenceacrossdifferentobject/sceneinstances.

• Lackofdatafordensecorrespondence• 2.Naïvesolution:trainedon3Drenderedmodel

Page 26: Weakly Supervised Correspondence Estimation

MainApproach

• Utilizetheconceptofcycleconsistencyofcorrespondenceflows:• thecompositionofflowfieldsforanycircularpaththroughtheimagesetshouldhaveazerocombinedflow.

• “meta-supervision”• End-to-endtraineddeepnetworkfordensecross-instancecorrespondencethatusesthewidelyavailable3DCADmodels.

Page 27: Weakly Supervised Correspondence Estimation

ConsistencyintheSenseofFlowField

• Predictadenseflow(orcorrespondence)fieldF(a,b):R^2→R^2betweenpairsofimagesaandb.

• TheflowfieldF(a,b)(p)=(px−qx,py−qy)computestherelativeoffsetfromeachpointpinimageatoacorrespondingpointqinimageb.

Page 28: Weakly Supervised Correspondence Estimation

ConsistencyintheSenseofMatchability

• Whymatchability?• Amatchability mapM(a,b):R^2→[0,1]predictingifacorrespondenceexists,M(a,b)(p)=1,ornotM(a,b)(p)=0.

Page 29: Weakly Supervised Correspondence Estimation

ConsistencyasSupervision

• Whilewedonotknowwhattheground-truthis,weknowhowitshouldbehave.

• Specifically,foreachpairofrealtrainingimagesr1andr2,finda3DCADmodelofthesamecategory,andrendertwosyntheticviewss1ands2insimilarviewpointasr1andr2,respectively.

• Eachtrainingquartet<s1,s2,r1,r2>

Page 30: Weakly Supervised Correspondence Estimation

Aim to learn 2D image correspondences that potentially captures the 3D semantics.

Page 31: Weakly Supervised Correspondence Estimation

ComparedtoAutoencoder

• Reconstructionvs.zeronetflow• Sparsityconstrants vs.useconstructionofFlowFieldfroms1tos2asguidance

Page 32: Weakly Supervised Correspondence Estimation

LossfunctionforLearningDenseCorrespondence

Page 33: Weakly Supervised Correspondence Estimation

LossfunctionforLearningDenseMatchability

Combinedlossfunctionis:

Page 34: Weakly Supervised Correspondence Estimation

ASmallIssueforLearningMatchability

• Multiplicativecomposition.• CouldfixM(s1,r1)=1andM(r2,s2)=1,andonlytraintheCNNtoinferM(r1,r2)

Page 35: Weakly Supervised Correspondence Estimation

End-to-endDifferentiablebyContinuousApproximation• BilinearinterpolationovertheCNNpredictionsondiscretepixellocations.

Page 36: Weakly Supervised Correspondence Estimation

OverallArchitecture

Page 37: Weakly Supervised Correspondence Estimation

TrainingProcess

• Data:• The3DCADmodelsusedforconstructingtrainingquartetscomefromtheShapeNet database,whiletherealimagesarefromthePASCAL3D+dataset.

• 1.Firstinitializethenetwork(partly)tomimicSIFTflow:• minimizetheEuclideanlossbetweenthenetworkpredictionandtheSIFTflowoutputonthesampledpair.

• 2.Thenfine-tunethewholenetworkend-to-endtominimizethecombinedconsistencyloss

Page 38: Weakly Supervised Correspondence Estimation
Page 39: Weakly Supervised Correspondence Estimation

EvaluationofLearningPerformance

• 1.Featurevisualization• 2.Keypoint transfer• 3.Matchability prediction:• 4.Shape-to-imagesegmentationtransfer

Page 40: Weakly Supervised Correspondence Estimation

FeatureVisualization

• Extractconv-9featuresfromtheentiresetofcarinstancesinthePASCAL3D+dataset,andembedthemin2-Dwiththet-SNEalgorithm.

• Theresultindicatesthatviewpointsisanimportantsignalsforsimilarities inthelearnednetwork

Page 41: Weakly Supervised Correspondence Estimation
Page 42: Weakly Supervised Correspondence Estimation

Keypoint Transfer

• Computethepercentageofcorrectkeypoint transfer(PCK)overallimagepairsasthemetricformeasuringtheperformance.

• Valuatethequalityofourcorrespondenceoutputusingthekeypoint transfertaskonthe12categoriesfromPASCAL3D+

Page 43: Weakly Supervised Correspondence Estimation
Page 44: Weakly Supervised Correspondence Estimation

Matchability Prediction:

• Evaluatetheperformanceofmatchability predictionusingthePASCAL-Partdataset,whichprovideshumanannotatedlabeling.

Page 45: Weakly Supervised Correspondence Estimation

Shape-to-imageSegmentationTransfer:

• Shape-to-imagecorrespondencefortransfering per-pixellabels(e.g.surfacenormals,segmentationmasks,etc.)fromshapestorealimages.

• 1.Constructashapedatabaseofabout200shapespercategory,witheachshapebeingrenderedin8canonicalviewpoints.

• 2.Givenaqueryrealimage,applythenetworktopredictthecorrespondencebetweenthequeryandeachrenderedviewofthesamecategory,andwarpthequeryimageaccordingtothepredictedflowfield.

• 3.ComparetheHOGEuclideandistancebetweenthewarpedqueryandtherenderedviews,andretrievetherenderedviewwithminimumdistance.

Page 46: Weakly Supervised Correspondence Estimation
Page 47: Weakly Supervised Correspondence Estimation

Limitation&Extension

• vs.SIFTFlow• Othervisualproblemssolvedbyusing3Dmodels