Using a Pretrained Model in Caffe

Embed Size (px)

Citation preview

  • 8/18/2019 Using a Pretrained Model in Caffe

    1/2

    Using a pretrained model in caffeProcess Overview:

    1. If we provide the weights argument to the caffe train command, the pretrained weights will be loade

    into our model, matching layers by name.2. If the number of classes being trained is different from that of the pretrained model data, then the last layer

    needs to be changed to reflect the correct number of classes. ecause this layer is being changed, you should

    also change its name. !ince there is no layer named that in the bvlc_reference_caffenet, that layer begin training with random weights.". #e will also decrease the overall learning rate base_lr in the solver protot$t, but boost the lr_mult on t

    newly introduced layer. %he idea is to have the rest of the model change very slowly with new data, but let thnew layer learn fast.

    &. !et stepsize in the solver to a lower value than if we were training from scratch, since we're virtually far

    along in training and therefore want the learning rate to go down faster.(. )ote that we could also entirely prevent fine*tuning of all layers other than fc8_mdData by setting their

    lr_mult to +.

    !teps:

    1. hange to caffe directory.1. cd ./caffe

    2. -ae sure you have your training and validation images ready.". /ownload the pretrained model you want to use

    1. e$0 ./scripts/download_model_binary.py models/bvlc_reference_caffenet

    &. !et up the layers in the networ file ie. modelsmy-odeltrain3val.protot$t01. 4d5ust the first layer for both the train and test runs to refelect the dataset

    1. hange type to reflect  your image data type http:caffe.bereleyvision.orgtutoriallayers.html• /atabase 6 Data

    • In*-emory 6 MemoryData

    • 7/8( Input – HD!Data

    • 7/8( Output 6 HD!"utput• Images 6 #mageData

    2. 8ill out the parameters for your data time. 9$amples:• type 6 parameter variable name 6 parameters; 0

    • /ata 6 data3param * source: bacend: ?-/ ;

    • Image/ata 6 image3data3param * source: new3width: 2(> ;

    ◦ train.t$t is a file containing the location of each image followed by its class, with a

    new image on each line.2. 4d5ust the last layer to reflect the the data set

    1. @ename the last layer and the AtopB field to reflect this data set.1. ie0 name AfcCB becomes AfcC3mydataB. !ame for the AtopB field.

    2. !et lr3mult to higher than for other layers, because this layer is starting from random while the othersare already trained. %ry 1+, 2+.

    ". !et num3output to eDual the number of classes in this data set.

  • 8/18/2019 Using a Pretrained Model in Caffe

    2/2

    &. In any layers that follow such as accuracy0 and use the output from the last 8 layer, change the botlayer to match the new name of the last 8 layer.1. ie0 bottom AfcCB becomes AfcC3mydataB

    ". )ote that on any layer, lr3mult can be set to + to disable any fine*tuning of that layer.(. !et up solver file ie modelsmy-odelsolver.protot$t0. 4 sample is below:

    net: "models/myModel/train_val.prototxt"test_iter: 100 # The number of iterations for each test net.test_interval: 1000 # The number of iterations beteen to testin! phases# lr for finetunin! should be loer than hen startin! from scratchbase_lr: 0.001 # be!in trainin! at a learnin! rate of 0.001lr_policy: "step" # learnin! rate policy: drop the learnin! rate in "steps"  # by a factor of !amma every stepsie iterations!amma: 0.1 # drop the learnin! rate by a factor of 10  # $i.e.% multiply it by a factor of !amma & 0.1'# stepsie should also be loer% as e(re closer to bein! donestepsie: )0000 # drop the learnin! rate every )0* iterationsdisplay: )0max_iter: 100000 # train for 100* iterations totalmomentum: 0.+ei!ht_decay: 0.000,snapshot: 10000 # save a snapshot of the ei!hts every 10* iterations

      # this snapshot can be used to resume trainin! from this point.snapshot_prefix: "models/myModel/myModel"# uncomment the folloin! to default to - mode solvin!solver_mode: - 

    >. )ow you have what you need to train../build/tools/caffe train solver models/location_of_solver/solver.prototxt ei!htsmodels/location_of_pretrained_model/pretrained_model.caffemodel

    f you need to pic up training from a snapshot, run:

    E resume training from the half*way point snapshotbuild/tools/caffe train solver models/location_of_solver/solver.prototxt snapshotmodels/myModel/myModel_iter_10000.solverstate