If you can't read please download the document
Upload
azoft
View
13.252
Download
6
Embed Size (px)
Citation preview
rnd.azoft.comTutorial:
Using Convolutional Neural Networks to Detect
Object Keypoints
Using our tutorial, you will learn how to get a convolutional neural network model for the detection of object keypoints in images. For our tutorial weve chosen license plates as the object to be detected in images. However this tutorial can be used as a guide for detecting other objects.
About the project
Choosing images for neural network training
Labeling the keypoints
Data augmentation
Packing a dataset in HDF5
Training a convolutional neural network
Conclusion
Related links
Overview
rnd.azoft.com
rnd.azoft.com
1. Choosing images for neural network training
1. Choosing images for neural network training
The training dataset has to have somewhere from a few hundred to a few thousand original (not augmented) images in total. The more, the better.
2. Labeling the keypoints
2. Labeling the keypoints
If the keypoints in the original image were not labeled, then
you need to label them. This means you need to save keypoint
coordinates in **.txt or **.csv file format. Each coordinate has
two values for the horizontal axis and
for the vertical axis.
Remark: If you decided to label several keypoints, then you should label them in one sequence. For example - labeling a license plate: you might label the left side upper plates angle by the first dot, the right side upper plates angle by the second dot, the left side lower plates angle by the third dotand the right side lower plates angle by the fourth dot. So in future, you should keep the same sequence.
rnd.azoft.com
rnd.azoft.com
3. Data augmentation
3. Data augmentation
For effective training you need to get a dataset with several
thousand to tens of thousands of images.
If the initial dataset is not enough, you should apply augmentation
of the images.
Remark: Before starting with augmentation split your database into training and control parts. This is required to guarantee that images received by augmentation of one picture will be in the training as well as in the control part. If you miss this step, you can barely follow the retraining of the model.
rnd.azoft.com
3. Data augmentation
Here are the transformations that can be implemented for the augmentation:
Rotations relative to the center
Perspective distortion
Resize
Shifts
Salt-and-pepper noise
Blurring and sharpening
Erosion and dilation
rnd.azoft.com
4. Packing a dataset in HDF5
4. Packing a dataset in HDF5
In order to use the Caffe framework, you need to pack the
dataset into the file format HDF5.
You should normalize pixel values from 0 to 1 and coordinate values
from -1 to 1.
Remark:If you implemented augmentation for the initial images, the images in HDF5 have to follow in random order.
After packing the dataset in HDF5, you should check the received file using the utility HDF5 Viewer. The data of pixels have to be from 0 to 1, whereas coordinates have to be from -1 to 1, and images must not be distorted.
rnd.azoft.com
4. Packing a dataset in HDF5
rnd.azoft.com
4. Packing a dataset in HDF5
rnd.azoft.com
5. Training the convolutional neural network
5. Training the convolutional neural network
We recommend using the optimization method ADAM to begin training a neural network.
The input layer should look like this:
layer { name: "data" type: "HDF5Data" top: "data" top: "label" hdf5_data_param { source: "/home/user/caffe/examples/regression/regression_train.txt" batch_size: 256 }}
rnd.azoft.com
5. Training the convolutional neural network
The number of outputs at the output layer have to be equal to the number of coordinate values. Its better to use the layer of error EuclideanLoss.
layer { name: "ipout" type: "InnerProduct" bottom: "ip01" top: "ipout" inner_product_param { num_output: 8 weight_filler { type: "msra" } bias_filler { type: "constant" } }}layer { name: "loss" type: "EuclideanLoss" bottom: "ipout" bottom: "label" top: "loss"}
It seems quite complicated to train a qualified and quick neural network with just several attempts. We made about 20 trials before we got the appropriate outcome. If you have some questions regarding the idea, the experiment implementation, or the code, well be glad to answer you in comments below.
Conclusion
We have used these works as the base of the experiment:Using convolutional neural nets to detect facial keypoints tutorial
affe-regression examples ,Kaggle face keypoint detection at GitHub
Related links
rnd.azoft.com
Read the Detailed Convolutional Neural Networks for Object Detection Project Overview