Upload
auro-tripathy
View
793
Download
1
Tags:
Embed Size (px)
Citation preview
Auro Tripathy
*Random Forests are registered trademarks of Leo Breiman and Adele Cutler
Attributions, code and dataset location (1 minute)
Overview of the scheme (2 minutes)
Refresher on Random Forest and R Support (2 minutes)
Results and continuing work (1 minute)
Q&A (1 minute and later)
ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=5651638
R code available here; my contribution http://www.shatterline.com/SkinDetection.html
Data set available here http://www.feeval.org/Data-sets/Skin_Colors.html
Permission to use may be required
All training sets organized as a two-movie sequence
1. A movies sequence of frames in color
2. A corresponding sequence of frames in binary black-and-white, the ground-truth
Extract individual frames in jpeg format using ffmpeg, a transcoding tool
ffmpeg -i 14.avi -f image2 -ss 1.000 -vframes 1
14_500offset10s.jpeg
ffmpeg -i 14_gt_500frames.avi -f image2 -ss 1.000 -vframes 1
14_gt_500frames_offset10s.jpeg
Ground-truth Image
The original authors used 8991 such image-pairs, the image along with its manually annotated pixel-level ground-truth.
Attributions, code and dataset location (1 minute)
Overview of the scheme (2 minutes)
Refresher on Random Forest and R Support (2 minutes)
Results and continuing work (1 minute)
Q&A (1 minute and later)
Skin-color classification/segmentation Uses Improved Hue, Saturation, Luminance
(IHLS) color-space RBG values transformed to HLS HLS used as feature-vectors Original authors also experimented with
Bayesian network, Multilayer Perceptron, SVM, AdaBoost (Adaptive Boosting), Naive Bayes, RBF network
“Random Forest shows the best performance in terms of accuracy, precision and recall”
The most important property of this [IHLS] space is a “well-behaved” saturation coordinate which, in contrast to commonly used ones, always has a small numerical value for near-achromatic colours, and is completely independent of the brightness function
A 3D-polar Coordinate Colour Representation Suitable for Image, Analysis Allan Hanbury and Jean Serra
MATLAB routines implementing the RGB-to-IHLS and IHLS-to-RGB are available at http://www.prip.tuwien.ac.at/˜hanbury.
R routines implementing the RGB-to-IHLS and IHLS-to-RGB are available at http://www.shatterline.com/SkinDetection.html
Package ‘ReadImages’
This package provides functions for reading JPEG and PNG files
Package ‘randomForest’
Breiman and Cutler’s Classification and regression based on a forest of trees using random inputs.
Package ‘foreach’ Support for the foreach looping construct
Stretch goal to use %dopar%
set.seed(371)
skin.rf <- foreach(i = c(1:nrow(training.frames.list)), .combine=combine,
.packages='randomForest') %do%
{
#Read the Image
#transform from RGB to IHLS
#Read the corresponding ground-truth image
#data is ready, now apply random forest #not using the formula interface
randomForest(table.data, y=table.truth, mtry = 2, importance = FALSE,
proximity = FALSE, ntree=10, do.trace = 100)
}
table.pred.truth <- predict(skin.rf, test.table.data)
Attributions, code and dataset location (1 minute)
Overview of the scheme (2 minutes)
Refresher on Random Forest and R Support (2 minutes)
Results and continuing work (1 minute)
Q&A (1 minute and later)
Have lots of decision-tree learners
Each learner’s training set is sampled independently – with replacement
Add more randomness – at each node of the tree, the splitting attribute is selected from a randomly chosen sample of attributes
Each decision tree votes for a classification
Forest chooses a classification with the
most votes
Quick training phase
Trees can grow in parallel
Trees have attractive computing properties
For example… Computation cost of making a binary tree is
low O(N Log N)
Cost of using a tree is even lower – O(Log N)
N is the number of data points
Applies to balanced binary trees; decision trees often not balanced
Attributions, code and dataset location (1 minute)
Overview of the scheme (2 minutes)
Refresher on Random Forest and R Support (2 minutes)
Results and continuing work (1 minute)
Q&A (1 minute and later)
My Results? OK, but incomplete due to very small training set. Need parallel computing cluster
ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=5651638
Attributions, code and dataset location (1 minute)
Overview of the scheme (2 minutes)
Refresher on Random Forest and R Support (2 minutes)
Results and continuing work (1 minute)
Q&A (1 minute and later)