Classification of human activities using data from smart phone’s gyroscope and accelerometer, Mathematical Pattern Recognition Project Report, USC Spring 2013

1 | P a t t e r n R e c o g n i t i o n P r o j e c t

EE559 PROJECT REPORT TITLE: HUMAN ACTIVITY RECOGNITION SYSTEM

N A M E : M A I S A M S H A H I D W A S T I

E M A I L : W A S T I @ U S C . E D U

D A T E : 06 /05 /20 13


TABLE OF CONTENTS

I. ABSTRACT ....................................................................................................................................................................................... 3

II. DATASETS ....................................................................................................................................................................................... 4

III. BASELINE PERFORMANCE (PART 1)................................................................................................................................................... 4

IV. BASELINE PERFORMANCE IN REDUCED DIMENSIONS (PART 2) .......................................................................................................... 6

A. PRINCIPAL COMPONENT ANALYSIS (PCA) ............................................................................................................... 6 B. FISHER’S LINEAR DISCRIMINANT (FISHERM) ........................................................................................................... 6 C. COMPARISON WITH BASELINE AND COMMENTS: ..................................................................................................... 7

V. EVALUATION OF DIFFERENT CLASSIFIERS (PART 3) ............................................................................................................................ 7

A. METHODOLOGY TO COME UP WITH THE BEST CLASSIFIER: .......................................................................................... 7 B. DISTRIBUTION FREE CLASSIFIER: LINEAR PERCEPTRON (MATLAB: PERLC) ...................................................................... 8

a) Dimension optimization ......................................................................................................................... 8 b) Parameters optimization ........................................................................................................................ 8 c) Cross-validation performance on training set ......................................................................................... 8 d) Performance in 2-dimensional space and comparison with baseline....................................................... 9

C. STATISTICAL CLASSIFIERS: K-NEAREST NEIGHBORS VS PARZEN WINDOWS ...................................................................... 9 a) Comparison of KNN and Parzen Windows (Matlab: knnc and parzenc) ................................................... 9 b) Parzen windows: Parameter Optimization............................................................................................ 10 c) Performance in 2-dimensional space and comparison with baseline..................................................... 10

VI. THE BEST CLASSIFIER: PARZEN WINDOWS ...................................................................................................................................... 10

A. SPECIFICATIONS ............................................................................................................................................ 11 B. FEATURE SET USED ........................................................................................................................................ 11 C. FIVE-FOLD CROSS-VALIDATION RESULTS AVERAGES OVER 5 TIMES ............................................................................. 11 D. CONFUSION MATRIX AND ACCURACY RATE ON LABELED TESTING SET .......................................................................... 11 E. LABELING OF UNLABELED TESTING DATA .............................................................................................................. 12

VII. REFERENCES .................................................................................................................................................................................. 12

VIII. APPENDIX ..................................................................................................................................................................................... 13

[A – 1] CLASSIFICATION PERFORMANCE OF LINEAR AND QUADRATIC BAYES CLASSIFIER IN DIFFERENT DIMENSIONS (PCA)............... 13 [A – 2] CLASSIFICATION PERFORMANCE OF LINEAR AND QUADRATIC BAYES CLASSIFIER IN DIFFERENT DIMENSIONS (FLD) ............... 14 [A – 3] SCATTER PLOTS IN FEATURE SPACE FOR 2-DIMENSIONS FOR LINEAR BAYES CLASSIFIER. (FLD USED TO REDUCE DIMENSIONS) . 15 [A – 4] LINEAR PERCEPTRON DIMENSION OPTIMIZATION ................................................................................................. 16 [A – 5] LINEAR PERCEPTRON PARAMETERS OPTIMIZATION ............................................................................................... 17 [A – 6] LINEAR PERCEPTRON SCATTER PLOT IN 2-DIMENSIONS .......................................................................................... 18 [A – 7] PARZEN WINDOW SCATTER PLOT IN 2-DIMENSIONS .............................................................................................. 19 [A – 8] INSTRUCTION ON RUNNING THE MATLAB CODE .................................................................................................... 20


I. ABSTRACT

This report is on the project regarding the classification of human activities using data from gyroscope and accelerometer of a smart phone worn by a human. The labeled data providing the type of activity performed with the corresponding feature values extracted from the sensors were used for designing the best classifier. Several distribution-free and statistical classifiers were tested for performance evaluation using different techniques for feature selection and dimensionality reduction. The classifier that gave the best performance was Parzen windows classifier with 96.57% accuracy rate on the testing data. The procedure followed for evaluation, training and optimization for classifier is thoroughly presented in this report.


II. DATASETS

Following are the details of the dataset that were provided to come up with the best

performing classifier for the task.

Dataset Type # Samples # Features Purpose In Matlab

Training Data Labeled 7352 561 To train the classifier ‘TrainSet'

Testing Data Labeled 1547 561 To test the classifier ‘TestSet'

Testing Data Unlabeled 1400 561 To classify the unlabeled data ‘UtestSet'

III. BASELINE PERFORMANCE (PART 1)

Two classifiers, Linear Bayes (ldc) and Quadratic Bayes (qdc), were used to set up a baseline for

comparison. 5-fold Cross-validation was repeated 5 times on the training data set to come up

with average classification accuracy and standard deviation.

5-fold Cross validation averaged over 5 repetitions

Linear Bayes Quadratic Bayes

Classification Accuracy 17.13 % 61.43 %

Standard Deviation 0.0002 0.0019

These baseline classifiers were then tested on the Labeled Testing set to get the accuracy and

confusion matrix as shown below.

For Testing labeled set Linear Bayes Quadratic Bayes

Percent Accuracy 17.58 % 58.82 %


Confusion Matrix for Linear Bayes

Estimated Labels

1 2 3 4 5 6 Total Tr

ue

Lab

els

1 270 0 0 0 0 0 270

2 251 0 0 0 0 0 251

3 199 0 2 0 0 0 201

4 259 0 0 0 0 0 259

5 281 0 0 0 0 0 281

6 274 0 0 10 1 0 285

Total 1534 0 2 10 1 0 1547

Confusion Matrix for Quadratic Bayes

Estimated Labels

1 2 3 4 5 6 Total

Tru

e La

bel

s

1 259 1 10 0 0 0 270

2 169 82 0 0 0 0 251

3 125 21 55 0 0 0 201

4 13 0 0 238 5 3 259

5 23 0 1 211 46 0 281

6 16 0 0 39 0 231 285

Total 605 104 66 488 51 233 1547

It can be observed that Quadratic Bayes performs better than Linear Bayes. This is

fundamentally because Quadratic Bayes gives more degrees of freedom for the decision

boundary and hence would work better than Linear Bayes for any set of feature space. Even the

confusion matrix for Quadratic Bayes is closer to the diagonal matrix.


IV. BASELINE PERFORMANCE IN REDUCED DIMENSIONS (PART 2)

Different dimensionality reducing techniques were tried including Principal Component Analysis

and Fisher’s Linear Discriminant Analysis to measure the impact of dimensions on the

performance of the classifier.

A. PRINCIPAL COMPONENT ANALYSIS (PCA)

PCA is an unsupervised method for mapping of dimensions in reduced spaces. Both Linear

Bayes and Quadratic Bayes classifier’s performance were tested in different dimensions given

by PCA as plotted in Appendix A-1.

B. F ISHER ’S L INEAR DISCRIMINANT (FISHERM)

FLD is a supervised method that maps the features into lower dimensions using the information

of the class labels. The FLD maps the features in dimensions less than the number of classes.

Both Linear and Quadratic Bayes classifiers were used to see the impact of dimensionality

reduction on classification accuracy as plotted in Appendix A-2.

Since there was not much of a difference in maximum classification accuracy for both the

methods I preferred to use Fisher’s Linear Discriminant for rest of my analysis because of its

capability to deliver good classification results in smaller dimensions (<5) as compared to PCA

that peaks at 61 dimensions for Quadratic Bayes and 270 dimensions for Linear Bayes.

Finally, I chose Linear Bayes Classifier in 2-dimensions given by the Fisher’s Linear Discriminant

technique to scatter plot the Training and Testing data with the decision boundary. The plots

are shown in Appendix A-3.


C. COMPARISON WITH BASEL INE AND COMMENTS :

The classification accuracy of the Linear and Quadratic Bayes Classifiers has significantly

increased when used in lower dimensions by FLD. The reason why this happened is because

the Fisher’s Linear Discriminant finds the linear combination of features that can best

characterize and separate the classes [1]. Hence the resulting mapped feature space we get has

the classes that are better separated for these classifiers.

Classification Accuracy Comparison

#Dimensions Linear Bayes Quadratic Bayes

561 17.58% 58.82%

5 (FLD) 96% 96.66%

V. EVALUATION OF DIFFERENT CLASSIFIERS (PART 3)

A. METHODOLOGY TO COME U P WITH THE BEST CLASSIFIER:

a) Both distribution-free and statistical classifiers were tested and evaluated to come up

with the classifier with highest accuracy.

b) The feature dimension giving the best results was chosen and 5-fold cross-validation

was used for comparison of classifier’s performances.

c) A most accurate classifier of both types was chosen and its parameters were optimized.

d) The classifier giving the highest classification accuracy was then chosen as the final

classifier


B. DISTRIBUTION FREE CLA SSIFIER: L INEAR PERCEPTRON (MATLAB : PERLC)

a) D I MENSI ON OPTI MI ZATI ON

Firstly, the linear Perceptron was cross-validated for all the possible dimensions given by

the Fisher’s Linear Discriminant mapping and the optimal dimension of 5 was chosen. The

plot is shown in Appendix A-4.

b) PARAME TE RS OP TI MIZATI ON

The linear Perceptron has two parameters: the learning rate η and maximum number of

iterations. The training data was randomly split in 80% training and 20% testing data five

times and classification accuracy was calculated with different learning rates and number of

iteration. The parameter values that gave the maximum average classification over five

repetitions were selected. The plots that we used to find the maximums are show in

Appendix A-5.

Optimal Perceptron Parameters

Learning Rate 0.85

Maximum Iterations 100

c) CROSS-VALI DATI ON PE RFORMANCE ON TRAINI NG SE T

The linear Perceptron gave good cross-validation accuracy on the Training Data in 5

dimensions.

Cross-validation accuracy

Linear Perceptron 98.61%


d) PE RFORMANCE I N 2-DI ME NSI ONAL SPACE AN D COMPARI SON WI TH B ASELI NE

Once the classifier’s parameter was selected, the performance of the Perceptron was

examined in reduced 2-dimensional feature space and plotted as shown in Appendix A-6.

The classification accuracy was then calculated for the labeled Testing Data and compared

with the baseline classifier’s performance in the same dimensions.

Classification Accuracy in 2-dimensions on Testing Set

Perceptron Baseline: Linear Bayes

50.80% 66.10%

It can be seen that the Perceptron classifier is not suitable for 2-dimension as its

classification accuracy has even decreased from the Baseline Linear Bayes classifier in 2

dimensions.

C. STATISTICAL CLASSIFIE RS: K-NEAREST NEIGHBORS VS PARZEN W INDOWS

a) COMPARI SON OF KNN AND PARZEN W INDOWS (MATLAB : KNNC AND PARZENC)

The k-Nearest Neighbors and Parzen windows both gave the highest Cross-validation

accuracy in 5 feature dimensions by Fisher Linear Discriminant as shown below

5-fold Cross-validation Accuracy

kNN 98.64%

Parzen Window 98.69%

Comparing the 2 classifiers the Parzen Window gives us slightly better performance. Hence,

Parzen Windows was chosen as the best statistical classifier.


b) PARZEN WINDOWS : PARAME TER OPTIMI ZATI ON

Parzen window classifier has a smoothing parameter that can be optimized. The smoothing

parameter optimized by Matlab is 0.6685.

c) PE RFORMANCE I N 2-DI ME NSI ONAL SPACE AN D COMPARI SON WI TH BA SELI NE

Once the classifier’s parameter was selected, the performance of the Parzen Window

classifier was examined in reduced 2-dimensional feature space and plotted as shown in

Appendix A-7. The classification accuracy was then calculated for the labeled Testing Data

and compared with the baseline classifier’s performance in the same dimensions.

Classification Accuracy in 2-dimensions on Testing Set

Parzen Window Baseline: Linear Bayes

64.964% 66.10%

It can be concluded that the Parzen window classifier is not suitable for 2-dimension as its

classification accuracy has even decreased from the Baseline Linear Bayes classifier in 2

dimensions.

VI. THE BEST CLASSIFIER: PARZEN WINDOWS

Out of all the statistical and distribution-free classifiers evaluated Parzen Windows gave the

best performance as shown below

Classifier Type 5-fold Cross-validation Accuracy

Distribution-Free Linear Perceptron 98.61%

Statistical kNN 98.64%

Statistical Parzen Window* 98.69%


A. SPECIFICATIONS

B. FEATURE SET USED

Performed Fisher Linear Discriminant Analysis for 5-dimensions to get the final feature set.

C. F IVE-FOLD CROSS-VALIDATION RE SULTS AVERAGES OVER 5 TIMES

The 5-fold cross validation performance results averaged over 5 times on training data set

are tabulated below.

5-fold Cross validation averaged over 5 repititions

Parzen Windows

Classification Accuracy (%) 98.62 %

Standard Deviation 0.00024

D. CONFUSION MATRIX AND ACCURACY RATE ON LAB ELED TESTING SET

Following are the confusion matrix and percentage accuracy on the Labeled Testing Data.

Confusion Matrix for Parzen Window Classifier

Estimated Labels

1 2 3 4 5 6 Total

Tru

e L

abe

ls

1 267 3 0 0 0 0 270

2 6 245 0 0 0 0 251

3 0 5 196 0 0 0 201

4 0 1 0 231 27 0 259

5 0 0 0 11 270 0 281

6 0 0 0 0 0 285 285

Total 273 254 196 242 297 285 1547

For Testing labeled set Parzen Window Classifier

Percent Accuracy 96.574 %

Best Classifier Smoothing Parameter H

Parzen Windows 0.6685


E. LABELING OF UNLABELED TESTING DATA

The unlabeled testing data provided was then labeled using the optimized Parzen window

classifier. The histogram indicating how many samples were classified in the respective

classes is plotted below.

VII. REFERENCES

[1] http://en.wikipedia.org/wiki/Linear_discriminant_analysis


VIII. APPENDIX

[A – 1] CLASSIFICATION PERFORMANCE OF LINEAR AND QUADRATIC BAYES

CLASSIFIER IN DIFFERENT DIMENSIONS (PCA)


[A – 2] CLASSIFICATION PERFORMANCE OF LINEAR AND QUADRATIC BAYES

CLASSIFIER IN DIFFERE NT DIMENSIONS (FLD)


[A – 3] SCATTER PLOTS IN FEATURE SPACE FOR 2-DIMENSIONS FOR L INEAR

BAYES CLASSIFIER . (FLD USED TO REDUCE DIMENSIONS)


[A – 4] L INEAR PERCEPTRON DIMENSION OPTIMIZATION

Correction: The Y-axis represents the %Accuracy/100. It can be seen that the linear perceptron

gives us the best results in 5 dimensions by FLD.


[A – 5] L INEAR PERCEPTRON PARAMETERS OPTIMIZATION

0.85 was chosen as the optimized learning rate as it gives the minimum error rate. The zeros on

the X-axis are to be avoided.


[A – 6] L INEAR PERCEPTRON SCATTER PL OT IN 2-DIMENSIONS

Interpretation: This is how the linear Perceptron decision boundaries look when used in

reduced 2-D feature dimension. As it can be seen the classes are not separable enough with 2-

dimensional lines the Perceptron in 2-dimensions of FLD will not give us good classification

accuracy.


[A – 7] PARZEN WINDOW SCATTER PLOT IN 2-DIMENSIONS

Interpretation: Since it is not limited to linear decision boundaries like Linear Perceptron it is

better at classifying the six classes than Perceptron.


[A – 8] INSTRUCTION ON RUNNIN G THE MATLAB CODE

Since the code in M-file is divided into cells. It has to be executed cell by cell to test it

thoroughly. The cell execution sequence for analysis and calculating the final results achieved

by Parzen Classifier follow the sequence tabulated below.

Cell execution sequence to run the best performing classifier (parzenc) in Matlab code

Cell Sequence Task that cell performs

Step 1 A Loads all the data and makes the dataset

Step 2 F Dimension Reduction to 5 dimensions using FLD for all the datasets

Step 3 L Parzen crossvalidation performance

Step 4 M To calculate Confusion matrix and Accuracy Rate

Step 5 N To generate final labels for Testing Data Unlabeled

The rest of the blocks were used for

1. Analysis of performance of classifiers and comparison

2. Finding the optimal dimensionality for using PCA and FLD. FLD was used in the

final classifier.

3. Optimization of parameters for Perceptron, kNN and Parzen windows

Technology

Classification of human activities using data from smart phone’s gyroscope and accelerometer, Mathematical Pattern Recognition Project Report, USC Spring 2013