RECOGNIZING CHILDREN’S EMOTIONS USING CONVOLUTIONAL NEURAL NETWORKS Amir Hossein Farzaneh, Xiaojun Qi DEEP LEARNING TRAINING IMAGES TRAINING ARTIFICIAL INTELLIGENCE NETWORK TOPOLOGY EXPERIMENTAL RESULTS SAMPLE APPLICATION The network is trained using several thousand annotated images of different faces expressing various emotions. Microsoft’s Facial Expression Recognition (FER+) dataset has 35887 images annotated with 8 emotions. Challenged with children’s faces detected in a video, the network assigns each child’s emotion in every frame as happy or neutral with high accuracy. To evaluate the performance of our algorithm, we have manually labelled 1000 images of children’s faces as “happiness” and “neutral”. Our deep learning algorithm classifies images with 88.79% accuracy. Deep learning algorithms take many forms. Our research group used a convolutional neural network (CNN) to identify eight facial expressions including neutral, happiness, surprise, sadness, anger, disgust, fear, and contempt in people from different demographics and ages. We challenge our network to identify children’s emotions in hope of enhancing Human Computer Interaction (HCI), diagnosing behavioral diseases, and etc. 86.26 85.33 91.67 0 10 20 30 40 50 60 70 80 90 100 Test Validation Training Classification Performance Accuracy (%) Evaluation on FER+ dataset: Over multiple iterations, the network discovers patterns in the data that can distinguish emotions. Convolutional layers identify structural features of the images, which are integrated in fully connected layers. Combining layers of different structure lets the network adapt to recognize images of varying type. Convolutional layers Fully connected layers happiness Input We adopt from the state-of-the-art CNN architecture VGG13, tweaking the structure of the network to better comply with our application. The output generates probabilities of an arbitrary image belonging to eight distinct classes of emotion. To avoid overfitting, we interleave convolutional layers and fully connected layers with dropout layers. Confusion matrix happiness anger surprise Trained CNN happiness neutral This project was supported by USU’s Graduate Research and Creative Opportunities (GRCO) Grant Program 2017-2018

RECOGNIZING CHILDREN’S EMOTIONS USING · Microsoft’s Facial Expression Recognition (FER+) dataset has 35887 images annotated with 8 emotions. Challenged with children’s faces

Download PDF Report

Upload
others
View
4
Download
0

Embed Size (px)

Citation preview

RECOGNIZING CHILDREN’S EMOTIONS USING CONVOLUTIONAL NEURAL NETWORKS

Amir Hossein Farzaneh, Xiaojun Qi

DEEP LEARNING

TRAINING IMAGES

TRAINING ARTIFICIAL INTELLIGENCE

NETWORK TOPOLOGY

EXPERIMENTAL RESULTS

SAMPLE APPLICATION

The network is trained using several thousand annotated images of different faces expressing various emotions.

Microsoft’s Facial Expression Recognition (FER+) dataset has 35887 images annotated with 8 emotions.

Challenged with children’s faces detected in a video, the network assigns each child’s emotion in every frame as happy or neutral with high accuracy.

To evaluate the performance of our algorithm, we have manually labelled 1000 images of children’s faces as “happiness” and “neutral”. Our deep learning algorithm classifies images with 88.79% accuracy.

Deep learning algorithms take many forms. Our research group used a convolutional neural network (CNN) to identify eight facial expressions including neutral, happiness, surprise, sadness, anger, disgust, fear, and contempt in people from different demographics and ages. We challenge our network to identify children’s emotions in hope of enhancing Human Computer Interaction (HCI), diagnosing behavioral diseases, and etc.

86.26

85.33

91.67

0 10 20 30 40 50 60 70 80 90 100

Test

Validation

Training

Classification Performance

Accuracy (%)

Evaluation on FER+ dataset:Over multiple iterations, the network discovers patterns in the data that can distinguish emotions. Convolutional layers identify structural features of the images, which are integrated in fully connected layers.

Combining layers of different structure lets the network adapt to recognize images of varying type.

Convolutional layers Fully connected layers

happiness

Input

We adopt from the state-of-the-art CNN architecture VGG13, tweaking the structure of the network to better comply with our application. The output generates probabilities of an arbitrary image belonging to eight distinct classes of emotion. To avoid overfitting, we interleave convolutional layers and fully connected layers with dropout layers.

Confusion matrix

happiness

anger

surprise

Trained CNN

happiness

neutral

This project was supported by USU’s Graduate Research and Creative Opportunities (GRCO) Grant Program 2017-2018