A FACEREADER- DRIVEN 3D EXPRESSIVE AVATAR Crystal Butler | Amsterdam 2013

A FACEREADER-DRIVEN 3D EXPRESSIVE

AVATAR

Crystal Butler | Amsterdam 2013

2

Key FramesAction Units

QuantitativeBasis

Action unit combinations forthe basic emotions of disgust (hindering)and happiness (facilitating) were chosen

from the Facial Action Coding Systemby Ekman, Friesen and Hagar (2002)

and varied based on happinessresponse intensity of the source data.

Action Units

Examining inflection points onthe averaged response curve in

conjunction with visual inspectionof the video stimulus revealed key

transition points for facial expressions.

Key Frames

Happiness response curves gathered byFaceReader analysis from a prior studywere transformed to create intensity

dynamics for animated avatars. The avatars were designed to hinder or

facilitate happiness responses of participants.

Quantitative Basis

Facial Mimicry Experiments

THE ANIMATION DEVELOPMENT PROCESS

2

3

Data from Source Experiment

FACIAL EXPRESSIONS OF HAPPINESS, CONTROL CONDITION

Below is an intensity graph of averaged participant happiness responses over the course of viewing an amusing commercial. From the

original data set, only 12 participants who had consistently good quality facial fitting results in FaceReader were used. Of a potential

peak intensity of 1, the maximum average happiness score was .2433.

3

4

Data from Source, Normalized

(Hi -Hmin)/(Hmax/Hmin)

In order to develop more discriminable facial expressions for the avatar animation, the original data was normalized to span the full

range of possible FaceReader emotion scores from 0-1.

4

5

Video Fear Face

HAPPINESS DIPS DURING FEAR DISPLAY

5

QuickTime™ and aH.264 decompressor

are needed to see this picture.

6

Inflection points are frames at which the slope of the curve, dy/dx calculated using happiness measurements on either side of

the frame, changes sign, indicating that a local maximum or minimum has been reached and a change in trend of expressive

intensity has occured. Areas of high standard deviation were determined by calculating the SD at a point using its value plus the

five frames to either side. Frames with SDs in the top 30% form areas where rapid changes in happiness intensities are

occuring relative to the average.

vs

Determining Key Frames

High SD AreasInflection Points

6

7

Key Frames: The Eyes Have It

7

VISUAL INSPECTION OF INFLECTION POINTS ON THE VIDEO WINS

8

Hindering Expressions: Disgust

The FACS Investigator’s Guide lists six Action Unit (AU) combinations typical of disgust. Three involve action unit 9, which wrinkles the nose and is thought to be an innate physiological

response to noxious odors. Action Unit 10 is indicated in the other three combinations; it is also found in expressions of anger. Because of these differences, combinations with AU 9

were reserved for video scenes focused on food (Doritos) or the goat. Action Unit 10 combinations were applied to scenes in which the focus was on the human actor. In order to create

expressions strong enough to be recognizable, an intensity floor of .3 was applied to the average happiness measurements and then the AU intensities were normalized to range from

0-1. Combinations with a greater number of AUs were considered to be more intense, and were further broken down into slight, moderate, and strong expressions. Thresholds for AU

intensities were determined by dividing each group by nine.

Applied to happiness intensities from 0-.33.Slight: .11 = .377 normalizedModerate: .22 = .454 normalizedStrong: .33 = .531 normalized

AU 9 AU 10

9+17

8

Applied to happiness intensities from .34-.66.Slight: .44 = .608 normalizedModerate: .55 = .685 normalizedStrong: .66 = .762 normalized

9

9+16+25+26


10

Applied to happiness intensities from 0-.33.Slight: .11 = .377 normalizedModerate: .22 = .454 normalizedStrong: .33 = .531 normalized



10+17

10+16+25+26

9

About us

Key frames are illustrated with clips from the Doritos Goat 4 Sale video and renderings of the avatar’s expression as it would appear at that moment. The video frame number is given, along with the point Happiness Average (HA) and corresponding Action Unit designations per key frame.

Key Frames

Indicated within red arrow boxes

Based on intensity dynamics of the happiness graphSubjective use of additional AUs or modified intensities to create a more natural flow

Transitions

Storyboarding: Disgust

9

10

Hindering Video

DO EXPRESSIONS OF DISGUST REDUCE FEELINGS OF HAPPINESS?

10

11

Lip Corner Pull Duchenne Smile Open-lipped Smile Open-mouthed Smile

Facilitating Expressions: Happy

11

The FACS Investigator’s Guide lists six only two AU combinations typical of happiness, both closed-mouth smiles. Two more were created by adding AU 25 and AU 25+26 for open

smiles with greater intensity. In order to create expressions strong enough to be recognizable, an intensity floor of .3 was applied to the happiness measurements and then the AU

intensities were normalized to range from 0-1. Combinations with a greater number of AUs were considered to be more intense, and were further broken down into slight, moderate,

and strong expressions. Thresholds for AU intensities were determined by dividing the four combinations into three subgroups, for a total of twelve intensity ranges.

12 D

Applied to happiness intensities from 0-.2475.

Slight: .3583 normalizedModerate: .4167 normalizedStrong: .475 normalized

6+12

Applied to happiness intensities from .2476-.495.


6+12+25

Applied to happiness intensities from .496-7425.


6+12+25+26

Applied to happiness intensities from .7426-.99.


12

About us

Key frames are illustrated with clips from the Doritos Goat 4 Sale video and renderings of the avatar’s expression as it would appear at that moment. The video frame number is given, along with the point Happiness Average (HA) and corresponding Action Unit designations per key frame.

Key Frames

Indicated within red arrow boxes

Based on intensity dynamics of the happiness graphSubjective use of additional AUs or modified intensities to create a more natural flow

Transitions

Storyboarding: Happy

12

13

Facilitating Video

DO EXPRESSIONS OF HAPPINESS MAGNIFY THAT EMOTION?

13

WHY

AVATARS?

15

This group from Northeastern University, USA, is working on medical applications

including:

Relational Agents Group

Some Current Research

AVATARS AND AGENTS

Affective Social Computing Lab Rachael Jack Institute for Creative Technologies

15

Christine Lisetti, Florida International University, develops FACS-based virtual

counselors to provide mental healthcare:“On-Demand VIrtual Counselor (ODVIC)

In this project, we design and implement the prototype of On-Demand VIrtual Counselor (ODVIC) intelligent virtual characters who can provide people access to effective behavior change interventions and help them find and cultivate motivation to change unhealthy lifestyles (e.g. excessive alcohol use, overeating).

An empathic Embodied Conversational Agent (ECA) delivers the intervention. The health dialog is directed by a computational model of Motivational Interviewing, a novel effective face-to-face patient-centered counseling style which respects an individual’s pace toward behavior change. “

Dr. Jack’s work at the University of Glasgow uses avatars generated by a FACS-based facial grammar to create expressions and

test recognition across cultures:

This University of Southern California group investigates a variety of uses for virtual

humans, focusing on education and training:

http://relationalagents.com/index.html

http://ascl.cis.fiu.edu/

16



About us

Mirroring AU output for reliability checks or user feedback

Anonymize participant videos to alleviate privacy concerns

Create and identify a wide range of subtle expressions beyond the set of basics

Potential Applications

For FaceReader

16

Provide an instructional avatar within the interface to provide procedural guidelines and offer helpCreate customized avatars that allow for the study of reactions based on race, gender, and ageStudy the effect of the presence of an ‘other’ in various scenarios

Integrate as a tool for recognizing and coding Action Units (as in video at right)

Use FaceReader as an engine rather than an end product for driving affective, human-agent interactions

The Future?

Technology and Features

Avatar Creation Process

Goal: Use FaceReader’s Action Unit, Head and Eye

Poses, and Mask Data to drive a 3D Avatar

that Mirrors User Expressions in Real

time

19

faceshift for Modeling

AUTOMATED MODELING USING THE KINECT

19

20

Maya for Motion and TextureMANUAL ADJUSTMENTS AND ADDITIONS OF BLENDSHAPES

20

21

About us

Read the Action Unit values from the live FaceReader feed via a comma-delimited text file

Update the blendshapes in Maya using those AUs

Get the head angle values from the FaceReader text file and apply them to the neck joint

MEL Commands

Scripting in Maya

21

Get the face mask texture from the live FaceReader feed and apply it to the UV map on the model Grab a point color from the current mask texture and apply it to the shading node for the head

Set an animation key frame so the capture can be played back

Check for a radio button change indicating that the current facial texture should be fixed

Hair, eye movement, blending at the edge of the facial mask, and SPEED!

What’s Missing?

22



The BetaPRIOR TO APPLYING TEXTURES OR ADJUSTING BLENDSHAPES (AND TO SHOW OFF VOICE ACTIVATION!)

22

23

The Avatar NowFACEREADER VS FACESHIFT SMACKDOWN

23

24

Subtle Expressions

SOME EXAMPLES FROM FACEREADER AU VIDEO CAPTURES

24

25

About us

Surprise is the most well-recognized emotion

Disgust is the most poorly recognized emotion

Anger and disgust are frequently confounded

General Observations

FaceReader AU Recognition

25

A PILOT STUDY OF FACEREADER RESPONSES TO TYPICAL AU COMBINATIONS FOUND IN BASIC EXPRESSIONS OF EMOTION

Data is based on a series of 37 videos made to display the Action Units that comprise the prototypes and major variants of the basic emotions according to the FACS Investigator’s Guide. Each video is 30 seconds, with 3 seconds of neutral expression at the beginning and end and a full range of AU intensities that peak around 15 seconds. A full analysis of each video with emotion and AU recognition is available.

Fear is often mistaken for surprise

Due to the absence of AU 11, AUs 9 and 10 are mistakenly coded in sadness

Discrimination between AUs 9 and 10 is poor, as is differentiation between AUs 23 and 24

WHO WANTS TO

TRY IT?

Documents

A FACEREADER- DRIVEN 3D EXPRESSIVE AVATAR Crystal Butler | Amsterdam 2013