Assess and Augment: Toward Games & Training With Biophysical Sensor

Games & Training + Human Performance Data

Spencer Frazier

Lockheed Martin

• Spencer Frazier

• Lockheed Martin - Lead Senior Software Engineer

•Boston College, USC M.S., Georgia Tech Ph.D (hiatus)

• Startups: Moglo (Aniphon), Drizly Inc

•Research: Serious Games (Team-It, Mars Game), Multi-Agent Systems, NNs (RNN,LSTM,etc), Compositional Frameworks for Zero-Shot Learning

•Hobbies: Mask carving, game community (almost 15 years), jamming sensors into everyday life

•Moglo – location-based capture the flag using mobile• ZombieSC – Location-based fitness game, closed loop incentives based

on learning• Aniphon – location-based creature battling (think Pokemon GO)• Team-It - @USC, MAS research, evaluating teaming and collaboration

with agents•Mars Game – Instructional game web game for STEM concepts• Agent Trust/Real-Time Augmentation – Eyetracking, alerting, HR, GSR,

CNAP (BP), Wizard of Oz performer• 1O -> ML -> 2O – Take 1st Order Human Performance data and learn

second order assessment-relevant data to lighten workload on instructors (ML, RNNs, LTSMs)

“Why shouldn’t I take a nap or text or answer emails?” aka “The Agenda”

1. Hardware and assessment-based approaches to collecting human performance data

2. What interesting things you can infer from those data

3. How you can modify your games and training based on that data to save/make time/money

4. Examples (real work, ongoing work and some new ideas)

Plenary+ (More reasons to stay [awake])

•Build trust w/ avatars• Should we use an avatar?

• Improve scalability in virtual environments• Save resources• Improve scalability in VR, specifically

•More immersive environments

•Capitalize on eSports growth

•More

• Human Factors: interactions among humans and other elements of a system• Human Performance: first order

(heart rate) and second order (stress) data from a human• Sensors/Hardware: how we get

the data• Biophysical: a dirty(?) word that I

still like to use• Agent: person or automated

process•Others: speak up!

Researchgate.net

Hardware – advancing and already “on”

Pupil Labs

Gregory Kovacs (Stanford)

Tobii

Hardware Improvements (Last 5 Years)

• Beyond Medical• Though this is still the best for

reliable data

• Consumer Awareness/Ownership

• Durable(ish)

• Affordable ($200-$2000 for multi-modal monitoring)• Eyetracking is still pricy

• Less bulk/onboarding

• Gender affordances(ish)

Types of First-Order data• Eyetracking

• Gaze• Pupillometry

• Heartrate/Pulse/BP• Perspiration

• GSR

• Respiration• EEG

• Brain Signals

• fNIR• Functional Near-IR (O2, Hem in

prefrontal cortex)

• Force/Haptics• Force sensitive mouse• ForceTouch

•Movement (aXYZ)• Camera Images• Video• IR Depth• ...and more!

USC ICT

Krysten Newby (CreativePool)

Types of Second-Order Data• Gestures/Interaction/Intent• Expressions (EKMAN)

• Stress (TALEMAN 2009)

• Engagement (SHAGASS 1976)

• Workload (SASSAROLI 2008)

• Sentiment (RAUDONIS 2009)

• Learning Rate (MAKEIG, AYAZ, GRAMANN)

• Trust/Deceit (GRATCH 2012)

• Fatigue (SEOANE 2014)

• Posture [Engagement] (D’MELLO)

• Fitness [Avg HR]

Emotions/Trust/Deceit

• Face and eyes

•Pupil size and reactivity

•Observations• Illness/recovery potential (medical)• Attraction• Cognitive load (larger diameter through task completion)• Memory encoding

•Avatars can react or animate appropriately

Engagement

•Gaze, pupils• Memory encoding based on pupil size

•Posture (slouching)

•Don’t lose early chunks of the “train, practice, assess” cycle in game onboarding or in the classroom (learners left behind) (Jacobson)

Stress/Cognitive Load

•Heartrate

•Bloodpressure

•GSR

• fNIR

•Game or training difficulty can respond dynamically (or just report)

•Your own assessment/derived measure

Data Best Practices•Timing/time is huge

• You’ll hear different things from your HFR, your architect, your engineer, your data scientist, your customer

•Data format for analysis after the fact is huge• xAPI?• NoSQL/Relational?• Binary?• How much to save?

•We need meta-data standards for data collection to ease adoption and interaction with this equipment• Nobody wins long term with closed-source, closed-API software*

Game Use Case – Gaze, Pupillometery, PostureCharacter authoring, believability, immersion

Advantages (Eyetracking/Posture via Depth)

• Less time spent authoring believable characters• Reactionary behavior can be shared across all agents in the game• Saves $

• Immersion increased as every player action feels like they are influencing the environment• Flow, presence (Weibel 2011)

•Mirror the character’s posture to establish rapport or just assess engagement

Training Use Case– Gaze, PupillometeryTrust, engagement

Insert image about alerting/avatars/dash

20 inch monitor

Approx accuracy

Research/Methods/Results•Avatar “Oz” controlled by a SME in the domain in another room,

semi-scripted

•Other actual domain experts as subjects• 3.5 days of 8+ hour days performing a task

•Control: non-emotional avatar, non-augmented team

•Assessing trust, speed, accuracy, engagement…•Augmenting by dismissing alerts generated either by Oz or by the

system when the subject looked at the alert

• Survey (Trust)

•Positive feedback from subjects, system improvements

Advantages (Eyetracking)

•Eyetracking for alert dismissal or acknowledgement• More time on task

• Learn what visual features are important to a trainee during critical decision making processes• Improve fidelity of those features/entities• Decrease fidelity of less important features

•Assess usefulness of an avatar – how often are they engaging it?

Game Use Case – Adaptive UI/RenderingInterface design, responsive UI, dynamic rendering, VR

Advantages (Eyetracking)

•More realistic engagement with environment• I don’t always turn my head to look at things in real life

• Adaptive sight-based LOD (also called foveated rendering) saves time optimizing game for specific hardware, squeezes more out of an engine• Saves $• Increase scalability of virtual environments/VR

• VR Nausea (Xiao, SparseLightVR)• Player retention, $

• Assessment of your end product• Used in supermarkets to assess product placement – assess UI speed/accuracy with

eyetracking

Training Use Case – Alerting/Augmentation (Vitals, Fatigue, HR, BP, etc.)

Advantages (Multi-Modal Monitoring)

• Stress/fatigue system response saves lives

•Build trust with a system performing life-critical tasks during training

•Perform report generation faster• Increase instructor’s response to trainee’s physical state• Enable the next generation of learning (ML) approaches to assessment,

classification and decision support by collecting this data now.

Game Use Case – fNIR, Learning Rate, StressAdaptive difficulty, player assessment

SecondSpectrum

Mars Game•Web-based• Blockly• Pre-calculus, Trig, Programming• 3-4 Hours of content•Open Source

• 9th and 10th grade math and programming concepts, and aligns to the Common Core State Standards for Mathematics. • Control: Typical written examinations• Result: More engaged, learned at or slightly above their peers rate• Did not use fNIR but would be very interested to test in this domain

Advantages

•Player washout/adeptness at task• eSports roster selection, faster identification of talent, $

•Richer live analysis of players – more for announcers to comment on• Did you see that champion pop out of the bush or not?• More engaged spectators, $

• Is this task hard enough?• Adaptive difficulty engages and retains players, $

Brief Plug for Machine Learning (“Fast Statistics”)• Second, Third-Order insights we don’t even know about or are hard

to generate (Boyd)• Washout

•Collect relevant contextual information and train the trainer (Lamb)

Not just questions…(Discussion,Concerns, Ideas, Requests, Sentiment Analysis)

Education

Assess and Augment: Toward Games & Training With Biophysical Sensor