Activity and Gesture Sensing - Hong Kong University of

Activity and Gesture Sensing

Qian Zhang

Activity Recognition

• “Activity recognition aims to recognize the actions and goals of one or more agents from a series of observations on the agents' actions and the environmental conditions. ” *1+

• Computer takes raw data as input and recognizes the user’s motion activity

[1] http://en.wikipedia.org/wiki/Activity_recognition

Activities

Activities

Process of activity recognition

Sensors Classification

• Video sensors

• Installed at fixed location

• Passive monitor

• May have privacy problem

• Environment sensors

• WiFi, Bluetooth, Infrared sensors

• Fixed location, limited information

• Wearable sensors

• Equipped on human body

• Smartphones

Mobile Phone Sensors

Accelerometer

Compass

Gyroscope

Barometer

Data Collection

Data Collection

• Sensors

• Single type of sensor

• Single accelerometer

• Multiple accelerometers

• Multi modality

• Different combinations

Data Collection

• Locations • Single location • Multiple location

Preprocessing

Segmentation

[6] L. Wang, T. Gu, X. Tao, and J. Lu. A hierarchical approach to real-time activity recognition in body sensor networks. Pervasive and Mobile Computing, 2012 [7] N. Ravi, N. Dandekar, P. Mysore, and M. L. Littman. Activity recognition from accelerometer data: AAAI, 2005

Feature Computation

[8] A. M. Khan, Y.-K. Lee, S. Y. Lee, and T.-S. Kim. A triaxial accelerometer-based physical-activity recognition via augmented-signal features and a hierarchical recognizer. Information Technology in Biomedicine, 2010.

Feature Computation

[9] N. Ravi, N. Dandekar, P. Mysore, and M. L. Littman. Activity recognition from accelerometer data: AAAI, 2005 [10] L. Baoand S. S. Intille. Activity recognition from user-annotated acceleration data. In Pervasive computing, Springer, 2004 [11] J. R. Kwapisz, G. M. Weiss, and S. A. Moore. Activity recognition using cell phone accelerometers. ACM SigKDD Explorations Newsletter, 2011.

Classification

• Base-level Classification • Meta-level Classification

Base-level Classification

• Decision Tree

• Decision Table

• K-Nearest Neighbor(KNN)

• Hidden Markov Model(HMM)

• Support Vector Machine(SVM)

• Naïve Bayes, Artificial Neural Networks, Gaussian Mixture Model, etc.

Base-level Classification

Other Classifiers

• Gaussian Mixed Model • Artificial Neural Network • Naïve Bayes • Rule-based Classifier • Fuzzy Inference

Meta-Level Classification

• Voting • Each base level classifier gives a vote, the class label receiving the

most votes is the final decision

• Stacking • Learning algorithm to learn how to combine the predictions of the

base-level classifiers

• Cascading • Iterative process to combine base-level classifiers • Sub-optimal compare to others

• Activity classification • Beneficial from more than one base-level classifiers

• Feature selection

• Choose the most discriminative features subset

Meta-Level Classification

Challenges

• Challenges

• Subject Sensitivity

• Location Sensitivity

• Activity Complexity

• Energy and Resource Constrains

• Data Sparsity

Subject Sensitivity

[13] J. Lester, T. Choudhury, and G. Borriello. A practical approach to recognizing physical activities. [14] W.-Y. Deng, Q.-H. Zheng, and Z.-M. Wang. Crosspersonactivity recognition using reduced kernel extreme learning machine.

Location Sensitivity

[15] Y. E. Ustev, O. DurmazIncel, and C. Ersoy. User, device and orientation independent human activity recognition on mobile phones: challenges and a proposal. [16] J.-g. Park, A. Patel, D. Curtis, S. Teller, and J. Ledlie. Online pose classification and walking speed estimation using handheld devices.

Activity Complexity

[17] J. Lester, T. Choudhury, and G. Borriello. A practical approach to recognizing physical activities. [18] M. Khan, S. I. Ahamed, M. Rahman, and R. O. Smith. A feature extraction method for real-time human activity recognition on cell phones.

Energy and Resource Constrains

[19] Z. Yan, V. Subbaraju, D. Chakraborty, A. Misra, and K. Aberer. Energy-efficient continuous activity recognition on mobile phones: An activity-adaptive approach. [20] Y. Liang, X. Zhou, Z. Yu, and B. Guo. Energyefficientmotion related activity recognition on mobile devices for pervasive healthcare.

Data Sparsity

[21] D. Guan, W. Yuan, Y.-K. Lee, A. Gavrilov, and S. Lee. Activity recognition based on semi-supervised learning. [22] M. Mahdavianiand T. Choudhury. Fast and scalable training of semi-supervised crf swith application to activity recognition. In NIPS, volume 20, 2007. [23] J. Xieand M. S. Beigi. A scale-invariant local descriptor for event recognition in 1d sensor signals. In Proc. Int. Multimedia and Expo Conf., 2009.

Applications

• Daily Life Monitor • Personal Biometric Signature • Elder and Youth Care • Localization • Industry Manufacturing Assisting

Daily Life Monitoring

Personal Biometric Signature

Elder and Youth Care

Localization

• Assist GPS: • Indoor vertical information: 9-1-1 caller location inferring, daily

routine logger

• Signal in city: weak! • Augmented localization based on context detected using activity

information • E.g. Theater and Restaurant, which maybe too close for GPS to

tell the different, but activities in each place have their own signature

Industry Manufacturing Assisting

• To assist workers in their daily job

http://www.wearitatwork.com/home/

Agenda

01

02

03

Wearable based sensing

Wireless radio based sensing

Smartphone based sensing

04 Acoustic based sensing

05 Visible light based sensing

I am a Smartwatch and I can Track my User’s Arm

Sheng Shen, He Wang, Romit Roy Choudhury

Understanding human arm motion

How is the arm moving?

What is the meaning of this motion?

3D co-ordinates Eating, sleeping, playing, running


Gesture Recognition

Gesture Recognition

Running

Smoking

Drinking

Driving

Gesture Recognition

Running

Smoking

Drinking

Driving *WearSys’15+

[MobiSys’14]

[Rithmio]

[WaterMinder]

How is the arm moving?


Gesture Recognition Posture Tracking

3D co-ordinates

Arm Posture Tracking - Applications

Motion Gaming Healthcare

Sports Analytics Natural User Interface

Can we track arm postures with a smartwatch alone?

3D arm postures with only a smartwatch

• What is inside a smartwatch?

Accelerometer Gyroscope Compass

Acceleration along 3 axes

Rotation speed around 3 axes

North vector projected to 3 axes

X

Y

Z


Elbow Shoulder

Wrist

Posture = < Elbow Location, Wrist Location, Wrist Rotation >

• What do we need to track to get posture?


Elbow Shoulder

Elbow Location 3D Sphere

(DoF: 2)

• What do we need to track?


Shoulder

Wrist

Wrist Location 3D Sphere

(DoF: 2)


(DoF: 2)



Wrist Location 3D Sphere

(DoF: 2)


(DoF: 2)


Elbow Shoulder

Wrist

Wrist Rotation 1D Angle (DoF: 1)


Accelerometer

Gyroscope

Compass

Elbow Shoulder

Wrist

?

Smartwatch = < Accelerometer, Gyroscope, Compass>


Approach 1: Dead Reckoning

Gyroscope Wrist Orientation

sensor fusion

Accelerometer Compass

𝑡

• Step 1

+

Y

Roll

Z

Yaw

Pitch

X

Approach 1: Dead Reckoning

Gyroscope Wrist Orientation

sensor fusion

Accelerometer Compass

𝑡

• Step 1

+

• Step 2

Accelerometer Global Acceleration

Projection Linear Acceleration

Gravity Removal

Velocity

𝑡

Wrist Location

𝑡

Wrist Orientation

Approach 1: Results

Wrist Location Wrist Orientation

• Wrist orientation error is okay…

• Wrist location error goes unbounded!

Double integration won’t work in unconstrained space

Approach 2: State estimation from Bayesian filters • Posture space is indeed constrained

• Why not estimate 5 unknowns directly?

Accelerometer

Gyroscope

Compass Posture

𝑑2

𝑑𝑡2 Posture

𝑑

𝑑𝑡 (Posture)

Dimension: 5+5+5=15


DoF: 5 = 2 + 2 + 1

The search space is too large for efficient state estimation

Need to find a way to reduce search space!

Reducing search space

What happens if wrist orientation is fixed… Forearm pointing upward Palm facing towards yourself

Elbow Point Cloud: A subset of elbow sphere

Elbow Forearm pointing upward Palm facing towards yourself

What happens if wrist orientation is fixed…

Wrist

Wrist Point Cloud: A shift of elbow point cloud, along forearm direction

Elbow Point Cloud: A subset of elbow sphere

Forearm pointing upward Palm facing towards yourself

What happens if wrist orientation is fixed… Elbow

Wrist

Elbow

What happens if wrist orientation is fixed…

Wrist

What happens if wrist orientation is fixed… Elbow

• For a fixed wrist orientation, arm posture space is small!

• This is promising, as we already estimate wrist orientation reasonably well…

• But how can we derive this point cloud for each wrist orientation?

Human Arm Model

θ1: [-60° ~ 180°]

θ2: [-40° ~ 120°]

θ3: *−30° ~ 120°+

Shoulder: 𝜃1, 𝜃2, 𝜃3

Human Arm Model Shoulder: 𝜃1, 𝜃2, 𝜃3

Elbow: 𝜃4, 𝜃5

θ1: [-60° ~ 180°]

θ2: [-40° ~ 120°]

θ3: *−30° ~ 120°+

θ4: [0° ~ 150°] θ5: [0° ~ 180°]


Elbow: 𝜃4, 𝜃5

θ1 θ2

θ3

θ4 θ5

Elbow Location = 𝑓 𝜃1, 𝜃2

= lu

cos 𝜃2 sin(𝜃1)sin(𝜃2)

−cos 𝜃1 cos(𝜃2)

Wrist Location = 𝑔 𝜃1, 𝜃2, 𝜃3, 𝜃4

Wrist Orientation = 𝑕 𝜃1, 𝜃2, 𝜃3, 𝜃4, 𝜃5


Elbow: 𝜃4, 𝜃5

Elbow Location = 𝑓 𝜃1, 𝜃2

= lu

cos 𝜃2 sin(𝜃1)sin(𝜃2)

−cos 𝜃1 cos(𝜃2)

Wrist Location = 𝑔 𝜃1, 𝜃2, 𝜃3, 𝜃4

Wrist Orientation = 𝑕 𝜃1, 𝜃2, 𝜃3, 𝜃4, 𝜃5

N-N Mapping

1-N Mapping for each orientation

Orientation – Location Mapping

Elbow Location

𝑔 𝜃1, 𝜃2, 𝜃3, 𝜃4

Wrist Orientation

𝑕 𝜃1, 𝜃2, 𝜃3, 𝜃4, 𝜃5 𝑓 𝜃1, 𝜃2

Wrist Location

1-N Mapping for each orientation

Video: Point Cloud Tracking

RGB Video Kinect Ground Truth Elbow/Wrist Point Clouds

How large are the point clouds?

Fraction = Area of Red

Area of Sphere

Median: 8.3%


Since they are small, what if we simply take an average?

Median Error: (12.0cm, 13.3cm)


Since they are small, what if we simply take an average?

What happens if wrist orientation doesn’t change?

Elbow Location Inferencing

t Possible Elbow Locations





Accelerometer

Wrist Acceleration

Projection

Gravity Removal

Wrist Orientation

Wrist-to-Elbow Acceleration

Forearm Rotation

Elbow Acceleration



Elbow Acceleration

Location

Velocity

Acceleration

State 1



Elbow Acceleration

Location

Velocity

Acceleration

State 2



Elbow Acceleration

Location

Velocity

Acceleration

Transition 12

HMM & Viterbi gives optimal solution!

Viterbi Decoding

T Time Steps

N Possible Elbow Locations

State (2 Locations)

Velocity

Acceleration (Transition) Transition Prob.

Evaluation

Median Error: (7.9cm, 9.2cm)

Elbow: 12.0cm 7.9cm

Wrist: 13.3cm 9.2cm

Video: Write in the Air

Limitation

• Facing direction

– Need to express arm posture in torso coordinate system

• Tracking on the move

– Body motion will pollute accelerometer signal

• Speed

– Need improvement

Conclusion

• Tracking arm postures using motion sensors on a smartwatch alone

• Fusing observations from human kinematics and IMU sensors into a HMM

• <7.9cm, 9.2cm> tracking error for <elbow, wrist>

Abhinav Parate, Mengh-Chieh Chiu, Chaniel Chadowitz, Deepak Ganesan, Evangelos Kalogerakis

University of Massachusetts, Amherst

Recognizing Smoking Gestures with Inertial Measurements Unit (IMU)

According to CDC, smoking is responsible for

440,000 deaths in the United States

$96 billion in medical costs

$97 billion in lost productivity

Over a billion smokers worldwide!

Smoking

Smoking Cessation

40% smokers try to quit each year.

Most efforts end in relapse.

Well-timed interventions help! Less than 10 % success rate

Requires presence of a ubiquitous agent

• Smartphone

– Always with the user

– Can sense user environment

– Real-time intervention

• Wristband

– Equipped with 9-axis Inertial Measurement Unit (IMU)

– Real-time smoking detection

RisQ: A Mobile Solution for Intervention

Hand-to-mouth gesture characteristics

IMU Signals for various hand-to-mouth gestures

food tow ards m outh

D w ellTim e

return to norm al

(c) Eating w ith a spoon

cigarette tow ards m outh

D w ell Tim e

return to norm al

(a) Sm oking w hen standing still

cigarette tow ards m outh

D w ell Tim e

return to norm al

(b) Sm oking w hen w alking

cup tow ards m outh

D w ell Tim e

return to norm al

(d) D rinking from a cup

1: Orientation-dependent Characteristics

Gesture characteristics for the same smoking gesture when the user faces in opposite directions Signal characteristics change with user’s body orientation

2: Unknown Gesture Boundaries

Where does a gesture start? How to identify gesture boundaries in a passive manner?

3: Collecting labels for training

How to collect fine-grained labels for training a classification model?

Outline

• Introduction

• Challenges

• Data Collection using IMUs

• Data Processing Pipeline

• Evaluation

• Conclusion

IMU signal: Background

• Quaternion – Mathematical entity to

represent orientation of an object in 3D space

– q = q_s + q_x I + q_y j + q_z k

– One scalar and 3 imaginary components

x

y

z

Angle of rotation: a

q = cos(a/2) + x sin(a/2) i + y sin(a/2) j + z sin(a/2) k

3D coordinates using Quaternions

• Point p w.r.t. IMU’s local frame of reference

• IMU device orientation in the form of a quaternion q

• Coordinates of p w.r.t world frame of reference

q = cos(a/2) + x sin(a/2) i + y sin(a/2) j + z sin(a/2) k

q’ = cos(a/2) – x sin(a/2) I – y sin(a/2) j – z sin(a/2) k

p’ = q.p.q’

Wrist Trajectory using Quaternions

Visualizing gestures using a wristband and an armband equipped with IMUs

Outline

• Introduction

• Challenges



• Evaluation

• Conclusion

Segment Extraction

Feature Extraction

Gesture Classification

Session Detection

Relative to elbow

Peak Detection algorithm

Executed on phone

q_s q_x q_y q_z

0

0.5

1 0

0.5

1

z

x

y

z

W rist Trajectory

C om putation

0

0.5

1 0

0.5

1

z

x

y

z

Trajectory

Sm oothing

0

0.5

1 0

0.5

1

z

x

y

z

W rist's R est-point

C om putation

Segment

Segment Extraction

Feature Extraction


Session Detection

• Orientation Independent Features • A set of 34 spatio-temporal features

• Duration-based features (4) • Gesture duration, time to raise arm, etc.

• Velocity-based features (6) • Maximum wrist speed, etc.

• Displacement-based features (6) • Vertical displacement, XY displacement, etc.

• Angle-based features (18) • Angle with the gravity, angular velocity, etc.

G esture R ecognition Layer

Average Speed

Vertical Displacement

Angular twist

Gesture duration

Feature Extraction

S upervised G estureC lassification

S ession D etection Layer

SessionSegm ent N -1 Segm ent N +1

Segm ent N features

C onditional R andom Field

M odel

Segment Extraction

Feature Extraction


Session Detection

Segment Extraction

Feature Extraction


Session Detection

Outline

• Introduction

• Challenges



• Evaluation

• Conclusion

Evaluation Dataset

– Dataset

• 28 hours of data from 15 volunteers

• 17 smoking sessions (369 puffs)

• 10 eating sessions (252 food bites)

• 6 drinking sessions

Smoking Session Detection

Leave-one-session-out Cross-validation

Statistic Avg ± Std Dev

Duration of smoking

sessions

326.21 ± 19.65 s

Error in estimation 65.7 ± 30.6 s 02468

1012141618

17 15

Smoking Gesture Recognition

– 10-fold Cross-validation • 369 puffs

• 252 bites

• 4976 other gestures

Mechanism Performance Metrics

Accuracy Recall Precision FPR

Random Forests 93.00% 0.85 0.72 0.023

CRF 95.74% 0.81 0.91 0.005

CRF improves precision at a cost of slight drop in recall

User Study

– Recruited 4 subjects for 3 days.

– Used our smoking detection app developed for Android OS.

User Study

Fewer than 2 false positives per day! Rarely missed any smoking session.

Day 1 Day 2 Day 3

Conclusion

• An algorithm to recognize hand-gestures using a wristband – Demonstrated an application to detect smoking in real-time.

• Smartphones in conjunction with wearable accessories present a great platform to sense health-related behaviors like smoking, eating, and so on.

• Remarkable opportunity to create effective intervention strategies using smartphones.

• Software/Code available at: – http://people.cs.umass.edu/~aparate/risq.html

Your Glasses Know Your Diet

Dietary Monitoring

Q. Huang, W. Wang, and Q. Zhang, "Your Glasses Know Your Diet: Dietary Monitoring using Electromyography Sensors", to appear in IEEE Internet of Things Journal, 2017.

Eating Problems in Modern Society

Skip meals when tied up in

meetings or hurrying to meet a deadline.

Eat too fast without chewing

the food thoroughly.

Emotional eating turn to food not for

hunger but for comfort and stress relief

Diet awareness is important for the patients with diabetes, overweight, cardiovascular diseases

Logging food by scanning barcode

Tedious human labor Tend to ignore unhealthy food

Logging food by life-logging camera

Privacy invasion May offend nearby people

Monitoring Eating Behavior

Your Glasses Know When You Eat

• When we wear glasses, the lower part of temporalis muscle is exactly the place where glasses touch the skull

• The temporalis muscle is one of the mastication muscles

Chung et. al, A glasses-type wearable device for monitoring the patterns of food intake and facial activity, Scientific Reports 7, Article number: 41690 (2017).

One bite

Challenges

• Acquisition of surface EMG signals: require adhesive electrodes

– Much less contact area

– Natural shifting of glasses

• Cross-talk of muscle

– Electrical activity of other muscles

– Confounding actions: speak, laugh, yawn

• Limited capability of on-board microcontroller

– A real-time course grained algorithm

Your Glasses Know When and What You Eat

• 96% accuracy for counting the number of chewing cycles

• 70-90% accuracy for classifying around 10 types of food

Intake Detection We can use electromyography (EMG) sensors

to capture the muscle activity of temporalis,

so as to detect food intake

Food Classification The electrical activity level of mastication

muscles, chewing cycle duration, the number

of chewing cycles etc. are closely related to

food textures. We use J48 decision tree for

food classification

Agenda

01

02

03

Wearable based sensing

Wireless radio based sensing

Smartphone based sensing

04 Acoustic based sensing

05 Visible light based sensing

Introduction to SmartPhone Sensing

• A Survey of Mobile Phone Sensing

Nicholas D. Lane, Emiliano Miluzzo, Hong Lu, Daniel Peebles, Tanzeem

Choudhury, and Andrew T. Campbell

Dartmouth College

Devices use sensors to drive user experience: Phone usage: Light sensor – Screen dimming Proximity – Phone usage Content capture: Camera – Image/video capture Microphone – Audio capture Location, mapping: GPS – Global location Compass – Global orientation Device orientation: Accelerometer & Gyroscope – Local orientation

Classifying Activities

• Sensors can also collect data about users and their surroundings

• Accelerometer data can be used to classify a user’s movement: – Running – Walking – Stationary

• Combining motion classification with GPS tracking can recognize the user’s mode of transportation: – Subway, bike, bus, car, walk…

Phone cameras can be used to track eye movements across the device for accessibility Microphone can classify surrounding sound to a particular context:

– Using an ATM – Having a conversation – Driving – Being in a particular coffee shop

Classifying Activities

Custom Sensors

• Device sensors are becoming common, but lack special capabilities desired by researchers: – Blood pressure, heart rate, EEG – Barometer, temperature, humidity – Air quality, pollution, Carbon Monoxide

• Specialized sensors can be embedded into peripherals: – Earphones – Dockable accessories / cases – Prototype devices with embedded sensors

Research Applications - Transportation

• Fine grained traffic information collected through GPS enabled phones – MIT VTrack (2009)

• 25 GPS/WiFi equipped cars, 800 hours • Mobile Millenium Project (2008) • GPS Mobile app: 5000 users, 1 year

– Google Maps keeps GPS history of all users

• Real time traffic estimates • Route analysis (19 minutes to home) • Navigation / route planning

Research Applications – Social Network

Users regularly share events in their lives on social networks. Smart devices can classify events automatically.

Dartmouth’s CenceMe project (2008) • Audio classifier recognizes when people are talking.

• Motion classification to determine standing, sitting, walking, running.

• Server side senses conversations, combines classifications.

Research Applications Environmental Monitoring

UCLA’s PEIR project (2008)

App uploads GPS signal and motion classification. Server combines data sources: • GPS traces • GIS maps • Weather data • Traffic data • Vehicle emission modeling

Presents a Personal Environmental Impact Report • CO and PM2.5 emission impact analysis • PM2.5 exposure analysis

Research Applications – Health

Sensors can be used to track health and wellness. UbiFit Garden (2007, 3 months) • App paired with wearable motion sensor • Physical activity continuously logged • Results represented on phone’s background as a garden • This “Glanceable display” improved user participation dramatically

Sensing Scale and Paradigms

Sensing Paradigms • Participatory sensing

– User takes out phone to take a reading – Users engaged in activity, requires ease of use and incentive

• Opportunistic sensing – Minimal user interaction – Background data collection – Constantly uses device resources

Sensing Scale • Personal sensing

• Group sensing

• Community sensing

Sensing Scale – Personal Sensing

Personal Sensing • Tracking exercise routines

• Automated diary collection

• Health & wellness apps

Sensing is for sole benefit of the user • High user commitment

• Direct feedback of results

Sensing Scale – Group Sensing

Group Sensing • Sensing tied to a specific group

• Users share common interest

• Results shared with the group

• Limited access

Example: UCLA’s GarbageWatch (2010) • Users uploaded photos of recycling bins to improve recycling

program on campus

Sensing Scale – Community Sensing

Community Sensing • Larger scale sensing • Open participation • Users are anonymous • Privacy must be protected

Examples: • Tracking bird migrations, disease spread, congestion patterns • Making a noise map of a city from user contributed sound sensor re

adings

Sensing Paradigms

User involvement has its own scale:

Manual (participatory) collection • Better, fewer data points • User is in the loop on the sensing activity, taking a picture or logging a

reading • Users must have incentive to continue

Automatic (opportunistic) collection • Lots of data points, but much noisy/bad data • Users not burdened by process, more likely to use the application • Application may only be active when in foreground

Mobile Phone Sensing Architecture

Sensing applications share common general structure:

• Sense – Raw sensor data collected from device by app

• Learn – Data filtering and machine learning used

• Inform – Deliver feedback to users, aggregate results

Sensing – Mobile Phone as a Sensor

Programmability

• Mobile devices support 3rd party apps (2008+)

• Mixed API and OS support to access sensor data

• GPS sensor treated as black box

• Sensors vary in features across devices (see 5S)

• Unpredictable raw sensor reporting

• Delivering raw data to cloud poses privacy risks

Sensing – Continuous Sensing

Sampling sensors continuously • Phone must support background activities • Device resources constantly used

– CPU used to process data – High power sensors (GPS) polled – Radios frequently used to transmit data – Expensive user data bandwidth used – Degrading user’s phone performance will earn your app an uninstall

Continuous sensing is potentially revolutionary, but must be done with care

– Balance data quality with resource usage – Energy efficient algorithms

Sensing – Phone Context

Mobile phones experience full gamut of unpredictable activity. • Phone may be in a pocket, in a car, no signal, low battery..

Sensing application must handle any scenario.

• Phone and its user are both constantly multitasking, changing the context of sensor data

Some advances: • Using multiple devices in local sensing networks

• Context inference (running, driving, in laundry)

Learning – Interpreting Sensor Data

Interpreting potentially flakey mobile data requires context modeling. Data may only valid during certain contexts (running, outdoors…)

• Supervised learning: Data is annotated manually, these classifications improve machine learning.

• Semi/unsupervised learning: Data is wild and unpredictable, algorithms must infer classifications.

• Accelerometer is cheap to poll and helpful to classify general activity (moving/still)

• Microphone can classify audio environments at cost of CPU resources and algorithm complexity

• Involving the user in automatic classification can be helpful, but adds interaction complexity

Learning – Scaling Models

Many statistical analysis models are too rigid for use in mobile devices. Models must be designed flexible enough to be effective for N users. • Adaptive models can query users for classification if needed.

• A user’s social network can help classify data, such as significant locations.

• Hand annotated labels may be treated as soft hints for a more flexible learning algorithm.

• Complex adaptive algorithms bring increased resource usage.

Inform, Share, Persuade

Once data is analyzed, how are results shared with users? How to close the loop with users and keep them engaged • Sharing - Connecting with web portals to view and compare data

• Personalized Sensing – Targeting advertising to your habits

• Persuasion – Showing progress towards a common goal,

encouraging users

• Privacy – Treating user data mindfully

Share

The sensing application must share its findings with the user to keep them engaged and informed. • Can be tied with web applications (Nike+)

• Form a community around the data

• Allow users to compare and share their data

• Nike+ collects a simple data set (run time and distance) but users are actively engaging in the web portal

Reports & Studies

Applications

Notification

Visualization

Personalized Sensing

A user’s phone can constantly monitor and classify their daily life; the data collected is highly personal.

– Targeted advertising would love to know just when to show you a certain ad

– Your phone can provide personalized recommendations targeted to your location and activity

– A common sensing platform could feed classifications and data to other apps and services

Persuasion

Sensing applications usually involve a common goal, the reason the user is running the app. • The goal of a persuasive app is to encourage the user to change their

behavior – Improve fitness and physical activity – Reduce smoking – Avoid traffic – Lower carbon emissions

• Provide comparison data to give the user perspective • Present aggregated community data • Accurate models of persuasion are needed so that the user feels

engaged and moved to change

Privacy

With your phone sensing you and your activity, user privacy is a major concern

• Advertising places high price on accurate ad target data, which the sensing app could provide

• User data may include personal details (GPS locations, habits, conversations)

Approaches

• Personal sensing apps can store private data locally, and share selectively

• Group sensing apps gain privacy by limited trusted membership

• Community sensing apps must ensure user privacy is guaranteed

• Raw sensor data can be processed and filtered locally before uploading more anonymous data to the system

Rajalakshmi Nandakumar, Shyam Gollakota, Nathaniel Watson M.D.

Contactless Sleep Apnea Diagnosis on Smartphones

Hypertension risk Obesity risk

CVD risk

Sleep Deficiency (e.g. Sleep Apnea)

Stroke risk

Perils of Sleep Deficiency

Mortality

Sleep Apnea Diagnosed With Polysomnography

Sleep Apnea Diagnosed With Polysomnography

Makes Polysomnography expensive ($4000)

labor intensive and cumbersome

Can we enable contactless sleep apnea diagnosis on a smartphone ?

Introduces the first contactless system that measures breathing signals on a smartphone

Detects sleep apnea events (e.g. central apnea, hypopnea, obstructive apnea) from the breathing signals

Evaluated in a clinical study with 37 patients over 296 hours

ApneaApp

How do we measure breathing signals using a smartphone?

How to diagnose sleep apnea from the breathing signals?



Challenge: Breathing Motion is Very Minute

Radar Approaches Infeasible on Smartphones

Centimeter resolution requires expensive ultra-wideband (1—2 GHz) radio signal processing

Requires custom, expensive multi-antenna hardware

147

Key Idea: Transform Phone into Active Sonar

Breathing Motion

Δd

Achieves 0.71 cm resolution using only 48 kHz acoustic signals!

Freq

uen

cy

Speaker and multiple microphones

Time

18 KHz

20 KHz

10ms

Δf

Δt

Transmitted signal

Reflected Signal

System Design

Freq

uen

cy

Speaker and multiple microphones

Time

18 KHz

20 KHz

10ms

Transmitted signal

Person 1

System Design

Person 2

• Start at zero distance and search for breathing motion at increasing distances

• Stop search when 0.2 - 0.3 Hz periodic signal is found

• With multiple people, continue search till max distance

ApneaApp’s Processing Algorithm



Peak Detection Algorithm

Simple peak detection results in false

peaks

Use thresholds for min. distance between peaks and min. peak amplitude.

Apnea Event Detection

1.Central Apnea : Distance between consecutive

peaks > 10s

2.Hypopnea : Peak below a threshold, but still

periodic

3.Obstructive Apnea : Sudden spikes. Amplitude

increase > 50%

AHI < 5 : No Apnea 5 <= AHI < 15 : Mild Apnea 15 <= AHI < 30 : Medium Apnea AHI > 30 : Severe Apnea

Central Apnea

Hypopnea Obstructive Apnea

Diagnosing Sleep Apnea from Breathing Signals

AHI = Total # of apnea events

Total sleep time (in hr)

We Infer Sleep Time from Body Motion

Sleep time = Total duration – Duration of motion

Body motion

How well does ApneaApp work?

5 Participants over 10 hours Different distances, phone positions and orientations All sleeping positions Different blankets 2 people on the bed

Breathing rate accuracy of 99.2% at a range of 1 meter

Harborview sleep center over one month 37 patients over 296 hours - 17 female and 20 male - ages of 23 – 93 Polysomnography as baseline

How Well Can We Detect Apnea Events?

Apnea Event Correlation

Central Apnea 0.9957

Hypopnea 0.9533

Obstructive Apnea 0.9860

Interclass Correlation Coefficient (ICC) between

ApneaApp and Polysomnography

False Positive False Negative

59/26070 (0.22%) 43/1281 (3.35%)

Sleep Time Accuracy of ApneaApp

Motion Detection Accuracy

Mean sleep time error is 37 minutes

Sleep Apnea Diagnosis Accuracy

32/37 patients were classified correctly between no, mild, medium and severe apnea

0.00

10.00

20.00

30.00

40.00

50.00

60.00

0.00 10.00 20.00 30.00 40.00 50.00 60.00

AH

I (A

pn

eaA

pp

)

AHI (Polysomnography)

Correlation = 0.9816

Limitations

• Works only on android phones with two microphones.

• High battery drainage. Phone speaker and microphone are switched on for 8 hours.

• Environmental effects not explored. Limited testing shows rapid accuracy reduction beyond 1m radius of phone placement.

• Sleep time error occurs when patients wake up but don’t move.

• Not possible to distinguish between equidistant patients

Introduces the first contactless system that measures breathing signals on a smartphone

Detects sleep apnea events (e.g. central apnea, hypopnea, obstructive apnea) from the breathing signals

Evaluated in a clinical study with 37 patients over 296 hours

Conclusion

Documents

Activity and Gesture Sensing - Hong Kong University of