Robotics Institute, Carnegie Mellon University IMTC’2002, Anchorage, AK, USA1 IMTC-2002-1077 Sensor Fusion [email protected] Sensor Fusion for Context Understanding

IMTC’2002, Anchorage, AK, USA

11

Robotics Institute, Carnegie Mellon UniversityRobotics Institute, Carnegie Mellon University

IMTC-2002-1077 Sensor Fusion [email protected]

Sensor Fusion for Context Understanding

Sensor Fusion for Context Understanding

Huadong Wu, Mel Siegel

The Robotics Institute, Carnegie Mellon University

Sevim Ablay

Applications Research Lab, Motorola Labs


22



sensor fusion• how to combine outputs of multiple

sensor perspectives on an observable?• modalities may be “complementary”,

“competitive”, or “cooperative”• technologies may demand registration• variety of historical approaches, e.g.:

– statistical (error and confidence measures)– voting schemes (need at least three)– Bayesian (probability inference)– neural network, fuzzy logic, etc


33



context understanding• best algorithm for human-computer

interaction tasks depends on context• context can be difficult to discern• multiple sensors give complementary

(and sometime contradictory) clues• sensor fusion techniques needed• (but best algorithm for sensor fusion

tasks may depend on context!)


44



agenda• a generalizable sensor fusion architecture

for “context-aware computing”– or (my preference, but not the standard term)

“context-aware human-computer interaction”

• a realistic test to demonstrate usability and performance enhancement

• improved sensor fusion approach(to be detailed in next paper)


55



background• current context-sensing architectures

(e.g., Georgia Tech Context Toolkit)tightly couple sensors and contexts

• difficult to substitute or add sensors, thus difficult to extend scope of contexts

• we describe a modular hierarchical architecture to overcome these limitations


66



Sensing hardware: cameras, microphones, etc.Environment situation: people in the meeting room, objects around a moving car, etc.

humans understand

context naturally &effortlessly

toward context understandingtoward context understanding

Identification, representation, and understanding of contextAdapt behavior to context

traditionalsystem

Information Separation+

Sensor Fusion

sensorsensor sensor


77



methodology• top-down• adapt/extend Georgia Tech Context Toolkit

(Motorola helps support both groups)• create realistic context and sensor prototypes• implement a practical context architecture

for a plausible test application scenario • implement sensor fusion as a mapping of

sensor data into the context database• place heavy emphasis on real sensor device

characterization and (where needed) simulation


88



context-sensing methodology: sensor data-to-context mappingcontext-sensing methodology:

sensor data-to-context mapping

context

observations& hypotheses

sensory output

mnmnn

m

m

n sensor

sensor

sensor

fff

fff

fff

2

1

21

22221

11211

2

1

)()()(

)()()(

)()()(

?

?

?

mnmnn

m

m

n sensor

sensor

sensor

fff

fff

fff

2

1

21

22221

11211

2

1

)()()(

)()()(

)()()(

?

?

?


99



dynamic database• example: user identification and posture

for discerning focus-of-attention in a meeting

• tables (next) list basic information about environment (room) and parameters, e.g.,– temperature, noise, lighting, available devices,

number of people, segmentation of area, etc– initially many details are entered manually– eventually a fully “tagged” and instrumented

environment can reasonably be anticipated

• weakest link: maintaining currency


1010



space time socialoutside nowactivity schedule

inside history

context

self, family, friends, colleague, acqauintance, etc.

space time socialoutside nowactivity schedule

inside history

context

self, family, friends, colleague, acqauintance, etc.

moodagitation / tiredness

stress concentration preferencesphysical information

merry, sad, satisfy…

nervousness focus of attention

habits, current name, address, height, weight, fitness, metablism, etc.

inside (personal information, feeling & thinking, emotional)

moodagitation / tiredness

stress concentration preferencesphysical information

merry, sad, satisfy…

nervousness focus of attention

habits, current name, address, height, weight, fitness, metablism, etc.

inside (personal information, feeling & thinking, emotional)

context classification and modelingcontext classification and modeling

location proximity time people audiovisualcomputing & connectivity

city, altidude, weather (cloudyness, rain/snow, temperature, humidity, barometer pressure, forecast), location and orientation (absolute, relative to

close to: building (name, structure, facilities, etc. knowledge), room, car, devices (function, states, etc.), …, vicinity temperature, humidity, vibration,

day, date individuals or group (e.g. audience of a show, attendees in a cock-tail party): people interaction, casual chatting, formal meeting, eye contact, attention arousing; non

human talking (information collection), music, etc.; in-sight objects, surrounding scenery

computing environment (processing, memory, I/O, etc., hardware/software resource & cost), network connectivity, communication bandwidth, communication costchange:

travelling, speed, heading,

change: walking/running speed, heading

time of the day: office hour, lunch time, …, season of a year, etc.

interruption source: imcoming calls, encounting, etc., …

noise-level, brightness

history, schedule, expectation

social relationship

outside (environment)

location proximity time people audiovisualcomputing & connectivity

city, altidude, weather (cloudyness, rain/snow, temperature, humidity, barometer pressure, forecast), location and orientation (absolute, relative to

close to: building (name, structure, facilities, etc. knowledge), room, car, devices (function, states, etc.), …, vicinity temperature, humidity, vibration,

day, date individuals or group (e.g. audience of a show, attendees in a cock-tail party): people interaction, casual chatting, formal meeting, eye contact, attention arousing; non

human talking (information collection), music, etc.; in-sight objects, surrounding scenery

computing environment (processing, memory, I/O, etc., hardware/software resource & cost), network connectivity, communication bandwidth, communication costchange:

travelling, speed, heading,

change: walking/running speed, heading

time of the day: office hour, lunch time, …, season of a year, etc.

interruption source: imcoming calls, encounting, etc., …

noise-level, brightness

history, schedule, expectation

social relationship

outside (environment)

sound processing: speaker recognition, speaking understanding

image processing: face recognition, object recognition, 3-D object measureing

location, altitude, speed, orientation

ambient environment

personal physical state: heart rate, respiration rate, blood pressure, blink rate, Galvanic Resistance, body temperature, sweat

microphones cameras, infra-red sensors

GPS, DGPS, serverIP, RFID, gyro, accelerometers, dead-reckoning

network resource, thermometer, barometer, humidity sensor, photo-diode sensors, accelerometers, gas sensor

biometric sensors: heart-rate/blood-pressure/GRS, temperature, respiration, etc., …

information: sensorssound processing: speaker recognition, speaking understanding

image processing: face recognition, object recognition, 3-D object measureing

location, altitude, speed, orientation

ambient environment

personal physical state: heart rate, respiration rate, blood pressure, blink rate, Galvanic Resistance, body temperature, sweat

microphones cameras, infra-red sensors

GPS, DGPS, serverIP, RFID, gyro, accelerometers, dead-reckoning

network resource, thermometer, barometer, humidity sensor, photo-diode sensors, accelerometers, gas sensor

biometric sensors: heart-rate/blood-pressure/GRS, temperature, respiration, etc., …

information: sensors

work body visionaural (listen/talk)

hands

task ID drive, walk, sit, …

read, watch TV, sight-seeing, …, people: eye contact

content: work, entertainment, living chore, etc., …

type, write, use mouse, etc., …

interruptable…

activity

work body visionaural (listen/talk)

hands

task ID drive, walk, sit, …

read, watch TV, sight-seeing, …, people: eye contact

content: work, entertainment, living chore, etc., …

type, write, use mouse, etc., …

interruptable…

activity


1111



Preference-table-user[Hd]

NameHuadong Wu (κ =1.0)

Height 5’6” (σ = 0.5” )

Weight 144 lb (σ = 4 lb)

Preference Preference-table-user[Hd]

… …

Background-table-user[Hd]

NameHuadong Wu (κ =1.0)

Height 5’6” (σ = 0.5” )

Weight 144 lb (σ = 4 lb)


… …

History-table-user[Hd]

Name Huadong Wu

Time Place

9:06AM-10:55AM NSH 4102


… …

context information architecture:dynamic context information database

context information architecture:dynamic context information database

Room-table: NSH A417

Area Area-table

Detected people #

6 (κ > 0.5)

Detected users User-table

Temperature 72 ºF (σ = 3 ºF)

Light conditionBrightness grade

Noise level 60 db (σ = 6 db)

Devices Device-table

Current

Inside Area

Of room NSH A417


Light condition Brightness grade



Detected people #

4 (κ > 0.5)

Detected user User-table

… …

User-table

Name Background Place Confidence Activity First detected

history

HdBackground-table-user[Hd]

Entrance [0.5, 0.9]Activity-table-user[Hd]

10:32AM, 06/06/2001

History-table-user[Hd]

MelBackground-table-user[Mel]

Entrance [0.3, 0.7]Activity-table-user[Mel]

11:48AM, 06/06/2001

History-table-user[Mel]

AlanBackground-table-user[Alan]

Inside [0.9, 0.98]Activity-table-user[Alan]

2:48PM, 06/06/2001

History-table-user[Alan]

ChrisBackground-table-user[Chris]

Inside [0.4, 0.9]Activity-table-user[Chris]

10:45AM, 06/06/2001

History-table-user[Chris]

… … … … … … …

Entrance Area

Of room NSH A417


Light condition Brightness grade



Detected people #

2 (κ > 0.5)

Detected user User-table

… …


1212



implementation• low-level sensor fusion done• sensing and knowledge integration via LAN• context maintained in a dynamic database• each significant entity (user, room,

house, ...) has its own dynamic context repository

• dynamic repository maintains and serves context and data to all applications

• synchronization and cooperation among disparate sensor modalities achieved by “sensor fusion mediators” (agents)


1313



system architecture to support sensor fusion for context-aware computing

system architecture to support sensor fusion for context-aware computing

user contextdatabase

appliance

embedded OS

database sever

gatewayInternet

Intranet

appliance

embedded OS

sensor

smart sensor node

sensor

smart sensor node

sensors

applicationssensor fusion

sensor

smart sensor node

appliance

embedded OS

site contextdatabase

context serverhigher-level

sensor fusion

applicationslower-level

sensor fusion

sensors


1414



practical details ...• context type =>sensor fusion mediator• mediator integrates corresponding

sensors, e.g., by designating some primary others secondary based on observed or specified performance

• Dempster-Shafer “theory of evidence” implementation in accompanying paper

• (white-icon components in following cartoon inherited from Georgia Tech CT)


1515



system configuration diagramsystem configuration diagram

DynamicContext

Database

user -mobile computer

site context database server

site context server

Widget

sensor

Widget

sensor

Widget

sensor

Widget

sensor

SF mediator

Aggregator

SF mediator

Aggregator

context data

ResourceRegistry

context data

ResourceRegistry

OtherAI algorithms

Interpreter

application application

OtherAI algorithms

Interpreter

AI algorithms

Discoverer

Dempster-Shafer rule


1616



details inherited from GT TC

• JAVA implementation• BaseObject class provides communication

functionality: sending, receiving, initiating, and responding via multithread server

• context widgets, interpreters, and discovers subclass from the BaseObject, and inherit its functionality

• service is part of the widget object• aggregators subclass from widgets, inheriting

their functionality


1717



object hierarchy and subclass relationship in context toolkitobject hierarchy and subclass relationship in context toolkit

BaseObject

Interpreter Discoverer

Aggregator

Widget

Service


1818



focus-of-attention application

• neural network estimates head poses• focus of attention estimate based on head

pose probability distribution analysis• audio reports speaker, assumed to be focus

of other participants’ attention• situation is not easy to analyze due to, e.g.,

dependence of behavior on discussion topic• initial results suggests we need more

general fusion approach than provided by Bayesian


1919



next paper ...• Dempster-Shafer approach ...• provides mechanism for handling

“belief” and “plausibility”• cautiously stated, generalizes Bayes’

Law a priori probabilities to distributions

• (difficulty, of course, is that usually neither the requisite probabilities northe distributions are actually known)


2020



focus-of-attention estimation from video and audio sensors

focus-of-attention estimation from video and audio sensors


2121



expectationsexpectations

context AI rules

Widget

sensor

Widget

sensor

e.g. Dempster-Shafer Belief Combination

Sensor fusion mediator

Observations & Hypotheses

Dynamic ConfigurationTime interval: T

Sensor listUpdating flag

… …

Expected Performance BoostExpected Performance Boost

1. Uncertainty & ambiguity representation to user applications

2. Information consolidation & conflict resolving for users

3. Adaptive sensor fusion support switch to suitable algorithms

4. Robust to configuration change — and for some to die gracefully

5. Situational description support — using more & complex context


2222



conclusions• preliminary experiments demonstrate

feasibility of context-sensing architecture and methodology

• expect our further improvements via– better uncertainty and ambiguity handling– fusion of overlapping sensors – context-adaptive widget capability– sensor fusion mediator coordinates

resources– context information server supports

applications

Documents

Robotics Institute, Carnegie Mellon University IMTC’2002, Anchorage, AK, USA1 IMTC-2002-1077 Sensor Fusion [email protected] Sensor Fusion for Context Understanding