25
CAMEO: Year 1 Progress and Year 2 Goals Manuela Veloso, Takeo Kanade, Fernando de la Torre, Paul Rybski, Brett Browning, Raju Patil, Carlos Vallespi, Betsy Ricker

CAMEO: Year 1 Progress and Year 2 Goals Manuela Veloso, Takeo Kanade, Fernando de la Torre, Paul Rybski, Brett Browning, Raju Patil, Carlos Vallespi, Betsy

Embed Size (px)

Citation preview

Page 1: CAMEO: Year 1 Progress and Year 2 Goals Manuela Veloso, Takeo Kanade, Fernando de la Torre, Paul Rybski, Brett Browning, Raju Patil, Carlos Vallespi, Betsy

CAMEO:Year 1 Progress and Year 2 Goals

Manuela Veloso, Takeo Kanade,

Fernando de la Torre, Paul Rybski, Brett Browning,

Raju Patil, Carlos Vallespi, Betsy Ricker

Page 2: CAMEO: Year 1 Progress and Year 2 Goals Manuela Veloso, Takeo Kanade, Fernando de la Torre, Paul Rybski, Brett Browning, Raju Patil, Carlos Vallespi, Betsy

CAMEO Internals

Raw video is converted to a continuous mosaic

Faces are detected and motions of people are tracked

Action data is logged for analysis and replay

Meeting simulator used for CAMEO learning validation.

Live person actions are used to seed meeting simulation environment.

Detected person motions are used to classify actions

Actions of the group are used to classify global meeting state

Classified actions reported to other agents

Page 3: CAMEO: Year 1 Progress and Year 2 Goals Manuela Veloso, Takeo Kanade, Fernando de la Torre, Paul Rybski, Brett Browning, Raju Patil, Carlos Vallespi, Betsy

CAMEO’s Connection to other CALO Agents

CAMEO is an example of a physical event capture system. Systems such as these transmit state information about people to the CALO timeline server.

Individualized CALO agents can access this information to obtain updates about their individual users.

Page 4: CAMEO: Year 1 Progress and Year 2 Goals Manuela Veloso, Takeo Kanade, Fernando de la Torre, Paul Rybski, Brett Browning, Raju Patil, Carlos Vallespi, Betsy

Inferring Meeting State with CAMEO: Overview

1) CAMEO observes activities of people in meeting

2) Raw visual motion is segmented into discrete actions

3) High-level meeting state is inferred from the aggregate actions of the group

Page 5: CAMEO: Year 1 Progress and Year 2 Goals Manuela Veloso, Takeo Kanade, Fernando de la Torre, Paul Rybski, Brett Browning, Raju Patil, Carlos Vallespi, Betsy

Training CAMEO to Recognize Human Actions

Tracked person horizontal displacement

Tracked person vertical displacement

A series of raw person actions are tracked and recorded by CAMEO. Action data is manually labeled.

Relative displacement means and variances for each action class

Significant statistics of raw actions are extracted. Action data is now represented as a learned generalization.

Page 6: CAMEO: Year 1 Progress and Year 2 Goals Manuela Veloso, Takeo Kanade, Fernando de la Torre, Paul Rybski, Brett Browning, Raju Patil, Carlos Vallespi, Betsy

Action Recognition

Person action sequences are represented as a simple finite state machine.

State transitions are encoded in a dynamic Bayesian network which infers the current person state as a function of observed human activity and previous state.

Dynamic Baysian Network

Person Action State Machine

Page 7: CAMEO: Year 1 Progress and Year 2 Goals Manuela Veloso, Takeo Kanade, Fernando de la Torre, Paul Rybski, Brett Browning, Raju Patil, Carlos Vallespi, Betsy

Classification of Person State in a Meeting

Standing

Stand

Sitting

Sit

Time in seconds

Example of person state classification:

Here, the states of a person are correctly classified from the Bayesian network. The parameters of the activity data are learned from previously-recorded meeting data.

Page 8: CAMEO: Year 1 Progress and Year 2 Goals Manuela Veloso, Takeo Kanade, Fernando de la Torre, Paul Rybski, Brett Browning, Raju Patil, Carlos Vallespi, Betsy

Classification of the Meeting State

Global meeting state is defined by the aggregate activities of every person attending the meeting.

An example of a global meeting finite state machine. CAMEO can be set up to recognize different meeting types.

Classification of meeting state using the defined state machine.

Page 9: CAMEO: Year 1 Progress and Year 2 Goals Manuela Veloso, Takeo Kanade, Fernando de la Torre, Paul Rybski, Brett Browning, Raju Patil, Carlos Vallespi, Betsy

Generating Meeting Summary

• Meeting event log becomes summary

• Low and high-level events can be organized into a hierarchy

• Meeting can be viewed at any requested level of detail from summary to captured video (and eventually audio)

2004-02-03 Project Status Report

13:04:05 Meeting Start

13:12:12 General Discussion

13:19:45 Presentation

13:24:23 General Discussion

13:29:29 Meeting End

Page 10: CAMEO: Year 1 Progress and Year 2 Goals Manuela Veloso, Takeo Kanade, Fernando de la Torre, Paul Rybski, Brett Browning, Raju Patil, Carlos Vallespi, Betsy

Generating Meeting Summary

• Meeting event log becomes summary

• Low and high-level events can be organized into a hierarchy

• Meeting can be viewed at any requested level of detail from summary to captured video (and eventually audio)

2004-02-03 Project Status Report

13:04:05 Meeting Start13:12:12 General Discussion13:19:45 Presentation

13:19:45 Jim stands13:19:50 Jim walks to podium13:20:00 Jim speaks13:22:04 *Unknown* speaks13:22:45 Jim speaks13:30:23 Wendy stands13:30:37 Wendy walks to podium13:30:42 Wendy speaks13:33:04 Wendy sits down13:33:04 Jim speks13:38:50 Jim sits down

13:40:23 General Discussion13:50:29 Meeting End

Page 11: CAMEO: Year 1 Progress and Year 2 Goals Manuela Veloso, Takeo Kanade, Fernando de la Torre, Paul Rybski, Brett Browning, Raju Patil, Carlos Vallespi, Betsy

Protecting Individual’s Privacy Issues

• Recognition is voluntary. CAMEO only recognizes people it has registered.

• We can digitally represent video logs so faces are distorted or represented only as shapes:

Raw video with tracking information

Stored video log after privacy filtering

Page 12: CAMEO: Year 1 Progress and Year 2 Goals Manuela Veloso, Takeo Kanade, Fernando de la Torre, Paul Rybski, Brett Browning, Raju Patil, Carlos Vallespi, Betsy

Some ways CALO Agents could use CAMEO Data

• What meetings happened when?

• Who was at the meeting?• Who was sitting, standing,

or speaking?• Where were people looking?• Who was talking?• What were people doing?• Who was pointing at what?• What happened during the

formal presentation?

• What happened during the general discussion?

• What is a general/detailed summary of the meeting?

• What did person 'x' contribute to the meeting?

• How to replay a meeting from a specific point in time?

• How to replay specific parts of the meeting?

Page 13: CAMEO: Year 1 Progress and Year 2 Goals Manuela Veloso, Takeo Kanade, Fernando de la Torre, Paul Rybski, Brett Browning, Raju Patil, Carlos Vallespi, Betsy

Some ways CALO Agents could use CAMEO Data

• What meetings happened when?– When a meeting starts, CAMEO can post an

event to the timeline server indicating the start time of the meeting. By querying the timeline server for events of the appropriate tag, CALO agents could determine the starts of the various meetings and obtain other information about them such as what it was about.

Page 14: CAMEO: Year 1 Progress and Year 2 Goals Manuela Veloso, Takeo Kanade, Fernando de la Torre, Paul Rybski, Brett Browning, Raju Patil, Carlos Vallespi, Betsy

Some ways CALO Agents could use CAMEO Data

• Who was at the meeting?– Face recognition is required. This can be done

by applying various kinds of image matching algorithms (SVD, template matching, etc...) to see how close a given face is to a database of saved faces. A database of saved faces must be available to work from.

Page 15: CAMEO: Year 1 Progress and Year 2 Goals Manuela Veloso, Takeo Kanade, Fernando de la Torre, Paul Rybski, Brett Browning, Raju Patil, Carlos Vallespi, Betsy

Some ways CALO Agents could use CAMEO Data

• Who is sitting, standing, or speaking?– By tracking the positions of people as they

move around, we should be able to tell who is sitting and who is standing. Depending on how animated the faces are in that state, we should also be able to tell who is speaking by how much they're bobbing around.

Page 16: CAMEO: Year 1 Progress and Year 2 Goals Manuela Veloso, Takeo Kanade, Fernando de la Torre, Paul Rybski, Brett Browning, Raju Patil, Carlos Vallespi, Betsy

Some ways CALO Agents could use CAMEO Data

• Where are people are looking?– In order to determine where people are looking,

a profile face detector is needed. In this case, we should be able to tell which direction they're looking and correlate this with the other faces in the image to figure out where in the image people are likely to be looking

Page 17: CAMEO: Year 1 Progress and Year 2 Goals Manuela Veloso, Takeo Kanade, Fernando de la Torre, Paul Rybski, Brett Browning, Raju Patil, Carlos Vallespi, Betsy

Some ways CALO Agents could use CAMEO Data

• Who was talking?– Besides tracking the face movements, audio

data can be recorded by possibly instrument CAMEO or the meeting attendees with microphones (i.e. Alex Rudnicky). With multiple microphones in the room, sound localization techniques would be required.

Page 18: CAMEO: Year 1 Progress and Year 2 Goals Manuela Veloso, Takeo Kanade, Fernando de la Torre, Paul Rybski, Brett Browning, Raju Patil, Carlos Vallespi, Betsy

Some ways CALO Agents could use CAMEO Data

• What were people doing?– Besides the relative positions of people’s

bodies in the room, more detailed information could be obtained with a full-body tracker. Including information about the room itself, such as what else is in the room (tables, whiteboards, or chairs) would let CAMEO report more detailed information.

Page 19: CAMEO: Year 1 Progress and Year 2 Goals Manuela Veloso, Takeo Kanade, Fernando de la Torre, Paul Rybski, Brett Browning, Raju Patil, Carlos Vallespi, Betsy

Some ways CALO Agents could use CAMEO Data

• Who was pointing at what?– We need to have even more detailed full-body tracking. By

tracking arms and arm positions with a stereo camera (ie, Trevor Darrell), we should be able to figure out where the person is pointing. By putting a stereo head on a panning mount, a lot of information about the environment could be obtained very easily. Even by extending the 2D tracker so that it identifies arms as being attached to bodies, we might be able to get this information. However, this is only as good as long as the person is pointing in a direction perpendicular to CAMEO. Having two CAMEOs would be a good way to solve this problem.

Page 20: CAMEO: Year 1 Progress and Year 2 Goals Manuela Veloso, Takeo Kanade, Fernando de la Torre, Paul Rybski, Brett Browning, Raju Patil, Carlos Vallespi, Betsy

Some ways CALO Agents could use CAMEO Data

• What happened during the formal presentation?– Information has to be collated and merged in

such a way as the speaker is identified, and information regarding the speech and powerpoint presentation is processed (CALO-MMD group).

Page 21: CAMEO: Year 1 Progress and Year 2 Goals Manuela Veloso, Takeo Kanade, Fernando de la Torre, Paul Rybski, Brett Browning, Raju Patil, Carlos Vallespi, Betsy

Some ways CALO Agents could use CAMEO Data

• What happened during the general discussion?– Information has to be collated and merged in

such a way as the speakers are identified, and information regarding the speech is processed (CALO-MMD group).

Page 22: CAMEO: Year 1 Progress and Year 2 Goals Manuela Veloso, Takeo Kanade, Fernando de la Torre, Paul Rybski, Brett Browning, Raju Patil, Carlos Vallespi, Betsy

Some ways CALO Agents could use CAMEO Data

• What is a general/detailed summary of the meeting?– Given a state machine which can be used to describe

the most common things in a meeting, we could cluster the individual events into larger states which indicate the various sections of the meeting based on a generic agenda (intro, formal presentation, questions, open discussion, wrap-up), or even a specific agenda that is provided to CAMEO ahead of time? People print out agendas and often bring them to formal meetings so that everyone can follow allong.

Page 23: CAMEO: Year 1 Progress and Year 2 Goals Manuela Veloso, Takeo Kanade, Fernando de la Torre, Paul Rybski, Brett Browning, Raju Patil, Carlos Vallespi, Betsy

Some ways CALO Agents could use CAMEO Data

• What did person 'x' contribute to the meeting?– Tracking an individual person's speech and gestures

allows the events posted to the timeline server to be gathered/clustered into a personalized kind of state machine that can be viewed at a very minute level of detail (individual gestures and actions) or a high level description such as "person x didn't talk very much", etc...

Page 24: CAMEO: Year 1 Progress and Year 2 Goals Manuela Veloso, Takeo Kanade, Fernando de la Torre, Paul Rybski, Brett Browning, Raju Patil, Carlos Vallespi, Betsy

Some ways CALO Agents could use CAMEO Data

• How to replay a meeting from a specific point in time?– The raw movie files are available. Once the

individual person events are classified, the timestamps can be extracted from the timeline server and the video can be replayed from that location.

Page 25: CAMEO: Year 1 Progress and Year 2 Goals Manuela Veloso, Takeo Kanade, Fernando de la Torre, Paul Rybski, Brett Browning, Raju Patil, Carlos Vallespi, Betsy

Some ways CALO Agents could use CAMEO Data

• How to replay specific parts of the meeting, i.e., introductions, discussion after the presentation, wrap up? – We need to create a probabilistic meeting ontology that

we can use to parse and tag the meeting identifying parts of the meeting with different probabilities. We can learn the model of different types of meetings in terms of learning the probabilistic parameters of an ontology or the Bayesian dependencies from types, people, and meeting purpose, to the format of the meeting.