42
ctivity t epor 2009 Theme : Vision, Perception and Multimedia Understanding INSTITUT NATIONAL DE RECHERCHE EN INFORMATIQUE ET EN AUTOMATIQUE Project-Team Pulsar Perception Understanding Learning Systems for Activity Recognition Sophia Antipolis - Méditerranée

Project-Team Pulsar Perception Understanding Learning ...object perception, behavior understanding, activity model learning, dependable activity recognition system design and evaluation

  • Upload
    others

  • View
    11

  • Download
    0

Embed Size (px)

Citation preview

c t i v i t y

te p o r

2009

Theme : Vision, Perception and Multimedia Understanding

INSTITUT NATIONAL DE RECHERCHE EN INFORMATIQUE ET EN AUTOMATIQUE

Project-Team Pulsar

Perception Understanding LearningSystems for Activity Recognition

Sophia Antipolis - Méditerranée

Table of contents

1. Team . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12. Overall Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

2.1. Presentation 22.1.1. Research Themes 22.1.2. International and Industrial Cooperation 2

2.2. Highlights of the year 23. Scientific Foundations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3

3.1. Introduction 33.2. Scene Understanding for Activity Recognition 3

3.2.1. Introduction 33.2.2. Perception for Activity Recognition 33.2.3. Understanding For Activity Recognition 43.2.4. Learning for Activity Recognition 4

3.3. Software Engineering for Activity Recognition 43.3.1. Introduction 43.3.2. Platform for Activity Recognition 53.3.3. Software Modeling for Activity Recognition 63.3.4. Behavioral Models for Activity Recognition 63.3.5. Safeness of Systems for Activity Recognition 6

4. Application Domains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .74.1. Overview 74.2. Video Surveillance 74.3. Detection and Behavior Recognition of Bioagressors 84.4. Medical Applications 9

5. Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95.1. Pegase 95.2. SUP 95.3. Clem 10

6. New Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106.1. Scene Understanding for Activity Recognition 10

6.1.1. Introduction 106.1.2. A videosurveillance system to quasi real time pest detection and counting (ARC BioSerre)

106.1.3. Real-time Monitoring of TORE Plasma 116.1.4. Human detection and re-identification 136.1.5. Controlled Object Detection 156.1.6. Learning Shape Metrics 166.1.7. Online People Trajectory Evaluation 176.1.8. People Tracking Using HOG Descriptors 186.1.9. Human Gesture Learning and Recognition 186.1.10. Monitoring Elderly Activities, Using Multi Sensor Approach and Uncertainty Manage-

ment 196.1.10.1. Monitoring Daily Activities of Elderly Living Alone at Home 196.1.10.2. Monitoring Activities for Alzheimer Patients from CHU Nice 20

6.1.11. Trajectory Clustering for Activity Learning 216.2. Software Engineering for Activity Recognition 22

6.2.1. Introduction 226.2.2. Model-Driven Engineering for Video Surveillance 236.2.3. Scenario Analyze Module 25

2 Activity Report INRIA 2009

6.2.4. Multiple Services for Device Adaptive Platform for Scenario Recognition 256.2.5. The Clem Workflow 26

7. Contracts and Grants with Industry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288. Other Grants and Activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

8.1. European Projects 288.1.1. VICoMo Project 288.1.2. COFRIEND Project 28

8.2. International Grants and Activities 288.2.1. Joint Partnership with Vietnam 298.2.2. Shape metrics and statistics 29

8.3. National Grants and Collaborations 298.3.1. SYSTEM@TIC SIC Project 298.3.2. VIDEO-ID Project 298.3.3. BioSERRE: Video Camera Network for Early Pest Detection in Greenhouses 298.3.4. MONITORE: Real-time Monitoring of TORE Plasma 298.3.5. Long-term People Monitoring at Home 298.3.6. Intelligent Cameras 308.3.7. Semantic Interpretation of 3D Sismic Images by Cognitive Vision Techniques 308.3.8. Scenario Recognition 30

8.4. Spin off Partner 309. Dissemination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

9.1. Scientific Community 309.2. Teaching 329.3. Thesis 32

9.3.1. Thesis in Progress 329.3.2. Thesis Defended 33

10. Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .33

1. TeamResearch Scientist

Monique Thonnat [ Team Leader up to September 2009, Research Director (DR) Inria, HdR ]François Brémond [ Team Leader since September 2009, Research Associate (CR) Inria, HdR ]Guillaume Charpiat [ Research Associate (CR) Inria ]Sabine Moisan [ Research Associate (CR) Inria, HdR ]Annie Ressouche [ Research Associate (CR) Inria ]

Faculty MemberJean Yves Tigli [ Assistant professor, Faculty Member Nice Sophia-Antipolis University, on secondment sinceSeptember 2009 ]

External CollaboratorJean-Paul Rigault [ Professor, Faculty Member Nice Sophia-Antipolis University ]Christophe Tornieri [ Keeneo ]Daniel Gaffé [ Faculty Member Nice University and CNRS-LEAT Member ]

Technical StaffBernard Boulay [ Development Engineer, European Project COFRIEND ]Etienne Corvée [ Development Engineer, European Projects COFRIEND ]Erwan Demairy [ IR2 INRIA for the SUP Software Platform’s ADT ]Jose-Luis Patino Vilchis [ Development Engineer, European Projects COFRIEND ]Valery Valentin [ Development Engineer, SIC Project up to September 2009 ]Daniel Zullo [ Development Engineer, CIU Santé, since February 2009 ]

PhD StudentSlawomir Bak [ Nice Sophia-Antipolis University, VideoID Grant ]Duc Phu Chau [ Paca Grant, Nice Sophia-Antipolis University ]Thi Lan Le [ Hanoi University and Nice Sophia-Antipolis University up to February 2009 ]Mohammed-Bécha Kaâniche [ Paca Lab Grant, Nice Sophia-Antipolis University up to November 2009 ]Anh-Tuan Nghiem [ Nice Sophia-Antipolis University, SIC Project ]Guido-Tomas Pusiol [ Nice Sophia-Antipolis University, CORDIs ]Rim Romdhame [ [Nice Sophia-Antipolis University, CIU Santé Project ]Nadia Zouba [ Nice Sophia-Antipolis University ]

Post-Doctoral FellowIkhlef Bechar [ Arc Bioserre ]Mohammed-Bécha Kaâniche [ since December 2009 ]Vincent Martin [ Arc Monitore ]

Administrative AssistantCatherine Martin [ Secretary (SAR) Inria ]

OtherPiotr Bilinski [ Internship EGIDE Inria since March 2009 up to September 2009 ]Carolina Garate [ Internship EGIDE Inria since April 2009 up to October 2009 ]Saurabh Goyal [ Internship EGIDE Inria since May 2009 up to July 2009 ]Guiseppe Santoro [ Internship EGIDE Inria since March 2009 up to September 2009 ]Anja Schnaars [ Internship ANR Monitore since September 2009 ]

2 Activity Report INRIA 2009

2. Overall Objectives2.1. Presentation2.1.1. Research Themes

Pulsar is focused on cognitive vision systems for Activity Recognition. We are particularly interested in thereal-time semantic interpretation of dynamic scenes observed by sensors. We thus study spatio-temporalactivities performed by human beings, animals and/or vehicles in the physical world.

Our objective is to propose new techniques in the field of cognitive vision and cognitive systems for mobileobject perception, behavior understanding, activity model learning, dependable activity recognition systemdesign and evaluation. More precisely Pulsar proposes new computer vision techniques for mobile objectperception with a focus on real-time algorithms and 4D analysis (e.g. 3D models and long-term tracking). Ourresearch work includes knowledge representation and symbolic reasoning for behavior understanding. We alsostudy how statistical techniques and machine learning in general can complement a priori knowledge modelsfor activity model learning. Our research work on software engineering consists in designing and evaluatingeffective and efficient activity recognition systems. Pulsar takes advantage of a pragmatic approach workingon concrete problems of activity recognition to propose new cognitive system techniques inspired by andvalidated on applications, in a virtuous cycle.

Within Pulsar we focus on two main applications domains: safety/security and healthcare. There is anincreasing need to automate the recognition of activities observed by sensors (usually CCD cameras, omnidirectional cameras, infrared cameras), but also microphones and other sensors (e.g. optical cells, physiologicalsensors). Safety/security application domain is a strong basis which ensures both a precise view of the researchtopics to develop and a network of industrial partners ranging from end-users, integrators and software editorsto provide data, problems and fundings. Pulsar is also interested in developing activity monitoring applicationsfor healthcare (in particular assistance for the elderly).

2.1.2. International and Industrial CooperationOur work has been applied in the context of more than 7 European projects such as AVITRACK, SERKET,CARETAKER, COFRIEND. We have industrial collaborations in several domains: transportation (CCI Air-port Toulouse Blagnac, SNCF, INRETS, ALSTOM, RATP, Rome ATAC Transport Agency (Italy), TurinGTT (Italy)), banking (Crédit Agricole Bank Corporation, Eurotelis and Ciel), security (THALES R&TFR, THALES Security Syst, INDRA (Spain), EADS, Sagem, Bertin, Alcatel, Keeneo, ACIC, BARCO,VUB-STRO and VUB-ETRO (Belgium)), multimedia (Multitel (Belgium), Thales Communications, IDIAP(Switzerland), SOLID software editor for multimedia data basis (Finland)), civil engineering sector (CentreScientifique et Technique du Bâtiment (CSTB)), computer industry (BULL), software industry (SOLID soft-ware editor for multimedia data basis (Finland), AKKA) and hardware industry (ST-Microelectronics). Wehave international cooperations with research centers such as Reading University (UK), ARC Seibersdorf re-search GMBHf (Wien Austria), ENSI Tunis (Tunisia), National Cheng Kung University (Taiwan), NationalTaiwan University (Taiwan), MICA (Vietnam), IPAL (Singapore), I2R (Singapore), NUS (Singapore), Uni-versity of Southern California (USC), University of South Florida (USF), University of Maryland.

2.2. Highlights of the yearPulsar is a Project team created in January 2008 in the continuation of the Orion project. François Brémondhas taken the lead of the Pulsar team since September 2009 and Monique Thonnat, the former leader, has takenresponsibilities at INRIA at the national level. Indeed, she is now both deputy scientific director at INRIA andresearcher in Pulsar team. We have a new software platform called SUP (Scene Understanding Platform) foractivity recognition based on sound software engineering paradigms. We have also continued original workon learning techniques such as data mining in large multimedia database. For instance, we have been able tolearn the behavior profile of nine elderly people living in the Ger’home laboratory by processing more than9 x 4 hours of video and sensor (e.g pressure and contact sensors) recordings. We have also learned gesturedescriptors to recognize actions such as walking and jumping.

Project-Team Pulsar 3

3. Scientific Foundations

3.1. IntroductionPulsar conducts two main research axes : scene understanding for activity recognition and software engineer-ing for activity recognition.

Scene understanding is an ambitious research topic which aims at solving the complete interpretation problemranging from low level signal analysis up to semantic description of what is happening in a scene viewed byvideo cameras and possibly other sensors. This problem implies to solve several issues which are grouped inthree major categories: perception, understanding and learning.

Software engineering methods allow to ensure genericity, modularity, reusability, extensibility, dependability,and maintainability. To tackle this challenge, we rely on the correct theoretical foundations of our models,and on state-of-the art software engineering practices such as components, frameworks, (meta-)modeling, andmodel-driven engineering.

3.2. Scene Understanding for Activity RecognitionParticipants: Guillaume Charpiat, François Brémond, Sabine Moisan, Monique Thonnat.

3.2.1. IntroductionOur goal is to design a framework for the easy generation of autonomous and effective scene understandingsystems for activity recognition. Scene understanding is a complex process where information is abstractedthrough four levels: signal (e.g. pixel, sound), perceptual features, physical objects and events. The signal levelis characterized by strong noise, ambiguous, corrupted and missing data. Thus to reach a semantic abstractionlevel, models and invariants are the crucial points. A still open issue consists in determining whether thesemodels and invariants are given a priori or are learned. The whole challenge consists in organizing all thisknowledge in order to capitalize experience, share it with others and update it along with experimentation.More precisely we work in the following research axes: perception (how to extract perceptual features fromsignal), understanding (how to recognize a priori models of physical object activities from perceptual features)and learning (how to learn models for activity recognition).

3.2.2. Perception for Activity RecognitionWe are proposing computer vision techniques for physical object detection and control techniques forsupervision of a library of video processing programs.

First for the real time detection of physical objects from perceptual features, we design methods either byadapting existing algorithms or proposing new ones. In particular, we work on information fusion to handleperceptual features coming from various sensors (several cameras covering a large scale area or heterogeneoussensors capturing more or less precise and rich information). Also to guarantee the long-term coherence oftracked objects, we are adding a reasoning layer to a classical bayesian framework, modeling the uncertaintyof the tracked objects. This reasoning layer is taking into account the a priori knowledge of the scene foroutlier elimination and long term coherency checking. Moreover we are working on providing fine andaccurate models for human shape and gesture, extending the work we have done on human posture recognitionmatching 3D models and 2D silhouettes. We are also working on gesture recognition based on 2D feature pointtracking and clustering.

A second research direction is to manage a library of video processing programs. We are building a perceptionlibrary by selecting robust algorithms for feature extraction, by insuring they work efficiently with real timeconstraints and by formalising their conditions of use within a program supervision model. In the case of videocameras, at least two problems are still open: robust image segmentation and meaningful feature extraction.For these issues, we are developing new learning techniques.

4 Activity Report INRIA 2009

3.2.3. Understanding For Activity RecognitionA second research axis is to recognize subjective activities of physical objects (i.e. human beings, animals,vehicles) based on a priori models and the objective perceptual measures (e.g. robust and coherent objecttracks).

To reach this goal, we have defined original activity recognition algorithms and activity models. Activityrecognition algorithms include the computation of spatio-temporal relationships between physical objects. Allthe possible relationships may correspond to activities of interest and all have to be explored in an efficientway. The variety of these activities, generally called video events, is huge and depends on their spatial andtemporal granularity, on the number of physical objects involved in the events, and on the event complexity(number of components constituting the event).

Concerning the modeling of activities, we are working towards two directions: the uncertainty managementfor expressing probability distributions and knowledge acquisition facilities based on ontological engineeringtechniques. For the first direction, we are investigating classical statistical techniques and logical approaches.For example, we have built a language for video event modeling and a visual concept ontology (includingcolor, texture and spatial concepts) to be extended with temporal concepts (motion, trajectories, events ...) andother perceptual concepts (physiological sensor concepts ...).

3.2.4. Learning for Activity RecognitionGiven the difficulty of building an activity recognition system with a priori knowledge for a new application,we study how machine learning techniques can automate building or completing models at the perception leveland at the understanding level.

At the perception level, to improve image segmentation, we are using program supervision techniquescombined with learning techniques. For instance, given an image sampling set associated with ground truthdata (manual region boundaries and semantic labels), an evaluation metric together with an optimisationscheme (e.g. simplex algorithm or genetic algorithm) are applied to select an image segmentation methodand to tune image segmentation parameters. Another example, for handling illumination changes, consists inclustering techniques applied to intensity histograms to learn the different classes of illumination context fordynamic parameter setting.

At the understanding level, we are learning primitive event detectors. This can be done for example bylearning visual concept detectors using SVMs (Support Vector Machines) with perceptual feature samples.An open question is how far can we go in weakly supervised learning for each type of perceptual concept(i.e. leveraging the human annotation task). A second direction is the learning of typical composite eventmodels for frequent activities using trajectory clustering or data mining techniques. We name composite eventa particular combination of several primitive events.

Coupling learning techniques with a priori knowledge techniques is promising to recognise meaningfulsemantic activities.

The new proposed techniques for activity recognition systems (first research axis) are then contributing tospecify the needs for new software architectures (second research axis).

3.3. Software Engineering for Activity RecognitionParticipants: François Brémond, Erwan Demairy, Sabine Moisan, Annie Ressouche, Jean-Paul Rigault,Monique Thonnat.

3.3.1. IntroductionThe aim of this research axis is to build general solutions and tools to develop systems dedicated to activityrecognition. For this, we rely on state-of-the art Software Engineering practices to ensure both sound designand easy use, providing genericity, modularity, adaptability, reusability, extensibility, dependability, andmaintainability.

Project-Team Pulsar 5

This year we focused on four aspects: the definition of a joint software platform with KEENEO, the studyof model-driven engineering approaches to facilitate platform usage, the extension of behavioral models, andformal verification techniques to design dependable systems.

3.3.2. Platform for Activity Recognition

Figure 1. Global Architecture of an Activity Recognition Platform.

In the former project team Orion, we have developed two platforms, one (VSIP), a library of real-timevideo understanding modules and another one, LAMA [12], a software platform enabling to design not onlyknowledge bases, but also inference engines, and additional tools. LAMA offers toolkits to build and to adaptall the software elements that compose a knowledge-based system or a cognitive system.

Pulsar will continue to study generic systems and object-oriented frameworks to elaborate a methodology forthe design of activity recognition systems. We want to broaden the approach that led to LAMA and to applyit to the other components of the activity recognition platform, in particular to the image processing ones. Wealso wish to contribute to set up, in the long term, a complete software engineering methodology to developactivity recognition systems. This methodology should be based on model engineering and formal techniques.

To this end, Pulsar plans to develop a new platform (see figure 1) which integrates all the necessarymodules for the creation of real-time activity recognition systems. Software generators provide designers withperception, software engineering and knowledge frameworks. Designers will use these frameworks to createboth dedicated activity recognition engines and interactive tools. The perception and evaluation interactivetools enable a perception expert to create a dedicated perception library. The knowledge acquisition, learningand evaluation tools enable a domain expert to create a new dedicated knowledge base.

This platform will rely on the LAMA experiment, but necessitates some architectural changes and model exten-sions. We plan to work in the following three research directions: models (adapted to the activity recognitiondomain), platform architecture (to cope with deployment constraints such as real time or distribution), and

6 Activity Report INRIA 2009

system safeness (to generate dependable systems). For all these tasks we shall follow state-of-the-art SoftwareEngineering practice and, if needed, we shall attempt to set up new ones.

The new platform should be easy to use. We should thus define and implement tools to support modeling,design, verification inside the framework. Another important issue deals with user graphical interfaces. Itshould be possible to plug existing (domain or application dependent) graphical interfaces into the platform.This requires defining a generic layer to accommodate various sorts of interfaces. This is clearly a medium/longterm goal, in its full generality at least.

3.3.3. Software Modeling for Activity RecognitionDeveloping integrated platforms such as SUP is a current trend in video surveillance. It is also a challenge sincethese platforms are complex and difficult to understand, to use, to validate, and to maintain. The situation getsworse when considering the huge number of choices and options, both at the application and platform levels.Dealing with such a variability requires formal modeling approaches for the task specification as well as forthe software component description.

Model Driven Engineering (MDE) [62] is a recent line of research that appears as an excellent candidate tosupport this modeling effort while providing means to make models operational and even executable. Our goalis to explore and enrich MDE techniques and model transformations to support the development of productlines for domains presenting multiple variability factors such as video surveillance.

The long term scientific objective concerns research activities both on Model Driven Engineering and videosurveillance. On the MDE side, we wish to identify the limits of current techniques when applied to real scalecomplex tasks. On the video surveillance side, the trend is toward integrated software platforms, which requiresformal modeling approaches for the task specification as well as for the software component description.

This MDE approach is complementary to the Program Supervision one, which has been studied by Orion fora long time [21]. Program Supervision focuses on programs, their models and the control of their execution.MDE also covers task specification and transformations to a design and implementation.

3.3.4. Behavioral Models for Activity RecognitionPursuing the work done in Orion, we need to consider other models to express knowledge about activities,their actors, their relations, and their behaviors.

The evolution toward activity recognition requires various theoretical studies. We need to complement theknowledge representation model with a first-class notion of relations. The incorporation of a model of time,both physical and logical, is mandatory to deal with temporal activity recognition especially in real time. Afundamental concern is to define an abstract model of scenarios to describe and recognize activities. Supportingdistributed systems is unavoidable for current software systems and it requires a model of distribution. Theseconceptual models will lead to define corresponding ontologies.

Finally, handling uncertainty is a major theme of Pulsar and we want to introduce it into our platform; thisrequires deep theoretical studies and is a long term goal.

3.3.5. Safeness of Systems for Activity RecognitionAnother aim is to build dependable systems. Since traditional testing is not sufficient, it is important to rely onformal verification techniques and to adapt them to our component models.

In most activity recognition systems, safeness is a crucial issue. It is a very general notion dealing withperson and goods protection, respect of privacy, or even legal constraints. However, when designing softwaresystems it will end up with software security. In Orion, we already provided toolkits to ensure validation andverification of systems built with LAMA. First, we offered a knowledge base verification toolkit, allowing toverify the consistency and the completeness of a base as well as the adequacy of the knowledge with regardto the way an engine is going to use it. Second, we also provided an engine verification toolkit that relies onmodel-checking techniques to verify that the BLOCKS library has been used in a safe way during knowledgebased system engine designs.

Project-Team Pulsar 7

Generation of dependable systems for activity recognition is an important challenge. System validation reallyis a crucial phase in any development cycle. Partial validation by tests, although required in the first phaseof validation, appears to be too weak for the system to be completely trusted. An exhaustive approach ofvalidation using formal methods is clearly needed. Formal methods help to produce a code that has beenformally proved and the size and frequency of which can be estimated. Consistently with our componentapproach, it appears natural to rely on component modeling to perform a verification phase in order to buildsafe systems. Thus we study how to ensure safeness for components whose models take into account time anduncertainty.

Nevertheless, software dependability cannot be proved by relying on a single technique. Some propertiesare decidable and they can be checked using formal methods at the model level. By contrast, some otherproperties are not decidable and they require non exhaustive methods such as abstract interpretation at thecode level. Thus, a verification method to ensure generic component dependability must take into accountseveral complementary verification techniques.

4. Application Domains

4.1. OverviewWhile in our research the focus is to develop techniques, models and platforms that are generic and reusable,we also make effort in the development of real applications. The motivation is twofold. The first is to validatethe new ideas and approaches we introduced. The second is to demonstrate how to build working systems forreal applications of various domains based on the techniques and tools developed. Indeed, the applications weachieved cover a wide variety of domains: intelligent visual surveillance in transport domain, applications inbiologic domain or applications in medical domain.

4.2. Video SurveillanceThe growing feeling of insecurity among the population led the private companies as well as the publicauthorities to deploy more and more security systems. For the safety of the public places, the video camerabased surveillance techniques are commonly used, but the multiplication of the camera number leads to thesaturation of transmission and analysis means (it is difficult to supervise simultaneously hundreds of screens).For example, 1000 cameras are viewed by two security operators for monitoring the subway network ofBrussels. In the framework of our works on automatic video interpretation, we have studied the conception ofan automatic platform which can assist the video-surveillance operators.

The aim of this platform is to act as a filter, sorting the scenes which can be interesting for a human operator.This platform is based on the cooperation between an image processing component and an interpretationcomponent using artificial intelligent techniques. Thanks to this cooperation, this platform automaticallyrecognize different scenarios of interest in order to alert the operators. These works have been realized withacademic and industrial partners, like European projects PASSWORDS, AVS-PV, AVS-RTPW, ADVISOR,AVITRACK CARETAKER, SERKET and CANTATA and more recently, European projects VICoMo andCOFRIEND, national projects SIC, VIDEOID, industrial projects RATP, CASSIOPEE, ALSTOM and SNCF.A first set of very simple applications for the indoor night surveillance of supermarket (AUCHAN) showedthe feasibility of this approach. A second range of applications has been to investigate the parking monitoringwhere the rather large viewing angle makes it possible to see many different objects (car, pedestrian, trolley)in a changing environment (illumination, parked cars, trees shacked by the wind, etc.). This set of applicationsallowed us to test various methods of tracking, trajectory analysis and recognition of typical cases (occlusion,creation and separation of groups, etc).

8 Activity Report INRIA 2009

We have studied and developed video surveillance techniques in the transport domain which requires theanalysis and the recognition of groups of persons observed from lateral and low position viewing angle insubway stations (subways of Nuremberg, Brussels, Charleroi, Barcelona, Rome and Turin). We have workedwith industrial companies (Bull, Vigitec, Keeneo) on the conception of a video surveillance intelligent platformwhich is independent of a particular application. The principal constraints are the use of fixed cameras andthe possibility to specify the scenarios to be recognized, which depend on the particular application, based onscenario models which are independent from the recognition system.

In parallel of the video surveillance of subway stations, projects based on the video understanding platformhave started for bank agency monitoring, train car surveillance and aircraft activity monitoring to managecomplex interactions between different types of objects (vehicles, persons, aircrafts). A new challengeconsists in combining video understanding with learning techniques (e.g. data mining) as it is done in theCARETAKER and COFRIEND projects to infer new knowledge on observed scenes.

4.3. Detection and Behavior Recognition of BioagressorsIn the environmental domain, Pulsar is interested in the automation of early detection of bioagressor, especiallyin greenhouse crops, in order to reduce pesticide use. Attacks (from insects or fungi) imply almost immediatedecision-taking to prevent irreversible proliferation. The goal of this work is to define innovative decisionsupport methods for in situ early pest detection based on video analysis and scene interpretation from multicamera data. We promote a non-destructive and non-invasive approach to allow rapid remedial decisions fromproducers. The major issue is to reach a sufficient level of robustness for a continuous surveillance.

During the last decade, most studies on video applications for biological organism surveillance were limitedto constrained environments where camerawork conditions are controlled. By contrast, we aim at monitoringpests in their natural environment (greenhouses). We thus intend to automate pest detection, in the same way asthe management of climate, fertilization and irrigation which are carried out by a control/command computersystem. To this end, vision algorithms (segmentation, classification, tracking) must be adapted to cope withillumination changes, plant movements, or insect characteristics.

Traditional manual counting is tedious, time consuming and subjective. We have developed a generic approachbased on a priori knowledge and adaptive methods for vision tasks. This approach can be applied to insectimages in order, first, to automate identification and counting of bio-aggressors, and ultimately, to analyzeinsect behaviors. Our work takes place within the framework of cognitive vision [54]. We propose to combineimage processing, neural learning, and a priori knowledge to design a system complete from video acquisitionto behavior analysis. The ultimate goal of our system is to integrate a module for insect behavior analysis.Indeed, recognition of some characteristic behaviors is often closely related to epicenters of infestation.Coupled with an optimized spatial sampling of the video cameras, it can be of crucial help for rapid decisionsupport.

Most of the studies on behavior analysis have concentrated on human beings. We intend to extend cognitivevision systems to monitor non-human activities. We will define scenario models based on the concepts of statesand events related to interesting objects, to describe the scenarios relative to white insect behaviors. We shallalso rely on ontologies (such as a a video event one). Finally, in the long term, we want to investigate datamining for biological research. Indeed, biologists require new knowledge to analyze bioagressor behaviors.A key step will to be able to match numerical features (based on trajectories and density distributions forinstance) and their biological interpretations (e.g., predation or center of infestation).

This work takes place in a two year collaboration (ARC BioSERRE) between Pulsar (INRIA Sophia Antipolis- Méditerranée), Vista (INRIA Rennes - Bretagne Atlantique), INRA Avignon UR407 Pathologie Végétale(Institut National de Recherche Agronomique), CREAT Research Center (Chambre d’Agriculture des AlpesMaritimes) started in 2008.

Project-Team Pulsar 9

4.4. Medical ApplicationsIn the medical domain, Pulsar is interested in the long-term monitoring of people at home, which aimsat supporting the caregivers by providing information about the occurrence of worrying change in peoplebehavior. We are especially involved in the Ger’home project, funded by the PACA region and Conseil Général(CG06), in collaboration with two local partners: CSTB and Nice City hospital. In this project, an experimentalhome that integrates new information and communication technologies has been built in Sophia Antipolis. Thepurpose concerns the issue of monitoring and learning about people activities at home, using autonomous andnon-intrusive sensors. The goal is to detect the sudden occurrence of worrying situations, such as any slowchange in a person frailty. We have also started collaboration with Nice hospital to monitor Alzheimer patientswith the help of geriatric doctors. The aim of the project is to design an experimental platform, providingservices and allowing to test their efficiency.

5. Software

5.1. PegaseSince September 1996, the Orion team (and now the Pulsar team) distributes the program supervision enginePEGASE, based on the LAMA platform. The Lisp version has been used at Maryland University and at Genset(Paris). The C++ version (PEGASE+) is now available and is operational at ENSI Tunis (Tunisia) and atCEMAGREF, Lyon (France).

5.2. SUP

Figure 2. Components of the Scene Understanding Platform (SUP).

SUP is a Scene Understanding Software Platform written in C and C++ (see figure 2). SUP is the continuationof the VSIP platform. SUP is splitting the workflow of a video processing into several modules, such asacquisition, segmentation, etc. until scenario recognition. Each module has a precise interface, and differentplugins implementing these interfaces can be used for each step of the video processing. This genericarchitecture is designed to facilitate:

1. integration of new algorithms in SUP;

2. sharing of the algorithms among the team.

Currently, 15 plugins are available, covering the whole processing chain. Several plugins are using the Geniusplatform, an industrial platform based on VSIP and exploited by Keeneo, the Start-up created in July 2005 bythe research team.

10 Activity Report INRIA 2009

Goals of SUP are twofolds:

1. From a video understanding point of view, the goal of SUP is that all the researchers of the Pulsarteam can share the implementations of their researches through this platform;

2. From a software engineering point of view, the goal is to integrate the results of the dynamicmanagement of the applications when applied to video surveillance.

5.3. ClemThe Clem Toolkit [56] is a set of tools devoted to design, simulate, verify and generate code for LE [60], [59]programs. This latter is a synchronous language supporting a modular compilation. The language also supportsautomata possibly designed with a dedicated graphical editor. The Clem toolkit comes with a simulation tool.Hardware description (Vhdl) and software code (C) are generated for LE programs. Moreover, we also generatefiles to feed the NuSMV model checker [52] in order to perform validation of program behaviors.

6. New Results

6.1. Scene Understanding for Activity RecognitionParticipants: Slawomir Bak, Ikhlef Bechar, Bernard Boulay, François Brémond, Guillaume Charpiat, DucPhu Chau, Etienne Corvée, Mohammed Bécha Kaâniche, Vincent Martin, Sabine Moisan, Anh-Tuan Nghiem,Guido-Tomas Pusiol, Monique Thonnat, Valery Valentin, Jose-Luis Patino Vilchis, Nadia Zouba.

6.1.1. IntroductionThis year, Pulsar has tackled several scene understanding issues and proposed new algorithms in the threefollowing research axes:

• Perception: people detection and human gesture recognition;

• Understanding: multi-sensor activity recognition and gesture recognition using learned local motiondescriptors;

• Learning: online and offline trajectory clustering.

6.1.2. A videosurveillance system to quasi real time pest detection and counting (ARCBioSerre)Participants: Ikhlef Bechar, Vincent Martin, Sabine Moisan.

In the framework of BioSerre project, we investigate a video-surveillance solution for an early detection ofpest attacks in the framework of pest management methods. Our system is to be used in a greenhouse endowedwith a (Wifi) network of video-cameras. This year we have presented this work in June 2009, in Paris, duringthe Salon Européen de la Recherche et de l’Innovation(SERI’09), in October 2009, in Bordeaux during theJournées des ARCs de l’INRIA and in December 10th-11th, in Sophia Antipolis for the 2009 INRA-INRIASeminary.

On top of the classical challenges in video-surveillance (lighting changes, shadows, etc.), we have to facespecific challenges:

• The high resolution of the video frames needed by the application (about 1.3 Mega pixels perframe and about 2 frames every 1.5 sec.), which is necessary to visualize the insects of interest,but constitutes a serious challenge for quasi-real time processing;

• The very low spatial resolution and color contrast of the harmful insects of interest in the videos;

• The lack of powerful discriminative features in the insect species of interest, because their low spatialresolution in the videos does not allow us to see their detailed shapes.

Project-Team Pulsar 11

The application is divided into different modules. A video acquisition module acquires videos from theremote cameras and stores them locally in a PC where the core of the application is running. Thanks tothe trap extraction module, only the region of interest in a video (the trap area) is processed. Then, abackground substraction module maintains a statistical model of the background, and detects pixels whichdeviate significantly from the learned background model. The detected pixels are then processed by an insectpresence detection module (IPDM) to decide whether they are likely to be insect or not (e.g. due to illuminationchanges). A video-frame being divided into many patches (to speed up the subsequent image processing), amere counting of the pixels that are classified as insect pixels by IPDM allows the system to vote for thepatch which will be processed by the next module, insect detection module (IDM). The IDM consists oflow level image processing operations (RGB to gray scale image transformation, image convolution, imagedifferentiation, local maxima extraction, perceptual grouping, etc.) and relies on a rough prior geometric modelabout insects of interest (i.e., a salient rectangular intensity profile) to extract the patterns likely correspondingto be insects of interest. The insect classification module involves some additional processing in order toclassify the extracted patterns into insects or to fake patterns generated either by noise or by illuminationreflections. In order not to redo the detection of a previously detected insect, a (cheap) insect tracking modulemaintains a list of the already detected insects in the previous frames and updates it whenever a new insect isdetected by IDM and confirmed by insect classification. All these routines are repeated continuously duringscheduled daytime, and the counting results are stored and analyzed in quasi-real time.

(a) (b)

Figure 3. Detection of whiteflies in a video frame (1280× 960 px.). The central part of each image represents theyellow sticky trap which attracts insects (whiteflies in this case). (a) original frame Trapped whiteflies correspond

to bright spots on the image; (b) detections superimposed on the original frame (colors are used only for renderingconvenience and for tracking purpose).

We have shown the feasibility of a video-surveillance system for pest management in a greenhouse andmanaged to overcome most of the initially posed image processing challenges. Currently, we are in the phaseof testing and refining our algorithms. The currently developed prototype should be deployed soon in actualgreenhouse sites of our INRA partners (Avignon) for further testings and validations.

6.1.3. Real-time Monitoring of TORE PlasmaParticipants: Vincent Martin, François Brémond, Monique Thonnat.

This project aims at developing an intelligent system for the real time surveillance of the plasma evolvingin Tore Supra or other devices. The first goal is to improve the reliability and upgrade the current real

12 Activity Report INRIA 2009

time control system operating at Tore Supra. The ultimate goal is to integrate such a system into the futureITER imaging diagnosis. In this context, a first collaboration has recently started between the Plasma FacingComponent group of CEA Cadarache and the Pulsar project-team. The goal is to detect events (expected ornot) in real-time, in order to control the power injection for the protection of the Plasma Facing Components(PFC). In the case of a known event, the detection must lead to the identification of this event. Otherwise, alearning process is proposed to assimilate this new type of event. In such way, the objective of the project istwofold: machine protection and thermal event understanding. The system may take multimodal data as inputs:plasma parameters (plasma density, injected power by heating antenna, plasma position...), infrared and visibleimages, and signals coming from others sensors as spectrometers and bolometers. Recognized events arereturned as outputs with their characteristics for further physical analysis. In this application, we benefit fromthe large amount of available data accumulated during thousands of pulses and for several devices. We relyon an ontology-based representation of the a priori domain expert knowledge of thermal events. This thermalevent ontology is based on visual concepts and video event concepts useful to describe and to recognize a largevariety of events occurring in thermal videos.New results in thermal event detectionThis year, we have focused on the improvement and the assessment of the thermal event detection system. Asseen in Table 1, the proposed approach outperforms the previous system based on detection on thresholdoverrun in specified regions of interest. Improvements are visible both in terms of sensitivity (less falsenegative than the previous system) and precision (no false positives). This work has been published in [42].An extended journal version has been accepted and will be be published next year.

Table 1. comparison between the arcing event detection results obtained with the current system (CS) used atTore Supra and with the proposed approach (PA) where TP = true positive counts, FN = false negative counts,

and FP = false positive counts

Antenna no. of Annotated no. of detected arcing eventspulses arcing events TP FN FP

CS PA CS PA CS PAC2 11 73 68 70 5 3 51 0C3 7 17 11 13 6 4 7 0C2 + C3 18 90 79 83 11 7 58 0

New results in thermal event understandingIn the case of a complex thermal event, we have studied learning techniques for temporal behaviors modeling.Our study case focuses on B4C flakes on the vertical edge of the Faraday screen. It is due to the flaking of theB4C coating consequently to the heating caused by fast ion losses. Temperature may overpass the acceptablethreshold without apparent risk of damage. The recognition of a B4C flake is not evident and relies on a finephysical analysis based on the hot spot temporal behavior. Our goal is to build statistical models from trainingsamples composed of positive and negative examples of B4C flakes. We used Hidden Markov Models trainedwith temperature of detected hot spots and injected power as inputs. Preliminary results are convincing andfurther evaluation of the proposed approach needs to be pursued. Finally, this approach may be extended forthe modeling and the automatic recognition of other complex thermal events.New software developmentWe are currently developing a Plasma Imaging data Understanding Platform (PInUP) dedicated to thermalevent recognition and understanding. This platform is inspired from VSUP and is composed of several modulesand knowledge bases. PInUP embeds also a dedicated tool for video annotation. The goal of this tool istwofold: first to build an annotation base of observed thermal events with precise spatiotemporal information,and second to retrieve from the resulting base useful information on thermal events for further PFC aginganalysis for instance (see Figure 4).

Project-Team Pulsar 13

Figure 4. Left: snapshot of the video annotation panel of PInUP. Bounding box spatiotemporal positions and eventdescriptions are stored into a mySQL base. Right: Snapshot of the annotation manager of PInUP used to retrieve

thermal events previously annotated in plasma videos.

This platform is going to be deployed at Tore Supra and will be used by physicists and person in chargeof the infrared imaging diagnostics. Concerning the real-time detection of thermal events, we are currentlyimplementing most costly algorithms into a FPGA to reach real-time constraints. The real-time monitoringsystem should be operational in April 2010 and will be working in parallel of the existing system at ToreSupra.

6.1.4. Human detection and re-identificationParticipants: Slawomir Bak, Etienne Corvée, François Brémond, Saurabh Goyal.

Human activity requires the detection and tracking of people in often congested scenes captured by surveil-lance cameras. The common strategies used to detect objects with high frame rates rely on segmenting andgrouping foreground pixels from a background scene captured by static cameras. The detected objects in a3D calibrated environment are then classified according to predefined 3D models such as persons, luggage orvehicles. However, whenever occlusion occurs, objects are no longer classified as single individuals but areassociated to a group of objects. Hence, standard tracking systems fail to differentiate objects within a groupand often looses their tracks.

One way to handle occlusion issues is to use multiple cameras viewing the same scene but at different locationsin order to cover the possible field of views where occlusion occurs e.g. when one camera fails to track oneperson, another camera takes over the tracking of this person. People detection in difficult scenarios can alsobe improved by extracting local descriptors. In the PULSAR team, we have used Histograms of OrientedGradient, i.e. HOG, to model human appearance and people faces. The aim is to model body parts and peopleposes of people to better define people shapes. Figures 5 shows results obtained by the HOG face detector andfigures 6 show tracking results from people detected by the HOG human detector.

We are not only adding visual signatures to better track people in independent cameras but we are also usingvisual signatures to allow people tracking in different cameras. These visual signatures would also allowus to re-identify people in more complex networks of cameras where camera fields of view do not overlape.g. in underground stations or airports. It is thus desirable to determine whether a given person of interest

14 Activity Report INRIA 2009

Figure 5. Examples of detected face in left: TrecVid and right: Ger’home dataset

Figure 6. Examples of tracked people in left: TrecVid and right: Caviar dataset

Project-Team Pulsar 15

has previously been observed by other cameras in such network of cameras. This constitutes the person re-identification issue. We have focused our first re-acquisition algorithm to combine Haar-like features withdominant colours extracted on mobile objects by the HOG based human detector described above.

People visual signatures are described by Haar-like descriptors shown in figure 7 and by dominant coloursextracted from two regions: the upper and lower part as shown in figures 8. The Adaboost algorithm is adaptedto take the people visual signatures as input and to construct a model for each individual. We have testedour algorithm with two non overlapping cameras scenario: 10 people with Caviar database and 40 people inTrecVid database.

Figure 7. a) original image; b) foreground-background separation mask; c) discriminative haar-like features

Figure 8. The dominant color separation technique: a) original image; b) upper body dominant color mask; c)lower body dominant color mask.

6.1.5. Controlled Object DetectionParticipants: Anh-Tuan Nghiem, François Brémond, Monique Thonnat.

Detecting mobile objects is an important task in many video analysis applications such as video surveillance,people monitoring, video indexing for multimedia. Among various object detection methods, the ones basedon adaptive background subtraction such as Gaussian mixture model, Kernel density estimation method,Codebook model are the most popular. However, the background subtraction algorithm alone can not easilyhandle various problems such as adapting to changes of environment, removing noise, detecting ghosts.

16 Activity Report INRIA 2009

To help background subtraction algorithms to deal with these problems we have constructed a controllerfor managing object detection algorithms. Being independent from one particular background subtractionalgorithm, this controller has two main tasks:

• Supervising background subtraction algorithms to update their background representation.

• Adapting parameter values of background subtraction algorithms to be suitable for the currentconditions of the scene.

To supervise background subtraction algorithms to update their background representation, the controlleremploys the feedback from the classification and tracking task. With this feedback, the controller can askbackground subtraction algorithms to apply appropriate updating strategies for different blob types. Forexample, if the feedback from the classification and tracking tasks identify a noise region, the controller willask background subtraction algorithms to update the corresponding region quickly so that this noise does notoccur again in the detection results. With the updating supervision of the controller, background subtractionalgorithms can handle various problems such as removing noise, keeping track of objects of interest, managingstationary objects, and removing ghosts.

To adapt the parameter values of background subtraction algorithms, the controller first needs to evaluatethe foreground detection results. This evaluation is realized with the help of the feedback from classificationand tracking tasks. Based on this evaluation, the background subtraction algorithm may change its parametervalues to have a better performance.

This work has been published in [44].

6.1.6. Learning Shape MetricsParticipants: Anja Schnaars, Guillaume Charpiat.

The notion of shape is important in many fields of computer vision, from tracking to scene understanding. Asfor usual object features, it can be used as a prior, as in image segmentation, or as a source of information, asin gesture classification. When image classification or segmentation tasks require high discriminative poweror precision, the shape of objects naturally appears relevant to our human minds. However, shape is a complexnotion which cannot be dealt with directly like a simple parameter in Rn. Modeling shape manually is tedious,and one arising question is the one of learning shapes automatically.

Shape evolutions, as well as shape matchings or image segmentation with shape prior, involve the preliminarychoice of a suitable metric in the space of shapes. Instead of choosing a particular one, we propose a frameworkto learn shape metrics from a set of examples of shapes, designed to be able to handle sparse sets of highlyvarying shapes, since typical shape datasets, like human silhouettes, are intrinsically high-dimensional andnon-dense. We formulate the task of finding the optimal metrics on an empirical manifold of shapes as aclassical minimization problem ensuring smoothness, and compute its global optimum fast.

To achieve this, we design a criterion to compute point-to-point matching between shapes which deals withtopological changes. Then, given a training set of shapes, we use these matchings to transport deformationsobserved on any shape to any other one. Finally, we estimate the metric in the tangent space of any shape,based on transported deformations, weighted by their reliability. We performed successful experiments ondifficult sets, in particular we considered the case of a girl dancing fast (figure 9). For each shape from thetraining set of shapes, we estimate the most probable deformations (see figure 10) that it can undergo. Moreprecisely we estimate the shape metrics (deformation costs) that fits the training set the best, which leads to ashape prior. We also proposed applications.

The novelty in this work is both theoretic and practical. On the theoretical side, usual approaches consisteither in estimating a mean shape pattern and characteristic deformations, or in using kernels based ondistances between shapes, while here the framework is based on reliable deformations and transport, andwe provide a criterion on metrics to be minimized. On the practical side, usual approaches require either lowshape variability in the training set, or a high sampling density, and these are not affordable in practice. Ourassumptions are much weaker so we can deal with much more general datasets, for example the framework

Project-Team Pulsar 17

is well suited to videos. This work was published in [35]. The links between texture and shape are also beingstudied, following an approach developed in [51].

Figure 9. Examples of shapes (from the training set)

Figure 10. Example of probable deformation

6.1.7. Online People Trajectory EvaluationKeywords: Object Tracking, Online Evaluation.

Participants: Duc Phu Chau, Francois Brémond, Monique Thonnat.

A tracking algorithm can provide satisfying tracking results in some scenes and poor results in other realworld scenes. A measure of performance evaluation of these algorithms is necessary to quantify how reliablea tracking algorithm is in a particular scene. Many types of metrics have been proposed and defined to addressthis issue but most of them are dependent on ground truth data in order to compare tracking results. Wepropose in this work a new online evaluation method that is independent from ground truth data. We want tocompute the quality (i.e. coherence) of the obtained trajectories based on a set of seven features. Based onthe frequency of each feature for the mobile objects, these features are divided into two groups: “one timefeatures“ and “every time features“. While “one time features“ are the features that can be computed once(one unique time) for a mobile object (e.g. temporal length of its trajectory, zone where the object leaves thescene), “every time features“ can be computed for each frame during the tracked duration (e.g. color, speed,direction, area and shape ratio).

For each feature, we define a local score in the interval [0, 1] to determine whether the mobile object is correctlytracked or not. The quality of a trajectory is estimated by the summation of local scores computed from theextracted features. The score decreases when the system detects a tracking error and increases otherwise.

Using the seven features a global score which is in interval [0, 1], is defined to evaluate online the quality ofa tracking algorithm at each frame. When the global score is greater than 0.5, we can say that the trackingalgorithm performance is rather good. Whereas if the value of the global score is lower than 0.5, that meansthe tracker generally fails to track accurately the detected objects.

18 Activity Report INRIA 2009

We have tested our approach in the video sequences of Caretaker project 1 and Gerhome project 2. In figure 11,the person in the left image is tracked by a tracking algorithm and the right image shows the online evaluationscore of the tracking algorithm. The results of the proposed online evaluation method are compatible with theoutput of the offline evaluation tool using ground truth data. These experimentations validate the performanceof the online evaluation method. This work has been published in [37].

Figure 11. Online tracking evaluation of a Gerhome sequence with global score computing

6.1.8. People Tracking Using HOG DescriptorsKeywords: Learning, Understanding.

Participants: Piotr Bilinski, Carolina Garate, Mohammed Bécha Kaâniche, François Brémond.

We propose a multiple object tracking algorithm working with occlusions. First, for each detected object wecompute feature points using FAST algorithm [61]. Second, for each feature point we build a descriptorbased on Histogram of Oriented Gradients (HOG) [53]. Third, we track feature points using these descriptors.Object tracking is possible even if objects are partially occluded. If few objects are merged and detected asa single one, we assign newly detected feature points in such single object to one of these occluded objects.We apply a probabilistic method for this task using information from the previous frames like object size andmotion information (i.e. speed and orientation). We use multi resolution images to decrease the processingtime. Our approach has been tested on a synthetic video sequence and the public datasets KTH and CAVIAR.The preliminary tests confirm the effectiveness of our approach. This work has been published in [33].

An extension of this tracker have been proposed for crowd analysis. The recognition in real time of crowddynamics in public places are becoming essential to avoid crowd related disasters and ensure safety of people.We introduce a new approach for Crowd Event Recognition. Our study begins with the previous trackingmethod, based on HOG descriptors, to finally use pre-defined models (i.e. crowd scenarios) to recognize crowdevents. We define these scenarios using statistics analysis from the data sets used in the experimentation. Theapproach is characterized by combining a local analysis with a global analysis for crowd behavior recognition.The local analysis is enabled by a robust tracking method, and global analysis is done by a scenario modelingstage. This work has been published in [39].

6.1.9. Human Gesture Learning and RecognitionKeywords: Learning, Understanding.

1http://www-sop.inria.fr/pulsar/personnel/Francois.Bremond/topicsText/caretakerProject.html2http://www-sop.inria.fr/members/Francois.Bremond/topicsText/gerhomeProject.html

Project-Team Pulsar 19

Participants: Mohammed Bécha Kaâniche, François Brémond.

We aim at recognizing gestures (e.g. hand raising) and more generally short actions (e.g. falling, bending)accomplished by an individual in a video sequence. Many techniques have been already proposed for gesturerecognition in specific environment (e.g. laboratory) using the cooperation of several sensors (e.g. cameranetwork, individual equipped with markers). Despite these strong hypotheses, gesture recognition is still brittleand often depends on the position of the individual relatively to the cameras. We propose to reduce thesehypotheses in order to conceive a general algorithm enabling the recognition of the gesture of an individualacting in an unconstrained environment and observed through a limited number of cameras. The goal is toestimate the likelihood of gesture recognition in function of the observation conditions.We propose a gesture recognition method based on local motion learning. First, for a given individual in ascene, we track feature points over its whole body to extract the motion of the body parts. Hence, we expectthat the feature points are sufficiently distributed over the body to capture fine gestures. We have chosencorner points as feature points to improve the detection stage and HOG (Histogram of Oriented Gradients)as descriptor to increase the reliability of the tracking stage. Thus, we track the HOG descriptors in order toextract the local motion of the feature points.In order to recognize gestures, we propose to learn and classify gesture based on the k-means clusteringalgorithm and the k-nearest neighbors classifier. For each video in a training dataset, we generate all localmotion descriptors and annotate them with the associate gesture. Then, for each training video taken separately,the descriptors are clustered into k clusters using the kmeans clustering algorithm. The k parameter is set upempirically. Each cluster is associated to its corresponding gesture, so similar clusters can be labeled withdifferent gestures. Finally, with all generated clusters as a database, the k-nearest neighbor classifier is usedto classify gestures occurring in the test dataset. A video is classified according to the amount of neighborswhich have voted for a given gesture providing the likelihood of the recognition.We demonstrate the effectiveness of our motion descriptors by recognizing the actions of KTH and IXMASpublic databases. This work has been published in [41], [23].

6.1.10. Monitoring Elderly Activities, Using Multi Sensor Approach and UncertaintyManagementParticipants: Nadia Zouba, Bernard Boulay, Valery Valentin, Rim Romdhame, Daniel Zullo, FrançoisBrémond, Monique Thonnat.

6.1.10.1. Monitoring Daily Activities of Elderly Living Alone at Home

Participants: Nadia Zouba, Valery Valentin, Bernard Boulay, François Brémond, Monique ThonnatIn the framework of monitoring elderly activities at home, we have proposed an approach combiningheterogeneous sensor data for recognizing elderly activities at home. This approach consists in combiningdata provided by video cameras with data provided by environmental sensors to monitor the interaction ofpeople with the environment.

In this work we have done a strong effort in event modeling. The result is 100 models representing ourknowledge base of events for home care applications. This knowledge base can be used in other applicationsin the same domain.

In this approach we have also proposed a sensor model able to give a coherent representation of theinformation provided by various types of physical sensors. This sensor model includes an uncertainty in sensormeasurement.

The approach is used to define a behavioral profile for each person and to compare these behavioral profiles.The first step to establish a behavioral profile of an observed person is to determine his/her daily activities.This behavioral profile is defined as a set of the most frequent and characteristic (i.e. interesting) activities ofan observed person. The basic goal of determine behavioral profile is to measure variables from persons duringtheir daily activities in order to capture deviations of activity and posture to facilitate timely intervention orprovide automatic alert in emergency cases.

20 Activity Report INRIA 2009

In order to evaluate the whole proposed activity monitoring framework, several experiments have beenperformed. The main objectives of these experiments are to validate the different phases of the activitymonitoring framework, to highlight interesting characteristics of the approach, and to evaluate the potential ofthe framework for real world applications.

The results of this approach are shown for the recognition of Activities of Daily Living (ADLs) of realelderly people living in an experimental apartment (Gerhome laboratory) equipped with video sensors andenvironmental sensors.

Results comparing volunteer 1 (male of 64 years) and volunteer 2 (female of 85 years), observed during 4hours, show the greater ADLs ability of the 64 years old adult as compared to that of the 85 years old:

• Volunteer 1 of 64 years changed zones more often than volunteer 2 of 85 years (for "enteringlivingroom" 20 vs. 13), and did this at a quicker pace (00:01:15 vs. 00:02:42), showing a greaterability to walk.

• Volunteer 1 was more often seen "sitting on chair" (15 vs. 4), but volunteer 2 was "sitting on chair"for a longer duration (03:30:29 vs. 01:36:43), showing also a greater ability for the volunteer 1 tomove in the apartment.

• Volunteer 1 was using more the "upper cupboard" than the volunteer 2 (22 vs. 9), and in a quickerway (00:00:57 vs. 00:04:43).

• Volunteer 1 was more able to use the stove (less trials for "using stove" 35 vs. 106).

• Similarly volunteer 1 was "bending" twice as much as volunteer 2 (30 vs. 15), and in a quicker way(00:00:03 vs. 00:00:12), showing greater dynamism for the younger volunteer.

More details about the proposed approach and the obtained results are described in [49], [48] and in [29].In the current work, the proposed activity recognition approach was evaluated in the experimental laboratory(Gerhome) with fourteen real elderly people. The next step of this work requires to test this approach inhospital environment (see the next section) involving more people with different wellness and different healthstatus (e.g. Alzheimer Patients).

6.1.10.2. Monitoring Activities for Alzheimer Patients from CHU Nice

Participants: Rim Romdhame, Daniel Zullo, Nadia Zouba, Bernard Boulay, François Brémond, MoniqueThonnatWe propose to develop a framework for monitoring Alzheimer patients as the continuity of the ADLmonitoring framework. The basic goal is to determine behavioral profiles of Alzheimer people and evaluatethese profiles. With the help of doctors a specific scenario has been established to evaluate the behaviors ofAlzheimer patients.Some experiments have been performed in a room in CHU of Nice equipped with video where elderly peopleand medical volunteers have spend between 15 min and 1 hour:

• 1 Alzheimer Volunteer (80 years old)3

• 3 Elderly Volunteers (64-85 years old)

• 5 Young Volunteers (20-30 years old)

• 1 medical staff Volunteer (25-30 years old)

3He and his relatives have signed an agreement

Project-Team Pulsar 21

The second goal of this framework is to handle the uncertainty of event recognition. Most previous approaches,able to recognize events and handling uncertainty, are 2D approaches which model an activity as a set ofpixel motion vectors. These 2D approaches can only recognize short and primitive events but cannot addresscomposite events. We propose a video interpretation approach based on uncertainty handling. The main goalis to improve the techniques of automatic video data interpretation taking into account the imprecision of therecognition. To attain our goal, we have extended the event recognition described by T.VU by modeling thescenario recognition uncertainty and computing the precision of the 3D information characterising the mobileobjects moving in the scene. We have used the 3D information to compute the spatial probability of the event.We have also computed the temporal probability of an event based on its spatial probability at the previousinstant. This approach is validated using a homecare application which tracks elderly people living at homeand recognizes events of interest specified by gerontologists.This work has been published in [46].

6.1.11. Trajectory Clustering for Activity LearningKeywords: Data-mining, Learning, Understanding.

Participants: José Luis Patino Vilchis, Guido-Tomas Pusiol, François Brémond, Monique Thonnat.

Trajectory information is a rich descriptor, which can produce essential information for activity learning andunderstanding. Our work on analysis of underground stations with trajectory clustering [34] has shown thattrajectory patterns associated to the clusters are indicative of specific behaviors and activities. Our new studiesexplore the application of trajectory clustering on two (different behavior types) new domains: 1) Monitoringof elderly people at home; 2) Monitoring the ground activities at an airport dockstation.

1) Monitoring of elderly people at home.In this work we propose a framework to recognize and classify loosely constrained activities with minimalsupervision. The framework uses basic trajectory information as input and goes up to video interpretation.The work reduces the gap between low-level information and semantic interpretation, building an intermediatelayer of Primitive Events. The proposed representation for primitive events aims at capturing small coherentunits of motion over the scene with the advantage of being learnt in an unsupervised manner. We proposethe modeling of an activity using Primitive Events as the main descriptors. The activity model is built in asemi-supervised way using only real tracking data.The approach is composed of 5 steps. First, people are detected and tracked in the scene and their trajectoriesare stored in a database, using a region based tracking algorithm. Second, the topology of the scene is learnt,using the regions where the person usually stands and stops interacting with fixed objects in the scene. Third,the transitions between these Slow Regions are computed by cutting the observed person trajectory. Thesetransitions correspond to short unit of motion and can be seen as basic elements constituting more complexactivities. Fourth, a coarse activity ground-truth is manually performed on a reference video corresponding tothe first monitored person. Primitive Event histograms are computed and labelled by performing a matchingstage with this ground-truth. Fifth, using these labeled Primitive Event histograms the activities of a secondmonitored person can be automatically recognized.

We validate the approach by recognizing and labeling modeled activities in a home-care (Gerhome project)application. The used video datasets capture the living room and kitchen of an apartment. Each video datasetcontains an aged person performing activities, such as "the person is eating". These activities are learnt anddiscovered using different video datasets. This work has been published in [45].

2) Monitoring the ground activities at an airport dockstation.In this work we employ trajectory-based analysis for activity extraction from apron monitoring in the Toulouseairport in France. We aim at helping the infrastructure managers; for the everyday operation we provideenvironmental figures, which include location and number of people in the monitored areas (occupation map),and the activities themselves. We have built a system thus to 1) learn which monitored areas are normallyoccupied, and then 2) perform activity pattern discovery with interpretable semantic.

22 Activity Report INRIA 2009

Trajectory clustering is employed mainly to discover the points of entry and exit of mobiles appearing inthe scene. Proximity relations between resulting clusters of detected mobiles as well as between clustersand contextual elements from the scene are employed to build the occupancy zones and characterise theongoing different activities of the scene. We study the scene activity at different granularities which givethe activity description in broad terms, or with detailed information thus managing different informationlevels. By including temporal information we are able to find spatio-temporal patterns of activity. Thanksto an incremental learning procedure, the system is capable to handle large amounts of data. We have appliedour algorithm to five video datasets corresponding to different monitoring instances of an aircraft in the airportdocking area. This corresponds to about five hours of video analysed. Figure 12 shows occupancy zones at thesystem-selected information levels for scene activity reporting. We elaborate activity maps with a semanticaldescription of the discovered zones (and thus of the associated activities). From the analysed sequences, wewere able to recognize activities such as ’GPU arrival’, ’Loading’, ’Unloading’.

Figure 12. Activity maps for the sequence ’cof1’. The left panel shows the user defined zones to be monitored. Theright panel shows the most employed discovered zones. Colors (from blue to red) indicate occupency level of all

zones.

6.2. Software Engineering for Activity RecognitionParticipants: François Brémond, Erwan Demairy, Sabine Moisan, Anh-Tuan Nghiem, Annie Ressouche,Jean-Paul Rigault, Monique Thonnat, Jean-Yves Tigli, Christophe Tornieri.

6.2.1. IntroductionParticipants: Monique Thonnat, François Brémond, Erwan Demairy, Christophe Tornieri.

This year Pulsar has developed a new software platform: the SUP platform. It is the backbone of the teamexperiments to implement the new algorithms proposed by the team in perception, understanding and learning.We study a meta-modeling approach to support the development of video surveillance applications basedon SUP. We also introduce the notion of relation in our knowledge description language. We study thedevelopment of a scenario recognition module relying on formal methods to support activity recognition inSUP platform. We began to study the definition of multiple services for device adaptive platform for scenariorecognition. We continue to develop the Clem toolkit around a synchronous language dedicated to activityrecognition applications.

Project-Team Pulsar 23

subsubsectionSUP Software Platform

SUP is made as a framework allowing several video surveillance workflows to be implemented. Currently, theworkflow is static for a given application. A given workflow is the composition of several plugins, each of themimplementing an algorithmic step in the video processing (i.e. the segmentation of images, the classificationof objects, etc.).

The design of SUP allows to execute at runtime the selected plugins. Currently 15 plugins are available:

• 6 plugins are wrappers on industrial implementations of algorithms (made available by Keeneo).They allow a quick deployment of a video processing chain encompassing image acquisition, seg-mentation, short-term and long-term tracking. These algorithms are robust and efficient algorithms,but with the drawback that some algorithms can lack of accuracy.

• 9 are implementations by the team members which cover the following fields:

– one segmentation removing the shadows;

– two classifiers, one being based on postures and one on people detection;

– four frame-to-frame trackers, using as algorithm: (i) a simple tracking by overlapping, (ii)neural networks, (iii) tracking of feature points, (iv) tracking specialized for the trackingof persons in a crowd.

– two scenario recognizers, one generic allowing expression of probabilities on the recog-nized events, and the other one focusing on the recognition of events based on postures.

From a software engineering point-of-view, the goal is to obtain a platform being dynamically reconfigurableas described in next section.

6.2.2. Model-Driven Engineering for Video SurveillanceParticipants: Sabine Moisan, Jean-Paul Rigault, Guiseppe Santoro.

This year we have explored how model-driven engineering techniques can support the configuration anddynamic adaptation of the video surveillance systems designed with our SUP platform. This work is donein collaboration with the MODALIS team of UNSA/CNRS I3S laboratory.

In the video surveillance community, the focus has moved from individual vision algorithms to integrated andgeneric software platforms (such as SUP), and now to the security, scalability, evolution, and ease of use ofthese platforms. The last trends require a modeling effort for video surveillance component platforms as wellas for application specification. Our approach is to apply modeling techniques to the application specification(describing the video surveillance task, and its context) as well as to the implementation (assembling thesoftware components). Model-driven engineering uses models to represent partial or complete views of anapplication or a domain, possibly at different abstraction levels. MDE offers techniques to transform a sourcemodel into a target one.

The number of different tasks (such as detection, counting, or tracking), the complexity of contextualinformation, and the relationships among them induce many possible variants. The first activity of a videosurveillance application designer is to sort out these variants to precisely specify the function to realize and itscontext. In our case study, the underlying software architecture is component-based (SUP). The processingchain consists of components that transform data before passing it to other components. As a result, thedesigner has to map this specification to software components that implement the needed algorithms.

The challenge is to cope with the many –functional as well as nonfunctional– causes of variability on both sides(specification and implementation). Hence, we first decided to separate these two concerns. We then applieddomain engineering to identify the reusable elements on both sides. This led to two models: a generic modelof video surveillance applications (for short application model) and a model of video processing componentsand chains (component platform configuration model, for short platform model). Both of them are featuremodels expressing variability factors. Feature models are a popular formalism used to model software productlines commonalities and variabilities. They compactly define all features in an product line and their valid

24 Activity Report INRIA 2009

combinations. A feature model is basically an AND-OR graph with constraints which organizes hierarchicallya set of features while making explicit the variability. Our models are also enriched with intra- and inter-modelsconstraints. Inter-models constraints specify how the system should adapt to changes in its environment. Itis convenient to use the same kind of models on both sides, leading to a uniform syntax. Feature modelsare appropriate to describe variants; they are simple enough for video surveillance experts to express theirrequirements. Yet, they are powerful enough to be liable to static analysis [58]. In particular, the inter-featureconstraints can be analyzed as SAT problem.

The application model describes the relevant concepts and features from the stakeholders’ point of view, ina way that is natural in the video surveillance domain: characteristics and position of sensors, context of use(day/night, in/outdoors, target task)... It also includes notions of quality of service (performance, response time,detection robustness, configuration cost...). Such a model provides a user friendly way to specify a problemfor SUP. The platform model describes the different software components and their assembly constraints(ordering, alternative algorithms...). Ultimately, we wish to automatically generate a SUP component assemblyfrom an application specification, using model to model transformations. The first model can be transformedinto one or several valid component configurations, in order to map the specification onto software componentsthat implement the needed algorithms. Due to the multiple causes of variability, the result of the transformationis usually a set of possible component assemblies fulfilling the task and the context specification; hence thedesigner has to manually fine tune the assembly. This approach allows designers to define the static initialconfiguration of a video surveillance system [31].

Videosurveillanceapplica0on

Task QoS Objectofinterest

Scenecontext

Coun0ng Intrusion

Precision Sensi0vity

Quality Scenedescrip0on

Environment

Ligh0ng Noise

Figure 13. Simplified feature model of video surveillance applications

Concretely, we have developed a generic feature diagram editor to manipulate the models, using ECLIPSEmeta-modeling facilities (EMF, ECORE, GMF...). At this time, we have a first prototype that allowed us torepresent both models. However, the current tool only supports natural language constraints, but we haveexperimented the KERMETA workbench to implement some model to model transformations. We have alsodeveloped an interface with SAT tools to verify the generated configurations.

An additional challenge is to manage the dynamic variability of the context to cope with possible run-timechange of implementation triggered by context variations (e.g. lighting conditions, changes in the referencescene, etc.). Video surveillance systems are indeed a good example of dynamic adaptive systems, i.e. software

Project-Team Pulsar 25

systems which have to dynamically adapt in order to cope with a changing environment. Such systems must beable to sense their environment, to autonomously select an appropriate configuration and to efficiently migrateto this configuration. Handling these issues at the programing level proves to be challenging due to the largenumber of contexts and of software configurations. The use of models at run-time is an alternative solution. Inour approach, the adaptation logic is defined by adaptation rules, corresponding to the dependency constraintsbetween specification elements in one model and software variants in the other [30]. A context change at run-time corresponds to a new configuration in the application model and must lead to a new configuration of theplatform model. Adaptation rules use the constraint language of feature models (i.e. propositional logic-basedlanguage). These rules address features possibly connected with “and”, ”or”, “not”, their actions correspondto changes in the processing chain. As an example, the following adaptation rule:Night and HeadLight implies HeadLightDetectionstates that if the context changes (from day) to night and if headlights (e.g. of vehicles) must be taken intoaccount, a new component from the platform (HeadLightDetection module) must be integrated in the runningprocessing chain. To ensure its usability, the proposed approach has been built on top of the Domain SpecificModeling Language (DSML [55]) for adaptive systems.

6.2.3. Scenario Analyze ModuleParticipants: Bernard Boulay, Sabine Moisan, Annie Ressouche, Jean Paul Rigault.

This research axis contributes to the study and the development of a module to analyze scenarios.

This year we have studied models of scenarios dealing with both real time (to be realistic and efficient inthe analyze phase) and logic time (to benefit from well-known mathematical models allowing re-usability,easy extension and verification). Scenarios are mostly used to specify the way a system may react to sensorinputs. Therefore, models of scenarios must also take into account the uncertainty of sensor results. Toaddress these needs (logic time, real time and uncertainty) we have defined a language to express scenarios ascompositions of sub-scenarios. Basic scenarios are composed of events, while general scenarios are expressedas composition of temporal relations (before, during, overlap) between sub-scenarios. Temporal constraintscan also be expressed. Moreover, the language supports the definition of external types and functions in orderto allow scenarios to handle events of different kinds.

For behavior recognition, as for all automated systems, validation is really a crucial phase and an exhaustiveapproach of validation is clearly needed. To be trusted behavior analyze must rely on formal methods from thevery beginning of its design. Formal methods help to produce a sound code the size and frequency of whichcan be estimated. Hence, we defined a synchronous model for our scenario language. This modeling supportsthe representation of scenarios as equation systems which compute two Boolean variables for each scenario:beginning true when the first event(s) or sub scenario(s) of the scenario is recognized and termination truewhen the scenario is recognized. This theoretical approach leads to the definition of a scenario analyze module(SAM). This module provides users with (1) a simulation tool to test scenario behaviors; (2) a recognitionprogram in C for each scenario which must be completed by the definition (by the user) of external typesand functions in a C or C++ environment; (3) an exhaustive verification of safety properties relying on modelchecking techniques our approach allows. This latter offers also the possibility to define the safety propertieswe want to prove as “observers” [57] expressed in the scenario language.

6.2.4. Multiple Services for Device Adaptive Platform for Scenario RecognitionParticipants: Annie Ressouche, Jean-Yves Tigli.

Activity recognition and monitoring systems based on multi-sensor and multi-device approaches are moreand more popular to enhance events production for scenario analysis. Underlying software and hardwareinfrastructures can be considered as static (no changes during the overall recognition process quasi-static (nochanges during two reconfigurations of the process) or really dynamic (depending on dynamic appearanceand disappearance of numerous sensors and devices in the scene, communicating with the system, duringrecognition process).

26 Activity Report INRIA 2009

In this last case, we need to partially and reactively adapt the application to the evolution of the environment,while preserving invariants required for the validity of the recognition process.

In order to address such a challenge our researches try to federate the inherent constraints of platform devotedto action recognition, like SUP, with a service oriented middleware approach to deal with dynamic evolutionsof the system infrastructure. Recent results, using a Service Lightweight Component Architecture (SLCA) [28]to compose services for device and Aspects of Assembly (AA) to adapt them in a reactive way [27], presentinteresting prospects to deal with multi-devices and variable systems. They provide a user-friendly separateddescription for adaptations that shall be applied and composed at runtime as soon as the corresponding requireddevices are present (in context-sensitive security middleware layer for example [40]). They also underlineperformances and response times that allow reactive adaptation on appearance and disappearance of devices.However, although composition between these adaptations can verify proved properties, the use of blackboxcomponents in the composition model of SLCA doesn’t allow extracting a model of their behavior. Thus,existing approaches don’t really succeed to ensure that the usage contract of these components is not violatedduring application adaptation. Only a formal analysis of the component behavior models associated with awell sound modeling of composition operation will allow us to secure the respect of the usage contracts.

In this axis, we propose to rely on a synchronous modeling of component behavior and component assemblyto allow the usage of model checking techniques to formally validate services composition.

We began to consider this topic in 2008 through a collaborative action (SynComp) between Rainbow teamat University of Nice Sophia Antipolis and INRIA Pulsar team. Within the Rainbow team, SLCA/AAexperimental platform called WComp is dedicated to the reactive adaptation of applications in the domain ofubiquitous computing. During this collaboration, the management of concurrent access in WComp has beenstudied as a main source of disturbance for the invariant properties. A modeling of the behavior of componentsand of their accesses in a synchronous model has been defined in WComp.

This approach allows us to benefit from model checking techniques to ensure that there are no unpredictablestates of WComp components on concurrent access. This year, during his training, Vivien Fighiera (alreadyinvolved in the SynComp action), has completed the theoretical work done in SynComp. He studied how toprove safety properties regarding WComp component models relying on the NuSMV [52] model checker.This year, the collaboration between Rainbow and Pulsar has been strengthened since Jean-Yves Tigli isa full time researcher at Pulsar team since September, in sabbatical year sponsored by INRIA. Now, weplan to modelize the overall assembly of WComp components with a synchronous approach to allow theusage of model checking techniques to formally validate application design. In order to obtain results basedon experimental scenarios to evaluate SynComp improvements for adaptive recognition process, we plan tointegrate SUP platform as a software services provider.

The SUP platform gathers a set of modules devoted to design applications in the domain of activity recognition.WComp is aimed at assembling services which evolve in a dynamic and heterogeneous environment. Indeed,the services provided by SUP can be seen as complex high-level services whose functionalities depend onthe SUP treatments; this latter dealing with the dynamic change of the environment. Thus, considering SUPservices as web services for devices for example, the devices associated with SUP services will be discovereddynamically by WComp and used with other heterogeneous devices.

6.2.5. The Clem WorkflowParticipants: Annie Ressouche, Daniel Gaffé.

This research axis concerns the theoretical study of a synchronous language LE with modular compilationand the development of a toolkit 14 around the language to design, simulate, verify and generate code forprograms.

LE language agrees with Model Driven Software Developement philosophy which is now well known as away to manage complexity, to achieve high re-use level, and to significantly reduce the development effort.Therefore, we benefit from a formal framework well suited to compilation and formal validation. In practice,we defined two semantics for LE: a behavioral semantics to define a program by the set of its behaviors,

Project-Team Pulsar 27

avoiding ambiguities in program interpretations; an equational semantics to allow modular compilationof programs into software and hardware targets (C code, VHDL code, FPGA synthesis, observers...). Ourapproach fulfills two main requirements of critical realistic applications: modular compilation to deal withlarge systems and model-based approach to perform formal validation.

Figure 14. The Clem Toolkit

The main originality of this work is to be able to manage both modularity and causality. Indeed, only fewapproaches consider a modular compilation because there is a deep incompatibility between causality andmodularity. Causality means that for each event generated in a reaction, there is a causal chain of events leadingto this generation. No causal loop may occur. Program causality is a well-known problem with synchronouslanguages, and therefore, it needs to be checked carefully. Thus, relying on semantics to compile a languageensures a modular approach but requires to complete the compilation process with a global causality checking.To tackle this problem, we introduced a new way to check causality from already checked sub programs andthe modular approach we infer.

This year we focused on the algorithms to check causality. To compile LE programs we rely on an equationalsemantics that translates each program into an equation system. A program is causal (i.e it has no causalitycycle) if its associated equation system has no dependency cycle between its variables. Thus, to check causalityamounts to find an evaluation order for equation systems. Usually, several total orders are valid when sorting anequation system, and choosing one particular prevents modularity. Indeed, starting with two equation systemstotally ordered could result in a false cyclic equation system when merging them. To ensure modularity, wedefine a sorting algorithm that computes all the valid partial order of an equation system. We complete ourapproach in also defining an algorithm to merge two previously sorted equation systems. These algorithms arethe corner stone of our approach. This year we define the merge algorithm and we prove (1) that the sortingalgorithm we have is the greatest fix point of a function defined on the dependency graph of equation systems;(2) that the merge algorithm is correct (i.e we get the same partial orders using the merge algorithm on twopreviously sorted equation systems or when sorting the union of the two equation systems considered).

28 Activity Report INRIA 2009

On another hand, this year we have extended the language to support data handling. According to LE principle,only signals carry data. Data belong to predefined types (usual programming language types) or to someexternal types. We improve the syntax to allow data definition for signals. Then we also extend the internalexchange format (LEC) we defined to support modularity, to take values into account. We also improve thecompiler of the language to deal with signal data and finally, we improve the code generation. Now, theintegration of data must be achieved in the simulator and in the code we generate to feed the NuSMV modelchecker.

7. Contracts and Grants with Industry

7.1. Industrial ContractsThe Pulsar team has strong collaborations with industrial partners through European projects and nationalgrants. In particular with STMicroelectronics, Bull, Thales, Sagem, Alcatel, Bertin, AKKA, Metro of Turin(GTT), Metro of Paris (RATP) and Keeneo.

8. Other Grants and Activities

8.1. European ProjectsPulsar team has been involved this year in two European projects: a project on multimedia informationprocessing (VICoMo) and a project on machine learning and activity monitoring (COFRIEND).

8.1.1. VICoMo ProjectViCoMo is a ITEA 2 European Project which has started on the 1st of October 2009 and will last 36 months.This project concerns advanced video-interpretation algorithms on video data that are typically acquired withmultiple cameras. ViCoMo is focusing on the construction of realistic context models to improve the decisionmaking of complex vision systems and to produce a faithful and meaningful behavior. The context of an event(a crime, group activity, or computer-aided diagnosis) can be found with multiple sensors, like a multi-cameraset-up in a surveillance system. The general goal of the ViCoMo project is thus to find the context of events thatwere captured by the cameras or image sensors, and model the context such that reliable reasoning about anevent can be established. Hence, modeling of events and the 3D surroundings helps to recognize the behaviorof persons, objects and events in a 3D view.

The project is executed by a strong international consortium, including large high-tech companies (e.g. Philips,Acciona, Thales), smaller innovative SMEs (CycloMedia, VDG Security) and is complemented with relevantresearch groups and departments from well-known universities (TU Eindhoven, University of Catalonia, FreeUniversity of Brussels) and research institutes (INRIA, CEA List, Multitel). Participating countries are France,Spain, Finland, Turkey, Belgium and Netherlands.

8.1.2. COFRIEND ProjectCOFRIEND is a European project in collaboration with Akka, University of Hamburg (Cognitive SystemsLaboratory), University of Leeds, University of Reading (Computational Vision Group), Toulouse-BlagnacAirport. It has begun at the beginning of February 2008 and will last 3 years. The main objectives of thisproject is to develop techniques to recognise and learn automatically all servicing operations around aircraftparked on aprons.

8.2. International Grants and ActivitiesPulsar is involved in an academic collaboration with MICA and University of Hanoi in Vietnam (a joint PhDhas been defended at the beginning of 2009).

Project-Team Pulsar 29

8.2.1. Joint Partnership with VietnamPulsar has been cooperating with the Multimedia Research Center in Hanoi MICA on semantics extractionfrom multimedia data. Currently we continue through a joint supervision by A. Boucher and M. Thonnat ofThi Lan Le PhD on video retrieval (funded by an AUF grant). Thi Lan Le defended her PhD in February 2009on the topic: Semantic-based Approach for Image Indexing and Retrieval [24], [26].

8.2.2. Shape metrics and statisticsPulsar is collaborating with Guillermo Sapiro’s team, University of Minnesota, about the design of newshape distances and shape statistics. Guillaume Charpiat visited Sapiro’s team during 3 weeks this July inMinneapolis.

8.3. National Grants and CollaborationsPulsar Team has six national grants: the two first ones concern the implication of the team in a “pôle decompétitivité” and an ANR project on videosurveillance. Two projects concern long term people monitoringat home. We continue both our collaboration with INRA and our collaboration with STmicroelectronics. Wealso continue our collaboration with Ecole des Mines de Paris.

8.3.1. SYSTEM@TIC SIC ProjectPulsar is strongly involved in SYSTEM@TIC “pôle de compétivité” and in particular in the SIC projectSécurité des Infrastructures Critiques which is a strategic initiative in perimeter security. More precisely theSIC project is funded for 42 months with the industrial partners including Thales, EADS, BULL, SAGEM,Bertin, Trusted Logic.

8.3.2. VIDEO-ID ProjectPulsar is participating to an ANR research project on intelligent video surveillance and people biometrics. Theproject lasts 3 years and will be over on February 2011. The involved partners are: Thales-TSS, EURECOM,TELECOM and Management SudParis, UIC, Metro of Paris (RATP), DGA, STSI.

8.3.3. BioSERRE: Video Camera Network for Early Pest Detection in GreenhousesPulsar cooperates with Vista (INRIA Rennes - Bretagne Atlantique), INRA Avignon UR407 PathologieVégétale and CREAT Research Center (Chambre d’Agriculture des Alpes Maritimes) in an ARC project(BioSERRE) for early detection of crop pests, based on video analysis and interpretation.

8.3.4. MONITORE: Real-time Monitoring of TORE PlasmaPulsar is involved in an Exploratory Action called MONITORE for the real-time monitoring of imagingdiagnostics to detect thermal events in tore plasma. This work is a preparation for the design of the futureITER nuclear reactor and is done in partnership with the Plasma Facing Component group of the IRFM/CEA atCadarache. This action is supported by EFDA (European Fusion Development Agreement) through a researchfellowship attributed to V. Martin. Pulsar, through this action, is also a membership of the FR-FCM (frenchfederation for research on controlled magnetic fusion).

8.3.5. Long-term People Monitoring at HomePulsar has a collaboration with CSTB (Centre Scientifique et Technique du Bâtiment) and the Nice CityHospital (Groupe de Recherche sur la Tophicité et le Viellissement) in the CIU Santé project, funded byDGCIS. CIU Santé project is devoted to experiment and develop techniques that allow long-term monitoringof elderly people at home. In this project an experimental room has been set up in Nice hospital and is relyingon the research of the Pulsar team concerning event recognition.

30 Activity Report INRIA 2009

We have started the SWEET-HOME project. It is an ANR TECSAN French project from Nov 1st 2009 to 2012(3 years) on long-term monitoring of elderly people at Hospital with Nice City Hospital, Actis Ingenierie,MICA Center (CNRS unit - UMI 2954) in Hanoi, Vietnam, SMILE Lab at National Cheng Kung University,Taiwan and National Cheng Kung University Hospital. INRIA Grant is 240 Keuros out of 689 Keuros forthe whole project. SWEET-HOME project aims at building an innovative framework for modeling activitiesof daily living (ADLs) at home. These activities can help assessing the evolution of elderly disease (e.g.Alzheimer, depression, apathy) or detecting pre-cursors such as unbalanced walking, speed, walked distance,psychomotor slowness, frequent sighing and frowning, social withdrawal with a result of increasing indoorhours. The SWEET-HOME project focuses on two aspects related to Alzheimer disease: (1) to assess theinitiative ability of patient and whether the patient is involved in goal directed behaviors (2) to assess walkingdisorders and potential risk of falls. In this focus, the goal is to collect and combine multi-sensor (audio-video)information to detect activities and assess behavioral trends to provide user services at different levels. In thisproject experimental rooms are used in Nice-Cimiez Hospital for monitoring Alzheimer patients.

8.3.6. Intelligent CamerasThis year Pulsar has completed a cooperation with STmicroelectronics. A PhD thesis on the design ofintelligent cameras including gesture recognition algorithms has been successfully defended on the 28th ofOctober 2009 (Mohammed Bécha Kaâniche).

8.3.7. Semantic Interpretation of 3D Sismic Images by Cognitive Vision TechniquesA cooperation took place with IFP (French Petrol Institute) and Ecole des Mines de Paris in the frameworkof a joint supervision by M Thonnat and M. Perrin of Philippe Verney PhD at IFP. This thesis [25] has beendefended in September 2009. The title is: Interprétation géologique de données sismiques par une méthodesupervisée basée sur la vision cognitive.

8.3.8. Scenario RecognitionAnother collaboration with Ecole des Mines de Paris started in October 2006 through a joint supervision by A.Ressouche and V. Roy of Lionel Daniel PhD at CMA. The topic is : “Principled paraconsistent probabilisticreasoning - applied to scenarios recognition and voting theory”.

8.4. Spin off PartnerKeeneo (http://www.keeneo.com) is a spin off of the Orion/Pulsar team which aims at commercializing videosurveillance solutions. This company has been created in July 2005 with six co-founders from the Orion/Pulsarteam and one external partner. We have a joint collaboration for building the new Pulsar SUP platform in theframework of an ADT (Action de Développement Technologique).

9. Dissemination

9.1. Scientific Community• Monique Thonnat is member of the editorial board of the journal Image and Vision Computing

(IVC), and is co-editor of a special issue in the journal CVIU Computer Vision and Image Under-standing.

• Monique Thonnat is a reviewer for the journals CVIU Computer Vision and Image Understanding,and MVA (Machine Vision and Applications).

• Monique Thonnat is a Program Committee member for the following conferences: CVPR09 (22thIEEE International conference on computer vision and pattern recognition), ICCV09(12th IEEEInternational conference on computer vision), TAIMA09( 6ièmes atelier sur le traitement et analysede l’information: methodes et applications), CVPR10 (23rd IEEE International conference oncomputer vision and pattern recognition),

Project-Team Pulsar 31

• Monique Thonnat is an expert to review a Cognitive System project IST eTRIMMS for the EuropeanCommission, and for the programme CIBLE 2009 of Region Rhones-Alpes.

• Monique Thonnat is reviewer for the following theses: Robert Lundh (PhD, Univ Orebro, Sweden),Wolfgang Ponweiser (PhD, TU Wien, Austria), Regis Clouard (HDR, Univ. Caen), SebastienDerivaux (Univ, Strasbourg).

• Monique Thonnat had an invited talk at the Technical University of Vienna, Austria on the 23rd ofNovember on Semantic Activity Recogniton.

• M. Thonnat had an invited talk at CEA in Cadarache on the 4th of April 2009 on Activity recogntionbased on vision systems

• Monique Thonnat had an invited talk at the INTECH seminar "Santé, Maintien à Domicile" inGrenoble on April the 28th.

• Monique Thonnat is member of the scientific board of ENPC, Ecole Nationale des Ponts etChaussées since june 2008

• Monique Thonnat was member of the scientific board of INRIA Sophia Antipolis (bureau du comitédes projets) since September 2005 until September 2009.

• Monique Thonnat was president of the hiring committee for junior scientists CR1 and CR2 forINRIA Sophia-Méditerranée

• Monique Thonnat is deputy scientific director in charge of the domain Perception, Cognition andInteraction at INRIA since September 2009

• Monique Thonnat and F. Brémond are co-founders and scientific advisors of Keeneo, the video-surveillance start-up created to exploit Pulsar research results on the VSIP/SUP software.

• François Brémond is reviewer for the journals: IEEE Transactions on Pattern Analysis and MachineIntelligence, IEEE Intelligent Systems, Pattern Recognition Letters, IEEE Transactions on Multi-media, Computer Vision and Image Understanding, Artificial Intelligence Journal, Transactions onSystems, Man, and Cybernetics, Sensors journal, Machine Vision and Applications Journal.

• François Brémond is Program Committee member of ACM Multimedia 2009 Workshop Multimediain Forensics, International Conference on Imaging for Crime Detection and Prevention (ICDP09),Tracking Humans for the Evaluation of their Motion in Image Sequences (THEMIS 2009), PatternRecognition and Artificial Intelligence for Human Behaviour Analysis (PRAI4HBA), InternationalConference on Computer Vision Systems (ICVS 2009), Visual Surveillance (VS2009).

• François Brémond is a reviewer for the conferences and workshops: International Conference onComputer Vision (ICCV09), British Machine Vision Conference (BMVC09), Asian Conference onComputer Vision (ACCV 2009), Computer Vision and Pattern Recognition CVPR09.

• François Brémond had an invited Talk at “Etats généraux personnes âgées” in October 2009 inMarseille.

• François Brémond was a reviewer for the PhD defense of Fida EL BAF from La Rochelle University,Pau Baiget from Computer Vision Center at the Autonomous University of Barcelona, BaptisteHemery from ENSICAEN - GREYC.

• François Brémond is an ANR reviewer for the 2009 edition of Project Call: "Concepts, Systèmes etOutils pour la Sécurité Globale".

• François Brémond is an Expert for EC INFSO in the framework of Ambient Assisted Living FP7.

• François Brémond was an area chair at IEEE Advanced Video and Signal Based Surveillance(AVSS2009) conference, Genova, Italy 2-4 September 2009

• François Brémond is organizer of the “Journé SOOS de la SFO : Concepts Avancés de Réseaux deCameras Optroniques pour la Surveillance Automatisée”, 15 October 2009 Paris.

32 Activity Report INRIA 2009

• François Brémond is a member of the 2nd European Network for the Advancement of ArtificialCognitive Systems, Interaction and Robotics (EUCogII network).

• Sabine Moisan is a member of the Scientific Council of INRA for Applied Computing and Mathe-matics (MIA Department).

• Jean-Paul Rigault is a member of AITO, the steering committee for several international conferencesincluding in particular ECOOP. He is also a member of the Administration Board of the PolytechnicInstitute of Nice University.

• Annie Ressouche is a member of the Inria Cooperation Locales de Recherches (Colors) committee.

• Guillaume Charpiat is organizing a working group about shapes between several teams in INRIASophia-Antipolis.

• Guillaume Charpiat is a reviewer for the journals: the International Journal of Computer Vision(IJCV), Transactions on Pattern Analysis and Machine Intelligence (TPAMI), the Journal of Mathe-matical Imaging and Vision (JMIV), Computer Vision and Image Understanding (CVIU) and Med-ical Image Analysis.

• Guillaume Charpiat’s work on image colorization is the subject of a popular science article publishedby Interstices.

• Vincent Martin is a reviewer for the journals: Transactions on Instrumentation & Measurement(TIM), Computers & Electronics in Agriculture (COMPAG).

• Jean-Yves Tigli is ACM member and Program chair of ACM International Mobility Conference2009.

• Jean-Yves Tigli is member of the expert committee of the Sectoral consulting group (GCS3) of theMinistry for Higher Education and Research, DGRI A3, on "Ambient Intelligence" since 2009.

• Jean-Yves Tigli had an invited talk on "Middleware for Ubiquitous Computing" at DAAD (GermanAcademic Exchange Service) CTDS’ 09, Summer School on Current Trends in Distributed Systems,24 - 26 September, Gammarth, Tunisia.

9.2. Teaching• Pulsar is a hosting team for the Master of Computer Science of UNSA.

• Teaching at Master EURECOM on Video Understanding (3 h F. Bremond).

• Teaching at Master of Computer Science at Polytechnic School of Nice Sophia Antipolis University,course on Synchronous Languages and Verification (24 h A. Ressouche).

• Teaching in the “Module Général d’Inégration” at Ecole des Mines de Paris, course on SCADE tooland verification (6 h A. Ressouche).

• Jean-Paul Rigault is full professor at the Polytechnic School of Nice Sophia Antipolis University(Computer Science Department).

• Jean-Yves Tigli is Assistant Professor at Polytechnic Institute of Nice University.

• Sabine Moisan and Jean-Paul Rigault obtained the best paper award of the Educator’s Symposiumat MODELS’09 [43].

9.3. Thesis9.3.1. Thesis in Progress

• Slawek Bak : People detection in temporal video sequences by defining a generic visual signature ofindividuals, Nice Sophia-Antipolis University.

• Duc Phu Chau: Object Tracking for Activity Recognition, Nice Sophia-Antipolis University.

Project-Team Pulsar 33

• Guido-Tomas Pusiol : Learning Techniques for Video Understanding, Nice Sophia-Antipolis Uni-versity.

• Rim Romdham : Event Recognition in Video Scenes with Uncertain Knowledge, Nice Sophia-Antipolis University.

• Anh Tuan Nghiem : Learning Techniques for the Configuration of the Scene Understanding Process,Nice Sophia-Antipolis University.

• Nadia Zouba : Multi Sensor Analysis for Homecare Monitoring, Nice Sophia-Antipolis University.

9.3.2. Thesis Defended

• Lan Le Thi : Semantic-based Approach for Image Indexing and Retrieval, Nice Sophia-AntipolisUniversity and Hanoi University (Vietnam).

• Mohammed-Bécha Kaâniche : Human Gesture Recognition from video sequences, Nice Sophia-Antipolis University.

• Philippe Verney: Interprétation géologique de données sismiques par une méthode supervisée baséesur la vision cognitive, IFP (French Petrol Institute).

10. BibliographyMajor publications by the team in recent years

[1] A. AVANZI, F. BRÉMOND, C. TORNIERI, M. THONNAT. Design and Assessment of an Intelligent ActivityMonitoring Platform, in "EURASIP Journal on Applied Signal Processing, Special Issue on “Advances inIntelligent Vision Systems: Methods and Applications”", vol. 2005:14, 08 2005, p. 2359-2374.

[2] H. BENHADDA, J. PATINO, E. CORVEE, F. BREMOND, M. THONNAT. Data Mining on Large VideoRecordings, in "5eme Colloque Veille Stratégique Scientifique et Technologique VSST 2007, Marrakech,Marrocco", 21st - 25th October 2007.

[3] B. BOULAY, F. BREMOND, M. THONNAT. Applying 3D Human Model in a Posture Recognition System., in"Pattern Recognition Letter.", vol. 27, no 15, 2006, p. 1785-1796.

[4] F. BRÉMOND, M. THONNAT. Issues of Representing Context Illustrated by Video-surveillance Applications, in"International Journal of Human-Computer Studies, Special Issue on Context", vol. 48, 1998, p. 375-391.

[5] N. CHLEQ, F. BRÉMOND, M. THONNAT. Image Understanding for Prevention of Vandalism in Metro Stations,in "Advanced Video-based Surveillance Systems", Kluwer A.P. , Hangham, MA, USA, November 1998, p.108-118.

[6] V. CLÉMENT, M. THONNAT. A Knowledge-Based Approach to Integration of Image Procedures Processing, in"CVGIP: Image Understanding", vol. 57, no 2, March 1993, p. 166–184.

[7] F. CUPILLARD, F. BRÉMOND, M. THONNAT. Tracking Group of People for Video Surveillance, Video-BasedSurveillance Systems, vol. The Kluwer International Series in Computer Vision and Distributed Processing,Kluwer Academic Publishers, 2002, p. 89-100.

[8] B. GEORIS, F. BREMOND, M. THONNAT. Real-Time Control of Video Surveillance Systems with ProgramSupervision Techniques, in "Machine Vision and Applications Journal", vol. 18, 2007.

34 Activity Report INRIA 2009

[9] C. LIU, P. CHUNG, Y. CHUNG, M. THONNAT. Understanding of Human Behaviors from Videos in NursingCare Monitoring Systems, in "Journal of High Speed Networks", vol. 16, 2007, p. 91-103.

[10] S. LIU, P. SAINT-MARC, M. THONNAT, M. BERTHOD. Feasibility Study of Automatic Identification ofPlanktonic Foraminifera by Computer Vision, in "Journal of Foramineferal Research", vol. 26, no 2, April1996, p. 113–123.

[11] N. MAILLOT, M. THONNAT, A. BOUCHER. Towards Ontology Based Cognitive Vision, in "Machine Visionand Applications (MVA)", vol. 16, no 1, December 2004, p. 33-40.

[12] S. MOISAN. Une plate-forme pour une programmation par composants de systèmes à base de connaissances,Université de Nice-Sophia Antipolis, April 1998, Habilitation à diriger les recherches.

[13] S. MOISAN, A. RESSOUCHE, J.-P. RIGAULT. Blocks, a Component Framework with Checking Facilities forKnowledge-Based Systems, in "Informatica, Special Issue on Component Based Software Development", vol.25, no 4, November 2001, p. 501-507.

[14] J. PATINO, H. BENHADDA, E. CORVEE, F. BREMOND, T. M.. Video-Data Modelling and Discovery, in "4thIET International Conference on Visual Information Engineering VIE 2007, London, UK", 25th - 27th July2007.

[15] J. PATINO, E. CORVEE, F. BREMOND, M. THONNAT. Management of Large Video Recordings, in "2ndInternational Conference on Ambient Intelligence Developments AmI.d 2007, Sophia Antipolis, France", 17th- 19th September 2007.

[16] M. THONNAT, M. GANDELIN. Un système expert pour la description et le classement automatique dezooplanctons à partir d’images monoculaires, in "Traitement du signal, spécial I.A.", vol. 9, no 5, November1992, p. 373–387.

[17] M. THONNAT, S. MOISAN. What Can Program Supervision Do for Software Re-use?, in "IEE Proceedings -Software Special Issue on Knowledge Modelling for Software Components Reuse", vol. 147, no 5, 2000.

[18] M. THONNAT. Vers une vision cognitive: mise en oeuvre de connaissances et de raisonnements pour l’analyseet l’interprétation d’images., Université de Nice-Sophia Antipolis, October 2003, Habilitation à diriger lesrecherches.

[19] M. THONNAT. Toward an Automatic Classification of Galaxies, in "The World of Galaxies", Springer Verlag,1989, p. 53-74.

[20] A. TOSHEV, F. BRÉMOND, M. THONNAT. An A priori-based Method for Frequent Composite Event Discoveryin Videos, in "Proceedings of 2006 IEEE International Conference on Computer Vision Systems, New YorkUSA", January 2006.

[21] V. T. VU, F. BRÉMOND, M. THONNAT. Temporal Constraints for Video Interpretation, in "Proc of the 15thEuropean Conference on Artificial Intelligence, Lyon, France", 2002.

Project-Team Pulsar 35

[22] V. T. VU, F. BRÉMOND, M. THONNAT. Automatic Video Interpretation: A Novel Algorithm based forTemporal Scenario Recognition, in "The Eighteenth International Joint Conference on Artificial Intelligence(IJCAI’03)", 9-15 September 2003.

Year PublicationsDoctoral Dissertations and Habilitation Theses

[23] M.-B. KAÂNICHE. Human Gesture Recognition from video sequences, Université de Nice Sophia-Antipolis,October 2009, http://tel.archives-ouvertes.fr/tel-00428690/en/, Ph. D. Thesis.

[24] T.-L. LE. Indexation et Recherche de Vidéo pour la Vidéo Surveillance, Université de Nice Sophia-Antipolis,February 2009, http://tel.archives-ouvertes.fr/tel-00393866/en/, Ph. D. Thesis.

[25] P. VERNET. Interprétation géologique de données sismiques par une méthode supervisée basée sur la visioncognitive, Ecole Nationale Supérieure des Mines de Paris, September 2009, Ph. D. Thesis.

Articles in International Peer-Reviewed Journal

[26] T.-L. LE, M. THONNAT, A. BOUCHER, F. BRÉMOND. Surveillance video indexing and retrieval using objetfeatures and semantic events, in "IJPRAI special issue on Visual Analysis and Understanding for SurveillanceApplications", 2009 VN .

[27] J.-Y. TIGLI, S. LAVIROTTE, G. REY, V. HOURDIN, D. CHEUNG, E. CALLEGARI, M. RIVEILL. WCompmiddleware for ubiquitous computing: Aspects and composite event-based Web services, in "Annals ofTelecommunications", vol. 64, no 3-4, 2009, ISSN 0003-4347 (Print) ISSN 1958-9395 (Online).

[28] J.-Y. TIGLI, S. LAVIROTTE, G. REY, V. HOURDIN, M. RIVEILL. Lightweight Service Oriented Architecturefor Pervasive Computing, in "IJCSI International Journal of Computer Science Issues", vol. 4, no 1, 2009,ISSN (Online): 1694-0784 ISSN (Print): 1694-0814.

[29] N. ZOUBA, F. BRÉMOND, M. THONNAT, A. ANFOSSO, E. PASCUAL, P. MALLEA, V. MAILLAND, O.GUERIN. A computer system to monitor older adults at home: Preliminary results, in "The GerontechnologyJournal: Official Journal of the International Society for Gerontechnology", vol. 8 no. 3, 2009, p. 129 - 139.

International Peer-Reviewed Conference/Proceedings

[30] M. ACHER, P. COLLET, F. FLEUREY, P. LAHIRE, S. MOISAN, J.-P. RIGAULT. Modeling Context andDynamic Adaptations with Feature Models, in "[email protected] Workshop, Denver, CO, USA", October2009.

[31] M. ACHER, P. LAHIRE, S. MOISAN, J.-P. RIGAULT. Tackling High Variability in Video Surveillance Systemsthrough a Model Transformation Approach, in "ICSE’2009 - MISE Workshop, Vancouver, Canada", May2009.

[32] S. BAK, S. SURESH, E. CORVÉE, F. BRÉMOND, M. THONNAT. Fusion of motion segmentation and learningbased tracker for visual surveillance, in "The International Joint Conference on Computer Vision, Imagingand Computer Graphics Theory and Applications (VISAPP), Lisbon, Portugal", 5 - 8 February 2009.

36 Activity Report INRIA 2009

[33] P. BILINSKI, F. BRÉMOND, M.-B. KAÂNICHE. Multiple object tracking with occlusions using HOGdescriptors and multi resolution images, in "Proceedings of the 3rd International Conference on Imagingfor Crime Detection and Prevention (ICDP’09), Kingston University, London, UK", December 2009.

[34] C. CARINCOTTE, F. BRÉMOND, J.-M. ODOBEZ, L. PATINO, B. RAVERA, X. DESURMONT. MultimediaKnowledge-Based Content Analysis over Distributed Architecture, in "The Networked and Electronic Media,2009 NEM Summit - "Towards Future Media Internet", Saint Malo, France", September 2009 BE CH .

[35] G. CHARPIAT. Learning Shape Metrics based on Deformations and Transport, in "Proceedings of ICCV2009 and its Workshops, Second Workshop on Non-Rigid Shape Analysis and Deformable Image Alignment(NORDIA), Kyoto, Japan", September 2009.

[36] D.-P. CHAU, F. BRÉMOND, E. CORVÉE, M. THONNAT. Repairing People Trajectories Based on PointClustering, in "The International Joint Conference on Computer Vision, Imaging and Computer GraphicsTheory and Applications (VISAPP), Lisbon, Portugal", 5 - 8 February 2009.

[37] D.-P. CHAU, F. BRÉMOND, M. THONNAT. Online evaluation of tracking algorithm performance, in "The3rd International Conference on Imaging for Crime Detection and Prevention, London, United Kingdom", 3December 2009.

[38] E. CORVÉE, F. BRÉMOND. Combining face detection and people tracking in video surveillance, in "Proceed-ings of the 3rd International Conference on Imaging for Crime Detection and Prevention, ICDP 09, LondonUK", December 2009.

[39] C. GARATE, P. BILINSKI, F. BRÉMOND. Crowd Event Recognition Using HOG Tracker, in "Proceedings ofthe Twelfth IEEE International Workshop on Performance Evaluation of Tracking and Surveillance, (invitedpaper), Winter-PETS 2009, Snowbird, Utah, USA", December 2009.

[40] V. HOURDIN, J.-Y. TIGLI, S. LAVIROTTE, M. RIVEILL. Context-Sensitive Authorization for AsynchronousCommunications, in "4th International Conference for Internet Technology and Secured Transactions (IC-ITST), London UK", November 2009.

[41] M.-B. KAÂNICHE, F. BRÉMOND. Tracking HOG Descriptors for Gesture Recognition, in "The 6th IEEEInternational Conference on Advanced Video and Signal Based Surveillance (AVSS), Genoa, Italy", 2 - 4September 2009.

[42] V. MARTIN, F. BRÉMOND, J.-M. TRAVERE, V. MONCADA, G. DUNAND. Thermal Event Recognition Ap-plied to Tokamak Protection during Plasma Operation, in "IEEE International Instrumentation and Measure-ment Technology Conference", May 2009, p. 1690-1694.

[43] S. MOISAN, J.-P. RIGAULT. Teaching Object-Oriented Modeling and UML to Various Audiences, in "MOD-ELS’09 Educators’ Symposium, Denver, CO, USA", October 2009, best paper award.

[44] A.-T. NGHIEM, F. BRÉMOND, M. THONNAT. Controlling Background Subtraction Algorithms for RobustObject Detection, in "The 3rd International Conference on Imaging for Crime Detection and Prevention,London, United Kingdom", 3 December 2009.

Project-Team Pulsar 37

[45] G. PUSIOL, F. BRÉMOND, M. THONNAT. Trajectory-based Primitive Events for Learning and RecognizingActivity, in "Second IEEE International Workshop on Tracking Humans for the Evaluation of their Motion inImage Sequences (THEMIS2009), Kyoto Japan", October 2009.

[46] R. ROMDHANE, F. BRÉMOND, M. THONNAT. Handling Uncertainty for Video Event Recognition, in "The3rd International Conference on Imaging for Crime Detection and Prevention, London, United Kingdom", 3December 2009.

[47] M. SIALA, N. KHLIFA, F. BRÉMOND, K. HAMROUNI. People Detection in Complex Scene Using a Cascadeof Boosted Classifiers based on Haar-like-features, in "IEEE Intelligent Vehicle Symposium (IEEE IV 2009)",June 2009 TN .

[48] N. ZOUBA, F. BRÉMOND, M. THONNAT, A. ANFOSSO, E. PASCUAL, P. MALLEA, V. MAILLAND, O.GUERIN. Assessing Computer Systems for the Real Time Monitoring of Elderly People Living at Home, in"The 19th IAGG World Congress of Gerontology and Geriatrics (IAGG), Paris, France", 5 - 7 July 2009.

[49] N. ZOUBA, F. BRÉMOND, M. THONNAT. Multisensor Fusion for Monitoring Elderly Activities at Home, in"The 6th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), Genoa,Italy", 2 - 4 September 2009.

[50] M. ZUNIGA, F. BRÉMOND, M. THONNAT. Incremental Video Event Learning, in "Proceedings of the 7thIEEE International Conference on Computer Vision Systems, Liege Belgium", October 2009 CL .

Scientific Books (or Scientific Book chapters)

[51] G. CHARPIAT, I. BEZRUKOV, Y. ALTUN, M. HOFMANN, B. SCHÖLKOPF. Machine Learning Methodsfor Automatic Image Colorization, in "Computational Photography: Methods and Applications", R. LUKAC(editor), CRC Press, 2009 DE .

References in notes

[52] A. CIMATTI, E. CLARKE, E. GIUNCHIGLIA, F. GIUNCHIGLIA, M. PISTORE, M. ROVERI, R. SEBASTIANI,A. TACCHELLA. NuSMV 2: an OpenSource Tool for Symbolic Model Checking, in "Proceeeding CAV,Copenhagen, Danmark", E. BRINKSMA, K. G. LARSEN (editors), LNCS, no 2404, Springer-Verlag, July2002, p. 359-364, http://nusmv.irst.itc.it.

[53] N. DALAL, B. TRIGGS. Histograms of Oriented Gradients for Human Detection, in "CVPR, San Diego, CA,USA", vol. 1, IEEECSP, June 20-25 2005, p. 886–893, http://www.acemedia.org/aceMedia/files/document/wp7/2005/cvpr05-inria.pdf.

[54] ECVISION. A Research Roadmap of Cognitive Vision, no IST-2001-35454, IST Project, 2005, Technicalreport.

[55] F. FLEUREY, A. SOLBERG. A Domain Specific Modeling Language Supporting Specification, Simulation andExecution of Dynamic Adaptive Systems, in "MoDELS, Denver, CO, USA", 2009, p. 606-621.

[56] D. GAFFÉ, A. RESSOUCHE. The Clem Toolkit, in "Proceedings of 23rd IEEE/ACM International Conferenceon Automated Software Engineering (ASE 2008), L’aquila, Italy", September 2008.

38 Activity Report INRIA 2009

[57] N. HALBWACHS, F. LAGNIER, P. RAYMOND. Synchronous observers and the verification of reactive systems,in "Third Int. Conf. on Algebraic Methodology and Software Technology, AMAST’93, Twente", M. NIVAT,C. RATTRAY, T. RUS, G. SCOLLO (editors), Workshops in Computing, Springer Verlag, June 1993.

[58] C. KÄSTNER, S. APEL, S. TRUJILLO, M. KUHLEMANN, D. BATORY. Guaranteeing Syntactic Correctnessfor All Product Line Variants: A Language-Independent Approach, in "TOOLS (47)", 2009, p. 175-194.

[59] A. RESSOUCHE, D. GAFFÉ, V. ROY. Modular Compilation of a Synchronous Language, no 6424, INRIA, 012008, http://hal.inria.fr/inria-00213472, Research Report.

[60] A. RESSOUCHE, D. GAFFÉ, V. ROY. Modular Compilation of a Synchronous Language, in "Software En-gineering Research, Management and Applications", R. LEE (editor), Studies in Computational Intelligence,vol. 150, Springer, 2008, p. 157-171, selected as one of the 17 best papers of SERA’08 conference.

[61] E. ROSTEN, T. DRUMMOND. Machine learning for high-speed corner detection, in "ECCV, Graz, Austria",vol. 1, Springer, May 7-13 2006, p. 430–443, http://mi.eng.cam.ac.uk/~er258/work/rosten_2006_machine.pdf.

[62] T. STAHL, M. VOELTER. Model-driven Software Development: Technology, Engineering, Management,Wiley, 2006.