Upload
vanduong
View
214
Download
0
Embed Size (px)
Citation preview
Philipp Slusallek and Team
German Research Center for Artificial Intelligence (DFKI)
Research Area “Agents and Simulated Reality (ASR)”
Excellence Cluster Multimodal Computing and Interaction (MMCI)
Intel Visual Computing Institute (IVCI)
Center for IT Security, Privacy, and Accountability (CISPA)
Saarland University
Scalable Learning from Synthetic Data: Teaching Autonomous Systems via Synthetic Sensor
Data Generated by Learned Models of the Real World
German Research Center for
Artificial Intelligence (DFKI)
• Motto „Computer with Eyes, Ears, and Common Sense“
• Overview Largest AI research center worldwide (founded in 1988)
Germany’s leading research center for innovative SW technologies
5 sites in Germany (SB, HB, KL; B, OS)
18 research areas, 9 competence centers, 7 living labs
More than 485 researchers (~858 with students)
Budget of more than 44 M€ (2017)
More than 60 spin-offs
The Pentagon of
Innovation
Saarbrücken
Berlin
Bremen
Osnabrück
Kaiserslautern
DFKI is a Joint Venture of:
Rhineland-Palatinate
SaarlandSaarland
Bremen
Deutschland GmbH
‘Blue Sky‘
Basic Research
Commercialization/
Exploitation
DFKI Covers the Complete
Innovation Cycle
Labs at the
University
Application-
oriented
Basic
Research
Applied
Research
and
Development
Transfer
Projects
DFKI projects
for
external
clients and
shareholders
Spin-off
Companies
with DFKI
equity
DFKI
projects
for
federal
government,
EU
DFKI
projects
for
state
governments,
clients and
shareholders
External
Clients
Shareholders
The Research and Innovation
Pipeline in Saarland
Max-Planck
Institutes
University
Business Units
Blue-SkyResearch
BasicResearch
AppliedResearch
ProduktPrototype
Industry Research Valley of DeathTM
Intel-VCI
1 Research 10 Engineering 100
Start-Ups (new IT-Incubator Saar)
DFKI
ASREngineersResearchers
Demonstrator
Agents & Simulated Reality:Graphics & AI & HPC & Security
Research:
Topics &
Teams
Philipp Slusallek
Programmable
Linked Data
René Schubotz
Large-Scale Virtual
Environments
Philipp Slusallek
Multimodal
Computing
and
Interaction
High-Performance
Graphics & Computing
Richard Membarth
Distributed Realistic
Graphics
Philipp Slusallek
Computational
3D Imaging
Tim Dahmen
Knowledge and Technology Transfer
Visualization
Center
Georg Demme
Industry
Transfer
N.N.
Strategic
Relations
Hilko Hoffmann
SW-Engineering &
Organization
Georg Demme
Scientific Director
Survivable Systems
and Services
Philipp Slusallek
Intelligent
Information Systems
Matthias Klusch
Multiagent
Systems
Klaus Fischer
Distributed
Displays
Alexander Löffler
Visual
Computing
Philipp Slusallek
Applica-
tion
Areas
Autonomous Driving
Christian Müller
Smart Environments
Hilko Hoffmann
Industrie 4.0
Ingo Zinnikus
Comp. Sciences
Tim Dahmen
Autonomous
Driving
Christian Müller
Smart System
Security
Stefan Nürnberger
Overview
• Motivation
Why synthesize data for (deep) machine learning?
• Autonomous Driving
Learning for critical situations
Need for scalable system design
Combination of long- and short-term research and transfer
• Example: Behavior & Motion Models for Pedestrians
Model building from motion capture data
Automatic motion synthesis from high-level constraints
• Conclusions
Autonomous System &
Critical Situations
• Our World is extremely complex Geometry, Appearance, Motion, Weather, Environment, …
• Systems must make accurate and reliable decisions Especially in Critical Situations
Increasingly making use of (deep) machine learning
• Learning of critical situations is essentially impossible Too little data even for “normal” situations
Critical situations rarely happen in reality – per definition!
Extremely high-dimensional models
Goal: Scalable Learning from synthetic input data Continuous benchmarking & validation (“Virtual Crash-Test“)
Autonomous Driving:
The Problem
Learning from
real data
Typical situations
with high probability
Critical situations
with low probability
Learning from
synthetic data
Learning for
Long-Tail Distributions Goal: Validation of correct behavior by
covering high variability of input data
Scalability
Critical situations
with low probability
Does This Even Work?
• Recent example [ECCV 2016]
Instrumented GTA-V car racing game (per pixel labels)
Improved semantic segmentation by 2.6%
But many limitations: No control, one environment, …
SzenarienSzenarien
Teil-ModelleTeil-ModelleTeil-Modelle
Reality
Digital
Reality
Partial
Models
Simulation/
Rendering
Scenarios
Concrete
Instances of
Scenarios
Configuration &
Learning
Modeling &
Learning
Geometry
Appearance
Behavior
Motion
Environment
…
e.g. kid in front of a car e.g. kid in front of the car in exactly this way
Synthetic Sensor Data,
Labels, …
Traffic
Situations
Adaptation to
simulated environment
(sensor types)
Validation
Creating coverage via
parameterized instantiating &
genetic programming
Sensor & Actor DataCar
With
Autonomous
System /
Autonomous
CarCar
SzenarienSzenarien
Teil-ModelleTeil-ModelleTeil-Modelle
Reality
Digital
Reality
Partial
Models
Simulation/
Rendering
Scenarios
Concrete
Instances of
Scenarios
Configuration &
Learning
Modeling &
Learning
Geometry
Appearance
Behavior
Motion
Environment
…
e.g. kid in front of a car e.g. kid in front of the car in exactly this way
Synthetic Sensor Data,
Labels, …
Traffic
Situations
Adaptation to
simulated environment
(sensor types)
Creating coverage via
parameterized instantiating &
genetic programming
Sensor & Actor DataCar
With
Autonomous
System /
Autonomous
Car
Continuous
Validation &
Adaptation
Car
Digital Reality: Challenges &
Parallel Track Approach
• Long-Term Research & Development Challenges
How to learn models of the complex world around us?
How to automatically assemble target situations?
How to generate suitable coverage of input data?
How to perform simulations of these models?
How to render multi-modal sensor data (e.g. radar)?
How to validate and adapt models based on the real world?
How to set up a SW architecture to support all this?
How to bring together academia and industry quickly?
• Simultaneous Short-Term Benefits & Outcome
Can start applying early results immediately !!!
AI for
Autonomous Driving/Systems
• Requirements and Challenges Learning of critical situations
• Creating proper models of the real world (via learning)
• Generating the required input data (via models)
• Covering the required variability of the input data
• Requirements on the size, accuracy and robustness of the data?
Benchmarking the development processes • Reproducible and standardized test scenarios
• Scalable and fast simulation, rendering, and learning
• Open architecture for integrating different models & simulations
Validating the learned behavior for autonomous driving• Calibrating synthetic data against real data
• Identifying and adapting insufficient and missing models
• Setting up a ”Virtual TÜV” for autonomous vehicles/systems
REACT Project
• BMBF-funded Project at DFKI
Autonomous Driving: Modeling, Learning and Simulation Environment for Pedestrian Behavior in Critical Traffic Situations
>5 researchers over three years, starting now
• Exploring Key Challenges
Motion Models and Motion Synthesis for Pedestrians
Modeling and Simulating High-Level Behavior
Hybrid Deep Learning for Agent-Based Simulations
Automated Creation & Evaluation of Critical Situations
High-Performance Deep Learning (w/ AnyDSL)
REACT Project: Overview
• Learning pipeline with multiple feedback loops
OpenDS
• Example for highly visible automotive DFKI platform
Open Source driving simulation platform
Validated tasks for psychological driver distraction research
Internationally widely used in research and industry• More than 10k registered users
• Including: Google, Bosch, Continental, Honda, TomTom, Nuance, …
• Including: CMU, Stanford, Berkeley, MIT, TUM, TU-Berlin, …
Also used for driver education (e.g. developing countries)
USP: Highly flexible infrastructure for research
Related Projects
• BMW + VW (1 PhD + Transfer Project) Scalable learning from synthetic data
• Intel (1 PostDoc) Systems architectures for scalable learning (with BMW, etc.)
• TASS International (Projects) Extensions of commercial traffic simulation system (PreScan)
Machine learning, rendering, etc.
• Offis Research Center (WS + Proposal) Validation and Verification
• TU Darmstadt (WS + Proposal) Radar simulation
• Audi, ExCluster-DL, HP-DLF, SmartMaas, …
Initial Scenario: Autonomous
Breaking System (Euro NCAP)
• Autonomous breaking test conducted by ADAC
Significant efforts invested in tests in 2016
Images from: https://www.adac.de/infotestrat/tests/crash-test/notbremsassistent_2016/default.aspx?ComponentId=250194&SourcePageId=31956
Initial Scenario: Autonomous
Breaking System (Euro NCAP)
• Insufficient Results in Recent Test by ADAC (2016)
Problems: Night, speed, moving legs, only forward looking, …
But even difficult cases were handled by at least some cars
• Important for Pedestrian Protection (30% of deaths)
Potentially strong impact with relatively simple setup
Easier tests, wider coverage, less false positives, …
[Nils Tiemann, Ein Beitrag zur Situationsanalyse im vorausschauenden Fußgängerschutz, Diss, Uni Duisburg-Essen, 2012]
Partial Models
for Pedestrian Motion
Motion Models
• Human motion generation
based on a semantically
parameterized Motion Model
• Goals and constraints given
by behavior simulation
• Developed in EU project
“Interact” with Daimler
right start
left start
right stance
left stance
left end
right end
Motion Capture Data
Motion Model Synthesized Motion
Preprocessing
Input MotionKey Frame Definition & Extraction
Motion Segmentation
& Categorization
Motion Alignment
Functional Data Analysis
Dimension Reduction
Discrete frames
Functional motion data
Statistical Motion Synthesis
Constraints
Scaled FPCASegmentation
Spatial and
Temporal
Alignment
Statistical
Modelling
Clustering in
Feature
Space
𝑠1, … , 𝑠𝑇𝑔𝑢𝑒𝑠𝑠
𝑠1, … , 𝑠𝑇∗
𝑠𝑖𝐶𝑜𝑛𝑠𝑡𝑟𝑎𝑖𝑛𝑡𝑠𝑖
Back
ProjectionOptimization
Graph Walk
Generation
Graph of
k-Means
Trees
Graph of
Statistical
Models
𝐶𝑜𝑛𝑠𝑡𝑟𝑎𝑖𝑛𝑡𝑠
onlinek-Means Tree
Search
Motion Capture
New Motion
Preprocessing
𝑠: latent motion parameters
offline
Morphable Graph
• Morphable Graph provides a graph-based statistical approach for motion modeling
• Models infinite variation of human motion with finite high-level structures (motion primitives)
• Each motion primitive (node) is described as a statistical model
• Transforms motion synthesis problem to discrete directed graph search problem
right foot start
left foot start
right foot walk
left foot walk
left foot end
right foot end
Morphable graph for normal walking
Statistical Modeling in a
Low-Dimensional Space• Representation
Motion data is represented in low-dimensional space using functional principal component analysis (FPCA).
The set of all motions forms amotion manifold in that space
• Synthesis Samples are randomly selected from this manifold A Maximum A-Posteriori (MAP) control strategy enables support
for arbitrary spatial and temporal constraints (𝑐𝑢𝑠𝑒𝑟) via optimization in the low dimensional motion space 𝑆
𝑠𝑀𝐴𝑃 = argmax𝑠
𝑝𝑟 𝑐𝑢𝑠𝑒𝑟|𝑠𝑒𝑟𝑟𝑜𝑟 𝑚𝑒𝑎𝑠𝑢𝑟𝑒
∗ 𝑝𝑟 𝑠𝑝𝑟𝑖𝑜𝑟 𝑙𝑖𝑘𝑒𝑙𝑖ℎ𝑜𝑜𝑑
Motion ≠ Motion
• Long history in motion research (>40 years)
E.g. Gunnar Johansson's Point Light Walkers (1974)
Significant interdisciplinary research (e.g. psychology)
• Humans can easily discriminate different styles
E.g. gender, age, weight, mood, ...
Based on minimal information (e.g. few points on body)
• Can we teach machines to do the same?
Detect if approaching pedestrian will cross the street
Started collaboration with Prof. Nico Troje, BioMotionLab• Parameterized Motion Models and Style Transfer
Open Questions:
Motion Models
• What are adequate motion models?
• What is the relation of machine learning and models?
• Automatic and scalable model building from data?
• If and how to integrate physics into our models?
• Hierarchical and more detailed models?
• Modular but better integrated technology stack?
• …
Partial Models
for Pedestrian Behavior
Pedestrian Simulations
Traditionally Simulation of Macrosopic Pedestrian Behavior
Pedestrian Simulations
Need Simulation of Microsopic Pedestrian Behavior!
Social Connections
InterestsIntentionsSwarms
Multi-Agent System
Agents
Behaviors
Belief Bases
Services
TriplestoresRDF4J SPARQL 1.1
Ag
ents
Ma
na
ge
r
Endpoints
Qu
ery
Pro
cesso
r
Execution
Agent Domain with Sensors and Actions
(REST/RDF Resources)
Editor
Behaviors
(Behavior Tree)
Plu
gin
Syste
m
AJAN is a modular agent web service for creatingautonomous systems. The main goal is to address aheterogeneous community with an easy to use, flexibleand powerful AI tool for different domains, such like 3Dsimulations, programmable web or home automation.
• Agent Model defined in RDF:o Behavior (SPARQL - Behavior Trees (BT))o Agent knowledge (RDF/RDFS)o Service description (RDF Action Vocabulary)
• Agent interacts in REST/RDF domains:o Perception & Action serviceso Resources
• Using triple stores (RDF4J, SPARQL 1.1) for storing agent models
• Web editor for agent modeling
• Offers plugin system:o GraphPlan extended BT
(for Planning in STRIPS domains)
Multi-Agent System
Neural Network
Agent BeliefsRDF-Triplestores
reading(pedestrian,sign)…available(bus8)blinking(bus8,left)leaves(bus8,busStop)
is(pedestiran,tim)friend(tim)funny(tim)
Hybrid learning combines deep learning withother knowledge representation and reasoningtechniques.
Area: Scene understanding
• Agent learns to semanticallyunderstand the perceived scene data(image) according to own RDF-encoded domain model (ontology =concept base + object base)
• Use of semantic reasoning on domainmodel in support of deep learning-based image analysis (semanticsegmentation, instance recognition) inpartially observable environment
• Incremental domain model updatewith new or modified concepts
Current focus
Impact of synthetic data on accuracyof DNNs for semantic segmentation
Analysis of ontology-based DNN vs.DNNs for (part-based) semanticsegmentation (e.g. in case of partialocclusions)
rawimages
Red Sign
Crossing
Multi-Agent System
C A A
∞
Ø
→
→
→
→
?
Neural Network
Extending Behavior Treewith Learning -Node
R
MDPs
Execute Animations
Learn how tocross streets
Pedestrian Behavior
Multi-Agent System
C A A
∞
Ø
→
→
→
→
!
Neural Network
POMDPs
Extending BT with
Probabilistic Planning
Pedestrian Behavior
Target Online navigation to targets inpartially observed (or unknown) areas
in scene with detected other agents
GenerateBehaviors
Area: Deep Learning and Planning
Agent learns to recognize actual and intended behavior of other agents for own action planning (navigation) in POMDPs and based on its local belief base (domain, agent models)
Current focus
Analysis of approximated online POMDP planning (e.g. DESPOT) vs. deep reinforcement learning (e.g. A3C) in POMDP for navigation
Use of local belief base (e.g. auxiliary tasks in DRL) to improve performance
Integration of model-free RL and DRL with BTs and performance analysis
Model Pedestrians
• Intuitive Pedestrian Modeling:
– Intuitive user interface to quickly model a broad variety of pedestrians
– For modeling personal interests, social contacts, partial knowledge
– Pedestrian models based on traffic science studies
User Interfaces:
Open Questions:
Behavior Models
• Best trade-off between user control and automation?
• How can we best learn behavior from real data?
• How to integrate (deep) learning and classical AI?
• How to best create semantic 3D models?
• How to create modular service architecture?
• Best data formats to communicate between services?
• …
Learning for Autonomous
Systems using AI & CG & HPG
• Scalable Approach to Learning from Synthetic Data
Learning of (relatively) low-dimensional partial models• Individual models from “normal” data (e.g. motion, environment)
Assembling a complete model from partial models• Automatically configuring critical situations and their variations
Simulating & rendering synthetic input data for each sensor• Better rendering algorithms, more accurate simulations
Iterative improvement of the entire process• Validating the models with the real world
• In parallel: Incremental learning of better models
Open architecture for scalable system design• Ability to integrate contribution from industry & research partners
Digital Reality is AI
AI is Digital Reality
• Creating an architecture and building blocks that offer Perception
• … to find out what is around
Machine Learning (Deep Learning)• … to build/adapt models automatically
Knowledge Representation• … to represent our models of the environment
Reasoning• ... to infer and generalize based on existing knowledge
Planning• … to plan and simulate in the virtual world, exploring possible options,
evaluating their performance, and making decisions
Acting• … to manipulate and move around in the real/virtual world
Next Steps
• Setting up network between industry & academia• Academia:
Long-term research work on key research challenges Short-term validation of the general approach Many groups with focus on specific partial models Integration and networking between partners DFKI established as coordinator with large industry network
• Industry: Creation of initial models, acquisition of real data, validation Integration into their development processes
• Collaboration Framework Open architecture for learning from simulated, synthetic data Started on “OpenDR: Open Digital Reality”
Conclusions
• Goal: Ability to Create a “Digital Reality” Machine Learning is the best known method
Still insufficient for many (critical) situations
• Learning from Synthetic (and Real) Data Requires learning, modeling, simulation, and validation
• Big Challenges Ahead Many promising partial results – but largely islands
Requires closer collaboration of industry & academia
• AI a Central Component of Future Systems Machine Learning / Deep Learning
From specific to more general solutions