Upload
others
View
8
Download
0
Embed Size (px)
Citation preview
Copyright (C) 2014 DENSO IT LABORATORY, INC. All Rights Reserved. 3/26/2014
Ikuro Sato, Hideki Niihara
R&D Group, Denso IT Laboratory, Inc.
Beyond Pedestrian Detection:
Deep Neural Networks Level-Up Automotive Safety
1/26
Advanced Driver Assistance Systems (ADAS)
3/26/2014 Copyright (C) 2014 DENSO IT LABORATORY, INC. All Rights Reserved.
pedestrian detection with infrared camera forerunning car detection with radar
Images taken from DENSO’s TV ad.
2/26
What would a skilled driver do?
3/26/2014 Copyright (C) 2014 DENSO IT LABORATORY, INC. All Rights Reserved. 3/26
What would a skilled driver do?
3/26/2014 Copyright (C) 2014 DENSO IT LABORATORY, INC. All Rights Reserved. 4/26
Today’s Technology
• Only to detect pedestrians with a camera
– limited autonomous maneuvers possible, e.g., slows down to stop
3/26/2014 Copyright (C) 2014 DENSO IT LABORATORY, INC. All Rights Reserved.
today’s industrial technology
5/26
The Goal
• Able to extract richer information about pedestrians
– various autonomous maneuvers possible like a skilled driver
3/26/2014 Copyright (C) 2014 DENSO IT LABORATORY, INC. All Rights Reserved.
future industrial technology
old-aged man finishing crossing
running toward the vehicle
man holding smartphone, crossing intersection
6/26
Three Technical Challenges
• multiclass classification
– Need to identify various annotative info on top of binary classification
• visual feature design
– Who knows what visual features best separate a man with a
smartphone from a man without?
• real-time processing
3/26/2014 Copyright (C) 2014 DENSO IT LABORATORY, INC. All Rights Reserved. 7/26
Promising Tools
• multiclass classification
– need to identify various “states” on top of binary classification
• visual feature design
– Who knows what visual features best separate a man with a cell phone
from a man without?
• real-time processing
3/26/2014 Copyright (C) 2014 DENSO IT LABORATORY, INC. All Rights Reserved.
Deep Neural Networks
GPGPUs
8/26
• Why Deep Neural Networks?
• Why GPGPUs?
• Demo Setup
3/26/2014 Copyright (C) 2014 DENSO IT LABORATORY, INC. All Rights Reserved. 9/26
Basics
• Neural Network: one of the machine learning algorithms for
classification and/or regression
3/26/2014 Copyright (C) 2014 DENSO IT LABORATORY, INC. All Rights Reserved.
input 𝑥
hidden ℎ
simple Neural Network
By tuning 𝑊𝑘, one obtains an arbitrary function.
weight 𝑊1
𝑊2
ℎ = 𝑔 𝑊1𝑥
𝑦 = 𝑔 𝑊2ℎ
𝑔: differentiable nonlinear function
(simplified)
output 𝑦
10/26
Basics
• Neural Network: one of the machine learning algorithms for
classification and/or regression
3/26/2014 Copyright (C) 2014 DENSO IT LABORATORY, INC. All Rights Reserved.
input 𝑥 output
𝑦
hidden ℎ
simple Neural Network
By tuning 𝑊𝑘, one obtains an arbitrary function.
weight 𝑊1
𝑊2
Can have arbitrary many units
multiclass
11/26
• Convolutional Neural Network (CNN) comprising
– Convolutional layer(s) local feature extraction
– Pooling layer(s) dimensionality reduction
– Fully-connected layer(s) classification/regression
simple CNN
Neural Networks for Visual Tasks
3/26/2014 Copyright (C) 2014 DENSO IT LABORATORY, INC. All Rights Reserved.
input image C P F F
12/26
• Convolutional Neural Network (CNN) comprising
– Convolutional layer(s) local feature extraction
– Pooling layer(s) dimensionality reduction
– Fully-connected layer(s) classification/regression
Neural Networks for Visual Tasks
3/26/2014 Copyright (C) 2014 DENSO IT LABORATORY, INC. All Rights Reserved.
input image C P F F
visual features automatically extracted
simple CNN
13/26
• Transforming data vectors at multiple layers help to increase the
level of abstraction of visual contents (Bengio, 2009).
Why is “Deep” Architecture Advantageous?
3/26/2014 Copyright (C) 2014 DENSO IT LABORATORY, INC. All Rights Reserved.
C P P P F F F C C
edges set of segments parts classifier
Deep CNN
14/26
• Why Deep Neural Networks?
• Why GPGPUs?
• Demo Setup
3/26/2014 Copyright (C) 2014 DENSO IT LABORATORY, INC. All Rights Reserved. 15/26
Major Reason
• Classification can be boosted.
– Deep CNN can run at real-time with a compact Tegra K1 processor.
3/26/2014 Copyright (C) 2014 DENSO IT LABORATORY, INC. All Rights Reserved.
why?
Deep CNN is very GPU friendly!
A Deep CNN typically has a small number of sequential operations, each of which is dominated by highly-parallelizable, massive sums-of-products.
16/26
Minor Reasons
• Classification can be boosted.
– Deep CNN can run at real-time with compact Tegra K1 processor.
• Learning can be boosted.
– It can take a month without GPUs. Takes only several days with GPUs.
• Development can be boosted.
– Capability of processing generic programing language (i.e., CUDA)
makes R&D process efficient.
– No more fixed-points!
3/26/2014 Copyright (C) 2014 DENSO IT LABORATORY, INC. All Rights Reserved. 17/26
Minor Reasons
• Classification can be boosted.
– Deep CNN can run at real-time with compact Tegra K1 processor.
• Learning can be boosted.
– It can take a month without GPUs. Takes only several days with GPUs.
• Development can be boosted.
– Capability of processing generic programing language (i.e., CUDA)
makes R&D process efficient.
– No more fixed-points!
3/26/2014 Copyright (C) 2014 DENSO IT LABORATORY, INC. All Rights Reserved. 18/26
• Why Deep Neural Networks?
• Why GPGPUs?
• Demo Setup
3/26/2014 Copyright (C) 2014 DENSO IT LABORATORY, INC. All Rights Reserved. 19/26
Problem Statement
• Real-time pedestrian detection with depth, height, and body
orientation estimations
– requires discrete-valued classification and real-valued regression tasks
3/26/2014 Copyright (C) 2014 DENSO IT LABORATORY, INC. All Rights Reserved. 20/26
Deep CNN Model
• A single deep CNN with 9 (processing) layers
– 3 Convolutional + (Max) Pooling layers
– 3 Fully-connected layers
3/26/2014 Copyright (C) 2014 DENSO IT LABORATORY, INC. All Rights Reserved.
16 maps 16 maps
16 maps
16 units
32 units 140x60
gray image
21/26
Deep CNN Model
• Output layer contains:
3/26/2014 Copyright (C) 2014 DENSO IT LABORATORY, INC. All Rights Reserved.
1 binary classification unit for pedestrian/background
1 regression unit for relative x-coord. of the body center
1 regression unit for relative y-coord. of the top of the head
1 regression unit for relative y-coord. of the toe
4 classification units for body orientations
𝑥
𝑦
𝑦head
𝑦toe
𝑥center
scanning window
22/26
Deep CNN Model
• Output layer contains:
3/26/2014 Copyright (C) 2014 DENSO IT LABORATORY, INC. All Rights Reserved.
1 binary classification unit for pedestrian/background
1 regression unit for relative x-coord. of the body center
1 regression unit for relative y-coord. of the top of the head
1 regression unit for relative y-coord. of the toe
4 classification units for body orientations
𝑥
𝑦
𝑦head
𝑦toe
scanning window converted to depth & height
𝑥center
23/26
Processor
• NVIDIA’s Jetson platform equipped with Tegra K1 processor
– 192 Kepler CUDA cores
– 200 GFLOPS (automotive K1)
– memory shared between GPU and CPU (ARM)
– low power consumption for automotive applications, e.g., ADAS
3/26/2014 Copyright (C) 2014 DENSO IT LABORATORY, INC. All Rights Reserved.
Jetson
24/26
Processor
• NVIDIA’s Jetson platform equipped with Tegra K1 processor
– 192 Kepler CUDA cores
– 200 GFLOPS (automotive K1)
– memory shared between GPU and CPU (ARM)
– low power consumption for automotive applications, e.g., ADAS
3/26/2014 Copyright (C) 2014 DENSO IT LABORATORY, INC. All Rights Reserved.
Jetson Deep CNN
25/26
Take-Home Messages
1. Extracting rich information about pedestrians will be
one of the key technologies for automotive safety.
2. Deep Neural Networks are the best algorithms for
visual recognition tasks.
3. GPGPUs enable real-time processing.
– Demonstrated that Tegra K1 indeed does!
3/26/2014 Copyright (C) 2014 DENSO IT LABORATORY, INC. All Rights Reserved. 26/26