Upload
others
View
15
Download
0
Embed Size (px)
Citation preview
CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu
CS231n: Convolutional Neural
Network for Visual Recognition
Lecture 1: Introduction
24-Mar-211
CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu
Welcome to CS231n
2 24-Mar-21
CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu
Welcome to CS231n
3 24-Mar-21
20152016
2017
2018 2019 2020
CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu
Computer
Vision
Neuroscience
Deep learning
Machine learning
Speech, NLP
Information retrieval
Mathematics
Computer
Science
Biology
Engineering
Physics
Robotics
Cognitive
sciences
Psychology
graphics, algorithms,
theory,…
Image
processing
4
systems,
architecture, …
optics
24-Mar-21
CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu
Computer
Vision
Neuroscience
Deep learning
Machine learning
Speech, NLP
Information retrieval
Mathematics
Computer
Science
Biology
Engineering
Physics
Robotics
Cognitive
sciences
Psychology
graphics, algorithms,
theory,…
Image
processing
5
systems,
architecture, …
optics
24-Mar-21
CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu
Artificial Intelligence (AI)
6 24-Mar-21
Machine Learning (ML)
Depp Learning (DL)
Convolutional Neural Network
(CNN)
Computer Vision
• Object detection• Object classification• Scene understanding• Semantic scene
segmentation• 3D reconstruction• Object tracking• Human pose estimation• Activity recognition• VQA• ….
CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu 7 24-Mar-21
Jiajun Wu
Fei-Fei Li
Juan CarlosNiebles
Silvio Savarese
CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu
Today’s agenda
• A brief history of computer vision
• CS231n overview
8 24-Mar-21
CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu
Evolution’s Big Bang: Cambrian Explosion, 530-540million years, B.C.
9 24-Mar-21
This image is licensed under CC-BY 3.0
This image is licensed under CC-BY 2.5
This image is licensed under CC-BY 2.5
CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu 11 24-Mar-21
Camera Obscura
Leonardo da Vinci,
16th Century AD
This work is in the public domain
This work is in the public domain
Gemma Frisius, 1545
This work is in the public domain
Encyclopedia, 18th Century
CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu
Where did we come from?
The known story – Neuroscience inspired AI
12 24-Mar-21
CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu
Hubel and Wiesel, 1959
13 24-Mar-21
CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu 14
Low-LevelDetails
Neural Networks (Digital)
Cortical Column(Biological)
High-Level Patterns
24-Mar-21
CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu
F. Rosenblatt, 1957 Rumelhart, Hinton & Williams, 1986
15 24-Mar-21
“The mere formulation of a problem is often far
more essential than its solution, which […]
requires creative imagination and marks
real advances in science.”
- Albert Einstein, 1921
CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu
Where did we come from?
The not-so-known story – the search for computer vision’s “North Star”
17 24-Mar-21
CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu
Larry Roerts1963, 1st thesis of Computer Vision
1960s: Interpretation of synthetic world
18 24-Mar-21
CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu 19 24-Mar-21
CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu 20
David Marr, 1970s
24-Mar-21
CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu 21
This image is CC0 1.0 public domain This image is CC0 1.0 public domain
Input image Edge image2 ½-D sketch 3-D model
Input
Image
Perceived
intensities
Primal
Sketch
Zero crossings,
blobs, edges,
bars, ends,
virtual lines,
groups, curves
boundaries
2 ½-D
Sketch
Local surface
orientation
and
discontinuities
in depth and
in surface
orientation
3-D Model
Representation
3-D models
hierarchically
organized in
terms of
surface and
volumetric
primitives
Stages of Visual Representation, David Marr, 1970s
24-Mar-21
CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu
D. L
ow
e. IJ
CV
, 19
92
Edges, segmentation, and perception
22 24-Mar-21
CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu
Normalized Cut (Shi & Malik, 1997)Image is CC BY 3.0 Image is public domain
Image is CC-BY 2.0; changes made
23 24-Mar-21
CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu
3D reconstruction
S. Agarwal et al. ICCV, 2009
24 24-Mar-21
CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu 25
• Generalized Cylinder • Pictorial Structure
Brooks & Binford, 1979 Fischler and Elschlager, 1973
b
b
24-Mar-21
CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu
D. Lowe. ICCV, 1999
Single Object Recognition
26 24-Mar-21
CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu 27
Spatial Pyramid Matching, Lazebnik, Schmid & Ponce, 2006
Level 0 Level 1
Image is CC0 1.0 public domain
24-Mar-21
CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu
Histogram of Gradients (HoG)
Dalal & Triggs, 2005
Deformable Part Model
Felzenswalb, McAllester, Ramanan,
2009orientation
freq
uen
cy
Image is CC0 1.0 public domain
28 24-Mar-21
CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu
Face Detection, Viola & Jones, 2001
Image is public domain
29 24-Mar-21
CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu
CVPR topic distribution: 2000
30 24-Mar-21
CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu
In the mean time…
31 24-Mar-21
CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu
I. Biederman, Science, 1972
32 24-Mar-21
CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu
Potter, etc. 1970s
Rapid Serial Visual Perception (RSVP)
33 24-Mar-21
CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu
150 ms !!Thorpe, et al. Nature, 1996
CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu
Kanwisher et al. J. Neuro. 1997 Epstein & Kanwisher, Nature, 1998
Neural correlates of object & scene recognition
35 24-Mar-21
CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu
A Computer Vision/AI ”holy grail” – Object Recognition
Fei-Fei et al. 2004Everingham et al. 2006-2012
36
CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu
There are MANY objects; organized HIERARCHICALLY
• Biederman: Recognition by Component, 1987 Eleanor Rosch: Principles of Categorization, 1978
32 6-Apr-20
37 24-Mar-21
CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu 38
George A. MillerPsychology, Cognitive SciencePrinceton University
G. A. Miller, Communications of the ACM, 1995
CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li & L. Fei-Fei. CVPR, 2009.
22,000 categories 15,000,000 images
CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu 40
Output:
Scale
T-shirt
Steel drum
Drumstick
Mud turtle
Steel drum
✔ ✗
Output:
Scale
T-shirt
Giant panda
Drumstick
Mud turtle
24-Mar-21
Russakovsky et al. IJCV 2015
The Image Classification Challenge:
1,000 object classes
1,431,167 images
CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu
0.280.26
0.16
0.12
0.070.036 0.03 0.023
0
0.05
0.1
0.15
0.2
0.25
0.3
2010 2011 2012 2013 2014 2015 2016 2017
Cla
ssif
icat
ion
Err
or
Human Shallow models
Classification Task
Deng et al. CVPR, 2009; Russakovsky et al. IJCV, 2012;
41 24-Mar-21
CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu
1998
2012
LeCun et al.
Krizhevsky et
al.
# of transistors # of pixels used in training
# of transistors # of pixels used in training
107
1014
106
109
GPUs
42
K
InputImage Maps
ConvolutionsSubsampling
Output
Fully Connected
Figure copyright Alex Krizhevsky, Ilya Sutskever, and Geoffrey Hinton, 2012. Reproduced with permission.
24-Mar-21
CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu
GoogLeNet VGG MSRASuperVision
[Krizhevsky NIPS 2012]
Year 2012 Year 2014Year 2010
NEC-UIUC
[Lin CVPR 2011]
[Szegedy arxiv
2014]
[Simonyan arxiv 2014]
43
Year 2015
Dense descriptor grid: HOG, LBP
Coding: local coordinate, super-vector
Pooling, SPM
Linear SVM
Lion image by Swissfrogis licensed under CC BY 3.0
Image
conv-64
conv-64
maxpool
conv-128conv-128
maxpool
conv-256conv-256
maxpool
conv-512conv-512
maxpool
fc-4096
fc-4096
fc-1000
softmax
conv-512conv-512
maxpool
Pooling
Convoluti
on
Softmax
Other
[He ICCV 2015]Figure copyright Alex Krizhevsky, Ilya Sutskever, and Geoffrey Hinton, 2012. Reproduced with permission.
24-Mar-21
A man riding a horse drawn carriage down a street
Horse pulling a cart. A wheel ona cart. A window on a building. Ahorse in a picture. A large whiteumbrella. A woman sitting on abench. Man sitting on amotorcycle. A cart with a cart.
Image Captioning
DenseCaptioning
JKF, CVPR 2016
Prio
rW
ork
Ou
r Recen
t Wo
rk
A man is riding a carriage on astreet. Two people are sitting ontop of the horses. The carriage ismade of wood. The carriage isblack. The carriage has a whitestripe down the side. The buildingin the background is a tan color.
ParagraphCaptioning
KJKF, CVPR 2017
Image Captioning: Richer Descriptions
Results:spatial, comparative, asymmetrical,
verb, prepositional
person person
left of
taller than
ski
wear
shirt
wear
snow
on
Krishna*, Lu*, Bernstein, Fei-Fei, ECCV 2016
CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu
2000
2013
CVPR topic distribution: 2000 vs. 2013
46 24-Mar-21
The Deep Learning Revolution
Computation DataAlgorithms
CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu
AI’s Explosive Growth & Impact
Source: The Gradient
Startups Developing AI Systems
Source: Crunchbase, VentureSource, Sand Hill Econometrics
Enterprise Application AI Revenue
Source: Statista
Number of attendanceAt AI conferences
48 24-Mar-21
CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu
Many Applications of computer vision
49 24-Mar-21
Slide source: World Capital Partners, 2017
50
$Low-cost Burden-free
Mobility Infection
Sleep Diet
Versatile Scalable
How to take care of seniors
while keeping them safe?
14Monitor Patients with
Mild Symptoms
Manage Chronic Conditions
Early Symptom Detection
of COVID-19
CS231n: Lecture 1 -Fei-Fei Li & Ranjay Krishna & Danfei Xu
Today’s agenda
• A brief history of computer vision
• CS231n overview
52 24-Mar-21