Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
Machine Vision GroupMachine Vision Group
DepDept.t. of Electrical of Electrical and and InformationInformation Engineering and Infotech OuluEngineering and Infotech Oulu
Local Binary Pattern (LBP) methods in motion and activity analysis
Matti PietikäinenUniversity of Oulu, Finlandhttp://www.ee.oulu.fi/mvg
Machine Vision GroupMachine Vision Group
DepDept.t. of Electrical of Electrical and and InformationInformation Engineering and Infotech OuluEngineering and Infotech Oulu
Texture is everywhere: from skin to scene images
•
Machine Vision GroupMachine Vision Group
DepDept.t. of Electrical of Electrical and and InformationInformation Engineering and Infotech OuluEngineering and Infotech Oulu
Property
Pattern ContrastTransformation
Starting point
2-D surface texture is a two dimensional phenomenon characterized by:• spatial structure (pattern)• contrast (‘amount’ of texture)
Thus,1) contrast is of no interest in gray scale invariant analysis2) often we need a gray scale and rotation invariant pattern measure
Gray scale no effect
Rotation no effectaffects
affects
?affectsZoom in/out
Machine Vision GroupMachine Vision Group
DepDept.t. of Electrical of Electrical and and InformationInformation Engineering and Infotech OuluEngineering and Infotech Oulu
Local Binary Pattern and Contrast operators
Ojala T, Pietikäinen M & Harwood D (1996) A comparative study of texture measures with classification based on feature distributions. Pattern Recognition 29:51-59.
6 5 2
7 6 1
9 8 7
1
1
1 11
0
00 1 2 4
8
163264
128
example thresholded weights
LBP = 1 + 16 +32 + 64 + 128 = 241
Pattern = 11110001
C = (6+7+8+9+7)/5 - (5+2+1)/3 = 4.7
An example of computing LBP and C in a 3x3 neighborhood:
Important properties:
• LBP is invariant to any monotonic gray level change
• computational simplicity
Machine Vision GroupMachine Vision Group
DepDept.t. of Electrical of Electrical and and InformationInformation Engineering and Infotech OuluEngineering and Infotech Oulu
- arbitrary circular neighborhoods- uniform patterns- multiple resolutions- rotation invariance- gray scale variance as contrast measure
Ojala T, Pietikäinen M & Mäenpää T (2002) Multiresolution gray-scale and rotation invariant texture classification with Local Binary Patterns. IEEE Transactions on Pattern Analysis and Machine Intelligence 24(7):971-987.
Multiresolution LBP
Machine Vision GroupMachine Vision Group
DepDept.t. of Electrical of Electrical and and InformationInformation Engineering and Infotech OuluEngineering and Infotech Oulu
U=2
U=0
‘Uniform’ patterns (P=8)
U=4 U=6 U=8
Examples of ‘nonuniform’ patterns (P=8)
‘Uniform’ patterns
Machine Vision GroupMachine Vision Group
DepDept.t. of Electrical of Electrical and and InformationInformation Engineering and Infotech OuluEngineering and Infotech Oulu
Texture primitives detected by the LBP
Spot Spot/flat Line end CornerEdge
Machine Vision GroupMachine Vision Group
DepDept.t. of Electrical of Electrical and and InformationInformation Engineering and Infotech OuluEngineering and Infotech Oulu
Estimation of empirical feature distributions
0 1 2 3 4 5 6 7 ... B-1
VARP,RLBPP,R
riu2 / VARP,R
LBPP,Rriu2
VARP,R
LBPP,Rriu2
VAR
P,R
LBPP,R riu2/ VAR
P,R
Joint histogram oftwo operators
Input image (region) is scanned with the chosen operator(s), pixel by pixel,and operator outputs are accumulated into a discrete histogram
LBPP,Rriu2
0 1 2 3 4 5 6 7 ... P+
1
Machine Vision GroupMachine Vision Group
DepDept.t. of Electrical of Electrical and and InformationInformation Engineering and Infotech OuluEngineering and Infotech Oulu
LBP has become widely used in various applications due to its high discriminative power, tolerance against illumination changes andcomputational simplicity. Among the applications are:
• Visual inspection
• Image and video retrieval
• Biomedical image analysis
• Aerial image analysis, remote sensing
• Facial image analysis
• Etc.
For a bibliography of LBP-related research, seehttp://www.ee.oulu.fi/research/imag/texture
LBP has been highly successful
Machine Vision GroupMachine Vision Group
DepDept.t. of Electrical of Electrical and and InformationInformation Engineering and Infotech OuluEngineering and Infotech Oulu
- Object detection: Zhang et al., 2006- On-line boosting: Grabner & Bishof, 2006- Object classification: Lisin et al., 2005; Autio 2006- Color-texture based indexing: Yao & Chen, 2003; Connah & Finlayson, 2006- Inspection of ceramic tiles: Lopes, 2005; Novak & Hocenski, 2005 - Classification of underwater images: Marcos et al., 2005; Clement et al., 2005; Blaschko et al., 2005
- Aerial image segmentation: Urdiales et al., 2004- Segmentation of multispectral remote sensing images: Lucieer et al., 2005
- Intravascular tissue characterization: Pujol & Radeva, 2005 - Mobile robot navigation: Hong et al., 2002; Davidson & Hutchinson, 2003 - Steganalysis for stenography: Lafferty & Ahmed, 2004- Designing aesthetically interesting and informative displays: Fogarty et al., 2001- Ovehead view person recognition: Cohen et al., 2000- Face recognition: G. Zhang et al., 2004; W. Zhang et al. 2005; Li et al., 2006;
Rodriguez & Marcel, 2006- Face detection: Jin et al., 2004- Facial expression recognition: Shan et al.. 2005; Liao et al., 2006- Gender classification: Sun et al., 2006; Lian & Lu, 2006
Examples of using LBP in other groups
Machine Vision GroupMachine Vision Group
DepDept.t. of Electrical of Electrical and and InformationInformation Engineering and Infotech OuluEngineering and Infotech Oulu
Face analysis using local binary patterns
• Face recognition is one of the major challenges in computer vision
• We proposed (ECCV 2004, PAMI 2006) a face descriptor based on LBP’s
• Our method has already been adopted by many leading scientists
- e.g. T.S. Huang, J. Kittler, S.Z. Li, W. Gao, H. Ai, B. Triggs, S. Gong, S. Marcel
• Outstanding results in face recognition and authentication, face detection, facialexpression recognition, gender classification
• Our approach will have a significant role in a new EU project ”Mobile Biometry” (2008-2010) coordinated by IDIAP (Switzerland)
Machine Vision GroupMachine Vision Group
DepDept.t. of Electrical of Electrical and and InformationInformation Engineering and Infotech OuluEngineering and Infotech Oulu
Face description with LBP
Ahonen T, Hadid A & Pietikäinen M (2006) Face description with local binary
patterns: application to face recognition. IEEE Transactions on Pattern Analysis
and Machine Intelligence 28(12):2037-2041. (an early version published at
ECCV 2004)
A facial description for face recognition:
Machine Vision GroupMachine Vision Group
DepDept.t. of Electrical of Electrical and and InformationInformation Engineering and Infotech OuluEngineering and Infotech Oulu
LBP in AuthenMetric F1Institute of Automation, Chinese Academy of Sciences
Machine Vision GroupMachine Vision Group
DepDept.t. of Electrical of Electrical and and InformationInformation Engineering and Infotech OuluEngineering and Infotech Oulu
FP7 project: Mobile Biometry (MOBIO) 2008-2010 (www.mobioproject.org)
• The aim of is to investigate multiple aspects of biometric authentication based on the face and voice in the context of mobile devices
• To increase security and user acceptance - using standard sensors already available on mobile phones
• Coordinator: IDIAP Research Institute (CH)
• Partners: University of Manchester (UK), University of Surrey (UK), Universited’Avignon (FR), Brno University of Technology (CZ), University of Oulu (FI), IdeArk (CH), eyeP Media (CH)
• A technology transfer tool referred to as MOBIO ”Community of Interest” willbe formed
Machine Vision GroupMachine Vision Group
DepDept.t. of Electrical of Electrical and and InformationInformation Engineering and Infotech OuluEngineering and Infotech Oulu
Subtracting the background and detecting moving objects
Heikkilä M & Pietikäinen M (2006) A texture-based method for modeling the background and detecting moving objects. IEEE Transactions on Pattern Analysis and Machine Intelligence 28(4):657-662. (an early version published at BMVC 2004)
Machine Vision GroupMachine Vision Group
DepDept.t. of Electrical of Electrical and and InformationInformation Engineering and Infotech OuluEngineering and Infotech Oulu
……Overview of the ApproachOverview of the Approach……
We use an LBP histogram computed over a circular region around the
pixel as the feature vector.
The history of each pixel over time is modeled as a group of K weighted
LBP histograms: {x1,x2,…,xK}.
The background model is updated with the information of each new
video frame, which makes the algorithm adaptive.
The update procedure is identical for each pixel.
Machine Vision GroupMachine Vision Group
DepDept.t. of Electrical of Electrical and and InformationInformation Engineering and Infotech OuluEngineering and Infotech Oulu
Detection of moving objects
A texture based method for modeling the background and detecting moving objects
Machine Vision GroupMachine Vision Group
DepDept.t. of Electrical of Electrical and and InformationInformation Engineering and Infotech OuluEngineering and Infotech Oulu
Dynamic texture descriptors for motion analysis
• Dynamic (or temporal) textures are textures in motion
• We proposed (PAMI 2007) simple spatiotemporal LBP descriptors for dynamictexture recognition outperforming the state-of-the-art
• This approach has been applied to facial expression regonition (PAMI 2007), faceand gender recognition from video sequences (AMFG 2007, ICPR 2008), visualspeech recognition (HCM2007), and recognition of actions (BMVC 2008) - withexcellent results
• Our approach has potential for significant contributions in many applications and fundamental problems of motion and activity analysis
Machine Vision GroupMachine Vision Group
DepDept.t. of Electrical of Electrical and and InformationInformation Engineering and Infotech OuluEngineering and Infotech Oulu
Dynamic texture recognition
� Determine the emotional state ofthe face
Zhao G & Pietikäinen M (2007) Dynamic texture recognition using local binary patterns with an application to facial expressions. IEEE Transactions on Pattern
Analysis and Machine Intelligence 29(6):915-928. (parts of this were earlier
presented at ECCV 2006 Workshop on Dynamical Vision and ICPR 2006)
Machine Vision GroupMachine Vision Group
DepDept.t. of Electrical of Electrical and and InformationInformation Engineering and Infotech OuluEngineering and Infotech Oulu
Dynamic texture
Motivation
–Dynamic textures or temporal textures are textures with motion.
–There are lots of DTs in real world, including sea-waves, smoke, foliage, fire, shower and whirlwind, etc.
Click the figure
Machine Vision GroupMachine Vision Group
DepDept.t. of Electrical of Electrical and and InformationInformation Engineering and Infotech OuluEngineering and Infotech Oulu
Volume Local Binary Patterns (VLBP)
Sampling in volume
Thresholding
Multiply
Pattern
Machine Vision GroupMachine Vision Group
DepDept.t. of Electrical of Electrical and and InformationInformation Engineering and Infotech OuluEngineering and Infotech Oulu
LBP from Three Orthogonal Planes (LBP-TOP)
Machine Vision GroupMachine Vision Group
DepDept.t. of Electrical of Electrical and and InformationInformation Engineering and Infotech OuluEngineering and Infotech Oulu
Machine Vision GroupMachine Vision Group
DepDept.t. of Electrical of Electrical and and InformationInformation Engineering and Infotech OuluEngineering and Infotech Oulu
DynTex database
• Our methods outperformed the state-of-the-art in experimentswith DynTex and MIT dynamic texture databases
Machine Vision GroupMachine Vision Group
DepDept.t. of Electrical of Electrical and and InformationInformation Engineering and Infotech OuluEngineering and Infotech Oulu
Machine Vision GroupMachine Vision Group
DepDept.t. of Electrical of Electrical and and InformationInformation Engineering and Infotech OuluEngineering and Infotech Oulu
Facial expression recognition
Zhao G & Pietikäinen M (2007) Dynamic texture recognition using local binary patterns with an application to facial expressions. IEEE Transactions on Pattern Analysis and Machine Intelligence 29(6):915-928.
� Determine the emotional state of the face
• Regardless of the identity of the face
Machine Vision GroupMachine Vision Group
DepDept.t. of Electrical of Electrical and and InformationInformation Engineering and Infotech OuluEngineering and Infotech Oulu
(a) Non-overlapping blocks(9 x 8) (b) Overlapping blocks (4 x 3, overlap size = 10)
(a) Block volumes (b) LBP features (c) Concatenated features for one block volume
from three orthogonal planes with the appearance and motion
Machine Vision GroupMachine Vision Group
DepDept.t. of Electrical of Electrical and and InformationInformation Engineering and Infotech OuluEngineering and Infotech Oulu
Database
Cohn-Kanade database :
• 97 subjects
• 374 sequences
• Age from 18 to 30 years
• Sixty-five percent were female, 15 percent were African-American, and three percent were Asian or Latino.
Machine Vision GroupMachine Vision Group
DepDept.t. of Electrical of Electrical and and InformationInformation Engineering and Infotech OuluEngineering and Infotech Oulu
Happiness Angry Disgust
Sadness Fear Surprise
Machine Vision GroupMachine Vision Group
DepDept.t. of Electrical of Electrical and and InformationInformation Engineering and Infotech OuluEngineering and Infotech Oulu
Comparison with different approaches
96.2610 foldY637497Ours
95.19two foldY637497Ours
93.66-------Y628490[Cohen, 2003]
90.9five foldY6------97[Yeasin, 2004]
93.8-------N637597[Tian, 2004]
93.8leave-one-
subject-out
N731390[Littlewort,
2004]
86.910 foldN731390[Bartlett, 2003]
88.4(92.1)10 foldN7(6)32096[Shan,2005]
Recognition Rate
(%)
MeasureDynamic Clas
s
Num
Sequence
Num
People
Num
Machine Vision GroupMachine Vision Group
DepDept.t. of Electrical of Electrical and and InformationInformation Engineering and Infotech OuluEngineering and Infotech Oulu
Demo For Facial Expression Recognition
� Low resolution
� No eye detection
� Translation, in-plane and out-of-plane rotation, scale
� Illumination change
� Robust with respect to errors in
face alignment
Machine Vision GroupMachine Vision Group
DepDept.t. of Electrical of Electrical and and InformationInformation Engineering and Infotech OuluEngineering and Infotech Oulu
Visual Speech Recognition
� Visual speech information plays an important role in speech recognition under noisy conditions or for listeners with hearingimpairment.
� A human listener can use visual cues, such as lip and tongue movements, to enhance the level of speech understanding.
� The process of using visual modality is often referred to as lipreadingwhich is to make sense of what someone is saying by watching themovement of his lips.
McGurk effect [McGurk and MacDonald 1976] demonstrates that inconsistency between audio and visual information can result in perceptual confusion.
Zhao G, Pietikäinen M & Hadid A (2007) Local spatiotemporal descriptors for visual speech recognition. Proc. 2nd International Workshop on Human-Centered Multimedia (HCM2007), Augsburg, Germany, 57-65.
Machine Vision GroupMachine Vision Group
DepDept.t. of Electrical of Electrical and and InformationInformation Engineering and Infotech OuluEngineering and Infotech Oulu
Appearance Features-Local Spatiotemporal Descriptors For Visual Information
(a) Volume of utterance sequence
(b) Image in XY plane (147x81)
(c) Image in XT plane (147x38) in y =40
(d) Image in TY plane (38x81) in x = 70
Overlapping blocks (1 x 3, overlap size = 10).
LBP-YT images
Mouth region images
LBP-XY images
LBP-XT images
Machine Vision GroupMachine Vision Group
DepDept.t. of Electrical of Electrical and and InformationInformation Engineering and Infotech OuluEngineering and Infotech Oulu
Features in each block volume.
Mouth movement representation.
Machine Vision GroupMachine Vision Group
DepDept.t. of Electrical of Electrical and and InformationInformation Engineering and Infotech OuluEngineering and Infotech Oulu
Experiments
Two databases:
1) Our own visual speech database:
20 persons; each uttering ten everyday’s greetings one to five times. Totally, 817 sequences from 20 speakers were used in the experiments.
Phrases included in the dataset.
“You are welcome”C10“Nice to meet you”C5
“Have a good time”C9“How are you”C4
“Thank you”C8“Hello”C3
“I am sorry”C7“Good bye”C2
“See you”C6“Excuse me”C1
2) Tulips1 audio-visual database
12 subjects, pronouncing the first four digits in English two times in repetition. Totally 96 sequences.
Machine Vision GroupMachine Vision Group
DepDept.t. of Electrical of Electrical and and InformationInformation Engineering and Infotech OuluEngineering and Infotech Oulu
Experimental Results-Own database
Mouth regions from the dataset.
Speaker-independent:
C1 C2 C3 C4 C5 C6 C7 C8 C9 C100
20
40
60
80
100
Phrases index
Re
co
gn
itio
n r
es
ult
s (
%)
1x5x3 block volumes
1x5x3 block volumes (features just from XY plane)
1x5x1 block volumes
Results of speaker-independent experiments.
59.6%62.4%Blocks (1x5x3+1x5x2)
58.660.6Blocks (1x5x3)
Automati
c
ManualEye detection
Machine Vision GroupMachine Vision Group
DepDept.t. of Electrical of Electrical and and InformationInformation Engineering and Infotech OuluEngineering and Infotech Oulu
Experimental Results-Tulips1 audio-visual database
8,8,8,1,1,1LBP TOP−
Mouth images with translation, scaling and rotation from Tulips1 database.
92.71NBlocks: 3x6x2Ours
80YTemporal Derivatives Features[Gurban 2005]
87.5YMI MRPCA[Arsic 2006]
81.25YMRPCA[Arsic 2006]
Results (%)NormalizationFeatures
Comparison to other methods on Tulips1 audio-visual database (speaker independent).
Machine Vision GroupMachine Vision Group
DepDept.t. of Electrical of Electrical and and InformationInformation Engineering and Infotech OuluEngineering and Infotech Oulu
Demo for visual speech recognition
Machine Vision GroupMachine Vision Group
DepDept.t. of Electrical of Electrical and and InformationInformation Engineering and Infotech OuluEngineering and Infotech Oulu
Recognition of actions
Machine Vision GroupMachine Vision Group
DepDept.t. of Electrical of Electrical and and InformationInformation Engineering and Infotech OuluEngineering and Infotech Oulu
2D texture based approach
V Kellokumpu, G Zhao & M Pietikäinen, "Texture Based Description of Movements for Activity Analysis". In Proc. VISAPP 2008
w
w
w
w
1
2
3
4
w
w
w
w
1
2
3
4
•Demonstration
Machine Vision GroupMachine Vision Group
DepDept.t. of Electrical of Electrical and and InformationInformation Engineering and Infotech OuluEngineering and Infotech Oulu
Dynamic texture based approach
yt
xt
Machine Vision GroupMachine Vision Group
DepDept.t. of Electrical of Electrical and and InformationInformation Engineering and Infotech OuluEngineering and Infotech Oulu
Dynamic texture based approach
V Kellokumpu, G Zhao & M Pietikäinen, “Human Activity Recognition using a Dynamic Texture Based Approach". BMVC 2008.
Feature histogram of a bounding volume
Machine Vision GroupMachine Vision Group
DepDept.t. of Electrical of Electrical and and InformationInformation Engineering and Infotech OuluEngineering and Infotech Oulu
Dynamic texture based approach
.980.020
.855.145
.032,108.860
.977.020.003
.01.987.003
.033.967
.980.020
.855.145
.032,108.860
.977.020.003
.01.987.003
.033.967
Box Clap Wave Jog Run Walk
Clap
Wave
Jog
Run
Walk
Box
SVM - 93,8%
KTH - dataset
Machine Vision GroupMachine Vision Group
DepDept.t. of Electrical of Electrical and and InformationInformation Engineering and Infotech OuluEngineering and Infotech Oulu
DT segmentation
Chen J, Zhao G & Pietikäinen M (2008) Unsupervised dynamic texture segmentation using local spatiotemporal descriptors, ICPR 2008, in press.
Machine Vision GroupMachine Vision Group
DepDept.t. of Electrical of Electrical and and InformationInformation Engineering and Infotech OuluEngineering and Infotech Oulu
A demo show
Segmentation of a dynamic texture
Input Output
Machine Vision GroupMachine Vision Group
DepDept.t. of Electrical of Electrical and and InformationInformation Engineering and Infotech OuluEngineering and Infotech Oulu
Experimental results
Results on sequences ocean-fire-small
(a) Frame 8 (b) Frame 21 (c) Frame 40
(d) Frame 60 (e) Frame 80 (f) Frame 100
Machine Vision GroupMachine Vision Group
DepDept.t. of Electrical of Electrical and and InformationInformation Engineering and Infotech OuluEngineering and Infotech Oulu
Experimental results
Results on a real challenging sequence
(b) Frame 10(a) Frame 5
Machine Vision GroupMachine Vision Group
DepDept.t. of Electrical of Electrical and and InformationInformation Engineering and Infotech OuluEngineering and Infotech Oulu
Summary
• LBP and its spatiotemporal extensions are very effective methods for motionand activity analysis
• Our recent reseach has foced on applications in detection and tracking of moving objects, face, facial expression and gender recognition from videos, visualspeech recognition, recognition of human actions, gait recognition
• The methods should be powerful in various industrial problems
- computationally simple
- robust to illumination variations
- robust to localization errors