1
CONTEXT-BASED PEOPLE RECOGNITION in CONSUMER PHOTO COLLECTIONS Markus Brenner, Ebroul Izquierdo MMV Research Group, School of Electronic Engineering and Computer Science Queen Mary University of London, UK {markus.brenner, ebroul.izquierdo}@eecs.qmul.ac.uk Face Detecon and Basic Recognion Inial steps: Image preprocessing, face detecon and face normalizaon Descriptor-based: Local Binary Paern (LBP) texture histograms Similarity metric: Chi-Square Stascs Basic face recognion: k-Nearest-Neighbor Graph-based Recognion Model: pairwise Markov Network (graph nodes represent faces) Unary Potenals: likelihood of faces belonging to parcular people Pairwise Potenals: encourage spaal smoothness, encode exclusivity constraint and temporal domain Topology: only the most similar faces are connected with edges Inference: maximum a posteriori (MAP) soluon of Loopy Belief Propagaon (LBP) Social Semancs Individual appearance for a more effecve graph topology (used to regularize the number of edges) Unique People Constraint models exclusivity: a person cannot appear more than once in a photo Pairwise co-appearance: people appearing together bear a higher likelihood of appearing together again Groups of people: use data mining to discover frequently appearing social paerns Body Detecon and Recognion when faces are obscured or invisible Detect upper and lower body parts Biparte matching of faces and bodies Graph-based fusion of faces and clothing f2 f1 f3 Unary potential Pairwise potential Face Resolve idenes of people primarily by their faces Incorporate rich contextual cues of personal photo collecons where few individual people frequently appear together Perform recognion by considering all contextual informaon at the same me (unlike tradional approaches that usually train a classifier and then predict idenes independently) Aim = 1 Experiments Public Gallagher Dataset: ~600 photos, ~800 faces, 32 disnct people Our dataset: ~3300 photos, ~5000 faces, 106 disnct people All photos shot with a typical consumer camera Considering only correctly detected faces (87%) Te Tr Tr Tr Te Face similarity All samples are independent Te Tr Tr Tr Te Based on face similarities Unary potential of every node Te Tr Tr Tr Te Upper body similarity Face similarity Lower body similarity Unary potential of every node ... , = , = 0, = = , , ℎ 0% 5% 10% 15% 20% 25% + Graph. Model + Social Semantics + Body parts Gain @ 3% training … for each block … LBP LBP

CUbRIK research presented at SSMS 2012

Embed Size (px)

DESCRIPTION

CONTEXT-BASED PEOPLE RECOGNITION in CONSUMER PHOTO COLLECTIONS

Citation preview

Page 1: CUbRIK research presented at SSMS 2012

CONTEXT-BASED PEOPLE RECOGNITION

in CONSUMER PHOTO COLLECTIONS

Markus Brenner, Ebroul Izquierdo MMV Research Group, School of Electronic Engineering and Computer Science

Queen Mary University of London, UK

{markus.brenner, ebroul.izquierdo}@eecs.qmul.ac.uk

Face Detection and Basic Recognition

Initial steps: Image preprocessing, face detection and face normalization

Descriptor-based: Local Binary Pattern (LBP) texture histograms

Similarity metric: Chi-Square Statistics

Basic face recognition: k-Nearest-Neighbor

Graph-based Recognition

Model: pairwise Markov Network (graph nodes represent faces)

Unary Potentials: likelihood of faces belonging to

particular people

Pairwise Potentials: encourage spatial smoothness,

encode exclusivity constraint and temporal domain

Topology: only the most similar faces are

connected with edges

Inference: maximum a posteriori (MAP)

solution of Loopy Belief Propagation (LBP)

Social Semantics

Individual appearance for a more effective graph

topology (used to regularize the number of edges)

Unique People Constraint models exclusivity:

a person cannot appear more than once in a photo

Pairwise co-appearance: people appearing together

bear a higher likelihood of appearing together again

Groups of people: use data mining to

discover frequently appearing social patterns

Body Detection and Recognition

… when faces are obscured or invisible

Detect upper and lower body parts

Bipartite matching of faces and bodies

Graph-based fusion of faces and clothing

f2f1

f3

Unary potential

Pairwise potential

Face

Resolve identities of people primarily by their faces

Incorporate rich contextual cues of personal photo collections

where few individual people frequently appear together

Perform recognition by considering all contextual information

at the same time (unlike traditional approaches that usually

train a classifier and then predict identities independently)

Aim

𝑢 𝑤𝑛 =1

𝑍𝑓𝑓 𝑤𝑛

Experiments Public Gallagher Dataset:

~600 photos, ~800 faces, 32 distinct people

Our dataset:

~3300 photos, ~5000 faces, 106 distinct people

All photos shot with a typical consumer camera

Considering only correctly detected faces (87%)

Te Tr

Tr

Tr

Te

Face

similarity

All samples

are independent

Te

TrTr

TrTe

Based on face

similarities

Unary potential

of every node

Te

TrTr

TrTe

Upper body

similarity

Face

similarity

Lower

body

similarity

Unary potential

of every node

...

𝑝 𝑤𝑛 ,𝑤𝑚 =

𝜏, 𝑖𝑓 𝑤𝑛 = 𝑤𝑚 ∧ 𝑖𝑛 ≠ 𝑖𝑚 0, 𝑖𝑓 𝑤𝑛 = 𝑤𝑚 ∧ 𝑖𝑛 = 𝑖𝑚

𝑐𝑜 𝑤𝑛 ,𝑤𝑚 , 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒

0%

5%

10%

15%

20%

25%

+ Graph. Model + Social Semantics + Body parts

Gain @ 3% training

… for each block …

LBP

LBP