Gowri Somanath, Rohith MV, Chandra Kambhamettu...

Preview:

Citation preview

Gowri Somanath, Rohith MV, Chandra Kambhamettu Video/Image Modeling and Synthesis (VIMS) Lab,

Department of Computer and Information Sciences, University of Delaware. USA.

http://vims.cis.udel.edu

First IEEE International Workshop on Benchmarking Facial Image Analysis Technologies (BeFIT) 2011

Context and motivation

VADANA Overview Comparison to existing datasets VADANA subject and image metadata and annotations.

Experiments with face verification Standard partitions for benchmarking Comparison of various baseline and state-of-the-art algorithms.

Discussions and future direction

Face recognition/ Person identification Face verification Age related studies – face verification across

age progression, age estimation, etc. Blood-relations/kinship verification Expression analysis Gender estimation Pose determination ….

Face recognition/ Person identification Face verification Age related studies – face verification across

age progression, age estimation, etc. Blood-relations/kinship verification Expression analysis Gender estimation Pose determination ….

Large-scale testbed to test scalability of algorithms. For recognition, scale may be determined by number of

subjects and images For verification, it may be measured through the number

of image pairs available for training and testing

‘Real-world’, ‘uncontrolled’, ‘in the wild’ data to check practical application.

Have preset standardized training-testing partitions and uniform performance measures to allow fair and direct comparison of algorithms.

Key aspects Images: 2298 images from 43 subjects (26 Male ; 17 Female ) Mostly high quality 24-bit color digital images (30 scanned)

Annotations Subject: gender, blood-relations (if-any) Image: age, pose, facial hair, occlusions, spectacles

Distribution Large number of images per subject (depth vs. breadth) Largest number of intra-personal pairs (essential for verification) Good distribution of images in different age groups

‘Real-world’ with natural pose, expression and illumination variations

Age

Identity /subject

3 4 19 24

-

-

-

Verification, age-progression, morphing studies

Age

-est

imat

ion,

agi

ng m

odel

stu

dies

Dataset Natural variations

High quality images

Age annotation

Large # of intra-pairs

Other annotations

FGNET X X * MORPH X X X X LFW X X PubFig X X VADANA *

* FGNET and VADNA include other image annotations (such as pose). In addition VADANA includes other subject related annotations

VADANA FGNET MORPH (Album 1) MORPH (Album 2)

#subjects (# adults)

43 (35) 82 (64) 631 (623) 13673 (13060)

Age range 0-78 0-69 16-69 16-99

Total # images (adults)

2298 (1913) 1002 (363) 1690 (1520) 55608 (52271)

# images per subject

3-300 (avg: 53) 6-18 (avg: 12)

1-6 (avg:2) 1-53 (avg:4)

Total # intra-personal pairs (adults)

168,833 (146,528)

5,808 (1,164)

1,597 (1,324)

138,368 (130,397)

Number in parenthesis indicates when only images with age >=18 were considered (adults) (see next)

VADANA FGNET MORPH (Album 1) MORPH (Album 2)

#subjects (# adults)

43 (35) 82 (64) 631 (623) 13673 (13060)

Age range 0-78 0-69 16-69 16-99

Total # images (adults)

2298 (1913) 1002 (363) 1690 (1520) 55608 (52271)

# images per subject

3-300 (avg: 53) 6-18 (avg: 12)

1-6 (avg:2) 1-53 (avg:4)

Total # intra-personal pairs (adults)

168,833 (146,528)

5,808 (1,164)

1,597 (1,324)

138,368 (130,397)

Depth of dataset vs. Breadth: For face verification, having large number of image pairs (intra-

personal) is desired, which needs large number of images per subject taken across different age and imaging conditions.

(More subjects)

VADANA FGNET MORPH (Album 1) MORPH (Album 2)

#subjects (# adults)

43 (35) 82 (64) 631 (623) 13673 (13060)

Age range 0-78 0-69 16-69 16-99

Total # images (adults)

2298 (1913) 1002 (363) 1690 (1520) 55608 (52271)

# images per subject

3-300 (avg: 53) 6-18 (avg: 12)

1-6 (avg:2) 1-53 (avg:4)

Total # intra-personal pairs (adults)

168,833 (146,528)

5,808 (1,164)

1,597 (1,324)

138,368 (130,397)

Largest number of intra-personal pairs among age annotated datasets. This allows for large-scale testing of algorithms – more training and

testing data available.

(More pairs)

VADANA FGNET MORPH (Album 1) MORPH (Album 2)

Total # intra-personal pairs (adults)

168,833 (146,528) 5,808 (1,164)

1,597 (1,324)

138,368 (130,397)

Age gap range (adults)

0-37 (0-37) 0-54 (0-45) 0-29 (0-28) 0-69 (0-69)

# intra-pairs in age gap [0,2] (adults)

158,262 (137,683) 855 (146) 402 (351) 128,742 (120,903)

# intra-pairs in age gap [3,5] (adults)

8,517 (7,967) 1,162 (229) 458 (389) 7,896 (7,805)

# intra-pairs in age gap [6,8] (adults)

696 (685) 932 (169) 292 (246) 664 (657)

# intra-pairs in age gap [9,inf] (adults)

1358 (193) 2,859 (620) 441 (338) 1,066 (1,032)

VADANA FGNET MORPH (Album 1) MORPH (Album 2)

Total # intra-personal pairs (adults)

168,833 (146,528) 5,808 (1,164)

1,597 (1,324)

138,368 (130,397)

Age gap range (adults)

0-37 (0-37) 0-54 (0-45) 0-29 (0-28) 0-69 (0-69)

# intra-pairs in age gap [0,2] (adults)

158,262 (137,683) 855 (146) 402 (351) 128,742 (120,903)

# intra-pairs in age gap [3,5] (adults)

8,517 (7,967) 1,162 (229) 458 (389) 7,896 (7,805)

# intra-pairs in age gap [6,8] (adults)

696 (685) 932 (169) 292 (246) 664 (657)

# intra-pairs in age gap [9,inf] (adults)

1358 (193) 2,859 (620) 441 (338) 1,066 (1,032)

This age gap contains least effect from appearance change due to aging – hence generally used to estimate upper-bound on algorithm

performance

VADANA FGNET MORPH (Album 1) MORPH (Album 2)

Total # intra-personal pairs (adults)

168,833 (146,528) 5,808 (1,164)

1,597 (1,324)

138,368 (130,397)

Age gap range (adults)

0-37 (0-37) 0-54 (0-45) 0-29 (0-28) 0-69 (0-69)

# intra-pairs in age gap [0,2] (adults)

158,262 (137,683) 855 (146) 402 (351) 128,742 (120,903)

# intra-pairs in age gap [3,5] (adults)

8,517 (7,967) 1,162 (229) 458 (389) 7,896 (7,805)

# intra-pairs in age gap [6,8] (adults)

696 (685) 932 (169) 292 (246) 664 (657)

# intra-pairs in age gap [9,inf] (adults)

1358 (193) 2,859 (620) 441 (338) 1,066 (1,032)

Age gap of ~10 years is encountered in applications such as passport image verification.

VADANA

FGNET

MORPH (1)

MORPH (2)

[3,5] (Max. VADANA 8.5K)

[6,8] (Max. FGNET 0.9K) [9,inf) (Max. FGNET 0.2K)

[0,2] (Max. VADANA 158K)

Area proportional to number of intra-personal pairs in corresponding age-gap groups

Our age gaps due to depth than breadth

VADANA FGNET MORPH (Album 1) MORPH (Album 2)

Variations Wide range of natural variations in pose, expression, illumination. Facial hair, spectacles Partial occlusions

Mostly frontal pose (some profile pose) Neutral expression, Controlled illumination.

Other metadata Gender, Relation b/w subjects (if-any), Facial hair, Spectacles, pose

Gender, Facial hair, Spectacles, 68 fiducial points

Gender, race Gender, race

Image quality 24-bit color digital images (30 scanned)

Mostly scanned images

Digitally scanned at 300dpi; grayscale pgm

8-bit color images

Aligned faces Yes (additional parallel version with all faces aligned)

No No No

#images in age group (# subjects)

VADANA FGNET MORPH (Album 1)

[0,18] 387 (15) 687 (80) 247 (209)

[19,30] 1084 (13) 186 (62) 921 (513)

[31,60] 761 (21) 122 (35) 516 (304)

[61,inf) 66 (4) 8 (4) 6 (5)

0

500

1000

[0,18] [19,30] [31,60] [61,inf)

VADANA

FGNET

MORPH (Album 1)# Im

ages

Age

Raw Image set Images of faces for each subject (varying image sizes)

Aligned Image set

Images resized to 250 x 250 pixels and aligned using funneling

Subject data Subject ID Gender Year of Birth

Image data Age Binary indicators for facial hair, spectacles and occlusions Horizontal and vertical pose (left, right, center ; up, down, center)

Relationship data Sibling pairs and parent-child pairs (e.g., [011,020] indicates subject 011 is the parent of subject 020)

Experiment sets Multiple partitions of the dataset for uniform comparison

We include preset standard partitions (experiment sets) of VADANA for benchmarking purposes.

For each set Fixed the age gap bin Fix a maximum limit on intra-pairs per subject Sample dataset to obtain the intra-pairs Divide the intra-pairs into folds such that

▪ Subjects across folds are non-overlapping (hence training-testing subjects are non-overlapping) ▪ The number of pairs in each fold is nearly equal

Each fold also has a corresponding set of extra-pairs. The number of extra-pairs is equal to number of intra-pairs in the fold.

The different sets for each age gap has less than 60% overlap (by random sampling of images and pairs per subject)

Age gap # of intra-pairs per fold X # folds # of sets

[0,2] ~7000 X 5 folds 4

[3,5] ~240 X 3 folds 2

[6,8] ~ 145 X 2 folds 3

For each age gap, average the performance over the sets

For each set, Measure CAR and CRR for different parameters of the algorithm

▪ Correct Acceptance Rate (CAR)= # of correctly classified intra-pairs/ total # of intra-pairs

▪ Correct Rejection Rate (CRR) = # of correctly classified extra-pairs/ total # of extra-pairs

The ROC curve is obtained by plotting the CARs, CRRs (averaged

over the folds)

Accuracy is determined as performance at Equal Error Rate (EER). That is the CAR at the point where (CAR=CRR) on the ROC curve.

Baseline: Variant of eigenfaces Formed eigenfaces basis by using 10 randomly sampled images from subjects. Select

sufficient number of eigenvectors to explain at least 75% variation. Represent each face image using the eigenfaces basis by projection. Use difference of above representation as feature for a given image pair. Use Random Forest as classifier for binary classification of each pair as inter-pair or extra-pair.

Pair matching algorithm of Nowak et.al. 2007

Works by training classifiers based on features extracted from image patches. Implementation provided by authors. We have used same parameters as used for LFW dataset and found to be fairly stable with

slight changes.

State-of-the art face verification algorithm under age progression by Ling et. al. 2010. Gradient Orientation Pyramids used as features to train SVM classifiers (Gaussian kernel). Specifically designed to handle age separation in image pairs. Reported to be state-of-the-art

on FGNET and Passport dataset. No public implementation currently available. Our implementation was verified by closely

matching author results on FGNET.

All experiments performed on [0,2] age gap All experiments performed on aligned version of VADANA Baseline algorithm using eigenfaces performs very well on current benchmark dataset

(FGNET) but performs just above chance on VADANA – showing the contrasting nature of the two.

VADANA FGNET LFW Jain

Eigenfaces (PCA+RF) 52.33 99.33

Nowak pair matching (SIFT+ERFC) 61.52 67+-2.2 73* 84.2*

Ling et.al (SVM + GOP) 57.43 73

* Obtained from original papers respectively.

All experiments performed on [0,2] age gap All experiments performed on aligned version of VADANA Baseline algorithm using eigenfaces performs very well on current benchmark dataset

(FGNET) but performs just above chance on VADANA – showing the contrasting nature of the two.

LFW and Jain are large-scale real-world datasets. Nowak algorithm performs better on the VADANA -- since it also works well on other wild imagery

VADANA FGNET LFW Jain

Eigenfaces (PCA+RF) 52.33 99.33

Nowak pair matching (SIFT+ERFC) 61.52 67+-2.2 73* 84.2*

Ling et.al (SVM + GOP) 57.43 73

* Obtained from original papers respectively.

All experiments performed on [0,2] age gap All experiments performed on aligned version of VADANA Baseline algorithm using Eigenfaces performs very well on current benchmark dataset

(FGNET) but performs just above chance on VADANA – showing the contrasting nature of the two.

LFW and Jain are large-scale real-world datasets. Nowak algorithm performs better on the VADANA -- since it also works well on other wild imagery

The state-of-the-art algorithm specifically designed for verification across age progression, performs better than eigenfaces. Yet the performance does not reach the same accuracy as for FGNET.

VADANA FGNET LFW Jain

Eigenfaces (PCA+RF) 52.33 99.33

Nowak pair matching (SIFT+ERFC) 61.52 67+-2.2 73* 84.2*

Ling et.al (SVM + GOP) 57.43 73

* Obtained from original papers respectively.

Algorithms based on image difference (Eigenfaces, Ling et.al) seem to perform well on FGNET but not so on VADANA. While, Nowak pair matching is based on image patches, performs well on VADANA compared to the above two. Number of training pairs may be a contributing factor for Nowak success

since FGNET offers only 148 intra-pairs in [0,2], while VADANA offers 35,000. Image quality may also be a minor contributing factor (SIFT does better).

VADANA FGNET LFW Jain

Eigenfaces (PCA+RF) 52.33 99.33

Nowak pair matching (SIFT+ERFC) 61.52 67+-2.2 73* 84.2*

Ling et.al (SVM + GOP) 57.43 73

* Obtained from original papers respectively.

VADANA provides Real-world: A large-scale real world age annotated dataset. Depth vs. Breadth: Contains the largest number of intra-personal pairs in total and age

gaps of [0,2] and [3,5]. Various cross-sections: Offers different cross-sections and variations within and across

subjects (pose, expression, illumination and subject appearance) Distribution: Offers a good distribution of subjects and images in various ages and age-

gaps. Annotations: Annotated with subject and image data for various other facial analysis

algorithms. Benchmark: Offers multiple standard partitions for uniform comparisons.

Future directions

Detailed comparison between the different types of algorithm (patch-based vs. image difference) is better scalable for large scale age annotated real-world data.

Quantitative study of sensitivity of the algorithms to the different factors (age difference, gender, image quality, etc)

Requests can be sent through the dataset webpage http://vims.cis.udel.edu/vadana.html

Recommended