Download pdf - AVA: A Large-Scale Database for Aesthetic Visual Analysis · An Aside: Aesthetics Per Wikipedia, aesthetics is the study of beauty and taste. It is highly subjective, especially at

AVA: A Large-Scale Database for Aesthetic Visual Analysis

Calvin Deutschbein

An Aside: Aesthetics● Per Wikipedia, aesthetics is the study of

beauty and taste.

● It is highly subjective, especially at current knowledge levels

● As with many cases in image recognition, establishment of ground truth is difficult

An Aside: Aesthetics

Top 4 Google Results for “beautiful” and “ugly”

Increasing subtlety increases difficulty

An Aside: Aesthetics● Which of these is more beautiful?●

●

●

●

●

● The right image has higher page rank, but...

Paper Goals● Develop a consistent database for a clear

ground truth in aesthetics testing– As with Imagenet, Caltech 101, this spurs and

focuses research ● Through experimentation, demonstrate the

merits of an improved database– Consider both scale and quality

Database Creation● Artistic nature heavily complicates

– For rating art, humans often use:● Vested critics, i.e. journalists, content creators● Established evaluation frameworks, i.e. star ratings● Criticism aggregation, i.e. Rotten Tomatoes, Oscar's

● How to do this on the scale of a ML dataset?– Many methods (Mechanical Turk) fail to provide

vested critics

DPChallenge.com● Photograph challenge site

– Participants create and evaluate: “vested critics”– Challenges are named – can create semantic tags

● Submissions for contests are ranked– This determines “ground truth” aesthetics

● As this work is already complete, only a matter of datamining

An Example: “Fireworks”Place 1/157: 7.4/10 Place 157/157: 4.2/10

Semantic tags are also associated with some images, but these images were associated with none

Anyway, this is novel

Another Nicety: Elegant Spreads● Distributions of scores are approx. normal!

● Standard deviation is a function of mean score!

● High variance occurs on unusual images!

● This is as good as one could hope for...

Examples of Niceness

Legend gives %age of results in colored cluster

Semantics & Score

Something interesting happens at extremes...

Exercising AVA● To demonstrate the usefulness of AVA, it was

used in three ways:– Generic aesthetic quality categorization– Content based aesthetic classification– “Style” Categorization

Rating Aesthetic Quality● Perhaps the most interesting test (linear SVM):

– Binary classification a la social media– Produced better results when trained on more data

● Dimishing returns (not surprising)– Including middling images improves results

● Including only extreme examples led to model confusion when confronted with ambiguous images

Content Classification● This leveraged the semantic tags● Class-specific SVMs performed better than

generic SVMs that didn't utilize classification– Only true for content-based models– In other cases (color, SIFT), generic is favored

“Style” Categorization● Different photographic styles were used and

recognized by dpchallenge voters– Examples include silhoutte and vanish point

● This is novel to this paper– It leverages the domain expertise of voters heavily

● The SVMs could now say “why” something is beautiful instead of just whether or not it is