85
Petacat: Applying ideas from Copycat to image understanding

Petacat : Applying ideas from Copycat to image understanding

  • Upload
    kass

  • View
    50

  • Download
    0

Embed Size (px)

DESCRIPTION

Petacat : Applying ideas from Copycat to image understanding. How Streetscenes Works ( Bileschi , 2006). 1. Densely tile the image with windows of different sizes. 2. HMAX C2 features are computed in each window. 3. The features in each window are given as input - PowerPoint PPT Presentation

Citation preview

Page 1: Petacat :   Applying ideas from Copycat  to image understanding

Petacat: Applying ideas from Copycat

to image understanding

Page 2: Petacat :   Applying ideas from Copycat  to image understanding

How Streetscenes Works(Bileschi, 2006)

1. Densely tile the image withwindows of different sizes.

2. HMAX C2 features are computed in each window.

3. The features in eachwindow are given as inputto each of five trained support vector machines (“pedestrian”, “car”, “bicycle”, “building”, “tree”)

4. If any return a classification with score above a learned threshold, that object is said to be “detected” .

Page 3: Petacat :   Applying ideas from Copycat  to image understanding

Object detection (here, “car”) with HMAX model (Bileschi, 2006)

Page 4: Petacat :   Applying ideas from Copycat  to image understanding

Limitations of Streetscenes approach for “image understanding”

Page 5: Petacat :   Applying ideas from Copycat  to image understanding

Limitations of Streetscenes approach for “image understanding”

• Exhaustive search – not scalable

• Does not recognize spatial and abstract relationships among objects for whole scene understanding

• Has no prior knowledge about object categories and their place in “conceptual space”

• HMAX model is completely feed-forward; no feedback to allow context to aid in scene understanding. – Where should feedback come in?

Page 6: Petacat :   Applying ideas from Copycat  to image understanding

Person Dog

leash attached to

walking

actionaction

holds

Representation of High-Level Knowledge: A Simple Semantic Network (or “Ontology”)

“Dog walking”

Page 7: Petacat :   Applying ideas from Copycat  to image understanding

But...

Page 8: Petacat :   Applying ideas from Copycat  to image understanding

Person Dog

leash attached to

walking

actionaction

holds

Modified Ontology

Dog Group

running

“Dog walking”

Page 9: Petacat :   Applying ideas from Copycat  to image understanding

Person Dog

leash attached to

walking

actionaction

holds

Modified Ontology

running

Allowing “conceptual slippage”

“Dog walking”

Dog Group

Page 10: Petacat :   Applying ideas from Copycat  to image understanding

But...

Page 11: Petacat :   Applying ideas from Copycat  to image understanding
Page 12: Petacat :   Applying ideas from Copycat  to image understanding

Person

leash attached to

walking

actionaction

holds

“Dog walking”

Modified Ontology

running

Cat

Iguana

Dog

Dog Group

Page 13: Petacat :   Applying ideas from Copycat  to image understanding

But...

Page 14: Petacat :   Applying ideas from Copycat  to image understanding

But...

Page 15: Petacat :   Applying ideas from Copycat  to image understanding

But...

Page 16: Petacat :   Applying ideas from Copycat  to image understanding

But...

Page 17: Petacat :   Applying ideas from Copycat  to image understanding

Person Dog

leash attached to

walking

actionaction

holds

Modified Ontology

running

Cat

Iguana

Bicycle

Car

Helicopter

“Dog walking”

Dog Group

Page 18: Petacat :   Applying ideas from Copycat  to image understanding

But...

Page 19: Petacat :   Applying ideas from Copycat  to image understanding

PersonDog

Leash

Outside

Ground

WalkingRunning

Standing

Tree

Inside

Stick

Close to

Far from

Beach

Sidewalk

Attached to

Grass Lawn mower

Gasoline

Runway

Airplane

Helicopter

Above

Left of

Holding

Dog walking

Dog grooming

Car

Sky

ArmyTrack

Fanny pack

Backpack

Page 20: Petacat :   Applying ideas from Copycat  to image understanding

Need dynamical process of constructing representation.

Page 21: Petacat :   Applying ideas from Copycat  to image understanding

Need dynamical process of constructing representation.

Information gained during the unfolding of perception feeds back to guide the directions the perceptual process takes.

Page 22: Petacat :   Applying ideas from Copycat  to image understanding

Need dynamical process of constructing representation.

Information gained during the unfolding of perception feeds back to guide the directions the perceptual process takes.

– Ongoing perception of “context” brings in appropriate concepts and conceptual slippages, and avoids exhaustive search

Page 23: Petacat :   Applying ideas from Copycat  to image understanding

Need dynamical process of constructing representation.

Information gained during the unfolding of perception feeds back to guide the directions the perceptual process takes.

– Ongoing perception of “context” brings in appropriate concepts and conceptual slippages, and avoids exhaustive search

– Prior, higher-level knowledge interacts with lower-level vision in both directions (bottom-up and top-down).

Page 24: Petacat :   Applying ideas from Copycat  to image understanding

Need dynamical process of constructing representation.

Information gained during the unfolding of perception feeds back to guide the directions the perceptual process takes.

– Ongoing perception of “context” brings in appropriate concepts and conceptual slippages, and avoids exhaustive search

– Prior, higher-level knowledge interacts with lower-level vision in both directions (bottom-up and top-down).

– Concepts are “fluid”, allowed to “slip” in certain contexts.

Page 25: Petacat :   Applying ideas from Copycat  to image understanding

Need dynamical process of constructing representation.

Information gained during the unfolding of perception feeds back to guide the directions the perceptual process takes.

– Ongoing perception of “context” brings in appropriate concepts and conceptual slippages, and avoids exhaustive search

– Prior, higher-level knowledge interacts with lower-level vision in both directions (bottom-up and top-down).

– Concepts are “fluid”, allowed to “slip” in certain contexts.

• This allows perception of essential similarity in the face of superficial differences—i.e., analogy-making.

Page 26: Petacat :   Applying ideas from Copycat  to image understanding

Active Symbol Architecture(Hofstadter et al., 1995)

Page 27: Petacat :   Applying ideas from Copycat  to image understanding

Active Symbol Architecture(Hofstadter et al., 1995)

• Basis for – Copycat (analogy-making), Hofstadter & Mitchell

– Tabletop (anlaogy-making), Hofstadter & French

– Metacat (analogy-making and self-awareness), Hofstadter & Marshall

and many others…

Page 28: Petacat :   Applying ideas from Copycat  to image understanding
Page 29: Petacat :   Applying ideas from Copycat  to image understanding
Page 30: Petacat :   Applying ideas from Copycat  to image understanding

Semantic network

Temperature

Workspace

Active Symbol Architecture(Hofstadter et al., 1995)

Perceptual agents (codelets)

Page 31: Petacat :   Applying ideas from Copycat  to image understanding

Petacat:

(Descendant of Copycat)

Integration of Active Symbol Architecture and HMAX

Initial task: Decide if image is an instance of “taking a dog for a walk”, and if so, how good an instance it is.

Page 32: Petacat :   Applying ideas from Copycat  to image understanding

taking a dog for a walk

outdoors

has location

persondog

has action

is onis touching

has component

aroad

abeach

trail

drives

runsflies

cathorse

swims

ropebelt

leashsidewalk

string

walkswalks

is in front of

has location

has action

has component

has componenthas component

stands

sits

is in front of

is touching

is behind

is next to

is on

agrass

is touching

Object

Action

indoors

is on Spatial Relation

Semantic Network

Page 33: Petacat :   Applying ideas from Copycat  to image understanding

Property links

Slip links

taking a dog for a walk

outdoors

has location

persondog

has action

is onis touching

has component

aroad

abeach

trail

drives

runsflies

cathorse

swims

ropebelt

leashsidewalk

string

walkswalks

is in front of

has location

has action

has component

has componenthas component

stands

sits

is in front of

is touching

is behind

is next to

is on

agrass

is touching

Object

Action

indoors

is on Spatial Relation

Semantic Network

Page 34: Petacat :   Applying ideas from Copycat  to image understanding

Property links

Slip links

taking a dog for a walk

outdoors

has location

persondog

has action

is onis touching

has component

aroad

abeach

trail

drives

runsflies

cathorse

swims

ropebelt

leashsidewalk

string

walkswalks

is in front of

has location

has action

has component

has componenthas component

stands

sits

is in front of

is touching

is behind

is next to

is on

agrass

is touching

Object

Action

indoors

is on Spatial Relation

Semantic Network

Properties of nodes

Page 35: Petacat :   Applying ideas from Copycat  to image understanding

Workspace

Page 36: Petacat :   Applying ideas from Copycat  to image understanding

Semantic network

Workspace

Page 37: Petacat :   Applying ideas from Copycat  to image understanding

Semantic network

Perceptual Agents (Codelets)

Codelets as active symbols

Page 38: Petacat :   Applying ideas from Copycat  to image understanding

taking a dog for a walkhas location

persondog

has action

is onis touching

has component

aroad

abeach

trail

drives

runsflies

cathorse

swims

ropebelt

leash

string

walkswalks

is in front of

has location

has action

has component

has componenthas component

stands

sits

is in front of

is touching

is behind

is next to

is on

agrass

is touching

Object

Action

indoors

sidewalk

outdoors

is on

Spatial Relation

Page 39: Petacat :   Applying ideas from Copycat  to image understanding

taking a dog for a walkhas location

persondog

has action

is onis touching

has component

aroad

abeach

trail

drives

runsflies

horse

swims

ropebelt

leash

string

walkswalks

is in front of

has location

has action

has component

has componenthas component

stands

is on

sits

is in front of

is touching

is behind

is next to

is on

agrass

is touching

Object

Action

indoors

sidewalk

outdoors

Spatial Relation

cat

Page 40: Petacat :   Applying ideas from Copycat  to image understanding

taking a dog for a walkhas location

persondog

has action

is onis touching

has component

aroad

abeach

trail

drives

runsflies

cathorse

swims

ropebelt

leash

string

walkswalks

is in front of

has location

has action

has component

has componenthas component

stands

is on

sits

is in front of

is touching

is behind

is next to

is on

agrass

is touching

Object

Action

indoors

sidewalk

outdoors

Spatial Relation

Page 41: Petacat :   Applying ideas from Copycat  to image understanding

taking a dog for a walkhas location

persondog

has action

is onis touching

has component

aroad

abeach

trail

drives

runsflies

cathorse

swims

ropebelt

leash

string

walkswalks

is in front of

has location

has action

has component

has componenthas component

stands

is on

sits

is in front of

is touching

is behind

is next to

is on

agrass

is touching

Object

Action

indoors

sidewalk

outdoors

Spatial Relation

Page 42: Petacat :   Applying ideas from Copycat  to image understanding

Dog?

Illustration of what we plan to have happen – not a real run of Petacat

Page 43: Petacat :   Applying ideas from Copycat  to image understanding

Dog? Dog?

Person?

Illustration of what we plan to have happen – not a real run of Petacat

Page 44: Petacat :   Applying ideas from Copycat  to image understanding

Dog? Dog?

Sidewalk?

Person?

Illustration of what we plan to have happen – not a real run of Petacat

Page 45: Petacat :   Applying ideas from Copycat  to image understanding

Dog? Dog?

Sidewalk?

Person?

Dog?

Outdoors?

Illustration of what we plan to have happen – not a real run of Petacat

Page 46: Petacat :   Applying ideas from Copycat  to image understanding

Dog? Dog?

Sidewalk?

Person?

Dog?

Outdoors?

Scout codelets: Send C1 features in window to corresponding SVM.If positive result, post builder codelet with urgency equal to SVM’sconfidence.

Illustration of what we plan to have happen – not a real run of Petacat

Page 47: Petacat :   Applying ideas from Copycat  to image understanding

Dog?negative Dog?

negative

Sidewalk?positive: 0.4

Person?negative

Outdoors?positive: 0.7

Scout codelets: Send C1 features in window to corresponding SVM.If positive result, post builder codelet with urgency equal to SVM’sconfidence.

Dog?positive: 0.8

Illustration of what we plan to have happen – not a real run of Petacat

Page 48: Petacat :   Applying ideas from Copycat  to image understanding

Builder codelets: Ask HMAX to compute C2 features using prototypes specific to the object (or scene), and send them to corresponding SVM. If positive, decide to build structure with probability equal to SVM confidence. Break competing structures if necessary.

Dog?negative Dog?

negative

Sidewalk?positive: 0.4

Person?negative

Outdoors?positive: 0.7

Dog?positive: 0.8

Illustration of what we plan to have happen – not a real run of Petacat

Page 49: Petacat :   Applying ideas from Copycat  to image understanding

Builder codelets: Ask HMAX to compute object-/scene-specific C2 features, and send them to corresponding SVM. If positive, decide to build structure with probability equal to SVM confidence. Break competing structures if necessary.

Outdoors

Dog

Illustration of what we plan to have happen – not a real run of Petacat

Page 50: Petacat :   Applying ideas from Copycat  to image understanding

taking a dog for a walkhas location

persondog

has action

is onis touching

has component

aroad

abeach

trail

drives

runsflies

cathorse

swims

ropebelt

leash

string

walkswalks

is in front of

has location

has action

has component

has componenthas component

stands

is on

sits

is in front of

is touching

is behind

is next to

is on

agrass

is touching

Object

Action

indoors

sidewalk

outdoors

Spatial Relation

Page 51: Petacat :   Applying ideas from Copycat  to image understanding

Dog? Dog

Leash?

OutdoorsLeash?

Sidewalk?

Person?

Person?

Illustration of what we plan to have happen – not a real run of Petacat

Page 52: Petacat :   Applying ideas from Copycat  to image understanding

Dog

PersonStrength: 0.75

Outdoors

Sidewalk

PersonStrength: 0.6

Illustration of what we plan to have happen – not a real run of Petacat

Page 53: Petacat :   Applying ideas from Copycat  to image understanding

Dog

PersonOutdoors

Sidewalk

Illustration of what we plan to have happen – not a real run of Petacat

Page 54: Petacat :   Applying ideas from Copycat  to image understanding

taking a dog for a walkhas location

persondog

has action

is onis touching

has component

aroad

abeach

trail

drives

runsflies

cathorse

swims

ropebelt

leash

string

walkswalks

is in front of

has location

has action

has component

has componenthas component

stands

is on

sits

is in front of

is touching

is behind

is next to

is on

agrass

is touching

Object

Action

indoors

sidewalk

outdoors

Spatial Relation

Page 55: Petacat :   Applying ideas from Copycat  to image understanding

Dog

PersonOutdoors

Sidewalk

Leash?

Leash?

Dog?

Sidewalk?

Dog? Rope?

Illustration of what we plan to have happen – not a real run of Petacat

Page 56: Petacat :   Applying ideas from Copycat  to image understanding

Dog

PersonOutdoors

Sidewalk

Leash

Dog(weak)

Illustration of what we plan to have happen – not a real run of Petacat

Page 57: Petacat :   Applying ideas from Copycat  to image understanding

Dog

PersonOutdoors

Sidewalk

Leash

Dog(weak)

Dog(strong)

Illustration of what we plan to have happen – not a real run of Petacat

Page 58: Petacat :   Applying ideas from Copycat  to image understanding

Dog

PersonOutdoors

Sidewalk

Leash

Dog

Illustration of what we plan to have happen – not a real run of Petacat

Page 59: Petacat :   Applying ideas from Copycat  to image understanding

taking a dog for a walkhas location

persondog

has action

is onis touching

has component

aroad

abeach

trail

drives

runsflies

cathorse

swims

ropebelt

leash

string

walkswalks

is in front of

has location

has action

has component

has componenthas component

stands

is on

sits

is in front of

is touching

is behind

is next to

is on

agrass

is touching

Object

Action

indoors

sidewalk

outdoors

Spatial Relation

Page 60: Petacat :   Applying ideas from Copycat  to image understanding

Dog

PersonOutdoors

Sidewalk

Leash

Dog

Once objects begin to be built, relation and grouping codelets can run on them.

is next to

is in front of

is next to

is in front of

Dog group

Illustration of what we plan to have happen – not a real run of Petacat

Page 61: Petacat :   Applying ideas from Copycat  to image understanding

Once objects begin to be built, relation and grouping codelets can run on them.

Dog

PersonOutdoors

Sidewalk

Dog

is next to

is next to

Dog group

Leash

Illustration of what we plan to have happen – not a real run of Petacat

Page 62: Petacat :   Applying ideas from Copycat  to image understanding

Dog

PersonOutdoors

Sidewalk

Dog

is next to

is next to

Dog group

is next to

Leash

Illustration of what we plan to have happen – not a real run of Petacat

Page 63: Petacat :   Applying ideas from Copycat  to image understanding

How codelets decide where to look

System starts out with weak segmentation (e.g., “normalized cuts” algorithm)

Page 64: Petacat :   Applying ideas from Copycat  to image understanding

How codelets decide where to look

System starts out with weak segmentation (e.g., “normalized cuts” algorithm)

System creates “heat maps” for location andscale of objects in general(at each pixel, probability of findingan object at this location and at a particular height/width of bounding box.

++++

Page 65: Petacat :   Applying ideas from Copycat  to image understanding

How codelets decide where to look

System starts out with weak segmentation (e.g., “normalized cuts” algorithm)

System creates “heat maps” for location andscale of objects in general(at each pixel, probability of findingan object at this location and at a particular height/width of bounding box.

Object scout codelets choose location and scale probabilisitically from these heat maps.

++++

Page 66: Petacat :   Applying ideas from Copycat  to image understanding

How codelets decide where to look

When codelets look for individual object categories (e.g., dog), object-specific heat maps are created

+

Dog

Person heat map

+

Page 67: Petacat :   Applying ideas from Copycat  to image understanding

How codelets decide where to look

When codelets look for individual object categories (e.g., dog), object-specific heat maps are created

As codelets build structure, heat maps are continually updated to reflect prior (learned) expectations about location and scale as a function of location and scale of “built” objects (as well asoriginal weak segmentation).

+

Dog

+

Person heat map

Person?Person?

Page 68: Petacat :   Applying ideas from Copycat  to image understanding

How Petacat makes a final decision

Temperature

taking a dog for a walk

Dog

PersonOutdoorsLeash

Dog

is next to

is next to

Dog group Sidewalk

is next to

Illustration of what we plan to have happen – not a real run of Petacat

Page 69: Petacat :   Applying ideas from Copycat  to image understanding

How Petacat makes a final decision

Temperature

taking a dog for a walk

Dog

PersonOutdoorsLeash

Dog

is next to

is next to

Dog group Sidewalk

“Situation” codelet is more likely to run when temperature is low.

is next to

Illustration of what we plan to have happen – not a real run of Petacat

Page 70: Petacat :   Applying ideas from Copycat  to image understanding

Dog

PersonOutdoors

Leash

Dog

is next to

is next to

Dog group

is next to

Situation codelet tries to match prototypical situation with existing workspace structures, possibly allowing slippages. Sidewalk

Illustration of what we plan to have happen – not a real run of Petacat

Page 71: Petacat :   Applying ideas from Copycat  to image understanding

Dog

PersonOutdoors

Leash

Dog

is next to

is next to

Dog group Sidewalk

person

taking a dog for a walk

leash

dog

outdoors

is next to

has componenthas component

has component

has location

is in front of

Situation codelet tries to match prototypical situation with existing workspace structures, possibly allowing slippages.

Page 72: Petacat :   Applying ideas from Copycat  to image understanding

Dog

PersonOutdoors

Leash

Dog

is next to

is next to

Dog group

person

taking a dog for a walk

leash

dog

outdoors

is next to

has componenthas component

has component

has location

is in front of

is next toDog group

Sidewalk

Page 73: Petacat :   Applying ideas from Copycat  to image understanding

Dog

PersonOutdoors

Leash

Dog

is next to

is next to

Dog group

person

taking a dog for a walk

leash

dog

outdoors

is next to

has componenthas component

has component

has location

is in front of

is next toDog group

If resulting temperature is low enough, classify scene as positive

Sidewalk

Page 74: Petacat :   Applying ideas from Copycat  to image understanding

Dog

PersonOutdoors

Leash

Dog

is next to

is next to

Dog group Sidewalk

person

taking a dog for a walk

leash

dog

outdoors

is next to

has componenthas component

has component

has location

is in front of

is next toDog group

If situation codelet fails enough times or does not run for a long time,program has increasing chance of ending with negative classification.

If resulting temperature is low enough, classify scene as positive

Page 75: Petacat :   Applying ideas from Copycat  to image understanding

If Petacat classifies the picture as positive, the temperature at the end of the run gives a measure of how good an instance the picture is (e.g., of the “dog walking” situation).

Page 76: Petacat :   Applying ideas from Copycat  to image understanding

Summary:

Page 77: Petacat :   Applying ideas from Copycat  to image understanding

Summary: How does Petacat avoid exhaustive search?

Page 78: Petacat :   Applying ideas from Copycat  to image understanding

Summary: How does Petacat avoid exhaustive search?

Recall Streetscenes system, which, given an image, does exhaustive search over:• Window size and location in the image

• C1, C2 features in windows

• Object categories (e.g., car, pedestrian, tree, etc.)

Page 79: Petacat :   Applying ideas from Copycat  to image understanding

Summary: How does Petacat avoid exhaustive search?

Recall Streetscenes system, which, given an image, does exhaustive search over:• Window size and location in the image

In Petacat, codelets choose window size and location based on learned expectations and perceived context, with probabilities continually changing as more information is obtained

• C1, C2 features in windows

• Object categories (e.g., car, pedestrian, tree, etc.)

Page 80: Petacat :   Applying ideas from Copycat  to image understanding

Summary: How does Petacat avoid exhaustive search?

Recall Streetscenes system, which, given an image, does exhaustive search over:• Window size and location in the image

In Petacat, codelets choose window size and location based on learned expectations and perceived context, with probabilities continually changing as more information is obtained

• C1, C2 features in windowsCodelets request C2 features only in “relevant” windows, and request only C2 features that are relevant to what the codelet is looking for.

• Object categories (e.g., car, pedestrian, tree, etc.)

Page 81: Petacat :   Applying ideas from Copycat  to image understanding

Summary: How does Petacat avoid exhaustive search?

Recall Streetscenes system, which, given an image, does exhaustive search over:• Window size and location in the image

In Petacat, codelets choose window size and location based on learned expectations and perceived context, with probabilities continually changing as more information is obtained

• C1, C2 features in windowsCodelets request C2 features only in “relevant” windows, and request only C2 features that are relevant to what the codelet is looking for.

• Object categories (e.g., car, pedestrian, tree, etc.)Codelets look for object categories that are activated by context, based on prior expectations and currently perceived information.

Page 82: Petacat :   Applying ideas from Copycat  to image understanding

Summary: How does Petacat avoid exhaustive search?

• Petacat effects a parallel terraced scan (Hofstadter, 1995):

Codelets build structures at a rate (urgency) based on their perceived promise, which is continually updated as new information is perceived.

Temperature allows this (continually changing) rate to depend on the global state of the system.

Page 83: Petacat :   Applying ideas from Copycat  to image understanding

Relation to neuroscience/psychophysics– Gilbert & Sigman (2007): Emphasis of role to top-down

processing in vision. • “V1 and V2 may work as ‘active blackboards’ that integrate

and sustain the result of computations performed in higher areas.

– Kahneman, Triesman, and Gibbs (1992): Notion of “object files”: temporary and modifiable perceptual structures, created on the fly in working memory, which interact with a permanent network of concepts.

– Churchland, Ramachandran, and Sejnowski: Theory of “interactive vision”

– Treisman and colleagues: Shift between parallel, random, “pre-attentive” bottom-up processing and more deterministic, focused, serial, “attentive” top-down processing.

Page 84: Petacat :   Applying ideas from Copycat  to image understanding

Does Petacat understand pictures?

Page 85: Petacat :   Applying ideas from Copycat  to image understanding

Does Petacat understand pictures?

Understanding (MM’s defintion):

- Ability to appropriately use one’s knowledge and make appropriate conceptual slippages in a wide variety of environments/contexts.

- Ability to use one’s existing concepts to learn new concepts