Cognitive Processes PSY 334 Chapter 2 – Perception July 3, 2003

Cognitive ProcessesPSY 334

Chapter 2 – Perception

July 3, 2003

Law of Pragnanz

Of all the possible interpretations, we will select the one that yields the simplest or most stable form.

Simple, symmetrical forms are seen more easily.

In compound letters, the larger figure dominates the smaller ones.

Visual Illusions

Depend on experience. Influenced by culture.

Illustrate normal perceptual processes. These are not errors but rather failures of

perception in unusual situations.

Visual Pattern Recognition

Bottom-up approaches: Template-matching Feature analysis Recognition by components

Template-Matching

A retinal image of an object is compared directly to stored patterns (templates). The object is recognized as the template

that gives the best match. Used by computers to recognize patterns.

Evidence shows human recognition is more flexible than template-matching: Size, place, orientation, shape, blurred or

broken (ambiguous or degraded items easily recognized by people.

Feature Analysis

Stimuli are combinations of elemental features. Features are recognized and combined. Features are like output of edge detectors.

Features are simpler, so problems of orientation, size, etc., can be solved.

Relationships among features are specified to define the pattern.

Evidence for Feature Analysis

Confusions – people make more errors when letters presented at brief intervals contain similar features: G misclassified: as C (21), as O (6), as B

(1), as 9 (1) When a retinal image is held constant,

the parts of the object disappear: Whole features disappear. The remaining parts form new patterns.

Object Recognition

Biederman’s recognition-by-components: Parts of the larger object are recognized as

subobjects. Subobjects are categorized into types of

geons – geometric ions. The larger object is recognized as a

pattern formed by combining geons. Only edges are needed to recognize

geons.

Tests of Biederman’s Theory

Object recognition should be mediated by recognition of object components.

Two types of degraded figures presented for brief intervals: Components (geons) missing Line segments missing

At fast intervals (65-100 ms) subjects could not recognize components when segments were missing.

Speech Recognition

The physical speech signal is not broken up into parts that correspond to recognizable units of speech. Undiminished sound energy at word

boundaries – gaps are illusory. Cessation of speech energy in the middle

of words. Word boundaries cannot be heard in an

unfamiliar language.

Phoneme Perception

No one-to-one letter-to-sound correspondence.

Speech is continuous – phonemes are not discrete (separate) but run together.

Speakers vary in how they produce the same phoneme.

Coarticulation – phonemes overlap. The sound produced depends on the

sound immediately preceding it.

Feature Analysis of Speech

Features of phonemes appear to be: Consonantal feature (consonant vs vowel). Voicing – do vocal cords vibrate or not. Place of articulation – where the vocal

track is constricted (where is tongue placed).

The phoneme heard by listeners changes as you vary these features. Sounds with similar features are confused.

Categorical Perception

For speech, perception does not change continuously but abruptly at a category boundary.

Categorical perception – failure to perceive gradations among stimuli within a category. Pairs of [b]’s or [p]’s sound alike despite

differing in voice-onset times.

Two Views of Categorical Perception Weak view – stimuli are grouped into

recognizable categories. Strong view – we cannot discriminate

among items within such a category. Massaro – people can discriminate

within category but have a bias to same items are the same despite differences.

Category boundaries can be shifted by fatiguing the feature detectors.

Top Down Processing

General knowledge (context, high-level thinking) combines with interpretation of low-level perceptual units (features).

Context limits the possibilities so fewer features must be processed: Word superiority effect – D or K vs WORD

or WORK – words do 10% better. To xllxstxatx, I cxn rxplxce xvexy txirx

lextex of x sextexce xitx an x, anx yox stxll xan xanxge xo rxad xt wixh sxme xifxicxltx.

Context and Speech

Phoneme restoration effect: It was found that the *eel was on the axle. It was found that the *eel was on the shoe. It was found that the *eel was on the

orange. It was found that the *eel was on the table.

The identification of the missing word depends on what happens after it.

Faces and Scenes

When parts are presented in isolation, more feature information is needed to recognize them. Face parts are recognized with less detail

when in the context of a face. Subjects are better able to identify objects

when they are part of coherent novel scenes rather than jumbled scenes.

Models of Object Perception

Two competing models explain how context and feature information are combined: Massaro’s FLMP (fuzzy logic model of

perception) -- Context and detail are two independent sources of information.

McClelland & Rumelhart’s PDP model – connectionist model in which both sources of information interact.

Testing the FLMP Model

Four kinds of stimuli: Only an e can make a real word. Only a c can make a real word. Both letters can make a word. Neither letter can make a word.

Within each group, stimuli go from e to c. Subjects saw each stimulus word briefly

and had to identify the letter, e or c.

FLMP Results

Observed frequencies for naming a letter e increase as it has more e features, but also as the context demands an e.

Baye’s theorem gives a formula for combining the independent contributions of two sources of information.

Massaro’s results conform to predictions of Baye’s theorem, suggesting that the information sources must be independent of each other.

Testing the PDP Model

Activation spreads from features to excite letters and from letters to excite words (bottom up processing).

Activation also spreads from words to the component letters (top-down processing).

The more activation, the more likely the correct letter will be identified: TRAP vs TRIP

Comparing the Two Models

Subjects heard a phoneme that varied from r to an l in two contexts: A syllable beginning with t – tr or tl. A syllable beginning with s – sl or sr.

Both the FLMP and PDP models were compared to actual subject data. FLMP was close to what subjects did. PDP was too strongly affected by context.

PDP Model Describes More

The PDP model suggests that information is not separately processed but each letter affects each other letter. Recognition of “a” in MAVE is almost as

good as recognizing it in MADE. This occurs because MAVE is similar to

many other words with an A in that position.

We do not have a context but four letters that each influence the others.

Marr

Depth cues (texture gradient, stereopsis) – where are edges in space?

How are visual cues combined to form an image with depth? Primal sketch – extracts features. 2-1/2 D sketch – identifies where visual

features are in relation to observer (depth). 3-D model – refers to the representation of

the objects in a scene, combines context.

Putting it All Together

The output of these stages (see Fig 2.31) is a representation of an object and its location.

This output is used as input to higher-level cognitive processes.

Conscious awareness (a higher-level process) involves the recognition stage, but lots of processing occurs first.

Documents

Cognitive Processes PSY 334 Chapter 2 – Perception July 3, 2003