29
1 Pattern and Speech Recognition Pattern Recognition John Beech School of Psychology PS1000

1 Pattern and Speech Recognition Pattern Recognition John Beech School of Psychology PS1000

Embed Size (px)

Citation preview

Page 1: 1 Pattern and Speech Recognition Pattern Recognition John Beech School of Psychology PS1000

1

Pattern and Speech Recognition

Pattern RecognitionJohn Beech

School of PsychologyPS1000

 

Page 2: 1 Pattern and Speech Recognition Pattern Recognition John Beech School of Psychology PS1000

2

Pattern Recognition

The term “pattern recognition” can refer to being able to recognise 2-D patterns, in particular alphanumerical characters. But “pattern recognition” is also understood to be the study of how we recognise objects in our environment. The term “object recognition” is more specific and is just about recognising 3-D objects.

We appear to be able to be very flexible, e.g. handwriting takes many different forms and still can be read (usually). See the figure on the next slide.

Page 3: 1 Pattern and Speech Recognition Pattern Recognition John Beech School of Psychology PS1000

3

Examples of different

letter As.

Page 4: 1 Pattern and Speech Recognition Pattern Recognition John Beech School of Psychology PS1000

4

Pattern Recognition

• We are still better than computers as pattern recognisers. E.g. at Carnegie-Mellon a program called “Gimpy” has an 850-word dictionary and it chooses a word from here, and mangles the image by letter warping, putting in distractions (e.g. blobs) in foreground and background. The web user has to recognise the word, which humans can do easily – even a child – but computer programs find this very difficult.

• As we find it so easy to pattern recognise, often psychologist resort to very brief presentations or they use reaction time measures.

Page 5: 1 Pattern and Speech Recognition Pattern Recognition John Beech School of Psychology PS1000

5

Pattern Recognition

For this topic we are going to cover the following areas:

1. Template theory

2. Feature theory and its evidence

3. Global vs local processing

4. Structural theory

Page 6: 1 Pattern and Speech Recognition Pattern Recognition John Beech School of Psychology PS1000

6

1. Template theory

• Template theory suggests a copy of each pattern in long term memory (LTM) for all known patterns. The closest match to a template produces recognition.

• These templates are wholistic and unanalysed. But there are difficulties…

Page 7: 1 Pattern and Speech Recognition Pattern Recognition John Beech School of Psychology PS1000

7

1. Template theory

The problems with templates• The template, when the comparison is made, has to be in

the same position and at the same angle AND be the same size. So there is the problem that a system using templates would be continuously adjusting templates to match up – and this is supposing that the correct template was chosen in the first place.

• Another difficulty is that there is such a great variety of potential patterns, even if we were to confine ourselves just to recognising handwriting. Thus stimuli can be very complex, so a template system might be unworkable.

Page 8: 1 Pattern and Speech Recognition Pattern Recognition John Beech School of Psychology PS1000

8

1. Template theory

Although one might dismiss template theory, there could be situations in which it is viable. There is evidence that sensory information is preserved briefly in a way that could be viewed as unanalysed templates. These could be subsequently analysed during the process of pattern recognition.

Page 9: 1 Pattern and Speech Recognition Pattern Recognition John Beech School of Psychology PS1000

9

1. Template theory• Phillips (1974) gave subjects patterns (8 x 8 matrices) such as this:

• It was on for 1 second and then went off. Straight afterwards another matrix would appear that was either identical or similar in pattern and Ss had to say if it were the same or different.

• Half the time this 2nd matrix was in the same location and in the other half of trials it had moved by exactly the width of one cell.

• Phillips also varied the time between the two matrices.

Moved?

Still Move

Same

Diff

Page 10: 1 Pattern and Speech Recognition Pattern Recognition John Beech School of Psychology PS1000

10

1. Template theory

• The results on the right are what Phillips found. In the “still” condition where the pictures were in the same location accuracy is very good at the shortest interstimulus interval (20 msec). But this declines rapidly. So a sensory store was being used but information decayed from it very rapidly.

• When we look at the “move” condition, where the second matrix was in a different position, the sensory store couldn’t be used so interstimulus interval has no effect on accuracy.

• This might be seen as evidence for some kind of rapid template matching in the 20 msec “still” condition.

Page 11: 1 Pattern and Speech Recognition Pattern Recognition John Beech School of Psychology PS1000

11

2. Feature theory• Template theory might give a good account of how a

sensory image is stored. But such an image is only fleeting. So this may not ultimately have much of a role in the processing of the image.

• To achieve this it is more likely that some kind of description of the image is used. In order to do this the parts of the images have to be described, which is where feature theory comes in.

• Gibson (1969) proposed that when children process patterns they discover features that are used to differentiate one pattern from the other.

• Consider M and N. A child initially is likely to find these two confusing, so the slanting line on the M that slants upwards is a distinctive feature that can be used to differentiate the two.

Page 12: 1 Pattern and Speech Recognition Pattern Recognition John Beech School of Psychology PS1000

12

2. Feature theory

Some of the evidence for feature theory:• Simple patterns can be broken into units or features• Hubel & Weisel (1962) observed single cell firing in cat’s

visual cortex. Small bar of light rotated (7 deg) affected firing. Angle of bar analysed.

Page 13: 1 Pattern and Speech Recognition Pattern Recognition John Beech School of Psychology PS1000

13

2. Feature theorySome of the evidence for feature theory:

• Their experimental work suggested simple cells with excitatory and inhibitory regions (the white and grey regions below), so the cell fires if there is input (light) in its receptive field. So it is responding to spatial features. Then there are complex cells as shown below. Complex cells for edges at a particular orientation. They would also operate in a larger region. The hypercomplex cells could respond, e.g. to 2 edges at right angles in an even larger region.

Page 14: 1 Pattern and Speech Recognition Pattern Recognition John Beech School of Psychology PS1000

14

2. Feature theory

Stabilised image on retina by Pritchard (1960). Image projected to retina disappears systematically. Loss of features.

Page 15: 1 Pattern and Speech Recognition Pattern Recognition John Beech School of Psychology PS1000

15

2. Feature theory

• Neisser (1964): Searching for a letter in a matrix of letters shows influence of features when searching for the letter (e.g. Q) in these two lists:

F T H L IL M V X ZE T V Q HI N H E T

O D B C UD O S G BR P B O UQ S G U R

Page 16: 1 Pattern and Speech Recognition Pattern Recognition John Beech School of Psychology PS1000

16

2. Feature theory

The pandemonium model (Selfridge, 1959)• This relies on the extraction of features.• Level 1: feature demons. Level 2: cognitive demons.

Level 3: the decision demon. • Predicts confusion between letters with similar visual

features.

Page 17: 1 Pattern and Speech Recognition Pattern Recognition John Beech School of Psychology PS1000

17

2. Feature theory: The pandemonium model (Selfridge, 1959)

Feature demons Cognitive demons Decision demon

Vertical lines? Letter A?

Horizontal lines? Letter B?

Oblique lines? Letter C?

Right angles? Letter D?

Acute angles? Letter E?

Discontinuous curves? Letter F?

Continuous curves? etc

Page 18: 1 Pattern and Speech Recognition Pattern Recognition John Beech School of Psychology PS1000

18

2. Feature theory: An evaluation

• In relation to template theory, feature theory is more efficient as there would probably be much fewer features than templates.

• Stimulus equivalence: this is an advantage for feature theory - a character could have many different forms and still be recognised by feature theory.

• However, the figure shows effect of context on pattern recognition. Middle letter the same in features but read differently. Feature theory does not acknowledge that context important.

• 3-D stimuli more difficult to identify from features. One needs more than just features. Need to know the relationships between features.

Page 19: 1 Pattern and Speech Recognition Pattern Recognition John Beech School of Psychology PS1000

19

2. Feature theory: An evaluation

• Featural analysis needs time. But if need to process features of individual letters, how is this done so fast? For instance, we read at 100-400 words/min (average 300), little time for analysis. One estimate is that normal reading in this manner would need 5,000 feature detections a second.

• We can read sentences quite well, even if some of the letters are missing. In this case we use our knowledge of grammar, of words and so on to help us to fill in the gaps. E.g. “W_ f_ll _n g_ps _n s_nt_nc_s w_ll”

• Similarly, when shown even a simple pattern (e.g. a letter), one has to know the spatial relationships between the features. Is it possible to recognise an overall shape before examining its features?

Page 20: 1 Pattern and Speech Recognition Pattern Recognition John Beech School of Psychology PS1000

20

3. Global v. local processing

• Psychologists (and philosophers before them) have been interested in the concept of processing local aspects of stimuli vs. the global impression of the stimulus. The global aspect suggests that one is processing something more than the sum of the parts. This was close to the ideas of the Gestalt psychologists.

• One way of illustrating this is by looking at the following:

AAAAAAA                EEEEAAAAAAA              EE   EEAA                   EE     EEAAAAAAA             EEEEEEEEAAAAAAA             EEEEEEEEAA                   EE     EEAAAAAAA             EE     EEAAAAAAA             EE     EE

Page 21: 1 Pattern and Speech Recognition Pattern Recognition John Beech School of Psychology PS1000

21

3. Global v. local processing

• Navon (1977) examined this issue as follows:

• He produced large upper case letters made up of smaller letters (see fig. of global letter H comprised of local letter Ss)

• People heard ‘H’ or ‘S’ faintly and had to choose whether they’d heard H or S. Also shown picture.

• So, the task was to say if they’d heard an S or an H sound (which was very faint).

• Found: If shown global H (composed of local letter “Ss”) and the sound was actually H, Ss faster than if global H shown and the sound was an S. Local letters had no effect. In fact most failed to notice the local letters. In this case showed that the whole can be seen before the parts.

Page 22: 1 Pattern and Speech Recognition Pattern Recognition John Beech School of Psychology PS1000

22

3. Global v. local processing

In another experiment, Navon (1977) gave his subjects two separate tasks.

Task 1They had to decide if they had been given a

global H or a global S.ResultThey were not affected by the local letters.

Task 2They had to decide if they’d been given local H

or local S.ResultThey were affected by the global letters. So if

given local Hs they found it more difficult if these were within a global S rather than a global H.

Navon concluded that it seems one invariably processes overall shape.

Page 23: 1 Pattern and Speech Recognition Pattern Recognition John Beech School of Psychology PS1000

23

4. Structural descriptions

• A problem for featural theory is that we construct descriptions of patterns in the absence of any features. This picture is a good example of that:

• Here we create the triangle even though there are actually no edges of the triangle shown.

• It seems likely that we use structural descriptions. These consist of propositions, or statements, describing the nature of the relationships between elements in a picture or pattern. For instance, for the letter L it might be: ‘it has a horizontal line and a vertical line. They are joined so that the left-most part of the horizontal is connected to the lowest point of the vertical line’.

Page 24: 1 Pattern and Speech Recognition Pattern Recognition John Beech School of Psychology PS1000

24

4. Structural descriptions

These descriptions are about the important aspects or the structure of the picture. For instance, length of lines is not important.

Featural theory does not make any discrimination between features on the grounds of importance.

Structural descriptions are important for 3-D structures.Early work by Winston (1975) involved producing an

analysis of pictures of blocks. This analysed for L, T and K shapes as well as for arrows and forks. The program worked out concave and convex edges as well as surfaces and likely shapes (see next Figure).

Page 25: 1 Pattern and Speech Recognition Pattern Recognition John Beech School of Psychology PS1000

25

4. Structural descriptions: Winston 1975

Page 26: 1 Pattern and Speech Recognition Pattern Recognition John Beech School of Psychology PS1000

26

4. Structural descriptions

Biederman’s component model• He believed that structural

descriptions wouldn’t describe the lines and curves in an object.

• Instead, it would describe simple volumes, such as cones, cubes, cylinders, etc. So Biederman thought in terms of a limited number of components that could describe most objects. Different arrangements can produce different objects. E.g the mug and pail in the picture.

• Biederman (1985) suggested that we need only 35 geons, as he called them, to represent the world of objects as we know them.

• Thus structural descriptions need only be concerned with the relations between a limited set of components.

Page 27: 1 Pattern and Speech Recognition Pattern Recognition John Beech School of Psychology PS1000

27

4. Structural descriptions

Biederman’s component modelIn his experiment he removed 65% of the contours, such as from the

objects to the right. But in the one on the left he removed from the middle of the segments, while for the one on the right he removed from the vertices (where the segments joined). They were then presented briefly (100 msec) and performance was 70% v. 50% for the left and right objects (averaging over similar types of objects). So people find relational information very useful.

Page 28: 1 Pattern and Speech Recognition Pattern Recognition John Beech School of Psychology PS1000

28

4. Structural descriptions

Conclusion for structural descriptionsStructural theories are necessary because they

extend feature theories by showing the relationships between features. And to be able to work we need a quite powerful descriptive language to account for our impressive abilities at pattern recognition.

Page 29: 1 Pattern and Speech Recognition Pattern Recognition John Beech School of Psychology PS1000

29

Conclusion of pattern recognition1. Template theory suggests that patterns are matched to internal

templates. But unworkable with complex stimuli.2. Feature theory is more efficient than template theory and there

are several strands of evidence suggesting the use of featural analysis: e.g. Hubel & Weisel (1962), Pritchard (1960), Neisser (1964). The Pandemonium model shows how featural analysis might operate. However, featural analysis has problems as it is insufficient by itself to account for many things, such as the effect of context and the relationships between features.

3. Global vs. local processing – it appears that global processing is more important than featural processing (e.g. Navon 1977). People have a preference for overall shape over individual features.

4. Structural theory involves propositional descriptions of the important aspects of the structure of a picture. Structural features have more to offer than featural theory by showing the relationship between features. But the challenge for structural theory is to account for the impressive abilities of humans over computers in the sphere of pattern recognition.