6
Seeing slow and seeing fast: two limits on perception Alex O. Holcombe School of Psychology, University of Sydney, Brennan MacCallum Bldg (A18), Sydney, NSW 2006, Australia Video cameras have a single temporal limit set by the frame rate. The human visual system has multiple temporal limits set by its various constituent mechan- isms. These limits seem to form two groups. A fast group comprises specialized mechanisms for extracting per- ceptual qualities such as motion direction, depth and edges. The second group, with coarse temporal resol- ution, includes judgments of the pairing of color and motion, the joint identification of arbitrary spatially sep- arated features, the recognition of words and high-level motion. These temporally coarse percepts might all be mediated by high-level processes. Working at very differ- ent timescales, the two groups of mechanisms collab- orate to create our unified visual experience. Spatial versus temporal scales Our visual system has a spatial grain, making small details completely invisible, such as the individual mol- ecules forming the surface you are currently viewing. Less appreciated is that our processing of time is also confined to coarse scales. Events confined to thousandths of a second, and sometimes even hundredths of a second or longer, are not perceived. For example, until brief-exposure photogra- phy was invented in the 1870s, it was not known whether or not a galloping horse lifts all four hooves off the ground at any one time [1]. Just as the microscope probed small spatial scales long before stroboscopic photography investigated small temporal scales, so vision scientists have traditionally con- centrated on the spatial resolution of vision rather than its temporal resolution. However, recent years have seen an upsurge in interest in temporal processing. Much of this growth has been in areas of temporal illusions, such as distortions in time perception [2], asynchronies in binding the features of an object [3], and possible interactions be- tween object motion and the time an object is perceived [4]. Recent reviews of these topics are available [57], but miss- ing from the literature is a modern synopsis of the temporal limits of vision the timescales on which the machinery of perception operates. These temporal limits set the founda- tion necessary to understand temporal illusions, and in bringing together the diverse psychophysical literature on temporal limits, a broad generalization emerges regarding the functional organization of the visual system. Understanding multiple temporal scales Let us begin with a naı¨ve question does your visual experience display the world at five, twenty or one hundred frames per second? In the case of the cinema, we know there is a frame rate, a temporal grain of 24 per second. However, our visual system does not operate like movie equipment. Visual processing is more continuous [8] although see Ref. [9] for the suggestion of a discrete component. In continuous processing, although there is no frame rate because discrete snapshots are not taken, the system still can be said to have a temporal grain. The grain, or temporal interval over which the system blurs information together, is known as ‘temporal resolution’. A movie camera has one temporal resolution, reflecting its frame rate. But the visual system has multiple temporal resolutions. The high temporal resolution of some of our perceptual mechanisms presents a problem for the cinematographer working with a film shot, as most are, at 24 pictures per second. If films were projected at 24 frames per second, the flicker would be obvious and annoying because our flicker perception mechanism is fast enough to detect 24 Hz flicker. To thwart both flicker perception and some motion artifacts that our fast motion mechanisms are sensitive to, cinema projectors flick each frame on and off twice before moving on to the next frame. The resulting 48 per second (48 Hz) flicker rate is almost invisible to us. Although the technique eliminates the flicker perception, one might expect other problems to remain. As each object is shown twice in the same location before moving on, a cine- matic car chase might appear as a stuttering procession rather than as the familiar fluid and seamless experience. Fortunately for the cinematographer, our visual perception fails to register the stuttering a consequence of the coarse temporal scale at which most perceptual mechanisms pro- cess the scene. As described later, the 24 pictures per second rate is already too fast for many of our visual mechanisms. Our perception comprises some mechanisms, like flicker and motion, which inform us of rapid changes, whereas others are restricted to much slower changes and thus fooled by a 24 Hz simulacrum of reality. Two of the temporal limits underpinning experience can be demonstrated with a simple animation. In Movie 1 (see supplementary material online), adjacent striped patches alternate between leftward-tilted and rightward-tilted. When the alternation rate is only a few per second (top row), it is easy to perceive whether the two patches are always opposite in orientation (top left) or are always oriented in the same way (top right). If the patches are alternated at a faster rate, say between 8 and 12 frames per second (second row), it becomes difficult or impossible to judge whether the patches have the same orientation or opposite orientations [10]. Interestingly, although too fast Review Corresponding author: Holcombe, A.O. ([email protected]). 216 1364-6613/$ see front matter ß 2009 Elsevier Ltd. All rights reserved. doi:10.1016/j.tics.2009.02.005

Review Seeing slow and seeing fast: two limits on perception€¦ · Seeing slow and seeing fast: two limits on perception Alex O. Holcombe School of Psychology, University of Sydney,

Embed Size (px)

Citation preview

Seeing slow and seeing fast: two limitson perceptionAlex O. Holcombe

School of Psychology, University of Sydney, Brennan MacCallum Bldg (A18), Sydney, NSW 2006, Australia

Video cameras have a single temporal limit set by theframe rate. The human visual system has multipletemporal limits set by its various constituent mechan-isms. These limits seem to form twogroups. A fast groupcomprises specialized mechanisms for extracting per-ceptual qualities such as motion direction, depth andedges. The second group, with coarse temporal resol-ution, includes judgments of the pairing of color andmotion, the joint identification of arbitrary spatially sep-arated features, the recognition of words and high-levelmotion. These temporally coarse percepts might all bemediated by high-level processes.Working at very differ-ent timescales, the two groups of mechanisms collab-orate to create our unified visual experience.

Spatial versus temporal scalesOur visual system has a spatial grain, making smalldetails completely invisible, such as the individual mol-ecules forming the surface you are currently viewing. Lessappreciated is that our processing of time is also confined tocoarse scales. Events confined to thousandths of a second,and sometimes even hundredths of a second or longer, arenot perceived. For example, until brief-exposure photogra-phy was invented in the 1870s, it was not known whetheror not a galloping horse lifts all four hooves off the groundat any one time [1].

Just as the microscope probed small spatial scaleslong before stroboscopic photography investigated smalltemporal scales, so vision scientists have traditionally con-centrated on the spatial resolution of vision rather than itstemporal resolution. However, recent years have seen anupsurge in interest in temporal processing. Much of thisgrowth has been in areas of temporal illusions, such asdistortions in time perception [2], asynchronies in bindingthe features of an object [3], and possible interactions be-tween object motion and the time an object is perceived [4].Recent reviews of these topics are available [5–7], but miss-ing from the literature is a modern synopsis of the temporallimits of vision – the timescales on which the machinery ofperception operates. These temporal limits set the founda-tion necessary to understand temporal illusions, and inbringing together the diverse psychophysical literature ontemporal limits, a broad generalization emerges regardingthe functional organization of the visual system.

Understanding multiple temporal scalesLet us begin with a naı̈ve question – does your visualexperience display the world at five, twenty or one hundred

frames per second? In the case of the cinema,we know thereis a frame rate, a temporal grain of 24 per second. However,our visual system does not operate like movie equipment.Visual processing is more continuous [8] – although see Ref.[9] for the suggestion of a discrete component. In continuousprocessing, although there is no frame rate because discretesnapshots are not taken, the system still can be said to havea temporal grain. The grain, or temporal interval overwhichthe system blurs information together, is known as‘temporal resolution’. A movie camera has one temporalresolution, reflecting its frame rate. But the visual systemhas multiple temporal resolutions.

The high temporal resolution of some of our perceptualmechanisms presents a problem for the cinematographerworking with a film shot, as most are, at 24 pictures persecond. If films were projected at 24 frames per second, theflicker would be obvious and annoying because our flickerperception mechanism is fast enough to detect 24 Hzflicker. To thwart both flicker perception and some motionartifacts that our fast motion mechanisms are sensitive to,cinema projectors flick each frame on and off twice beforemoving on to the next frame. The resulting 48 per second(48 Hz) flicker rate is almost invisible to us.

Although the technique eliminates the flicker perception,onemight expect other problems to remain. As each object isshown twice in the same location before moving on, a cine-matic car chase might appear as a stuttering processionrather than as the familiar fluid and seamless experience.Fortunately for the cinematographer, our visual perceptionfails to register the stuttering – a consequence of the coarsetemporal scale at which most perceptual mechanisms pro-cess the scene. As described later, the 24 pictures per secondrate is already too fast for many of our visual mechanisms.Our perception comprises some mechanisms, like flickerand motion, which inform us of rapid changes, whereasothers are restricted tomuchslower changes and thus fooledby a 24 Hz simulacrum of reality.

Two of the temporal limits underpinning experience canbe demonstrated with a simple animation. In Movie 1 (seesupplementary material online), adjacent striped patchesalternate between leftward-tilted and rightward-tilted.When the alternation rate is only a few per second (toprow), it is easy to perceive whether the two patches arealways opposite in orientation (top left) or are alwaysoriented in the same way (top right). If the patches arealternated at a faster rate, say between 8 and 12 frames persecond (second row), it becomes difficult or impossible tojudge whether the patches have the same orientation oropposite orientations [10]. Interestingly, although too fast

Review

Corresponding author: Holcombe, A.O. ([email protected]).

216 1364-6613/$ – see front matter ! 2009 Elsevier Ltd. All rights reserved. doi:10.1016/j.tics.2009.02.005

for the temporal relationship of the two features to beperceived, this rate is nevertheless slow enough to identifythe two orientations [11] and to detect the flicker. Not untilthe rate is increased much further would the movie appearcompletely fused (bottom row), with neither the relation-ship of the two patches, their individual patterns nor theirflicker perceived.

Each speed limit probably corresponds to the narrowesttemporal scale at which a perceptual mechanism operates.Consider the perception of fluorescent lights, which illu-minate the workdays of most office workers. Fluorescentlights flicker – 120 times per second (120 Hz) in the USAand 100 Hz in many other places. We do not perceive thisflicker because our underlying visual detectors averagetheir inputs over a longer interval. Indeed, our visualmechanisms average over 20 ms ormore, blurring togetheran entire 10 ms cycle of 100 Hz flicker and therefore oblit-erating the changes in luminance in the fluorescent lightsand in the 48 Hz flicker of the double-projected cinema.

At first, one might think this integration interval couldbe revealed by presenting a stimulus a single time ratherthan in a cycle, and determining the shortest duration forwhich it can still be perceived. But, a 5 ms stimulus flash, ifsufficiently high contrast, can be analyzed even by a mech-anism that operates at a slow 100 ms scale because aver-aging over 100 ms will not eliminate the stimulusinformation. This point has been key to a debate in theliterature on the timescale of visual grouping [12]. Thealternating displays referred to in this article are specifi-cally designed to provide no information when a mechan-ism integrates over the cycle of alternation. Restrictingourselves to such displays will lead us to ignore for now(but see Box 1) the interesting literature on the effect ofvarying presentation duration of a single stimulus.

Althoughmany studies have probed the temporal limitsof visual judgments using the alternating stimulus tech-nique, because these studies have used very differentstimuli and tasks, the results usually cannot be comparedwith quantitative precision. However, in some cases thespeed limits found differ so much that the differences areunlikely to be caused by experiment details. Figure 1shows some of these limits. The limits fall into two clusters,slow and fast, separated by a large gap.

Fast limits on visual perceptsThe fast cluster includes flicker perception. The ‘flickerfusion’ limit, like others, depends on mean luminance,luminance contrast and other factors, but is usually at least50 Hz on ordinary computer screens. Only six other types ofvisual judgmentshavebeen foundtobeof comparable speed:

First-order motionA moving sinusoidal grating with bright peaks and darktroughs is seen through awindow. Evenwhen its velocity isso high that 30 of its bright peaks pass each location persecond (30 Hz), observers still perceive its direction ofmotion [13]. If the input to motion detectors was blurredor jumbled over this timescale, determining the motiondirection would be impossible. Flicker is a degenerate caseof motion and the same bank of fast detectors could beresponsible for both the motion and flicker limits.

Depth from binocular disparityTwo identical gratings, one in each eye, are viewed throughidentical windows. One is spatially shifted relative to theother, introducing a binocular disparity that is perceived asdepth. Even when they move at a 30 Hz rate, the depth isstill perceived [14]. If the input to depth mechanisms wereblurred at a >30 ms timescale, detecting the depth wouldbe impossible at this rate.

Edges and texture boundariesAgainst a gray background, a field of white dots is adjacentto a field of black dots. Both fields are set in rapid alter-nation between white and black, but out of phase – whenone is white, the other is black. The conspicuous boundarybetween them is perceived even at fast rates of 30 Hz[15,16], whereas at faster rates the black and white areaveraged by the brain into the same color as the back-ground. Texture boundaries defined by orientation differ-ences can also be seen at fast rates [11].

Binding of color and orientation sharing a spatiallocationIf a red, right-tilted patch is alternated with a green, left-tilted patch, it is easy to distinguish from the complemen-tary pairing of red left alternating with green right, even atfast 20 Hz rates [17] (demo at www.psych.usyd.edu.au/staff/alexh/research/binding/).

Color and orientation as individual featuresFast binding of color and orientation could not occur with-out fast processing of the individual color and orientation

Box 1. Temporal masking and temporal resolution

The stimulus presentation method of the visual temporal maskingliterature bears a strong resemblance to that of the experimentsreviewed in this article. In both, two stimuli follow in rapidsuccession. But instead of alternating between two stimuli to beidentified as in Movie 1 (see supplementary material online),masking displays usually consist of a non-repetitive display, with asingle target stimulus followed (or preceded) by a neutral high-contrast pattern which impairs the visibility of the target. Thediscussion in the ‘Understanding multiple temporal scales’ sectionof this article explains why such experiments do not necessarilyreveal temporal resolution. However, poor temporal resolutionmight, nonetheless, be the reason for the invisibility, resulting inthe target integrating with the mask.

Studies of visual masking currently focus on the more complicatedpossibility that a mask presented at certain times disrupts selectedcomponents of target processing, such as edge versus surfaceprocessing or feedback versus feedforward mechanisms [51]. Maskscould also, rather than integrating (blurring) with the target assuggested by the temporal resolution concept, simply terminatetarget processing [52]. The implications for the understanding of thetemporal limits reviewed here is not understood. Detailed computa-tional models of masking have proliferated and are successful inexplaining the data from traditional masking paradigms [53], but theydo not seem capable of explaining the temporal limits reviewed here.The critical exposure duration for visibility in most masking studies isshorter than that for the temporal resolution studies in the slowcategory here, yet longer than that for the temporal resolution studiesin the fast category. In principle, this difference could be due to thepresence of low temporal frequency information in the non-repetitivemasking displays, but this is unlikely to explain the differences inevery case. The challenge is to createmodels of visual processing thatcan bridge both paradigms.

Review Trends in Cognitive Sciences Vol.13 No.5

217

features. Indeed, the features might be perceived at rateseven faster than is their pairing [18], although thisremains unsettled as the slower limit for the pairing couldinstead be caused by an asynchrony in processing of thefeatures.

Binding of local orientation elements into a global formThe visual system can efficiently integrate hundreds oflocal oriented elements distributed across the visual fieldinto an overall shape, and do so even when rapid alterna-tion requires the computation to occur over less than 30 ms[19] (demo at http://www.psych.usyd.edu.au/staff/colinc/HTML/dynamics.htm). Symmetry judgments of such pat-terns might be similarly rapid [20].

Slow limits on visual perceptsMeasurements of other visual judgments all seem to haveyielded much slower temporal limits. The pairing of twoorientations was one such limit (Movie 1 in supplementarymaterial online). For a variety of features, such pairing orbinding judgments are confined to rates less than 4 Hz. Astunning example occurs with a field of dots moving backand forth behind a window. When the dots change direc-tion, they also change color, between black and white(Movie 2 in supplementary material online). The color-motion pairing (leftward with white or with black) is easyto perceive at slow alternation rates, as can be seen in theversion at top of Movie 2. However, at a faster rate of !5alternations per second (bottom of Movie 2) it is difficult toperceive the pairing between color and motion [21,22],although this rate is far slower than the flicker fusion rate.Many observers report that, despite an inability to deter-mine the pairing (even after compensation for a possibleasynchrony in feature processing [22]), the constituentcolors and motions are still easily perceived. As in Movie

1, apparently the process of pairing visual features takesmore time than identifying them.

Other percepts are confined to similarly slow rates. It ispossible that each is extracted by a separate process withthe similar speed limits only a coincidence. A more inter-esting possibility is that a common, late visual processingstage limits the whole lot. This slow-limit group includes:

Binding of form and color across spaceAlternating color–shape pairings can only be reported atrates below !3 Hz [17] (demo at www.psych.usyd.edu.au/staff/alexh/research/binding/).

Binding of global form with colorTwo specially constructed dot patterns that form distinctshapes alternate, with all the dots of one red and all thedots of the other green. At alternation rates faster thanseveral a second it is very difficult to determine the shape–color pairing even though the shapes and colors themselvesare easily identified [19].

Direction change and acceleration perceptionA moving stimulus alternates between two speeds (accel-eration) or two directions. When the alternation occursfaster than several per second, these changes are unper-ceivable [23].

Attention-mediated motion and other forms of higher-order motionConventional Reichhardt-like motion detectors can onlydetect a common subset of moving stimuli. Other stimuliare thought to reveal a high-level motion mechanismpossibly controlled by attention [24] with a low temporallimit [25,26]. Its 8–10 Hz limit makes it the fastest ofthis list.

Figure 1. Temporal limits on visual judgments. Each colored strip represents the results of an experiment in which the rate of change of a display was varied to determinethe maximum rate at which something could be perceived. In a flicker perception experiment, the stimulus alternates between two luminance values. Flicker is perceivedfrom slow rates up to speeds exceeding 50 Hz, indicated by the fading of the strip around that rate.

Review Trends in Cognitive Sciences Vol.13 No.5

218

Word perceptionCertain pairs of words (such as ‘jump’ and ‘pink’), whenalternated in the same location, cannot be distinguishedfrom another matched pair (‘junk’ and ‘pimp’; demoat www.psych.usyd.edu.au/staff/alexh/research/words) atrapid rates as one perceives only the sum, which is thesame for both pairs. They can be distinguished only at ratesslower than several items per second [27].

Why do temporal limits form two distinct groups?What could account for the gulf (Figure 1) separating thetemporal limits of these two sets of visual judgments?Possibly this is simply selective reporting (Box 2), butthere is reason to believe instead that the fast limits reflectspecialized mechanisms, with the slow limits imposedwhen a visual representation must be constructed bycentral, possibly attentive processing. This notion of fastperipheral processing and slower central processing is anold one [28,29]. However, over the last decade or so thenumber of established slow limits (Figure 1) has doubled.Words, on the low-resolution list, are not recognized untilinferotemporal cortex [30], and attention-mediated motionand binding of arbitrary features are apparently mediatedby parietal cortex [31,32]. Furthermore, some of the limitsare set by long-range, object-based summation [33] ratherthan retinally local alternation rate or short-rangestimulus trajectory [34]. Because visual signals probablyare not compared with auditory signals until late in thesensory processing streams, the finding that audio-visualbinding is limited to 4 Hz [35] is yet another indication thatcentral stages are limited to low frequencies.

If late stages of processing including visual experienceare indeed slow in general, then onemight wonder how anyconscious visual judgment could show a fast rate limit.Fortunately, when visual cognition cannot extract infor-mation because it is confined to rapid rates, fast lower-levelmechanisms can create labels so that later stages can knowwhat is going on at short timescales. These labels aresubsequently temporally combined before reaching visualexperience. This yields a temporally extended object suchas a single moving surface [13] or, in the case of alternatingcolored gratings, a transparent display of two seeminglysimultaneous surfaces [36] (demo at http://www.psych.usyd.edu.au/staff/alexh/research/transparency/Home.html).However, with visual cognition thus dependent on special-ized mechanisms to inform it of high frequency information,much is lost.

Understanding high-level visual processingThe temporal limits of the early stages of vision are typi-cally explainedwith the linear systems concept of temporalblurring used to explain the flicker limit earlier in thisarticle [37,38]. It is uncertain whether the slow limits ofhigh-level visual processing can be explained in the sameway. The temporal filtering or blurring explanationassumes that all information ascending the hierarchyeventually makes it to perception. Coarse temporal resol-ution would cause a stimulus to be inappropriately com-bined with the stimuli that precede and follow it,obliterating rapid changes, but nothing is actually dis-carded. Theories of visual cognition propose instead that

only some visual representations are transferred to cogni-tion, with the rest ignored [39,40].

These twin frameworks, temporal resolution theory andattentional selection, rarely coincide in the literature, withthe first applied only to low-level temporal limits and thesecond only to higher limits at visual cognition. A rareexception has arisen with motion perception, which ispresent in both our fast and slow list. That the form ofmotion perception mediated by attention [24,25] seems tohave amuch slower temporal limit, and greater attentionalinvolvement, provides evidence for the present theory of a

Box 2. A true dichotomy?

The limits discussed in the main text, and schematized in Figure 1,seem to form two distinct groups, slow and fast. This review isintended to be comprehensive in reviewing the high-resolutionlimits, but due to space and relevance considerations some limits inthe slow group have been omitted, and others are yet to bemeasured. These considerations, and others discussed below, leaveopen the possibility that the suggested dichotomy is not trulydichotomous.

The fast group of limits are not homogeneous, as underscored bythe existence of the different flicker limits. Flicker between twocolors of the same luminance (equiluminant flicker) is commonlyinvisible above !25 Hz, much slower than the 50 Hz typicallyobserved with flicker of white and black. This puts it on the slowend of the fast limits reviewed, and the large difference from theluminance flicker limit could undermine the idea of the dichotomybetween slow and fast limits. However, when the stimuli are set tosimilar signal strength by equating cone contrast, the flicker limitsare not nearly as different [46]. Thus, a large difference in temporallimit can stem from an unfair comparison rather than a difference inthe temporal resolution of the underlying mechanisms. This issuemust still be addressed for many comparisons of temporal limits.

The divide between slow and fast limits can be partially bridgedby distance between stimulus elements. Some of the fast temporallimits require identification and binding of two simultaneousstimuli. As the visual cortices are retinotopically organized, if thebinding is to take place within visual cortex then far-flung neuronsmust interact when the stimulus elements are far apart. If signalsdiffuse gradually across cortex, then noise can accumulate andinteractions across greater distances will be temporally imprecise.Behaviorally, the temporal limit for some judgments does decreasewith stimulus separation [16]. As signals ascend the processinghierarchy, receptive field sizes increase until there is little spatialselectivity and the neural connectivity distance is less dependent onthe spatial stimulus separation. Nevertheless, one judgmentclassified here as high-level, the same-different judgments oforientation described in the introduction and shown in Movie 1(see supplementary material online), is affected by distance [10].When the oriented elements are abutting, whether they are thesame orientation or different can be perceived at rates as high as8 Hz, whereas once they are a few degrees apart as in the movie, thejudgment is impossible above 4–5 Hz. Perhaps the relevant compu-tation is mediated at an intermediate level of the visual system.Distance dependence of the other putatively high-level judgments isan underexplored yet important issue.

An important omission from this article’s list of temporal limits is anaspect of percepts that is not thought to be high level, but nonethelesshas coarse temporal resolution. The perceived brightness of a patch isinfluenced by the surrounding spatial context, as in the simultaneouscontrast illusion, where a grey patch on a dark background appearsmuch lighter than the same grey on a bright background. If thebackground is alternated between bright and dark, the grey alternatesin appearance at the same rate, but only up to a backgroundalternation rate of about five times per second [47]. This is consistentwith other evidence that surface appearance involves signals thatspread slowly across the visual cortex [48–50], and is consistent withthe distance effects described in the previous paragraph.

Review Trends in Cognitive Sciences Vol.13 No.5

219

large temporal divide, and some support for the use of twinframeworks to explain the high-level limits.

The discounting of information by attention before cog-nition is reached is commonly experienced when viewing arapid-cut music video. You might feel you have seen andheard many things in the video, but that most went, youmight say, in one ear and out the other. ‘Attentional blink’experiments document one of the temporal limits involved.In these experiments, participants are asked to process twostimuli at different times, much like the attempt to followthe succession of brief scenes in themusic video. Processingof the second stimulus is severely impaired if the firststimulus appeared shortly before. The culprit is thoughtto be a time-consuming stage required to process the firststimulus, such as short-termmemory consolidation [40], orone required to switch to processing the second stimulus,such as re-setting of attentional control [41].

Could the slow process involved in the attentional blinkbe the same as that which causes the resolution limit to below for binding widely separated features and for perceiv-ing acceleration? Both limits are probably caused by limitsat high-level processing stages. However, the attentionalblink requires a first task as a trigger, possibly to consumehigh-level resources, whereas the perceptual limitsreviewed here occur despite the benefit of full resources.The attentional blink is substantially more time-consum-ing than even the slowest visual percept reviewed here andmight be imposed by a bottleneck or switching limit at amore cognitive stage.

Experiments designed to target specific perceptual pro-cesses sometimes miss and instead reflect low-resolution,high-level processes. Flash-lag experiments, designed tomeasure position perception, have observers report theposition of a moving object at the time of a sudden flash.The moving object is perceived farther along its trajectorythan it actually was at the time of the flash, as if the visualsystem actively shifted it. However, the task requiresbinding the time of the flash with the simultaneous pos-ition of the moving object, and binding separated elementsusually has low temporal resolution. Consistent with this,the positions reported vary across trials, occupying a 60–70 ms swath of the moving object’s trajectory [42]. Indeed,at higher velocities this distribution fills a larger spatialarea, such that the temporal imprecision remains constant(Linares, D. et al. unpublished). The limit on task perform-ance is therefore a process with coarse temporal resolution.Could this coarse binding limitation be related to the shiftin the direction of motion? Possibly yes (as suggested byRef. [7]) because other work on bindingmotionwith a staticfeature also yields a temporal bias – the static feature isperceived as co-occurring with a later phase of the motion[3,43]. Also consistent with a causal role for high-levelbinding is that the flash-lag shift is not evident at mid-level visual mechanisms [44].

SummaryThe title of this article refers to the phenomenon of ‘seeingfast’, which might seem ungrammatical in isolation. How-ever, the meaning here is not to see quickly or rapidly, butrather to see things that occupy fast timescales. Afterhitting the retina, visual signals rocket towards cortex,

and on the way only changes on the order of a few milli-seconds are lost [45], perhaps due to membrane fluctu-ations and temporal summation of signals at geniculateand geniculo-cortical synapses. Shortly after reaching cor-tex, specialized motion detectors and edge detectors cross-correlate the incoming signals, outputting representationsof certain events at narrow timescales. These specialized,high temporal resolution motion detectors and edge detec-tors are replicated across the visual field. But with a highcost in cortical territory to be paid for having these special-purpose mechanisms, evolution has provided for only aselect set. Visual signals continue past the secondaryvisual cortices andmove towards visual cognition, but thenhit slow going. Visual cognition can make nearly anyjudgment about its inputs, but these computations areso slow that information at narrow timescales cannot beaccessed if not already explicitly represented by the low-level specialized detectors. Making matters worse is thatcognition is also limited in resources and only able toprocess one or a few objects at a time. Not all incomingsignals can then be processed, just as for Lucille Ball inher famous encounter with a fast assembly line (http://youtube.com/watch?v=4wp3m1vg06Q). Hapless Lucy wastold she must box every chocolate on the line; fortunatelycognition can afford to let many signals just pass it by, withonly a few needing its special treatment. Attentional selec-tion is usually able to choose a few for further processingand discard the rest. However, with certain repetitivetrains of stimuli, such as in the demonstrations of slowlimits described in this article, attention is unable toselect one stimulus and isolate it from succeeding visualrepresentations. In color–motion binding (Movie 2 insupplementary material online) for example, the secondcolor and second motion are fed into visual cognition at atime when it has only just begun to bind the first pair, andour percept remains unbound.

Vision science has a strong foundation in psychophysics,which can functionally identify the early visual stages andtheir simple filtering properties. But higher stages bearlittle resemblance to simple filters.Up in the clouds of visualcognition, as processing becomes more general and flexible,conventional psychophysics loses its power. To understandthis realm, further research (Box 3) should focus on thehallmarks of higher-level processes: low capacity managed

Box 3. Outstanding questions

If high-level visual processing has poor temporal resolution, whatabout inter-sensory processing? Intensive work has only recentlybegun on this question, with Fujisaki and Nishida [35,54,55] leadingthe way.

Why are low temporal resolution processes low temporalresolution? It is not understood what aspect of the underlyingneural processing is so time-consuming.

Low-level luminance mechanisms adapt their integration timesomewhat to fit the current input [56]. Do higher mechanisms alsodo so? Alternatively, could they rely on fixed component mechan-isms working simultaneously on different timescales [57] so each isavailable if needed?

How do visual mechanisms collaborate so harmoniously thatmost people never realize their visual experience comprisesdisparate rate limits? How do visual processing mechanisms withdifferent processing times keep perceptual qualities in temporalregister? (see Ref. [7].)

Review Trends in Cognitive Sciences Vol.13 No.5

220

by a selection process, attentional resource demands, and asargued here, limited temporal resolution.

AcknowledgementsOver a decade ago, Patrick Cavanagh introduced me to some of the coreideas of this article and since then has continued to collaborate on relatedprojects and provide sage advice. Alex L. White, Paul Dux, Daniel Linaresand two anonymous reviewers provided many helpful comments.

Appendix A. Supplementary dataSupplementarymaterial associated with this article can befound at doi:10.1016/j.tics.2009.02.005.

References1 Coe, B. (1992)Muybridge and the Chronophotographers.MOMI/British

Film Institute2 Johnston, A. et al. (2006) Spatially localized distortions of event time.

Curr. Biol. 16, 472–4793 Moutoussis, K. and Zeki, S. (1997) A direct demonstration of perceptual

asynchrony in vision. Proc. Biol. Sci. 264, 393–3994 Whitney, D. and Murakami, I. (1998) Latency difference, not spatial

extrapolation. Nat. Neurosci. 1, 656–6675 Eagleman, D.M. (2008) Human time perception and its illusions. Curr.

Opin. Neurobiol. 18, 131–1366 Whitney, D. (2002) The influence of visual motion on perceived

position. Trends Cogn. Sci. 6, 211–2167 Nishida, S. and Johnston, A. Problems of Space and Time in Perception

and Action (Nijhawan, R. and Khurana, B., eds), Cambridge UniversityPress (in press)

8 Kline, K. et al. (2004) Illusory motion reversal is caused by rivalry, notby perceptual snapshots of the visual field. Vision Res. 44, 2653–2658

9 VanRullen, R. andKoch, C. (2003) Is perception discrete or continuous?Trends Cogn. Sci. 7, 207–213

10 Motoyoshi, I. (2004) The role of spatial interactions in perceptualsynchrony. J. Vis. 4, 352–361

11 Motoyoshi, I. and Nishida, S. (2001) Temporal resolution oforientation-based texture segregation. Vision Res. 41, 2089–2105

12 Farid, H. (2002) Temporal synchrony in perceptual grouping: acritique. Trends Cogn. Sci. 6, 284–288

13 Burr, D.C. and Ross, J. (1982) Contrast sensitivity at high velocities.Vision Res. 22, 479–484

14 Morgan, M.J. and Castet, E. (1995) Stereoscopic depth perception athigh velocities. Nature 378, 380–383

15 Ramachandran, V.S. andRogers-Ramachandran, D.C. (1991) Phantomcontours: a new class of visual patterns that selectively activates themagnocellular pathway in man. Bull. Psychon. Soc. 29, 391–394

16 Forte, J. et al. (1999) Spatial limitations of temporal segmentation.Vision Res. 39, 4052–4061

17 Holcombe, A.O. and Cavanagh, P. (2001) Early binding of feature pairsfor visual perception. Nat. Neurosci. 4, 127–128

18 Bodelon, C. et al. (2007) Temporal resolution for the perception offeatures and conjunctions. J. Neurosci. 27, 725–730

19 Clifford, C.W.G. et al. (2004) Rapid global form binding with loss ofassociated colors. J. Vis. 4, 1090–1101

20 Tyler, C.W. et al. (1995) Multiple mechanisms for the detection ofmirror symmetry. Spat. Vis. 9, 79–100

21 Moradi, F. and Shimojo, S. (2004) Perceptual-binding and persistentsurface segregation. Vision Res. 44, 2885–2899

22 Arnold, D.H. (2005) Perceptual pairing of colour and motion. VisionRes. 45, 3015–3026

23 Werkhoven, P. et al. (1992) Visual processing of optic acceleration.Vision Res. 32, 2313–2329

24 Cavanagh, P. (1992) Attention-based motion perception. Science 257,1563–1565

25 Lu, Z.L. and Sperling, G. (2001) Three-systems theory of human visualmotion perception: review and update. J. Opt. Soc. Am. A Opt. ImageSci. Vis. 18, 2331–2370

26 Verstraten, F.A.J. et al. (2000) Limits of attentive tracking revealtemporal properties of attention. Vision Res. 40, 3651–3664

27 Holcombe, A.O. and Judson, J. (2007) Visual binding of English andChinese word parts is limited to low temporal frequencies. Perception36, 49–74

28 He, S. et al. (1997) Attentional resolution. Trends Cogn. Sci. 1, 115–121

29 Van de Grind, W.A. et al. (1973) Handbook of Sensory Physiology (Vol.VII/3A), pp. 431-573, Springer Verlag

30 Vinckier, F. et al. (2007) Hierarchical coding of letter strings in theventral stream: dissecting the inner organization of the visual word-form system. Neuron 55, 143–156

31 Battelli, L. et al. (2001) Unilateral right parietal damage leads tobilateral deficit for high- level motion. Neuron 32, 985–995

32 Friedman-Hill, S.R. et al. (1995) Parietal contributions to visualfeature binding: evidence from a patient with bilateral lesions.Science 269, 853–856

33 Cavanagh, P. et al. (2008) Mobile computation: spatiotemporalintegration of the properties of objects in motion. J. Vis. 8, 1.1–23

34 Nishida, S. et al. (2007) Human visual system integrates color signalsalong a motion trajectory. Curr. Biol. 17, 366–372

35 Fujisaki, W. and Nishida, S. (2005) Temporal frequency characteristicsof synchrony-asynchrony discrimination of audio-visual signals. Exp.Brain Res. 166, 455–464

36 Holcombe, A.O. (2001) A purely temporal transparency mechanism inthe visual system. Perception 30, 1311–1320

37 Kelly, D.H. (1979) Motion and vision. II. Stabilized spatiotemporalthreshold surface. J. Opt. Soc. Am. 69, 1340–1349

38 Burr, D.C. and Morrone, M.C. (1993) Impulse-response functions forchromatic and achromatic stimuli. J. Opt. Soc. Am. A 10, 1706–1713

39 Broadbent, D.E. (1958) Perception and Communication. PergamonPress

40 Chun, M.M. and Potter, M.C. (1995) A two-stage model for multipletarget detection in rapid serial visual presentation. J. Exp. Psychol.Hum. Percept. Perform. 21, 109–127

41 Di Lollo, V. et al. (2005) The attentional blink: resource depletion ortemporary loss of control? Psychol. Res. 69, 191–200

42 Murakami, I. (2001) The flash-lag effect as a spatiotemporal correlationstructure. J. Vis. 1, 126–136

43 Holcombe, A.O. (2009) Temporal binding favours the early phase ofcolour changes, but not of motion changes, yielding the colour-motionasynchrony illusion. Vis. Cogn. 17, 232–253

44 Linares, D. and Lopez-Moliner, J. (2007) Absence of flash-lag whenjudging global shape from local positions. Vision Res. 47, 357–362

45 Gegenfurtner, K.R. et al. (1997) Functional properties of neurons inmacaque area V3. J. Neurophysiol. 77, 1906–1923

46 Metha, A.B. andMullen, K.T. (1996) Temporal mechanisms underlyingflicker detection and identification for red-green and achromaticstimuli. J. Opt. Soc. Am. A Opt. Image Sci. Vis. 13, 1969–1980

47 De Valois, R.L. et al. (1986) Temporal properties of brightness and colorinduction. Vision Res. 26, 887–897

48 Paradiso, M.A. and Nakayama, K. (1991) Brightness perception andfilling-in. Vision Res. 31, 1221–1236

49 Caputo, G. (1998) Texture brightness filling-in. Vision Res. 38, 841–85150 Ringach, D.L. and Shapley, R. (1996) Spatial and temporal properties

of illusory contours and amodal boundary completion. Vision Res. 36,3037–3050

51 Breitmeyer, B. (2007) Visual masking: past accomplishments, presentstatus, future developments. Adv. Cogn. Psychol. 3, 9–20

52 Reeves, A. (2007) An analysis of visual masking, with a defense of‘Stopped Processing’. Adv. Cogn. Psychol. 3, 57–65

53 Francis, G. (2007) What should a quantitative model of masking looklike and why would we want it? Adv. Cogn. Psychol. 3, 21–31

54 Fujisaki, W. and Nishida, S. (2008) Top-down feature-based selection ofmatching features for audio-visual synchrony discrimination. Neurosci.Lett. 433, 225–230

55 Fujisaki, W. and Nishida, S. (2007) Feature-based processing of audio-visual synchrony perception revealed by random pulse trains. VisionRes. 47, 1075–1093

56 Stromeyer, C.F. and Martini, P. (2003) Human temporal impulseresponse speeds up with increased stimulus contrast. Vision Res. 43,285–298

57 Koenderink, J.J. (1988) Scale-time. Biol. Cybern 58, 159–162

Review Trends in Cognitive Sciences Vol.13 No.5

221