|Home | About | Journals | Submit | Contact Us | Français|
In human vision, the optics of the eye map neighboring points of the environment onto neighboring photoreceptors in the retina. This retinotopic encoding principle is preserved in the early visual areas. Under normal viewing conditions, due to the motion of objects and to eye movements, the retinotopic representation of the environment undergoes fast and drastic shifts. Yet, perceptually our environment appears stable suggesting the existence of non-retinotopic representations in addition to the well-known retinotopic ones. Here, we present a simple psychophysical test to determine whether a given visual process is accomplished in retino- or non-retinotopic coordinates. As examples, we show that visual search and motion perception can occur within a non-retinotopic frame of reference. These findings suggest that more mechanisms than previously thought operate non-retinotopically. Whether this is true for a given visual process can easily be found out with our “litmus test.”
Retinotopic organization plays a fundamental role in our investigations of the visual cortex and in our conceptualizations of its functions. For example, fMRI studies rely heavily on retinotopic mapping of cortical areas (e.g., Tootell, Hadjikhani, Mendola, Marret, & Dale, 1998). Moreover, many neuroscientific theories rely implicitly or explicitly on retinotopic processing. For example, feature integration theory assumes that attention operates on retinotopically organized feature maps (Treisman & Gelade, 1980). However, not all processes are strictly retinotopic. For example, it has been shown that neurons can “shift” the retinotopic position of their receptive fields before a saccade is executed (e.g., Duhamel, Colby, & Goldberg, 1992). A dissociation between perceived position and retinotopic position has also been demonstrated in an fMRI study where the retinotopic representation of a stationary window was found to shift when patterns (Gabor patches) inside this window underwent drifting motion (Whitney et al., 2003). Progress in visual neuroscience depends critically on the ability to determine which functions have their bases in retinotopic representations and which functions have their bases in non-retinotopic representations.
The classical experimental technique to distinguish between retinotopic and non-retinotopic processing is the saccadic stimulus presentation paradigm (SSPP; e.g., Davidson, Fox, & Dick, 1973; Golomb, Chun, & Mazer, 2008; Irwin, 1996; Knapen, Rolfs, & Cavanagh, 2009; McRae, Butler, & Popiel, 1987; Melcher, 2005, 2007, 2008; Melcher & Colby, 2008; Melcher & Morrone, 2003). In SSPP, observers are asked to make a saccadic eye movement from one fixation point to a second fixation point (Figure 1). Two stimuli, one before the eye movement and a second one after the eye movement, are presented briefly. As shown in Figure 1, the retinotopic shift, generated by the eye movement, causes different relative alignments of the two stimuli according to retinotopic and spatiotopic coordinate systems. SSPP is a natural and compelling way to investigate non-retinotopic processing across saccades. However, SSPP is not applicable to fast, short-lived processes that require the presentation of stimuli with brief inter-stimulus intervals (ISIs) because the latency, duration, and variability of saccadic eye movements limit the minimum ISIs that can be reliably introduced between the stimulus presented before and the stimulus presented after the eye movement. Finally, the involvement of the motor system or phenomena such as saccadic suppression can complicate the interpretation of the findings in SSPP.
Here, we present a simple but powerful test, which overcomes these shortcomings. Our test for non-retinotopic visual processing is based on a version of the Ternus–Pikler display (Petersik & Rice, 2006; Pikler, 1917; Ternus, 1926). Three disks in a first frame are presented for 100 ms and followed by an ISI of a variable duration. In a second frame, the disks are shifted one position rightward and, after another ISI, the sequence starts over again (Figure 2). For relatively long ISIs, “group motion” is perceived, i.e., observers perceive the three disks moving in tandem back and forth (Figure 2a; Video 1). For short ISIs, the outer disks are perceived to move back and forth while the inner two disks appear stationary (“element motion”; Figure 2b; Video 2). Previously, we have used this paradigm to demonstrate non-retinotopic processing of form information (Oğmen, Otto, & Herzog, 2006; Otto, Oğmen, & Herzog, 2008). Here, we present a generalized version that can be used as a “litmus test” of non-retinotopic processing for virtually any kind of visual process. We illustrate this with three examples.
Observers viewed the stimuli on a PHILIPS 201B4 or ViewSonic G90f+/b CRT monitor driven by a standard accelerated graphics card. Screen resolution was set to 1280 by 1024 pixels at 75-Hz refresh rate.
An iViewX-HiSpeed eye tracker from SensoMotoric Instruments (SMI) was used to record eye positions in the first and third experiments. It was set up for binocular mode at 500-Hz sampling frequency. Signals of both eyes were averaged in order to reduce noise.
A total of twenty-three observers took part in three experiments. Observers had normal or corrected-to-normal vision at least for one eye, as assessed with the Freiburg Visual Acuity Test (Bach, 1996). All but two observers were naive to the purposes of the experiments. Naive observers were paid 20 CHF per hour. They were explained the general purpose of the experiment and signed informed consent. All experiments were approved by the local ethics commission and observers were told they could quit the experiment at any time.
First, we show an example of how motion perception can occur within a non-retinotopic frame of reference.
Eight paid observers, naive to the purpose of the experiment, participated. Observers viewed the stimuli binocularly from a distance of 3 m in a dimly lit room. The stimulus was a variant of the Ternus–Pikler display (Figure 2). A black dot was inserted in each of the three white disks (Figure 3). The outer disks always contained a dot in the center. In the central disk, at each frame, a dot was presented along the trajectory of a clockwise or anti-clockwise rotation. The dot accomplished a complete rotation in four frames (Video 5).
The disks were 0.5 arcdeg in diameter, 0.6 arcdeg away from each other (center-to-center distance), and displayed with a luminance of 56 cd/m2. The disks were shifted by one inter-disk distance in successive frames. The black dots had 0.073 arcdeg diameter. The trial began with a 2.6 arcmin fixation dot that stayed on the screen for 1 s and was followed by the stimulus.
Frame duration was always 120 ms. Three conditions were tested: group motion, element motion, and no motion (Figure 3). In the group motion and element motion conditions, the spatial stimulus configuration was identical. The ISI was set to 210 ms in the former and 0 ms in the latter. In the no motion condition, the outmost disks were omitted from the group motion display (ISI 210 ms). In each trial, stimulus presentation was randomized for condition (group motion, element motion, no motion), dot rotation direction (clockwise or counterclockwise), rotation starting point (0 or 180 degrees), and motion condition (leftward or rightward first). Therefore, one out of 24 different stimulus configurations was randomly presented at every trial. Four blocks of 48 trials were run on each observer.
Four observers, three naive and one experienced, participated in an additional experiment using eye tracking. For this experiment a larger stimulus was used (1.5 arcdeg element diameter, 1.8 arcdeg inter-element distance, 0.205 arcdeg dot diameter) to allow better tracking. A 3.1-arcmin diameter fixation dot preceded the beginning of each trial, stayed on the screen for 1 s, and was followed by the stimulus. For the trial to start, observers had to fixate within a window of a width of 1 arcdeg centered on the fixation dot for 300 ms. The distance between the observer and the screen was 66 cm. Two observers ran three blocks of 128 trials each under the group motion condition only (two further observers performed three blocks of 48 trials only). Only correct trials were evaluated. The stimulus was randomized for dot rotation direction, rotation starting point, and display starting position. Hence, eight different stimulus variants were presented during this experiment. For every eye-tracking pattern, we considered separately the horizontal and vertical components of the traces, mirrored and averaged them to obtain two patterns. This procedure eliminated potential artifacts due to systematic changes in pupil diameter.
With an ISI of 210 ms, three disks are perceived moving as a group horizontally back and forth (group motion, see Figure 3a). Within the central disk, a dot appears to be rotating (Video 4, in this video the central dot is presented at eight positions along a complete rotation around the center). This rotation is computed non-retinotopically, i.e., after group motion is established. The role of non-retinotopic motion processing becomes immediately evident when group motion is obliterated, either by setting the ISI to 0 ms (element motion, Figure 3b) or by removing the outmost disks (no motion, Figure 3c). Under these latter conditions, clearly, no motion of the disks is perceived and no dot rotation either (Videos 6 and 7). The small dots inside the left and right central disks appear to move vertically up–down or horizontally left–right, respectively. This motion occurs because the closest dot-to-dot matches in successive frames fall on “retinotopic” horizontal and vertical trajectories within the two central disks.
It is important to note that the stimulus of the group motion condition pictured in Figure 3a is identical to the stimulus of the no motion condition illustrated in Figure 3c, except for the missing outer left and right disks. Hence, the contextual outer disks determine whether retinotopic or non-retinotopic motion is perceived. These contrasting predictions make the Ternus–Pikler display a simple yet powerful technique to test whether or not a given computation is carried out in retinotopic coordinates.
To quantify these effects, we designed an experiment in which the dot could rotate either clock or counterclockwise. The observer’s task was to indicate the perceived rotation direction by pressing one of two buttons. We presented this stimulus in the group motion, element motion, and no motion conditions (Figure 3, Videos 5–7). A set of Bonferroni’s corrected multiple comparisons (α = 0.05) showed significantly better performance for the group motion condition over both the element motion and the no motion conditions (Figure 3d).
Can eye movements explain these results? Since the back and forth motion of the display is highly predictable, observers might be making eye movements to track the disks back and forth. We recorded eye movements in four observers under the group motion condition using a larger stimulus (see Methods section). A typical horizontal eye movement pattern for one naive observer under the group motion condition is shown (Figure 3e). Clearly, no significant eye movements were found indicating that observers were maintaining reliable fixation even when they clearly perceived group motion and performed well in the task (average accuracy = 89.5%, SEM = 0.05). Eye tracking results for the other observers are very similar (results not shown).
In the second example, we show that motion detectors are susceptible to retinotopic adaptation of coherent motion even though the perceived motion is incoherent.
Eight new observers, naive to the purpose of the experiment, viewed the stimuli from a distance of 2 m. We used a Ternus–Pikler display with squares carrying Gabor patches. Squares were 40 cd/m2 luminance, 2 arcdeg side, 2.8 arcdeg center-to-center distance. Gabors were 50% Michaelson contrast sinusoidal luminance modulations (1.6 cpd) constrained by a Gaussian window centered at the middle of the squares having a σ of 0.5 arcdeg. Gabor carriers drifted either upward or downward at a speed of 31 arcmin/s. Each trial started with a 1.3 arcmin diameter fixation dot lasting 500 ms, followed by the stimulus. The adapting sequence was presented for 4800 ms. After the adapting sequence, a 500-ms blank screen preceded the MAE testing sequence lasting on the screen for 300 ms. With both an ISI and a frame duration of 200 ms, group motion of three squares was perceived (Videos 8 and 9).
Two conditions were tested. In the “retinotopic” condition, Gabors presented at the same retinotopic location drifted always in the same direction. The direction of drift for the Gabors positioned to the left of the virtual midline was always opposite to the direction of drift for the Gabors positioned to the right of the midline (Figure 4). With this arrangement, the Gabor in the central square was perceived to be drifting alternately upward and downward.
To measure motion aftereffects (MAEs), two squares carrying Gabor patches were displayed at the two opposite sides of the midline. A nulling technique was used to measure MAE. The Gabors drifted with five different velocities either up- or downward (−21, −10.5, 0, 10.5, 21 arcmin/s, Figure 4a). Observers responded by pressing one of two buttons to indicate the perceived drift direction of the test Gabors. Responses as a function of test Gabor drift velocity were fit with a cumulative Gaussian, whose parameters were adjusted by a maximum likelihood procedure. The inflection point of the best fitting curve was taken as a measure of the speed at which the Gabors appeared subjectively stationary. This speed was used as an estimate of the MAE magnitude.
In the “non-retinotopic” condition, Gabors at the same retinotopic location reversed their motion direction from one frame to the other. Perceptually, a coherent upward (or downward) drift motion was perceived (Figure 4b). We measured the MAE by presenting three squares with Gabors at the locations of the last frame of the adapting sequence (Figure 4). In this way, we could assess if any non-retinotopic MAE was produced by the central square as it moved across different locations. The direction and velocity were again varied according to the method of constant stimuli. We asked observers to indicate the direction of motion of the central Gabor by pressing one of two buttons. As before, the 50% point of the best fitting function was taken as the speed at which the Gabor in the central square was perceived to be stationary. In the two adaptation sequences, Gabor contrast was 80%.
Four blocks, consisting of 40 trials each, were run in each condition. Display motion direction was constant within one block.
In the first condition, the central Gabor was perceived as alternatingly moving upward and downward (Figure 4a, Video 8), even though, at each retinotopic position, Gabors were (invisibly) always drifting either upward or downward. With this setup, we found a strong motion aftereffect (Figure 4c; one sample t-test, p G 0.0001). Hence, retinotopic motion detectors adapt strongly even though the “retinotopic” motion is not perceived consciously.
In the non-retinotopic condition, we arranged the Gabors in such a way that the Gabor in the central square was perceived to drift consistently in one direction, e.g., upward. Retinotopically, Gabors drifted up and down in alternating frames (Figure 4b, Video 9). This retinotopic alternating motion is invisible to the observer. A much smaller adaptation was found (Figure 4c; one sample t-test, p = 0.015). Future research will determine whether this weak MAE can be attributed to the adaptation of non-retinotopic motion detectors and how it relates to nonlocation specific MAEs (Freeman, Sumnall, & Snowden, 2003; Snowden & Milne, 1997; Von Grünau & Dube, 1992; Whitney & Cavanagh, 2003).
In visual search, a target has to be searched among distracters, e.g., a horizontal, green line among red and green vertical lines. In each trial, the target is either present or absent. It is generally assumed that basic features are represented in feature maps coding, e.g., color or orientation (Figure 5a). These maps are implicitly (e.g., Huang & Pashler, 2007; Treisman & Gelade, 1980) or explicitly (Palmer, 1999, p. 532) assumed to be retinotopic. According to the feature integration theory (FIT; Treisman & Gelade, 1980), a retinotopic master map operates on the feature maps. If, for example, a horizontal green line has to be searched for, the master map directs focal attention to the same retinotopic location in the color and orientation map to determine whether they contain a green and a horizontal “entry,” respectively. This kind of search is referred to as conjunction search and reaction times increase when the number of distracters increases. Here, we show that attentional selection can operate on non-retinotopic coordinates.
Five observers joined this experiment. In the group motion condition (Figure 5b), we used a Ternus–Pikler display comprised of two outer squares and a central disk, which, as usual, were shifted back and forth from frame to frame (we used this setup with two squares and one disk to further enhance the group motion percept). We overlaid three search displays, one on each square or disk. In the second frame, the display was shifted either right or leftward. The ISI and frame duration ranged from 80 ms to 120 ms, individually adjusted for each observer to optimize the effects. Subjectively, observers perceived one search display on each of the three perceived Ternus–Pikler elements (Video 10).
In the no motion condition (Figure 5c), we omitted the outmost left and right square (Video 11). In the group motion condition, the disk was always at the central position and therefore the target location was predictable from the first frame. To make the target location predictable also in the no motion condition, for a given block, the sequence started always with the disk on one side of the square (always on the left of the square in Figure 5c).
Emphasis was on accuracy but observers were asked to respond as quickly as possible. Observers were familiarized with the stimuli and response system. The task for the observer was to report the presence or absence of a horizontal green line within the central disk. Acoustic feedback was provided upon incorrect responses.
Squares (2.2 arcdeg side) and disk (3.1 arcdeg diameter) were spaced 3.2 arcdeg center to center. Both disk and squares had a gray surface (8 cd/m2) and were surrounded by a thin outline of 0.05 arcdeg (64 cd/m2) to increase the contrast with the black background (0.2 cd/m2). Search displays contained sixteen lines in two colors (red and green) of two orientations (horizontal and vertical). Lines were 800″ wide and 300″ long with a luminance of 20 cd/m2. The central disk had always either one or no horizontal green line (depending on whether a target present or a target absent trial was being presented). The remaining lines were horizontal red or vertical green or red. The outer squares were composed of lines of any combination of color and orientation. Observers viewed the stimuli from a distance of 150 cm in a dimly lit room. A 3.5-arcmin diameter fixation dot preceded every trial, stayed on the screen for 1500 ms, and was followed by the stimulus.
Stimulus presentation was interrupted as soon as the observer responded, or halted after 7 cycles if no responses had been provided. Because of the individually adjusted ISIs, this resulted in a maximum stimulus duration that varied from 2240 ms to 3360 ms. Two blocks, comprised of 80 trials each, were run for each condition. Reaction times of correct trials only were taken into account. Moreover an outlier rejection procedure was applied by recursively excluding data points beyond 3 standard deviations from the observer mean for every condition.
For the group motion condition, eye movements were recorded from one additional naive observer and averaged to obtain the plot in Figure 5e (35 trials). Mean horizontal eye position (blue line) and its 95% confidence interval (red lines) are shown in Figure 5e. During eye movement recording, observers viewed the stimuli from a distance of 66 cm. The stimulus was scaled to have the same size on the retina as in the previous experiment. Again, a fixation dot anticipated stimulus presentation and was displayed for 1500 ms followed by stimulus onset. Observers had to fixate the dot for at least 300 ms to allow the trial to start. Data were also obtained for one of the authors (results not shown).
Performance in terms of accuracy is clearly better in the group motion condition compared to the condition where the outer squares are omitted (Figure 5d, paired t-test, p = 0.012). In this latter “no motion” condition, the search displays are “integrated” retinotopically across successive frames. Because the search displays are different, the displays strongly mask each other. In the group motion condition, the search displays are clearly visible because of the non-retinotopic integration across frames. As a result, search is relatively easy. Our results indicate that attention operates on the feature maps after non-retinotopic integration has occurred (Figure 5).
The early visual system is retinotopically organized (Tootell et al., 1998). However, this retinotopic organization is insufficient to support perception under natural viewing conditions. When the eyes move, the retinotopic representation of the environment undergoes drastic shifts, yet our percepts remain stable (Wurtz, 2008). In addition to this “eye movement problem” for retinotopic representations, there is also an “object movement problem”: Moving objects stimulate retinotopic receptive fields only briefly not allowing sufficient time for the computation of the characteristics of the stimulus (Oğmen, 2007). Studies addressing the limitations of retinotopic representations dealt primarily with the “eye movement problem” by using saccadic stimulus presentation paradigms (SSPPs). As discussed in the Introduction section, in SSPP retinotopic and non-retinotopic representations are contrasted by presenting stimuli before and after a saccadic eye movement (e.g., Davidson et al., 1973; Golomb et al., 2008; Irwin, 1996; Knapen et al., 2009; McRae et al., 1987; Melcher, 2005, 2007, 2008; Melcher & Colby, 2008; Melcher & Morrone, 2003; gaze modulation: d’Avossa et al., 2007; Nishida, Motoyoshi, Andersen, & Shimojo, 2003; Wenderoth & Wiese, 2008). However, SSPP is not well suited for moving stimuli or for fast, short-lived processes that require the presentation of stimuli with brief ISIs. Finally, the involvement of the eye motor system or phenomena such as saccadic suppression can complicate the interpretation of the findings in SSPP.
Our Ternus–Pikler paradigm overcomes these limitations. First, it can be used with eye movement and steady fixation paradigms, thus, one can dissociate the influence of eye-movement-related processes. Second, short-lived visual processes can be tested because ISIs can be much shorter than in SSPP. When using appropriate stimulus configurations, the ISI can be reduced even to 0 ms and still group motion is perceived (e.g., Kramer & Yantis, 1997; Scott-Samuel & Hess, 2001). The Ternus–Pikler stimulus can also be presented repetitively for long durations and therefore can be used for processes that require long presentation times, as illustrated in our adaptation experiments. Third, another distinct advantage of our paradigm is its possibility to pit retinotopic and non-retinotopic processes directly against each other due to the spatially overlapping elements in the Ternus–Pikler display, which mask each other when retinotopic integration prevails. Simple parametric manipulations (such as ISI, figural characteristics of elements, and omission of flanking elements) can modulate the percept from group to element or to no motion thereby offering strong control conditions (see also Cavanagh, Holocombe, & Chou, 2008; Shimozaki, Eckstein, & Thomas, 1999).
Based on these advantages, we have reported several novel findings. For example, visual search is usually assumed to rely on retinotopic feature maps. Here, we have shown that attention can operate on non-retinotopic feature maps when group motion prevails in the Ternus–Pikler display (Figure 5d). There are no eye movements during search (Figure 5e) and hence attention is covert. Interestingly, largely retinotopic attention was found in a cueing paradigm where attention was also covert (Golomb et al., 2008; but see Cavanagh et al., 2008). The non-retinotopic processing in visual search involves non-retinotopic form processing because the search elements are integrated across frames. This becomes immediately evident when the flanking squares are omitted: retinotopic integration occurs and search displays mask each other.
This non-retinotopic form processing shows again the sensitivity of our paradigm because studies using SSPP never found form processing to be non-retinotopic (Irwin, 1991; Irwin, Yantis, & Jonides, 1983). This holds also for a paradigm based on apparent motion, which is very similar to our Ternus–Pikler display (Cavanagh et al., 2008). However, other studies employing different paradigms found integration of form (Nishida, 2004; Yin, Shimojo, Moore, & Engel, 2002). In recent studies, we have shown that non-retinotopic form processing can even occur with features close to the hyperacuity range, i.e., with stimuli of which the crucial features are in the range of photoreceptor spacing (Oğmen et al., 2006; Otto, Oğmen, & Herzog, 2006, 2008).
In the first experiment, we have shown evidence for non-retinotopic motion processing, i.e., motion that becomes apparent only after group motion is established (Figure 3d, ISI 210 ms). This motion is invisible when retinotopic integration occurs (Figure 3d, ISI 0 ms, no flank conditions). This motion processing can be computationally understood as a two-step process. The motion correspondences between Ternus–Pikler elements (e.g., disks) provide the reference frame (Mack, 1986) against which local motion is computed. From this perspective, our stimulus paradigm provides a powerful link between non-retinotopic processes and reference frames in perception (Bertamini & Proffitt, 2000; Dunker, 1929; Johansson, 1973).
Interestingly, retinotopic adaptation occurred in Example 2 as the result of coherent retinotopic motion whose coherence was invisible to the observer. The percept was that of incoherent motion (Figure 4b). This finding suggests that retinotopic motion detectors adapted “unconsciously” to the underlying retinotopic motion.
Based on these findings, we suggest that our test is a litmus test for several reasons. First, it is easy to implement by, simply, putting the stimuli of interest on the Ternus–Pikler display. Second, non-retinotopic effects can often be easily verified just by looking at the display (Videos 4 and 10). Third, very short-lived non-retinotopic processes can be detected. Fourth, retinotopic processing can be pitted against non-retinotopic processing. Fifth, the Ternus–Pikler display can easily be adapted to neuro-physiological needs because the spatiotemporal parameters of the display can be flexibly adjusted without obliterating group motion.
Our test may provide a first, gross guidance whether to record from retinotopic or non-retinotopic areas. We are confident that our simple but compelling approach can be applied to any visual research fields such as filling-in, reading, contrast detection, and the attentional blink, just to name a few.
This work was supported by the Swiss National Fund (SNF) projects “Dynamics of Feature Integration,” Pro*Doc “Processes of Perception”, and in part by award R01 EY018165 from the National Institutes of Health (NIH). The content is solely the responsibility of the authors and does not necessarily represent the official views of NIH.
Commercial relationships: none.
Marco Boi, Laboratory of Psychophysics, Brain Mind Institute, Ecole Polytechnique Fédérale de Lausanne (EPFL), Switzerland.
Haluk Öğmen, Department of Electrical and Computer Engineering, Center for Neuro-Engineering and Cognitive Science, University of Houston, Houston, TX, USA.
Joseph Krummenacher, Department of Psychology, University of Fribourg, Switzerland.
Thomas U. Otto, Laboratoire Psychologie de la Perception, Université Paris Descartes, France.
Michael H. Herzog, Laboratory of Psychophysics, Brain Mind Institute, Ecole Polytechnique Fédérale de Lausanne (EPFL), Switzerland.