|Home | About | Journals | Submit | Contact Us | Français|
Smooth pursuit eye movements transform 100 ms of visual motion into a rapid initiation of smooth eye movement followed by sustained accurate tracking. Both the mean and variation of the visually-driven pursuit response, can be accounted for by the combination of the mean tuning curves and the correlated noise within the sensory representation of visual motion in extrastriate visual area MT. Sensory-motor and motor circuits have both housekeeping and modulatory functions, implemented in the cerebellum and the smooth eye movement region of the frontal eye fields. The representation of pursuit is quite different in these two regions of the brain, but both regions seem to control pursuit directly with little or no noise added downstream. Finally, pursuit exhibits a number of voluntary characteristics that happen on short time scales. These features make pursuit an excellent exemplar for understanding the general properties of sensory-motor processing in the brain.
Something happens. We act. Or not. That simple sequence characterizes most of our lives: Roger Federer returning service, bringing a cup of coffee to our lips, or moving our eyes to look at an object of interest all follow the same sequence from sensation to action. Of course, what happens between sensation and action is what makes us human. Sensory inputs provide a context for thoughts, decisions, and memories, and the consequent choices between action and inaction. To understand ourselves and provide a framework for applying basic research to improve human lives, we need to explicate fully how we process sensory inputs, make plans, and act.
For somatic movements such as reaching, we know the basic anatomical substrate for using sensory inputs to plan and execute movements. Signals from cortical sensory areas are relayed to the motor cortex through specific sensory-motor areas in the parietal cortex (Rizzolatti et al., 1998). There are precisely defined recurrent loops from the motor parts of the cerebral cortex to the basal ganglia or the cerebellum, and back (Middleton and Strick, 2000). Descending pathways from the cortex and the brainstem access motor synergies and modulate reflex pathways in the spinal cord, leading to coordinated activity in motoneurons (e.g., Drew et al., 2004). The resulting movements are, in general, both accurate and quite precise (e.g., Shadmehr and Mussa-Ivaldi, 1994). But, we do not understand what happens in the circuits between sensation and action. We know how sensory and motor events are represented in the brain, but not how the former are transformed to achieve the latter.
Eye movements, because they have many simplifying properties as neural systems, offer the opportunity to understand what happens between sensation and action. For example, the organization of the motor effectors is relatively simple for eye movements, and the final motor and pre-motor circuits in the brainstem are well understood. The cortical sensory-motor circuits for both saccadic and smooth pursuit eye movements follow the same general outline given above for reaching (Lynch and Tian, 2006). Studies of voluntary eye movements have elucidated: 1) how the parameters of a sensory stimulus are estimated from the response of a population of sensory neurons; 2) sensory and motor sources of motor variation; 3) the differences between cerebral and cerebellar control of movement; 4) the processing of signal and noise in sensory-motor circuits; 5) the neural basis for target choice and motor attention; and 6) mechanisms and roles for efference copy signals. Thus, eye movements provide an excellent model system for obtaining answers to general neuroscience questions. In addition, eye movements are of particular importance to primates like us, who rely heavily on vision and must move their eyes to point the foveae at stationary and moving objects of interest.
In the present review, I have two purposes. First, I summarize the work of the my own laboratory on smooth pursuit eye movements over the past 20 years, placing an emphasis on how the full body of knowledge we have created fits together into a conceptual story of how a sensory-motor system works. I attempt to identify areas of possible controversy, disagreement, or alternate hypotheses, and to place our work in the context of others’. Still, to keep the review accessible to a general readership, many important but specialized points are finessed. Second, I place our knowledge of the neural mechanisms of pursuit eye movements in the larger context of general neuroscience issues in sensation, sensory-motor transformations, and motor control. I think this latter purpose is especially important because of the potential value of smooth pursuit eye movements for obtaining quantitative answers to central questions in systems neuroscience, such as the six issues enumerated in the previous paragraph.
In the 1970’s, neuroscientists discovered a specialized set of cortical visual areas that process visual motion (Dubner and Zeki, 1971; Kaas 1973). The middle temporal visual area, known as V5 or MT, is at the heart of those areas. It contains neurons that respond selectively to moving targets, and that are tuned for the direction and speed of motion across their receptive fields (Maunsell and van Essen, 1983). Earlier, Rashbass (1961) had shown that smooth pursuit eye movements, like the neurons in MT, respond selectively to visual motion. Rashbass used the “step-ramp” target motion shown at the top of Figure 1. Here, a subject starts by fixating a stationary target. At an unexpected time, the fixation target disappears and a tracking target appears eccentric in the receptive field, moving back towards its initial position. The step-ramp motion creates a competition between target position and motion: if pursuit is driven by signals that report target position relative to the eye, then smooth eye movement should be initiated toward the position of the target; if pursuit is driven by target motion, then smooth eye movement should take the eye in the direction of target motion, away from target position. As illustrated in Figure 1, reality corresponds to the latter situation. Pursuit is initiated in the direction of target motion and a subsequent saccadic eye movement (arrow in Figure 1A) corrects the residual difference between the positions of the eye and target.
The convergent findings of a movement and a cortical area that are both selectively responsive to target motion led to tests of the hypothesis that area MT provides the sensory inputs that drive pursuit. Indeed, lesions within MT caused deficits in the use of visual motion for guiding pursuit eye movements (Newsome et al., 1985). Importantly, MT has a topographic organization so that different parts of the visual field are represented in an orderly way across the full extent of MT. Lesions confined to the parts of MT that process motion signals from defined, small areas of the visual field revealed a motion scotoma for the initiation of pursuit. If the targets used to initiate pursuit moved across unaffected parts of the visual field, then the initiation of pursuit was normal, indicating that the ablations had not affected the animal’s ability to generate the motor act itself. If the targets moved across the affected part of the visual field, then monkeys were not able to initiate pursuit. Instead, they waited for the target to move outside the region represented by the lesion, or used a saccade to move the target into an intact region of the visual field. Further experiments showed that microstimulation within MT could alter the speed and direction of pursuit with latencies as short as 25 ms (Groh et al, 1997; Carey et al., 2005). Thus, it seems reasonable to think of MT as a major source of the sensory inputs that guide pursuit eye movements, with the understanding that other cortical areas also may contribute.
Inspection of the velocity traces in Figure 1 reveals that there are two very different phases of pursuit. During the latency before the initiation of pursuit and the initial rising phase, the target moves with respect to the eye and there is substantial image motion across the retina (shaded area in velocity traces) to drive the visual motion system and pursuit (Lisberger and Westbrook, 1985). Later, eye velocity matches target velocity almost perfectly so that there is little or no image motion to drive pursuit (ignoring the large downward spike of eye velocity caused by the saccade). Indeed, using a computer to stabilize the target with respect to the moving eye during steady-state pursuit reveals that steady-state eye velocity persists almost perfectly in the absence of image motion (Morris and Lisberger, 1987). In the absence of a continuing drive, the elasticity and viscosity of the orbit would cause the eye to come to a halt within 200 ms. Extra-retinal signals must provide the neural drive to keep the eye moving. Consequently, we think of two separate phases of pursuit – initiation versus steady-state – with different, retinal versus extra-retinal, control signals and probably different control strategies in the brain.
The representation of image motion in MT is a “place code”. MT neurons are tuned for the speed and direction of image motion, with different neurons showing maximum responses for different “preferred” speeds and directions (Maunsell and van Essen, 1983). Thus, any given target motion causes activity in many MT neurons, with the largest response in the neurons whose preferred direction and speed match the target motion. Because of the tuned responses of MT neurons, neither the firing rate of any single neuron nor the average response amplitude across neurons provides an unambiguous estimate of target direction and speed. These estimates must come from comparing the responses of neurons across the population. The nature of the place code in MT contrasts with the rate code that represents and drives pursuit in the motor system (Groh, 2001). In the floccular complex of the cerebellum, for example, pursuit is represented largely by two groups of Purkinje cells that prefer eye motion either towards the side on which they reside, or downward (Krauzlis and Lisberger, 1996). Each Purkinje cell participates in all pursuit movements in or near its preferred direction, showing firing rates that scale monotonically with pursuit speed (Lisberger and Fuchs, 1978).
Figure 2 illustrates the problem of estimating sensory parameters from a population response. It diagrams a cartoon population response of 7 MT neurons, each tuned for a different speed of image motion. Suppose that we deliver the motion of a bright target at a speed of 6 deg/s or 10 deg/s. Because of their speed tuning, each neuron responds differently to the two speeds of motion. For the slower speed, the neuron with a preferred speed of 6 deg/s emits 6 spikes while the other neurons emit fewer spikes. The population response shows a peak at 6 deg/s. For the faster speed, the neuron with a preferred speed of 6 deg/s now emits only 3 spikes, while the neuron with a preferred speed of 10 deg/s now emits 6 spikes. For each target motion, it is obvious by inspection that we can guess (“estimate”) the speed of the stimulus by reporting the preferred speed of the neurons with the largest response. To understand how the brain decodes the population response, however, we need a way to perform this “inspection” mathematically.
Simply adding up the total activity across the population response yields 18 spikes for the bright targets moving at either 6 or 10 deg/s and does not allow us to discriminate between the two speeds. Labeling each neuron according to its preferred speed helps. Now, the weighted sum, calculated by summing the product of preferred speed and number of spikes across the population, allows us to discriminate between the motion of the bright targets at 6 and 10 deg/s. Consider, however, the smaller responses (Sclar et al., 1990) of the same population of model MT neurons for a dim target that moves at 10 deg/s. The weighted sum is smaller for motion at 10 deg/s for the dim than for the bright target: decoding with a weighted sum of the population activity risks confounding slow motion of a bright target with fast motion of a dimmer target.
Adding one more element in the decoding computation solves the problem. An unambiguous estimate of target speed emerges if we normalize the weighted sum through division by the total activity across the population. Now, the estimate of target speed is equal to the actual target speed and is insensitive to the amplitude of the population response. This way of estimating target speed for a population response is called “vector averaging”, and is represented by the equation:
where MTi and PSi are the response and the preferred speed of the ith MT neuron. Salinas and Abbott (1994) showed that the use of vector averaging to determine the center-of-mass of the population response is approximately an optimal linear estimator for a well-behaved population response. We regard vector averaging as a way of quantifying the content of the population response in MT, but without any strong implications for an implementation in the brain. Weighting of each neuron’s response by its preferred speed in the numerator could be accomplished by synaptic strengths. Normalization by the total population activity in the denominator is somewhat more challenging to implement, because of the need to divide with neurons, although a number of neural mechanisms have been proposed in theoretical and computational papers (Heeger, 1993; Chance et al., 2002).
The strong relationship between target speed and the magnitude of the change in eye velocity in the first 100 ms of the initiation of pursuit (Lisberger and Westbrook, 1985) would be expected if vector averaging describes the population decoding computation used by the brain. Stronger support comes from the effects of degrading the population response with sampled motion. When we presented targets that flashed sequentially at different locations separated systematically in space (Δx) and time (Δt), we found an illusion of increased estimates of target speed by both pursuit and perception (Churchland and Lisberger, 2000, 2001). For example, Figure 3C shows time averages of eye velocity in response to target motion at 15 deg/s for smooth target motion (black) versus target motion that is flashed at spatially separated locations every 32 ms (red trace). After a slight delay, the eye velocity for degraded motion rose more rapidly to a higher level of eye velocity compared to the eye velocity for smooth motion.
We think of the magnitude of the initial pursuit response as a probe for the properties of the visual signals that are driving pursuit. This view is justified by the fact that the latency from visual motion to eye motion is 100 ms, so that the first 100 ms of the pursuit response is driven only by visual motion and can be considered as the “open-loop” response of the system. Therefore, we take the change in eye velocity in the first 100 ms of pursuit as an index of the estimate of target speed that is driving pursuit, and we can think of the increased pursuit response as an illusion of increased target speed for degraded motion. Figure 3D quantifies the illusion of increased target speed by plotting the normalized pursuit response as a function of the interval between flashes of the target moving in sampled motion (black symbols and lines). Pursuit’s estimate of target speed increases as a function of the temporal interval between target flashes up to an interval of 48 ms before declining at longer intervals. Perception’s estimates show the same illusion (Churchland and Lisberger, 2001).
To our surprise, every MT neuron we recorded from showed a decreased response amplitude for the same degraded motion that created increased estimates of target speed (Churchland and Lisberger, 2001). We inferred, therefore, that the population response was indeed smaller for degraded motion than for smooth motion. This observation shows that smaller population responses do not necessarily lead to estimates of lower target speeds, and contradicts the predictions of any decoding computation that uses the amplitude of the population response alone to estimate speed.
Figure 3 shows how the population response for degraded motion can have a decreased amplitude but still lead to estimates of faster target motion, compared to smooth motion. For smooth motion, assume that all 5 neurons in a model population have peak responses of 100 (Figure 3A, “smooth”). When the target moves at 20 deg/s, neurons with preferred speed close to 20 will have close to maximal responses and neurons with higher or lower preferred speeds will have smaller responses. We can view the population response (Figure 3B, blue symbols and curve) by plotting each model neuron’s response as a function of its preferred speed. With more model MT neurons, the (mean) population response would be described by the blue curve, which has a peak of 100 for neurons with preferred speeds of 20. Decoding the population response with vector averaging would yield a speed estimate of 20 deg/s. For degraded target motion, suppose that the reduction in tuning curve amplitude is greater for neurons with smaller preferred speeds (Figure 3A, “degraded”). Now, creating the population responses by plotting each neuron’s response to target motion at 20 deg/s reveals a smaller population response that is shifted to the right, towards higher preferred speeds (Figure 3B, red symbols and curve). Decoding the population response with vector averaging would yield an estimate that speed was faster than 20 deg/s.
The scenario outlined in Figures 3A and B is exactly what we observed in the responses of MT neurons for smooth versus degraded target motion (Churchland and Lisberger, 2001). We think that the effect was greater in neurons with lower preferred speeds because the smaller spatial limits of their receptive field mechanisms were exceeded at lower spatial separations in the degraded target motion. When we used vector averaging to decode MT population responses for target motions of the same speed but different intervals between flashes, we found an effect that paralleled the illusion in pursuit perfectly (Figure 3D, red symbols and lines). We conclude that the illusion of increased target speed results from a systematic effect of degraded motion on the sensory population response in MT. The success of Figure 3 in accounting for the effects of apparent motion stimuli on pursuit initiation makes us mindful that pursuit performance may be determined primarily by the sensory representation of motion, a conclusion similar to those outlined by Pack and Born (2001) for pursuit and in the work of Miles and colleagues (e.g., Sheliga et al., 2008) for visually-driven ocular following.
For now, we regard vector averaging as a metaphor rather than as a neural mechanism for estimating target speed. Indeed, it may be an imperfect metaphor because it does not account for the relationship between MT responses and pursuit initiation for all target forms and contrasts (Krekelberg et al., 2006; Priebe and Lisberger, 2004). At the same time, an elegant experiment in the saccadic system showed that vector averaging is an excellent metaphor for the process that decodes the population response in the superior colliculus to control saccade direction and amplitude (Lee et al, 1988). Thus, vector averaging may be a useful and general way to think about neural population decoding. We have found that number of other mechanisms with potential neural implementations, such as statistically-motivated computations (Deneve et al., 1999; Beck et al., 2008) and maximum likelihood calculations (Jazayeri and Movshon, 2006), estimate target motion with means and variances quite similar to those obtained from Equation (1). Therefore, we take advantage of Equation (1) as a simple way to determine the center of mass of the MT population response. At the same time, we leave aside the unanswered, and in our view important, question of how the brain actually estimates target speed and direction from the population response in MT, or decodes any other neural population response for that matter.
We showed above that the mean MT population response predicts mean pursuit eye velocities for degraded, sampled target motion as well as for normal, smooth target motion. However, our analysis so far is based on mean responses of MT neurons and eye movements, and mean responses may not reveal as much as we’d like about nervous system function. To quote an unidentified meteorologist I heard on the radio quite a few years ago: “there is no such thing as a normal winter’s rainfall, just an average winter’s rainfall, and that never happens”. The same is true of the brain. Neuroscientists frequently measure average responses across repetitions of the same sensory stimulus, but the brain does not know about the mean responses of neurons. It must act on the basis of the brief, single responses of many neurons for a single target motion. In the case of MT and pursuit, action must be based on just a few action potentials in each of many MT neurons. Fortunately, at least under a limited set of stimulus conditions that are similar to those used for pursuit, MT neurons provide >80% of their maximal information about the direction of target motion within the first 100 ms of their response (Osborne et al., 2004).
Refocusing on the real problem solved by the brain raises the key question of understanding trial-by-trial variations in pursuit and neuron responses. Because neural responses are quite variable, there is a risk that motor behavior also could be quite variable. In pursuit, this is true. In Figure 4A, the black line shows the mean pursuit eye velocity as a function of time and the gray ribbon indicates the impressive trial-by-trial variance. In contrast, the eye velocity of the vestibulo-ocular reflex (VOR) is about 7 times more precise for a similar mean eye velocity trajectory (Figure 4B). Because pursuit and the VOR use the same brainstem interneurons and extraocular motoneurons, the variation in the initiation of pursuit cannot be attributed to noise added by the final motor pathways.
We think that the variation in pursuit results from sensory errors in estimating target speed. In support of our view, quantitative analysis showed that almost 95% of pursuit variation can be understood in terms of sensory errors in estimating the speed, direction, and time of onset of target motion (Osborne et al., 2005). Further, the magnitude of the errors in estimating these parameters for pursuit predicts an ability to discriminate between different directions or speeds of target motion that is only slightly larger than found in tests of perception (Osborne et al., 2007). The similarity of the estimation errors for pursuit and perception suggests that the variation in pursuit arises mainly from noise in the sensory representation in MT, in a population response that is shared by neural sub-systems responsible for perception and action. In contrast, perhaps little noise is added downstream in the private components of the perceptual or pursuit systems (Figure 4, schematic diagram). This implies that the motor system downstream from the sensory representation of target motion follows the sensory estimates of target speed and direction almost perfectly, even when those estimates are erroneous or noisy. The work of Gegenfurtner et al. (2003) does not agree totally with our conclusion of a shared noise source for perception and action, but number of other studies have postulated shared visual inputs for pursuit and perception (e.g. Stone and Krauzlis, 2003; Pack and Born, 2001).
Neurons in MT, and most of the rest of the brain, are noisy – the trial-by-trial variance of spike count is approximately equal to the mean spike count as expected for neurons with Poisson spiking statistics. Why is this noise not eliminated when the brain averages across the many MT neurons that must be active for any target motion? The answer lies in noise correlations across MT. The spike counts of neurons with similar stimulus preferences tend to fluctuate up and down together from trial to trial (Zohary et al., 1994; Bair et al., 2001; Huang and Lisberger, 2009). Correlated fluctuations cannot be eliminated by averaging across neurons, and therefore have the potential to appear in the ultimate behavioral output.
If we make a model MT population response with the same trial-by-trial variation in spike count found in MT neurons, and the same neuron-neuron correlations, then we find, indeed, that the trial-by-trial variation in estimates of target speed are comparable to those found in pursuit behavior. In Figure 5A, we assemble population responses, defined as the response of all individual neurons in the population plotted as a function of their preferred speeds. As in Figure 3B, it is important to remember that the curves in Figure 5A are not the tuning curves of individual neurons. As expected, the mean population response (red curve) is smooth. In contrast, there is considerable trial-by-trial variation in population responses for individual target motions (black curves), and this trial-by-trial variation leads to estimates of target speed that also vary from trial to trial. If we had made a large population of model MT neurons with the variation of the individual neurons contrived to be independent of all others, then the variation across the population response would average away and the estimates of target speed (and pursuit) would be quite precise. However, Figure 5B summarizes our data (Huang and Lisberger, 2009) showing significant neuron-neuron correlation of spike counts in the first 150 ms of the response of MT neurons, the relevant time interval for pursuit. Importantly, the correlations have structure: pairs of MT neurons with similar preferred speeds were much more likely to show statistically significant MT-MT correlations (red symbols), most of which were positive.
Using Equation (1) to decode target speed from a realistic model population response that includes structured neuron-neuron correlations reveals a single important principle. For large populations of neurons (here ~4,000). the variance of the estimate of target speed increases as a function of the peak MT-MT correlation in the model population, as long as the correlations have the structure described by Figure 5B. For example, the model with the peak correlation (0.36) that best described our recordings from pairs of MT neurons produced an asymptotic variance of target speed estimate of 0.8 (Figure 5C). In contrast, if the correlation between MT neurons lacks the structure shown in Figure 5B but instead is flat so that strong MT-MT correlations with values of 0.3 are the same for all pairs of neurons, then the variances of estimated target speed that were little different from those for an uncorrelated model MT population (R: MT-MT equals zero). This counter intuitive result can be understood by remembering that Equation (1) finds the center of gravity in terms of the preferred speed of the neurons with the largest response, but does so by taking into account the responses of all active neurons. If the entire population code fluctuates up and down in a correlated fashion, then the center of the population response will not fluctuate and the estimate of target speed will be very reliable. If, on the other hand, the spike counts of neurons of similar preferred speeds fluctuate up and down together while those of different preferred speeds fluctuate independently, then the center of mass of the population response will be much more variable.
The analysis presented so far shows that the variation in estimates of target speed and direction by pursuit are approximately the same in magnitude as those for perception, and that the variation in the response of the correlated population of neurons in MT could be the source of the variation in pursuit. Further, for most target forms, the mean center of the MT population response is closely related to the mean estimate of target speed by pursuit. We suggest that most features of the initiation of pursuit can be explained by the mean, variation, and neuron-neuron correlations in the sensory population response in MT. Of course, the correlated fluctuations in MT responses probably arise much earlier in the visual system, perhaps even in the retina. However, their presence in MT seems to be sufficient to control the accuracy and precision of the initiation of pursuit. Indeed, it is not necessary to posit that any additional variation is added to the initiation of pursuit downstream from the population decoding step of sensory-motor processing. However, it also is important to remember that we have analyzed only the visually-driven initiation of pursuit. Later in the pursuit response, real motor noise may accumulate far downstream, as postulated by Harris and Wolpert (1998).
So far, we have provided evidence that the mean and variance of the initiation of pursuit can be assigned to the properties of the sensory representation in area MT. We have suggested that the downstream sensory-motor and motor circuits must be reacting reliably to the estimates of target speed and direction that emanate from MT. The next question is how pursuit is represented in different parts of the pursuit circuit and whether those representations help us understand what neural computations are performed in different areas.
The skeleton circuit diagram for pursuit eye movements (Lynch and Tian, 2006) appears in Figure 6A. From MT, motion signals are transmitted via a cortico-cortical pathway through the parietal sensory-motor cortex to the smooth eye movement region of the frontal eye fields (FEFSEM) (Stanton et al., 2005). There is some evidence (Dursteler et al., 1988), but no definitive proof that area MST is the parietal area for pursuit. In other motor systems, the relationship between parietal neurons and a specific movement has been established by the presence of activity that builds up in the “delay” interval between presentation of a target and execution of a movement. In LIP and MIP, for example, delay activity occurs selectively for saccadic eye movements and reaching movements, respectively, but not for the other (Snyder et al., 1997). The recent development of a delay period task for pursuit, and the demonstration of direction selective delay period activity in the supplementary eye fields (Shichinohe et al., 2009), offers the chance for a similar identification of the parietal sensory-motor neurons for pursuit.
From MT, the parietal sensory-motor cortex, and the FEFSEM, signals are transmitted through a variety of brainstem relay nuclei to at least two regions of the cerebellum, the oculomotor vermis and the floccular complex (for review, see Lynch and Tian, 2006). The floccular complex can affect extraocular motoneurons via a disynaptic pathway that involves an identified group of “floccular target neurons” or “FTNs” in the vestibular nuclei (Lisberger et al., 1994; Scudder and Fuchs, 1992). In addition, the FEFSEM provides a strong projection to the basal ganglia (Cui et al., 2003), where neurons have been found that discharge in relation to pursuit (Basso et al., 2005; Lynch, 2009) and PET studies identify a region of strong activity during pursuit (O’Driscoll et al., 2000). The floccular complex and FTNs have been implicated in motor learning in both the vestibulo-ocular reflex (Lisberger 1994) and pursuit (Kahlon and Lisberger, 2000; Medina and Lisberger, 2008, 2009), perhaps using similar mechanisms, while the oculomotor vermis plays an important role for learning in motor learning in saccadic eye movements (Soetedjo and Fuchs, 2006), suggesting that it could have a similar function for pursuit.
The textbook picture of the discharge of neurons in the pursuit circuit beyond MT is that neurons in different parts of the circuit transmit very similar neural signals. Indeed, in at least area MST (Newsome et al., 1988), the FEFSEM (MacAvoy et al., 1991), the dorso-lateral pontine nucleus (Mustari et al., 1988), the cerebellar floccular complex (Stone and Lisberger, 1990), and the oculomotor vermis (Shinmei et al., 2002), many neurons show a response to visual motion at the initiation of pursuit and some degree of sustained firing related to eye velocity during steady-state pursuit. During the latter, steady-state, phase of pursuit, all areas seem to combine signals related to a) eye velocity in the head and b) head velocity in the world to represent eye velocity in the world, called “gaze velocity” (Lisberger and Fuchs, 1978; Thier and Erickson, 1992; Fukushima et al., 2000; Shinmei et al., 2002; Ono et al., 2004). The re-emergence of the same basic signal throughout the circuit underscores the world coordinate system as the fundamental reference frame used to compute signals that drive pursuit. The broad similarity of neural responses in different areas of the pursuit circuit could be related to the existence of recurrent loops from broad areas of the cerebral cortex through both the cerebellum and the basal ganglia (Middleton and Strick, 2000).
The small rasters in the circuit diagram of Figure 6A provide some details of the discharge of neurons in different parts of the pursuit circuit, and illustrate one of the major processing tasks of the pursuit circuitry. The responses of neurons in area MT are transient, as expected given the transient presence of visual image motion just before and during the initiation of pursuit (shaded area in Figure 1). The responses of Purkinje cells in the cerebellum consist of a transient that is thought to arise from visual inputs (Miles and Fuller, 1975; Stone and Lisberger, 1990), followed by a sustained response that persists along with eye velocity when a target is stabilized with respect to the moving eye during steady-state pursuit (Stone and Lisberger, 1990). Recall that eye velocity also is sustained through an interval of image stabilization during steady-state tracking (Morris and Lisberger, 1987). Thus, the transformation from sensation to action must include converting a transient sensory response in MT into sustained extra-retinal signals that drive a sustained motor response.
Careful scrutiny reveals that the responses of neurons in the FEFSEM (Schoppik et al., 2008) are more similar to those of floccular neurons than of MT neurons, but still differ in potentially important ways. The raster for one FEFSEM neuron in Figure 6A, for example, shows an early peak in firing related to the onset of pursuit, and a later peak during steady-state tracking. The differences between floccular and FEFSEM responses are illustrated in Figures 6B and C, where the color of each pixel indicates firing rate normalized for each neuron’s maximum, and each horizontal line show the firing of a different neuron as a function of time from 100 ms before to 500 ms after the onset of target motion. In effect, each horizontal line uses color to represent the peri-stimulus time histogram for a single neuron. Neurons are stacked from bottom to top according to the time when they reached 95% of their peak firing rates. In the FEFSEM (Figure 6B), the onset of the response occurs over the full time range of the movement in different neurons, and each neuron reaches a narrow peak at different times throughout the first 500 ms of target motion. Thus, different neurons in the FEFSEM appear to participate most strongly in pursuit for brief intervals that occur at different times during the movement, and the full population of neurons tiles the entire movement time. In the floccular complex of the cerebellum (Figure 6C), in contrast, all Purkinje cells start to fire between 110 and 140 ms after the onset of target motion. They then reach broad peaks, mostly within the first 200 ms of pursuit, and usually continue to fire throughout the pursuit movement.
Differences in the relationship between firing rate and eye movement for the floccular complex versus the FEFSEM underscore likely differences in the function of the two areas during pursuit. For floccular Purkinje cells, quantitative analysis of the dynamics of firing rate in terms of the parameters of eye motion reveals that >90% of the variance of each Purkinje cell’s average simple spike response as a function of time can be characterized as a linear combination of eye position, velocity, and acceleration (Shidara et al., 1993; Krauzlis and Lisberger, 1994; Hirata and Highstein, 2001; Medina and Lisberger, 2009). Analysis of the kinematics reveals that most PCs have preferred directions aligned with the pulling directions of one pair of the extraocular muscles (Krauzlis and Lisberger, 1996). Thus, each Purkinje cell participates throughout the pursuit eye movement, and is concerned with dynamics and kinematics independent of time. We imagine that floccular Purkinje cells operate at a low level to generate the correct levels of muscular force in particular muscles. In contrast, more complex models are needed to account for the relationship between the average firing of FEFSEM neurons and eye movement, and the models account for ~50% of the variance, mainly because of the temporally restricted responses and higher variation in the firing rates of FEFSEM neurons (Schoppik et al., 2008). Therefore, we imagine that the FEFSEM works at a higher level to regulate pursuit in a temporally-selective fashion. We suggest that the FEFSEM divides the pursuit movement into small time epochs and allows independent control of pursuit performance and/or learning in each time epoch (Schoppik et al., 2008).
Our evaluation of the trial-by-trial variation in pursuit behavior suggested to us that we might learn more about neural processing for pursuit by recording the neural and behavioral variation simultaneously during many repetitions of the same tracking target motion. This is a different axis of analysis from the standard approach of studying the time course of mean firing rates, and it provided two insights. First, it revealed another way in which responses of neurons in the floccular complex and the FEFSEM differed, and provided additional evidence that the two structures have different roles in pursuit. Second, it provided some unexpected insights into how signal and noise are processed in the brain.
We measured “neuron-pursuit correlations” by recording the neural and behavioral responses to many repetitions of the same target motion, and then computing the trial-by-trial correlation between firing rate and eye movement at every millisecond throughout the movement, for each neuron individually. For example, Figure 7A summarizes the correlation between eye movement and firing rate for one Purkinje cell at one time during the initiation of pursuit. Each point plots data from an individual trial, showing a strong co-variation of instantaneous firing rate and eye velocity even though the visual stimulus was the same in each trial. Across time (Figure 7B), the time average of neuron-pursuit correlation for floccular Purkinje cells was close to zero before the onset of pursuit, reached a peak of almost 0.6, accounting for 36% of variance, during the initiation of pursuit, and then settled to a lower, non-zero level during steady-state tracking.
Comparison of the neuron-pursuit correlations recorded in the floccular complex and the FEFSEM revealed large, but quite different, neuron-pursuit correlations in the two structures (Figures 7C and D). Here, we show results from all individual neurons instead of averages: each horizontal line in the color map shows neuron-pursuit correlations as a function of time for an individual neuron. The lines for different neurons are stacked vertically, ordered by the time at which 95% of the peak correlation was reached. In the floccular complex (Figure 7D) all Purkinje cells showed the same basic profile of neuron-pursuit correlation as a function of time. All reached peak neuron-pursuit correlations about 100 ms after the onset of target motion. Many continued to show neuron-pursuit correlations later in the behavior, but at lower levels. In the FEFSEM (Figure 7C), in contrast, different neurons reached their peak neuron-pursuit correlations at different times. Thus, the temporal structure of neuron-pursuit correlations in the FEFSEM implies that each individual neuron makes its most important contribution to pursuit in a specific, narrow time window, with the full pursuit response tiled systematically by the full population of FEFSEM neurons. This supports our earlier suggestion that each FEFSEM neuron could be controlling pursuit selectively at its own specific time during the movement.
Twenty years ago, the sorts of neuron-pursuit correlations illustrated in Figure 7C and D were unimaginable. We viewed the variation in the nervous system as noise, and we assumed that the noise was eliminated by averaging across neurons with nominally similar tuning and independent noise. Thought changed gradually starting with the work of Johnson et al. (1973) on the somatosensory system, the finding of Zohary et al (1994) that neurons in MT have correlated noise, and the observation of Britten et al. (1996) that one MT neuron could predict behavior (albeit weakly). Now it is clear that trial-by-trial fluctuations in neural responses are correlated across neurons, that the presence of correlated noise limits the power of averaging across neurons, and that the correlated noise will cause variation in the resulting behavior (Shadlen et al., 1996). The neuron-pursuit correlations illustrated in Figure 7 are less shocking in the modern context, and they turn out to allow a fairly quantitative understanding of how signal and noise are processed in the pursuit sensory-motor system.
Knowing the variance of the behavior, the variance of the neural firing, and the neuron-pursuit correlation in the floccular complex allowed us to calculate how much of the variation in pursuit was added downstream from the Purkinje cells (Medina and Lisberger, 2007). We did so under the assumptions that many (at least 1,000) Purkinje cells were involved in the behavior and that their signals were combined linearly at downstream areas. The calculation implied that all the variation in the first 100 ms of pursuit was present in the responses of the Purkinje cells, so that no noise was added downstream (Figure 7E). This is consistent with our hypothesis, based on behavioral measures and recording from area MT, that the variation in pursuit initiation arises entirely from correlated noise in the sensory representation of visual motion in MT. Later in the pursuit response however, our calculations implied that noise is added downstream. As suggested by Harris and Wolpert (1998) for noise of “motor” origin, the noise added downstream from the floccular complex scales as a function of the magnitude of eye speed (calculations for pursuit of 3 target speeds are shown in Figure 7E) and accumulates as a function of time. Thus, we suggest that there are two fundamentally different sources of pursuit variation, as there also seem to be for saccadic variation (van Beers, 2007). During the first 100 ms of pursuit, which is driven by visual signals, variation arises from the sensory input. During the later, steady-state phase of pursuit, which is driven by extra-retinal signals, variation arises deep in the motor system.
For the FEFSEM, we obtained an intuitive understanding of the processing of signal and noise through a simple model of population decoding. We created model FEFSEM populations constrained to have the mean neuron-neuron correlation of 0.18 that we measured during pursuit. Then, we ran many simulated trials, computing the average “pursuit” commanded by the model population and the “neuron-pursuit correlation” for each unit in the model. To understand the performance of the model, we define the product of the neuron-pursuit correlations for two neurons in a pair, which we will call RNB* and plot as a color code in Figure 7F. Several principles emerge. 1) As the size of the pool of model neurons increases, moving along a horizontal line in Figure 7F, each neuron contributes less to the population drive for movement and the magnitude of neuron-pursuit correlations decreases; so does RNB*. 2) For large pools with no noise added downstream, shown in the upper-right corner of Figure 7F, the product of the neuron-pursuit correlations of a pair of neurons (RNB*) approaches the theoretical limit of the value of the neuron-neuron correlations (0.18 in our data and simulations). For smaller pool sizes, RNB* can be larger than the neuron-neuron correlations because few neurons are contributing. 3) As the amount of variation added downstream increases, moving from top to bottom along any vertical line in Figure 7F, the neuron-pursuit correlation can explain less of the behavioral variance and therefore decreases, as does RNB*. In our data, RNB* was approximately equal to the neuron-neuron correlation and fell in the region indicated by the diagonal yellow stripe in Figure 7F (Schoppik et al., 2008). This yellow stripe defines the possible combination of pool sizes and downstream noise that would be compatible with our observations of neuron-neuron and neuron-pursuit correlations. As pool sizes smaller than 100 neurons seem implausible, we suggest that very little variation is added downstream from the signals emanating from the FEFSEM.
For both the cortex and the cerebellum, the analysis of neuron-pursuit and neuron-neuron correlations leaves us with the interpretation that very little noise is added to the commands for pursuit downstream from these areas. For the cerebellum, this statement is valid only for the initiation of pursuit, and for the FEFSEM it is valid only for brief epochs that are different in each neuron and that systematically tile the duration of the movement. It is actually quite remarkable that so little noise seems to be added to motor commands late in the process, and quite startling that the motor system seems to be capable of following its sensory inputs with such high reliability.
Pursuit operates on a very quick time scale. Only 100 ms elapses between the onset of target motion and the initiation of smooth eye motion. The short time scale tempts us to think of pursuit as a sensory-motor reflex. Yet, the richness of the pursuit circuit, its similarity to circuits for saccadic eye movements and arm movements, and the temporal diversity of contributions from the FEFSEM belie such a simple view. Much happens between sensation and action, even within the very short time that is available. Figures Figures88 and and99 summarize some of the higher functions that occur in relation to pursuit.
We (re)-discovered and codified modulation or “gain control” (c.f. Robinson, 1986; Luebke and Robinson, 1988) through the target motion outlined in Figure 8A (Schwartz and Lisberger, 1994). Here, a spot target started to move with a step-ramp target motion. In half of the trials, the target underwent a perturbation consisting of a single cycle of a 10 Hz sine wave (black arrow in Figure 8A). The perturbation evoked a clear response (red arrow in Figure 8A) if the perturbation was superimposed on smooth target motion, but little or no response if superimposed on a stationary fixation target (compare eye velocity traces in Figure 8B). The behavioral paradigm was contrived so that the image motion on the retina was the same during pursuit and fixation, while the state of the pursuit system was quite different. We concluded, as illustrated in the schematic diagram at the top of Figure 8, that the pursuit circuit contains a gain, or “volume”, control, and that the gain is high when the subject is tracking and low when the subject is fixating. We think that turning the gain control to “loud” is an essential step in initiating and maintaining pursuit of a moving target. As noted previously by Luebke and Robinson (1988), pursuit is not merely fixation of a moving target: it is more. Pursuit, like saccades (Fischer and Boch, 1983), requires activation and can proceed only after active release from fixation.
The schematic at the top of Figure 8 includes the output of the FEFSEM as one of the variables that can control the gain of visual-motor transmission for pursuit. The evidence for including the FEFSEM, summarized in Figures 8C and D, comes from micro-stimulation in the FEFSEM (Tanaka and Lisberger, 2001, 2002). In these experiments, we introduced micro-electrodes into the FEFSEM, and recorded until we found a site where neurons responded selectively during pursuit and not during other kinds of eye movements. We then switched the electronics to allow stimulation through the micro-electrode. Stimulation with high currents evoked an eye movement that was larger during pursuit than during fixation (Figure 8C, red versus black trace), consistent with the possibility that the site of gain modulation for pursuit is somewhere in the pathway from the FEFSEM to the motor system. We then delivered stimulation at lower frequencies as a stationary fixation point was perturbed briefly with the same tiny 10 Hz sine wave used to codify gain control in the first place. As shown in Figure 8D, perturbation of a stationary fixation target without activation of the FEFSEM evoked a very small smooth eye velocity (purple trace). During micro-stimulation of the FEFSEM (black trace), in contrast, the same perturbation of the fixation target caused a much larger smooth eye velocity response. We take this finding as evidence that activity emanating from the FEFSEM can control the gain of visual-motor transmission for pursuit. A role in gain control is entirely compatible with the suggestion in a prior section that different neurons in the FEFSEM control pursuit in different brief time epochs during the movement. Indeed, the temporal specificity of the involvement of each individual neuron provides a mechanism for temporally specific control of the gain of visual-motor transmission for pursuit (Tabata et al., 2008)
Several of our papers suggest that gain control is more than just arousal for smooth pursuit eye movements. We suggest that it is a form of motor attention and can be applied differentially to one of several moving targets to make choices about what objects to track (Schoppik and Lisberger, 2006; Garbutt and Lisberger, 2006; Gardner and Lisberger, 2001, 2002). An example appears in the cartoons of Figure 9. Here, two targets are moving upwards and to the right at the same time and the monkey is allowed to track either one. Pre-saccadic pursuit (blue arrows) takes the eye obliquely up and right in a direction that represents the average of the two simultaneous target motions. Then, execution of a saccade to the target moving to the right (A) or up (B) turns pre-saccadic pursuit that is averaging (blue arrows) into post-saccadic pursuit (red arrows) that is selective for the rightward or upward target motion (Gardner and Lisberger, 2001).
The emergence of target-selective pursuit immediately after the saccade (region within the ellipses in Figure 9) makes it impossible to invoke the use of visual inputs from the fovea as a mechanism of target selection; it would take 60-70 ms for the visual inputs from the fovea to have their first effect on smooth eye velocity. Instead, we conclude that saccades are tightly linked to target choice for pursuit. Interestingly, saccades evoked by micro-stimulation in the frontal eye fields or the superior colliculus have the same target-selective effect (Gardner and Lisberger, 2002), suggesting that execution of the saccade itself can cause target choice.
We think of target choice as an expression of voluntary control of the gain of visual-motor transmission for visual signals arising at the specific spatial location of the target chosen for tracking. Our hypothesis is that execution of a saccade is one way to increase the gain of visual-motor transmission selectively for the target at the endpoint of the saccade. There also may be a mechanism that chooses targets in parallel for saccades and pursuit (Liston and Krauzlis, 2003), and humans are able to bias pursuit towards one of two targets even without a saccade (Garbutt and Lisberger, 2006). The target selective nature of gain control suggests that it can be understood as a form of “motor attention” Schoppik and Lisberger (2006). As a homology to the spatial search light theory of perceptual attention (Posner, 1980), there seems to be a spatial aperture for the visual inputs that drive pursuit, and visual motion signals arising within that aperture seem to obtain preferential access to the pursuit motor circuits (Schoppik and Lisberger, 2006; Garbutt and Lisberger, 2006). The same concept helps to understand how the pursuit system can keep the moving eyes pointed at a moving target in spite of the potential drag provided by the oppositely-directed image motion from the stationary surroundings (Miles et al., 1991).
Efference copy is a general property of neural circuits in the brain. For example, spino-cerebellar pathways include the ventral spino-cerebellar tract (VSCT), which appears to assemble the same signals as are transmitted to motoneurons, and to send a surrogate of the motor command to the cerebellum to assist in coordinating movement (e.g., Arshavsky et al., 1978). In the saccadic eye movement system, efference copy has been postulated to allow the brain to remember the location of future targets in a spatial coordinate frame in the face of ongoing saccades (Mays and Sparks, 1980). Part of the neural substrate for this function exists in a pathway from the superior colliculus through the thalamus to the saccadic regions of the frontal eye fields (Sommer and Wurtz, 2002).
Armed with an understanding of the multiple processes that operate to initiate smooth tracking, can we use the pursuit system to understand how efference copy contributes to motor control? In the pursuit system, efference copy signals are present almost everywhere. Efference copy signals in the pursuit system are related to ongoing eye velocity and persist without decrement when the eyes continue to move in the absence of visual motion stimuli during steady-state pursuit. Signals that match these criteria have been recorded in Purkinje cells of the floccular complex of the cerebellum (Stone and Lisberger, 1990) and extrastriate visual area MST (Newsome et al., 1988), and probably exist also in at least the FEFSEM, the caudate nucleus, and the oculomotor vermis.
One hint of the function of efference copy in pursuit comes from the observation that pursuit can maintain steady-state tracking with eye velocity essentially equal to target velocity so that there is next to no image motion (Figure 1A). If image motion is removed during steady-state tracking by stabilizing the target on the moving eye (Morris and Lisberger, 1987), or if the target is blinked as if it went behind a tree (Newsome et al., 1988; Churchland et al., 2003), then the eye continues to move smoothly without a decrement in speed. Thus, as noted initially by Yasui and Young (1975), even small visual motion inputs cannot explain the maintenance of eye velocity nearly equal to target velocity during steady-state pursuit. As mentioned earlier in this paper, we cannot invoke the mechanical forces of inertia and momentum. Left to itself with constant innervation, the eyeball would slow to a stop within a few hundred milliseconds. We think of the maintenance of eye velocity during steady-state pursuit as a housekeeping function that the pursuit system performs automatically, without voluntary intervention.
The current theory of the maintenance of pursuit postulates that the command for eye velocity at any given time becomes an efference copy that is fed back and becomes part of the command for eye velocity in the immediate future. Then, eye velocity is maintained automatically unless there is a difference between target and eye velocity. As a consequence, any image motion drives a change in smooth eye velocity (an eye acceleration) that corrects the error between target and eye velocity (Lisberger et al., 1981). Debate is not about the principle of using efference copy to maintain eye velocity in the absence of image motion, but rather about the exact implementation in the brain. Lesions of the floccular complex (Zee et al., 1981), oculomotor vermis (Takagi et al., 2000), MST (Dürsteler and Wurtz, 1988), FEFSEM (Keating, 1991), and dorsolateral pontine nucleus (Ono et al. 2003) cause eye velocity to lag behind target velocity during steady-state pursuit, so all are potential substrates of the extra-retinal maintenance of steady-state pursuit. I favor a primary role for the cerebellum in this function (Lisberger, 2009), as shown by the red feedback pathway in Figure 10, and I think that other areas support different components of pursuit (e.g. gain control, see below). Still, it is plausible that the extra-retinal maintenance of steady-state pursuit is shared, with different sites contributing in different ways and on different time scales (e.g., Newsome et al., 1988). As an alternative to a direct role in maintaining steady-state pursuit, Komatsu and Wurtz (1988) pointed out that the corollary discharge in MST could support the perceptual phenomenon that a target is still perceived as moving even during pursuit that is good enough to eliminate visual motion inputs.
An efference copy signal related to eye motion seems essential for maintaining a high gain of visual-motor transmission during pursuit, perhaps through the component of discharge in the FEFSEM that is related to smooth eye velocity. Even though credible image motion is the first impetus for increasing the gain of visual-motor transmission and initiating the transformation of fixation into pursuit, image motion becomes small and unreliable during steady-state pursuit. Visual signals alone could not keep the gain of visual-motor transmission high. Figure 10 proposes that eye motion signals provided by an efference copy (magenta feedback pathway) would keep the gain of visual-motor transmission high so that the system can be responsive to any new visual inputs throughout the course of a single movement (see Figure 8A and B). Operationally, the brain is combining image and eye motion signals to create a representation of target motion that is present throughout the initiation and steady-state intervals of pursuit, whether or not the target motion is causing image motion.
Pursuit eye movements are particularly important to us as primates with foveal vision, because they allow us to keep our fovea pointed at objects that are moving. In addition, pursuit is an interesting system to study because of its accessibility to detailed physiological and behavioral analysis, and because of the possibility that interesting general brain functions such as population decoding, sensory-motor integration, and target choice can be studied rigorously within a known circuit. It seems likely that principles learned from the study of eye movements will generalize to other movements. In many ways, the problem solved by pursuit is similar to that solved when Barry Bonds hit a home run or Roger Federer returns service; from sensation to action in a very short time. The pursuit circuit parallels those for saccadic eye movements, reaching, and grasping; the circuit homology implies functional homology as well. Finally, pursuit implements voluntary features that are gracefully integrated into the quick sensory-motor response, features that seem likely to be important for all kinds of movements. Thus, what we’ve learned about pursuit so far, and what we’ll learn as a field going forward, seems likely to provide a broad understanding of how sensation is converted to action, and what happens in between, for all kinds of sensory modalities and motor acts.
I am grateful to collaborators, students, and postdoctoral fellows whose research and ideas contributed to the content of this review. Special thanks to David Schoppik, Javier Medina, Mark Churchland, and Sonja Hohl for providing old data to make the new figures in this paper. Joonyeol Lee, Yu-Qiong Niu, Kris Chaisanguanthum, Jennifer Li, Jin Yang, and John O’Leary provided helpful comments on earlier versions of the manuscript. Research was supported by the Howard Hughes Medical Institute, the National Eye Institute (EY03878 and EY017210), the National Institute of Mental Health (MH077970), and the Sloan and Swartz Foundations.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.