|Home | About | Journals | Submit | Contact Us | Français|
Internal models are a key feature of most modern theories of motor control. Yet, it has been challenging to localize internal models in the brain, or to demonstrate that they are more than a metaphor. In the present review, I consider a large body of data on the cerebellar floccular complex, asking whether floccular output has features that would be expected of the output from internal models. I argue that the simple spike firing rates of a single group of floccular Purkinje cells could reflect the output of three different internal models. 1) An eye velocity positive feedback pathway through the floccular complex provides neural inertia for smooth pursuit eye movements, and appears to operate as a model of the inertia of real-world objects. 2) The floccular complex processes and combines input signals so that the dynamics of its average simple spike output are appropriate for the dynamics of the downstream brainstem circuits and eyeball. If we consider the brainstem circuits and eyeball as a more broadly conceived “oculomotor plant”, then the output from the floccular complex could be the manifestation of an inverse model of “plant” dynamics. 3) Floccular output reflects an internal model of the physics of the orbit where head and eye motion sum to produce gaze motion. The effects of learning on floccular output suggest that it is modeling the interaction of the visually-guided and vestibular-driven components of eye and gaze motion. Perhaps the insights from studying oculomotor control provide groundwork to guide the analysis of internal models for a wide variety of cerebellar behaviors.
Modern theories of the neural control of movement posit internal models that serve the critical function of providing interfaces between neural signals in different frames of references (e.g. Miall et al., 1993; Shidara et al., 1993; Shadmehr and Mussa-Ivaldi, 1994; Wolpert et al., 1995; Schweighofer et al., 1998; Kawato, 1999; Bursztyn et al., 2006; Bo et al., 2008). For example, the brain must “know” about the mechanical structure, or “kinematics” of the arm and about its dynamical response across time to generate the correct forces in a large group of agonists and antagonists. This “knowledge” can be formalized as an internal model of the arm that allows the brain to convert a goal specified in terms of the desired final hand position, for example, into the appropriate amplitude and time course of muscular activity across multiple muscles and joints.
Internal models come in many flavors and functions. An inverse dynamics model would convert a command for joint trajectory or final joint angle into a time varying pattern of muscular activity. An inverse kinematics model would take a command for final hand position as its input, and convert the command into a description of the required kinematics in terms of a sequence of trajectories of angles for the shoulder, elbow, and wrist. A forward model might take internal feedback of the commands sent to muscles as its input, and convert this “corollary discharge” or “efference copy” into predictions of the sensory feedback that would result, for comparison with the actual sensory feedback.
The concept of internal models frequently is applied to arm movements, but it also has a place in the control of eye movement. For for the vestibulo-ocular reflex (Fernandez and Goldberg, 1971) and smooth pursuit eye movements (Miles and Fuller, 1976; Lisberger and Fuchs, 1978a), the higher inputs that specify a desired eye movement consist of signals related to the desired smooth eye velocity. For saccades, the command signal comprises a brief, high-frequency burst specifying the instantaneous eye velocity during a saccade (Luschei and Fuchs, 1972; van Gisbergen et al., 1981). The oculomotor “plant”, defined as the eyeball, extraocular muscles, and tissues that surround them, has dynamics in the sense that a transient pulse of firing in the motoneurons will rotate the eye rapidly but not instantaneously, and then will allow the eye to drift back toward straight-ahead gaze with a time constant of ~150 ms. To induce the eye to stay at the correct final rotation, the brain must have an inverse model of the dynamics of the plant and must use that model to generate exactly the correct motoneuron commands to rotate the eye to the desired location or at the desired speed, and then to hold the eye at the final angle of rotation. Skavenski and Robinson (1973) formalized this concept. They proposed that the inverse model comprises a neural integrator that converts eye velocity commands to eye position signals. Their model, illustrated in Figure 1, is one of the earliest examples of an inverse dynamics model, and has been largely substantiated by recordings from neurons related to eye movements and vestibular stimulation in the brainstem.
The kinematics of eye movements also becomes an interesting concept that may be relevant to a discussion of internal models if we consider eye movements as part of a larger motor system that specifies the position of gaze in three dimensional space. We can think of gaze in terms of 4 joints. The first “joint” allows conjugate rotation of the two eyes within the orbit, changing the horizontal and vertical position of gaze while maintaining the point of convergence of the visual angles of the two eyes at a fixed distance. The second joint allows “vergence” by adjusting the relative angles of the two eyes to bring the position of gaze to objects at different distances. The third joint is at the neck, which can move gaze by rotating the head. Finally, horizontal and vertical rotation of the eye within the orbit cause predictable torsion of the eye for saccades and smooth pursuit eye movements, but not for the vestibulo-ocular reflex (Tweed and Vilis, 1990; Crawford and Villis, 1991), creating a fourth, fairly complicated joint. The goal of the brain is to control gaze, not just eye rotation, through a complex coordination of the 4 joints.
If we accept that the brain must transform signals between coordinate frames, and that the required transformations are defined by the mechanics and dynamics of the effector apparatus, then the question is not whether the brain operates as an internal model of the body. It must. Instead, the questions are whether internal models are localized at specific sites in the brain, how they are constructed through neural circuits, and whether specific patterns of neural discharge can be understood in terms of internal models. The first step must be to find potential loci for further examination by identifying patterns of neural discharge that could represent the output of internal models.
Much prior research has pointed to the cerebellum as a potential locus of internal models. An exhaustive list of prior suggestions is beyond the scope of my review article, but I give a handful of examples, in alphabetical order by first author:
Angelaki et al. (2008) and Yukusheva et al. (2008) have discovered that cerebellar neurons can disambiguate tilt and torsion signals through convergence of vestibular inputs from semicircular canals and otolith neurons, implying that the cerebellum creates an internal model of the physics of the world.
Bastian et al. (1996) demonstrated that cerebellar lesions disrupt compensation for the “interaction” torques between the different segments of a limb, implying that the cerebellum contains a model of arm kinematics and dynamics that allows it to predict and counteract interaction torques.
Cerminara et al. (2009) showed that the simple-spike firing of Purkinje cells in the lateral cerebellum follows the velocity of a moving target even when the target has disappeared briefly, suggesting that the cerebellum contains an internal model of the physics of object motion.
Ghasia et al. (2008) found that putative cerebellar target neurons discharge in relation to a change in ocular torsion that in not represented in the firing of extraocular motoneurons (Ghasia and Angelaki, 2005), and that therefore results from the mechanical properties of the orbit. In a motor system such as eye movements that does not rely on proprioceptive feedback (Keller and Robinson, 1971), their observations imply that the cerebellum contains a model of ocular mechanics.
Imamizu et al. (2000) used functional imaging to show that specific voxels in the cerebellar cortex have BOLD signals that remain modified after a subject has learned a motor task involving creation of an internal model of a previously-unknown tool.
Ito (2008) illustrates how the concept of cerebellar internal models for motor control can be extended to internal models of mental representations, consistent with the finding of abundant cerebellar projections to regions of the cerebral cortex outside of traditional motor areas (Dum and Strick, 2003).
Pasalar et al. (2006) used recordings from the cerebellar cortex of monkeys to show that the output from the region they studied could result from an internal model of kinematics, but not of dynamics.
The theory of cerebellar learning also could be an important facet of the operation of internal models in the cerebellum. According to the cerebellar learning theory, errors in movement are signaled by consistently timed spikes on the climbing fiber input to the cerebellum. In turn, climbing fibers cause long-term depression of the synapses from parallel fibers onto Purkinje cells, specifically for the parallel fibers that were active at or just before the time the climbing fiber input arrived (Marr, 1969; Albus, 1971; Ito, 1972). The extension of the cerebellar learning theory to cerebellar internal models proposes that depression of the parallel fiber to Purkinje cell synapses corrects the internal model in the cerebellum so that the next instance of a given movement is closer to perfection (e.g. Imamizu et al., 2000; Ito 2005, 2008). The single unit recording studies of Gilbert and Thach (1977) and Ojakangas and Ebner (1992) are compatible with the hypothesis that the cerebellum contains adaptable internal models, with adaptation guided by error inputs from climbing fibers.
What would internal models look like in terms of the discharge of cerebellar neurons, and how should learning situations change the discharge of cerebellar neurons if internal models are a learned property of the cerebellum? The goal of my review article is to illuminate these two questions using data from the floccular complex of the cerebellar cortex during normal smooth pursuit eye movements, and during the vestibulo-ocular reflex before and after motor learning. After a general review of the inputs and outputs of the floccular complex, I will discuss three ways in which its simple spike output could be interpreted as the output of an internal model.
The logic in my review is based on reinterpretation of the responses of a single unified group of Purkinje cells that are found throughout the floccular complex. The explications of the different internal models are based on different features of the responses of the Purkinje cells during eye movements, but I imagine all three proposed internal models to be reflected in the output of this single group of Purkinje cells. Further, it is important to remember that prior papers have interpreted the responses of the Purkinje cells in light of legitimate concepts that largely overlooked the concept of internal models in motor control. Those interpretations do not need to change to consider the same data in light of adaptable internal models in the cerebellum.
Anatomically, the floccular complex comprises the 10–11 folia of the ventral paraflocculus and flocculus, where the latter structure has been considered classically to be part of the vestibulo-cerebellum. Until recently, there was debate about whether the flocculus and ventral paraflocculus were a single structure with a unified function, or were functionally distinct structures that play separate roles in motor learning in the VOR versus pursuit, respectively. The suggestion of functionally distinct structures was supported by demonstrations of different anatomical pathways for visual inputs to the two structures (Gerrits and Voogd, 1989; Glickstein et al., 1994), and by some physiological observations (Nagao et al., 1992). However, Rambold et al. (2002) showed the partial ablations of the flocculus and ventral paraflocculus caused linked deficits in pursuit eye movements and learning in the vestibulo-ocular reflex. The deficits in the two behaviors had linked magnitudes that scaled together in proportion to the size of the lesion. Further, Lisberger et al., 1994b) reported the same basic Purkinje cell response properties in both the ventral paraflocculus and the flocculus. On the basis of the latter reports, we now refer to the functionally relevant part of the two structures as the floccular complex.
The floccular complex clearly plays an important role in both smooth pursuit eye movements and motor learning in the vestibulo-ocular reflex. Purkinje cells (PCs) in the floccular complex show strong modulation of firing rate during visually-guided smooth pursuit eye movements in monkeys (Miles and Fuller, 1975; Lisberger and Fuchs, 1978a; Noda and Suzuki, 1979a; Leung et al., 2000) and changes in modulation in association with motor learning in the vestibulo-ocular reflex (Miles et al., 1980b; Lisberger et al., 1994b; Hirata and Highstein, 2001). Lesions of the floccular complex cause deficits in visually-guided eye movements (Takemori and Cohen, 1974; Zee et al., 1981; Rambold et al., 2002), as well as in motor learning in the vestibulo-ocular reflex (Lisberger et al., 1984; Rambold et al., 2002). Stimulation in the floccular complex evokes smooth eye movements after a latency of about 10 ms (Lisberger, 1994). Two sets of observations imply that the floccular complex is only part of the cerebellar system for smooth eye tracking, however, and that some functions reside elsewhere. First, the floccular complex seems to play its most important role when the head is stationary or is moved passively, rather than during tracking with active head turns (Belton and McCrea, 1999, 2000). Second, at least the oculomotor vermis and possibly more lateral cerebellar areas play potentially-important roles in pursuit eye movements (Takagi et al., 2000; Hiramatsu et al., 2008).
Of the floccular PCs that show strong modulation during pursuit eye movements, many also show strong independent modulation by vestibular stimulation, but only under specific behavioral conditions (Lisberger and Fuchs 1974, 1978a; Miles and Fuller 1976). Floccular PCs influence the activity of extraocular motoneurons via disynaptic pathways (e.g. Highstein 1973), employing an interneuron that is part of the brainstem’s three-neuron vestibulo-ocular reflex pathway (Figure 2) and that is called interchangeably a floccular target neuron or “FTN” (Lisberger et al., 1994a) or an eye-head velocity or “EHV” neuron (Scudder and Fuchs, 1992). Knowing how spikes in floccular PCs cause eye movements over a fully characterized disynaptic pathway provides a functional context that facilitates interpretation of floccular output.
Analysis of signal processing in the floccular complex has led to the conclusion that the simple spike firing rate of floccular PCs is assembled by combining several different input signals (Figure 2). Recordings from mossy fibers and other small elements in the granule cell layer have revealed three main classes of inputs.
Eye movement mossy fiber inputs show spontaneous firing when the eye is at straight-ahead gaze, steady firing that is related linearly to eye position in a preferred direction along either the horizontal or vertical axis, and a weaker component that is related to eye velocity (Lisberger and Fuchs, 1978b; Miles et al., 1980a; Noda and Suzuki, 1979b; Noda and Warabi, 1982). They show similar modulation of firing rates during visually driven smooth eye movements and the vestibulo-ocular reflex in the dark, and some emit bursts of spikes that precede saccadic eye movements (Lisberger and Fuchs, 1978b). The eye movement inputs to the floccular complex have firing patterns that are remarkably similar to those in the oculomotor regions of the nucleus prepositus hypoglossi (McFarland and Fuchs, 1992), a brainstem nucleus with strong projections to the floccular complex (McCrea and Baker, 1985; Langer et al., 1985). Prepositus neurons are thought to transmit to the floccular complex a corollary discharge reporting the eye movement command that has been delivered at the same time to extraocular motoneurons (Stone and Lisberger, 1990; Green et al., 2007).
Vestibular mossy fiber inputs also show spontaneous firing when the eye is at straight ahead gaze, and modulate their firing mainly in relation to head velocity during vestibular stimulation. Many of the vestibular input elements have the characteristics of “vestibular-only” neurons in the brainstem, while some show pauses during saccades and others show firing related to steady eye position as found in the “position-vestibular-pause” neurons in the vestibular nuclei (Lisberger and Fuchs, 1978b; Miles et al., 1980a). For the most part, the vestibular input to the floccular complex shows similar modulation of firing rate whether the eyes are undergoing a vestibulo-ocular reflex, or are preventing the expression of the vestibulo-ocular reflex by tracking a target that rotates exactly with the moving head (Lisberger and Fuchs, 1978b). The classical view was that the floccular complex receives inputs from primary vestibular afferents, but modern neuroanatomical studies (Gerrits et al, 1989) and the properties of the fibers recorded therein (Lisberger and Fuchs, 1978b; Miles et al., 1980a) imply that most of the vestibular inputs come from the vestibular nucleus, from a variety of second-order vestibular neurons (Highstein et al., 1987).
Visual mossy fiber inputs to the floccular complex have been studied less thoroughly than eye movement or vestibular inputs, but have been identified definitively in studies by Miles and Fuller (1976) and Noda (1986). The visual input appears to arise from the visual motion system, and is conveyed to the floccular complex through the dorsolateral pontine nucleus, the nucleus reticularis tegmenti pontis, and possibly the accessory optic nuclei in the brainstem (Glickstein et al., 1994; Gerrits et al., 1984; Langer et al., 1985).
The simple spike firing rate of floccular PCs can be described as related to gaze velocity or eye velocity, at least during passive sinusoidal eye and head motion (Lisberger and Fuchs, 1978a; Miles et al., 1980a). As illustrated in Figure 3, simple spike firing rate is strongly modulated during tracking of sinusoidal target motion at 0.5 Hz with the head stationary. The average firing rate reaches a peak that coincides with the peak of eye velocity toward the side of recording, a.k.a. ipsiversive eye velocity (Figure 3, left column). If the monkey is subjected to sinusoidal head rotation and is rewarded for keeping the eyes stationary in the orbit by tracking a target that rotates exactly with him, then simple spike firing rate still is strongly modulated even though the eyes are quite still in the orbit. Firing rate now reaches a peak that coincides with the peak of ipsiversive head velocity (Figure 3, right column). Finally, if the monkey generates eye velocity that is equal and opposite to head velocity in the light or the dark, then simple spike firing rate is at best weakly modulated even though both the eyes and head are undergoing sinusoidal motion that was effective alone in causing modulation of simple spike firing rate during tracking (Lisberger and Fuchs, 1974; Miles and Fuller, 1976). There is one possible caveat to the signal processing outlined above. In squirrel monkeys, at least, the head velocity component during cancellation of the VOR is not as impressive when the monkey tracks with active head turns (Belton and McCrea, 2000). It is not clear whether this difference is related to the use of different mechanisms to prevent the expression of the VOR with passive versus active head turns. The VOR is mainly counteracted by a pursuit eye movement for the parameters used for passive head rotation (Lisberger 1990; Cullen et al., 1991), but the gain of vestibular transmission might be suppressed more strongly during rapid, active head turns.
In general, simple spike firing during visual tracking and passive angular head rotation can be described as:
where Ė(t) and (t) represent eye and head velocity as functions of time and rr represents the spontaneous firing rate around which the input signals cause modulation. In many PCs, the values of b and d are approximately equal. As a result, the PCs show equal modulation during pursuit eye motion with the head stationary or sinusoidal head motion with the eyes stationary in the orbit. When eye and head velocity are equal and opposite, as during the vestibulo-ocular reflex, the eye and head velocity components of firing are also equal and opposite and largely cancel (Lisberger and Fuchs, 1978a). In a significant fraction of PCs (Belton and McCrea, 1999, 2000), d is small compared to b so that simple spike firing is modulated in relation to the eye velocity induced by both visual and vestibular stimuli. Many PCs also show some modulation of simple spike firing rate in relationship to eye position and some show strong modulation (Noda and Suzuki, 1979a), but for the overall population, eye velocity is emphasized relative to eye position.
A potential visual component of simple spike firing rate emerges during the initiation of smooth pursuit eye movements for step-ramp target motion. As illustrated in the bottom panel of Figure 4, step-ramp target motion starts with the monkey fixating a stationary target. At an unexpected time labeled “zero”, the target displaces in one direction and ramps at constant speed in the other. After 700 ms of motion, in this instance, the target undergoes a small onward step and stops. The combination of a properly coordinated step amplitude and ramp speed allows the initiation of pursuit tracking without any saccadic eye movement (Rashbass, 1961), simplifying the analysis of neural responses during pursuit. Unlike sinusoidal target motion, which is associated with only small amounts of retinal image motion when the monkey is tracking accurately, step-ramp motion has an early interval just after the onset of target motion when there is large image motion because the target is moving but the latency for pursuit initiation has not elapsed and the eyes are stationary.
In many PCs (Figure 4), simple spike firing rate shows a large transient in response to the onset of target motion followed by the expected sustained response related to eye velocity (Stone and Lisberger, 1990); note in Figure 4 that the average firing rate is sustained as long as eye velocity is sustained, until the time indicated by the two downward arrows. The transient and sustained responses are direction selective in the sense that they are positive or negative during tracking of step-ramp target motion in the on- versus off-direction for simple spike responses. We think that the transient response at the initiation of pursuit is driven by visual mossy fiber inputs (Stone and Lisberger, 1990). Visually-driven components of simple spike firing also have been demonstrated by providing motion of the visual surround while a monkey fixates a stationary target, and by changing the motion of the visual surround relative to a moving tracking target (Noda and Warabi, 1986).
It is difficult to prove that the transient in simple spike firing rate is driven by visual image motion, especially as image velocity can be related to eye acceleration quite closely (Lisberger et al., 1981; Lisberger and Westbrook, 1985). Indeed, the simple spike response of floccular PCs during pursuit with the head stationary can be described well by:
where Ë(t), Ė(t), and E(t) are the time-varying eye acceleration, eye velocity, and eye position (Shidara et al., 1993; Medina and Lisberger, 2007). As illustrated in Figure 5 for a PC recorded in my laboratory, the simple-spike firing rate predicted by the best-fitting version of Equation (2) (red trace) follows the actual firing rate (black trace) quite closely. Further, the contributions of the different components of the regression are quite clear. The change in steady firing rate from the start to the end of the trial can be accounted for by the sensitivity to eye position (dashed purple trace), the transient at the onset of pursuit by the sensitivity to eye acceleration (green trace), and the sustained response during steady tracking by the sensitivity to eye velocity (solid purple trace). Because the head is stationary, there is no component related to head velocity.
In this section, I suggest that the eye velocity sensitivity of PCs results from a positive feedback circuit through the floccular complex that can be viewed as a model of the inertia of smoothly moving objects. Figure 6 presents the evidence that the eye velocity component of PC firing is due to a corollary discharge input (from Stone and Lisberger, 1990). Here, the target started with a step-ramp motion. After 400 ms, a signal specifying the monkey’s eye position was fed back to the experimental-control computer and used to drive target position. As a result, the target was stabilized with respect to the moving eye, and moved wherever the eye moved for an interval of 300 ms. During the interval when the target was stabilized (between the vertical dashed lines), the eye continued to move without loss of speed, and the simple spike firing rate of the PC under study remained high without a noticeable decrease. Thus, both smooth eye motion and the elevated simple spike firing of PCs are maintained even when retinal image motion is removed by image stabilization. In contrast, the discharge of MT neurons shows a clear decrease when an analogous experiment is performed (Newsome et al., 1988), implying that MT responses during pursuit are driven by retinal signals while floccular responses are driven by extra-retinal signals.
An interpretation in terms of an internal model is outlined by Figure 7A. The interpretation starts from two facts: i) the disynaptic connections to motoneurons imply that floccular output drives eye velocity; and ii) floccular inputs reflect the ongoing eye velocity. Thus, floccular PCs control eye velocity and receive feedback about eye velocity, implying that the eye velocity component of PC firing is configured as a positive feedback loop. In the model represented by Figure 7A, eye velocity at time t (Ė[t]) is driven by a combination of two signals: eye velocity at time t−Δt (Ė[t − Δt] ) and image velocity from time t−100 ms ( İ[t − 100]). If Δt is small, then the positive feedback loop acts as an integrator and will maintain automatically both eye velocity (Morris and Lisberger, 1987) and the discharge of PCs (Stone and Lisberger, 1990) during steady-state pursuit, even in the absence of image motion during perfect tracking or stabilized vision. Thus, a positive feedback loop through the floccular complex predicts the two features of the data in Figure 6, that both eye velocity and simple spike firing persist when image motion from the target is eliminated. Absent positive feedback, the velocity of the eye would decay quickly to zero during stabilized vision because of the mechanics of the oculomotor plant.
Perhaps the positive feedback configuration of the eye velocity component of PC firing provides neural inertia, ensuring that an eyeball in smooth pursuit motion remains in motion. Similarly, physical inertia will cause real objects in motion to remain in motion. Therefore, we can view the positive feedback of eye velocity through the floccular complex as an internal model of the physics of objects in our world. In a similar vein, Cerminara et al. (2009) showed that the activity of PCs in the lateral cerebellum models the velocity of a moving object even when the target disappears briefly. Further, Angelaki et al. (2004) and Yakusheva et al. (2008) came to a similar conclusion, that the representation of tilt and translation in the brainstem, the cerebellar cortex of the nodulus, and the fastigial nucleus represented an internal model of the physics of the world. Perhaps internal models of the physics of the world are a general operating principle in many areas of the cerebellum.
Another way to think about the function of (and necessity for) an eye velocity positive feedback pathway through the floccular cortex is outlined by Figure 7B. Consider the pursuit of step-ramp target motion. The onset of target motion creates retinal image motion that drives pursuit. However, as eye velocity accelerates and tracks the target almost perfectly, retinal image motion disappears. Thus, a transient retinal image velocity (solid trace in the middle graph of Figure 7B) leads to a sustained eye velocity (solid trace in top graph of Figure 7B). An eye velocity positive feedback circuit would act as an integrator and convert a transient input into a sustained output. It also would allow image motion to serve as a command for changes in eye velocity, which are eye accelerations. Figure 7B shows the plausibility of these ideas: the eye acceleration at the initiation of pursuit (solid trace in bottom graph of Figure 7B) follows the image motion by 100 ms, but shows a remarkable resemblance to the image motion when shifted forward in time and scaled to superimpose on eye acceleration (dashed versus solid trace in middle panel of Figure 7B).
It is possible to think of the eye velocity output from the floccular complex as a representation of target velocity in world coordinates. Such a signal could come from cortical areas, where similar extraretinal responses have been reported (e.g. Newsome et al., 1988; Fukushima et al., 2000), or it could be constructed in and brainstem-cerebellum positive feedback circuit, as I have proposed here. There are theoretical advantages to positive feedback with the short latencies provided by a brainstem-cerebellar loop: longer latencies would have a propensity to oscillate. Further, a plausible neural substrate exists in the relationship between the floccular complex and the nucleus prepositus. Thus, while I cannot exclude a cortical origin for eye velocity output from the floccular complex, I favor the alternative that it arises in subcortical circuits.
It is traditional to think of inverse dynamic models in terms of compensation for the physical plant of the effector organ. In the oculomotor system, the brainstem velocity-to-position integrator proposed by Skavenski and Robinson (1973) is widely viewed as “the” inverse dynamic model that compensates for plant dynamics. To think about the dynamics of floccular output, I will extend the concept of the “plant” to comprise both the eyeball and the brainstem circuits that are downstream from floccular output. My thinking is premised partly on the fact that smooth pursuit is an evolutionarily recent specialization compared to the vestibulo-ocular reflex and saccades: pursuit is most capable of driving highly effective smooth tracking of small objects in primates (although not all species have been studied).
Pursuit signals emanate from the floccular complex and then share the brainstem circuits that must already have been specialized to transform vestibular and saccadic commands into the correct eye movements. Therefore, floccular output may have had to adapt to provide dynamics that are suitable given the pre-existing brainstem circuits: from the perspective of the floccular output, the brainstem and eyeball together may function as the “plant”. To drive eye movement effectively, the output of the floccular complex should have dynamics that complement the filter properties of downstream processing, including the final oculomotor circuits in the brainstem and the mechanics of the eyeball. Here I suggest that the mossy fiber inputs to PCs create the output that would be expected if an inverse model of those dynamics existed in the floccular cortex.
To understand the dynamics of floccular output and its possible function as an inverse model of downstream dynamics, I ask how the pooled floccular output must be transformed to generate the smooth pursuit eye movements recorded at the same time. To do so, we went one step further than did Shidara et al. (1993). Krauzlis and Lisberger (1994) started by averaging the simple spike responses of a sample of PCs to obtain a pooled floccular output during pursuit of step-ramp target motion in the on-direction (Figure 8A) and off-direction (Figure 8B). Comparison of the average firing rates (solid traces) and eye velocities (dashed traces) reveals two problems that make it impossible for a single set of downstream circuits, with a single set of dynamics, to convert both firing rates into eye movement. First, the amplitudes do not match: downstream circuits would need to amplify the responses during off-direction pursuit selectively. Second, the dynamics of eye velocity (Luebke and Robinson, 1988) and firing rate (Krauzlis and Lisberger, 1994) are different at the onset and offset of pursuit: for on-direction pursuit, there is a large overshoot in firing rate at the onset but not the offset of pursuit; for off-direction pursuit, the situation is the opposite. To resolve these problems, Krauzlis and Lisberger (1994) assumed that the output from the floccular complexes on the two sides of the brain might be combined before being transformed by a single brainstem circuit. They created an opponent floccular output (Figure 8C), computed as the difference between the averaged outputs from the floccular complexes on the two sides of the brain, and then processed the opponent output through a model of the brainstem and plant dynamics (Figure 8D). The output of the model provided a good account of the eye movements recorded simultaneously with the PC firing rates at the onset and offset of pursuit for step-ramp target motions (Figure 8E).
The opponent population response in Figure 8C shows transient overshoots and undershoots at the onset and offset of pursuit that are symmetrical. Because of the excellent prediction obtained with this approach, Krauzlis and Lisberger (1994) concluded that the “transient overshoots exhibited in the firing rate of PCs can provide appropriate compensation for the lagging dynamics of the oculomotor plant”. In an earlier study, Stone and Lisberger (1990) had provided evidence that the transient overshoots, while correlated with eye acceleration, are actually driven by visual motion inputs to the floccular complex. Thus, the transients provided by the visual mossy fiber inputs to floccular PCs contribute a critical component that allows the pooled floccular output to compensate for dynamics in the downstream processing. Without the extra transient provided by the visual input, the eye movement driven by floccular output would show lower eye acceleration than it does. The logic outlined here does not prove that the floccular complex contains an inverse model of downstream processing. But, it does provide a valid way to interpret the data: the output from the floccular complex appears to be customized to compensate for the dynamics of downstream processing, a signature of inverse dynamics models.
The floccular complex appears to perform one additional transformation of its inputs to configure its outputs in a way that matches the dynamics of downstream processing. Recall that the vestibular and saccadic commands for eye movements comprise command signals for eye velocity. To share brainstem circuits with these other kinds of eye movements, floccular output also should be a command for smooth eye velocity. Yet, the eye movement inputs to the floccular complex are mainly mossy fibers that arise in the nucleus prepositus. They transmit signals that are dominated by eye position with a significant but smaller component related to eye velocity (McFarland and Fuchs, 1992; McCrea and Baker, 1985; Langer et al., 1985). In contrast, the simple-spike outputs from the floccular complex are dominated by eye velocity (Miles and Fuller, 1976; Lisberger and Fuchs, 1978a), with a significant but smaller component related to eye position in some neurons (Noda and Warabi, 1982). Somehow, the circuits of the floccular cortex must reduce the eye position input and emphasize the eye velocity input. Only by providing an output related to eye velocity can the floccular complex co-opt the brainstem VOR pathways and drive smooth pursuit eye movements with the correct dynamics.
Here, I suggest that the combination of head and eye movement inputs to floccular PCs acts as an internal model of the kinematics of smooth gaze control during smooth eye and head movement. As illustrated in Figure 9A and B, it is possible to move gaze through space either by rotating the eyes with a stationary head (A), or by rotating the head and keeping the eyes stationary in the orbit (B). Either strategy has the same effect of changing the angle of the eyes with respect to the stationary world, even though the motor signals and strategies are quite different. Further, the action caused by the vestibulo-ocular reflex (VOR), which reacts to head rotation with an equal and opposite eye rotation, means that the eyes continue to point at the same location in space even though both the head moves in space and the eyes move in the orbit (C). The ultimate goal of smooth eye movement is to point the eyes at the desired stationary or moving object, using visual tracking systems to rotate gaze smoothly and vestibular mechanisms to compensate for head turns and prevent gaze from being destabilized by our own motions.
The situation cartooned in Figures 9A–C also is defined by an equation that describes how the combination of eye motion in the orbit ( Ė) and head motion in space () determines gaze motion (Ġ), defined as the velocity of the eyes in space:
The same relationship and terminology could be used for eye/head/gaze position or acceleration. Equation (3) defines one aspect of the kinematics of the orbit, namely the combination of rotation of the eyes and head to control where the eyes are pointed in the visual world. As I discussed in relation in Figure 3, the discharge of many floccular PCs combines signals related to eye and head velocity to achieve an output signal. Because the two sets of signals are equally weighted, at least on average, they produce an output that is related to gaze velocity and were named the “gaze velocity Purkinje cells” or “GVP” cells by Robinson (1981). Because they mimic the combination of eye and head velocity that occurs physically in the orbit (Figure 9D), the GVP cells (Figure 9E) can be regarded as the output of an internal model of eye-head kinematics. The head and eye movement signals appear to enter the floccular complex independently, leading me to assume that the internal model resides in the floccular cortex.
Do the GVP cells represent the output from an internal model of the mechanics of the orbit, or of the interaction between the VOR and visually-guided smooth tracking? The latter idea, expressed in Figure 10, is based on the concept that a command for eye motion in the orbit is generated by adding floccular output related to gaze velocity and vestibulo-ocular signals calibrated to compensate for head turns and stabilize gaze. During head motion in light or dark, the VOR causes eye motion that is equal in size and opposite in direction to head motion so that gaze motion is minimized. The same is true of GVP cells: during the VOR in light or dark the head and eye velocity components cancel each other so that simple-spike firing rate is, on average, unmodulated. Because the VOR results from vestibular inputs to brainstem pathways, there is no need for the floccular complex to intervene. In fact, the eye velocity positive feedback through the floccular complex would be detrimental during the VOR. If active, the eye velocity positive feedback would integrate a vestibular input that signals head velocity, creating a command for a constant eye acceleration. By nulling the eye velocity positive feedback with a head velocity input, the positive feedback can be suspended during the VOR, but enabled during visual tracking.
In normal monkeys, we cannot distinguish the two alternatives that GVP cells form internal models of the physical kinematics of the orbit versus the interaction between the VOR and visually-guided tracking. Because zero gaze velocity is the baseline status of the VOR, the two alternatives predict the same data for PCs. However, the degeneracy of the two alternatives is broken after learning in the VOR because the physical gaze velocity driven by the adapted VOR is non-zero. If a monkey wears magnifying spectacles for several days to persistently double the size of the visual image, then the amplitude of the VOR increases so that the images seen through the spectacles are stabilized during head turns (Miles and Fuller, 1974). Subsequent vestibular rotation in darkness reveals that the gain of the VOR has increased dramatically. Because eye rotation is now much larger than head rotation, the VOR in the dark produces a large physical gaze motion in space.
Recordings from GVP cells after motor learning in the VOR support the alternative that the floccular complex may be acting as an internal model of the interaction of the VOR and visually-guided smooth tracking. Increases in the gain of the VOR are associated with increases in the strength of the vestibular input to GVP cells (Miles et al., 1980b; Lisberger et al., 1994b; Hirata and Highstein, 2001), modeled by the parameter d in Equation (1) and Figure 10. As a result, the null point for cancellation of the eye and head velocity inputs to GVP cells moves: from the control situation where eye and head velocity are equal to the adapted situation where a larger eye movement is needed to null the vestibular input. After increases in the gain of the VOR, the null point for head and eye velocity inputs occurs when eye movements are larger than head movements and approximately equal to those induced by the adapted VOR in darkness. It appears that learning in the VOR adjusts the signal processing in the floccular complex in a way that maintains a good model of the interaction of the VOR and visually-guided smooth tracking, rather than a good model of the kinematics of the orbit.
In thinking about the seemingly complex situation that occurs when spectacles are used to adjust the gain of the VOR, it is important to remember that motor learning in the VOR is intended, in real life, to respond to deficits in vestibular inputs or the strength of extraocular muscle action. Thus, motor learning is an adjustment that is intended to restore the situation where eye velocity is equal in amplitude and opposite in direction to head velocity. The parallel adjustment in the vestibular inputs to GVP cells in association with motor learning in the VOR would maintain the situation where floccular simple-spike output is unmodulated when physical gaze velocity is zero.
I suggest that the floccular cortex is the site of the internal model of the kinematics of the orbit, and I propose that the learned adjustment of the internal model also occurs in the cerebellar cortex. It is tempting to think of learning in the cerebellar cortex in terms of the original cerebellar learning theory (Marr, 1969; Albus, 1971; Ito, 1972), but there is now evidence for multiple sites of plasticity in the cerebellar cortex (Hansel et al., 2000; Mittman and Hausser, 2007; Jorntell and Hansel, 2006). There also is some evidence for fast learning in the cerebellar cortex (Medina and Lisberger, 2008) and slower learning in the deep cerebellar nuclei (Kassarkjian et al., 2005; Shutoh et al., 2006). Future research will have to localize the sites of internal model learning more precisely in the cerebellar cortex and determine the relative importance of learning in the deep nucleus and the cerebellar cortex at different phases of motor learning.
The concept of internal models provides a conceptual structure that helps us think about how the brain controls movement. Many authors have suggested that the cerebellum is a locus of internal models, and that the popular theory of cerebellar learning might come into play to allow cerebellar internal models to be learned and adjusted. There is little doubt that the brain operates as if it contained internal models, but the attractiveness of internal models as a theoretical concept does not mean that they will be localized in a way that allows neurophysiologists to recognize them in the electrical activity of neurons.
The smooth oculomotor system and its associated cerebellar control from the floccular complex provides an excellent system to ask about neural representations of internal models because of the ease of quantifying oculomotor kinematics and dynamics, and because of the disynaptic connection from floccular PCs to extraocular motoneurons. I have used these advantages to show how the physiological and anatomical organization of the floccular complex may be representing and implementing internal models. Given the repeated structure of the cerebellar cortex, it seems reasonable that other motor systems may employ their private pieces of cerebellar cortex in similar ways. Perhaps the insights we can derive based on the advantages of studying oculomotor control provide groundwork that will guide efforts to recognize the potential representation and mechanisms of internal models for a wide variety of cerebellar behaviors.
I thank Philip Sabes, Dora Angelaki, and Hilary Heuer for helpful comments on the manuscript. I am grateful to, and acknowledge the contributions of, a number of former collaborators: Albert Fuchs, Frederick Miles, Leeland Stone, Edward Morris, Richard Krauzlis, Dianne Broussard, Helen Bronte-Stewart, and Javier Medina. Preparation of the review was supported by the Howard Hughes Medical Institute.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.