|Home | About | Journals | Submit | Contact Us | Français|
Goal-directed behavior can be thought of as dynamic links between sensory stimuli and motor acts. Neural correlates of many of the intermediate events of both auditory and visual goal-directed behaviors are found in the posterior parietal cortex. Here, we review studies that have focused on how neurons in the lateral intraparietal area (area LIP) differentially process auditory and visual stimuli. Together, these studies suggest that area LIP contains a modality-dependent representation that is highly dependent on behavioral context.
Goal-directed behavior is characterized by the flexible mapping of sensory stimuli onto actions (Snyder, 2000). A single stimulus can be arbitrarily mapped onto different actions. Alternatively, different stimuli can be mapped onto the same action. The dynamic quality of these links is the key to goal-directed behavior since it allows humans and other animals to respond adaptively to different environmental scenarios and not just through reflexive loops.
If goal-directed behavior is viewed as dynamic links, we encounter computational questions as to how the nervous system forms, maintains, and alters these links. For example, at any moment in time, we are bombarded with a variety of stimuli from different modalities. How do we choose which of these many stimuli will be the endpoint (cause) of an action? How do we choose which to ignore? Also, the mapping between stimuli and actions is often not one-to-one. Stimuli from different sensory modalities, for example, may elicit the same action. If you have just robbed a bank, the sound of police sirens or the sight of police lights may elicit the same action: run! In other situations, however, the same stimulus can elicit different actions. The sound of the same police sirens may elicit a different action than “run!” if you own the bank that has been robbed. Finally, the valence of a stimulus, the manner in which it is categorized, or even motivational state might affect which action is chosen or whether an action is even selected (Russ et al., 2007).
One cortical region that plays an important part in both auditory and visual goal-directed behavior is the parietal cortex. Parietal activity reflects many of the intermediate processes between sensation and action that are essential for goal-directed behavior.
Indeed, in functional subdivisions of the posterior parietal cortex of rhesus monkeys, such as the lateral intraparietal area (area LIP) and the medial intraparietal area (see below for more details on these parietal areas), neurons are modulated by attention and salience (Bisley et al., 2003; Cohen et al., 2004b; Colby et al., 1999; Gifford III et al., 2004; Goldberg et al., 2002; Gottlieb et al., 2005; Gottlieb et al., 1998; Ipata et al., 2008; Kusunoki et al., 2000; Powell et al., 2000), response selection (Baldauf et al., 2008; Cohen et al., 2004a; Gnadt et al., 1988; Platt et al., 1997; Snyder et al., 2000), coordinate transformations (Andersen et al., 1997; Buneo et al., 2008; Mullette-Gillman et al., 2005; Mullette-Gillman et al., In Press; Sabes et al., 2002; Snyder et al., 1998; Stricanne et al., 1996), and decisions to “act” on a sensory stimulus (Bendiksby et al., 2006; Klein et al., 2008; McCoy et al., 2005; Platt et al., 1999; Schall, 2004; Shadlen et al., 2001; Sugrue et al., 2004; Sugrue et al., 2005). These variables are also reflected in the activity of the human parietal cortex (Binkofski et al., 1998; Connolly et al., 2003; Connolly et al., 2000; DeSouza et al., 2000; Dinstein et al., 2008; Huettel et al., 2006; Jancke et al., 2001; Karnath et al., 2001; Kastner et al., 2000; Kawashima et al., 1996; Levy et al., 2007; Luna et al., 1998; Rushworth et al., 2001; Schluppeck et al., 2005; Schluppeck et al., 2006; Silver et al., 2005; Tosoni et al., 2008). Moreover, and key for this review, both non-human and human studies have demonstrated that the parietal cortex is activated during tasks that use auditory or visual stimuli (Ahveninen et al., 2006; Bremmer et al., 2001a; Bremmer et al., 2001b; Bushara et al., 2003; Bushara et al., 1999; Butters et al., 1970; Cohen et al., 2000; Cohen et al., 2004b; Crottaz-Herbette et al., 2004; Cusack et al., 2000; Deouell et al., 2000a; Deouell et al., 2000b; Gifford III et al., 2004; Griffiths et al., 1998; Grunewald et al., 1999; Karabanov et al., 2009; Linden et al., 1999; Mazzoni et al., 1996; Mullette-Gillman et al., 2005; Phan et al., 2000; Schlack et al., 2005; Stricanne et al., 1996; Warren et al., 2002).
Here, we focus on area LIP and review a series of studies on the auditory and visual properties of LIP neurons. First, though, we briefly review the anatomy of the posterior parietal cortex and area LIP.
In both humans and monkeys (Andersen, 1987; Hyvärinen, 1982), the posterior parietal cortex forms a circuit with sensory areas and with frontal, temporal, and limbic areas. The posterior parietal cortex contains a number of functional subdivisions including area 7a, area 7b, the medial temporal area, the medial superior temporal area, the medial lateral intraparietal area, and the lateral intraparietal area (LIP) (Andersen, 1987; Andersen et al., 1989).
Visual and auditory input to area LIP is well-described. Area LIP is classically considered to be part of the dorsal visual processing stream (Ungerleider et al., 1982). Consequently, LIP neurons receive input from neurons in extrastriate visual areas as well as cortical and brainstem areas involved with saccadic eye movements (Asanuma et al., 1985; Blatt et al., 1990; Lynch et al., 1985). The main source of auditory input to area LIP is the temporoparietal cortex (Divac et al., 1977; Hyvärinen, 1982; Pandya et al., 1969), which is part of the parabelt of auditory cortex (Kaas et al., 1998). Neurons in this region of the cortex are sensitive to the location of a sound (Leinonen et al., 1980; Pandya et al., 1985) and, consequently, may support any role that area LIP has in auditory spatial processing. Auditory input to area LIP may also arise via input from multimodal cortical areas (Baizer et al., 1991; Blatt et al., 1990; Seltzer et al., 1991). Indirect auditory input may also reach area LIP through its connections with the frontal cortex or the superior colliculus; neurons in these brain areas receive input from auditory areas and have responses that are modulated by auditory stimuli (Andersen et al., 1985; Andersen et al., 1990; Barbas, 1988; Barbas et al., 1981; Cavada et al., 1989; Hackett et al., 1999; Harting et al., 1980; Kaas et al., 1998; Romanski et al., 1999; Russo et al., 1994; Schall et al., 1995; Stanton et al., 1995; Vaadia, 1989).
We should note that since the prefrontal and parietal cortices are highly interconnected (Andersen et al., 1985; Barbas et al., 1981; Petrides et al., 1984; Schall et al., 1995; Stanton et al., 1995), it is not surprising that many of aforementioned intermediate processes that are found in the parietal cortex are also found in the prefrontal cortex. Indeed, it is hypothesized that parietal, prefrontal, and other cortical areas form a functional loop that potentially creates and maintains internal representations and possibly transforms these representations into motor acts (Barash et al., 2006; Chafee et al., 2000).
In one of the earliest examinations of LIP auditory activity, Grunewald, Linden, and Andersen (Grunewald et al., 1999) tested how training history modulates LIP responses. They first asked “naïve” monkeys to fixate a light while an auditory stimulus was presented at different peripheral locations. In their study, naïve monkeys were those that had not been operantly trained to associate an auditory stimulus with an action and a subsequent reward; these monkeys, though, had been trained to associate a gaze shift to a visual-stimulus location for a juice reward. Grunewald et al. (1999) found that when the monkeys fixated a light, LIP activity was modulated by the locations of visual stimuli but was not modulated by auditory stimuli at comparable locations. However, following auditory training, LIP neurons were modulated by the locations of auditory stimuli when the monkeys participated in this visual-fixation task.
This pattern of results was interpreted to suggest that laboratory-based auditory training induced some form of “oculomotor salience” on the auditory stimuli. That is, the auditory responses reflected the fact that the monkeys learned to associate the stimuli with an action (shift gaze toward their location) to receive a reward (Assad, 2003; Gottlieb et al., 1998; Kusunoki et al., 2000; Linden et al., 1999).
However, if we hypothesize that LIP neurons reflect stimulus salience (Kusunoki et al., 2000), a different interpretation of the Grunewald et al. study emerges. Namely, auditory stimuli are salient stimuli but, in naïve monkeys, LIP auditory responses may be suppressed by a more salient central fixation light. Why would the fixation light suppress the auditory responses? Maybe, in primates, visual stimuli are inherently more salient stimuli than auditory stimuli (Posner et al., 1976). A second possibility is that whereas the auditory stimulus was irrelevant for successful completion of the task, the fixation light was highly salient since the monkeys were required to maintain their gaze at the light to receive a reward.
If this suppression hypothesis is correct, a natural prediction would be that if the central light is removed, LIP neurons should be modulated by auditory stimuli even in the absence of auditory training. To test this prediction, we (Gifford III et al., 2004) recorded LIP activity while naïve monkeys listened to auditory stimuli and either (1) fixated a central light or (2) fixated in the dark without a central light. Consistent with our prediction, LIP neurons were modulated by auditory stimuli and had spatially limited response fields.
We interpreted these data to suggest that LIP neurons reflect the relative salience of stimuli. In naïve monkeys, LIP neurons do not code auditory stimuli when they compete with a more salient visual stimulus. But, they do code these same stimuli when a competitive stimulus (i.e., the central fixation light) is removed from the environment. A recent human electrophysiological and behavioral study is consistent with these ideas: when visual stimuli are removed from the environment, neural processes are “freed” that allow for enhanced auditory processing (Haroush et al., 2008).
Why do LIP neurons respond differently to auditory and visual stimuli? One hypothesis is that differences between LIP auditory and visual activity relate to differences between the physical properties of stimuli. Thus, differences between LIP responsivity may simply reflect differences between the physical properties of these two classes of stimuli. One solution to this issue would be to equate the auditory and visual stimuli along a particular psychophysical axis. However, this solution is not as straightforward as it would appear since there is no principled way to determine which axis (e.g., bandwidth, intensity, etc.) is the “proper” one to equate the two stimuli (Spence et al., 2000).
A second hypothesis is that modality-dependent activity reflects differences between the auditory and visual perceptual systems. For instance, the visual system has a higher spatial acuity than the auditory system (Blauert, 1997; Brown et al., 1978a; Brown et al., 1978b; Brown et al., 1980; Recanzone et al., 1998; Wightman et al., 1993). In contrast, in other situations such as when timing information is critical or when visual-spatial information becomes unreliable, aspects of a auditory stimulus may be more perceptually salient than aspects of a visual stimulus (Alais et al., 2004; Fendrich et al., 2001; Shams et al., 2000; Welch et al., 1980). Consequently, depending on the nature of the task, information provided by the visual perceptual system may be more or less salient than information provided by the auditory system. This idea has been formalized computationally within the context of Bayesian inference, where multi-modal percepts are formed as a function of the more “reliable” stimulus (Burr et al., 2006; Deneve et al., 2004; Ma et al., 2008).
A third non-exclusive hypothesis is that modality-dependent LIP activity reflects differences between the relationship of a stimulus and the cognitive or behavioral requirements of a task. For example, during saccade tasks, LIP neurons may respond more to visual stimuli than to auditory stimuli (Linden et al., 1999) because, as discussed above, in the context of planning eye movements, visual stimuli are more salient (Kusunoki et al., 2000; Toth et al., 2002) than auditory stimuli. This possibility may exist despite the fact that the behavior of the monkeys is similar (e.g., saccade to a stimulus location). Indeed, similar outward behavior does not eliminate the possibility that animals use different cognitive strategies and different neural circuits to solve the analogous versions of the same task (Gibson et al., 1997).
To test these hypotheses and, in particular, the latter hypothesis, we had monkeys participate in the predictive-cueing task (Cohen et al., 2004b). The predictive-cueing task is a version of Posner's cueing paradigm (Posner, 1980) that tests the allocation of attention. In our version of the task, monkeys shifted their gaze to a visual target whose location was predicted by the location of an auditory or visual cue. As found in other cueing tasks (Driver et al., 1998; Posner, 1980), the monkeys' response latency was faster to the target when the cue predicted the target location than when the cue was not predictive. More importantly, this “predictive effect” was the same regardless of whether the cue was an auditory cue or a visual cue. This result suggested that, within the context of the task, the auditory cue and visual cue had the same task-related salience. However, despite this equivalence, the mean firing rate of LIP neurons was significantly higher when the visual cue was presented than when the auditory cue was presented.
This result suggests that the link between a stimulus and a task is not the only determinant in the level of LIP activity (Cohen et al., 2004b). If it was the only factor, then LIP activity in response to the auditory cue and visual cue should have been the same during the predictive cueing task. But is this difference between auditory and visual activity absolute or is it task dependent? To address this question, we had monkeys participate in a memory-guided saccade task; in this task, monkeys made saccades to the remembered locations of auditory stimuli or visual stimuli. We found that visual activity during the memory-guided saccade task was comparable to that seen during the predictive-cueing task. However, auditory activity during the memory-guided saccade task was substantially lower than that seen during the predictive-cueing task.
Since auditory activity was task dependent, it suggests that changes in LIP activity reflect differences between the behavioral or task context (salience) of an auditory stimulus (Kusunoki et al., 2000). A similar result was reported by Linden et al. (1999) who observed that LIP neurons were modulated more by an auditory stimulus that signaled the location of a future eye movement than by an auditory stimulus that did not signal the location of a future eye movement. Overall, it appears that the factors that contribute to LIP activity are complex and depend both on the stimulus modality and the behavioral task. However, these modality-dependent differences can be minimized (and hence differences between the representation of salience may be minimized) when auditory and visual stimuli are explicitly equated as they were in the predictive-cueing task.
So far, we have discussed differences between how area LIP represents the more cognitive attributes of auditory and visual stimuli. In this section, we focus on more fundamental properties of the parietal cortex and area LIP. Namely, the auditory and visual spatial properties of LIP neurons (Mullette-Gillman et al., 2005; Mullette-Gillman et al., In Press), which underlie area LIP's role in forming extra-personal spatial representations of attention, salience, and other related factors.
Three important points emerge from the Mullette-Gillman studies (2005; In Press). First, bimodal auditory and visual LIP neurons code comparable regions of auditory space. Second, the reference frame of LIP activity during the presentation of a visual or an auditory stimulus is complex and differs from that previously reported (Cohen et al., 2002; Stricanne et al., 1996). That is, LIP neurons do not preferentially code visual and auditory spatial information in a canonical eye- or head-centered reference frame1, respectively. Instead, the reference frame of both visual and auditory activity can be best described as existing within a continuum of reference frames from eye-centered to head-centered representations. Within this continuum, the auditory and visual reference frames of bimodal LIP neurons are in rough correspondence. Finally, when the monkeys were actually saccading to the location of the auditory or visual target, we found that LIP activity was still not represented in an eye-centered reference frame but continued to exhibit this head-to-eye-centered continuum. We did, though, find that between the time of target presentation to the saccade time there was a slight improvement in the correspondence between visual and auditory signals: auditory signals shifted their coordinates to become slightly more similar to the coordinates of the visual signals. Whereas the rationale as to why the nervous system uses this continuum of reference frames is not known, it is seen in other parietal regions (Schlack et al., 2005) and other cortical systems (Batista et al., 2007; Wu et al., 2006; Wu et al., 2007) suggesting that it may be a ubiquitous computational format (Pouget et al., 1997). Another possibility is that trying to define the reference frame of cortical (or brainstem) activity in a format that is based on sensory properties, muscle forces, etc. may be too simplistic or even ultimately incorrect (Batista et al., 2007; Mullette-Gillman et al., In Press). That is, there is no a priori reason to believe that just because LIP neurons are involved in some aspect of spatial processing, they have to represent that information in a code that is dependent on the sensory stimulus or the eventual motor act (Batista et al., 2007).
An underlying premise of many LIP studies is that area LIP's role in goal-directed behavior is the same regardless of whether the stimulus is auditory or visual (Cohen et al., 2002). However, as we have discussed above, there are substantial functional differences between auditory and visual LIP activity; since these studies used a head-fixed preparation, a significant confound in the monkeys' behavior and the subsequent neural activity may have been introduced (Populin, 2006). Consequently, it is reasonable to hypothesize that these differences do not relate entirely to differences between stimulus saliency, which suggests an alternative role for auditory signals in area LIP.
We hypothesize that area LIP does not perform the same computations on auditory and visual signals (e.g., reflect their salience as a function of firing rate). Instead, auditory and visual signals may play substantially different roles. One possibility is to consider that area LIP is essentially a visual structure and that one of the main functions of auditory signals, and perhaps other extra-visual signals, is to modulate/enhance the computations that area LIP performs on visual stimuli. As such, we suggest that future studies test how the combined integrative effect of simultaneous auditory and visual presentation (Alais et al., 2004; Stein et al., 1993) modulates LIP neurons using more behaviorally-relevant tasks (Populin, 2006). For instance, if there are multiple visual stimuli, a concurrent auditory stimulus at the location of one of the visual stimuli may increase its salience (Kusunoki et al., 2000) and allow for attentional shifts (Goldberg et al., 2002) or eye-movement plans (Snyder et al., 2000) toward its location. Area LIP's primary role then may be to create a visual representation of extra-personal space in which extra-visual signals are used to modulate these representations.
YEC was supported by grants from the NIDCD-NIH and the NIMH-NIH.
1A reference frame can be thought of as a set of axes that describes the location of an object. In the earliest stages of sensory processing, auditory, visual, and other sensory signals are coded in different reference frames. For example, describing the location of an auditory stimulus depends initially on the brain's capacity to correlate differences between the time of arrival and intensity of a sound at the two ears with a location, as well as the brain's ability to correlate the location-dependent filtering properties of ears/head with a sound location. Consequently, identifying the location of a sound depends on the location of the head relative to the location of the sound. This reference-frame is referred to as a “head-centered” reference frame. In contrast, describing the location of a visual stimulus depends initially on the pattern of light that falls on the retinas and the resulting pattern of activity in the photoreceptors. That is, an “eye-centered” reference frame.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.