|Home | About | Journals | Submit | Contact Us | Français|
Acoustic filter properties of A1 neurons can dynamically adapt to stimulus statistics, classical conditioning, instrumental learning and the changing auditory attentional focus. We have recently developed an experimental paradigm that allows us to view cortical receptive field plasticity on-line as the animal meets different behavioral challenges by attending to salient acoustic cues and changing its cortical filters to enhance performance. We propose that attention is the key trigger that initiates a cascade of events leading to the dynamic receptive field changes that we observe. In our paradigm, ferrets were initially trained, using conditioned avoidance training techniques, to discriminate between background noise stimuli (temporally orthogonal ripple combinations) and foreground tonal target stimuli. They learned to generalize the task for a wide variety of distinct background and foreground target stimuli. We recorded cortical activity in the awake behaving animal and computed on-line spectrotemporal receptive fields (STRFs) of single neurons in A1. We observed clear, predictable task-related changes in STRF shape while the animal performed spectral tasks (including single tone and multi-tone detection, and two-tone discrimination) with different tonal targets. A different set of task-related changes occurred when the animal performed temporal tasks (including gap detection and click-rate discrimination). Distinctive cortical STRF changes may constitute a “task-specific signature”. These spectral and temporal changes in cortical filters occur quite rapidly, within 2 minutes of task onset, and fade just as quickly after task completion, or in some cases, persisted for hours. The same cell could multiplex by differentially changing its receptive field in different task conditions. On-line dynamic task-related changes, as well as persistent plastic changes, were observed at a single-unit, multi-unit and population level. Auditory attention is likely to be pivotal in mediating these task-related changes since the magnitude of STRF changes correlated with behavioral performance on tasks with novel targets. Overall, these results suggest the presence of an attention-triggered plasticity algorithm in A1 that can swiftly change STRF shape by transforming receptive fields to enhance figure/ground separation, by using a contrast matched filter to filter out the background, while simultaneously enhancing the salient acoustic target in the foreground. These results favor the view of a nimble, dynamic, attentive and adaptive brain that can quickly reshape its sensory filter properties and sensori-motor links on a moment-to-moment basis, depending upon the current challenges the animal faces. In this review, we summarize our results in the context of a broader survey of the field of auditory attention, and then consider neuronal networks that could give rise to this phenomenon of attention-driven receptive field plasticity in A1.
How is the cortical representation of sound influenced by attention? Since the work of Hubel, Galambos and colleagues (1959), it has been known that the responses of single neurons in auditory cortex can be strongly modulated by attention. In their pioneering study in awake cat auditory cortex, a brief but prescient paper in the style of anecdotal neurophysiology, they observed that the responses of some cells (~ 10%) were highly dependent upon whether or not the cat was aroused by the presented sounds, or attended to the acoustic stimuli. As Hubel wrote (personal communication 2006): “One day I entered the triple soundproof room to see if the cat was still alive, and discovered that rattling the doorknob, or keys, produced clear and lively responses … I found that almost anything I did that made a noise elicited firing as long as the cat appeared interested.” Some of their “attention” units were located in A1, others in higher auditory cortical areas. In their study, they noted several characteristics of sounds that elicited an attentive state in their cats and lead to enhanced neural responses: (i) novelty (i.e. novel sounds were better than repeated sounds), (ii) meaning (i.e. natural sounds were better than clicks or tones), (iii) multisensory spatial coherence (i.e. acoustic stimuli presented simultaneously with a matched visual source were better than sounds without a matched visual counterpart). Although the cats were fully awake in the experiments of Hubel and colleagues, they were not behaviorally trained on any auditory task, so it was not possible in this early study to systematically explore the role of goal-directed attention in modulating sensory processing, a challenge left for future research. One caveat, noted by Hubel, is that this study did not control for pinnae movement, nor measure neuronal directionality tuning, thus leaving open the question of whether the observed effects were truly the results of either spatial or feature-based attention.
Other contemporary experimentalists working during this time period on the awake cat or monkey auditory cortex, such as Katsuki, did not mention the presence of any such “attention” units. However, in a fairly thorough study of responses in awake cat auditory cortex (Evans and Whitfield, 1964), the authors wrote: “About one third of the units responding to sound could be stimulated only by clicks or ‘odd’ sounds, such as the jangling of keys. Many of them gave inconsistent responses unless the attention of the cat was attracted to the source of the sound. These resemble the ‘attention’ units reported by Hubel et al. (1959). Some of these units had very low thresholds, but most required loud ‘startling’ sounds for consistent stimulation.” Curiously, these researchers said almost nothing more about these “attention” neurons in the rest of this paper, or in two subsequent publications on the awake cat auditory cortex, perhaps because “… all of those units which required ‘odd’ sounds to stimulate them, or where much ingenuity and experiment were necessary to obtain the ‘attention’ of the unit, were obtained from cortex which was relatively inactive …”. In the following 20 years, a handful of neurophysiological studies continued the investigation of the effects of auditory attention on cortical processing in the context of behavior (including Hocherman et al., 1976; Pfingst et al., 1977; Benson & Hienz, 1978; Miller et al., 1980). These studies demonstrated increases in cell evoked discharge for an attended stimulus compared to an identical non-attended stimulus and showed that these effects could occur with remarkably short-latency.
However, as Hubel and colleagues (1959) had ruefully noted: “Unfortunately attention is an elusive variable that no one has as yet been able to quantify.” It remains so today. Although there has been considerable research on auditory attention over the past fifty years, using a variety of approaches (psychoacoustic, behavioral, neurophysiological (single unit and EEG), MEG, functional fMRI neuroimaging) the underlying neural mechanisms remain mysterious. Moreover, to make the problem even more challenging, there is clear evidence that attention itself, defined as a top-down selection process that focuses cortical processing resources on the most relevant sensory information in order to maintain goal-directed behavior in the presence of multiple, competing distractions, is hardly a unitary phenomenon, but may be comprised of several distinct behavioral and neural processes (Posner and Peterson, 1990; Desimone and Duncan, 1995; Parasuraman, 1998; Ahveninen, et al., 2006; Johnson and Zatorre, 2006).
So, what do we currently know about auditory attention? We know that auditory attention allows us to rapidly direct our acoustic focus towards sounds of interest in our acoustic environment. Attention can be bottom-up (sound-based) or top-down (task-dependent), and top-down control can trump involuntary attention switching to task-irrelevant distractor sounds (Sussman et al., 2003) perhaps through top-down attentional modulation by the prefrontal cortex of the deviance detection system in the auditory cortex (Doeller et al., 2003). Attention provides a top-down salience filter that in conjunction with bottom-up “pop-out” auditory salience (Kayser et al., 2005) is thought to pass only a small part of the incoming acoustic information to higher auditory areas. Attentional mechanisms can modulate neural activity encoding the spatial location and/or the acoustic attributes of the selected targets and the early sensory representation of attended stimuli (Ahveninen, et al., 2006). This is illustrated by one of the best-known examples of auditory attention - the “cocktail party effect” (Cherry, 1953; Haykin and Chen, 2005) where we can easily selectively eavesdrop on different speakers in a crowded room brimming with multiple conversations. Cherry speculated on possible cues to its solution, including location, lip -reading, mean pitch differences, different speaking speeds, male/female speaking voices, or accents. However, whatever the cues, or the exact mechanisms involved in deciphering them, it is clear that in order to accomplish this feat of selective attention to a single stream in a natural environment with multiple sound sources, we must already be highly proficient at auditory scene segregation (or ASA). As Bregman’s influential studies emphasized (Bregman, 1990), listeners have to solve the ASA problem in order to extract one or more relevant auditory streams from the mixture of sources that typify their acoustic environment. Sound sources may differ in a variety of acoustic cues (location, instantaneous fundamental frequency, or the patterns of energy envelope modulation in different frequency bands) that facilitate grouping. There is evidence that the brain has a fairly sophisticated pre-attentive automatic scene analysis system that parses the acoustic scene into streams and analyzes stability and novelty, even for task-irrelevant streams (Winkler et al., 2003). This automatic process may correspond to what Bregman referred to as a “bottom-up” or “primitive” grouping. In addition, Bregman suggested a set of top-down grouping processes which he termed “schema-driven” mechanisms, based on acquired expectations from prior experience or knowledge. Recent results (Carlyon, 2004; Cusack et al., 2004; Wrigley and Brown, 2004; Molholm et al., 2005, Snyder et al. 2006) also suggest the presence of two cortical mechanisms of streaming – an automatic “pre-attentive” segregation of sounds and an attention-dependent streaming mechanism. The process of auditory scene analysis sets the stage and seamlessly interacts with the auditory attention system (Naatanen et al., 2001; Opitz et al., 2005; Sussman, 2005). Thus, an explanation of the cocktail party effect must include an understanding of the interplay between ASA, and our abilities to direct spatial attention to sound sources within the acoustic scene and/or to direct featural attention by focusing on distinctive acoustic vocal features (such as fundamental frequency, timbre, accent, intonation) in order to identify individual speaker voices (Ahveninen, et al., 2006).
There may be a similarity between attention in the auditory and visual modalities, where a two-component framework for attentional selection (top-down and bottom-up) has also emerged from psychophysical and behavioral studies. Two sets of mechanisms are thought to operate in parallel in both modalities: using either bottom-up, automatic, image-based saliency cues or top-down, attentional, task-dependent cues. Another fundamental similarity (and duality) common to both modalities is that attention can either be spatial or feature-based. We will continue to explore the comparison between visual and auditory attention in the final section.
Overall enhancement of human auditory cortex activity by selective attention has been shown by functional MRI (Grady et al., 1997; Jancke et al., 1999, 2003; Rama & Courtney, 2005; Voisin et al., 2006), PET (Zatorre et al., 1999; Hugdahl et al., 2000; Alho et al., 2003; Johnson & Zatorre, 2005), EEG (Hillyard et al., 1973) and MEG (Woldorff et al., 1993; Ozaki et al., 2004; Ahveninen, et al., 2006). Auditory attention can selectively be directed to a rich variety of features including spatial location, auditory pitch, frequency or intensity, to tone duration or FM direction or slope, to speech vs. nonspeech streams. What is the neural locus of these auditory attentional effects? Although some human imaging studies have shown clear attentional modulatory effects in A1, as well as other primary and secondary auditory cortical regions, other studies (Petkov et al., 2004) report greater effects of auditory attention in higher auditory association areas, at least in a dual task paradigm (comparing responses when one sensory modality is attended and the other is ignored). Petkov and colleagues suggest that there may be two distinct types of auditory cortical pathways, one of which faithfully transmits acoustic information for all incoming stimuli and is unaffected by attentional bias, and another which is attentionally labile, is strongly modulated by attention and analyzes the acoustic features of behaviorally relevant sounds. Although intriguing, it is possible that this distinction would evaporate if subjects were tested with other auditory task conditions besides pitch discrimination in a dual task context (Petkov et al., 2004) which might reveal additional attentional modulatory effects in complementary cortical areas (Ahveninen, et al., 2006). The work of Brechmann and Scheich (2005) demonstrates that attentional focus on different features of the same acoustic stimuli leads to differential hemispheric activation of auditory cortex. There is also some evidence for hemispheric specialization of the attentional system – for example a study by Zatorre and colleagues (1999) suggests that auditory attention to either spatial location or tonal frequency activates a common network of right hemisphere cortical regions. A recent MEG/fMRI paper (Ahveninen, et al., 2006) provides further evidence for the presence of dual selective-attention effects on sound localization and identification. Additional evidence for lateralization is provided by a recent ERP study (Alain et al., 2006) that observed plastic changes in event-related potentials during rapid perceptual learning while listeners were trained to distinguish between two phonetically distinct vowels. These changes occurred in right auditory cortex and right anterior STG/inferior prefrontal cortex and were dependent upon auditory attention to the phonetic discrimination task.
In general, bimodal selective attention usually leads to widespread increased activity in relevant sensory cortices while simultaneously leading to decreased activity in irrelevant sensory cortices (Johnson and Zatorre, 2006). Other association cortical areas in the attentional network (Posner and Peterson, 1990) are also activated in auditory attention – such the posterior parietal cortex (Cohen et al., 2005; Shomstein and Yantis, 2004, 2006), and right inferior frontal and dorsolateral prefrontal cortex (Voisin et al., 2006). Moreover, neuroimaging studies of the thalamus (Frith and Friston, 1997) and physiological (McAlonan et al., 2006) and neuroanatomical (Sakoda et al., 2004) studies of the thalamic reticular nucleus suggest that the different thalamic nuclei may play important roles in attentional modulation and in helping direct the shifting focus of attention (Crick, 1984). Most recently, physiological studies by Otazu and Zador (2006) have observed an attention-driven overall enhancement of spontaneous activity in the medial geniculate thalamus, which may play a role in generating selective responses in auditory cortex. There is also evidence for auditory attentional modulation of activity in the superior colliculus in mammals, in parallel with the demonstration of top-down gain control in the avian midbrain (Winkowski and Knudsen, 2006). And a recent study (Perez-Gonzalez et al., 2005) has shown the presence of novelty detector neurons in the inferior colliculus, that may contribute to a subcortical attentional, arousal or orienting responses. In sum, these results suggest that auditory attention involves a wide range of auditory cortical and subcortical structures, and also integrates into a multisensory attentional network that includes parietal and frontal cortical regions (Bidet-Caulet et al., 2005; Foxe et al., 2005; Peers et al, 2005, Serences and Yantis, 2006; Raz and Buhle, 2006). Looking at the whole set of brain areas involved in the control of auditory attention reveals a richly interconnected network, that includes the computation of early auditory features, the location of acoustic items of interest, recognition of auditory objects, and the planning of actions.
In light of this review of auditory attention, a number of key questions emerge, including the following: (1) what is the relationship between auditory spatial attention and auditory feature-based attention? (2) what are the contributions of different neural loci (including multiple subcortical and auditory cortical areas) to these different forms of auditory attention? – and what are the network dynamics for this widely distributed set of structures modulated by auditory attention? (3) what is the relation between arousal, vigilance and attention? (4) what is the neural basis of pre-attentive processing represented by the MMN? And how does it integrate with attention on a cellular and network level? (5) what role does attention play in modulating neuronal receptive fields in A1? (6) what are the neural mechanisms underlying attentional effects such as STRF shape changes (7) what is the time course of task-related plasticity compatible with the time course of attention? (8) what is the relationship between learning and attention? how does task training shape the direction of attention? We will touch on some of these questions in the review, and leave others for future work. However, in section II, our discussion will focus on the results of some of our recent experimental studies to explore the possible role of attention in modulating A1 receptive field properties
The adaptive functions of the cerebral cortex rely upon flexibility and plasticity of information processing networks. Many previous studies have demonstrated that local and global properties of the auditory cortex (specifically in A1) are extraordinarily plastic in response properties to a variety of training procedures (see Weinberger, 1998, 2000, 2003a,b; Fritz et al., 2005b). Receptive fields and frequency response profiles of A1 neurons can be attentionally gated to adaptively assume different states or filter properties depending upon the behavioral demands of the ongoing task demands. An important study by Polley and colleagues (2006) shows that differential cortical plasticity arises during perceptual learning when animals attend to different features of the same acoustic stimulus set. Attention may also be instrumental in shifting from one cortical state to another. A recent study by Blake and colleagues (2006) demonstrates that a combination of acoustic stimuli and reward are insufficient to evoke cortical plasticity in the absence of an active, behavioral link between the two, and emphasizes the importance of forging dynamic links between sensory stimuli and motor actions during task learning (Cohen et al., 2005).
We began our current research on the effects of auditory attention on primary auditory cortex because we thought it might be valuable to study the impact of attention by examining dynamic changes in receptive field shape under different auditory attention conditions, in which the animal needs to focus on different salient acoustic features or cues in order to perform the task. To quantify these attention-driven, on-line adaptive changes in auditory cortex, we developed a new procedure to study rapid task-related receptive field plasticity in the awake, behaving animal (Fritz et al., 2003; 2004a; 2005a,b). In our experimental paradigm, ferrets are trained to distinguish behaviorally between two variable and broad acoustic categories, target and reference sounds, which have different reward value. Once they have learned the task, and the accompanying listening strategy (Wright and Fitzgerald, 2004) one may say that the ferrets have learned to prune the incoming acoustic input and extract the salient acoustic cues to differentiate reference and target sounds. In our paradigm, although the target stimulus is highly variable from day to day, once one of the daily behavioral physiology sessions has begun, the reference and target characteristics are fixed for the rest of the 10-30 minute session. Hence, after one or two trials, the ferrets knows what to listen for, providing an opportunity for top-down attentional guidance. We suggest that the target of attention is selected at the top level of the aud itory processing hierarchy, and that by top-down biasing, earlier sensory processing of the acoustic features of the attended target is then enhanced. This combination of stimulus variability during training, and stimulus stability during testing, gives us the ideal opportunity to measure neuronal plasticity in a cortical population that is poised for change, but has not already been biased by prior repetitive training with fixed target stimuli.
Since we have previously summarized our methods in previous papers (Fritz et al., 2003, 2004a, 2005a, 2005b), hence we will briefly sketch our methods but not discuss them in great detail in this review. Readers interested in methodological details should consult the original papers cited above. The essence of our approach is to record from single neurons in A1 while the animal performs a variety of different auditory tasks, with the goal of quantitatively analyzing the nature and time-course of state-dependent, task-dependent adaptive plasticity in the auditory cortex on a cellular and network level. Once we obtain a stable recording of an isolated A1 neuron in the awake ferret, the design of our experiments is to (1) rapidly and comprehensively characterize the cortical STRF in the “non-attentive” or “pre-behavioral” condition (we also refer to this initial STRF as the “passive” or “quiescent” STRF), (2) characterize the behavioral STRF while the animal is actively and attentively engaged in one type of auditory task and compare this “behavioral-STRF-1” to the initial “pre-behavioral” and subsequent “post-behavioral” quiescent STRFs, (3) if possible, characterize and compare STRF plasticity in the same cell while the animal performs a different auditory task or tasks (leading to “behavioral-STRF-2 ,3 etc.). All experiments follow the same basic behavioral paradigm of conditioned avoidance, developed by Heffner and Heffner (1995) that we have slightly modified in our experimental design. The animal is trained to continuously lick water from a spout for a variable number of reference sounds (1-7), and learns by conditioned avoidance to refrain from licking after hearing a single distinctive warning target in order to avoid mild shock. In all detection experiments, reference sounds are drawn from a class of ripple stimuli called TORCs (temporally orthogonal ripple combinations) which are temporally and spectrally rich, broadband stimuli that also serve during physiological experiments to characterize the STRF of the cell under study. By contrast, in detection experiments, the target sound varies from one experiment to another with distinctive cues that have salient spectral or temporal, or combined spectrotemporal features. We grouped the tasks by the type of acoustic target that the animal must attend in order to perform the task (tones, tones in noise, silent gaps, tone duration, click rate, FM sweep direction, acoustic streams, CV phonemes, etc.). The major goal of the research we will describe here was to investigate auditory cortical plasticity induced by tonal targets in spectral tasks, and to contrast their effects in two distinct behavioral contexts: single tone and multiple-tone detection, and two-tone discrimination. We shall also describe preliminary results of our studies of plasticity aris ing from temporal tasks in which the animal performs either gap detection or click rate discrimination.
The ferrets learned a general or “cognitive” version of the detection and discrimination tasks for a variety of tasks and target stimuli. After training, they reached a stable behavioral performance level in which they could perform equally well on any tone detection or discrimination task for any target frequencies chosen during the experiment (target frequencies were randomly chosen from a seven-octave range of 125 Hz -16kHz). After initial training, the animal received a surgical headpost implant that allowed the head to be stably positioned during physiological recording. After recovery from surgery, the ferrets were re-trained on the task while restrained in a cylindrical, horizontal holder, with the head fixed in place.
We recorded single and multi-unit responses to a set of 30 different TORCs in the different behavior conditions using standard physiological techniques. In all conditions, receptive fields were based upon the neural responses to TORCs (for a detailed discussion of TORCs, and the use of TORC responses to characterize neuronal STRFs using reverse correlation techniques see Klein et al., 2000; Depireux et al., 2001; Miller et al., 2002, Escabi and Read, 2003). The intensity of tone stimuli, and TORC stimuli for all STRF measurements for all behavioral and passive control studies conducted while recording from this site, was held constant – at an amplitude value that typically was selected within the range 60-75 dB SPL. This intensity range was chosen for behavioral reasons, to ensure that the animal could clearly hear the task stimuli. We note that an essential feature of our method for quantifying changes in the STRF is that we have focused on changes in normalized STRF shape (see Fritz et al., 2003, 2005a,b). We have looked for, but not observed any consistent changes in overall gain during task performance for any of our tasks.
In the tone detection task, the ferret identifies the presence of a tone against a background of TORCs. This may be thought of as a tone detection task, or alternatively as a discrimination task in which the ferret discriminates between tones (narrow-band) and broad-band rippled noise (the TORCs). In either case, the ferret must attend to the appearance of a narrow-band tonal stimulus (which changes randomly between sessions but is fixed in frequency during each behavioral session) in order to avoid mild shock. The most common change in an STRF observed during this task was “facilitation” at the target frequency as a result of either an enhancement of its excitatory field, or a weakening of the inhibitory sidebands. These overall effects of tone detection on the population STRFdiff are illustrated in Figure 1 (top panel), and confirm the specificity of these effects at a population level.
In greater detail, 72% of cortical cells showed a significant STRF change while the animal was engaged in performance of a detection task (Fritz et al. 2003a). In 80% of these cells, a facilitative or positive STRF change occurred, i.e., an enhancement of the excitatory fields or a reduction of the inhibitory sidebands during tone detection task. Evidence for a correlation of neuronal plasticity and behavior was provided by comparing the relationship between behavioral performance and the pattern of STRF changes (see Figures Figures22 and 5 in Fritz et al., 2003a). In the majority of cases, these STRF changes persisted after the behavior was over, whereas in other cases, the behavior STRFs reverted immediately to their pre-behavioral state. In some experiments, STRFs from the same neuron were measured in a series of tone-detection tasks with different target frequencies, which were chosen in order to probe different excitatory and inhibitory regions of the same STRF. Remarkably, in a number of these cases, the effects of performing successive tone-detection tasks were imprinted on the STRF for long durations of time, well after the tasks were completed, for up to 4-5 hours (we expect that persistence of these effects may last much longer – but we are limited by our current experimental design, which permits a maximum of 6 hours of daily recording from A1 with electrodes which are inserted and withdrawn each day). Since over half of all cells encountered in our experiments exhibited STRF changes that persisted after one or more tasks, we have recently measured the population average difference between post-behavior, passive STRFs and pre-behavior, passive STRFs to confirm the presence of long-term post-behavioral changes. This novel (‘pre-post’) STRFdiff reveals a compelling set of changes that are very similar to the (‘pre-behavior’) population STRFdiff between the active STRFs and the pre-passive STRFs for all of the spectral tasks we have analyzed (single tone and multiple tone detection and two-tone discrimination). The fact that the reshaped STRF continues in changed form following task completion suggests that attention triggers a set of task-related plastic changes that do not require the maintenance of active attention in order to persist.
When an animal is discriminating between the frequencies of two tones (as opposed to simply detecting the presence of one tone as above), the STRFs could change adaptively so as to improve performance by enhancing “foreground” over “background” by facilitating the STRF at the target (foreground) frequency while suppressing it at the reference (background) frequency. This hypothesis is consistent with earlier results obtained by Edeline and Weinberger (1993) and Blake et al (2002). In theory, another equally viable strategy to achieve discrimination might be to enhance “background” while suppressing “foreground” stimuli, since the key point for the nervous system is to enhance the contrast between the stimuli. In order to investigate this question, ferrets were trained on a two-tone discrimination task, which was a modified version of the earlier detection task except that now, each reference sound was a hybrid consisting of the combination of a TORC and a tone. The TORC stimulus was immediately followed (with no delay) by a 500 ms reference tone (distinct from the target tone) (see Fritz et al., 2005a), and this tone was invariant for the rest of the reference sequences for that session. The target was also a hybrid stimulus, consisting of a TORC followed by a 500 ms target tone that was distinct from the reference tone in frequency. The animal soon learned to attend to the reference tone frequency and respond only when the frequency changed, which occurred when the target tone was presented. Our overall results, summarized at a population level, show a pattern of specific suppression of facilitatory fields of STRFs at the reference frequency is prevalent during discrimination (Fritz et al., 2005a,b) although there is still enhancement at the target frequency as in the single-tone detection task (see middle panel of Figure 1). Interestingly, in overall amplitude and in spectral selectivity, the average STRF change (at the reference frequency) during discrimination appears to be exactly the opposite of the average STRF change (at the target frequency) during detection and may be operating by a common mechanism (with a sign reversal).
In summary, a comparison of the average population STRF differences aligned at target frequency for the detect task (similar to that seen for target in two-tone discrimination tasks) and aligned at reference frequency for the two-tone discrimination task showed opposite effects. The average STRF change at target frequency, irrespective of task, shows enhancement, whereas the average STRF change at reference frequency in the two-tone discrimination task, shows suppression. Given these opposite effects, it was possible to play one effect against another, for example by recording from the same neuron in two different task conditions in which the same stimulus had two different meanings – as reference in the first task (two-tone discrimination) and as target in the second task (single tone detect). As predicted, neurons show a chameleon-like adaptive ability to change its STRF based upon changing task conditions. Similar effects were seen at a population level in multiple task sequences (see Fritz et al., 2005a,b).
Among the many questions that arose during our study of single tone detection, was whether comparable effects would occur if the animal attended to multiple simultaneously presented tones. Since it is not possible for ferrets or humans to extract individual tone frequency components from a random complex chord, we reasoned that the animal would not be able to focus its attention on specific frequency locations, and could not rehearse the complex “chord” in auditory short-term memory. Thus, the new multiple-tone detection (MTD) experiment was designed so that the results might help clarify whether specific focal attention to a particular location on the frequency axis was necessary in order to obtain receptive field plasticity. Harmonic chords with multiple component tones were used as targets. The population average from a subset of our preliminary (unpublished) data, incorporating only the results of experiments where the multitone targets had either two or three components, each separated by 1-octave, is shown in Figure 1 (lower panel). These preliminary results suggest that during performance of the MTD task, A1 neurons show a similar pattern of enhancement and suppression as might be predicted from a linear summation of the effects seen in single tone detection. Moreover, on average, we see significant changes in the STRFs at two or more tone frequencies in the chord complex, suggesting that there was a faithful global imprint of much, or all of the whole auditory chord object on the reshaped STRF.
To generalize, our data suggest that it may be reasonable to summarize the STRF changes we observe during performance of spectral tasks in terms of a contrast matched filter hypothesis. We propose that neurons in A1 may change their receptive field properties so as to enhance the contrast between reference and target stimuli. In its simplest form, this hypothesis is illustrated in Figure 1 (right side figures for all three panels), for the single-tone and multi-tone detection tasks, and two-tone discrimination tasks. We are currently developing a formal model of the contrast matched filter hypothesis and also experimentally testing the predictive power of this hypothesis with new reference-target pairs in newly developed spectral and spectro-temporal tasks.
Recent papers have shown that temporal plasticity can be induced in A1 by pairing electrical stimulation of cholinergic basal forebrain with tone pips at a high temporal rate (Kilgard and Merzenich, 1998), or by auditory perceptual learning (Beitel et al., 2003, Bao et al., 2004). In order to study the STRF changes in A1 that may result from performance on temporal tasks, ferrets were trained on a gap detection task, in which they learned to detect the presence of a short gap (25-150 ms) in TORCs, and also on a click-rate discrimination task in which they learned to discriminate between the rate of clicks in TORC-click combinations with click rates in the flutter range (5-40 Hz). Unlike the spectral tasks (single tone detection, MTD and two-tone frequency discrimination described above) in which the ferret attended to one or more spectral frequencies, in both temporal tasks, the ferret needed to attend primarily to the salient temporal characteristics of the sounds in order to perform well on the task. We conjectured that both temporal tasks should yield faster STRF dynamics, as evidenced, for example, by shortening peak latencies, shorter durations of facilitatory (or suppressive) fields, and/or by a concomitant sharpening of the outlines of its facilitatory and suppressive fields along the temporal axis. All of these effects have been observed and illustrative examples of actual STRF changes in two different temporal tasks are given in Figure 2. The top panel illustrates (a) temporal sharpening of the facilitatory receptive field, (b) a temporal shift in peak latency, during a gap detection task. In the lower panel, another example of dynamic temporal changes in peak latency in the facilitatory field are shown from the results of an experiment in which the animal performed two successive click rate discrimination task. In this cell, we observed a shift in STRF peak latency from ~30 ms (passive prebehavioral STRF) to ~ 15 msec (first behavioral STRF). What is remarkable, is that during continuous recording from the same cell, we found that the temporal changes gradually faded over a time course of hours, but then immediately re-occurred when the animal engaged in a second click rate discrimination task (see figure 2, lower panel).
We have now recorded from over 80 neurons in ferrets performing temporal tasks (either gap detection and click rate discrimination) and find consistent temporal STRF changes. Although some of these temporal changes in the STRF may reflect general arousal or behavioral response timing constraints common to all of our task conditions, these preliminary data suggest that it may be possible to influence both the temporal, as well as the spectral dimensions of the STRF, depending upon the behavioral task, and the task-salient stimuli. An open question, which we are currently exploring experimentally, is whether the contrast filter hypothesis generalizes to temporal and spectrotemporal tasks as well as to purely spectral tasks.
Approaching the question of the neural basis of selective goal-directed attention at an oblique angle, the experiments described above suggest that rapid auditory task-related plasticity is an ongoing process that occurs as the animal switches between different tasks and dynamically adapts auditory cortical STRFs in response to changing acoustic demands and attentional focus on salient acoustic cues. Rapid plasticity modifies STRF shape in a manner consistent with enhancing the behavioral performance of the animal, monitored through externally supplied feedback signals (whether the observed changes in STRF filter shape are actually necessary for optimal, or even good behavioral performance, is a question we do not answer here, and plan to address in future experimental lesion studies). The specific form of the STRF change is dictated by the salient acoustic cues of the foreground, as well as the background signals in the behavioral task, is modulated both by general arousal and also by selective attention, and can be successfully described a population level by the contrast filter hypothesis.
We return to a question posed in the introduction about the relative time scales (onset and duration) for attention and plasticity. Our findings on attentionally modulated task-related plasticity suggest that some forms of cortical receptive field plasticity may occur on a rapid time-scale. Our results are consistent with previous studies of A1 cortical receptive field plasticity that have shown similar swiftness in onset (Weinberger and Diamond, 1987, Edeline et al., 1993, Edeline, 1999, Ma and Suga, 2003). Many of the cortical changes we observe have short and evanescent life-times, and many STRFs return to their original shapes soon after the behavior is over, and the attentional focus has changed. However, since most cortical sensory neurons participate in multiple behavioral contexts, it is likely that their receptive field properties are continuously being modified, against the basic scaffolding of the synaptic inputs, as the animal enters new acoustic environments and initiates new tasks. In a sense, the STRF gives “linear” snapshots of a set of adaptive transformations of the receptive field in different behavioral and attentional contexts. We suggest that plasticity is part of an ongoing process that is constantly adapting and re-organizing cortical receptive fields to meet the challenges of an ever-changing environment and new behavioral demands (Edeline, 2003). We suggest that these rapid effects are attentionally driven – however, as indicated above, we have also demonstrated the long-term persistence of these effects, long after auditory attention may have shifted, suggesting that attention can initiate these changes, which may then be sustained by non-attention-related mechanisms. Presumably, if these effects are attentively driven they could occur on the order of seconds, rather than minutes. In new experiments, we are currently seeking to sharpen our temporal resolution from 1-2 minutes to a few seconds, so that we can clarify whether these receptive field changes can take place on the timescale of rapid attentional shifts, which can occur in a blink of less than one second (focused auditory attention in humans can selectively modulate sensory processing as early as 20 ms post-stimulus onset (Woldorff et al., 1993), volitional deployment of visual attention takes ~200-300 ms in monkeys, at a behavioral level our ferrets typically respond to targets within ~300-400 ms).
In the visual domain, the influences of attention include enhanced neuronal and behavioral sensitivity, improved discriminability and spatial resolution, as well as accelerated and more accurate information processing. Although most research on visual attention has focused on space-based attention, visual attention can also be directed to visual features such as color, orientation, texture, shape or direction of motion. What insights into auditory attention mechanisms, pathways and sites of action can we derive from current research findings in visual attention? There has been considerable recent research on spatial and featural attentional effects in LGN (O'Connor et al., 2002), V1 (Crist et al., 2001; Ito and Gilbert, 1999; Li et al., 2004; Motter, 1993; Reynolds et al., 2000; Roelfsema et al., 1998, Treue, 2001) , V4 (McAdams and Maunsell, 1999, 2000), the temporal-occipital area (TEO) as well as in parietal and prefrontal areas, and has been recently reviewed (Reynolds and Chelazzi, 2004; Maunsell and Treue, 2006). Although the mechanisms of visual attention are still unknown, it is clear that attentional effects are found at multiple levels of visual processing in an extensive attentional neural network. Moreover, there is an attentional gradient of effects, so influence of attention increases as one ascends the visual hierarchy from a few percent in early visual cortex and thalamus, to around 10-20% in MT (Martinez-Trujillo and Treue, 2002) and V4 (McAdams and Maunsell, 2000), and is most powerful in parietal and prefrontal cortex, where the neural representation is dominated by behavioral salience (although we note that attention-driven increases in baseline firing rates may be higher than the increased gain of neuronal responses to attended stimuli).
Spatial attention is selection based on stimulus position. The “spotlight” multiplicative model suggests that space-based visual attention changes the gain, or strength of neuronal responses without changing their underlying response properties or tuning curve. This multiplicative gain effect on firing rates of cortical neurons, without affecting their selectivity, has been shown in at multiple locations in the visual system, such as in V4 (Motter, 1993; McAdams and Maunsell, 1999). On the basis of their studies of attentional effects on neural processing in V4, Reynolds and colleagues (2000) have proposed a different model, which suggests that attention acts by enhancing effective contrast or stimulus strength. Parallels between space-based and feature-based visual attention lead to the feature-similarity gain model proposed by Martinez-Trujillo and Treue (1999) in which they suggested that neujronal gain changes depend upon the similarity of the features of the currently attended, behaviorally salient target and the sensory selectivity of the neuron. Maunsell and Treue (2006) argue that there is a fundamental equivalence of these different multiplicative models of the modulatory effects of attention. A unified gain model may also apply to the somatosensory cortex, where s imilar attention-driven gain changes have also been shown (Sripati and Johnson, 2006), also with no effect on the neuronal tuning width.
It has been proposed that the tonotopic (cochleotopic) axis of frequency space in the auditory system is a one dimensional projection map of the acoustic world comparable to the 2-D retinal projection of visual space in the visual system. If so, a possible implication of this analogy is that the receptive field changes that we observe when selective attention is focused on one salient frequency (as in our single tone detection task) should be comparable to selective spatial attention to one location in the visual field. However, before proceeding further with this comparison, an important caveat is in order: our single tone detection task may not be a truly parallel experiment to the spatial attention studies described above in the visual and somatosensory system – the central reason for this lies in the design of our task, which allows the ferret to attend to the difference between the reference and the target stimuli, rather than necessarily focusing on the target stimulus frequency alone. In fact, in the multi-tone detection task described above, we emphasize that the ferret may be paying global attention to the task, as the observed effects cannot fully be attributed to selective attention (since the ferret simply cannot discern (nor attend) to specific individual component frequencies in the multi-tone stimulus). So, two real questions emerge from this caveat: (1) if not frequency, what are our ferrets really attending to in the single tone detection task? (recent studies in our laboratory (Atiani et al., 2006) indicate that it is not simply a distinction between broadband and narrow band stimuli), (2) what is the best design for an auditory task which is truly parallel in design, and fully comparable to the spatial attention studies described above in vision and touch?
While keeping this caveat firmly in mind, what may we learn from the comparison of “spatial attention” in the auditory and visual domain? One striking difference, from the results of our studies on attention-driven effects in A1, is that we have not found any consistent evidence of gain changes in our task (the usual neuronal hallmark of visual spatial attention), although we have observed clear changes in receptive field shape. Thus, at first glance, our findings in primary auditory cortex cannot be explained by any of the multiplicative models. The central drawback of these models for explaining our tone-detection data, is that we observe additive, rather than multiplicative, effects of acoustic salience (i.e. enhanced facilitatory response area in the receptive field if the target frequency is placed near an facilitatory field in the neural STRF, and a net decrease of a suppressive response area if the target frequency is placed near a suppressive field in the STRF). None of the multiplicative models can account for our single-tone-detection results nor explain the differential neural responses we have observed in the two-tone frequency discrimination task, which are most easily explained by a differential push-pull mechanism enhancing the attended stimulus and suppressing the unattended stimulus. However, even if not seen in A1 in our task conditions, it is possible that multiplicative attention-driven gain changes could be occurring earlier (or later) in the aud itory pathway. Moreover, although the current evidence from the visual spatial attention literature favors a multiplicative model, not all observed cortical effects of visual spatial attentional modulation are multiplicative (Reynolds and Chelazzi, 2004).
Models of visual search emphasize the role of task-related top-down factors in modulating cortical target-feature-detectors or filters (Wolfe, Cave and Franzel, 1989; Pomplun, 2006; Rao et al, 2002; Najemnik and Geisler, 2005; Oliva et al, 2003). Feature-based attention, a location-independent form of selective attention, is likely to enhance salient feature representation in the higher order visual field that is related to a particular feature. For example, McAdams and Maunsell (2000) showed effects in V4 neurons of shifting attention between feature dimensions (color and orientation). They found that the neural representation of stimuli, even in parts of the visual field that had no relevance to the task, were modulated by feature-based attention. As an example of a possible approach to a feature-based study of auditory attention, we are currently studying attentional modulation in a frequency-independent task in which the ferrets have been trained to respond to frequency contour of tone pairs or FM sweeps (Yin et al., 2007). Our research exploring possible dynamic receptive field changes in such a featural task follows earlier studies by Brosch and colleagues (Brosch et al. 2005; Selezneva et al., 2007) who have observed a long-term increase in the proportion of neurons preferring downward-contours in A1 of monkeys trained on a frequency-independent tone contour task (in which reward was associated with downward contours).
Recent studies in visual processing have also lead to a re-evaluation of the concept of the receptive field, which historically implied a well-defined, unchanging region of sensory sensitivity. Although attentional state is known to modulate neural responses, it has been thought that the shape and position of the receptive field should remain fixed. This assumption was questioned in the auditory system over 20 years ago by Weinberger and colleagues. In the visual community, it was questioned over a decade ago, in a study (Duhamel et al. 1992) that showed a “predictive remapping” of visual receptive fields in the lateral intraparietal area (LIP) or parietal eye field, a phenomenon in which the receptive field of many LIP cells appears to shift in the same direction as an intended saccadic eye movement, immediately prior to the saccade. Other studies have also shown visual receptive field shifts beforeimpending saccades, in LIP and in V4 (Tolias et al., 2001), in relation to movement of the attentional focus in V4 (Connor et al., 1997) and MT (Womelsdorf et al., 2006), in the context of changing arm position in ventral premotor cortex (Graziano, Yap and Gross, 1994), and also in relation to tool use extending manual grasp (Iriki et al., 1996; Maravita and Iriki, 2004). Thus, it has been conjectured that spatial attention changes receptive field profiles by shifting their centers towards attended locations and by shrinking them around attentional loci. Although we have earlier argued (III.4) for a comparison of visual spatial attention with auditory frequency selective attention, the results on shifting receptive field tuning described above is compatible with our studies in the primary auditory cortex that suggest that A1 receptive fields are dynamically reshaped by task context and attentional focus (Fritz et al., 2003, 2005a,b). This raises again, the open question of whether the appropriate comparison should be between “real” spatial attention in vision and audition, or between visual spatial attention and auditory frequency attention?
A recent study of the auditory space map in the tectum of the barn owl (Witten et al., 2006) found that the receptive fields of tectal neurons shifted toward an approaching sound, with a magnitude that increased systematically with increasing stimulus velocity. Their results demonstrate that the auditory space map shifts dynamically, and compensates adaptively for the direction and speed of sound stimulus motion. Thus, selective attention in the visual as well as the auditory cortex, may be partially based on short-term receptive field plasticity leading to modifications in receptive field shape or position that increase neuronal selectivity for relevant information, and/or link to future motor actions. Such plasticity may represent a fundamentally different mechanism than multiplicative amplification of neuronal responses in a fixed receptive field. In a larger sense, receptive field plasticity could lead to dynamic cortical representations that could support attention to any currently salient set of stimulus features.
What is the basis of top-down modulation of spatial attention? Moore and Armstrong (2003) recently showed that microstimulation of the frontal eye fields (at levels too small to elicit eye movements) lead to attention-like enhancement of V4 responses. These results suggest a tight coupling between planned eye movements and predictive attentional gain increases. Similar results have been obtained in the auditory system of the barn owl by Winkowski and Knudsen (2006) who found that microstimulation of the forebrain gaze control field in the barn owl changed the responsiveness of matched neurons in the topographic map of auditory space in the midbrain tectum. In a natural context, top-down attentional signals in the owl could spotlight a spatial location, and sharpen auditory tuning, thus enhancing precision of spatial localization for sounds emanating from this point in space. In keeping with the idea of delayed top-down feedback underlying spatial attention, recent studies (Noesselt et al., 2002) have shown long-latencies (~150-250 ms) for attentional effects on V1, occurring well after the initial stimulus-driven response (~60-90 ms).
Another highly relevant concept from the visual literature is the idea of the saliency map (Koch and Ullman, 1985), which was developed as part of a model for implementing a massive bottom-up parallel search pooling information from multiple feature maps across space, and uses a winner-take-all strategy to select the most salient location that “pops-out” and receive the focus of attention. This approach has recently been adapted to auditory processing by Kayser and colleagues (2005). In some studies of visual saliency maps (found in superior colliculus, pulvinar nucleus of the thalamus, and in different areas in frontal, parietal, and visual cortex, such as V1 andV4) the peak activity corresponds to the object that will be the next target of a saccade (Mazer and Gallant, 2004). There are likely to be multiple salience maps, which interface to form one distributed salience system. As Treue observes (2004), the saliency map is not simply a tool for directing gaze to potentially relevant parts of visual space, but also appears to be the basis of perceptual judgements. An integrated, distributed saliency map combines bottom-up sensory effects along with top-down feature-based and space-based attentional modulatory effects and is a dynamically updated, current representation of stimulus strength and behavioral relevance across visual space. A closely related concept is the task-relevance map (Navalpakkam and Itti, 2005). In these conceptualization, attention is an emergent property of the integrated , distributed salience network, rather than a separate system in its own right (Shipp, 2004).
In summary, we find that some A1 cortical cells undergo rapid, short-term, context-dependent, adaptive changes of their receptive field properties, when an animal performs an auditory task that has specific behavioral demands and stimulus feature salience (Diamond and Weinberger 1989, Fritz et al., 2003a, 2005a,b). Not all cortical neurons display plasticity, which may represent a cortical compromise in the trade-off between stability and adaptability of sensory information processing. Similarly, not all cortical neurons display other measures of attentional modulation. We suggest that such rapid task-related plasticity is adaptive and is a part of an ongoing, dynamic process that underlies normal, active listening, where the listener is attending to a stream of acoustic events in its environment. In this view, plastic ity plays a functional role by causing a selective re-setting of the cortical circuitry. This tweaking of synaptic input strengths leads to changes in the receptive field properties of cortical neurons, which may enable the animal to achieve enhanced performance of the auditory task. Achieving goals in changing environments requires adaptive behavior. Since changes to organisms occur continuously in a dynamic environment, it would obviously be useful adaptively, if animals continuously modulated their nervous systems on-line (Mountcastle, 1995; Ulanovsky et al, 2004) and forged dynamic links between sensory stimuli and motor actions (Cohen et al., 2005). The spectrotemporal receptive field (or STRF) in A1 sits at the focal juncture of this process, depicted by the model shown in Figure 3. In a trained and well-behaving animal which engages in a previously-learned task, the STRF swiftly adapts so as to enhance behavioral performance, monitored through externally supplied (reward or aversive) feedback signals. How does this occur? We suggest that a critical step is the arrival of a cascade of rapid top-down signals (emanating from auditory association cortex and prefrontal cortex) that are sent as soon as a target is identified, based upon incoming acoustic information and task category expectations. We propose that this top-down signal is sent to subcortical neuromodulatory structures (such as nucleus basalis), which initiate an automatic barrage of activity that leads to STRF changes in A1 (one specific possible scenario might be that the top-down signal from frontal cortex enhances activity in the nuc leus basalis, leading to a consequent increase in acetylcholine release in auditory cortex that acts in turn on post-synaptic muscarinic M2 receptors on A1 pyramidal cells, and specifically enhances synaptic weights of co-active synapses that are still simultaneously responding to the current incoming target stimulus (which critical for this hypothesis) is still being presented to the animal). Thus, in a nutshell, we propose that the animal attends to the target, leading to a top-down target recognition signal that triggers the neuromodulator projections to gate plasticity in A1.
Although still conjectural, the role of the neuromodulators in mediating rapid plasticity is plausible and supported by many experimental studies. Neuromodulators such as acetylcholine, dopamine, noradrenaline and serotonin are all influential in mediating plasticity and stimulus coding (Gu, 2002; Manunta and Edeline, 2004; Hurley et al., 2004) through direct as well as indirect projections (Bouret and Sara, 2004). The projection from the basal forebrain cholinergic system may be particularly important in mediating cortical plasticity during learning (Conner et al., 2003). This mechanism becomes even more plausible given the recent study of Froemke and colleagues (2006), who have shown significant enhancement in synaptic amplitude (epsps) of thalamocortical projections to primary auditory cortex, after paired electrical stimulation of nucleus basalis with acoustic stimulation for ~6 sec (which is strikingly similar to the amount of time (~ 9 sec) that the ferret is exposed to the target in our tone detection task for our shortest behavioral physiology sessions (of total length 2 min)). This highlights one important arena for such rapid synaptic modulation – namely the role of acetylcholine in influenc ing thalamic and A1 activity via the thalamocortical input fibers (see Mooney et al., 2006). Another possible arena for rapid synaptic change may be the set of widespread subthreshold horizontal synaptic connections found in sensory and motor neocortex (Das and Gilbert 1995; Huntley 1997; Rioult-Pedotti et al., 1998; Laubach et al., 2000) which exhibit plasticity and whose synaptic efficacy has been shown to strengthen in procedural motor learning (Rioult-Pedotti et al., 2000).
As mentioned earlier, all of the tasks in our study are “cognitive” in the sense that the animal is trained on each task with a broad range of different target stimulus values. Given this diverse training set, the animal generalizes, and eventually learns the “rule” or the basic structure of the same-different task, independent of stimulus value. Once the ferret has learned the basic structure of the paradigm or “task-schema”, it knows almost everything about what to expect when presented with a task-variant, except the specific acoustic “values” (or properties) of target and reference. The acoustic features of target and reference can vary several times/day during different behavioral sessions. This a priori knowledge of the task is presumably embedded in the functional architecture of the auditory processing cortical network. The auditory cortical network (manifested in the STRFs of individual neurons), can shift to different dynamic states defined by this functional architecture. Thus signal processing during task performance consists of a matching operation in which incoming acoustic information is compared with these neural “states of expectancy”. Although the ferret can respond appropriately to all stimuli, the specific form of task related plasticity depends upon the currently relevant stimuli or salient features as well as the structure of the learnt task.
In the case of the tone detection task, ferrets were trained to detect the presence of any pure tone in the context of broadband noise, and hence learned a general sensorimotor schema or mapping (which could be summarized as a rule: if you hear any pure.tone, stop licking the waterspout for two seconds). In a particular behavioral session, where only one tonal frequency was used, the ferret performed the task and focused its attention on the salient frequency, leading to a reshaping of A1 receptive fields to enhance response at this frequency. It is important to emphasize that as many as 2/3 of cortical neurons in A1 showed such frequency-selective enhancement during tone detection task performance (Fritz et al., 2003a). Such short-term plasticity changes may also occur in the human auditory cortex (Menning et al., 2000, Jancke et al., 2001).
It is important to distinguish between such general, cognitive training on multiple task variants, each comparatively simple for animals to perform, which is characteristic of our studies on the one hand, and specific, behaviorally challenging, perceptual learning on the other hand (Recanzone et al., 1993; Ahissar, 2001; Crist et al., 2001; Gilbert et al., 2001; Beitel et al., 2003; Ghose, 2004; Li et al., 2004). In perceptual learning, typically the animals are trained over a prolonged period of time (often months or years) to asymptotic performance levels where they can make fine sensory discriminations, and typically this learning is highly specific for the particular stimulus configuration used during training, and for position in visual space (vision), or in frequency space (audition) and consequently does not generalize. In striking contrast, in our experiments, the animals were trained as generalists within and between task variants, over a time course of weeks, and were tested on tasks which were comparatively easy for them, often more than an order of magnitude above threshold (for example, we have found that ferrets' threshold for two-tone discrimination is about 1/16th of an octave, and yet we typically used differences of 1/2 octave or more for our two-tone discrimination study (Fritz et al., 2005a)). In recent, elegant research on the neural basis of perceptual learning by Gilbert and colleagues (Li et al., 2004), monkeys were highly trained on two different visual discrimination tasks based on different attributes of the same visual stimulus at the same visual location. One of the visual spatial discrimination tasks was a three-line bisection task, and the other was a vernier acuity task. The training effects fail to transfer more than a few degrees across retinotopic locations, suggesting that perceptual learning was occurring in V1. After this training, Gilbert and colleagues observed no change in receptive field properties of V1 neurons, nor any attention-related gain changes, however they found a task-dependent change in response to the identical visual stimulus, and on the influence of contextual stimuli placed outside the traditional receptive field. This state or task-dependent change allows the same neuronal population to multiplex and mediate different perceptual functions. They attribute this adaptive, dynamic multiplexing to a combination of changes in local circuits that arose during perceptual learning and top-down control that allows switching between the two different network states that correspond to each task condition. In contrast, the results of our training procedures do not lead to crystallized neural networks in A1 specialized precisely for each task. Rather, the ferrets learn a general set of rules that can be applied to any task variant condition, and hence use a different neural switching strategy than the monkeys in the dual perceptual learning tasks. The neurons in ferret A1 are also likely to be influenced by top-down signals, but appear to multiplex by rapidly reshaping their receptive fields to adapt to specific task demands and salient cues.
Of course, as indicated in the introduction, our description of the possible role of auditory attention in dynamically modulating cortical filters in A1 is only one small part of the whole story of the relationship between sound and attention. In general, there can be remarkably strong effects of attention on auditory processing in active listening, as observed in the familiar psychoacoustic phenomena of FM completion or phonemic restoration. An interesting window into the role of these top-down influences in the human brain is provided by studies that have shown that human aud itory cortex is activated by even by silence, in the complete absence of acoustic stimulation, when there is an expectation of sound (Raij et al., 1997; Hughes et al., 2001;Vo is in et al., 2006). This is an extraordinary display of the importance of attentive expectation in shaping cortical responses. A recent brain-imaging study (Engelien et al. 2000) underlines the additional point that these attentive effects on auditory processing are likely to occur throughout the auditory cortex, not just in A1. This is shown by their research on a “deaf-hearing” neuro logical patient with extensive bilateral destruction of auditory cortices (including the primary auditory fields) who was still able to marshall sufficient auditory attention to perceive sound onsets and offsets. Conscious attentive perception of sounds in this patient may have arisen from top-down projections from prefrontal cortex to the remaining non-primary auditory cortex. Recently, two forms of auditory neglect have been described, one an attentional deficit associated with basal ganglia lesions, and an auditory spatial deficit associated with parieto-prefrontal lesions (Bellmann et al., 2001; Clark & Thiran, 2004). These brain imaging and neurological results provide a useful reminder, that in order to fully understand the role of attention in the auditory system, we should not only focus on processing in primary auditory cortex, but also clearly must look well beyond A1.
We have recently initiated a new set of studies to examine the possible role of attention-driven, top-down influences in mediating task-related changes in A1 (Fritz et al., 2004b). The PFC is known to be involved in working memory, encoding of task-relevant features, task monitoring, task switching, executive control and goal-directed behavior (Miller and Cohen, 2001, Miller et al., 2002). It may also play a role in top-down attentional modulation of salient sensory inputs, and their linkage to a repertory of actions. We asked whether top-down inputs from prefrontal cortex (PFC) to A1 might contribute to task-related plasticity in the primary auditory cortex of the ferret as the animal focused attention on salient acoustic cues, and switched attention between targets in different auditory tasks (preliminary results are described in an earlier paper (Fritz et al., 2005b)). As mentioned above, we are conducting studies of feature-based auditory attention in order to distinguish feature-based from object-based attention, and we have also initiated experiments to see whether there is spread of attentional enhancement to unattended features of attended objects. Since attentional modulation has also been shown to lead to enhanced synchrony in the visual cortex, we are examining synchrony in the context of our studies of auditory attention. In current studies we are also testing specific predictions of the hypothesis outlined above (and shown in figure 3) and by recording in A1, PFC and other multiple levels in order to integrate our understanding of the auditory attentional network.
We would like to thank Bob Galambos and David Hubel for discussions of the early studies of auditory attention, Nima Mesgarani for assistance with task development and software programming, David Klein for computational analysis, Henry Heffner for continuing generous advice and guidance on behavioral training, Pingbo Yin for discussion on auditory featural attention, Kevin Donaldson for help with ferret care and training. We are also grateful for the grant support of NIDCD, NIH.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.