Bottom-up (stimulus-driven) and top-down (attentional) processes interact when a complex acoustic scene is parsed. Both modulate the neural representation of the target in a manner strongly correlated with behavioral performance.
The mechanism by which a complex auditory scene is parsed into coherent objects depends on poorly understood interactions between task-driven and stimulus-driven attentional processes. We illuminate these interactions in a simultaneous behavioral–neurophysiological study in which we manipulate participants' attention to different features of an auditory scene (with a regular target embedded in an irregular background). Our experimental results reveal that attention to the target, rather than to the background, correlates with a sustained (steady-state) increase in the measured neural target representation over the entire stimulus sequence, beyond auditory attention's well-known transient effects on onset responses. This enhancement, in both power and phase coherence, occurs exclusively at the frequency of the target rhythm, and is only revealed when contrasting two attentional states that direct participants' focus to different features of the acoustic stimulus. The enhancement originates in auditory cortex and covaries with both behavioral task and the bottom-up saliency of the target. Furthermore, the target's perceptual detectability improves over time, correlating strongly, within participants, with the target representation's neural buildup. These results have substantial implications for models of foreground/background organization, supporting a role of neuronal temporal synchrony in mediating auditory object formation.
Attention is the cognitive process underlying our ability to focus on specific aspects of our environment while ignoring others. By its very definition, attention plays a key role in differentiating foreground (the object of attention) from unattended clutter, or background. We investigate the neural basis of this phenomenon by engaging listeners to attend to different components of a complex acoustic scene. We present a spectrally and dynamically rich, but highly controlled, stimulus while participants perform two complementary tasks: to attend either to a repeating target note in the midst of random interferers (“maskers”), or to the background maskers themselves. Simultaneously, the participants' neural responses are recorded using the technique of magnetoencephalography (MEG). We hold all physical parameters of the stimulus fixed across the two tasks while manipulating one free parameter: the attentional state of listeners. The experimental findings reveal that auditory attention strongly modulates the sustained neural representation of the target signals in the direction of boosting foreground perception, much like known effects of visual attention. This enhancement originates in auditory cortex, and occurs exclusively at the frequency of the target rhythm. The results show a strong interaction between the neural representation of the attended target with the behavioral task demands, the bottom-up saliency of the target, and its perceptual detectability over time.