In contrast to the complex acoustic environments we encounter everyday, most studies of auditory segregation have used relatively simple signals. Here, we synthesized a new stimulus to examine the detection of coherent patterns (‘figures’) from overlapping ‘background’ signals. In a series of experiments, we demonstrate that human listeners are remarkably sensitive to the emergence of such figures and can tolerate a variety of spectral and temporal perturbations. This robust behavior is consistent with the existence of automatic auditory segregation mechanisms that are highly sensitive to correlations across frequency and time. The observed behavior cannot be explained purely on the basis of adaptation-based models used to explain the segregation of deterministic narrowband signals. We show that the present results are consistent with the predictions of a model of auditory perceptual organization based on temporal coherence. Our data thus support a role for temporal coherence as an organizational principle underlying auditory segregation.
Even when seated in the middle of a crowded restaurant, we are still able to distinguish the speech of the person sitting opposite us from the conversations of fellow diners and a host of other background noise. While we generally perform this task almost effortlessly, it is unclear how the brain solves what is in reality a complex information processing problem.
In the 1970s, researchers began to address this question using stimuli consisting of simple tones. When subjects are played a sequence of alternating high and low frequency tones, they perceive them as two independent streams of sound. Similar experiments in macaque monkeys reveal that each stream activates a different area of auditory cortex, suggesting that the brain may distinguish acoustic stimuli on the basis of their frequency.
However, the simple tones that are used in laboratory experiments bear little resemblance to the complex sounds we encounter in everyday life. These are often made up of multiple frequencies, and overlap—both in frequency and in time—with other sounds in the environment. Moreover, recent experiments have shown that if a subject hears two tones simultaneously, he or she perceives them as belonging to a single stream of sound even if they have different frequencies: models that assume that we distinguish stimuli from noise on the basis of frequency alone struggle to explain this observation.
Now, Teki, Chait, et al. have used more complex sounds, in which frequency components of the target stimuli overlap with those of background signals, to obtain new insights into how the brain solves this problem. Subjects were extremely good at discriminating these complex target stimuli from background noise, and computational modelling confirmed that they did so via integration of both frequency and temporal information. The work of Teki, Chait, et al. thus offers the first explanation for our ability to home in on speech and other pertinent sounds, even amidst a sea of background noise.