|Home | About | Journals | Submit | Contact Us | Français|
Bats echolocating in the natural environment face the formidable task of sorting signals from multiple auditory objects, echoes from obstacles, prey, and the calls of conspecifics. Successful orientation in a complex environment depends on auditory information processing, along with adaptive vocal-motor behaviors and flight path control, which draw upon 3-D spatial perception, attention, and memory. This article reviews field and laboratory studies that document adaptive sonar behaviors of echolocating bats, and point to the fundamental signal parameters they use to track and sort auditory objects in a dynamic environment. We suggest that adaptive sonar behavior provides a window to bats’ perception of complex auditory scenes.
The sensory world of an animal is noisy, complex, and dynamic. From a barrage of stimuli, an animal must detect, sort, group, and track biologically relevant signals. Parsing, integrating, and organizing complex sensory stimuli to support species-specific survival behaviors are tasks of perception, which invoke higher level processes of scene analysis, both within and across modalities.
The perceptual organization of sound, commonly referred to as auditory scene analysis, has received a great deal of research attention in the human literature over the past 20 years (Bregman, 1990), and represents an important advance in our understanding of auditory perception. In human psychoacoustic research, this problem has been successfully studied in laboratory experiments with simplified acoustic stimuli, but there remains the challenge, in humans and other animals, to understand perception of complex sounds in the natural environment.
Auditory scene analysis is a fundamental problem faced by all organisms that use acoustic signals for social communication, territorial defense, navigation or predator evasion. To understand the processes that support the analysis of natural scenes is further complicated by the fact that an animal's perception of stimuli may depend upon its behavioral goal, biological state, and environmental context. Here, we consider the problem of natural scene analysis by turning to the echolocating bat, an animal that can negotiate a complex auditory world in complete darkness (Griffin, 1958). For the echolocating bat, the analysis of auditory scenes builds upon its active production of sounds that reflect from objects in the environment (Moss and Surlykke, 2001). The bat adaptively adjusts the features of its sonar vocalizations in response to information obtained from echo returns; and therefore, the bat's sonar vocalizations provide a window into its perceptual world. Specifically, the directional aim, timing, frequency content, intensity, and duration of sonar signals used by a bat to “illuminate” the environment have a direct impact on the information available to its acoustic imaging system. In turn, the bat's perception of the echo scene guides its adjustments of the features of subsequent sonar vocalizations. Therefore, the bat's adaptive sonar behavior can shed light on the fundamental processes that underlie auditory scene analysis by echolocation. Indeed, the bat's active sonar behavior allows us to listen in on the signals that are used to perform a variety of auditory tasks. Here we review laboratory and field studies that point to the features echolocating bats use to represent information about a complex acoustic environment and the general importance of an animal's actions to its perception of natural scenes.
Bats are the most ecologically diverse group of mammals, with more than 1100 extant species of which around 950 echolocate (Simmons et al., 2008). Each species of bat has a distinct repertoire of signals that it uses for echolocation, and the features of these sounds determine the acoustic information available to its sonar imaging system. Bat sonar signals fall broadly into two categories, constant frequency (CF) used by some specialized species and frequency modulated (FM) signals employed by the majority of bats studied so far. Bats using CF signals typically forage in dense foliage, and some of these species lower the frequency of their sonar vocalizations as they fly to compensate for Doppler shifts in returning echoes (Schnitzler, 1968), stabilizing echo returns around a reference frequency and isolating Doppler shifts in echoes that come from fluttering insect prey (Schnitzler et al., 1983). FM-bats hunting in the open generally produce signals of narrow bandwidth, whereas FM-bats foraging closer to vegetation, typically produce shorter, broadband signals (Schnitzler and Kalko, 1998), which are well suited for 3-D target localization (Simmons, 1973; Moss and Schnitzler, 1995) and for separating figure and ground (Moss and Surlykke, 2001; Moss et al., 2006). Work is in progress to develop telemetry microphones small enough for bats to carry in natural flight (see e.g. Hiryu et al., 2005), but to date there are not any recordings of the sonar calls and echoes from the field, which would directly demonstrate the complexity of a bat's auditory scene. However, behavioral observations and echolocation call design in different contexts and habitats give indirect evidence of the challenges bats encounter and the acoustic tools they use to deal with a broad range of perceptual tasks.
The echolocating bat's auditory system receives and processes sonar calls and echoes from its environment for the task of spatial orientation, but it is essentially a “standard mammalian auditory system” (Suga, 1988, 1990; Covey and Casseday, 1999). The hearing range of bats extends from around 1kHz to well above 100kHz or even much higher, depending on the species (Moss and Schnitzler, 1995). Frequency sensitivity and resolution are extremely high in some CF-bats (Long and Schnitzler, 1975) and these auditory functions are supported by mechanical specializations of the basilar membrane and large populations of neurons tuned to sounds around the echo reference frequency of their biosonar (Bruns, 1980; Neuweiler et al., 1980). By contrast, auditory behavior, basilar membrane mechanics, and frequency tuning of neurons in the central auditory systems of FM bats show patterns similar to other mammals (Kössl and Vater, 1995; Moss and Schnitzler, 1995).
Many of the same cues used by other animals to localize sound and to process complex patterns of acoustic information are exploited by bats for spatial orientation and perception by sonar. Binaural cues for sound localization are used to estimate the azimuthal position of an acoustic target (Popper and Fay, 1995; Blauert, 1996). Monaural cues are also important for assigning a location in azimuth, but are considered essential for determining the elevation of a sound in space (Batteau, 1967; Wightman and Kistler, 1989, 1997). The bat's pinna-tragus system produces changes in the spectrum of incoming echoes, creating patterns of interference that are used by the bat to estimate target elevation (Wotton et al., 1996). Inter-aural spectral cues produced by the directionality of the two-ear system may provide additional information for determining target angle in the vertical plane (Grinnell and Grinnell, 1965; Aytekin et al., 2004). The bat estimates the third spatial dimension, target range, from the time delay between the outgoing vocalization and returning echo (Simmons, 1973), and FM-bats show extraordinary spatial selectivity along the range axis (Simmons, 1979; Menne et al., 1989; Moss and Schnitzler, 1989, 1995; Simmons et al., 1990; Surlykke, 1992).
As an insectivorous bat flies toward its prey, the features of its sonar vocalizations change. Sonar emission patterns have been used to divide the bat's foraging sequence into different phases: search, approach, and terminal buzz, typical for all aerial insectivores as well as for trawling species hunting for prey on or just above water surfaces (Griffin et al., 1960; Simmons et al., 1979). In general, search phase signals of aerial insectivorous bats are comparatively long in duration and narrow in bandwidth (Schnitzler and Kalko, 1998). Search signals are well suited for target detection, with energy concentrated in a restricted frequency band over an extended time window. In the big brown bat, for example, the repetition rate of search signals is low (5–10Hz), call duration long (up to 15–20ms), and frequency modulation shallow (sweeping from about 26 to 24kHz). Once the bat detects and selects a prey item, it produces approach phase signals at a higher repetition rate (20–80Hz) with steep FM (fundamental sweeping from about 65–25kHz) and reduced duration (2–5ms). In the final phase of capture, terminal buzz signals shorten further (0.5–1ms) and are produced at a very high repetition rate (up to around 170 calls per second) (Surlykke and Moss, 2000). The broadband approach and terminal phase signals are adapted for target localization in azimuth, elevation, and range (Moss and Schnitzler, 1995).
The sound production pattern of a foraging bat is illustrated in Figure Figure1.1. The calls characteristic of search, approach, and terminal buzz phases of insect pursuit are more than a stereotyped sequence of sonar signals. They form part of a complex set of adaptive behaviors to changing acoustic information (Moss and Surlykke, 2001). The bat's active control over the timing, duration, intensity, and bandwidth of its outgoing sonar transmissions allows this animal to regulate the flow of acoustic information it uses to perceive the auditory objects that comprise its dynamic environment.
Adaptations of call design to a bat's habitat and hunting strategy point to distinct acoustic features in the animal's auditory world and the problems its perceptual system must solve. Bats hunting insects out in the open sky, where all other objects are far away, often encounter fairly “clean” echo returns from isolated objects. The acoustic “snapshots” from the open sky are far less complex than those a bat encounters when hunting close to vegetation. Prey capture in mid-air may involves a limited set of cues, primarily echo delay and interaural differences of echo returns (Popper and Fay, 1995; Thomas et al., 2004). In addition to these target location cues, there is probably a more comprehensive set of situation-dependent cues comprising the bat's auditory space percepts (Müller and Schnitzler, 1999), including distance to ground or bank (Verboom et al., 1999).
At the moment of calling, the bat hears its outgoing sonar vocalization, probably at reduced intensity due to the attenuation caused by the middle ear muscles (Jen and Suga, 1976; Kick and Simmons, 1984). In between calls, there may be environmental sounds (e.g. wind noise or calls of neighboring bats) impinging on the bat's ears, but primarily there are comparatively long gaps, except when a nearby potential insect prey appears, reflecting an echo back to the bat (Figures (Figures1B,C).1B,C). The behaviors of aerial hunting bats support the notion of such a simple acoustic situation. Aerial hunting bats will try to capture pebbles or other objects thrown into the air (Griffin, 1958), indicating that they operate by a very simple rule, assigning any object echo to a potential prey item. However, they will not continue reacting to repeated pebble throws, demonstrating that if necessary, bats can make use of their extremely fine echo discrimination ability to differentiate between objects (Griffin et al., 1965; Simmons et al., 1974; Habersetzer and Vogler, 1983). Bats flying closer to vegetation adapt their signals to be shorter and more broadband (Neuweiler, 1989; Schnitzler and Kalko, 1998; Schnitzler et al., 2003). The most cluttered habitat is encountered by bats taking insects right next to or gleaning from vegetation (Figures (Figures1B,D).1B,D). The bats inhabiting this zone have adapted their calls to be extremely short and broadband (Fenton, 1990; Kingston et al., 1999). The similarity of changes in call structure, i.e. short duration and increase of bandwidth and sweep rate close to clutter, across many species from many families, point to their functional significance in a complicated scene, where short broadband calls probably serve to enhance discrimination of objects against background. Many bats are flexible in the call design (Kalko and Schnitzler 1993; Schnitzler and Kalko, 1998), and for example the genus Tadarida is considered “extremely adaptable” (Simmons et al., 1979), but access to a variety of habitats is not just a question of being able to produce the optimal sonar calls. Foraging in a complex habitat close to vegetation also requires broad wings and slow maneuverable flight, and some bats like Tadarida may be restricted to flying out in open or semi-open space due to their size and long, narrow wings (Fenton, 1990). This emphasizes that active behavior in a natural environment requires a combination of perception and motor skills.
Echolocating bats use sonar to represent the location and features of objects in the natural environment. It has been suggested that bats can discriminate between different tree types based on statistical differences between returning echoes. The conclusion was based on theoretical analysis of information content in returning echoes (Müller and Kuc, 2000; Yovel et al., 2008). Thus, in theory, bats should be able to use sonar to discriminate between different plants, given enough time. However, a bat flying at high speed, with noise from wind, conspecifics, other bats and ultrasonic insects, having to dodge branches and other obstacles while detecting minute insect prey or fruit in between leaves and twigs, may not have the time or attentional resources to do so. Furthermore, natural interactions between bats and plants implies that it is, in fact, quite difficult for bats to discriminate between plants under natural conditions. The bat-pollinated neotropical vine Mucuna holtonii directs its echolocating pollinators, the phyllostomid bat Glossophaga commissarisi, to its virgin flowers by a small concave “mirror” that works like an optical cat's eye, but in the acoustic domain, reflecting most of the energy of the bats’ echolocation calls back to the bat (Helversen and Helversen, 1999).
We hypothesize that the bat's perceptual system organizes acoustic information from a complex and dynamic environment into echo streams, allowing it to track spatially distributed auditory objects (sonar targets) as it flies. We define an echo stream as a sequence of sonar returns that can be assigned to a distinct source or auditory object. The acoustic parameters that can contribute to perceptual streams are echo direction, duration, intensity, frequency, and timing.
Here we present in Figure Figure22 a schematic illustration of sonar streams constructed from coherent changes in echo delay, the bat's cue for target distance. Figure Figure22 (upper panel) shows a bat in an environment that contains a single prey item and trees at different distances. In this simplified scenario, the insect and trees are located in front of the bat, and the bat receives echoes from these objects at different and changing delays. Each panel represents a new slice in time, separated by a fixed interval. In each panel, the bat's position relative to the trees and insect changes, as the bat pursues its prey. Below, each horizontal line in the plot corresponds to a sonar vocalization, starting from 1500ms before capture until time zero (from top to bottom on the left y-axis), when the bat intercepts the prey. The separation between the lines corresponds to the repetition rate of sonar vocalizations produced by the bat at different distances to the prey and obstacles. The resulting streams of echoes at changing delays are shown as open boxes with widths corresponding to sonar call duration and color coding as the insect (black) and trees (red, blue, and green) in the panel above. The signal durations and intervals are based on a pursuit sequence recorded from Eptesicus fuscus in the wild. The right y-axis shows the echolocation phases of insect pursuit from search to terminal buzz phase. As the bat flies closer to the trees and insect, the echo delays shorten. The echo amplitudes are estimated to illustrate relative differences due to changes in distances to the objects and the fact that bats reduce the output intensity as they close in on a target (Hartley and Suthers, 1989; Surlykke and Kalko, 2008). When the bat has passed an object the echo delay increases as the bat's distance from these objects increases, and the echo amplitude decreases rapidly to reflect the directionality of the sonar call with low intensity radiated in the backward direction. Each of the reflecting objects appears as a distinct ridge with a particular slope, corresponding to the rate of delay change of echoes over time. In this display, one can visually identify and track the returning echoes from the trees and insect over time.
Figure Figure22 highlights two important aspects of bat echolocation which we hypothesize play an important role in the analysis of natural scenes. (1) Bats actively control the features of their sonar calls that are used to represent the environment. Here we show adjustments in call timing, duration, and amplitude with changing target distance, but other signal parameters are also adaptively adjusted. The active control of sonar vocalizations directly impacts what the bat hears and also suggests how the bat perceptually organizes dynamic acoustic information. (2) The bat may perceptually organize echoes from objects at changing positions into streams, which allow the bat to track moving auditory objects over time. In this example, echo delay streams from the insect become most salient when the bat's sonar repetition rate is high. To build a representation of echo streams, the bat must integrate echo information over time, and its vocal behavior can directly influence perceptual grouping processes. These notions are elaborated in the sections below.
While research over the last several decades has elucidated the echo features that bats use to localize and discriminate sonar targets, there remains an incomplete understanding of the larger problem of auditory scene analysis, namely, how echo features are perceptually organized into representing the natural scene. We propose that the bat's adaptive echolocation behavior can shed light on this problem.
Below, we consider field and laboratory data on echolocation behavior, showing that bats adaptively control echo direction, delay, frequency, and timing, which the animal can then use to represent a complex auditory scene. These adaptive vocal-motor behaviors give rise to distinct echo features that the bat can use to segregate and group auditory objects into echo streams. Echo streams present a natural dimension for the bat to organize acoustic information in a dynamic and complex acoustic environment. The bat's adaptive sonar signal design provides a window to the perceptual representations it builds of the environment.
The bat's sonar beam can be likened to an auditory flashlight. Beam width constitutes a spatial filter, determining which limited region of space is sampled at a given point in time. The width of the sonar beam constrains the bat's “field of view.” Different lines of data demonstrate that beam width is under the bat's dynamic control. Myotis daubentonii emits calls of higher directionality in the wild compared to those produced in the lab. At peak power frequency, 55kHz, the half amplitude angle measured in the lab was 40°, but only 20° in the wild, where the “sonar view” is narrower, but with longer range due to the higher on-axis intensity, thus sampling further ahead but not as far to the side as in the lab. Beam width may be controlled by adjusting mouth aperture, such that opening the mouth wider results in a narrower beam (Surlykke et al., 2009b). Adjusting call frequency is another way of controlling beam width. Thus, by lowering the frequency by almost an octave, vespertilionid bats increase beam width dramatically in the last phases of the pursuit (Jakobsen and Surlykke, 2010). Face morphology also affects beam width and shape. Bats from the families Rhinolophidae and Phyllostomidae emit their calls through the two nostrils. Mouth-emitting bats have simple faces resembling those of other mammals, but nose-emitting bats are characterized by a nose-leaf, a fleshy structure around and above the nostrils. The association between nose-emission and nose-leaf suggests a role for the nose-leaf in sonar beam shaping. Nose-leaf morphology can be quite complicated, but in phyllostomid bats from the New World tropics, with relatively simple lancet-shaped nose-leaves, this fleshy structure around the nostrils seems to restrict the beam mainly in the vertical direction (Vanderelst et al., 2010).
By directing the beam axis towards specific objects, the bat influences which echo information it samples from the environment. Phyllostomid bats may move the nose-leaf to steer the beam independent of head aim to selectively sample echoes from certain objects (Weinbeer and Kalko, 2007). Vespertilionid bats and other mouth emitters control beam direction and thus the region of space they inspect by moving the head. As mentioned earlier, CF bats employ a different strategy to sort echoes from vegetation and insect prey, by listening for Doppler shifts introduced by prey wing movements (Schnitzler and Flieger, 1983; von der Emde and Schnitzler, 1986, 1990; von der Emde and Menne, 1989). These examples serve to illustrate the enormous diversity of bats and indicate that there may be more than one echolocation solution to the perceptual challenges that arise in a complex acoustic scene. The echolocating bat's sonar adaptations to different habitat and foraging requirements in the wild present a window to the animal's active analysis of acoustic information in a natural environment and can be used to motivate carefully designed laboratory studies.
Laboratory studies of sonar emission patterns of the big brown bat, E. fuscus, show that sonar beams are broad enough to collect echo information from objects within a 60–90° cone (Hartley and Suthers, 1989), which could enable simultaneous inspection of several objects in the frontal plane (Ghose and Moss, 2003). However, the results of a recent study clearly demonstrate for the first time that bats encountering a complex environment shift the directional aim of their sonar beam to accurately and sequentially point the central axis of the sonar beam in the direction of closely spaced objects (Surlykke et al., 2009a). This was determined by taking measurements of the directional aim of the bat's sonar beam as it performed in a dual task, obstacle avoidance and insect capture. Bats were trained to fly through one of two openings in a fine net. Behind only one of the openings a tethered insect was presented, at variable distances behind the net (see Figure Figure3A,B).3A,B). A microphone array was used in this study to reconstruct the sonar beam pattern as the bat inspected the net obstacle and the prey (Figures (Figures3A,C).3A,C). Changes in the directional aim (acoustic gaze) of sonar calls showed that the bat first inspected the obstacle, and then shifted its gaze to the more distant prey, before negotiating its way past the obstacle. The bat sequentially shifted the axis of its sonar beam to different objects with an accuracy of ~5° (see Figure Figure3E).3E). These findings provide indicators that the bat negotiates a complex acoustic environment by shifting its attention (acoustic gaze) sequentially to closely spaced objects. Doppler shift compensation measured in a CF bat also suggest shifts in attention to different walls in a flight room (Hiryu et al., 2005).
Visual animals may engage similar behavioral strategies by moving their eyes to sequentially sample closely spaced objects in a scene (e.g. Kano and Tomonaga, 2009). We therefore propose that the components of scene analysis detailed here for the echolocating bat apply more generally to the analysis of natural scenes in a broad range of animals that rely on different sensory modalities.
Bat sonar operates in three-dimensional space, and the bat's echolocation behavior influences the information it gathers along the range axis. In particular, the bat makes adjustments in the intensity and duration of calls to sample echoes from targets at different distances, i.e. higher intensity and longer duration for more distant objects.
Emitted intensity directly impacts the sonar operating range. The operating range of echolocation varies with species and habitat, and refers to the distance over which echoes return to the bat's ears at levels that are above the animal's detection threshold. Emitted intensity directly influences a bat's operating range, since adequate sound energy must impinge upon objects to return an audible echo. Bats that fly high in open space may be able to detect echoes from the ground many tens of meters away, due to the large reflection surface of the ground, which is not a point target (Jung et al., 2007). Ultrasound attenuates rapidly with distance (Lawrence and Simmons, 1982), and to collect audible echoes from small insects, some bats produce echolocation calls with source levels (at 0.1m from the mouth) up to ca. 140dB SPL (Surlykke and Kalko, 2008), a finding that demonstrates call intensities emitted in the wild are much greater than previously believed (Griffin, 1958). Furthermore, bat sonar call intensities depend on habitat, with the highest intensities being emitted by bats hunting in uncluttered space (Holderied et al., 2005; Surlykke and Kalko, 2008) whereas bats hunting closer to clutter produce lower intensities (Kingston et al., 2003; Brinkløv et al., 2010). The lowering of emitted intensity in the lab parallels other acoustic changes in sonar signals, for example, shorter calls with broader bandwidths (Surlykke and Moss, 2000), indicating that a collection of call parameter adjustments aid sonar operation in more cluttered environments (Brinkløv et al., 2010). Clutter from more distant objects is reduced by the attenuation of call intensity, and echo returns becomes less noisy (Johnson et al., 2008).
The duration and timing of the sonar calls are also tied to sonar range. Holderied et al. (2005) showed that call duration increased with emitted intensity in E. bottae, an open air forager, and they estimated a maximum detection range of 21m for large objects. Thus, the longer the range, the more intense and the longer the call. The same relation between duration and intensity was shown for E. fuscus in the lab (Møhl and Surlykke, 1989). In general, bats emitting FM signals avoid overlap between their outgoing sounds and the returning echoes (Figure (Figure3D),3D), and they wait until the echo has been received before producing the next sonar call, hence “placing” the echo in a window between the end of one pulse and the beginning of the next. Holderied and von Helversen (2003) report that the interval between successive sonar calls fits an estimate of maximum detection range for 11 European bat species, and they suggest that this match ensures receipt of a prey echo before producing the next call. The same timing control of signals is observed in echolocating dolphins (Au, 1993), indicating that this is a general strategy for echolocating animals to avoid call-echo assignment confusions.
As a bat inspects an object, it adjusts the duration of its call to avoid overlap between sonar pulses and echo returns from the target of interest (see Figure Figure3D).3D). Such vocal-motor adjustments in call duration provide an indirect measure of where the bat is attending along the distance axis. This is illustrated in data from laboratory studies. For example, in a target discrimination task, the big brown bat adjusted the duration of calls to the distance of the object it was inspecting (Falk et al., 2010). The bat was trained in this study to discriminate between two small (16mm diameter) spheres, one smooth (S+) and the other textured (S−). The bat received a food reward for tapping S+ as it flew by. When the bat flew towards the S-sphere, it initially decreased the duration of its calls to avoid overlap with echoes from the object. However, before flying past this non-rewarded target, the bat increased the duration of its calls, experiencing pulse-echo overlap, suggesting that it had shifted its attention to a more distant rewarded target (S+). The data from this study indicate that the bat adjusted its calls to sample echo information from different objects at changing distances. Similarly, in the obstacle avoidance/insect capture task described above, the bat made adjustments in call duration as it approached the net, shortening its calls to avoid pulse-net echo overlap (Surlykke et al., 2009a). When the bat was close to the net and had planned its path through the opening, but before flying through, it increased the duration of its calls, tolerating pulse-net echo overlap. Directional aim of the sonar beam provided an independent measure of where the bat was focusing its sound and thus confirmed that call duration, was adjusted to the more distant worm (Figure (Figure3E).3E). Data from these studies demonstrate that the big brown bat makes vocal-motor adjustments to shift its acoustic gaze to sequentially sample different objects along the range axis.
Shortening a broadband signal is typical of bats negotiating more cluttered environments. A broadband, brief signal allows for more precise time determination of echo return (Surlykke, 1992, Figure Figure4),4), which makes it easier to discriminate between prey and background. Field results supporting this notion showed, based on duration, frequency, and bandwidth, that the clutter-adapted M. septentrionalis has a shorter maximum sonar operating range than M. lucifugus, a species known to forage in a variety of habitats but mainly in uncluttered areas (Broders et al., 2004). Myotis nattereri emits calls of extremely broad bandwidth, with a downward FM-sweep from 130 to 30kHz, i.e. spanning two octaves, and is able to forage within a few cm of clutter (Ratcliffe and Dawson, 2003; Siemers and Schnitzler, 2004).
In this section, we have provided several examples of sonar call duration and intensity adjustments to objects at different distances, which contribute to the bat's perception of the natural scene. Auditory scene analysis invokes shifts in attention to object features and locations, and discretely sampled information must be ultimately integrated over time and space (Moss and Surlykke, 2001). The data summarized here show that the bat adapts the amplitude and temporal features of its calls to inspect objects at different distances, and serve to illustrate how this animal's actions play directly into its representation of a complex environment.
When bats forage in cluttered environments, where a single sonar call results in a cascade of sonar echoes, the animal may experience ambiguity about the delays of echoes associated with a given sonar call. As noted above, bats typically adjust the intervals between successive calls to avoid such ambiguity, waiting for echo returns before producing the next sonar call. However, echolocating bats sometimes encounter complex environments that prevent sorting of calls and echoes by pulse interval (PI) adjustments alone. Hiryu et al. (2010) discovered that big brown bats, E. fuscus, flying through an array of echo reflecting obstacles made frequency adjustments between alternating sonar calls, presumably to tag time dispersed echoes with a given sonar call by using echo frequency information. This result suggests that bats may treat the cascade of echoes following each sonar vocalization as representing one complete view of the auditory scene. If the integrity of one view of the acoustic scene is compromised by overlap of one echo cascade with the next, the bat changes its call frequencies to create the conditions for segregating echoes associated with a given sonar vocalization, thus providing strong evidence for the bat's use of frequency cues to sort information about a complex acoustic scene.
When bats forage together in groups, they face additional challenges, namely to sort echoes from their own signals from the signals and echoes of neighboring bats. The call design within a species is much more similar than across species, which would seem to create severe problems for acoustic orientation and prey detection in proximity of conspecifics. However, field data show that a number of species, e.g. Pipistrellus pygmaeus, M. daubentonii, Rhinopoma hardwickei (Habersetzer, 1981) Tadarida brasiliensis (Gillam et al., 2007) or Noctilio sp. (Barak and Yom-Tov, 1989) hunt in groups, perhaps due to food abundance or water availability. Sound recordings from a group of Noctilio leporinus and N. albiventris in Panama (cited in Moss and Surlykke, 2001) reveal a chaos of sound, where it would appear to be absolutely impossible to detect prey echoes. However, the bats feed successfully under these acoustically challenging conditions.
A recent laboratory study investigated strategies used by echolocating animals to reduce interference from conspecifics by placing pairs of big brown bats in a situation where they competed for a single prey item (Chiu et al., 2009). This laboratory study used high-speed 3-D video and microphone array recordings that permitted unambiguous assignment of calls to the individual vocalizing bat. The results showed that the big brown bat made adjustments in the spectral characteristics of its calls when it flew with conspecifics, and the magnitude of these adjustments depended on the baseline similarity of calls produced by the individual bats when flying alone (Figure (Figure5).5). Bats that produced sonar calls with similar baseline signal design made larger adjustments in their sonar calls than those bats whose baseline call designs were already dissimilar. Field recordings from the same species showed frequency adjustments of up to 8kHz, when two individuals flew closely together (Surlykke and Moss, 2000). Bates et al. (2008) demonstrated that frequency adjustments of paired big brown bats can aid in target detection. It is noteworthy that free-tailed bats, Tadarida brasiliensis, can prevent mutual interference by avoiding emission of sounds at the same time (Jarvis et al., 2010). Also, Gillam et al. (2007) reported that free-tailed bats changed emitted call frequency in response to signal playbacks in the field by 3–4kHz, corroborating Habersetzer's (1981) suggestion that the frequency shifts in Rhinopoma hardwickei were jamming avoidance responses (see Table Table1).1). These findings imply that frequency features of sonar calls produced by different bats aid each individual in segregating echoes of its own sonar vocalizations from the acoustic signals of neighboring bats (Chiu et al., 2009). Distinct frequency components of an individual's calls could be used by the bat to hear out the signals of interest (echoes from its own calls) from background (calls and echoes from other bats).
The same laboratory study of bats foraging in pairs led to the surprising discovery that echolocating bats sometimes go silent. The prevalence of silent behavior depended on the spatial separation of the bats as they flew together and also on the baseline similarity of their calls when they flew alone, suggesting that silence is at least in part a jamming avoidance response. In addition, the trailing bat tended to go silent, raising the possibility that it used the vocalizations and echoes from the leading bat to orient (Chiu et al., 2008). Silent behavior in bats suggests possible connections between scene analysis by echolocation in bats and other animals that listen passively to the natural soundscape (Figure (Figure66).
The problem of jamming is not restricted to conspecifics. Sympatric bats and other ultrasound emitting animals also contribute to the complexity of the soundscape. In the tropics, where bat density and diversity is particularly high, the variety of call design across species is pronounced. Many bats emit search calls with alternating dominant frequency (e.g. Emballonuridae; Jung et al., 2007), and some bats (e.g. Cormura brevirostris, Molossus molossus) produce even more sophisticated calls with three or more different tones (Guillén-Servent and Ibáñez, 2007; Surlykke and Kalko, 2008). A range of fast and high flying, aerial hawking bats regularly alternate peak frequencies between subsequent calls during search flight, as is known for the European noctule Nyctalus noctula with its characteristic “plip-plop” search calls. The significance of this behavior remains controversial (Kingston et al., 2003). Frequency changes may help bats to effectively sort and assign echoes from their own calls in a complex environment, where ambiguity about call-echo assignment can arise. Another advantage of alternating calls is that the “time stamp” of each call increases maximum detection range by marking calls to discriminate between echoes of successive calls (Jung et al., 2007). The emballonurid bat, Saccopteryx bilineata, is well known for emitting calls alternating between 43 and 47kHz (Jung et al., 2007). However, it may skip the low frequency and only emit high frequency calls, apparently in situations when in transit between the roost and hunting ground. The frequency alternation always occurs in hunting situations, which further supports the notion that frequency alternation is adaptive for separating target and background echoes (Ratcliffe et al., 2010).
These observations point to the high importance of frequency as a cue in echolocating bats for separating and tracking auditory objects. The bat may listen in on a restricted frequency band to separate its own calls and echoes from those of other echolocating bats. In technical radar “frequency hopping” (Jankiraman, 2007) is used to reduce jamming by rapidly switching the frequency of the transmitted energy, and detecting only that frequency during the receiving time window. A broad range of stimulus features may be used to sample information from the natural environment for the analysis of natural scenes. Data presented in this section provide examples of active adjustments of sonar call spectral features by bats foraging in the presence of conspecifics and other sympatric bats. Such vocal-motor adjustments may allow the bat to perceptually segregate echoes from its sonar calls from the signals of neighboring bats.
Echolocation calls are discrete signals that yield brief acoustic snapshots of the environment. Because of the interrupted nature of sonar calls and echoes, bats must integrate echo information over time to build up a representation of an auditory scene and track dynamic objects. Does the temporal patterning of bat vocalizations yield insights to the information-processing strategies and integration windows for representing auditory objects in the natural scene? In human listening experiments, the perception of auditory streams depends not only on the spectral features of acoustic signals but also on the temporal spacing between stimuli (Bregman, 1990). For example, when a human listener is presented with a series of tones that alternate between high and low frequencies at a low temporal rate, the subject hears out the individual tones. However, when the rate of presentation is increased, the listener begins to perceive two separate sounds streams, one low frequency and the other high frequency. The perception of auditory streams, in this example, depends on the frequency separation of high and low tones and the temporal interval between successive sounds (Bregman and Campbell, 1971). A similar phenomenon occurs in visual motion perception: When neighboring lights flash slowly and sequentially along the perimeter of a marquis, a viewer perceives each flash as a separate visual event; however, if neighboring lights flash sequentially at a higher rate, the viewer perceives a stream of light that moves along the perimeter. The perception of visual movement in this example depends on the spatial separation of the neighboring lights and the time interval between successive flashes (Körte, 1915).
The interval between successive echolocation calls directly impacts the timing of echo returns, with consequences for sorting and tracking sonar objects in a dynamic environment. Literature on bat echolocation behavior typically describes a continuous and regular decrease in pulse interval with a reduction in target range (e.g. Nachtigall and Moore, 1988). However, as detailed below, the decrease may not be as regular as generally characterized. The temporal patterning often contains periods of stable call production, embedded in sequences with decreasing call intervals. As noted above, humans report that the perception of auditory streams depends on the timing of acoustic stimuli (Bregman, 1990), and by extension, the bat's temporal control over echo returns would be expected to influence the animal's perceptual representation of objects comprising the natural scene. Therefore, quantitative analysis of the temporal characteristics of sonar calls produced by bats in complex auditory environments provides insight to the information that allows the bat to segregate, group, and track echoes from different objects over time and space.
Analysis of sounds produced by the bat E. fuscus as it forages provides an example of how the sound repetition rate does not change continuously over time; rather it remains relatively high and stable over extended periods; during the approach phase, the sound repetition rate may plateau at around 50–60Hz, interrupted by longer PI gaps, for time periods as long as 200ms (Moss et al., 2006; see Figure Figure77 below). The grouping of calls over time periods exceeding a wingbeat cycle demonstrates that the link between respiration and call emission (Suthers et al., 1972) may be broken, even though coupling call emission to wing beat saves energy (Waters and Wong, 2007). Thus, we infer that when the rhythm is broken, the bat makes adjustments in sonar call timing in response to perceptual demands for echo streaming. The breaks in pulse production serve to open up a temporal window for the bat to listen for echoes from more distant objects before producing the next sonar vocalization. The bat may also need the longer intervals between sound groups to integrate echo sequences and update motor behaviors (Wilson and Moss, 2003). The stable periods of sound repetition rate (sonar “strobe groups”) occur when the bat is selecting a target, changing the direction of its flight path and in proximity to obstacles (Moss et al., 2006).
Grouping of sounds has been reported from field studies of a number of bats from different families e.g. Vespertilionidae M. nattereri (Melcón et al., 2007), E. fuscus (Surlykke and Moss, 2000), Phyllostomidae (Weinbeer and Kalko, 2007), Rhinolophidae (Schnitzler, 1968). Also, small Vespertilionid bats from Malaysia hunting close to clutter emit groups of 2–15 pulses at high pulse repetition rates (37±105Hz) of high frequency and very broad bandwidth. The longest pulse groups and highest within-group repetition rates have been recorded in the most maneuverable, Kerivoula pellucida, of the nine studied Kerivoulinae and Murininae species, supporting the hypothesis that inter-call interval indicates the degree of clutter tolerance (Kingston et al., 1999).
We hypothesize that stable signal repetition rates have immediate consequences for the bat's perception of space (Moss and Surlykke, 2001; Moss et al., 2006). Neural recordings from the midbrain of the awake bat show that sonar “strobe groups” influence spatial-temporal response profiles of neurons in the bat auditory system that are hypothesized to play a role in coding sonar target distance. Specifically, a class of echo-delay tuned neurons in the bat superior colliculus exhibit facilitated responses and narrow tuning to pulse-echo pairs presented at stable PIs. However, the echo-delay tuning collapses when stimulus intervals are jittered, even by as little as 20–30% (Gifford and Moss, 2005; Ulanovsky and Moss, 2008). This finding suggests that activity of neurons responsive to echo delay can be gated by the temporal stability of successive sonar calls, which in turn may influence the animal's perception of targets along the range axis.
The bat's adjustments of sonar signal repetition rate and duration are tied to target range; however, echolocation parameters also depend on the bat's azimuth and elevation relative to a selected prey item, and most importantly, its plan of attack. For example, when a bat approaches an insect, flies past it and returns to intercept it, the temporal patterning of the animal's sonar signals are distinctly different from those produced by a bat that flies directly to intercept the insect (Moss and Surlykke, 2001). Thus, the temporal patterning of the bat's echolocation signals provides explicit data on the motor commands that feed directly back to the auditory system for spatially-guided behavior. The temporal clustering of calls into groups may also reveal the time window over which echolocating bats integrate pulse-echo information to build up a representation of the auditory scene.
In summary, echolocating bats produce sounds in groups, and the bat may use the collection of echoes from such sound groups to process and update acoustic information gathered from the environment. Given the importance of temporal patterning to the perceptual organization of sound patterns in human listeners, we propose that the sound groups contribute to the bat's perceptual organization of echo streams from a spatially complex environment.
Several themes emerged from our review that we discuss below, both in the context of scene analysis by echolocation and in other animal systems. We propose that the components of scene analysis detailed here for the echolocating bat apply more generally to the analysis of natural scenes in a broad range of animals that rely on hearing, as well as other senses.
The bat's active sonar system offers indirect access to its perceptual world, which then presents a special opportunity to tap into the processes supporting the analysis of natural auditory scenes. The features of a bat's sonar signals have a direct impact on the echo information available to its acoustic imaging system. In turn, the bat's perception of objects in the environment influences its motor behavior. Therefore, the bat's adaptive sonar behavior can shed light on its perception of auditory objects in a natural scene and how its control over sonar signals can contribute directly to perceptual grouping and segregation of dynamic sound streams.
Research from field and laboratory studies demonstrate the bat's control over the frequency, timing, and direction of sonar calls, which leads us to propose that these parameters are used by the bat to segregate and track auditory objects in a dynamic environment. Figure Figure88 summarizes in schematic form the finding that bats echolocating in a complex environment adjust the frequency and/or direction of sonar vocalizations to stabilize these parameters in echo returns. By maintaining some constancy in sound frequency (Figure (Figure8A)8A) and direction (Figure (Figure8B),8B), the bat may be able to hear out auditory streams from selected objects in the midst of echoes from background targets and signals from other bats. As a bat flies towards a target, echo delay necessarily changes, and the bat must track coherent patterns of object distance changes over time. In the case of changing echo delay (Figure (Figure8C),8C), the bat may be able to hear out streams of echo delay that are shortening in a predictable temporal pattern, which depends on the angle between the bat's flight direction and the object.
Bat echolocation highlights the importance of action to perception, as this animal's motor behaviors give rise to the very stimuli that it uses to guide behaviors. It is important to note, however, that action influences perception, not only in echolocating bats, but in a variety of animal systems that rely on different modalities for sensing their environments. Some senses are referred to as active, because the animal detects stimuli that result from its own production of energy. This applies to echolocation in bats and toothed whales that produce and process sound energy to perceive objects in their surroundings, and also to electrolocation in African mormyrid and South American gymnotiform fishes that generate electric fields to orient in murky waters. The dynamic adjustment of signal production by active sensing animals has immediate influence on the stimuli they can use to monitor their perceptual worlds.
The link between action and perception may be less obvious in so-called passive sensing animals that rely on environmental energy to perceive their surroundings, but it is no less important. For example, vision involves active processes to seek out task-relevant visual information. Visual input is determined by gaze control, modulated by head movements, fixation-saccade cycles of eye movements, and accommodation. Active gaze control has been demonstrated in visual animals as different as jumping spiders (Tarsitano and Andrew, 1999), stalk-eyed flies (Ribak et al., 2009), zebra finches (Eckmeier, 2008), and humans (Henderson, 2003).
Perception also depends on motor adjustments in other “passive” sensory modalities. Acoustic cues for passive sound localization is enhanced by head movements that bring the sound source into auditory midline, as has been convincingly demonstrated in behavioral experiments with barn owls (Knudsen et al., 1979; Konishi and Knudsen, 1979), and cats (Tollin and Populin, 2005). Acoustically orienting robots (TeleHead) confirmed that dynamic cues produced by head movements play important roles in auditory localization (Toshima and Aoki, 2006). Similarly, the control of perception through action is essential for animals relying mainly on olfaction (sniffing, Bensafi et al., 2003), whisking (movement of whiskers Metha et al., 2007), and touch (Catania and Remple, 2004). The echolocating bat is a valuable model for studying the role of action in perception, not because it is unique, but because its dynamic call modifications provides us with direct access to the link between action and perception.
Echolocating bats experience a dynamic acoustic environment as they fly. The bat's own movement, combined with movement of insect prey, results in time-varying echo parameters that the bat must track over time. To accomplish basic tasks like insect capture, the bat must integrate brief “acoustic snapshots” of objects in the environment to build a representation of a natural scene. An insectivorous bat must also anticipate its point of contact with prey to prepare appropriate movements for target interception with the wing or interfemoral membrane. These perceptual and behavioral requirements suggest that sequential analysis of acoustic stimuli is central to the bat's perception of its natural scene.
Psychophysical experiments have demonstrated that echolocating bats can integrate acoustic information that arrives sequentially over time (Moss and Surlykke, 2001). In human listening experiments, the integration of acoustic stimuli into perceptual streams depends critically on the interval between successive sounds (Bregman, 1990), and we argue, based on the adaptive changes of temporal patterning in sonar calls, that a similar perceptual phenomenon operates in bats. Thus, the production of sonar sound groups, recorded in both the laboratory and the field (e.g. Kingston et al., 1999; Moss et al., 2006; Weinbeer and Kalko, 2007) reflects the bat's control over echo streams. Since the bat actively adjusts the timing of its signals, we infer that the intervals selected by the animal under different environmental and task conditions contribute to its perceptual segregation and integration of echo returns over time.
Visual animals with well-developed fovea move their eyes to scan and fixate objects in the environment (Land and Hayhoe, 2001). Sequential fixations are interrupted by saccades, during which visual perception is suppressed (Volkmann et al., 1978). Thus, high-resolution snapshots of visual information must be integrated over time to perceive a continuous and stable world, as reported by human viewers. The agile flight of echolocating bats in cluttered environments (Fenton, 1990; Kingston et al., 1999; Ratcliffe and Dawson, 2003; Siemers and Schnitzler, 2004) shows that they can operate by listening to stroboscopic echo returns, and this leads us to hypothesize that bats integrate dynamic and interrupted sensory information to represent complex natural scenes. This hypothesis is bolstered by laboratory studies with simplified stimuli (Moss and Surlykke, 2001).
Attention, learning, and memory contribute to the analysis of natural scenes, as these cognitive processes allow an animal to efficiently manage the sensory load from a complex and noisy environment (e.g. Henderson and Hollingworth, 1999; Knudsen, 2007; Walker et al., 2008). In laboratory studies reviewed in this paper, we present several examples of the bat's adaptive control of the sonar beam aim and range (e.g. Falk et al., 2010; Ghose and Moss, 2006; Ghose et al., 2006; Surlykke et al., 2009a; Falk et al., 2010), which directly influence the direction and amplitude of echo returns from a limited region in space. Since the animal's beam directing behavior determines the echo information it samples, we propose that the bat's “acoustic gaze” provides an indicator of its attention to objects in space, similar to the relation between foveation and spatial attention in visual animals. In other words, the bat's sonar beam may be a physical manifestation of the animal's “spotlight of attention” (Broadbent, 1958), by allocating information-processing resources to a restricted region of a complex and noisy natural scene.
Learning and memory can also reduce the information-processing load on the bat's sonar imaging system. Both anecdotal reports and experimental findings provide documentation that bats performing routine tasks in familiar environments rely on memory in favor of echo information to orient and navigate. Changes in familiar environments sometimes lead to mistakes by the bat, as it fails to listen to echoes that could guide appropriate behavior. For example, Griffin (1958) describes bats, returning to the roost after an evening's hunt, crashing into a newly erected cave barricade even though it reflected strong echoes. Griffin coins this mishap the “Andrea Doria Effect,” because the bats seemed to ignore important information from echo returns to guide their behavior and instead favored spatial memory. More recently, Jensen et al. (2005) conducted a laboratory study to investigate the bat's use of an acoustic landmark to guide spatial navigation. Bats learned to associate a landmark with passage through a net and could reliably navigate the obstacle when the landmark and net opening were moved together. However, on catch trials, when the landmark and net opening were moved to an unfamiliar configuration, the bat crashed into the echo-reflecting net at a location adjacent to the landmark, where the animal had come to anticipate an opening. These findings suggest that spatial orientation in a complex environment can be aided by learning and memory and this only fails when (often artificial) changes occur in the environment.
Echolocating bats actively control the features of their sonar calls in response to information gathered from the environment, and we hypothesize that active adjustments in sonar signal design play directly into bat perception of complex and dynamic acoustic scenes. This hypothesis cannot be tested using phenomenological reports, as in many studies of human auditory scene analysis. Instead, objective measures of the bat's adaptive behaviors provide us with a window to the animal's perception. Here, we have reviewed field and laboratory studies of bat sonar behavior that demonstrate adjustments in call direction, duration, intensity, timing, and frequency, and we propose that these signal parameters are fundamental to the perceptual grouping and segregation of auditory objects in a complex environment.
Grouping and segregation of auditory objects is a general problem of auditory scene analysis that all hearing animals must solve. When a bat echolocates in a complex environment, a single vocalization results in a cascade of echoes from objects at different directions and distances. By the time a flying bat produces a subsequent vocalization, the relative position of these objects changes, creating a new “acoustic snapshot” of the environment. Grouping echoes returning from different objects across calls invokes sequential scene analysis processes, which may give rise to distinct perceptual streams. Parsing information carried by overlapping echoes from different objects invokes simultaneous scene analysis processes. The bat's sequential analysis of echo returns enables target tracking, while its simultaneous analysis enables figure-ground segregation. Target tracking and figure-ground segregation are commonly viewed as tasks of the visual system, and this parallel with echolocation suggests that comparative studies across animal systems and modalities may help to deepen our understanding of natural scene analysis.
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
C.F. Moss and A. Surlykke gratefully acknowledge the generous support of the Institute of Advanced Study in Berlin, which allowed them the time and space to write this review. We would also like to thank Wei Xian for assistance with data analysis and figure preparation. Melville Wohgemuth contributed the top panel to Figure Figure1.1. Collection of experimental data reported in this manuscript was supported by NSF grant, “Active Sensing for Three-Dimensional Auditory Localization,” NIBIB grant, “CRCNS: Innovative technologies inspired by biosonar” to CFM, NIDCD P-30 grant in “Comparative and Evolutionary Biology of Hearing” (R. Dooling and A. Popper, Co-PIs) and the Danish Natural Science Research Council to AS. Discussions with Albert Bregman, Chen Chiu, Michael Lewicki, Bruno Olshausen motivated some of the ideas presented in this review. We are grateful for comments from Shihab Shamma, Melville Wohlgemuth, James Simmons, and Khaleel Razak, who helped us to improve the manuscript.