|Home | About | Journals | Submit | Contact Us | Français|
Animals, including humans, use interaural time differences (ITDs) that arise from different sound path lengths to the two ears as a cue of horizontal sound source location. The nature of the neural code for ITD is still controversial. Current models differentiate between two population codes: either a map-like rate-place code of ITD along an array of neurons, consistent with a large body of data in the barn owl, or a population rate code, consistent with data from small mammals. Recently, it was proposed that these different codes reflect optimal coding strategies that depend on head size and sound frequency. The chicken makes an excellent test case of this proposal because its physical pre-requisites are similar to small mammals, yet it shares a more recent common ancestry with the owl. We show here that, like in the barn owl, the brainstem nucleus laminaris in mature chickens displayed the major features of a place code of ITD. ITD was topographically represented in the maximal responses of neurons along each isofrequency band, covering approximately the contralateral acoustic hemisphere. Furthermore, the represented ITD range appeared to change with frequency, consistent with a pressure gradient receiver mechanism in the avian middle ear. At very low frequencies, below400 Hz, maximal neural responses were symmetrically distributed around zero ITD and it remained unclear whether there was a topographic representation. These findings do not agree with the above predictions for optimal coding and thus revive the discussion as to what determines the neural coding strategies for ITDs.
Accurate coding of temporal information has direct behavioral relevance for the computation of sound source location. Birds and mammals show exquisite sensitivity to interaural time differences (ITDs): when sound comes from one side of the body, it reaches one ear before the other. The brain uses these ITDs to compute sound location in the horizontal (azimuthal) plane (Konishi 2003; Yin 2002).
There is general agreement that the basic sensitivity for ITD and binaural correlation arises through a cross-correlation like comparison of inputs to the two ears (Batra and Yin 2004; Joris and Yin 2007; Yin et al. 1987). The cross correlator neurons act as coincidence detectors (reviews in Grothe 2003; Konishi 2003; Yin 2002). The coincidence detection is performed separately and in parallel in many narrowly tuned frequency channels. The sound waveform is encoded by phase-locked neural discharges in the auditory nerve, i.e. by a precise correlation between the phase of the stimulus and the firing of spikes. Coincidence detection between such inputs from each ear gives rise to a discharge pattern that varies cyclically as a function of interaural phase difference, showing a maximum when both inputs are in phase and a minimum when they are 180° out of phase. Thus, sensitivity to interaural phase differences (IPDs) is created. IPD is a relative measure of time and, knowing the stimulus period, can be translated into absolute ITD. In fact, within each narrowly tuned frequency channel, IPD and ITD are interchangeable. ITD is the physical cue to the azimuthal position of a sound source. A current controversy centers on the question of how the coding of a range of ITDs enables the nervous system to precisely localize sound sources along the azimuthal plane.
In principle, an array of coincidence detectors could be set up, situated along interdigitating or counter-current delay line inputs from each ear. In such a circuit, the delay lines introduce successively greater input delays to the coincidence detectors they contact serially. In consequence, each individual coincidence detector fires maximally at the phase difference between its inputs that exactly compensates for the conduction delay introduced at its place. Such a circuit, generating a place map of interaural phase difference at each frequency is well known as the place-code model or Jeffress model, after Jeffress (1948). However, the task of ITD coding is affected by both head size and the ability to phase lock. The sharpness of ITD selectivity of the individual coincidence detectors increases for neurons with higher characteristic frequency because their temporal precision is greater. For example, the spikes of an auditory neuron phase-locking to a 5kHz stimulus (with a period of 200 μs) show a temporal dispersion of about ±40 μs around the preferred phase; for a neuron phase-locking to 1 kHz (with a period of 1 ms) the temporal dispersion is typically ±100 μs (Köppl 1997). In coincidence detector neurons using such inputs, this results in correspondingly steeper slopes for the 5 kHz and shallower slopes for the 1 kHz ITD selectivity curves (Batra and Yin 2004). Animals with smaller heads that naturally experience a smaller ITD range therefore have less precise information available at equivalent frequencies than animals with larger heads.
Animals with smaller heads also do not have the option of simply using higher frequencies. As the above example illustrates, phase-locking is a process that demands increasing temporal precision in spike generation with increasing frequency. Due to the biophysical limitations of the cell membranes involved, phase-locking faces a clear upper frequency limit. For the auditory neurons providing the input to the coincidence detector circuits discussed here, this upper limit varies between 3 and 10kHz in different species (review in Köppl 1997).
The basic problem of the interaction between head size and the frequency range available for creating the neural code of ITD was formalized in a model of IPD representation (Harper and McAlpine 2004). Assuming that the ITDs an animal naturally encounters should be coded with maximal accuracy, Harper and McAlpine (2004) argued that the neural representation of IPD within the population of the first binaural coincidence detectors should conform to either one of two distinct strategies, depending on head size and frequency range. One is a homogeneous distribution of the maxima of their selectivity curves (hereafter called best IPD), collectively covering the physiological ITD range of the animal within each frequency band. Although the model does not address the question as to how the distribution is achieved, such a distribution is consistent with the Jeffress model and an orderly representation of best IPDs along input delay lines. The second strategy of ITD coding is characterized by a non-homogeneous distribution of best IPD, with distinct subpopulations of neurons within each frequency band. The best IPDs of each population fall within a narrow range and often outside the physiological range of the animal. Instead of the maxima, the slopes of the IPD-selectivity curves cover the physiological range, and each slope covers most of this range. Various terms and variations have been suggested for this broad category of models in the past, summarized as Left–Right Count-Comparison models by Colburn and Kulkarni (2005). Here, the term two-channel model will be used, emphasizing the fact that all the coincidence detectors of each brainstem hemisphere together are believed to comprise one channel (or population). The relative excitation in the two channels from the two hemispheres is assumed to be read out as a correlate of ITD and thus as azimuthal sound source location (review in Palmer 2004).
Experimental evidence for both types of models of ITD coding exists. As has been reviewed by many authors (e.g. Konishi 2003), all of the characteristics of the Jeffress model appear fulfilled in the relevant brainstem nucleus (Nucleus laminaris, NL) of the barn owl, at least within the frequency range that has been extensively studied (above 3 kHz; Carr and Konishi 1990; Pena et al. 1996). Experimental data from the equivalent brainstem nucleus (medial superior olive, MSO) in the gerbil provide the clearest support for the two-channel model (Brand et al. 2002). In addition, a likely neural mechanism has been revealed in the gerbil for creating the unique distribution of best IPDs. It relies on additional phase-locked inhibitory inputs to the coincidence detector (MSO) neurons and does not require input delay lines (reviewed in Grothe 2003). However, data from different mammalian species are often ambiguous and their interpretation in support for the Jeffress model on the one hand or the two-channel model on the other is intensely controversial (recent summaries in Joris and Yin 2007; McAlpine 2005; Palmer 2004).
A virtue of the optimal coding scheme suggested by Harper and McAlpine (2004) is that it makes clear predictions for specific examples of head sizes and frequencies about which coding strategy should be optimal and thus allows for experimental testing. As a general rule, a Jeffress-like code and homogeneous representation of best IPDs is optimal at frequencies high enough so that the head’s ITD range exceeds ±0.5 cycles, while one or two channels with discrete populations of best IPD are optimal at frequencies below that. The barn owl and the gerbil were put forward as examples where experimental data clearly fit those predictions, however, this has recently been challenged for the low-frequency range of the owl (Wagner et al. 2007).
The key prediction of Jeffress’ model, a topographic map of best ITD in the MSO or NL, has not been experimentally addressed recently. In 1990, Carr and Konishi used physiological and anatomical techniques to show that axonal delay lines form maps of ITD in the NL of the barn owl. In the cat, two studies provided anatomical evidence for axonal delay lines in the contralateral afferents (Beckius et al. 1999; Smith et al. 1993), while Yin and Chan (1990) showed a correlation between best delay and rostrocaudal position in the MSO. However, the owl has been challenged as a highly specialized and potentially untypical case (e.g. McAlpine 2005) and the evidence in the cat is not conclusive (Joris and Yin 2007).
We have therefore examined this key prediction in the chicken, an unspecialized bird with a small range of physiological ITDs (Hyson et al. 1994) and a relatively low range of frequencies of phase-locking (Salvi et al. 1992), both similar to the values in the gerbil. Harper and McAlpine’s (2004) optimal coding scheme predicts ITD coding in discrete channels for frequencies up to 3 kHz, i.e., up to the limit of phase-locking. However, anatomical studies show that the chicken Nucleus magnocellularis (NM) projects in a delay-line pattern to NL (Parks and Rubel 1975; Young and Rubel 1983) and appropriate conduction delays have been measured in brain-slice preparations of this circuit (Overholt et al. 1992). This suggests a map-like representation of a range of IPDs, inconsistent with the prediction of a uniform population of neurons on each side of the brainstem. However, it is unknown whether those delay lines determine the responses of NL neurons in the mature chicken in vivo and if so, what range of IPDs they cover. We have carried out in vivo recordings of NL activity, combined with histological verification of recording sites. We show that the NL contains a systematic, gradual representation of the animal’s ITD range. This and a host of monaural and binaural response properties investigated are entirely consistent with the Jeffress model.
Experiments were carried out on 22 chickens aged between 17 and 41 days after hatching. Animal husbandry and experimental protocols were approved by the Regierung von Oberbayern, Germany (AZ 209.1/211-2531-56/04) for a first series of experiments and by the University of Sydney, NSW, Australia (Animal Ethics Committee Approval No. K03/1-2007/3/4526) for a subsequent series.
Anesthesia was induced by intramuscular injections of 20 mg/kg ketamine hydrochloride (Ketavet by Pharmacia GmbH, Erlangen, Germany or Ketamine by Parnell Laboratories, Alexandria, NSW, Australia) and 3mg/kg xylazine (Rompun by Bayer Vital GmbH, Leverkusen, Germany or Ilium Xylazil-20 by Troy Laboratories, Smithfield, NSW, Australia) and maintained with supplementary doses as necessary until switching to isoflurane (see below). In addition, a subset of animals received approximately 20 mg/kg metamizol-sodium (Vetalgin by Intervet GmbH, Unterschleissheim, Germany) every 3–4 h, as required by German authorities. Body temperature was maintained at 41°C by a heating blanket wrapped around the animal and feedback-controlled by a cloacal temperature probe. An EKG recording via needle electrodes placed in the muscles of the right wing and left leg was constantly monitored. The trachea was cut and intubated. After opening the abdominal air sac just below the ribs, a constant, humidified gas flow of 150–400 ml/min (approximately 1ml/g body weight)was connected to the tracheal tube. Spontaneous breathing ceased under these conditions. The gas was either carbogen or pure oxygen, mixed with 0.8–1.5% isoflurane. The head was held in a constant position and the skull was opened to expose the cerebellum. The medial sinus was ligated, and most of the cerebellum aspirated to expose the dorsal surface of the brainstem.
Thin-walled glass microelectrodes were filled with 5% neurobiotin in 2MK-acetate, positioned above the relevant brainstem area under visual control and then advanced remotely with a piezo device (Inchworm 700, Burleigh, Fishers, NY). Responses to acoustic stimuli were monitored continuously until we were confident that the electrode was within the cellular layer of NL. Responses were amplified (Intra 767, World Precision Instruments, Sarasota); the amplified signal was usually high-pass filtered at 300 Hz, except for the extreme low-frequency recordings, (module PC1, Tucker-Davis Technologies (TDT), Alachua) and fed in parallel to an A/D converter (TDT DD1) and a threshold discriminator (TDT SD1) with subsequent event counter (TDT ET1). As single-unit spike recordings could only rarely be achieved, most of the recordings were of the neurophonic potential, a sinusoidal evoked potential reflecting the frequency of a pure-tone stimulus. For neurophonic recordings, the TTL trigger threshold was subjectively adjusted for optimal signal-to-noise ratio. Both the analog and the TTL signal could be stored by custom-written software (xdphys, California Institute of Technology).
At selected recording sites, neurobiotin was deposited iontophoretically, usually by passing 250nA of positive direct current for 10mins.
Chickens were placed in a sound-attenuating chamber for all measurements. Closed, custom-made sound systems, containing small earphones (Sony MDR-E818LP) and miniature microphones (Knowles EM 3068), were placed at the entrance of both ear canals, but not tightly sealed. The sound systems were calibrated individually for both amplitude and phase before the recordings.
Stimuli were generated separately for the two ears by custom-written software (xdphys, Caltech), using a TDT AP2 signal processing board. Both channels were then fed to the earphones via D/A converters (TDT DD1), anti-aliasing filters (TDT FT6-2) and attenuators (TDT PA4). Stimuli were tone bursts of 50ms duration (including 5ms linear ramps), presented at a rate of 5s−1, or clicks, presented at a rate of 10s−1.
Monaural frequency-versus-level responses for both ipsi- and contralateral stimulation were recorded first, by presenting tones from a matrix of frequencies and sound pressure levels in random sequence, repeated three times. Monaural tuning curves were derived from these as described in Köppl and Carr (2003), using the recorded TTL signal in all cases. The mean of their CFs and thresholds were taken as the CF and threshold of the recording site.
Responses to 500 repetitions of monaurally presented clicks were recorded. For single-unit recordings, a peristimulus-time histogram (PSTH) with a bin width of 0.02ms was calculated, using the TTL signal. Latency was defined as the earlier of the first two consecutive bins exceeding the tallest bin in a 10ms interval preceeding the stimulus. For neurophonic recordings, the averaged analog response waveform was analyzed as described in Wagner et al. (2005) for NL neurophonic data from the barn owl. Briefly, the waveform was high-pass filtered to exclude components below the CF and subsequently fitted with a gammatone function. This type of analysis was well applicable to chicken neurophonic click responses, too, if the cut-off frequencies of the filter functions were adjusted to the lower frequency range of the chicken. Fitted waveforms of ipsi- and contralateral responses were then superimposed. The median difference between 2 and 4 consecutive maxima and minima was taken as the difference in response latency, with positive values indicating contralateral leading.
Responses to 100 repetitions of monaurally presented tones at a frequency close to the CF, and a level of 40–60 dB SPL, corresponding to an average of 16 dB above threshold, were recorded. Using the TTL signal in all cases, mean phase and vector strength (VS) were derived from these according to Goldberg and Brown (1969). Only VS values with a significance level of 0.01 or below were accepted. The difference between the mean phases for ipsi- and contralateral stimulation were then calculated as a predictor of best IPD, using the click responses as a guideline as to which side was leading.
Sensitivity for ITD was tested with tones presented binaurally with various time disparities. ITD was usually varied within ±1 stimulus period, in steps no larger than one-tenth of the period. Stimulus level was the same as for monaural phase responses (40–60 dB SPL or, on average, 16 dB above monaural thresholds); Usually 10 stimulus repetitions were presented at each ITD. As a rule, for single units, the TTL signal, i.e., spike rate, was used for further analysis, for neurophonic recordings the amplitude of the analog waveform was used. The only exceptions to this were data from the earliest experiments (9 of 44 neurophonic sites), where the analog signal was not saved. For these neurophonic data, TTL counts exceeding a subjectively set threshold were also used. We found later, comparing both types of analysis for neurophonic recordings, that the results did not differ systematically, but that the neurophonic amplitude provided a better signal-to-noise ratio. The frequencies at which ITD sensitivity was tested always included the CF previously determined from monaural tuning curves. Because of the well-known sharpness of tuning in the lower auditory centres of birds, the range of frequencies over which responses could be obtained was limited. We developed empirical criteria for the acceptance or rejection of data at particular frequencies for further analysis. For single units, the mean spike rate and standard deviation was determined for each ITD and an index of modulation derived by calculating the difference between minimal and maximal mean rate and dividing it by the maximal standard deviation observed. Data were discarded if this index was below 1.5. For neurophonic recordings, the averaged analog response waveform at each ITD was fitted with a cosine function at the stimulus frequency. The amplitude of this fit was then divided by: standard deviation of the averaged waveform *√2. The value of this index is 1 if the waveform is identical to the fitted cosine and becomes zero if the waveform contains no stimulus frequency component. Data were discarded if this index remained below 0.7 at all ITDs tested. Acceptable data according to these criteria usually fell within a range of 0.2–0.5 octaves around CF (median 0.33 octaves). The neurophonic amplitudes or, for single units, the spike rate, as a function of lTD were then fitted with a cosine function at the respective stimulus frequency (Viete et al. 1997) to determine best IPD, defined as the peak closest to zero IPD. In cases where the minimum fell close to zero IPD and it was thus ambiguous which peak defined the best IPD, click responses and the CD (see next) were used to resolve the laterality. Finally, a linear regression of best IPD as a function of frequency was calculated, the slope of which corresponds to the CD and the y-intercept to the CP (Yin and Kuwada 1983).
Chickens were fixed by cardiovascular perfusion with 4% buffered paraformaldehyde, the brains were extracted, cryo-protected by infiltration with 30% sucrose and cross-sectioned on a cryostat. Neurobiotin was visualized using standard ABC (Vector Laboratories, Burlingame, CA) and diaminobenzidine protocols on floating sections. Finally, the sections were serially mounted and counterstained with cresyl violet.
All sections containing NL were identified and the position of NL’s medial edge relative to the midline was measured in each. The linear extent of the nucleus was then measured along the neuron chain, regardless of its orientation within the section, as well as the position of any neurobiotin label along that dimension. These data were then used to construct a flat surface view of NL and determine the position of label in normalized coordinates, with reference to the total mediolateral and rostrocaudal extent of NL. Note that the mediolateral dimension in this scheme represents an artificially flattened view of NL and is not identical to the brain’s mediolateral axis. This is different to the surface projection of NL used by Rubel and Parks (1975). All measurements were carried out with the use of image analysis software (AnalySIS by Soft Imaging Software, Münster, Germany).
We report a total of 43 neurophonic recordings, 3 extracellular multi-unit recordings, 14 extracellular single-unit spike recordings and 4 intracellular recordings from the NL of the chicken in vivo. Thirty-four of these recording sites were histologically located within the cellular layer of NL by neurobiotin labeling. The neurophonic is a sinusoidal evoked potential reflecting the frequency of the pure-tone stimulus (Schwarz 1992; Sullivan and Konishi 1986). It is more easily and stably recorded than single units, but its precise origin has not been explored. We suggest that it predominantly originates from the NL neurons, for the following reasons. In the chicken, the maximal amplitude of the neurophonic potential is very localized and falls sharply with distance from the cellular monolayer of NL (Schwarz 1992). We were able to confirm this well-localized nature. Using high-impedance electrodes (typically between 10 and 25 MΩ), the location of maximal neurophonic amplitude could usually be judged to within 50 μm by audiovisual criteria, and care was taken to position the electrode at this maximum before recording and subsequent iontophoresis of neurobiotin. In addition, we observed that the neurophonic thresholds to ipsi- and contralateral stimulation were most similar at that point and provided another useful criterion. In many cases, 1–3 labeled cell bodies were later seen in the histological sections, confirming that the electrode was in the neuron layer. In the rare cases of intracellular recordings (indicated by a sudden jump of the recorded DC potential to between −20 and −50 mV) the response appeared like a magnified neurophonic with either small or no spikes superimposed on the sinusoidal waveform. For direct comparison, we have a brief intracellular recording from an NL neuron, obtained within 25 μm of a neurophonic recording with the same electrode. In addition, there is one case of an extracellular spike recording and, after the loss of spikes, the corresponding neurophonic recording at the same site. In both cases, the ITD selectivities of the neurophonic potential and the corresponding single unit were very similar (Fig. 1).
Characteristic frequencies (CF) ranged from 80 to 3,500 Hz. Figure 2 shows the distributions of mean CFs and thresholds, as well the differences between the monaurally determined measures for each recording site. Most of our data were obtained between 1 and 2.5 kHz CF, as this region of NL was most easily accessible. There was no systematic mismatch between the monaural CFs or the monaural thresholds of a particular recording site. Median differences were 0 Hz for the paired CFs (interquartile range −75 to +100 Hz) and 1 dB for the thresholds (interquartile range −3.5 to +5 dB). Wilcoxon tests showed that there were no significant differences between the paired samples (p = 0.84 for CFs, p = 0.65 for thresholds).
Monaural click responses of neurophonic recordings usually showed a clear oscillatory component of a frequency close to CF. Comparing the responses to ispi- and contralateral clicks provided an unambiguous measure of laterality and a prediction for best ITD (Fig. 3a). In a minority of cases (7 neurophonic recordings), the response waveforms could not be unambiguously matched because of significant differences in shape; these were excluded from further analysis. Formulti- or single-unit recordings, PSTH histograms of spike responses to monaural click stimuli were used (Fig. 3b, c). With one exception, all recording sites analysed (n = 41) showed either an ipsilaterally leading click response or equal response latencies to both sides. Ipsilateral lead times ranged up to 1,020 μs, but mostly fell below 400 μs (see also later Fig. 12). The one exception was a low-frequency single unit (CF 138 Hz), responding 160 μs earlier to the contralateral click.
All recording sites tested (n = 53) displayed significant phase locking to monaural pure-tone stimuli near CF. Vector strengths for ipsi- and contralateral stimulation were generally similar (no median difference) and decreased with increasing frequency (Fig. 4). The difference between the preferred response phases to ipsi- and contralateral stimulation was used to predict the preferred ITD (see below), using the click responses and/or CD to resolve phase ambiguity.
All NL recordings showed sensitivity for ITD, in the form of cyclic changes of neurophonic amplitude or of spike rate with variations in ITD. The cycle period corresponded to the period of the stimulus, which was chosen to be close to the CF. Best IPDs ranged from −0.57 to +0.58 cycles (Fig. 5b; median +0.131; positive values indicating contralateral leading, negative values indicating ipsilateral leading). Note that best IPDs beyond ±0.5 cycles could and did occur because monaural click responses and/or the CD were used as the ultimate indicators of laterality. For example, if the peak of the IPD function closest to zero fell at −0.445 cycles, but the click responses indicated a shorter delay for the ipsilateral response, the best IPD occurred at a contralateral-leading stimulus, i.e. +0.555 cycles in this example. Best IPDs corresponded to best ITDs from −770 to +834 μs (Fig. 5a; median +90 μs). There was no systematic change of best IPD or best ITD with CF (Fig. 6a, b). Best ITD (but not best IPD) values showed an increasing range of scatter with decreasing CF.
CP and CD were determined according to themethods of Yin and Kuwada (1983). All recording sites analyzed conformed to their linearity criteria. Figures 7, ,88 and and99 illustrate three examples, a single unit with CF 450 Hz and two neurophonic recordings with CFs at 1.75 and 2.25 kHz, respectively. CPs were expressed on a scale from −0.5 to +0.5 cycles and covered nearly that whole range (−0.49 to +0.45, n = 48). Their distribution was clearly not uniform, with most values (31 of 48 or 68%) falling within ±0.2 (Fig. 10a). The median CP was 0.053. This means that in the majority of cases, the CD fell near a peak in the ITD responses, as for the examples shown in Figs. 7 and and8.8. Only in a minority of cases did the CD fall closer to a minimum in the ITD responses, as for the example shown in Fig. 9.
If coincidence detection between inputs from both sides were the main determinant of the binaural sensitivity for ITD, then the monaural responses should predict the best ITD. The difference in delay between ispi- and contralateral click responses was indeed inversely correlated with best ITD for CFs above 0.4 kHz (Fig. 12a, Spearman rank correlation, ρ = −0.68, p < 0.001, n = 35). Similarly, monaural phase responses predicted best ITD very well at CFs above 0.4 kHz (Fig. 12b, Spearman rank correlation, ρ = −0.84, p < 0.001, n = 40). However, these correlations did not hold for CFs below 0.4 kHz where the data scattered a lot more. Here, the difference between monaural click responses did not necessarily agree with the difference between monaural phase responses and neither systematically predicted the best ITD (Fig. 12a, b).
We successfully labeled 34 recording sites, comprising 4 single units, 28 neurophonic and 2 multi-unit extracellular spike recordings. Two examples are shown in Fig. 13. The chicken NL is tonotopically organized, with the lowest frequencies represented caudolaterally and the highest rostromedially (Rubel and Parks 1975). Accordingly, isofrequency bands run from caudomedial to rostrolateral. Our recording sites covered the full extent of this axis, i.e. labeled sites were found from the medial to the lateral extremes of NL. Best ITD was systematically related to position along the isofrequency axis. This is perhaps most strikingly illustrated by one case where three sites were recorded and labeled along the 1.3 kHz band in an individual NL (Fig. 14). A further seven pairs of recording sites with similar CF from an individual NL where one or both were successfully labeled, and two pairs without label, showed the same trend. Without exception, the best ITD changed towards increasingly contralateral values when moving rostrolaterally within NL. A Wilcoxon test confirmed that this change was highly significant over all pairs (p = 0.001, one-tailed).
In order to normalize and pool the positions of labeled sites across animals we exploited the fact that labeled arbors of NM axons were often seen emanating from the injection site, outlining parts or all of the corresponding isofrequency band. This additional information was used to define the angle of the respective isofrequency band in a reconstructed surface view of the individual nucleus and derive the position of recording sites along these bands (Fig. 15a). In cases without axonal label, the median angle of isofrequency bands (30°), determined from all experiments, was assumed.
There was a highly significant correlation of anatomical position along the isofrequency axis and all three parameters of preferred interaural timing, the best IPD, the best ITD and the CD (Fig. 15b–d). Values close to zero were represented near the caudomedial edge and values corresponding to sounds in the contralateral hemisphere occurred increasingly rostrolaterally. Linear regressions indicated a mapped range for best IPD of 0.63 cycles (−0.07 to +0.56, Fig. 15b) and for CD of 386 μs (−100 to +285, Fig. 15d). Expressed as best ITD (Fig. 15c), the maps furthermore appeared to differ with frequency. Regressions carried out separately for different frequency ranges suggested a mapped range of 518 μs (−94 to +425) for 0.8 to 1.6 kHz, but only 274 μs (+5 to +269) above 1.6 kHz; there were only three data points for the CF-range 0.4–0.8 kHz, which fell along a line covering 915 μs (shown dashed in Fig. 15c). However, these differences in the mapped ITD range remain tentative, as an analysis of covariance for differences in the regression slopes between the frequency ranges did not support them as significant (p = 0.15). A regression over all data of best ITD showed a range of 436 μs (−82 to +354). It remained unclear whether there is a systematic map at all at very low frequencies, below 400 Hz. We have only three labeled sites for this frequency range, all of which were located at similar relative positions along the isofrequency band and scattered widely in their best ITDs.
The data shown here for the NL of the chicken are among the most comprehensive sets of in vivo recordings from the NL and its mammalian analog, the MSO. Although our sample of single-unit recordings appears small, it is well known that such recordings from the NL and the MSO are difficult to achieve in vivo (e.g. Guinan et al. 1972; Konishi 2003). This is probably due to an unusually small and variable amplitude of spikes in the mature somata of these neurons (Ashida et al. 2007b; Kuba et al. 2006; Scott et al. 2005, 2007). There is thus a crucial difference between recording well-isolated spikes and recording from the cell bodies. In order to achieve simultaneous electrophysiological characterization and histological verification of recording sites within NL, we consistently aimed for the cell body layer, which would have reduced our chances to obtain good single-unit recordings. The majority of recordings reported here are of the neurophonic potential. Intracellular records suggested that the neurophonic potential originates within the NL cells, similar to what has recently been reported for the NL of the barn owl (Ashida et al. 2007a). Furthermore, the neurophonic potential in the chicken is very well localized to the unique cellular monolayer of NL, suggesting its origin in the cells (Schwarz 1992). Finally, neurophonic responses and closely neighboring intracellular or spike responses were very similar. We thus suggest that the neurophonic is a valid reflection of the responses of NL neurons.
Since this is the first extensive characterization of binaural responses from the chicken NL in vivo, we will briefly discuss how the data relate to the well-established concept of coincidence detection. We will then focus on the discussion as to which of the current models of ITD coding is most consistent with the chicken data. This can be broken down into several questions which will be addressed separately: (a) whether there is a systematic representation of ITD, (b) how the neural best ITDs are distributed across the total range found and (c) how the range of neural best ITDs compares to the natural ITD range of the animal. Finally, we will briefly address the implications of our findings for the evolution of ITD coding.
NL neurons are excited by monaural stimulation of either ear and, when binaurally stimulated, are sensitive to changes in IPD (e.g., Carr and Konishi 1990). The present data confirmed this for the chicken in vivo. Coincidence detection between the ipsi- and contralateral inputs is thought to underly the sensitivity to IPD in both the avian NL and mammalian MSO (e.g. Grothe et al. 2004). Aprerequisite for coincidence detection is phase-locking to monaural stimulation, which was also confirmed for the chicken NL throughout its CF range.
A crucial test that is commonly employed for in vivo data is that the timing of the monaural responses should predict the ITD of maximal binaural response (Batra and Yin 2004; Carr and Konishi 1990; Goldberg and Brown 1969; Yin and Chan 1990). In the chicken, monaural click and phase responses both predicted best ITD very well, in agreement with coincidence detection. The large scatter observed at the lowest frequencies, below 400 Hz, could be partly due to pronounced interaural canal effects (discussed below) which would have led to deviations of the effective stimuli from what was acoustically presented.
CP should theoretically be zero if coincidence detection between excitatory inputs underlies ITD sensitivity (Yin and Kuwada 1983). Neural values for CP in the MSO indeed always cluster near zero (Batra et al. 1997; Spitzer and Semple 1995; Yin and Chan 1990), as they did in the present study. However, a substantial and as yet unexplained spread is also typical (review in Batra et al. 1997). The variation seen in the chicken was no exception and is thus considered in agreement with coincidence detection.
A final interesting point is that the monaural CFs were, on average, perfectly matched in the chicken, as they are in the NL of the barn owl (Pena et al. 2001). This does not support the stereausis model for NL, which postulates a systematic mismatch in the cochlear locations of origin, and thus CF, between the inputs from both sides as a source of delay (Shamma et al. 1989).
The data reported here for the chicken verify a key prediction of the Jeffress model, a topographic representation of a range of best ITD. We found a systematic change of best ITD along the isofrequency axis. This was shown both for individual NL, by multiple recordings within the same isofrequency band, and for recording sites pooled across animals. The representation was restricted to contralateral auditory space, from near zero ITD caudomedially to increasingly contralateral-leading ITDs rostrolaterally. This direction of representation agrees with the anatomical orientation of axonal delay lines in the contralateral inputs (Young and Rubel 1983) and physiological delay lines in vitro (Overholt et al. 1992). Axonal delay lines are the second key element of the Jeffress model.
Very few attempts have been made to date to experimentally test for such maps of ITD at the level of the NL or MSO. The best documented case is the barn owl where a representation of contralateral space was also found (Carr and Konishi 1990; Sullivan and Konishi 1986). There are several interesting differences between the chicken and the owl NL, not least the different anatomical orientation of the ITD maps. However, this is convincingly explained by the hyperplasia of the owl’s NL and its specialisations for high-frequency processing (reviews in Grothe et al. 2004; Kubke et al. 2004). A recent in vitro study on the emu’s NL showed physiological delay lines along the same anatomical axis and in the same direction as in the chicken (MacLeod et al. 2006), supporting the hypothesis that this is the plesiomorphic pattern in birds. We may thus assume that axonal delay lines and maps of ITD, the two key elements of the Jeffress model, are a typical feature of the avian NL. The only mammal where the MSO has been probed for a topographical representation of ITD is the cat. A representation of contralateral space along the rostrocaudal dimension of the nucleus was suggested and is in agreement with the direction of reported axonal delay lines in the inputs. However, the evidence is still tentative and controversial (recent review in Joris and Yin 2007).
Maps of ITD and axonal delay lines are clear evidence in support of the Jeffress model. We believe they also argue against the alternative two-channel model of ITD coding. A central tenet of the latter is a concentration of best IPDs in each frequency band around a uniform value (McAlpine et al. 2001). A certain range of random scatter around the average IPD value might be expected in a biological system; however, a topographic representation of that range is not required, indeed should not exist, if natural selection favored a convergence of best IPD values toward a uniform value. By definition, selection toward a uniform value would select against tuning to different values of IPD and, in consequence, against the formation of a systematic representation of IPD. Parsimony suggests that such maps would not exist without selective pressure to maintain them.
A distribution in the strictest sense of the Jeffress model should cover the (contralateral) range of natural ITDs, although not necessarily homogeneously (Jeffress et al. 1956). Best IPDs should then show a widening range with increasing CF. In contrast, a distribution in the strictest sense of the two-channel model should be focussed on one particular IPD value across frequencies, typically 45 degrees (McAlpine 2005; McAlpine et al. 2001) and, consequently, will show a regular decrease in best ITD with increasing CF. Real neural distributions usually do not obviously conform to either and their interpretation varies widely, even for comparable sets of data from the same species (Hancock and Delgutte 2004; Yin and Chan 1990). In addition, often the relevant MSO or NL data are not available and inferences have to be made from recordings in their target areas, usually in the midbrain IC. We tend to agree with the recent summary by Joris and Yin (2007) who concluded that, with the exception of the gerbil MSO, all published mammalian MSO and IC data display rather broad distributions which appear incompatible with the narrow predictions of the two-channel model. A constant value of best IPD across frequencies has been suggested (Brand et al. 2002; Hancock and Delgutte 2004; McAlpine et al. 2001), but its significance in the face of substantial scatter remains controversial. For best ITD, what is typically observed is a larger spread of values at lower frequencies (Hancock and Delgutte 2004; Joris et al. 2006; McAlpine et al. 2001) which, of course, corresponds to an increase of average best ITD but is scant evidence for a real relationship between best ITD and frequency.
The chicken NL data reported here were also broadly distributed, but differed in some important aspects. There was no indication for a common best IPD value across frequencies or any discernable trend. Instead, the median best IPD appeared to fluctuate widely (Fig. 6b), consistent with random fluctuations due to minor sampling biases in the different frequency bands. Most interestingly, the scatter was distinctly symmetrical around zero at low frequencies of a few hundred Hz, a fact even more obvious in the distribution of best ITD (Fig. 6a). This is highly unusual and indicates that there is no bias towards a representation of the contralateral auditory hemisphere, as there clearly is at higher frequencies. Although one has to be aware of interaural–canal effects in birds, especially at low frequencies (discussed below), it is difficult to see how that could lead to a sign reversal and thus the erroneous assignment of an ipsilateral-leading best ITD. More likely, the symmetrical scatter around zero reflects an increasing ambiguity in determining the peak of a very broad ITD selectivity curve (Goldberg and Brown 1969) and random CF mismatches, i.e., random differences in the cochlear delays of the inputs from both sides (Joris et al. 2006). In either case, it implies an average value near zero best ITD. This runs contrary to the two-channel model which predicts the best ITDs at such low frequencies to cluster around a large value outside the physiological range (see discussion below). It might be consistent with the Jeffress model, but only if the observed range of best ITDs in this low-frequency range is topographically mapped in NL, which remains open at present. Whether mapped or not, the symmetrical distribution of best ITD around zero indicates a remarkable shift from a predominantly contralateral representation to one of the entire azimuthal space.
The suggestion of a change in ITD representation at low frequencies is intriguing in the light of earlier observations in the barn owl that the low-frequency regions of the NL and its inputs are anatomically different to the higher frequency regions, in a way that suggested a breakdown of the delay-line structure (Köppl and Carr 1997). Unfortunately, physiological data from those low-frequency regions of NL are extremely scarce (Carr and Konishi 1990; Carr and Köppl 2004) and allow no conclusions at present. Wagner et al. (2007). recently published distributions of best ITD for a large sample of owl midbrain neurons. They found an increasing range of best ITD values with decreasing frequency. However, the format that the data were shown in—recordings pooled for both sides of the brain without normalization to ipsi- or contralateral leading—allows no distinction between a symmetrical distribution around zero ITD and a contralaterally biased representation.
In the chicken, the response peak nearest to zero ITD was not always the best ITD, resulting in some best IPDs beyond ±0.5 cycles, outside the so-called π-limit (Fig. 6a). In mammals, neural responses outside the π-limit are rarely observed in the midbrain (Marquardt and McAlpine 2007), in contrast to the barn owl where such responses are a typical feature and well explained by the Jeffress model (Wagner et al. 2007). Interestingly, Marquardt and McAlpine (2007) have suggested that the π -limit may be due to a phase shift underlying interaural delays, as opposed to morphological delay lines. Also, the absence of detectors beyond the π-limit has been attributed to redundancy since the periodicity and relative magnitudes of the peaks in the cross-correlation function beyond the π -limit are not separable (Thompson et al. 2006). The chicken data may be interpreted as both conforming to the π-limit or not, depending on how much significance is attached to the few data points falling outside. Ambiguity in selecting the correct response peak from two similarly sized ones has been blamed for such outliers in mammalian data sets (Marquardt and McAlpine 2007). It is worth pointing out that this can be excluded for the chicken, since the CD and/or monaural responses were used to determine laterality.
In summary, the distribution of best IPD and best ITD in the chicken, as in most other species, are not consistently supportive of either the Jeffress model or the two-channel model. We interpret the substantial scatter of values at any one frequency as more likely compatible with a Jeffress-like code. Intriguingly, the chicken data suggest a shift from the usual contralaterally-biased representation to one centred around zero ITD in the low-frequency regions of NL. This is clearly in conflict with the two-channel model.
In order to answer this question, it is important to clarify what the physiological range of ITD in the chicken is. Avian middle ears are not enclosed in bullae as they are in mammals, but are acoustically connected through skull spaces collectively termed the interaural canal. Ears connected like this may function as pressure difference receivers (Calford and Piddington 1988). Depending on the physical dimensions of the head and on the wavelength and the attenuation across the interaural canal, significant interactions between the sounds reaching the eardrum from both sides may result in increased directional cues. Although agreed upon in principle, the precise extent of those effects in different species of birds is still controversial (recent reviews in Klump 2000; Christensen-Dalsgaard 2005). For the chicken, the best measurements of the actual ITD, using cochlear microphonics (Hyson et al. 1994), support a pressure difference receiver mechanism with increasing effect towards lower frequencies. Extrapolating Hyson et al’s. (1994) data to more mature chickens with a head size of up to 25 mm, as used in our experiments, we derive maximal ITDs of about ± 160 μs at high frequencies, rising to ±300 μs at 800 Hz. Below 800 Hz, ITDs for the chicken are unknown, but data from other bird species suggest that they will continue to increase with decreasing frequency (Calford and Piddington 1988; Larsen et al. 1997, 2006). It is important to note that interaural–canal transmission, especially at low frequencies, is severely affected by cumulative changes in skull air pressure under anesthesia (Larsen et al. 1997) and possibly also by tightly sealing sound systems into the ear canals (Rosowski and Saunders 1980), because both the conditions affect eardrum impedance. We assume that middle-ear function was near normal under our experimental conditions, since those conditions were avoided.
An interesting feature of the topographical representation of ITD in the chicken NL was that the mapped range appeared to increase with decreasing frequency—a striking correlation with the physical properties of the middle ear. For the two frequency bands with the most data, 0.8 – 1.6 kHz and >1.6 kHz, the mapped ranges were −94 to +425 μs and −5 to +269 μs. This is a reasonable match with the above estimates of ±300 and ± 160 μs, respectively. Conclusive comparisons must await more extensive measurements of older chickens’ ITD range over a broader frequency range than currently available. Also, the median best ITD of all our recordings in NL fell at +90 μs, clearly within the physiological range of the chicken.
In summary, the ranges of neural best ITD topographically represented in the chicken NL match the estimated physiological ranges well. In addition, the majority of best ITD values clearly fell within physiological range. This is entirely consistent with the Jeffress model of ITD coding. Is it also consistent with the two-channel model? A crucial observation that led to the revival of the two-channel model was that in the guinea pig, many neurons in the IC appear to have their best ITDs outside the animal’s physiological range (McAlpine et al. 2001). According to the predictions in Harper and McAlpine (2004) and using the above estimates for physiological ITDs in the chicken, best ITDs should clearly fall outside the physiological range at low frequencies of a few hundred hertz. The data instead showed a clustering of best ITDs around zero and thus contradict this prediction.
Taken together, ITD coding in the chicken NL is more broadly consistent with the Jeffress model than with the two-channel model. Thus, contrary to expectations from the optimal coding scheme of Harper and McAlpine (2004), a Jeffress-like place code of ITD could be an evolutionarily stable strategy for an animal with a relatively small head and a limited ability of its neurons to phase-lock to high frequencies. Similarly, Wagner et al. (2007) concluded that ITD coding in the low-frequency range of the barn owl did not conform to the predictions of optimal coding (Harper and McAlpine 2004). This suggests that either ITD coding is not always optimal or that factors not included in the model are of overriding importance. We discuss two such potential factors: no selective pressure for optimal coding and other useful aspects of the neurons’ code.
The relative importance of sound localization in the ecological context of the animal species will impose different selective pressures on the ITD coding circuits (Wagner et al. 2007). Sound localization abilities of the chicken may be optimal for its environment, but not optimal in theoretical terms. This argument, however, simply pushes the problem further back in evolutionary time, as the Jeffress-like layout of the chicken’s ITD coding circuit must have been selected for at some time. Indeed, all available evidence suggests that it is the plesiomorphic condition for birds (Grothe et al. 2004). Paleontological studies show that early birds and their dinosaurian ancestors were predominantly small creatures, similar in head size to pigeons or chickens (review in Chiappe and Dyke 2002), providing no retrospective support for optimal coding.
The usefulness of any neural code for ITD at the level of NL or MSO must depend on how it is read at higher levels of the auditory system. As Takahashi et al. (2003) have pointed out, different aspects of the same neurons’ discharges may be used for different behavioral tasks, e.g. spatial discrimination vs. sound localization, thus rendering the strict distinction between a place code and a population code obsolete. Along similar lines, Joris and Yin (2007) have argued that ITD coding circuits also convey useful information about binaural correlation. Psychophysical studies have shown that humans and owls can localize phantom sound sources well until the correlation declines to a very low value, below which their performance deteriorates (Blauert and Lindemann 1986; Grantham and Wightman 1979; Jeffress et al. 1962; Saberi et al. 1998). Binaural neurons are sensitive to changes in binaural correlation mostly at the peak of the ITD curve and not at the slope (reviewed in Joris and Yin 2007). Thus neurons with best ITDs within the physiological range are most useful for decorrelation detection. These additional constraints suggest that the assumptions of the two-channel model are insufficient. Sensory systems have evolved to extract behaviorally relevant information and organize it into a format that allows subsequent neural stages to process the information rapidly and efficiently (Konishi 1986). The formation of maps of ITD in owls and chickens suggests that such maps engender a profound computational advantage (van Hemmen 2005).
We are grateful to Mark Konishi for the generous gift of the software “xdphys” custom-written in his lab and used in our experiments. We also wish to thank Jose-Luis Peña, Richard Kempter and Hermann Wagner for the use and support of data analysis routines and Birgit Seibel for excellent technical assistance with the histology. Jacob Christensen-Dalsgaard, Nicol Harper, Mark Konishi, Geoff Manley, Jose-Luis Peña and several anonymous reviewers kindly commented on earlier versions of the manuscript. Supported by a Humboldt Research Award to CEC, DFG grant KO 1143 / 12-2 and University of Sydney R&D grant to CK and NIH grant 000436 to CEC.
Christine Köppl, Lehrstuhl für Zoologie, Technische Universität München, Lichtenbergstr. 4, 85747 Garching, Germany. Department of Physiology (F13), University of Sydney, Sydney, NSW 2006, Australia ; Email: ua.ude.dysu.loisyhp@lppeokc..
Catherine E. Carr, Lehrstuhl für Zoologie, Technische Universität München, Lichtenbergstr. 4, 85747 Garching, Germany. Program in Neuro- and Cognitive Science, Department of Biology, University of Maryland, College Park, MD 20742-4415, USA.