|Home | About | Journals | Submit | Contact Us | Français|
The quality of temporal coding of sound waveforms in the monaural afferents that converge on binaural neurons in the brainstem limits the sensitivity to temporal differences at the two ears. The anteroventral cochlear nucleus (AVCN) houses the cells that project to the binaural nuclei, which are known to have enhanced temporal coding of low-frequency sounds relative to auditory nerve (AN) fibers. We applied a coincidence analysis within the framework of detection theory to investigate the extent to which AVCN processing affects interaural time delay (ITD) sensitivity. Using monaural spike trains to a 1-s broadband or narrowband noise token, we emulated the binaural task of ITD discrimination and calculated just noticeable differences (jnds). The ITD jnds derived from AVCN neurons were lower than those derived from AN fibers, showing that the enhanced temporal coding in the AVCN improves binaural sensitivity to ITDs. AVCN processing also increased the dynamic range of ITD sensitivity and changed the shape of the frequency dependence of ITD sensitivity. Bandwidth dependence of ITD jnds from AN as well as AVCN fibers agreed with psychophysical data. These findings demonstrate that monaural preprocessing in the AVCN improves the temporal code in a way that is beneficial for binaural processing and may be crucial in achieving the exquisite sensitivity to ITDs observed in binaural pathways.
Sensitivity to acoustic differences at the two ears helps human spatial perception (for general and recent reviews see Trahiotis et al. 2005; Colburn and Kulkarni 2005). At low frequencies, two temporal cues have been widely studied: interaural correlation (ρ) and interaural time delay (ITD). ITD is the dominant cue for localizing low-frequency sounds (Wightman and Kistler 1992; Macpherson and Middlebrooks 2002). Behavioral studies in different species, and especially humans, report that just noticeable differences (jnds) in ITD are in the order of tens of microseconds (Klumpp and Eady 1956; Durlach and Colburn 1978; Bernstein 2001).
The classical view on neural ITD sensitivity, inspired by the Jeffress model (Jeffress 2004), is that the monaural pathways preserve temporal features of the sound waveforms in their temporal discharge patterns, and that ITD is explicitly represented by a topographic array of binaural cells in the medial superior olive (MSO) of the brainstem. MSO neurons act as coincidence detectors: they only discharge when receiving coincident spikes from their monaural afferents (Goldberg and Brown 1969; Yin and Chan 1990; Spitzer and Semple 1995). Experimental studies have found that binaural neurons in the IC, which receive input from the MSO, can achieve ITD jnds as small as those obtained in humans (Skottun et al. 2001; Shackleton et al. 2003). Obviously, temporal acuity of input spikes to the coincidence detectors affects ITD sensitivity, but it has not been examined to what extent temporal information in the monaural pathways may limit the performance of binaural neurons.
A rigorous mathematical analysis of the limitations in ITD sensitivity imposed by monaural inputs requires an exhaustive characterization of the statistical process that generates the responses of monaural pathways to the stimulus waveforms studied. This model-based approach to binaural processing was introduced by Colburn (1973, 1977), using a generic model of steady-state responses of the auditory nerve (AN) to pure tones. Colburn then evaluated the limitations imposed by the monaural inputs by analyzing the performance of an “ideal observer”, i.e., a perfect central processor dealing with the stochastic monaural inputs. In the current study, we use the actual responses of individual monaural neurons to wideband noise to analyze the limits of ITD sensitivity imposed by monaural properties. Limited recording time does not allow one to construct exhaustive models of the responses of individual neurons. Repeated presentation of a single noise token, however, yields an accurate statistical description of the variability across responses to identical stimuli that limits performance. With only a few assumptions concerning which aspects of the response are relevant to binaural processing (Colburn and Latimer 1978; Louage et al. 2006), this statistical description enables us to derive the limits of performance along the lines indicated by Colburn (1973). This approach is somewhat abstract in that it does not implement a concrete model of binaural processing by, say, a single coincidence detector having a given internal delay. Our approach, however, enables us to focus exclusively on the role of monaural processing without specific assumptions on binaural processing.
Previous studies (Joris et al. 1994; Louage et al. 2005) have reported that low-frequency trapezoid body (TB) fibers discharge with higher temporal accuracy and consistency when compared with their inputs, the AN fibers. It has therefore been hypothesized that cochlear nucleus processing of multiple converging AN inputs is a critical step in arriving at high ITD sensitivity in the binaural nuclei (Joris et al. 1994). We have studied how well the monaural pathways support the discrimination of ρ (Louage et al. 2006). Obviously, monaural pathways in themselves cannot be sensitive to a binaural cue: our study therefore made use of a coincidence analysis of monaural spike trains. We used the shuffled autocorrelogram (SAC) (Joris 2003; Louage et al. 2004) within the framework of detection theory (Green and Swets 1966) and found that decorrelation jnds derived from responses of TB fibers were lower than those derived from AN fibers. These findings suggested that the enhanced synchronization of TB fibers indeed improves sensitivity to binaural correlation. Here, we use the same approach to establish the performance in an ITD discrimination task based on responses of individual monaural neurons. We report that ITD jnds derived from TB fibers are lower than those of AN fibers, that jnds for ITD are correlated with jnds for ρ, and that different internal delays are most appropriate for the ITD and decorrelation discrimination task.
Our experimental methods are described in detail in previous reports (Louage et al. 2005, 2006) and are only briefly summarized here. We recorded from the AN and the TB in separate experiments. Pentobarbital anesthetized cats were placed in a soundproof room. A sealed acoustic driver was inserted into one or both exposed ear canals and calibrated with a 1/2 inch condenser microphone and a probe tube close to the eardrum.
The AN was exposed via a posterior fossa approach, and the TB via a ventral approach to the skull base. All data were recorded with glass micropipettes filled with 3 M NaCl. The neural signal was converted to spike times referenced to the stimulus onset with a peak detection triggering circuit with an accuracy of 1 μs.
The search stimulus was a noise burst (duration 300 ms, repeated every 500 ms, 70 dB SPL, bandwidth 40 kHz). When recording from the TB, the search stimulus was delivered to both ears. When the activity of a single fiber was isolated, the excitatory ear was determined. For each fiber encountered, a threshold tuning curve was obtained with a tracking algorithm that provided spontaneous rate, CF, and threshold. Short tone bursts at CF (duration 25 ms, repeated every 100 ms, 200 repetitions, rise-fall time 2.5 ms, starting in sine phase) were then presented at increasing SPL in 10-dB steps. Next, a rate-level function was obtained to a broadband Gaussian noise (1,000 ms, repeated every 1,200 ms, five to ten repetitions). The bandwidth of the broadband noise was generally set from 50 to 8,000 Hz or from 100 to 30,000 Hz, depending on CF. The broadband noise was presented from 10 to 90 dB SPL in 5- or 10-dB steps. For a subset of fibers, a rate level function was also obtained to a narrowband Gaussian noise (the narrowband noise had a bandwidth of 100 Hz and was centered on the CF of the fiber).
After a fiber’s basic physiological parameters and rate-level functions were collected, we delivered the broadband noise with many repetitions, usually 25 to 100, so as to collect roughly 3,000 spikes. The first level (overall level re 20 μPa) tested was usually 70 dB SPL, the next levels were usually 50, 30, 80, 60, and 10 dB SPL. For a subset of fibers, we delivered narrowband tokens at the same overall levels. If time allowed, responses to broadband and narrowband sequences were collected at additional SPLs.
Fibers of the TB were classified into different categories based on the shape of their PSTH (binwidth 0.1 ms) to short pure tone bursts at CF, presented at multiple SPLs including at least 60, 70, and 80 dB SPL. “Primary-like” (PL) PSTHs resemble those of AN fibers, with an initial peak followed by a monotonic decline in rate to a steady-state response (Pfeiffer 1966). “Primary-like-with-notch” (PLN) fibers have PSTHs with a brief notch following the initial peak. This notch is difficult to detect for fibers which phase-lock and have a CF lower than 1,200 Hz (Pfeiffer 1966; Smith et al. 1993), and such fibers were therefore classified as “phase-lockers” (PHL). PSTHs with regularly spaced peaks of discharge, the period of which was unrelated to the stimulus waveform, were classified as “chopper” (CHOP). Fibers for which no responses to short tone bursts were available were classified as “no PSTH”.
The computation of “binaural” thresholds from monaural spike trains is based on coincidence counts and shuffled correlograms, which are constructed from intervals between spikes across different repetitions of the stimulus, excluding intervals occurring within single spike trains. The details are described in our previous work (Joris 2003; Louage et al. 2004). The relation between binaural neurophysiology, coincidence detection, and correlograms of monaural spike trains will be elaborated at the end of this section. Here, we explore the limits that the monaural neural inputs impose on binaural performance. The theoretical framework for this analysis is provided by signal detection theory (Green and Swets 1966). The neural data are converted to a decision variable D, and the limits of performance are determined by the statistics of D. Ideally, the mathematical expression describing the decision variable in terms of the neural responses is derived from theoretical considerations involving the mathematical concept of the ideal observer. Such an approach, however, requires an exhaustive mathematical description of the stimulus-response relation: for each stimulus, the instantaneous firing probability has to be known in full detail. Colburn (1973, 1977), starting from a heuristic model of pure-tone reponses of the AN, has elaborated the theoretical analysis of binaural performance. In the present study, we use the actual responses of monaural neurons to wideband noise rather than a model that reproduces the main properties of the responses. The limited recording time for each neuron precludes an exhaustive characterization of instantaneous firing probability in response to the stimulus. Our treatment of the binaural processor is therefore more heuristic than that of Colburn. Apart from these differences in emphasis, our analysis and assumptions are very similar to the work described in Colburn (1973, 1977) and Colburn and Latimer (1978). The analysis presented here also closely parallels that of our previous study (Louage et al. 2006), in which the sensitivity to changes of interaural correlation ρ was examined.
We assume here that binaural processing is exclusively based on the relative timing of action potentials between the left and right inputs (Colburn 1977). Our starting point is therefore the correlogram rather than the instantaneous firing rate. The computation of D is illustrated in Figure 1. From the experimentally obtained spike trains (e.g., s1 and s2 in Fig. 1A) in response to a noise token, we created new spike trains (e.g., s2Δ in Fig. 1A) by introducing small delays Δ ranging from 0 to 500 μs. We then constructed, for each imposed delay Δ, correlograms h(τ; Δ) (Fig. 1B) of all possible pairs of an original and a delayed spike train. In the notation h(τ; Δ), τ denotes the inherent delay parameter of the correlogram; Δ denotes the delay imposed on the second spike train prior to computing the correlogram. As further explained below, τ corresponds to structural internal delay within the nervous system, and Δ corresponds to ITD. h(τ; Δ) is analogous to the coincidence count term Lm in Eq. 1 of Colburn and Latimer (1978), with the exception that the Lm represent independent inputs, whereas the value of h(τ; Δ) for each value of τ represents identical inputs.. Pairs containing the original spike train and the delayed version of the same spike train were excluded. This exclusion is essential for the procedure to mimic the cross-coincidence analysis of a collection of identical, independent monaural inputs having a range of internal delays (Louage et al. 2004).
The central processor has to discriminate between correlograms h(τ) from pairs of undelayed spike trains (Fig. 1B, left column) and correlograms from pairs with the second spike train delayed (Fig. 1B, right column). These two situations correspond to ITD=0 and ITD>0, respectively. Note that imposing a delay Δ on the second spike train is equivalent to shifting the τ axis by an amount Δ, i.e., a change from h(τ; Δ)=h(τ−Δ;0). The same equivalence is present in the grand correlogram H(τ; Δ), which is obtained by averaging all the individual h(τ; Δ) (Fig. 1C). The task of the central processor is to detect changes due to the imposed delay Δ. In order to quantify the effect of Δ on the correlograms h(τ; Δ), one needs a set of weighting factors to convert the differences between the delayed and undelayed correlograms into a single decision variable while taking into account the sizes and signs of the local differences. In Colburn and Latimer (1978, Eq. 1), these weighting factors (cm in their Eq. 1) are derived from the model for AN responses and the requirement of optimal detection. Because we are using genuine monaural spike trains rather than a theoretical model, our weighting factors must be chosen to optimize sensitivity of the measured correlograms to interaural delays. This optimization only considers the basic statistical properties of the correlograms, i.e., its mean and variance. To arrive at a decision variable D that is optimized for detecting variations of Δ, one must first take the weighting factors w(τ) proportional to the local changes of h(τ) induced by changes in Δ from zero, and then take into account the variance of h(τ). For the small delays considered here, the change (a horizontal shift) is well described by the derivative of the grand correlogram H(τ; Δ) with respect to Δ, taken at Δ=0. After division by the variance σ2(τ) across pairs of h(τ) to incorporate the statistical weight of the contribution at delay τ, one obtains for the weighting factor
We approximated the derivative of H by [H(τ; 0)−H(τ; Δ)]/Δ with Δ=50 μs (Fig. 1D), smoothed by convolving it with a rectangular window of 100 μs. For all fibers, w(τ) had zero value at τ=0. This follows from the symmetric shape of H(τ; 0). The absolute weighting function shows maxima at the steep flanks of H(τ). The decision variable is computed by weighting each bin of h(τ) with w(τ) and summing over the range of available delays τ:
This decision statistic is optimal in the following technical sense: D is proportional to the delay that minimizes, in a weighted least-square sense, the deviation between the observed h(τ) and the undelayed grand mean correlogram H(τ;0). Most of the analyses are based on a 10-ms range of delays centered around zero, large enough to capture the meaningful part of the correlogram. In some of the later analyses, the effect of restricting this range will be explored.
By reducing the correlogram h(τ) to a single number D, the central processor’s problem of judging the imposed delay Δ is reduced to setting a criterion on the value of D. The performance of the central processor is entirely determined by the statistics of the decision variable. For each stimulus, many instances of D are available because multiple responses to the same stimulus are available, from which a high number of pairs of spike trains can be extracted. We computed D for all possible spike train pairs from the collection of responses. Figure 1E shows how the distribution of the decision variable is affected by Δ; two probability density functions of D are displayed, one corresponding to Δ=0 μs and the other to Δ=100 μs. The two distributions are almost identical, but shifted, versions of each other. The slight deviations in shape stem from boundary effects created by shifting the spike trains (cf. Fig. 1B). The width of each distribution reflects internal noise in the neural responses, the distance between the two distributions reflects the effect of a time delay of 100 μs. The distance or “detection index” d′ between distributions corresponding to a certain ITD and a reference ITD of 0 μs was calculated according to (Green and Swets 1966)
where μ and σ denote mean and standard deviation of D and the subscript ref refers to the reference condition, Δ=0. By convention, the ITD jnd was defined as the value of Δ at which d′=1.
The calculation of d′ according to Eq. 3 is most appropriate for Gaussian statistics of the decision variable D. A more general metric of performance, which does not assume Gaussian statistics, is provided by the receiver-operating characteristic (ROC) curve (Fig. 1F). The ROC curve is derived from the distributions in Figure 1E, and depicts the probability of hits versus the probability of false alarms for all possible response criteria. A property of ROC curves is that the area under the curve expresses the percent correct answers of the central processor in case of a two-alternative forced-choice task (Green and Swets 1966). For example, with the two distributions corresponding to Δ=0 or 100 μs (Fig. 1E), the central processor will correctly make 76 out of 100 decisions (Fig. 1F).
Nearly all computational models conceptualize the binaural system as an array of coincidence detectors that compares spike timings of left- and right-ear inputs at different internal delays (e.g., Colburn 1973). Integrated over time, the output of the entire array is akin to a running crosscorrelation of the spike trains feeding into the array, in which each “tap” (i.e., a comparison at one delay τ) in the array is provided by one coincidence detector. Spike train autocorrelograms (Figs. 1B, C, ,2A,2A, etc.) are counts of coincidences of an array of coincidence detectors regularly spaced in internal delay τ and connected to two input fibers that are identical in their physiological properties (Louage et al. 2004). It is important to clearly distinguish between τ, which is a structural property internal to the binaural system, and time delay Δ (ITD), which is the independent stimulus variable. The correlograms at two different Δ (Fig. 1B, left vs. right column; Fig. 1C, solid vs. dashed) thus simulate the output of an array of coincidence detectors to the same binaural stimulus, delivered sequentially at two different ITDs. In physiological terms, they simulate the output of an array of MSO neurons, tuned to the same CF but with different internal delays, to a single binaural stimulus sequentially delivered at two ITDs.
We report results for 356 TB fibers obtained from 14 cats and for 443 AN fibers obtained from 19 cats. First, we show data for individual AN and TB fibers, and next we report population data. To calculate an ITD jnd, we have to choose an analysis window (i.e., the post-stimulus time segment over which the spike trains are considered), a range of internal delays τ, and a reference ITD. The analysis window always extended from 50 to 1,000 ms. When not stated otherwise, τ ranged from −5 to +5 ms, and the reference ITD was 0 ms. The implications of these latter two choices will be discussed later.
Figure 2 shows steps (each column) in the analysis to determine the ITD jnds for three AN fibers, arranged from low (top) to high (bottom) CF. The three curves in each panel of the first column show the grand correlograms, h(τ) (as in Fig. 1C), of spike train pairs corresponding to ITD=0. The grand correlograms are proportional to the normalized SACs reported in our previous publications (e.g., Louage et al. 2004). The shape of the correlograms is consistent with the expected autocorrelation function of the “effective” stimulus to the fiber as determined by the mechanical and transduction events that precede spike initiation at the cochlear site which excites the fiber. For a detailed description of the shape of correlograms, see Louage et al. (2004).
Figure 2, second column, illustrates probability density functions (PDFs) of the decision variable D corresponding to target ITDs ranging from 0 to 500 μs, in steps of 25 μs. The PDFs corresponding to ITD=0 have zero mean and are depicted by a bold line. Depending on the number of stimulus presentations, a PDF is typically defined by 4,000 to 13,000 values of the decision variable, each ordered pair of stimulus presentations yielding a single value. For all fibers, the mean values of the PDFs decrease with increasing target ITD. The shape and standard deviation of the PDFs of one fiber are nearly invariant with ITD, because the decision variables corresponding to different ITDs are retrieved from the same set of responses. Compared with the low-CF fiber (Fig. 2B), PDFs of the high-CF fiber (Fig. 2H) are narrow but their mean values are relatively invariant with increasing target ITD.
Figure 2, right column, illustrates the detection index, d′, versus ITD; we will refer to d′ versus ITD as an interaural time sensitivity curve (ITSC). For the fiber with the lowest CF (Fig. 2C), the ITSC has a linear shape up to ITDs of 150 μs. The ITD jnd (99 μs) was expressed as the amount of ITD needed to reach d′=1 and was determined by linear interpolation. For the mid-CF fiber (Fig. 2F), the linear range is confined to smaller values of ITD. Note that the nonmonotonicity of the ITSC reflects periodicity of the grand correlogram, and that at relatively large ITDs, the detection index loses its meaning because the weighting function only applies to small ITDs (Eq. 1). The ITD jnd (35 μs), however, falls well within the linear range of the ITSC. The ITSC of the high CF (Fig. 2I) fiber does not cross d′=1 up to target ITDs of 500 μs: we label it as undefined.
Figure 3A shows the ITD jnds versus CF using 70-dB-SPL broadband noise for a group of 377 AN fibers. The AN jnds show a clear dependence on CF: they decrease with increasing CF up to 2–3 kHz, and are mostly undefined (symbols plotted at 450 μs) above CFs of 4 kHz. This steep transition region coincides with the CF range where envelope synchronization starts to outweigh fine-structure synchronization in the responses to broadband noise (Joris 2003; Louage et al. 2004). At all CFs, the lowest jnds of AN fibers with low (≤18 spikes/s) or high (>18 spikes/s) spontaneous rate (SR) are similar but the highest jnds mostly occur with low/medium-SR AN fibers (inverted triangles).
Figure 3B shows the same data as Figure 3A, but with jnds expressed relative to the period corresponding to CF (Tcf). With increasing CF, jnd/Tcf slowly increases up to CFs of 2 kHz, and steeply increases at CFs above 2 kHz. Below 1 kHz, ITD jnds are on average 4% of Tcf.
Figure 3C shows ITD jnd versus CF when the jnd is defined at d′=0.3. Compared with the ITD jnds obtained at d′=1 (Fig. 3A), those obtained at d′=0.3 are on average 3.4 times lower as expected from the linear shape of the ITSC at small target ITDs (Fig. 2, right column). Lowering the threshold criterion reduces the ceiling effect in Figure 3A and allows for differentiation of ITD discrimination performance among fibers with high CFs. The collapse of performance for CFs higher than ~3.5 kHz is still quite dramatic. Note that lowering the threshold corresponds to the optimal use of (1/0.3)2, i.e., ~11, independent fibers with identical physiological properties, because the collective d′ squared equals the sum of the squared individual d′ values (Green and Swets 1966).
We find that the ITD jnds of TB fibers are usually smaller than those of AN fibers. The low ITD jnds of TB fibers are an expected consequence of their more accurate time coding, as reflected by the high and narrow central peaks of correlograms (Louage et al. 2004) obtained from responses to broadband noise (Fig. 4, left column). Figure 4 shows data for three representative TB fibers, with the same layout as Figure 2. Compared with the low-CF AN fiber (Fig. 2A), the correlogram of the PHL fiber (Fig. 4A) has a much higher and narrower central peak, the ITSC is steeper (Fig. 4C), and the ITD jnd (16 μs) is lower. The ITSC and jnd of the PL fiber (Fig. 4, middle row) are rather similar to those of the mid-CF AN fiber (Fig. 2, middle row). Compared with the high-CF AN fiber (Fig. 2G), the correlogram of the PLN fiber (Fig. 4G) has a higher central peak, and the jnd is defined.
Figure 5 shows the ITD jnds obtained at 70 dB SPL for a population of 212 TB fibers. The solid line envelopes the lowest jnds obtained from AN fibers (Fig. 3A). In the region of transition between phase-locking to fine-structure and envelope (~2–4 kHz), the jnds are similar for AN and TB. Below and above this region, many TB fibers yield jnds lower than the lowest jnds of all AN fibers in the same region. Overall, the CF dependence of TB fibers is weaker than in AN fibers.
The population data shown so far were all obtained at 70 dB SPL, but for many fibers noise was presented at additional levels. Figure 6 shows data for four fibers, arranged from low (top) to high CF (bottom). The first column shows superimposed correlograms representing different SPLs as indicated in the right upper corner of each panel. The second column shows the ITSCs which correspond to the SPLs in the first column. The third column shows panels with two ordinates: the right ordinate indicates the average firing rate and the left ordinate indicates the ITD threshold.
The first row of Figure 6 shows a low-CF (230 Hz) PHL fiber. With increasing SPL, the ITD jnd slightly decreases (Fig. 6C); this trend was seen for all fibers that synchronized to the fine-structure of the waveform. The second row of Figure 6 shows a mid-CF (2,530 Hz) PL fiber. The left panel shows two correlograms obtained at 30 and 70 dB SPL. Both correlograms have an oscillatory shape with a frequency that corresponds to CF, which reflects synchronization to fine-structure. As with the low-CF PHL fiber, ITD jnds slightly decrease with increasing SPL.
The bottom two rows of Figure 6 show two high-CF fibers: a PLN (third row) and an AN (fourth row) fiber. Their correlograms have the shape of a single central peak (Fig. 6G, J). The ITD jnd versus SPL curves (Fig. 6I, L) show a minimum at SPLs in the upper end of the dynamic range of their rate-level curves. With decreasing SPL, ITD jnds increase and eventually, when the fiber is not driven, become undefined. With increasing SPL, jnds also increase: for the PLN fiber, this increase is small whereas for the AN fiber the increase is marked and the ITD jnd is undefined at 60 dB SPL. The nonmonotonic shape of the jnd versus level curves of high-CF fibers probably reflects two counterbalancing mechanisms. On the one hand, at low SPL, too few stimulus-coupled spikes are available for an accurate estimate; on the other hand, at high SPLs, envelope coding declines (Fig. 6G, J).
Figure 7A shows all measured ITD jnds derived from responses of TB (N=356) and AN fibers (N=443). Multiple points connected by a vertical line represent those TB fibers from which we obtained jnds at more than one SPL. When only one SPL was tested, a single data point represents a single fiber. For graphical clarity, multiple jnds obtained from single AN fibers are not connected. Undefined jnds are not illustrated except when all thresholds were undefined; in those cases, a data point is plotted at the horizontal dashed line near 500 μs. Different symbols represent the fiber types. The stimuli were presented at levels ranging from 10 to 100 dB, with a median of 60 dB SPL. In general, the CF dependency of ITD jnds and the contrast between AN and TB (Figs. 2 and and5)5) persist when stimuli are delivered at SPLs other than 70 dB: jnds are lower at low CFs, and the most ITD-sensitive TB fibers have lower jnds than any AN fibers.
Figure 7B shows ITD discrimination performance expressed as the percentage of correct answers when discriminating waveforms with 0 and 50 μs ITDs. The percentage of correct answers is derived from the ROC curve (see “Methods” and Fig. 1F). Figure 7B is based on the same responses as Figure 7A but, for each fiber, data points indicate percent correct for the SPLs which yielded the best performance. Performance of TB and AN fibers changed with CF: below 1 kHz the average performance of TB and AN fibers was 87 and 76, respectively; between 1 and 4 kHz, the average performance of TB and AN fibers was 77 and 83, and above 4 kHz, average performance of TB and AN fibers was 67% and 58%, respectively. Among TB fibers, ITD sensitivity for the PSTH classes is not equal: average performance of PHL, PL, PLN, and chopper fibers was 88%, 74%, 71%, and 69%, respectively.
It has repeatedly been shown that envelope phase-locking of ventral cochlear nucleus neurons occurs over an extended dynamic range when compared with AN fibers (Joris et al. 2004). We checked whether this would also translate to envelope ITD sensitivity to noise. Figure 8A shows ITD jnds versus sound level above the rate threshold for broadband noise for AN (thick lines) and TB (thin lines) fibers. Fibers with CFs<4 kHz are excluded because the focus here is on envelope coding. Compared with AN fibers, most TB fibers yield lower jnds that are also defined over a wider range of sound levels. The extended range of ITD sensitivity for TB fibers is particularly clear in Figure 8B, where jnds are plotted versus sound level relative to the level at which the lowest jnd occurs.
We evaluated ITD sensitivity for 100-Hz-wide narrowband noise centered at the fiber’s CF. Compared with broadband noise, the effective stimulus for narrowband noise is less affected by peripheral filtering and has slower envelope fluctuations. We have previously described correlograms of responses to narrowband noise (Louage et al. 2006; Fig. 12). For CFs above the phase-locking limit, the correlograms to narrowband noise have the shape of a single broad peak with a halfwidth ranging from 5 to 8 ms, i.e., much wider than the peak of correlograms to broadband noise. For CFs below the phase-locking limit, correlograms have the shape of a damped oscillation with an oscillation frequency that corresponds to the CF of the fiber.
Figure 9A plots ITD jnds for narrowband noise versus CF. Points at multiple SPLs are again connected with a vertical line. Undefined jnds are not illustrated except when all thresholds were undefined; in those cases the data points are plotted at the horizontal dashed line. Up to CFs of 3.5 kHz, most jnds are between 15 and 100 μs; above 3.5 kHz the jnds are undefined.
A comparison between jnds to narrow- and broadband noise is shown in Figure 9B. Only data for which jnds were obtained at multiple SPLs are shown. Fibers for which all of the jnds were undefined, are not shown. Each line represents a single fiber; the line end with a symbol represents the smallest jnd obtained from responses to broadband noise and the line end without a symbol represents the smallest jnd obtained from responses to narrowband noise. The direction of the line thus indicates whether jnds are lower or higher in the broadband condition. At the lowest CFs, the lines are small (one exception) and can go either way. This is as expected since the tuning bandwidth of these neurons is small, so that restriction of the noise bandwidth to 100 Hz has little effect on the responses (Mc Laughlin et al. 2007, 2008). For fibers at midfrequencies, with CFs in the 2–3 kHz range, the narrowband condition tends to yield lower jnds. For CFs above 3.5 kHz, the situation is reversed and more extreme: the smallest jnd always occurs for broadband noise and the jnd for narrowband noise is undefined. These latter trends are clearly present across unit types.
In the next sections, results refer exclusively to data obtained with broadband noise. We previously reported decorrelation jnds for AN and TB fibers (Louage et al. 2006). For a subset of fibers (N=93), we can directly compare the discrimination performance of a fiber for two different tasks: ITD discrimination and correlation discrimination. Figure 10A shows the decorrelation jnd versus the ITD jnd for AN and TB fibers with CFs lower than 2.8 kHz. Each data point represents a single fiber. There is a weak trend for fibers with low ITD jnds to also yield low decorrelation jnds, and vice versa. Decorrelation sensitivity and ITD sensitivity are more tightly linked when the ITD jnds are expressed relative to the period corresponding to CF (TCF) (Fig. 10B).
Figure 10C shows decorrelation jnds versus ITD jnds for high-CF (>4 kHz) TB fibers. Compared with fibers that synchronize to the fine-structure of the waveform, decorrelation and ITD jnds are more tightly linked in high-CF fibers. We have no high-CF AN fibers for which the jnd was defined for both the ITD and the decorrelation discrimination task; this is mainly due to the scarceness of high-CF ITD jnds. At both low CFs (Fig. 10A) and high CFs (Fig. 10C), chopper fibers are below the general trend of other fibers. This reflects a combination of two effects: the higher threshold of choppers to ITDs (Figs. 5 and and7)7) and their lower thresholds to decorrelation (Louage et al. 2006, Figs. 6 and and88).
The comparisons in Figure 10 show that the jnds of the central processor for ITD and decorrelation discrimination tasks are related when they are directly compared. The analyses presented in the next section, however, will show that sensitivity to ITDs and correlation depend in contrasting ways on analysis parameters (such as the range of internal delays) and on derived metrics (such as firing rate).
Binaural recordings indicate that the range and average value of internal delays τ depend on CF (McAlpine et al. 2001; Brand et al. 2002; Hancock and Delgutte 2004a; Joris et al. 2006a, b; Siveke et al. 2006; Pecka et al. 2008). In our analysis so far, the available range of τ values was taken very wide and fine-grained, and was centered at the peak of the correlograms (−5 to +5 ms). Because such a wide range of internal delays is not actually available in the binaural system, it is important to examine the effect of the width and center of this range on the performance of the central processor for the two tasks. Describing the range of internal delays by its center value τc and its width W, Eq. 2 becomes
For instance, a choice of W=150 μs and τc=−50 μs results in an available range of internal delays (τ values) of −125 to +25 μs. Such a restriction of available internal delays corresponds to changing the structural properties of the conceptualized binaural system. Restricting the range of internal delays is analogous to the introduction of the function p(τ) by Colburn (1977, Eq. 3), which describes the population density of coincidence detectors having internal delay τ. In Figure 11, different choices of W (A: 50 μs, B: 150 μs) and τc (−50, 0, and +50 μs in both A and B) are visualized in the context of a delay line network (see METHODS). The left column (Fig. 11A) shows an arrangement where the width of internal delays is reduced to a single internal delay at τc, and for which we test how the jnd of that single coincidence detector depends on τc. In the right column (Fig. 11B), the range of internal delays W is wider (150 μs) and encompasses three coincidence detectors. Again, we test how the centering of this range affects discrimination performance.
Note that a single pair of monaural inputs is considered throughout. Therefore, if larger values of W enhance ITD sensitivity, this is not caused by effectively adding more monaural inputs, but by a more exhaustive use of the available monaural information. Increase in W enables comparison of two spike trains of one neuron over a wider range of internal delays. Consider the individual correlograms of Figure 1B: the number of coincidences at each bin gives the number of coincidence for that one pair of spiketrains for one internal delay, corresponding with the bin center. For example, the number of coincidences at bin +50 μs equals the output for a W of 50 μs and τc of +50 μs (Fig. 11A, top). The number of coincidences of the bins centered at 0, 50, and 100 μs give the output for one pair of spiketrains examined with W of 150 μs and τc of +50 μs (Fig. 11B, top).Manipulation of the values of W and τc allows us to determine the range of internal delays across pairs of monaural inputs that is optimally useful for ITD discrimination.
Figure 12 illustrates the effect of W and τc on the sensitivity to ITD (left column) and interaural decorrelation (right column). Each row shows data of a single fiber, and each panel shows the central portion of the normalized SAC to broadband noise (thick line and left ordinate), as well as the probability of correct response (thin lines with symbols, right ordinate). In the left column, performance refers to discriminating ITDs of 0 and 100 μs, and in the right column performance refers to discriminating normalized correlations of 1 and 0.77. Different symbols refer to different values of W, from 50 to 7,500 μs as indicated in the inset (panel G), and the horizontal position of each symbol corresponds to the center value of internal delays, τc.
In all cases, widening of the range of internal delays (larger W, indicated with larger symbols) results in curves that move up, indicating improved performance. In the PHL fiber, excellent performance is reached for a range of internal delays W as small as 50 μs. Given the finite (50 μs) width of the coincidence window (the maximum temporal separation between two spikes to be “coincident”), such a small value of W amounts to using only a single internal delay centered at 150 μs (or −150 μs ) in the PHL fiber of Figure 12C. However, for such small W, high performance is restricted to a narrow range of center positions τc. For flanking positions, performance quickly drops to chance. Note that the distribution of internal delays specificied by Eq. 4 is rectangular, unlike the function p(τ) of Colburn (1977). To estimate performance on a different ΔITD, the curves in Figure 12 (left column) can simply be shifted along the abscissa. For example, discrimination of 150 and 250 μs is obtained by adding 150 μs to all x values. If the PHL fiber would supply a single internal delay centered at 150 μs, performance for discrimination between 150 and 250 μs would be at chance, since the trough at 0 μs in Figure 12C would now be at 150 μs. To reach maximal performance in the ΔITD task, and in particular to reach maximal performance over a range of ITDs, fibers need to supply coincidence detectors over a range of internal delays.
Interestingly, different choices of τc are required for best performance in the two tasks. In other words, if only a narrow range of internal delays is available, the optimal position along the internal delay axis differs for the ITD and decorrelation discrimination task. For the ITD discrimination task, optimal position is at the slopes of the central peak of the correlogram, and for the decorrelation discrimination task, optimal position is at the central peak itself (Fig. 12, all panels). This holds for both AN and TB fibers, irrespective of CF.
In summary, widening of the range of internal delays improves performance (curves in Fig. 12 move up); makes it less dependent on stimulus parameters (curves in Fig. 12 flatten); and generalizes to different tasks (ΔITD and Δρ). How broad this range should be to obtain maximal performance over a wide range of parameters is something we have not systematically explored. Nevertheless, Figure 12 shows that CF, fiber type, and nature of the task will be important to consider, and that ranges that span a large fraction of the central peak of the correlogram will give the most robust performance.
We applied a coincidence analysis to mutually delayed responses to frozen noise. The resulting coincidence counts were processed by a central processor that judged whether the responses were delayed or not. The ITD sensitivity of the central processor was much higher with responses of TB fibers than for AN fibers and sometimes of the same order of magnitude as human ITD jnds.
Our quantitative analyses are closely related to the theoretical work of Colburn (1973, 1977) and Colburn and Latimer (1978). The methodological connections are addressed in more detail in the METHODS section. A number of clear differences exist between Colburn’s approach and ours. Colburn’s analysis is based on the ideal observer operating on virtual data produced by a heuristic model of AN responses; we are using a heuristic decision variable extracted from actual monaural data. Colburn’s work on ITD detection uses pure-tone stimuli; we use noise stimuli. Colburn’s predictions of ITD sensitivity are based on estimates based on decision statistics integrated over all binaural neurons (all CFs, all internal delays); our analysis mimics a single array of coincidence detectors with statistically identical inputs spanning a range of internal delays. The latter contrast probably explains why Colburn’s model predicts much smaller ITD thresholds than the ones derived from our neural data. The methodological differences, however, should not obscure the fact that Colburn’s theoretical approach could be readily transformed to the “behavioral analysis” of electrophysiological data. In that sense, our work confirms that Colburn’s approach contains all the key ingredients for the quantitative assessment of cross-coincidence analysis.
Recordings in the cochlear nucleus have shown higher vector strengths than in the AN both to pure tones (Rhode and Smith 1986; Blackburn and Sachs 1989; Carney 1990; Paolini et al. 2001) and to the envelope of amplitude modulated tones (Frisina et al. 1990; Rhode and Greenberg 1994; Wang and Sachs 1994; Joris and Smith 1998; Joris and Yin 1998). Enhanced phase-locking to fine-structure is particularly conspicuous in recordings from axons in the TB (Joris et al. 1994; Joris and Yin 1998), which are predominantly derived from bushy cells (Spirou et al. 1990; Smith et al. 1991; Smith et al. 1993) and are thus part of the monaural pathways providing input to the binaural system. In previous studies, we found that temporal enhancement in the TB also occurs in response to broadband noise, both to fine-structure and to envelope (Louage et al. 2005). It was hypothesized (Joris et al. 1994; Louage et al. 2005) that temporal enhancement serves to improve binaural sensitivity in the superior olivary complex, but direct testing was required to assess to what extent increased vector strength or higher and narrower peaks of autocorrelograms indicate “enhanced” forms of temporal coding. Indeed, TB responses can also be said to be more distorted (rather than enhanced) representations of the acoustic waveform than AN responses. In the present study, we find that ITD discrimination based on TB responses is better than that based on AN responses. The improvement is CF-dependent and cannot be captured with a single number (Figs. 5 and and7):7): it is most dramatic above and below the transition region between phase-locking to fine-structure and envelope, in the CF ranges at which TB correlograms differ most strongly from AN correlograms (Louage et al. 2005, Fig. 6). Together with our previous study on decorrelation sensitivity (Louage et al. 2006), these findings thus support the hypothesis that monaural processing in the anteroventral cochlear nucleus (AVCN) is advantageous for a high level of accuracy in binaural processing.
We processed spike trains of monaural neurons to test performance limits on binaural tasks. Although our analysis is inspired by actual binaural ingredients (internal delays, coincidence detection), it is not intended to closely model binaural processing. Real afferents converging on an MSO neuron are bound to differ more strongly than in our analysis (where we compare a single neuron with itself); to be more numerous (only one input from each side in our analysis); and to be less systematically arranged in internal delay (±5 ms in 50 μs steps, for most of our analysis). Moreover, there is more to binaural interaction than coincidence detection and pure time delays: the MSO also receives inhibitory inputs (Cant and Hyson 1992; Grothe and Sanes 1994; Smith et al. 1998; Brand et al. 2002; Pecka et al. 2008), there are subtractive interactions, e.g., in the lateral superior olive (Joris and Yin 1990; Finlayson and Caspary 1991; Batra et al. 1997; Tollin and Yin 2005), phase delays are known to exist (Yin and Kuwada 1983; McAlpine et al. 2001; Hancock and Delgutte 2004b; Marquardt and McAlpine 2007), etc. Finally, for none of the TB responses studied here do we have direct evidence that they terminate on MSO neurons. As discussed previously (Louage et al. 2005), we can be confident that the vast majority of the PHL fibers, which have the lowest jnds, are bushy cell axons, but they are likely a mix of spherical and globular bushy cell axons of unknown proportions but with a predominance of the latter (Joris and Smith 2008). All of these reservations make it clear that many additional factors need to be considered in a comparison of the thresholds reported here (e.g., Fig. 7) with binaural physiology or human perception, but on the other hand these reservations do not diminish our conclusion that TB fibers are superior to AN fibers in these binaural tasks.
Interestingly, the best ITD jnds of single IC neurons are comparable to the jnds found here (Shackleton et al. 2003, 2006), even though an IC neuron presumably represents only one sampling point, i.e., one internal delay value of τ. The limited analysis shown here (Fig. 12) suggests that the monaural afferents indeed allow good discrimination even when the range of available internal delays is very restricted. There are many differences in the stimuli and analyses between the study of Shackleton et al. (2003) and ours which hinder a straightforward comparison: a closer comparison between jnds at the monaural and binaural level is one topic that merits further study.
In summary, while our present and preceding studies show that temporal coding for binaural tasks improves between AN and TB, it remains to be investigated how this improvement contributes to actual binaural interaction.
ITD thresholds derived from AN fibers exhibit a marked CF dependence (Fig. 3). Up to CFs of a few kHz, responses are dominated by fine-structure synchronization, and jnds are on average 4% of the TCF of the fiber (Fig. 3B). The jnds improve with increasing CF (Fig. 3A), but abruptly deteriorate above 3–4 kHz where they steeply increase from a minimum near 20 μs to undefined values (>500 μs, Fig. 3A) for most fibers. This transition is in the range where, in response to broadband noise, synchronization to fine-structure is lost and is replaced by synchronization to the envelope (Joris 2003; Louage et al. 2004). At yet higher CFs, where synchronization is purely to envelopes, ITD jnds are also mostly undefined (Fig. 3A).
Interestingly, the CF dependence of jnds is less marked in the TB (Figs. 5 and and7),7), where the lowest jnds gradually increase with increasing CF. The fact that the TB jnds at low CFs show an increase with CF, rather than the decrease seen in the AN, is likely due to the enhanced fine-structure synchronization which is particularly prominent at the lowest CFs (Joris et al. 1994; Louage et al. 2005). The fact that most TB jnds at high CFs are so much lower than in the AN probably also reflects enhanced temporal coding, but is less expected. As mentioned earlier, TB fibers show enhanced envelope coding to noise, but the difference with the AN is less marked than for coding of fine-structure (Louage et al. 2005). That the ITD jnds are mostly undefined in the AN (Fig. 3) and substantially lower in the TB (Fig. 5) is remarkable but not as mysterious as it first appears. The AN fibers at these CFs clearly show envelope phase-locking and narrow distributions of the decision variable (Fig. 2G), but these distributions are little dependent on ITD (Fig. 2H), while in the TB such dependency is present (Fig. 4H). In the AN, there is a limited range of internal delays over which the coincidence rate differs from chance (value of 1), and the correlation index, i.e., the ratio between peak values and chance is small (Joris 2003; Louage et al. 2004; Joris et al. 2006a, b). The latter ratio is generally higher in AVCN neurons (Louage et al. 2005). Moreover, correlograms of the latter neurons often show a decrease in coincidence rate below chance level at delays flanking the central peak (Figs. 4G, ,6G,6G, and 12E; Louage et al. 2005). Both of these factors will make the distribution of the decision variable more dependent on ITD.
We observed more tolerance for SPL in the TB than in the AN (Figs. 6 and and8).8). In particular, ITD discrimination thresholds of high-CF TB fibers are lower over an extended SPL range when compared with AN responses. AVCN processing might thus contribute to the relative insensitivity of ITD jnds with SPL (Blauert 1983; Simon et al. 1994; van de Par et al. 2001).
The bandwidth dependence of our neural data can be summarized as follows: at low CFs, ITD jnds with broadband and narrowband noise differ only marginally, but at high CFs, ITD jnds with broadband noise are invariably lower than those with narrowband noise (Fig. 9). The most obvious explanation for this reduced ITD sensitivity of high-CF fibers is the lack of fast envelope fluctuations in narrowband responses, as reflected by the excessive widening of narrowband noise correlograms (Louage et al 2006, Fig. 12). No qualitative differences between the AN and TB seem to exist with respect to these bandwidth effects. Effects of bandwidth similar to our findings are found in the human psychophysical literature. At high frequencies (4 and 8 kHz), psychophysical ITD discrimination of bands of noise improves with bandwidth (Bernstein and Trahiotis 1994). Several psychophysical studies have addressed ITD-induced lateralization of bands of noise of varying center frequency and bandwidth. To the extent that ITD-induced lateralization reflects sensitivity to ITDs, these psychophysical data show exactly the same bandwidth effects as our neural thresholds: at low frequencies (<1,600 Hz), the effect of bandwidth on lateralization is marginal (Schiano et al. 1986); at high frequencies (4 and 8 kHz), the amount of lateralization increases markedly with bandwidth (Bernstein and Trahiotis 2003).
An important recent finding in binaural physiology is the inverse relationship between best delay and CF in IC neurons of the guinea pig (McAlpine et al. 2001), which has since been observed in other species and binaural structures as well (Brand et al. 2002; Hancock and Delgutte 2004a; Joris et al. 2006a, b; Siveke et al. 2006; Pecka et al. 2008). It was proposed that this relationship serves to position the binaural sensitivity such that its steepest slope is at 0 ITD so as to optimize the detection of ITDs (McAlpine et al. 2001; Harper and McAlpine 2004). At low CFs, this requires best delays outside the physiological range. The analysis here confirms that maximal ITD sensitivity is obtained near the steepest slopes (Figs. 1D and and12),12), but at these delays the sensitivity to decorrelation is small; vice versa, maximal sensitivity to decorrelation is at the peak, where ITD sensitivity is small (Figs. 1D and and12;12; Louage et al. 2006). The need for acute discrimination of both ITDs and decorrelation, combined with this difference in optimal delay, may explain why the observed distribution of best delay in the IC deviates from modeling predictions (Harper and McAlpine 2004) and shows a range of best delays in low-CF neurons (Yin and Chan 1990; McAlpine et al. 2001; Hancock and Delgutte 2004a; Joris et al. 2006a, b; Joris and Yin 2007). More generally, the results illustrate that discrimination performance benefits from the availability of a pattern of coincidences across a range of coincidence detectors with different internal delays.
This paper was substantially improved by comments on a previous version by Steve Colburn. Supported by the Fund for Scientific Research—Flanders (G.0083.02 and G.0392.05) and Research Fund K.U.Leuven (OT/09/50 and OT/05/57).
Open Access This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
Marcel van der Heijden, Phone: +31-10-7043567, Email: m.vanderheyden/at/erasmusmc.nl.
Philip X. Joris, Phone: +32-16-345741, Fax: +32-16-345993, Email: Philip.Joris/at/med.kuleuven.be.