|Home | About | Journals | Submit | Contact Us | Français|
Sensitivity to interaural time differences (ITDs) in high-frequency bandpass-filtered periodic and aperiodic (jittered) pulse trains was tested at a nominal pulse rate of 600 pulses per second (pps). It was found that random binaurally-synchronized jitter of the pulse timing significantly increases ITD sensitivity. A second experiment studied the effects of rate and place. ITD sensitivity for jittered 1200-pps pulse trains was significantly higher than for periodic 600-pps pulse trains, and there was a relatively small effect of place. Furthermore, it could be concluded from this experiment that listeners were not solely benefiting from the longest interpulse intervals (IPIs) and the instances of reduced rate by adding jitter, because the two types of pulse trains had the same longest IPI. The effect of jitter was studied using a physiologically-based model of auditory nerve and brainstem (medial superior olive neurons). It was found that the random timing of the jittered pulses increased firing synchrony in the auditory periphery, which caused an improved rate-ITD tuning for the 600-pps pulse trains. These results suggest that a recovery from binaural adaptation induced by temporal jitter is possibly related to changes in the temporal firing pattern, not spectral changes.
The experiments described here were motivated by recent work on interaural time difference (ITD) sensitivity in cochlear-implant (CI) users and past work on the binaural adaptation phenomenon observed in ITD perception of normal-hearing (NH) listeners. Hafter and Dye (1983) presented evidence that ITD sensitivity decreases for high-frequency modulated stimuli, like bandpass-filtered pulse trains, if the modulation rate is too high. By systematically varying the number and rate of pulses in a train, they found that increasing the pulse rate decreases the usefulness of the binaural information after the onset. Later, Hafter and Buell (1990) showed that a recovery from binaural adaptation can be produced by inserting a change or “trigger” in the signal. They reported a recovery from adaptation when doubling or halving one or more intervals in a pulse train with a 2.5-ms interpulse interval (IPI). They also reported a recovery from binaural adaptation to a pulse train by adding short trigger signals such as diotic sinusoids or diotic, monotic, or uncor-related noise bursts. Hafter and Buell (1990) concluded that the recovery effect from a trigger was most likely due to a temporary spectral change in the signal. The previously described studies were performed with a fixed number of pulses. In fact, even for a periodic stimulus with a fixed duration (such as pulse trains, sinusoidally amplitude modulated tones, or transposed tones), there is decreasing ITD sensitivity with increasing modulation rate (e.g., Bernstein and Trahiotis, 2002; Majdak and Laback, 2009).
Recent work with CIs readdressed the binaural adaptation phenomenon and introduced a new method to cause a recovery from binaural adaptation (Laback and Majdak, 2008). CIs use high-rate electric pulses to encode acoustic information. Several recent studies have shown that, similar to NH listeners, ITD sensitivity in CI listeners rapidly decreases with increasing pulse rate beyond a few hundred pulses per second (pps) (Majdak et al., 2006; Laback et al., 2007; van Hoesel, 2007). Laback and Majdak (2008) hypothesized that this pulse rate limitation is a form of binaural adaptation and showed that introducing binaurally-synchronized jitter (referred to as binaural jitter) can substantially increase ITD sensitivity at rates of 800–1515 pps. Because direct electric stimulation at one interaural electrode pair was used in that experiment, the jitter changed only the temporal properties of the stimuli, not the spectral. Therefore, they concluded that the recovery from binaural adaptation is caused by ongoing temporal changes in the signal.
In this study, we examined if a similar improvement in ITD sensitivity could be achieved in acoustic hearing by introducing binaural jitter into high-frequency bandpass-filtered pulse trains. In experiment 1, we tested the effect of binaural jitter for 600-pps pulse trains, a pulse rate at which listeners normally have difficulty in detecting waveform ITDs (Majdak and Laback, 2009). In experiment 2, we tested the hypothesis that the improvement in ITD sensitivity depends on only the longest IPIs of a jittered pulse train. We then modeled the response of the auditory periphery and brainstem to jittered pulse trains to observe the likely changes to the physiological firing patterns introduced by jitter in an attempt to understand the listeners' ITD sensitivity.
The effect of binaural jitter on ITD sensitivity has already been investigated in two earlier studies using sinusoids (Nordmark, 1976; Blauert, 1981). Nordmark (1976) reported surprisingly small just noticeable differences (JNDs) around 1.5 μs for a temporally-jittered 4-kHz carrier. Blauert (1981) replicated Nordmark's experiment and found JNDs that were two orders of magnitude larger (around 170 μs). If the latter measurement is correct, this means that ITD JNDs for jittered sinusoids are comparable to other high-frequency stimuli that have an amplitude modulation (AM) (Henning, 1974) or a frequency modulation (FM) (Henning, 1980). This might be expected if a jittered sinusoid is viewed as a FM with a random modulation frequency. Note, however, the fundamental difference between the study by Nordmark (1976) and Blauert (1981) using jittered sinusoids, and both Laback and Majdak's (2008) study and the present study using pulse trains. The purpose of using jittered sinusoids in earlier studies was to present usable ITD information at high center frequencies. The purpose of this study is to investigate the effect of jitter on ITD rate limitations for pulsatile stimuli, which are commonly associated with CI processing strategies, and to more deeply understand how temporal jitter affects ITD sensitivity.
Six listeners participated in this experiment. All listeners were between 24 and 37 years old and had normal hearing according to standard audiometric tests. Two listeners were authors of this study (NH2 and NH10). All six listeners were experienced with virtual sound localization. Three listeners (NH2, NH8, and NH10) had extensive experience in lateralizing pulse trains. From preliminary tests and training, we determined that listeners NH2, NH8, and NH10 could lateralize pulse trains with relatively small amounts of jitter compared to the other listeners. Therefore, we divided the listeners into a high-sensitivity group (NH2, NH8, and NH10) and a low-sensitivity group (NH12, NH14, and NH15). Also, NH8 was markedly more sensitive to ITD than the other five listeners. Thus, he was given smaller ITD values to lateralize to avoid ceiling effects.
A personal computer system was used to control the experiment. The stimuli were output via a 24-bit stereo A/DD/A converter (ADDA 2402, Digital Audio Denmark) using a sampling rate of 96 kHz/channel. The analog signals were sent through a headphone amplifier (HB6, TDT) and an attenuator (PA4, TDT). The signals were presented to the subjects via headphones (HDA200, Sennheiser). Calibration of the headphone signals was performed using a sound level meter (2260, Brüel & Kjær) connected to an artificial ear (4153, Brüel & Kjær).
The stimuli were 500-ms pulse trains composed of 10.4-μs monophasic pulses, corresponding to one sampling interval at a sampling rate of 96 kHz. The pulse rate was 600 pps, which has an IPI of 1667 μs. A recent study by Majdak and Laback (2009) showed that ITD sensitivity degrades to chance around 500 pps for most NH listeners, which is generally consistent with studies that use other types of modulated stimuli (e.g., Bernstein and Trahiotis, 2002). Thus, stimuli with a 600-pps pulse rate have the property that there is substantial room for improvement in ITD sensitivity.
A waveform ITD was introduced by delaying the temporal position of the pulses at one ear relative to the other ear. The ITD values were 100, 200, 400, and 600 μs for all but one listener. The other listener, who was unusually sensitive to ITDs, was tested with ITD values of 20, 50, 100, and 150 μs. To minimize the detection of ITD in the onset and offset of the stimulus, 150-ms linear ramping was applied to the pulse trains. The full-on duration of the stimuli was 200 ms. The −3-dB duration of the stimuli was 288 ms.
Jitter in the timing of the pulses was applied to the stimuli. Periodic pulse trains [Fig. 1(A)] had a constant IPI, whereas the jittered pulse trains [Fig. 1(B)] had randomly-varied IPIs. The nominal IPI corresponded to the average IPI over the stimulus duration. To preserve the ITD information in the pulse timing, the jitter was synchronized between the two ears (indicated by the constant length of the arrows in Fig. 1). The jitter followed a rectangular distribution, where the parameter k defines the width of the distribution relative to the nominal IPI. The parameter k ranges from 0 (periodic, no jitter) to 1 (maximum jitter). A jittered pulse train was “constructed” pulse by pulse. For each pulse added, the IPI was varied within the range of IPI·(1±k). Thus, for k=1, the largest possible IPI was twice the nominal IPI and the smallest possible IPI was zero. For the three high-sensitivity listeners, the jitter values were k=0 (periodic condition, no jitter), 1/128, 1/32, 1/8, and 1/3. For the three low-sensitivity listeners, the jitter values were k=0, 1/8, 1/3, 1/2, and 3/4.1 Each trial used a new random jitter manifestation.
The pulse trains were passed through a digital sixth-order bandpass Butterworth filter. The spectral center frequency of the band was 4.6 kHz. The spectral bandwidth was 1.5 kHz. The A-weighted sound pressure level of the stimuli was 72 dB (re: 20 μPa). In a control condition, Gaussian white noises were used as stimuli, filtered by the same sixth-order Butterworth bandpass filter that was used for the jittered pulse trains.
Binaurally-uncorrelated, low-pass filtered, white noise was used to mask low-frequency components that might contain useful binaural cues. The corner frequency was 3500 Hz, with a 24-dB/oct roll-off, and the A-weighted sound pressure level of the noise was 61 dB. The sound pressure spectrum level at 2 kHz was 35.8 dB (re: 20 μPa in a 1-Hz band).
A two-interval, two-alternative forced-choice procedure was used in a lateralization discrimination test. The first interval contained a reference stimulus with zero ITD and zero k evoking a centralized auditory image. The second interval contained the target stimulus with non-zero ITD and one of the five values of k. The interstimulus interval was 400 ms. The listeners indicated whether the second stimulus was perceived to the left or to the right of the first stimulus by pressing a button. Visual response feedback was provided after each trial. The listeners controlled when the next trial began.
For the pulse trains, a block contained 2000 trials consisting of 100 presentations of four ITD and five k values in a randomized order. For the noises, a block contained 400 trials consisting of 100 presentations of four different values of ITD. The 100 repetitions per condition were presented in a balanced format with 50 targets on the left and 50 targets on the right. The chance rate was 50%. Listeners took a break every 200–250 trials, which was approximately every 15 min. The order of the two blocks was balanced over listeners.
Listeners were trained before the main test started. The training began with stimuli that had k=1/3 and ITD=600 μs. Values of k and/or ITD were decreased as listeners'performance improved. Training continued until performance saturated. Listeners were separated into high-sensitivity and low-sensitivity groups after the training. If a listener could left/right discriminate k=1/3, ITD=200-, 400-, and 600-μs pulse trains more than 80% of the time, they were placed within the high-sensitivity group. The training period lasted between 4 and 8 h depending on the listener.
Figure 2 shows the results of the experiment. The high-sensitivity listeners (NH2, NH8, and NH10) are plotted in the top row. The low-sensitivity listeners (NH12, NH14, and NH15) are plotted in the bottom row. Several of the psychometric functions asymptote well below 100% correct. Several of the listeners show a decrease in percent correct (Pc) for the periodic (k=0) 400-μs condition where the ITD is ambiguous for a 600-pps pulse train. The data show that adding jitter to pulse trains increases ITD discrimination performance and eliminates the decreases in Pc due to the ITD ambiguity. The performance for the jittered pulse trains seems to be approximately limited by the performance for the bandpass-filtered noise stimuli.
A two-way repeated-measures analysis of variance (RM ANOVA) (factors: ITD and k) was performed. The values of Pc were transformed using the rationalized arcsine transform proposed by Studebaker (1985) to not violate the homogeneity of variance assumption required for an ANOVA. The RM ANOVA showed that the main effects were highly significant (p<0.0001 for both), but the interaction was not statistically significant (p=0.47). Tukey HSD post-hoc tests were performed separately for the two listener groups to determine the value of k that shows a significant increase from the periodic condition. For the high-sensitivity listeners, k=0 did not differ from k=1/128 (p=0.94) and k=1/32 (p=0.22); k=0 significantly differed from k=1/8 (p=0.0004) and k =1/3 (p<0.0001). For the low-sensitivity listeners, k=0 did not differ from k=1/8 (p=0.30); k=0 significantly differed from k=1/3 (p=0.001), k=1/2 (p<0.0001), and k=3/4 (p<0.0001).
To more easily compare our results to those of previous studies, JNDs were estimated for each listener.2 The threshold criterion was set to 70% and JNDs are reported in Table I. Some JNDs could not be computed because there were no Pc values above 70%.
The data in Fig. 2 show that introducing jitter to the pulse timing of an acoustic pulse train can substantially improve ITD discrimination performance. Depending on the sensitivity of the listener, the amount of jitter that increased ITD sensitivity was different. The high-sensitivity listeners showed significant improvements for jitter values as small as 1/8. The low-sensitivity listeners showed significant improvements for jitter values as small as 1/3.
Listeners showed no or low sensitivity to ITD in the periodic 600-pps pulse trains, which was expected based on pilot tests. In many cases for low values of k, JNDs could not be determined (ND in Table I), consistent with previous studies of ITD sensitivity at this rate (Majdak and Laback, 2009). In contrast to our results, studies using comparable-rate pulse trains reported determinable JNDs (Hafter and Dye, 1983; Dye and Hafter, 1984). This difference is most likely due to the fact that we used long (150-ms) temporal ramping and thus avoided onset cues, which are known to be important at such high rates (e.g., Saberi and Perrott, 1995; Laback et al., 2007).
Binaurally-jittered pulse trains have not been tested before in acoustic hearing. Hafter and Buell (1990) tested the effect of inserting one or three gaps of 5 or 7.5 ms in a regular pulse train with a standard IPI of 2.5 ms and observed improvements of ITD JNDs as large as a factor of 2. Even though this modification has some similarities with binaural jitter, a direct comparison to our results is hindered by the several differences in the stimuli, including the pulse rate, the signal duration, and the manner of IPI modification. Laback and Majdak (2008) reported the effect of binaural jitter in electric pulse trains presented to CI listeners. They observed large improvements in ITD sensitivity similar to the improvements of the NH listeners in the current study. Again, a quantitative comparison between the two studies is hindered by differences in the stimuli, most importantly the difference between acoustic and electric hearing, but also the different pulse rates and the fact that the electric stimuli intentionally included a slowly-varying envelope modulation.
Nordmark (1976) and Blauert (1981) measured ITD sensitivity to jittered sinusoids, which can produce random AM at the output of some auditory filters as a result of FM-to-AM conversion. Thus, the jittered sinusoids may have similar temporal characteristics as our jittered pulse trains. Comparison of our JNDs to those for the previous two studies agrees with Blauert's measurement, who found an average JND of 173 μs for 5% jitter. For our experiment, the average JND was 213 μs for the high-sensitivity listeners for k=1/32=3% jitter. The JNDs were not determinable for the low-sensitivity listeners for this jitter value. Our measurements may be slightly larger compared to Blauert because we used low-frequency masking noise, which could have increased JNDs (Bernstein and Trahiotis, 2004).
Blauert (1981) measured an average JND of 35 μs for 1-octave noise centered at 4 kHz. Our average JND was 102 μs for a 1/2-octave noise. Assuming increasing sensitivity with increasing spectral bandwidth (Bernstein and Trahiotis, 1994), our JNDs are expected to be larger than Blauert's JNDs. As mentioned before, we included a low-frequency masking noise, which could have further increased JNDs.
The results show that ITD sensitivity increases as the amount of jitter increases. This gain appears to be limited to the performance achieved with the bandpass-filtered noise stimuli. A discussion of the similarities between noise and jittered pulse trains is provided in the general discussion.
By introducing jitter, portions of the pulse trains have a relatively long instantaneous IPI, which decreases the instantaneous rate. At high center frequencies, low-rate modulated stimuli are easier to lateralize than unmodulated stimuli (Henning, 1974), or high-rate modulated stimuli (Bernstein and Trahiotis, 2002). It could be that the increase in ITD sensitivity with jitter was due to the listeners more effectively utilizing the long IPIs in the pulse train compared to the short IPIs. This hypothesis has two forms. The first form is that listeners utilized the IPIs longer than some critical absolute value. The second form is that listeners utilized the IPIs relatively longer than the surrounding IPIs. In this experiment, we tested the first form of this hypothesis. To do this, we used periodic 600-pps pulse trains and jittered 1200-pps pulse trains (k=1). If the performance at 1200 pps with jitter exceeds the performance at 600 pps without jitter, then the absolute length of the IPI cannot be the sole signal property that causes the increased ITD sensitivity with increasing jitter. This is because the maximum IPI for a 1200-pps pulse train with k=1 is precisely the IPI for a 600-pps pulse train without jitter.
To keep the number of resolved harmonics constant and the spectral bandwidth approximately constant in terms of critical bands [an equivalent rectangular bandwidth of approximately 2.3 at both center frequencies (Moore and Glasberg, 1983)], the spectral center frequency and bandwidth of the 1200-pps stimulus were increased by a factor of 2 relative to the 4.6-kHz pulse train. As control conditions, a 600-pps pulse train was tested at a 9.2-kHz center frequency and a 1200-pps pulse train was tested at 4.6 kHz. This allowed us to further study the effects of the rate and place parameters.
This experiment used the same methods as experiment 1 and tested three conditions. The first condition used stimuli that had a spectral center frequency of 4.6 kHz and a bandwidth of 1.5 kHz, like those of experiment 1, but with a pulse rate of 1200 pps. The jitter value was either k=0 or 1. The second and third conditions used stimuli that had a center frequency of 9.2 kHz and a bandwidth of 3 kHz. The second condition had 600-pps stimuli with k=0 or 1/3 for the high-sensitivity listeners or k=0 or 3/4 for the low-sensitivity listeners. These values of k matched the largest values of k for the 600-pps data tested in experiment 1 for particular listener groups. The third condition had 1200-pps stimuli with k=0 or 1 for all the listeners. The A-weighted sound pressure level was 72 dB for all of the stimuli. The masking noise was the same as used in experiment 1. The same six listeners participated in this experiment. For listener NH8, ITD values of 50 and 100 μs were tested. For the other listeners, the ITD values of 200 and 400 μs were tested.
Figure 3 shows the results for experiment 2. The 4.6-kHz, 600-pps data are repeated from experiment 1. From the figure, it can be easily seen that for any specific condition the performance for the jittered pulse trains was always greater than for the periodic pulse trains. For the periodic conditions, there appear to be substantial floor effects as most of the Pc values are near 50%. For some jittered conditions and listeners, there appear to be some ceiling effects.
To determine the effects of place and rate, specific cases were compared with a RM ANOVA. The values of k=1/3, 3/4, or 1 were considered as a single condition with jitter for the statistical analysis. This is reasonable because the high-performance listeners in experiment 1 seemed to show a saturation of performance (approximately the performance for the bandpass noise) at k=1/3 and the low-performance listeners at k=3/4. It is assumed that a value of k=1 would only marginally improve the performance. First, one of the most important comparisons for this experiment is between jittered 1200-pps pulse trains at 9.2 kHz and periodic 600-pps pulse trains at 4.6 kHz. There was a significant difference between these two conditions (p<0.0001). This indicates that the absolute length of the IPIs due to jittering pulse trains does not cause the observed improvements in ITD sensitivity. However, the comparison between these two conditions might be confounded by the effect of the different places and bandwidths.
The effect of place was tested using a RM ANOVA including only conditions with jitter to avoid floor effects. There was a significant decrease in performance with increasing place (p=0.001). Thus, even though there was an effect of place, it does not confound the conclusion of a larger performance for the jittered 9.2-kHz pulse trains compared to the periodic 4.6-kHz pulse trains. Rather, the effect of place reduces the difference.
It is possible that the increased sensitivity for the jittered 1200-pps pulse trains at 9.2 kHz compared to the periodic 600-pps pulse trains at 4.6 kHz is due to the increased spectral bandwidth used at 9.2 kHz. Therefore, we compared jittered 1200-pps pulse trains to periodic 600-pps pulse trains for a fixed place of 4.6 kHz. There was a significant difference between these two conditions (p<0.0001). Since the jittered 1200-pps pulse trains showed a higher performance than the periodic 600-pps pulse trains for a constant spectral bandwidth, it stands to reason that it was not the increased bandwidth that increased the performance when the place was changed.
This experiment tested the hypothesis that jitter increases ITD sensitivity because it increases some IPIs beyond some absolute duration. Long IPIs may provide a benefit to listeners due to the refractoriness of some auditory neurons. This hypothesis can be rejected because it was shown that it was much easier to lateralize jittered 1200-pps pulse trains than periodic 600-pps pulse trains while systematically varying the rate and place parameters. Varying both parameters was necessary because by changing the rate, the number of resolved harmonics in the stimulus changed. The comparisons showed that place and rate had a comparatively small effect on ITD sensitivity compared to the jittering of pulse timing.
Physiological measurements of responses of ventral cochlear-nucleus (VCN) chopper cells to maximum length sequence pulse trains (essentially jittered pulse trains) have been performed by Burkard and Palmer (1997). Their measurements showed that jitter increases the probability of a VCN neuron firing at certain time instances. Although spherical bushy VCN cells, not chopper VCN cells, project to the medial superior olive (MSO) (Smith et al., 1993), insight may be gained from Burkard and Palmer's (1997) measurements. We hypothesized that the responses of the auditory nerve (AN) fibers, the input of the VCN, would become more synchronous after the introduction of jitter. We also hypothesized that increased synchrony will cause a sharpening of rate-ITD tuning. We assumed that sharpening of rate-ITD tuning is related to improvements in ITD sensitivity. Therefore, we modeled AN and MSO responses to binaurally-jittered pulse trains.
The model of the auditory periphery that was used was developed by Meddis (2006). We will briefly describe the model. A physical acoustic stimulus was filtered by a human outer and middle ear model based on Huber et al. (2001). The filtering of a human basilar membrane was modeled with a dual-resonance non-linear filter with parameters based on Tables II and III in Lopez-Poveda and Meddis (2001). The inner-hair cell (IHC) cilia, IHC presynapse, and AN synapse parameters were from Tables II, III, and IV of Meddis (2006), respectively. Only high-spontaneous rate fibers were modeled. The refractory time for the AN fibers was 0.75 ms. The model sampling rate was 10 kHz. The input stimuli had the same parameters (level, duration, rise-fall time, etc.) as the stimuli used in the experiments.
Figure 4 shows sample post-stimulus time histograms (PSTHs) for periodic and jittered (k=0.9) 100-pps and 600-pps pulse trains with a 4.6-kHz center frequency. The auditory filter was centered at 4.6 kHz. The 100-pps pulse trains show synchronous responses to both the periodic and jittered conditions, and there is little noticeable difference in the PSTHs with the exception for the expected aperiodic timing of peaks for the jittered pulse train. In contrast, for the periodic 600-pps pulse trains, the synchrony is not evident. Additionally, the jittered 600-pps pulse train shows noticeably higher peaks in the PSTH compared to the periodic 600-pps pulse train, which can easily be seen in Fig. 4(C).
We measured the firing rate and synchrony of the AN fibers' responses. Each measurement was made over 50 unique PSTHs. All calculations were made over the entire 500-ms stimulus duration. Figure 5 shows the average firing rate and the correlation index (CIn)(described below) for the average of five pulse train manifestations for five pulse rates (100, 300, 600, 900, and 1200 pps) as a function of k (0, 0.05, 0.1, 0.25, 0.5, 0.75, 0.9, and 1). The responses of two filters with best frequencies (BFs) of 4.6 and 9.2 kHz were modeled. The AN firing rates for the 100-pps pulse trains are around 50–70 spikes/s. Theoretically, the firing rate should be near 100 spikes/s. As stated before, because of the long temporal onset and offset ramps, the −3-dB duration of the stimuli was only 288 ms. If the temporal ramping was omitted from the input stimuli, the firing rates would be higher. The other pulse rates have a higher firing rate, limited by the refractory time of the AN fibers. For both center frequencies and all pulse rates, the AN firing rate decreases slightly with jitter. For the higher pulse rates, a small decrease in firing rate may be expected with jitter because, after short IPIs, pulses will be missed because of the refractory effects. This will not necessarily be compensated by the longer IPIs because it will depend on the lengths of the surrounding IPIs.
We quantitatively measured the change in synchrony when jitter was added to a pulse train. Since jittered pulse trains are aperiodic, a common metric like the synchronization index is not appropriate. Instead, we used a metric to allow for aperiodic stimuli, called the CIn (Joris et al., 2006). The CIn is based on the counting of neural response spike coincidences from multiple presentations of the same stimulus. Mathematically, the CIn is
where Nc is the number of individual neuron firing coincidences, r is the average firing rate, M is the number of presentations, ω is the coincidence window duration, and T is the duration of the stimulus. The factor of 2 is necessary because we used the number of unordered pairs for our calculation, not the number of ordered pairs as in Joris et al. (2006). For our modeling, we used ω =100 μs and M =50 presentations. The CIn has a value of 1 for an uncorrelated response, a value greater than 1 for a correlated response, and a value of 0 for an anticorrelated response.
The bottom row of Fig. 5 shows the CIn. The dotted lines show the CIn for k=0. For an increase in jitter, both center frequencies and all pulse rates show an increase in CIn, hence more synchronous firing. The 100- and 300-pps pulse trains show increases for k greater than or equal to 0.5. In contrast, the higher pulse rates show an increase in CIn for values of k as small as 0.05. For a condition that showed a significant increase in ITD sensitivity in experiment 1 (high-sensitivity listeners), namely, the 600-pps pulse trains for k=1/8=0.125, there is an increase in firing synchrony.
Blauert (1981) postulated that the increase in ITD sensitivity to jittered sine tones was due to FM-to-AM conversion of the signal, which may happen due to the steep slopes of the auditory bandpass filters. He postulated this especially for off-frequency filters toward higher frequencies. To investigate the use of off-frequency cues for ITD sensitivity, we modeled the response of auditory filters with BFs from 3 to 9 kHz. We used periodic and jittered (k=0.9) 600-pps pulse trains, all with a 4.6-kHz spectral center frequency. Like before, we averaged our results over five different jitter manifestations. Figure 6 shows the results of varying the BF of the auditory filter. The jittered pulse trains always have a slightly smaller firing rate than the periodic pulse trains for all auditory filters modeled, although this difference is approximately constant for all BFs. The jittered pulse trains always have a larger CIn than the periodic pulse trains for all auditory filters modeled. The largest differences were for auditory filters away from the center frequency of the stimulus, in line with Blauert's (1981) notion that the AM in off-frequency filters could be important for detecting ITDs. Another explanation for the larger CIn (hence improved synchrony) with increasing auditory filter BF would be that the basilar membrane impulse response becomes shorter with increasing center frequency because the auditory filter bandwidth increases. The importance of the length of the basilar membrane impulse response is also supported by the results for periodic pulse trains in Fig. 5; the CIn for the 9.2-kHz band is consistently higher than the CIn for the 4.6-kHz band.
Because we observed an increase in AN firing synchrony with an increase in jitter, we wondered if such an increase could be utilized by the MSO to improve ITD perception. We simply used the response of the AN as the MSO input because it is presently unknown exactly how primary-like spherical bushy VCN cells alter the AN firing pattern. The modeled MSO neuron was a simple excitatoryexcitatory coincidence counter. Thirty excitatory synapses (15 per side) provided the input to the cell. The cell fired if there was a coincident firing from the left and right inputs within a 100-μs window. One-hundred unique PSTHs were made and 30 (15 for each side) were randomly selected without replacement as the input to the MSO cell.3 After a coincidence, the cell went into a refractory state where no firing occurred for 1 ms (Scott et al., 2005; Scott et al., 2007).4 Each MSO measurement was repeated 100 times using different random sets of 30 PSTHs, chosen from the same pool of 100 PSTHs.
To show how an increased firing synchrony could translate to increased ITD sensitivity, we calculated MSO firing rates for binaural AN inputs with a range of ITDs. To support the psychophysical data, we expect sharper rate-ITD tuning for jittered pulse trains. The model MSO neuron had a best ITD of 0 μs. Figure 7 shows the rate-ITD curves for responses of auditory filters with BFs between 4 and 9 kHz. The input stimuli were a periodic and a jittered (k=0.9) 600-pps pulse train with a 4.6-kHz center frequency. Little difference in the tuning could be seen between the shapes of the curves for the periodic and jittered pulse trains at 4 and 5 kHz. At higher BFs, the periodic nature of the rate-ITD tuning curves is apparent for the periodic pulse trains. This periodic nature is not seen for the jittered pulse trains. As expected, because the CIn is larger for the jittered pulse trains, there was sharper rate-ITD tuning. The MSO firing rate decreases for all ITDs for best auditory filter frequencies of 6 kHz and higher. This is due to the decreasing firing rate in the AN with increasing BF, seen in Fig. 6(A).
These experiments were inspired by previous studies on the binaural adaptation phenomenon (Hafter and Dye, 1983; Hafter and Buell, 1990). The results show that introducing binaurally-synchronized jitter into the timing of high-frequency filtered pulse trains considerably improves ITD sensitivity of NH listeners, consistent with the hypothesis that a change in the ongoing signal causes a recovery from binaural adaptation.
However, while Hafter and Buell (1990) and Hafter (1997) argued that the recovery effect is mediated by a discernible short-term change to the spectrum, the study by Laback and Majdak (2008) on the effect of binaural jitter on ITD sensitivity in CI listeners suggests that temporal changes alone can cause a recovery. Direct electrical stimulation at a single interaural electrode pair allowed the introduction of jitter to the pulse timing without concomitant spectral changes. The improvements in ITD sensitivity by binaural jitter in electrical hearing were similar to those observed in the present study with acoustic hearing. Of course, it is possible that different mechanisms are responsible for the improvements in electric and acoustic hearing. Hence, we cannot entirely dismiss the possibility that spectral changes also contributed to the recovery effect in acoustic hearing. It is worth noting that jitter, created by randomly modulating the rate of pulses, will cause an additional AM signal due to FM-to-AM conversion from auditory filtering in NH listeners. In contrast, in CI listeners, the auditory filters are bypassed. However, additional AM is probably created in both NH and CI listeners in the auditory system via synaptic transmission properties and neural membrane time constants.
In experiment 2, we hypothesized that the improvement in ITD sensitivity was due to only the introduction of long IPIs. Long IPIs could reduce the instantaneous modulation rate below the lowpass cutoff of a modulation filter or allow temporary recovery from refractoriness in auditory neurons. The data showed higher performance for jittered 1200-pps pulse trains compared to periodic 600-pps pulse trains. Since the pulse trains for the two different rates had longest IPIs of the same duration, this implies that the absolute length of the IPI is not as important as the relative IPI in the context of the surrounding pulses. Similar results can be found using electric stimulation (Laback and Majdak, 2008). Two listeners for which comparable data are available showed significantly higher percent correct scores for jittered pulse trains (k=3/4) at 1515 pps compared to the scores for periodic pulse trains at 800 pps for an ITD of 600 μs. Nevertheless, there is no reason to assume that long IPIs do not improve ITD sensitivity. However, having only long IPIs is not sufficient to improve ITD sensitivity. Rather, the temporal jitter, which combines both long and short IPIs, seems to be the necessary condition for improved ITD sensitivity.
We modeled the neural response characteristics in order to determine if response changes in the auditory periphery and brainstem might reflect the behavioral changes in ITD sensitivity. The results indicate that jitter increases the synchrony in the neural spike pattern of the ongoing signal. This is especially the case for auditory filters with BFs higher than the center frequency of the stimulus. It is quite likely that the increased synchrony makes it easier for the binaural system to detect an ITD, given that the jitter is synchronized between the two ears.
Modeling the basic operation of MSO neurons, we also showed improved rate-ITD tuning for auditory filters with BFs higher than the center frequency of the stimulus. While the simple MSO model was able to capture some of the expected trends in the data, numerous improvements could be made to the modeling, which might show a greater contrast between the rate-ITD tuning curves for periodic and jittered signals. For example, inclusion of spherical bushy VCN cells, which act as the input to the MSO, may act as monaural coincidence detectors (Carney, 1990). Also, a true physiological model of the MSO could be used (Han and Colburn, 1993), particularly one that includes elements that improve timing aspects, such as the inclusion of dendrites (Agmon-Snir et al., 1998). VCN bushy cells and MSO principle neurons contain low-threshold potassium channels, which are thought to play a role in coincidence detection (Manis and Marx, 1991; Smith, 1995). High-rate pulse trains, which would produce a relatively constant synaptic input to these neurons, may produce sustained activation of the low-threshold potassium channels, which, if included in the model, would suppress neuron repolarization and block firing (Colburn et al., 2008). Also, inclusion of inhibitory effects in the VCN (Burkard and Palmer, 1997) or at higher centers like the inferior colliculus (Smith and Delgutte, 2008) may also help the use of models to understand the jitter effect.
The modeling results provide an explanation for the jitter effect in terms of increased synchrony of the neural response at the level of the AN, which is not inconsistent with recent ITD sensitivity measurements in CI listeners by van Hoesel (2008). This explanation is somewhat different from the hypothesis proposed by Hafter and Buell (1990) that the recovery from binaural adaptation induced by inserting a change (trigger) in a pulse train is an active process involving some kind of change detector. Hafter and Buell (1990) reported that other types of changes applied to the ongoing part of a pulse train, such as the insertion of short trigger signals (either monotic or diotic), cause recovery. The question arises if the effect of these changes could also be explained by an increase in synchrony. The answer is probably no, since a monotic or a diotic trigger signal in the spectral frequency region of the pulse train could disrupt the ITD. This is because the changes in the neural firing pattern would be unassociated with the ITD. This seems to imply that the recovery effect observed by Hafter and Buell (1990) with those triggers is mediated via another mechanism, which may involve a true change detector. However, our modeling results do not necessarily rule out the restarting explanation of Hafter and Buell (1990), and further work needs to be done to investigate the underlying mechanisms of recovery from binaural adaptation.
An interesting aspect of the data obtained in this study, which was also seen for electric hearing in Laback and Majdak (2008), is that binaural jitter resolves the ambiguity in the ongoing ITD cue that occurs whenever the ITD exceeds one-quarter of the IPI. Majdak et al. (2006) showed that CI listeners lateralize periodic pulse trains to the wrong (lagging) side for fine-structure ITDs exceeding one-half of the IPI. However, in our study, for jittered pulse trains, listeners lateralize to the correct side, even for ITDs that approach or exceed one-half of the IPI. For example, the NH listeners lateralized jittered 1200-pps pulse trains (IPI =833 μs) with a 400-μs ITD, which is about one-half IPI, to the correct (leading) side in nearly all the trials. The CI listeners in Laback and Majdak (2008) correctly lateralized jittered pulse trains at rates from 800 to 1515 pps with ITDs falling within the range of one-quarter to one IPI. There are at least two possible explanations of how binaural jitter could resolve the ITD ambiguity. First, the auditory system could process and analyze the jittered pulse trains as a temporal structure, thus integrating information across time. For a pulse train with an ITD of one-half of the IPI, the classical cross-correlation model of binaural interaction (e.g., Colburn, 1977) predicts an ambiguous pair of peaks in case of a periodic pulse train. However, in the case of a jittered pulse train, the “wrong” peak disappears and only the peak corresponding to the correct ITD remains. Second, the auditory system could pick out interaural pulse pairs with a large IPI to adjacent pairs. This corresponds to a so-called multiple looks model (Viemeister and Wakefield, 1991), where the auditory system stores samples or “looks” of the signal in memory and accesses and processes them selectively.
Finally, based on these findings, we would like to reconsider the interpretation of experiments on the ITD sensitivity to high-frequency bandpass-filtered white noise. In particular, Bernstein and Trahiotis (1994) showed that the ITD JND for bandpass-filtered noise centered at 4 kHz is independent of noise bandwidth up to a bandwidth of at least 800 Hz. The mean envelope rate in filtered noise corresponds to about 64% of the bandwidth (Rice, 1953). Thus, the 800-Hz bandwidth corresponds to a modulation rate of 512 Hz. For sinusoidal AM tones, two-tone complexes, or transposed tones, ITD JNDs could not be measured at modulation rates of 512 Hz or greater (Bernstein and Trahiotis, 1994, 2002). In order to explain the comparatively high ITD sensitivity for the noise, Bernstein and Trahiotis (1994) suggested that the listeners may shift their attention to lower-frequency “internal filters” or critical bands, which would result in a narrower critical bandwidth and consequently a lower rate of envelope fluctuation. Bernstein and Trahiotis (1994) also noted that this strategy reduces the rate of envelope fluctuation with no loss in depth of modulation. In light of the results presented in this study, an alternative explanation for the comparatively high sensitivity to filtered noise is the temporal jitter in the envelope.5 In other words, we propose that it is the random temporal variation in envelope maxima and minima that causes the high ITD sensitivity for noise rather than the strategy of down-shifting the internal filters. Note that down-shifting of filters cannot explain our results as our pulse trains had a constant spectral bandwidth for all amounts of jitter, including the periodic condition. Naturally, it is also possible that the variation in the amplitudes of the envelope maxima is a relevant factor in case of filtered noise. Due to the similar performance for jittered pulse trains and filtered noise in our study, we assume that the temporal variation is the more important factor.
We would like to thank Mr. Michael Mihocic for running experiments and our listeners. We would like to thank the associate editor Dr. Richard Freyman and two anonymous reviewers for numerous improvements to this work. We would like to thank Dr. Brian Moore, Dr. Zachery Smith, Dr. Andrew Brughera, Dr. Laurel Carney, and Dr. Philip Joris for useful discussions about the binaural jitter phenomenon. We would like to thank Dr. Raymond Meddis for help using his model. We would like to particularly thank Dr. Bertrand Delgutte and Dr. Kenneth Hancock for helping us to understand the pertinent physiology. This study was funded in part by the Austrian Science Fund (FWF Project No. P18401-B15).
1The k=1/3 value was really 516/1667=0.31. We intended to use k=1/4 but there was a mistake in the experimental program.
2JNDs were estimated from a maximum-likelihood cumulative Gaussian fit to the Pc data using PSIGNIFIT version 2.5.41 (see http://bootstrap-software.org/psignifit/, Last viewed 6/18/09), a software package for fitting psychometric functions to psychophysical data (Wichmann and Hill, 2001a, 2001b). The function used was a Weibull function.
3Due to the extremely large amount of time needed to compute 30 000 AN firing patterns (30 AN fibers ×100 MSO measurements) per condition, we assumed that 30 AN fiber PSTHs randomly chosen from a pool of 100 were sufficient to represent variance of 30 000 PSTHs.
4Refractory times of 2 and 4 ms were also tried. The different refractory times resulted in the same basic trends in the data.
5Perceptually, periodic pulse trains have a tonal quality. Jitter introduces a noisy or scratchy quality to the pulse trains. The physical and perceptual qualities of temporally-jittered pulse trains and noise were summarized in Pierce et al. (1977). In that study, it is stated that “… the central limit theorem tells us that at high enough rates, for which many pulses do overlap, the [jittered pulse train] approaches Gaussian noise.” Hence, the ITD sensitivity for jittered pulse trains being bounded by the performance for noise seems consistent with the fact that jittered pulse trains become physically and perceptually similar to noises.