|Home | About | Journals | Submit | Contact Us | Français|
Two forward-masking experiments were conducted with six cochlear implant listeners to test whether asymmetric pulse shapes would improve the place-specificity of stimulation compared to symmetric ones. The maskers were either cathodic-first symmetric biphasic, pseudomonophasic (i.e. with a second anodic phase longer and lower in amplitude than the first phase), or “delayed pseudomonophasic” (identical to pseudomonophasic but with an inter-phase gap) stimuli. In Experiment 1, forward-masking patterns for monopolar maskers were obtained by keeping each masker fixed on a middle electrode of the array and measuring the masked thresholds of a monopolar signal presented on several other electrodes. The results were very variable and no difference between pulse shapes was found. In Experiment 2, six maskers were used in a wide bipolar (BP + 9) configuration: the same three pulse shapes as in Experiment 1, either cathodic-first relative to the most apical or relative to the most basal electrode of the bipolar channel. The pseudomonophasic masker showed a stronger excitation proximal to the electrode of the bipolar pair for which the short, high-amplitude phase was anodic. However, no difference was obtained with the symmetric and, more surprisingly, with the delayed pseudomonophasic maskers. Implications for cochlear implant design are discussed.
Multichannel cochlear implants (CIs) attempt to mimic the natural tonotopic encoding of the cochlea by stimulating different populations of auditory nerve fibers along the implanted electrode array. One potential limitation of contemporary devices lies in the so-called “channel interactions” phenomenon. Due to the conductive properties of the perilymph in the scala tympani, the current generated at one electrode site spreads out widely over the cochlea. As a consequence, the different stimulation channels excite overlapping neural populations and presumably degrade the spectral resolution of the sounds transmitted to the brain. Although existing CIs possess up to 22 intracochlear electrodes, little or no improvement in speech recognition is observed when the number of active electrodes is increased above approximately eight (Fishman et al., 1997; Friesen et al., 2001, 2005). This may not impose a severe limitation for the perception of speech in quiet, where only a few channels are needed (Shannon et al., 1995) and where many CI users perform well. However, it may impair performance in listening situations where more independent spectral channels are needed (e.g. for the perception of speech in noise or for the perception of music), and where CI users have been shown to perform poorly (Friesen et al., 2001; McDermott, 2004). Several approaches have been followed to reduce channel interactions in CIs, and the present study forms part of this effort.
One approach is to place the electrodes closer to the excitable neural elements (Shepherd et al., 1993). Some benefits of perimodiolar over outer-wall placements have been reported, including lower thresholds, better electrode discrimination and narrower forward-masking profiles (Cohen et al., 2001; 2006). To even further minimize the distance between the electrodes and the nerve, an alternative for future CI devices may be to implant the electrode array directly within the modiolus (Badi et al., 2002; Hillman et al., 2003; Middlebrooks and Snyder, 2007).
Another approach is to manipulate the electrode configuration. Most clinical strategies use a monopolar configuration which consists of stimulating each intracochlear electrode with reference to a remote (usually extracochlear) electrode. Theoretically, configurations that involve stimulation between closer intracochlear contacts, such as bipolar or quadrupolar, should produce a more spatially-focused excitation and therefore reduce channel interactions compared to monopolar (van den Honert and Stypulkowski, 1987; Jolly et al., 1996). However, behavioral measures in CI users failed to demonstrate sharper tuning or clear speech-perception benefits with bipolar or quadrupolar configurations compared to the usual monopolar one (Pfingst et al., 1997; Mens and Berenstein, 2005; Kwon and van den Honert, 2006). One reason for this arises from the fact that, in masking experiments, different masker types are presented at equal loudnesses. As Kwon and van den Honert (2006) have pointed out, the current used for bipolar or quadrupolar configurations may need to be increased to recruit sufficient auditory nerve fibers, thereby undermining the putative increases in selectivity. For speech perception, the potential advantages of bipolar or quadrupolar stimulation may be further compromised by irregularities in neural survival patterns (Kwon and van den Honert, 2006; Bierer, 2007), and with the possible occurrence of a bimodal spatial excitation pattern in bipolar configuration (Chatterjee et al., 2006; Snyder et al., 2008).
A third possibility, which is the focus of the present study, is to modify the electrical waveform. It has been shown that pulse duration and pulse polarity are determinant factors of spatial selectivity in functional electrical stimulation (Grill and Mortimer, 1996; McIntyre and Grill, 1999, 2000, 2002). In contemporary CIs, electrical signals are trains of amplitude-modulated symmetric biphasic pulses consisting of two opposite-polarity phases (anodic and cathodic), which may both stimulate the nerve. Physiological studies in animals have shown that monophasic anodic and cathodic pulses initiate spikes at different locations along the nerve fibers, as shown by the longer latencies obtained with cathodic than with anodic stimulation (Miller et al., 1999, 2004). This latency difference is the result of anodic pulses initiating spikes at a more central locus on the fibers than cathodic pulses. The fact that anodic stimuli initiate action potentials at a more remote site than cathodic stimuli is not specific to auditory nerve stimulation and has been observed in other extracellular neural stimulation studies (cf. Ranck, 1975).
To optimize the place-specificity of stimulation, action potential initiation should ideally be restrained to neural sites proximal to the electrode. The predictions of a computational model of the human cochlea developed by Rattay et al. (2001) suggest that this might be better achieved with cathodic stimuli. These authors first calculated the extracellular potentials produced by a monopolar electrode along sixteen modeled neurons spanning the entire cochlea. They found that the across-neuron difference in extracellular voltage along their axis was larger at their peripheral end and became smaller at more central locations. This trend is due to the specific geometry of cochlear neurons which get closer to each other as they extend from their peripheral terminal to the modiolus. The model of Rattay et al. (2001) further predicts that cathodic stimuli would produce a large depolarization (and therefore elicit spikes) at the level of the peripheral processes of most neurons. Because these neurons receive different contributions from the electrode (large contribution for neurons right next to the electrode and smaller contributions for more distant ones), even two neighboring neurons have very different thresholds. In contrast, for anodic stimuli, the largest depolarization is obtained at a level central to the cell body, where fibers are more tightly packed. The resulting effect is that neighboring neurons have much more similar thresholds and may be equally excited by the same stimulus current (cf. Table 2 in Rattay et al. 2001). The predicted threshold differences between the several modeled neurons and the neuron closest to the electrode (neuron 7) are illustrated in Fig. 1 for both monophasic anodic and cathodic stimuli. It can be seen that the cathodic pulse does produce the more progressive spatial recruitment, i.e. for “x” dB above the threshold of neuron 7, the stimulus level is suprathreshold for fewer neurons with a cathodic than with an anodic pulse. Based on these predictions, we would expect cathodic stimulation to produce a more place-specific excitation than anodic stimulation. Note that this model of the human auditory nerve assumes that the neurons’ peripheral processes are not degenerated and also predicts that the threshold of the lowest-threshold neuron (number 7) is lower for a cathodic than for an anodic pulse. Our recent findings on polarity sensitivity in human CI users contrast with these predictions as we found the anodic phase to produce a stronger masking effect than the cathodic phase when stimulating at the same current level (Macherey et al., 2008). Although the reason for this trend remains unknown, one explanation could be a substantial loss of peripheral processes (cf. Macherey et al., 2008 for discussion). In such a “degenerated scenario”, the place-specificity difference between anodic and cathodic stimuli discussed above may not hold.
The spatial selectivity of monophasic cathodic pulses cannot be directly evaluated in CI users because this would impose the delivery of a net DC charge to the nerve, which can cause damage to the tissue (Shepherd et al. 1999). One way to approach the situation encountered with cathodic monophasic stimulation may be to use asymmetric biphasic pulses for which the contribution of the anodic phase is reduced by making it longer and lower in amplitude (Miller et al., 2001; van Wieringen et al., 2005; Macherey et al., 2006). If a monophasic cathodic pulse excites a more spatially restricted neural population than an anodic pulse, we would expect such cathodic asymmetric pulses to provide a more spatially-focused excitation than symmetric ones when presented in monopolar configuration. This hypothesis will be tested in Experiment 1.
Another different reason for using asymmetric pulses was suggested by a computational model of the guinea pig cochlea (Frijns et al., 1996). Frijns et al. argued that the use of asymmetric rather than symmetric pulses would approximately double the number of independent channels of a CI with longitudinal bipolar electrodes. Their model predicted that the fibers proximal to the electrode for which the short, high-amplitude phase of the pulse is cathodic would be more effectively excited than the fibers proximal to the other electrode of the pair. In contrast, with biphasic stimulation, there would be a bimodal pattern of excitation, reflecting equal stimulation at sites near to each electrode. These predictions are consistent with more recent physiological recordings by Bonham et al. (2003) who measured neural activity along the tonotopic axis of the inferior colliculus of guinea-pigs following intracochlear bipolar stimulation with asymmetric pulses. They found that the place of the peak of excitation shifted when inverting the polarity of the asymmetric pulse. This shift was also consistent with a more effective excitation of fibers near the electrode for which the short, high-amplitude phase was cathodic. This finding probably arises from the fact that, for a given current level, cathodic pulses are more effective in exciting nerve fibers than anodic pulses, as found in most animal physiological studies (e.g. Miller et al., 1999). However, as previously mentioned, our recent work on polarity sensitivity in human CI users showed the opposite trend, i.e. that anodic phases have a stronger masking effect than cathodic phases (Macherey et al. 2008). We may, therefore expect the opposite effect to the one predicted by Frijns’ model and obtained by Bonham et al. (2003), i.e. that fibers proximal to the electrode for which the short, high-amplitude phase is anodic are more effectively activated. This hypothesis will be tested in Experiment 2.
Channel interactions can be assessed using several psychophysical (reviewed in Shannon et al., 2004) or electrophysiological techniques (Cohen et al., 2003; Abbas et al., 2004). In the two experiments reported here, we investigated non-simultaneous interactions using a psychophysical forward-masking paradigm (Lim et al., 1989; Shannon, 1990; Cohen et al., 1996; Chatterjee and Shannon, 1998; Chatterjee et al., 2006; Kwon and van den Honert, 2006). Forward masking refers to the threshold shift of a signal presented after a masker stimulus. It is related to the degree to which auditory nerve fibers that are important for the detection of the signal also respond to the masker, and whose response to the signal therefore is affected by neural refractoriness at the auditory nerve and more centrally. By keeping the masker on a fixed electrode and measuring the threshold shift of the signal on different electrodes spanning the entire electrode array, it is possible to measure the spatial spread of excitation produced by a given masker.
Six postlingually deafened CI users (S1-S6) participated in a series of two psychophysical forward masking experiments. All six subjects performed Experiment 1 while only four (S1-S4) performed Experiment 2. All subjects had been implanted with a perimodiolar electrode array (HiFocus II for all subjects except S4 who has a Helix) manufactured by Advanced Bionics and consisting of 16 intracochlear electrodes. Each electrode contact has a rectangular shape (0.4 * 0.5 mm) and is made of platinum and iridium. Electrodes are numbered from 1 to 16, from the most apical to the most basal one. The distance between two adjacent contacts is 1.1 mm for the HiFocus II and 0.85 mm for the Helix. Table I summarizes the information for each subject, including electrode array, age, duration of deafness, duration of CI use and etiology of deafness. Testing was approved by the K.U. Leuven Medical Ethical Committee and was in accordance with the Declaration of Helsinki. Subjects were paid for participating.
Forward-masked thresholds were measured for three different masker shapes in both monopolar (experiment 1) and bipolar (experiment 2) configurations (Fig. 2). The three shapes included a symmetric biphasic cathodic-first (BI-C), a pseudomonophasic cathodic-first (PS-C) and a delayed pseudomonophasic cathodic-first (DPS-C) stimulus. Each of these three stimuli had a total duration of 400 ms, a rate of 104 pps, and the duration of the cathodic phase was always 97 μs. The duration of the second (anodic) phase was also 97 μs for the BI-C stimulus, whereas, for the PS-C and DPS-C pulses, it was eight times longer, with an amplitude reduced by the same factor in order to maintain charge-balancing. The inter-phase gap (IPG) of DPS-C was set to 4.3 ms to present the second phase approximately midway between the first phases of two consecutive pulses. Each current-source of the implant is coupled to a DC-blocking capacitor which induces a current flow during the inter-phase and inter-pulse gaps of DPS-C. The amplitude of this current flow was similar to that previously measured by Macherey et al. (2006), i.e. about 50 dB lower than the pulse amplitude.
The three stimuli were used as maskers of a 19-ms BI-C signal presented 19 ms after the end of each masker (i.e. after the end of the second phase of the last masker pulse). The signal was always presented in monopolar mode (for both monopolar and bipolar maskers), had the same phase duration as the masker (97 μs), and a much higher rate of 937 pulses per second. We used a monopolar signal to avoid the marked variations in absolute sensitivity that can occur across the electrode array with bipolar stimulation (e.g. Bierer, 2007). Unlike previous forward-masking studies with CIs which used identical masker and signal, we used different stimulation rates to avoid possible confusion effects, as suggested by psychoacoustical data (Moore and Glasberg, 1982; Neff, 1985). Specifically, if the signal is presented on the same electrode as the masker and has the same pulse rate, it is possible that it will be perceived as a mere continuation of the masker. This could prevent the signal from being detected, even when it provides substantial stimulation of the auditory nerve. Furthermore, when the signal and masker are presented on different electrodes, these confusion effects would be expected to decrease, leading to an over-estimate of the degree of tonotopic selectivity (Moore et al., 1984).
The stimuli were presented through direct stimulation of the implant via the clinical programming interface provided by the implant manufacturer. The psychophysical tests were performed using the APEX software platform (Laneau et al., 2005) and the BEDCS software developed by Advanced Bionics (Litvak, 2003). As previously described in Macherey et al. (2006), a finite number of current-level values can be accessed using this equipment. Specifically, the current level is coded on eight bits and the minimal current step that can be used depends on the current range that is selected. The following experiments were performed either in the 0-255 μA or in the 0-510 μA current range. These two ranges have a different minimal current-step size of 1 μA and 2 μA respectively. For asymmetric pulses with a ratio of duration between phases of 1/8, the minimal current-step was increased by a factor of 8, leading to current-step sizes of 8 μA and 16 μA for current ranges 0-255 μA and 0-510 μA, respectively (cf. Macherey et al., 2006).
In this experiment, we measured forward-masking patterns produced by the three maskers described in Fig. 2 (BI-C, PS-C, and DPS-C). The maskers were presented in monopolar configuration, on an electrode located in the middle of the array (electrode 9) with reference to the case (extracochlear) electrode of the implant. If the cathodic phase produces a more spatially compact neural excitation than the anodic phase, then asymmetric pulses (PS-C and DPS-C) should yield a narrower forward-masking pattern than BI-C. We used DPS-C in addition to PS-C because DPS-C has the advantage of requiring much less current than BI-C or PS-C to evoke the same loudness (Macherey et al., 2006). In the following paragraphs, the term “channel” will refer to the active electrode of the monopolar channel.
A preliminary aim of this experiment was to equate the effectiveness of the three masker types for a signal presented on the same channel, so that it is possible to directly compare the masked threshold differences produced by moving the signal to more distant locations. The experimental design was therefore divided into six steps:
The advantages and limitations of this method over those previously used will be discussed further in section IV.A. For the meantime it is worth noting that the maskers are equated not for loudness, but for the amount of masking produced at a single site. This avoids the situation of having a masker that produces a narrow spread being increased in level so that its loudness matches that of the others, and of this loudness increase being bought at the expense of a broader spread resulting from the increased current.
The masker equating procedure resulted in the masker levels displayed in Table 2. Consistent with previous reports (Macherey et al., 2006), these are usually lowest for the DPS-C stimulus. Fig. 3 shows the absolute and masked thresholds for the six subjects and the three different masker types. As expected, the masked thresholds on the masker channel (9) are similar for the three masker types. Because the pattern of absolute thresholds differed across subjects, the same data are re-plotted in terms of amount of masking in Fig. 4. The functions show in general a maximum of excitation on the masker channel and a decrease as the signal channel is more spatially remote from the masker channel. For S4, the maximum of excitation is located at a more basal region (channel 10 or 12; cf. Fig. 4). This may reflect the particular threshold trend of this subject, who shows lower absolute thresholds near the base (Fig. 3). The shift in masking peak may therefore indicate that fibers at a more basal location than electrode 9 are more effectively masked than fibers proximal to electrode 9 due to e.g. a specific pattern of neural survival or a “kink” in the electrode array. Furthermore, the absolute amount of threshold shift was variable across subjects. For example, the difference in excitation between channel 9 and 16 was only about 3 dB for S1 and 9 dB for S2 (Fig. 3).
Two-way repeated measures ANOVAs were conducted on the mean data of the six subjects. As we aimed to compare the BI-C excitation pattern with the other two, we performed two different analyses. One analysis compared the masked thresholds of BI-C and PS-C, and the other compared the masked thresholds of BI-C and DPS-C. The treatment factors were the masker type and the signal channel. If the hypothesis that asymmetric pulses are more selective than symmetric pulses is confirmed, this should be reflected by a significant interaction between masker type and signal channel as the amount of masking produced by symmetric and asymmetric maskers should differ more and more when the signal channel is moved further from the masker channel. For both analyses, there was a significant between-subject effect (F(1,5)=11.2, p=0.02 for BI-C/PS-C and F(1,5)=11.6, p=0.02 for BI-C/DPS-C). However, there was no effect of masker type (F(1,5)=0.78, p=0.42 for BI-C/PS-C and F(1,5)=0.78, p=0.42 for BI-C/DPS-C), nor of the interaction between masker type and signal channel (F(1,5)=1.26, p=0.31 for BI-C/PS-C and F(1,5)=1.26, p=0.31). The only significant contributor was the signal channel, reflecting the obvious fact that there is usually more masking near the masker channel than at other cochlear locations, whatever the masker type is. Therefore, the hypothesis that asymmetric pulses with a long, low-amplitude anodic phase would improve spatial focusing compared to symmetric pulses in monopolar configuration is not supported by these data.
Because of the large inter-subject variability, the data were also analyzed statistically for each subject separately, again using two-way repeated-measures ANOVAs (with each masked threshold estimate counting as a repetition). When the interaction between masker type and signal channel was not significant, subsequent paired-samples t-tests were performed on individual channels using the Bonferroni correction. A statistical summary is given in Table 3.
S1 and S4 demonstrated a sharper excitation pattern for PS-C than for BI-C, as demonstrated by a significant effect of masker type. Although no significant interaction between masker type and channel was found, paired-sample t-tests (df=5) revealed significant differences on channel 6 (p=0.012) for S1 and on channel 4 (p=0.009), 6 (p=0.009), 12 (p=0.018), 14 (p=0.009) and 16 (p=0.018) for S4. For the other four subjects (S2, S3, S5 and S6), the BI-C and PS-C patterns were very similar and the effect of masker type was not significant.
The BI-C and DPS-C maskers produced significantly different patterns in five of the six subjects (all except S5). For S6, none of the differences observed on individual channels (6 and 14) survived the Bonferroni correction. DPS-C led to significantly less masking on channel 10 (p<0.001) and 14 (p=0.045) for S1 and on channel 6 (p=0.045) for S4. However, for S2 and S3, the opposite was true, i.e. the DPS-C pattern was significantly broader (showing more masking) than the BI-C pattern. For these two subjects, one side of the DPS-C pattern (the basal side for S2 and the apical side for S3) had a shallower slope than the BI-C pattern. This observation was corroborated by the statistical tests which revealed a significant effect of the interaction between masker type and channel for the two subjects S2 and S3. Moreover, paired-samples t-tests revealed significant differences on channel 9 (p=0.008), 10 (p=0.013), 12 (p=0.002), 14 (p<0.001) and 16 (p<0.001) for S2 and on channel 2 (p=0.002), 4 (p=0.001), 6 (p<0.001), 9 (p=0.022), and 16 (p=0.025) for S31.
In summary, there appeared to be no consistent advantages of using asymmetric rather than symmetric pulses in monopolar configuration. Although some improvements were sometimes obtained on a few channels for some of the subjects using PS-C instead of BI-C, it is worth noting that asymmetric pulses could also degrade spatial selectivity, as shown by the trends of S2 and S3 for the DPS-C masker.
As already mentioned in the Introduction, Frijns and colleagues (1996) suggested that improved spatial selectivity may be obtained using pseudomonophasic pulses in longitudinal bipolar configuration because such stimuli should focus the stimulation near one electrode of the pair. To investigate this hypothesis, we performed a second forward-masking experiment with four subjects (S1-S4) using maskers presented in a relatively wide bipolar configuration (Bipolar + 9). We chose such a large separation to maximize the probability of obtaining a bimodal excitation pattern with symmetric biphasic pulses (Chatterjee et al., 2006; Snyder et al., 2008), i.e. two peaks of excitation near each electrode of the bipolar pair. The two electrodes (4, 14) were selected because they showed similar absolute threshold values of the signal for three of the subjects (S1, S2, and S4). Given the particular threshold trend of subject S3 (Fig. 3), it was not possible to find two electrodes widely separated with similar absolute thresholds and we used the same electrode pair as for the others. Six maskers were evaluated: BI-C, PS-C, and DPS-C (cathodic-first relative to electrode 4), and the same three stimuli reversed in polarity, that we will name BI-A, PS-A, and DPS-A (anodic-first relative to electrode 4, i.e. cathodic-first relative to electrode 14).
The six masker stimuli were first loudness-balanced at MCL. This was done by first determining the MCL for the BI-A masker and further balancing the other five maskers relative to BI-A (cf. Table 2). For each loudness adjustment, two stimuli were continuously presented to the subjects, the first being the reference, constant across the experiment, the second being the target to balance, with a gap of approximately 500 ms in between. Subjects could adjust the level of the second stimulus by pressing one of six buttons (three to increase, three to decrease), each button corresponding to a different step size (one, two or three current-steps), until they perceived the levels to sound equal. For each pair of stimuli that needed to be compared, at least four loudness-balanced measures were performed.
We subsequently measured the masked thresholds of the same monopolar BI-C signal as used in Experiment 1 presented on each of the two masker electrodes and for the six different maskers. For each condition, six repetitions were performed. If the anodic phase is more effective than the cathodic phase, we would expect the asymmetric maskers to predominantly excite neurons proximal to the electrode of the bipolar channel for which the short, high-amplitude phase is anodic. Note that, here, the maskers were not equated to give the same amount of masking at a single site (as in Experiment 1) because we expected two remote masking peaks (proximal to each electrode) for the symmetric pulse shape (cf. Chatterjee et al. 2006). Furthermore, we were not interested in the sharpness of the pattern per se but rather in the relative masking differences between the two electrode sites for the different maskers.
In addition to the masking experiment, the pitches evoked by the six bipolar maskers were also compared, as we would expect different places of excitation to elicit different pitch percepts. The six stimuli were pitch ranked using the optimally efficient “Midpoint Comparison Procedure” that was originally developed to assist the fitting of auditory brainstem implants (Long et al., 2005). The algorithm involves making pitch comparisons between pairs of stimuli, with the provisional pitch-ordering being updated as more comparisons are made. Each new stimulus is initially compared to the middle-ranking one in the provisional list, and the list is successively “bisected” as more comparisons are made. This procedure was repeated at least ten times and produced a mean rank and a standard error associated with each of the six stimuli.
Fig. 5 shows a diagram with the expected contributions of each phase of the maskers on the excitation of fibers near electrode 4 (left part of the panel) and 14 (right part). The letters “S” and “w” correspond to expected strong and weak masking effects, respectively. The hypotheses are that a short, high-amplitude phase would produce a stronger excitation than a long, low-amplitude phase (Moon et al., 1993; Macherey et al., 2006) and that an anodic phase would produce a stronger excitation than a cathodic phase (Macherey et al., 2008). For a BI-C stimulus, the first phase is anodic relative to electrode 14 and should predominantly excite fibers near electrode 14 while the second phase is anodic relative to electrode 4 and should excite fibers near electrode 4. We would therefore expect equal excitation at both electrode sites. A similar effect is expected for BI-A. The situation should, however, be different for PS-C where the second phase has a long, low-amplitude phase and should have a weak masking effect. PS-C should therefore primarily excite fibers near electrode 14. Symmetrically, the excitation should be larger near electrode 4 for PS-A. A similar trend as for PS is expected for the DPS pulse shape because the two phases are also asymmetric.
The results of the forward-masking experiment are illustrated in Fig. 6. Each panel corresponds to one of the three pulse shapes (left for BI, center for PS and right for DPS) and shows, for each subject, the threshold shift obtained with an anodic-first masker (left part of the panels) and with a cathodic-first masker (right part). Solid lines connecting filled symbols represent the threshold shifts of a signal presented on channel 4 whereas dotted lines connecting open symbols are for a signal on channel 14. Thick lines show the mean data across subjects. For each pulse shape (BI, PS, and DPS), we performed a two-way repeated-measures ANOVA on the mean threshold shifts of the four subjects. The treatment factors were the signal channel and the leading polarity.
The masker polarity did not have any effect on the threshold shift of the signal for the BI and DPS pulse shapes, as suggested by a lack of significance of the polarity (F(1,3)=0.59, p=0.5 for BI and F(1,3)=1.6, p=0.3 for DPS) and interaction factors (F(1,3)=2.38, p=0.2 for BI and F(1,3)=0.014, p=0.9 for DPS). However, the PS pulse shape showed consistent differences in threshold shifts that were dependent on the pulse polarity, as illustrated by the opposite trend of the solid and dotted thick lines in the middle panel of Fig. 6. This result was consistent with what was expected: for a signal presented on channel 4, the threshold shift was larger when the leading (short, high-amplitude) phase of the masker was anodic relative to channel 4 (i.e. using PS-A). Similarly, for a signal presented on channel 14, the threshold shift was larger when the leading phase of the masker was cathodic relative to channel 4 (i.e. using PS-C; except for S1 who showed no difference between the two polarities for this specific channel). This finding, which is supported by a significant signal channel X polarity interaction (F(1,3)=11.9, p=0.041), demonstrates that a PS stimulus presented to a wide bipolar channel can more effectively excite one region of the cochlea over another depending on its leading polarity. As we expected from our previous study (Macherey et al., 2008) and contrary to Frijns’ model (1996), the region of the cochlea located near the electrode for which the short, high-amplitude phase is anodic was more effectively stimulated.
Fig. 7.A shows the results of the pitch ranking task. Three of the four subjects (S1, S2, and S4) ranked the PS-C stimulus as higher in pitch than the PS-A stimulus, consistent with a stronger activation of the basal part of the cochlea due to the PS-C pulse being anodic relative to electrode 14. S3 was, however, unable to rank the BI-A, BI-C, PS-A, and PS-C stimuli. It is worth noting that the three subjects who showed a difference in pitch between PS-A and PS-C were not the same three subjects who showed a clear interaction in the masking experiment. The DPS-A and DPS-C stimuli evoked similar pitch percepts and, surprisingly, were in most cases ranked as lower in pitch than the other four stimuli. Finally, we performed a paired-sample t-tests on the individual ranks obtained for BI-A and BI-C for the four subjects and found the pitch of BI-C to be significantly higher than that of BI-A for S1 and S2 (p<0.02) despite the fact that there was no masking difference between electrode 4 and 14 for these two pulse shapes. It is therefore possible that the order of phases also has its importance.
The mean pitch ranks were correlated with the differences in threshold shift between electrode 14 and electrode 4 for the six maskers. We would expect the pitch to be higher when this threshold shift difference is large than when it is small or negative, as it would indicate a relatively stronger excitation of the basal region. Fig. 7.B shows the masking difference as a function of the pitch rank for the four subjects. Prior to performing the correlation, the masking data were standardized: for each subject, the mean threshold shift difference (averaged across maskers) was subtracted from each individual data. The within-subjects correlation turned out to be significant (r=0.64, p=0.001).
The improvement of future CI devices will probably require an increase in the number of independent stimulation channels. The implantation of a higher number of electrodes may achieve this goal only if each electrode is able to stimulate a spatially restricted portion of the cochlea. Therefore it is necessary to develop techniques that selectively stimulate fibers proximal to an active electrode whilst producing as little excitation as possible at the other electrode sites.
In Section II, we have presented the results of a forward masking experiment that compared the spatial profiles produced by three different maskers. A complication in comparing the spatial spread of excitation for different stimuli (e.g. for different electrode configurations or different waveforms) comes from the fact that the masker levels are usually set to achieve a particular loudness percept (Chatterjee et al., 2006; Kwon and van den Honert, 2006). This can lead to masking functions that can be offset vertically from each other by substantial amounts, and which further need to be normalized in order to compare their sharpness. However, the normalization process may only be valid if the growth of masking is linear and also similar for all channels of the implant as it is equivalent to multiplying all masked thresholds of a given pattern by the same arbitrary factor. The work of Nelson and Donaldson (2001; 2002) provides an insight into the validity of this hypothesis. They measured the amount of masking produced on a masker channel as a function of masker level in single-pulse and pulse-train forward masking experiments. They showed the amount of masking to increase linearly with masker level (expressed in μA). However, the slope of this increase was different across subjects and, more importantly, across electrodes measured within the same subjects. Extrapolating the results of their studies, the normalization of forward-masked excitation patterns relative to a given channel may well lead to an under- or an over-estimate of the degree of tonotopic selectivity in CI stimulation.
In the present study, we have adopted an alternative method to compare the masking patterns produced by different stimuli, by equating the amount of masking on a given channel. This procedure has the advantage of avoiding normalization: if two functions are equal at the peak, then the one that is lower at the edge will always reflect better selectivity. Nevertheless, it also has its drawbacks. First, the masker equating part may be difficult to achieve. There is some variability in the measures that may prevent determination of the exact masker levels that mask the signal by the same amount. Second, the fact that the maskers are not compared at the same loudness may not be appropriate to identify the most spatially focused stimulus to be used in contemporary commercial implant systems. This is because existing clinical fitting procedures usually involve setting all individual channels to a particular loudness (i.e. MCL).
Our methodology may, however, have a potential clinical relevance for future implant designs employing high-density electrode arrays. In such a configuration, it is likely that the stimulation levels of each electrode will not be set to achieve a particular loudness percept but rather to stimulate a restricted group of fibers. Ultimately, one would want to be able to stimulate independently every surviving nerve fiber (or, more realistically, as many distinct groups of surviving nerve fibers as possible). Consequently, there is a need to identify the stimulus parameters that will optimally achieve this and we think the method described here constitutes a step forward. Furthermore, it should provide a more direct way to compare human data with animal or theoretical modeling data because it does not involve loudness judgments of the subject. In physiological and modeling experiments, spatial selectivity usually refers to the ability to stimulate a discrete group of neurons without stimulating neighboring ones. In practice, this is assessed by measuring the spatial extent of stimulation for a number of dB above the threshold of the lowest-threshold neuron, which is usually the one closest to the electrode (e.g. Snyder et al., 2008). Our approach is similar in that we determine how much “unwanted” stimulation there is for a fixed masking effect at the masker site. Another potentially important factor that we have taken into account is the confusion effect between masker and signal (Moore and Glasberg, 1982; Neff, 1985). We have proposed a way to overcome it by using different pulse rates for masker and signal.
The forward masking patterns measured in Experiment 1 showed some variability across subject and pulse shape. The BI-C and PS-C maskers produced very similar patterns in four of the subjects. Only S1 and S4 demonstrated sharper tuning for a PS-C compared to a BI-C masker. This sharpening was particularly remarkable for S4. The fact that this subject is the only one with a Helix electrode array may only be a coincidence as the main difference with the HiFocus II array (implanted in all other subjects) is the smaller distance between intracochlear contacts. It is unlikely that this would affect place specificity since maskers and signals were monopolar stimuli in Experiment 1.
Although there was no overall interaction between target electrode and masker type, some subjects did show a difference between the BI-C and DPS-C patterns. DPS-C clearly produced a broader excitation than BI-C in S2 and S3. Several studies have shown that less current is required to evoke the same loudness percept when the IPG is increased (McKay and Henshall, 2003; Carlyon et al., 2005; van Wieringen et al., 2005, 2006). Prado-Guitierrez et al. (2006) demonstrated that the effect of IPG on threshold was correlated with neural survival. One explanation that could account for their observation was that neural degeneration is usually accompanied by a partial demyelination of neurons. This tends to increase the time constants and could possibly reduce the effect of IPG on neural excitation. It is therefore possible that the across-channel differences between BI-C and DPS-C obtained in Experiment 1 for some subjects relate to differences in neural survival. In regions of particularly high neural survival, we would expect the IPG of the DPS-C masker to be more effective, and therefore to produce relatively more masking than BI-C than in regions of poor neural survival.
The fact that we did not observe consistent results across subjects may be related to individual differences such as the position of the electrode relative to the neural elements and the specific degree of degeneration and of demyelination of the nerve fibers. All these factors are known to affect polarity sensitivity of neurons (Ranck, 1975; Rattay et al., 2001). Also, we made the implicit assumption in Experiment 1 that the cathodic phase would be the most effective phase of PS-C and DPS-C. The effects of phase duration on threshold and loudness have previously been studied using symmetric biphasic pulses (Shannon, 1985; Moon et al., 1993). Both of these studies showed that long-phase duration pulses are usually less effective (needing more charge) than shorter ones to reach threshold or to obtain the same loudness sensation, at least for phase durations similar as those used here (97 and 776 μs). However, the exact mechanisms of interactions between phase duration and polarity in pseudomonophasic pulses are, at present, unclear and it is possible that the long, low-amplitude anodic phase was still a significant contributor to neural excitation, as suggested by recent masking data (Macherey et al., 2008).
In Experiment 2, we demonstrated that a pseudomonophasic stimulus with no IPG presented in a relatively wide bipolar configuration (Bipolar + 9) could more specifically excite the fibers proximal to one or other electrode of the bipolar pair, depending on its polarity. Consistent with our previous work on polarity sensitivity (Macherey et al., 2008), we showed that fibers proximal to the electrode for which the short, high-amplitude phase was anodic were more effectively excited than fibers proximal to the other electrode of the bipolar channel. This reduced bimodality was, however, not observed with a DPS masker which is identical to PS except that it has a long IPG. There may be two different explanations for this.
First, for our asymmetric maskers, the stimulus ended with the long, low-amplitude phase. Although we would expect this phase to produce less neural excitation than the short, high-amplitude phase, it is, in the case of DPS-C, 4.4 ms closer to the signal. Hence, it is possible that the greatest effectiveness of the anodic, high-amplitude phase was counteracted by it being temporally further from the signal. However, this explanation fails to account for the finding that leading polarity affects pitch for PS but not for DPS pulses (Fig. 7A).
A second possible explanation relates to the current levels that were used. We showed that polarity sensitivity is prominent at comfortable listening level (e.g. Macherey et al., 2006), i.e. that less current is needed for PS-A than for PS-C (in monopolar mode) to evoke the same loudness. However, these two pulse shapes do not show any difference at threshold. As DPS pulses need overall less current than PS pulses to reach the same loudness (cf. Table 2), it is possible that the levels used for these stimuli are below the level at which polarity starts to matter. This would also be consistent with the fact that we observe a MCL difference between monopolar DPS-A and DPS-C for a 22- μs but not for a 97- μs phase duration in our previous publications (because short phase durations require higher current levels than longer ones to evoke the same loudness; Macherey et al., 2006; 2008).
It still remains to be demonstrated whether the polarity effect we observed using this wide bipolar configuration (Bipolar + 9) holds for narrower configurations. Snyder et al. (2008) noted that bimodal excitation patterns using bipolar, symmetric biphasic pulses have been observed in CI users only when the distance between the two electrodes is large (cf. Chatterjee et al., 2006). This probably arises from the fact that the two regions of activation overlap more and more as the distance between the electrodes is reduced. In a narrow bipolar configuration we may still expect a PS pulse to sharpen one side of the spatial pattern compared to a BI pulse and focus the excitation near one electrode of the pair, as suggested by recordings in the inferior colliculus of guinea pigs (Bonham et al., 2003). Clearly some more work is needed to confirm those predictions. Moreover, it is likely that the place-specificity of asymmetric pulses will depend on the neural survival patterns of individual patients. For example, if one electrode of the bipolar pair is in a “dead” region and the other in a “live” region, the polarity of the pulse will probably not affect the place of stimulation: only the same “live” region will be stimulated for both PS-A and PS-C pulses. Concerning the implementation of such pseudomonophasic pulses in a realistic speech processing strategy (e.g. with a pulse rate higher than 800 pulses per second per channel), some compromises will have to be made regarding the rate of stimulation, the duration of the long, low-amplitude phase, the bipolar separation and the non-simultaneity of stimulation (cf. van Wieringen et al., 2008).
An additional potential advantage of the pseudomonophasic shape arises from the fact that the electrode array of a CI is usually not inserted all the way up to the apex. Consequently, some nerve fibers lying near the apex may not be excited by existing stimulation strategies. When symmetric pulses are presented in bipolar mode, the perceived pitch (“place” pitch) may correspond to the average of the two peaks of excitation corresponding to each electrode. The use of bipolar configuration combined with pseudomonophasic pulses may provide an access to more apical (low-frequency) regions of the cochlea, by presenting the effective short, high-amplitude anodic phase of the pulse to the most apical electrode of the implant.
We would like to thank the subjects for their enthusiasm, their patience and the long time they have devoted to this research project. We also thank Leo Litvak for providing information on the electrode geometry, Dr. Frank Rattay for helpful discussions and three anonymous reviewers for critical comments on a previous version of this manuscript. This research was supported by the Fonds voor Wetenschappelijk Onderzoek FWO-Vlaanderen (FWO G.0233.01) and the Research Council of the Katholieke Universiteit Leuven (OT/03/58).
1The differences between BI-C and DPS-C on channel 9, although being small relative to the across-channel differences (0.48 and 0.35 dB for S2 and S3, respectively), were statistically significant and might have biased the results. This reveals a certain difficulty in optimally equating the maskers, due to some variability in the results of step (iii) of the experimental design. For these two subjects, the statistical analyses were reiterated with the forward masking data of DPS-C shifted to give the same mean excitation than the BI-C data on channel 9. The three factors of the two-way repeated-measures ANOVA, including masker type, channel and interaction term, were still statistically significant. Furthermore, the paired-sample t-tests showed the differences between BI-C and DPS-C on the three most apical channels of S2 and on the three most basal channels of S3 to remain significant.