|Home | About | Journals | Submit | Contact Us | Français|
Fundamental frequency (F0) discrimination between two sequentially presented complex (target) tones can be impaired in the presence of an additional complex tone (the interferer) even when filtered into a remote spectral region [H. Gockel, R.P.Carlyon, and C.J.Plack, J. Acoust. Soc. Am., 116, 1092-1104 (2004)]. This “pitch discrimination interference” (PDI) is greatest when the interferer and target have similar F0s. The present study measured PDI using monaural or diotic complex-tone interferers and “Huggins pitch” or diotic complex-tone targets. The first experiment showed that listeners hear a “Complex Huggins Pitch” (CHP), approximately corresponding to F0, when multiple phase transitions at harmonics of (but not at) F0 are present. The accuracy of pitch matches to the CHP was similar to that for an equally loud diotic tone complex presented in noise. The second experiment showed that PDI can occur when the target is a CHP while the interferer is a diotic or monaural complex tone. In a third experiment, similar amounts of PDI were observed for CHP targets and for loudness-matched diotic complex-tone targets. Thus, a conventional complex tone and CHP appear to be processed in common at the stage where PDI occurs.
Despite many decades of research, the mechanism underlying the pitch perception of complex tones remains a matter of some dispute. One recent finding is that even when stimuli are represented quite differently in the auditory periphery, their pitches interact more centrally in an obligatory fashion that can impair performance in a forced-choice task. Specifically, when listeners are required to compare the fundamental frequencies (F0s) of two sequentially presented harmonic complexes, each filtered so that all of their components are unresolved by the peripheral auditory system, then performance can be disrupted by the addition of a group of resolved frequency components that are filtered into a lower frequency region (Gockel et al., 2004; 2005; 2009b). The degree of this “Pitch Discrimination Interference” (PDI) depends on the similarity between the F0s of the interfering and target complexes. As Gockel et al (2004) pointed out, this suggests either that the pitches of resolved and unresolved complexes are initially processed by a common mechanism, or that, if separate pitch mechanisms do exist, they are converted at some obligatory stage of processing into a common code.
In the experiments described here, we used PDI to study the commonality of processing between two stimuli that are initially processed very differently by the auditory system: resolved harmonic complexes and broadband noises that give rise to a binaurally generated pitch, termed Huggins Pitch (HP). When samples of white noise are presented to each ear, which are identical in all frequency regions except for an interaural phase transition in a narrow frequency band, listeners perceive a faint pitch corresponding to the center frequency of that band (Cramer and Huggins, 1958). Each noise, when presented separately to one ear only, sounds just like white noise. When both noises are presented together, the perception is that of a noise, coming from the center of the head, and an additional tone, which is lateralized to one ear or the other (Raatgever and Bilsen, 1986). The perception of the HP crucially depends on the input from both ears being combined in some way. Thus, its percept must be derived from auditory processing at or higher than the level of the brainstem. Most current theories on the processing leading to the perception of a dichotic pitch assume the existence of an internal central spectrum that has a peak at the center frequency of the narrow band containing the interaural phase transition. How exactly this central spectrum is generated is still a matter of dispute (Raatgever and Bilsen, 1986; Culling et al., 1998b; Hartmann and Zhang, 2003). Two well-known binaural models, the equalization-cancellation (EC) model (Durlach, 1960; 1972) and the central-activity pattern (CAP) model (Raatgever and Bilsen, 1986), both depend on interaural delay times. In the EC model, the left and right channel are subtracted after a preceding equalization stage, while in the CAP model, the left and right channel signals which are tuned in frequency and interaural time delay are added. The details of these models are beyond the scope of this paper. The binaural pitch is determined by the central spectrum. In the case of a complex pitch (containing multiple phase transition regions), the pitch of the binaural input has been assumed to be determined via a central pattern recognition process similar to those that can be applied to monaural or diotic pitch stimuli (Terhardt, 1974; Goldstein, 1973), which do not require binaural interaction (Raatgever and Bilsen, 1986). Note, however, that Akeroyd and Summerfield (1999) suggested a fully-temporal account of the perception of dichotic pitches, in which the output of an analysis of interaural timing based on Culling et al.’s (1998b; 1998a) modified-equalization-cancellation model, feeds into a temporal-pitch model based on autocorrelation.
Our first experiment replicated and extended an earlier finding showing that listeners can perceive the “missing fundamental” of complex HP (“CHP”) when multiple phase transitions occur at frequencies that are integer multiples of a common F0, but when there is no transition at F0 (Bilsen, 1977); our listeners could match the binaural residue pitch of a CHP as accurately as they could match that of an equally loud diotic complex tone presented in noise. We then went on to show not only that PDI occurred between a group of monaurally or diotically presented resolved harmonics and CHP, but that the amount of this interference was similar to that when the CHP was replaced by an equally loud resolved harmonic complex presented in a noise background. Our results thus show that CHP is processed in common with monaural pitches at an obligatory stage of processing, and that PDI can occur between stimuli for which the initial stages of processing are likely to be very different.
In all experiments, a two-interval procedure was used. The details differed across experiments and will be described below for each experiment in turn. The mean F0 of the target complex tones was 200 Hz. The CHP stimulus had interaural phase transitions in frequency bands centered on the second to the fifth harmonics, i.e., centered on 400, 600, 800, and 1000 Hz. A stimulus which is generated by introducing a narrowband phase transition in a noise that is otherwise identical at the two ears leads to the perception of a pitch (termed HP-), lateralized either to the left or the right - depending on the subject, with the diotic noise being perceived at the center of the head (Raatgever and Bilsen, 1986; Hartmann and Zhang, 2003; Zhang and Hartmann, 2008). The same stimulus configuration except for a global phase shift of 180° (termed HP+) leads to the perception of a pitch that is perceived at the center of the head and a wideband noise with diffuse lateralization, i.e., lateralized to both left and right sides of the head with a tendency to spread towards the center. The HP- configuration rather than a HP+ configuration was used in the present study, as (i) the HP- configuration results in a stronger pitch than the HP+ configuration (Hartmann and Zhang, 2003) and (ii) it allowed us to also investigate the effect of relative lateralization of target sound and interfering sound for the maximum achievable difference between their perceived locations, i.e., to opposite sides (for more details see Exp.II).
All stimuli were generated in MATLAB. The CHP was generated from a 1000-ms band-limited Gaussian noise sampled at 40 kHz. It was generated in the spectral domain by first applying an FFT to the noise and then modifying the phases of one of two matched buffers representing the left and right channels. A linear shift with frequency from 0 to 2π radians was added to the phases for frequency components from 3% below to 3% above the center frequency of the chosen harmonics. Applying an inverse FFT to the two spectral buffers gave the signal waveforms for the left and right channels. One out of 21 pre-generated realizations of the CHP, based on 21 realizations of Gaussian noise, was selected at random for each presentation of a CHP. The Gaussian noise extended up to 1.1 kHz in Experiment 1, where the F0 of the CHP was fixed at 200 Hz, and up to 1.2 kHz in all other experiments, where the mean F0 of the CHP stimuli was roved between trials. Its spectrum level was 37.8 dB (re: 20 μPa). The overall root-mean-square (rms) level of the CHP stimulus was 68.2 dB SPL in Experiment 1 and 68.6 dB SPL in the other experiments.
The duration of all stimuli was 1000 ms, including 40-ms raised-cosine onset and offset ramps. The silent interval between the two intervals within a trial was 500 ms. All tones were generated digitally. They were played out using a 16-bit digital-to-analog converter (CED 1401 plus), with a sampling rate of 40 kHz. Stimuli were passed through an antialiasing filter (Kemo 21C30) with a cutoff frequency of 17.2 kHz (slope of 96 dB/oct), and presented using Sennheiser HD250 headphones.
The objective of the first experiment was to determine whether a CHP tone, without a phase transition at the F0, is perceived by human listeners as having a residue pitch, and to determine how salient that pitch is. We are aware of only one other study investigating the perception of CHP in the absence of the fundamental. In that study (Bilsen, 1977) there were two phase transitions (at 600 Hz and 800 Hz) and only two subjects, one of whom was Bilsen himself. Many listeners do not hear a residue pitch when a complex tone contains only two harmonics, even with monaural or diotic pitches (Smoorenburg, 1970). In Bilsen’s study, his own matches of a pure tone frequency to the pitch of a CHP showed an average value of 205 Hz with a relatively narrow distribution, while the matches of the second subject showed a much wider distribution around 200 Hz. Here, the aim was to establish the pitch values and distribution of pitch matches for a greater number of listeners, for CHP with four phase transitions rather than two, which would be expected to lead to a clearer residue type pitch.
Listeners adjusted the F0 of a complex tone with harmonics seven to 14 to match the residue pitch of a CHP with four phase transitions, centered on harmonics two to five of 200 Hz. Pitch matches to the CHP were compared with pitch matches to a diotic complex also containing harmonics two to five of 200 Hz, presented either in quiet or simultaneously with a diotic noise. When the diotic tone complex was presented with the diotic noise, the noise itself was identical to the noise used to generate the CHP stimuli (before phase shifts were applied), while the level of the tone complex was adjusted such that it had the same loudness as the tonal component in the CHP stimulus. The level of the diotic tone complex necessary to achieve this was determined for each subject individually in a loudness-matching experiment. Comparison of the distribution of the pitch matches to the CHP with that for the diotic complex tone, presented either in silence or at equal loudness in noise, provided information about the relative salience of the residue pitches derived from the three stimuli.
The CHP and the diotic complex tone in noise, both “containing” harmonics two to five of a fundamental of 200 Hz, were presented in alternation. One of two virtual boxes lit up on a computer monitor in synchrony with each presentation of the CHP and the diotic complex. The CHP stimulus and the noise were fixed in level (68.2 dB SPL rms at each ear) and subjects had to adjust the level of the diotic complex so that its loudness was equal to that of the tonal component in the CHP. Subjects adjusted the level by moving a virtual slider on the monitor. The slider scale was marked from “−5” at the left hand side to “+5” on the right hand side, in steps of one. Moving the slider from “0” to “−5” attenuated the diotic complex by 5 dB on the next trial. Moving the slider from “0” to “+5” increased the level of the diotic complex by 5 dB on the next trial. The slider could be moved to any position, thus allowing fine adjustments in level. After each presentation of the two stimuli, the slider’s position was automatically moved to “0”, before subjects could indicate the next desired adjustment of the level of the diotic tone complex. If subjects did not move the slider, the same stimulus pair was presented again. Subjects were encouraged to “bracket” the matching level several times, by making the diotic complex clearly softer than the tonal component of the CHP and then clearly louder (or vice versa), before making the fine adjustments. They were also encouraged to listen a few times to the same stimulus pair before indicating the next adjustment with the slider. Subjects pressed a virtual button on the monitor to indicate when they were satisfied with the loudness match. The matching level was defined as the level of the diotic tone complex presented immediately before the subject indicated a loudness match.
The starting level of the diotic complex was varied quasi-randomly in the range from 50-60 dB SPL per component, i.e., 56-66 dB SPL overall rms level. This range was chosen after some informal listening so that it covered levels at which the diotic tone was clearly louder or clearly softer than the tonal component in the CHP. For each presentation of the stimulus pair, one out of the 21 pre-generated CHP realizations was chosen randomly and the diotic noise that was presented simultaneously with the diotic tone in the other interval was identical to the noise used to generate the CHP stimulus (before phase shifts were applied).
The interval that contained the CHP was varied across blocks. A message on the monitor indicated to the subjects whether the level of the tone in the first or the second interval had to be adjusted to be equally loud to the tone heard in the other interval. At least 20, but more typically 40, loudness matches were collected for each subject (20 matches for each order of the two stimuli).
Subjects adjusted the F0 of a complex tone with harmonics seven to 14 to match: (1) the residue pitch of a CHP with phase transitions at 400, 600, 800, and 1000 Hz presented at an rms level of 68.2 dB SPL; (2) the residue pitch of a diotic complex with harmonics 400, 600, 800, and 1000 Hz presented in quiet at an rms level of 38 dB SPL; and (3) the residue pitch of the same diotic complex as in (2) but presented simultaneously with a diotic noise which was the same as in the CHP (one out of 21 realizations randomly drawn for each trial), at a level that was determined for each subject so that the tonal component was equally loud as for the CHP (around 62 dB SPL rms level, see results below). The matching complex was presented diotically in quiet with an rms level of 35 dB SPL (26 dB SPL per component).
The pitch matching procedure was essentially the same as the loudness matching procedure described above. Moving the slider from 0 to +5 or −5 increased/decreased the F0 of the matching sound to be presented on the next trial by 5 Hz. The slider could be moved to any position and returned to its central value at the start of each trial. The starting value of the F0 of the adjustable sound varied randomly in the range 200 ± 15 Hz. The interval that contained the matching sound was varied across blocks. A message on the monitor indicated to the subjects whether the F0 of the tone in the first or the second interval had to be adjusted to match the pitch of the tone heard in the other interval. For each subject and condition, 50 pitch matches were collected for each order of the two stimuli. The results from the two orders of the stimuli were averaged.
Four subjects, with various degrees of musical training, participated in all conditions of experiment 1. They ranged in age from 19 to 28 years, and their quiet thresholds at octave frequencies between 250 and 4000 Hz were within 15 dB of the ISO (2004) standard. To familiarize subjects with the procedure and equipment, they were given between 2 and 4 h of practice.
There were no significant differences between the loudness matches obtained for the two orders of stimulus presentation, and so the results were averaged across orders. The mean rms level of the diotic complex presented simultaneously with noise at the point of equal loudness with the tonal part of the CHP was (standard error, SE, in brackets) 62.1 dB SPL (0.2) for subject 1, 61.9 dB SPL (0.1) for subject 2, 61.3 dB SPL (0.27) for subject 3 and 63.0 dB SPL (0.14) for subject 4. The mean (and SE) across subjects was 62.1 dB SPL (0.3).
The noise had a spectrum level in its passband of 37.8 dB (re: 20 μPa), and an rms level of 65.6 dB in the frequency band from 400-1000 Hz, which contained the harmonics of the diotic complex. Therefore, equal loudness of the tonal component of the CHP and the diotic tone complex in noise occurred when the signal-to-noise ratio of the latter was only - 3.5 dB. Figure 2 shows the excitation patterns (following Moore et al., 1997) calculated separately for the noise background (dashed line) and for the diotic tone complex (dotted line) at the level of equal loudness with the tonal part of the CHP. The excitation pattern for the combined stimulus is shown by the solid line. At the level of equal loudness, the partial loudness of the diotic tone complex in the white noise was calculated as 22.5 phons or 0.192 sones (following Moore et al., 1997). This level is 6.6 dB above the masked threshold for the diotic tone in the noise predicted by Moore et al.’s (1997) loudness model. The latter result indicates that the loudness of the CHP corresponds approximately to that of the diotic tone complex presented at a sensation level of 6.6 dB SL. Overall, these results show that a discernable but relatively faint pitch was heard in the CHP, with good agreement between the four subjects.
Figure 3 shows the means and distributions of the F0 of the diotic complex containing harmonics 7-14 when its pitch was matched to that of the target sound. There were no significant differences between the pitch matches obtained for the two orders of stimulus presentation, and results were averaged across the two orders. Figures 3(a)-(d) show the 90th, 75th, 50th, 25th, and 10th percentiles of the distributions of the pitch matches for each of the four subjects. Figure 3(e) shows the mean matched F0 across subjects and the size of the typical SE, i.e., the average across the four individual SEs that were first calculated separately for each subject.
The mean F0 matched to the CHP was 202.3 Hz1. In spite of some differences between subjects with regard to their pitch matching reliability, the agreement between subjects was good. The matched F0 was somewhat above the “true F0” of the CHP of 200 Hz, but this was also true for the mean matched F0s to the other target sounds (201.6 Hz for the diotic tone in silence and 203.2 Hz for the diotic tone in noise). A repeated measures one-way ANOVA, using the mean of the matched F0s from each subject and condition as input, showed no significant difference between the mean pitch matches across the three different target sounds. There was a tendency for the SE of the matches to be smallest for the diotic tone in silence. However, a repeated measures one-way ANOVA, with the SE of the matched F0s from each subject and condition as input, showed no significant difference between the SE of the pitch matches across the three different target sounds.
Overall these results show that the CHP, in the absence of a phase transition at the F0, evoked a residue pitch which corresponded well to that of a diotic complex tone with harmonics at frequencies corresponding to the center frequencies of the phase transitions in the CHP. The salience of the residue pitch of the CHP, as assessed by the SE of the pitch matches, corresponded well with that of the loudness-matched diotic complex in noise.
Having established that a residue pitch can be perceived for a CHP with missing fundamental, the main objective of the second experiment was to investigate whether PDI would occur between the CHP as the target and a monaural or diotic complex tone as the added sound (interferer). The perception of the residue pitch of a CHP and a monaural or diotic residue pitch involve, at least initially, different processes: perception of the former requires binaural interaction while perception of the latter does not (see Introduction). If PDI were observed between these two stimuli, it would indicate that PDI can occur between stimuli that are initially processed in a different way. This in turn would be consistent with the idea mentioned in the Introduction that the pitches of resolved and unresolved complexes could initially be processed by different mechanisms rather than the same mechanism, and that the reported PDI might have occurred at a later stage of processing where the pitch information was converted into a common code. Gockel et al. (2009b) recently reported significant PDI between complex tones presented to opposite ears, showing that PDI can occur at or after the stage where pitch information from the two ears has been combined. This strengthens the possibility that PDI might be observed between a CHP and a monaural or diotic pitch.
The second objective was to investigate whether the previously reported dependence of PDI on the similarity between the F0s of the target and the interferer would also be observed between a CHP target and a monaural or diotic interferer. To assess this, the F0 of the interferer either corresponded to the nominal F0 of the target or it was increased by 40%. If a similar dependence on F0 similarity was observed, it would support the interpretation that the impairment with a CHP target and PDI between monaural pitches are caused by similar processes.
The third objective was to investigate the role of perceived location of the target and interferer. Gockel et al. (2009b) observed significantly less PDI when the interferer was presented contralaterally to the target than when it was presented ipsilaterally. Thus, relative ear of entry of target and interferer played an important role in PDI. In the present study, the perceived location of the tonal part of the CHP varied across subjects, but was very reliable within a subject (Zhang and Hartmann, 2008). By presenting the interferer either to the left ear, or to the right ear, or diotically, its perceived location was varied relative to that of the tonal part of the CHP, the latter varying across subjects.
In the non-shifted F0 conditions, the interferer was a complex tone containing harmonics 7-14 with an F0 corresponding to the nominal F0 of the target. Thus the interferer contained higher harmonics than the target, but still had a salient pitch (Moore and Glasberg, 1988; Houtsma and Smurzynski, 1990; Moore and Peters, 1992; Moore et al., 2006). The level of the interferer was chosen such that it was perceived as equal in loudness to the tonal component in the CHP target stimulus. The level of the tone complex necessary to achieve this was determined individually for each subject in a loudness-matching experiment (see Appendix).
Subjects had to discriminate between the F0s of two sequentially presented CHP, i.e., they had to indicate which of the two HP complexes, both with phase transitions around harmonics 2-5, had the higher F0. The mean F0 was varied across trials from 181.8 Hz (200/1.1) to 220 Hz (200*1.1), to encourage subjects to compare the pitch of the two targets presented in each interval. In each trial, in one, randomly chosen, interval the target complex had an F0 equal to F0-F0/2, while in the other interval its F0 was F0+F0/2. The difference in F0 between the two target tones in a trial, F0, was fixed, and percent-correct performance was measured. The size of F0 was chosen for each subject individually, in a preliminary experiment, so that performance in terms of d' was between about 1.6 and 1.8 in condition None, which was the easiest condition. The values of F0 were 1.4%, 0.9%, 1%, 1.8%, and 1.4% for subjects 1, 2, 3, 4, and 5 respectively. Correct-answer feedback was provided after every trial.
The target sounds were either presented alone (condition “None”) or with a synchronously gated interferer. The interferer was either a harmonic complex containing harmonics 7-14 with an F0 that was equal to the mean of the F0s of the two target sounds, and thus its F0 was never identical to that of a target sound, or a harmonic complex containing harmonics 5-10 with an F0 that was 40% higher than the mean target-F0. The F0 of the interferer was always identical in the two intervals of a trial, and thus was non-informative for the task. In the non-shifted case, the interferer was presented either diotically, or monaurally, either to the side where the tonal percept of the CHP was lateralized (condition “Ipsi”) or to the opposite side (condition “Contra”). In these conditions, the level of the interferer corresponded to the individually determined levels of equal loudness between the interferer and the tonal percept of the CHP; across subjects, this corresponded to an average rms level of the interferer of about 40.6 dB SPL and 36.6 dB SPL in the monaural and the diotic condition, respectively (for details, see Appendix). In the pitch-shifted condition (condition “Dio_PS”), the interferer covered the same frequency range as the non-shifted interferers, and was presented diotically at the same rms level as the diotic non-shifted interferer.
The four interferer conditions were tested in blocks of 105 trials each. Each block with an interferer present was preceded by a block of 55 trials with the target alone. This was done so subjects knew the characteristics of the target sound. The first five trials within each block were considered as “warm-up” trials and results from those were discarded. One block was run for each interferer condition in turn, before additional blocks were run in any other condition. At least 400, but usually 500 trials were collected for each subject in each interferer condition. As each block with an interferer was preceded by a 55-trial block without an interferer, this means that performance in the target alone condition is based on at least 800, but usually 1000 trials. Subjects received from 3-8 hours of practice until performance seemed to be stable within conditions, before data collection proper was started.
Five subjects participated in all conditions of experiment 2. Four of them had also taken part in experiment 1. They ranged in age from 19 to 45 years, and their quiet thresholds at octave frequencies between 250 and 4000 Hz were within 15 dB of the ISO (2004) standard. All subjects had previously participated in other experiments on PDI.
Figure 4 shows performance (d') for F0 discrimination for the CHP tone presented either alone (condition “None”) or simultaneously with the interferer. The interferer was presented at a level leading to the same loudness as the tonal part of the CHP, as determined individually for each subject (see Appendix for details). In the absence of the interferer, d' values were between 1.6 and 1.8 for all of the subjects (Fig. 4a-e). This was as intended, and shows that F0 discrimination for CHP is good for relatively small values of F0 (0.9%-1.8%). It was not quite as good as for a complex tone containing resolved harmonics presented at moderate but higher sensation level, for the same values of F0 (Gockel et al., 2009a), but it was clearly better than for complex tones containing only unresolved harmonics (see e.g. Houtsma and Smurzynski, 1990; Shackleton and Carlyon, 1994; Gockel et al., 2004; 2006). In the presence of an interferer with F0 centered between those of the target complexes in the two intervals (3 bars in the middle), performance deteriorated for all subjects. However, the degree of impairment varied across subjects and conditions. Subject 04 (Fig. 4 d, the first author) showed markedly higher performance when the perceived location of the interferer was different from that of the target than when it was similar, i.e., diotic and contralateral presentation both led to higher performance levels than ipsilateral presentation. The other four subjects did not seem to be able to take advantage of the difference between the lateralization of the target and the interferer. When the F0 of the interferer was increased by 40% above the nominal target F0 (condition Dio_PS), performance improved relative to that for the non-shifted conditions, for all subjects. Figure 4f shows the mean data and standard errors across subjects. On average, PDI, defined as the difference between the d' value in condition None and that observed in the presence of an interferer, was about 0.7 when the interferer’s F0 was centered at the nominal target-F0 and about 0.2 when the interferer’s F0 was shifted.
A repeated-measures one-way ANOVA, calculated with the mean d' values from all subjects and conditions as input, showed that there was a highly significant difference between conditions [F(4,16)= 12.92, p<0.001]. Post hoc contrasts based on Fisher’s least significant difference procedure showed that performance in condition None was significantly higher than in conditions Ipsi and Contra (p<0.01) and Diotic (p<0.05). In contrast, in the presence of the F0-shifted interferer performance did not differ significantly from that observed in condition None and was significantly higher than that observed in condition Diotic (p<0.01).
The results show that significant PDI does occur between a CHP target and a monaural or diotic interferer. The size of the PDI depended on the similarity of the F0s of the CHP target and the interferer. This indicates that the PDI for a binaural pitch target and a conventional pitch interferer, two stimuli which initially are processed in a different way, is likely to be caused by the same process as the PDI observed between conventional pitch stimuli. The present results further support the idea that PDI occurs at least partly at or after the stage at which pitch-relevant information is combined across the two ears. In the present experiment, a difference in perceived lateralization of the target and interferer was an ineffective cue, except for one subject. In contrast, Gockel et al. (2009b) found that presentation of the target to one ear and the interferer to the other ear significantly reduced (but did not abolish) PDI in comparison to the case when both tones were presented to the same ear. Thus, relative ear of entry seems have a more powerful influence on PDI than perceived lateralization. This could be another example of the small effect of perceived location in contrast to the significant effects of relative ear of entry reported for concurrent sound segregation (Culling and Summerfield, 1995; Hukin and Darwin, 1995; Gockel and Carlyon, 1998; Gockel, 2000; Darwin and Hukin, 2004).
After establishing that PDI can occur between a CHP target and a monaural or diotic conventional pitch interferer, we assessed whether, for the same diotic pitch interferer, PDI would be larger if the target was a conventional pitch complex than when it was a CHP. In other words, is there any benefit if the target and the interferer are initially processed in a different way compared to when they are processed in the same way?
Two different stimuli were used as targets. The first was the same CHP as used in Experiment II. The second was a diotic tone complex, also containing harmonics two to five of 200 Hz, presented simultaneously with a diotic noise. This second target is the same as was used for pitch matching, as described for Experiment I. Briefly, the diotic noise was identical to the noise used to generate the CHP stimuli (before phase shifts were applied) and the diotic tone complex was presented at a level leading to the same loudness as for the tonal component in the CHP stimulus, as determined for each subject in Experiment I. Only the diotic (not the monaural) interferers from Experiment II were used, i.e., a tone complex containing harmonics 7-14 with an F0 equal to the mean of the F0s of the two target sounds (nominally 200 Hz) or containing harmonics 5-10 with an F0 that was nominally 280 Hz.
The procedure was the same as that used in the PDI experiment described above. The four conditions with an interferer were tested in blocks of 105 trials each. Each block with an interferer present was preceded by a block of 55 trials with the target alone. The first five trials within each block were considered as “warm-up” trials and results from those were discarded. Conditions with the two different targets were blocked within each session; in one half of the session the CHP was the target and in the other half, the diotic tone in noise was the target. The order of the two was counterbalanced across sessions. One block was run for each interferer condition in turn, before additional blocks were run in any other condition. At least 500 trials were collected for each subject in each condition.
Four subjects, aged between 19 and 45 years, participated in all conditions of experiment 3. All of them had also taken part in experiment 2. The values of F0 were 1.4%, 0.9%, 1%, and 1.8%, for subjects 1, 2, 3, and 4, respectively.
When the targets were presented alone, the mean d' values (with standard errors across the four subjects given in brackets) were 1.70 (0.05) for the CHP and 1.86 (0.09) for the diotic tone. Figure 5 shows the PDI found using the two interferers. When the interferer’s F0 was equal to the mean of the targets’ F0s, the PDI was between 0.65 and 0.8. Both of these values were quite close to those observed in the previous experiment using the CHP as target. When the interferer’s F0 was increased, the PDI values were between 0.24 and 0.14. Again, both of these values were quite similar to those obtained with the CHP as target in the previous experiment.
A repeated-measures two-way ANOVA, with factors type of target and F0 of the interferer, was calculated, using the mean PDI values from all subjects and conditions as input. The results showed a significant main effect of interferer F0 [F(1,3)= 52.36, p<0.01]. The main effect of target type and the interaction were both not significant. Therefore, PDI did not differ significantly when the target was a conventional complex pitch and when it was a CHP. In other words, PDI was not significantly smaller when the target and the interferer were initially processed in a different way than when they were initially processed in the same way. To assess whether the presence of the pitch-shifted interferer produced significant impairment, one-sample t-tests were calculated on the mean PDI values observed with the 280-Hz interferer from all subjects. The results showed that PDI for the pitch-shifted interferer was significantly larger than zero when the CHP was the target (p<0.05; two-tailed) but was not significantly above zero when the diotic tone was the target.
In summary, the results indicate that the size of the impairment in F0 discrimination of two sequentially presented target tones caused by a simultaneous interferer is not (or only slightly) affected by whether the target and the interferer are both diotic tones or whether the target is a CHP and the interferer is a diotic tone. In the latter case, the target and interferer are likely to be initially processed in a different way. The absence of a significant benefit from the difference in initial processing indicates that the processes underlying PDI most likely do not differentiate between the “origin” of their input.
Overall, the results for a CHP tone were similar to those for a loudness-matched diotic complex tone with respect to the pitch matches and also with respect to the PDI that was observed in the presence of a diotic interferer. Thus, PDI can occur between stimuli which initially are likely to be processed in a different way. This further supports the notion that PDI is unlikely to occur at a very peripheral stage of auditory processing. It indicates that a conventional complex tone and a CHP are likely to be processed in common at the stage where PDI occurs. If PDI occurred at the stage of pitch extraction, the current results would support the idea of the existence of a central pitch processor common to binaural and monaural signals as expressed by Bilsen (1977). Alternatively, PDI may occur after the pitch extraction processes, which might not be in common for binaural and monaural pitches. If this is so, then the pitches must either be transformed into a common code at a later stage, or not be independently acessible.
This work was supported by EPSRC Grant EP/D501571/1. We thank Brian Moore and Brian Glasberg for providing us with their program to calculate binaural loudness as described in Moore and Glasberg (2007). We also thank Brian Moore, Richard Freyman, and three anonymous reviewers for helpful comments on an earlier version of this paper.
This appendix describes the loudness-matching experiment between the tonal component of the CHP target and the interferer in experiment 2.
The procedure used, was the same as that used for the loudness matches in experiment 1. Here, subjects adjusted the level of a complex tone containing harmonics 7-14 with an F0 of 200 Hz which was presented in silence, so that it had the same loudness as the tonal part of the CHP stimulus. The latter was the same CHP stimulus with phase transitions around harmonics 2-5 of a nominal F0 of 200 Hz that was investigated in experiment 1, except that the noise band extended up to 1.2 kHz rather than 1.1 kHz. The noise band was extended up to 1.2 kHz, to allow for the F0-randomization across trials applied in the following experiments on PDI (see below). There were three conditions: The conventional pitch complex was presented to the left ear, to the right ear, or diotically. These three conditions and the order of target and adjustable sounds were varied across blocks of usually 10 matches and counterbalanced across subjects. The starting level of the conventional pitch complex was varied randomly in the range from 21 to 31 dB SPL per component, i.e., 30-40 dB SPL rms level. This range was chosen after some informal listening so that it covered levels at which the diotic tone complex was clearly louder or clearly softer than the tonal component in the CHP stimulus. At least 40 loudness matches were collected for each condition and subject (20 matches for each order of the two stimuli), and results were averaged across the two orders of stimuli.
Table I shows the mean rms level of the complex tone containing harmonics 7-14 of an F0 of 200 Hz at which the complex was judged to be equally loud to the tonal part of the CHP stimulus, for each of the three conditions. For four of the five subjects the lateralization of the tonal part of the CHP stimulus was towards the right, while for subject 04 the tone was clearly perceived at the left. Thus, ipsilateral presentation of the tone complex containing harmonics 7-14 meant presentation to the right ear for four subjects and presentation to the left ear for subject 04. A repeated measures one-way ANOVA showed that there was a significant difference between the matched loudness levels in the three conditions [F(2,8)= 8.1, p<0.05]2. Post hoc contrasts based on Fisher’s least significant difference procedure showed that the level of the diotic tone required to match the loudness of the CHP was significantly lower than that in the monaural conditions (p<0.05) but, as expected, there was no significant difference between ipsilateral and contralateral presentation. In the diotic condition, the matched level was 4.05 dB lower than the average of the matched levels in the two monaural conditions, i.e., diotic presentation of the complex tone increased its loudness by an amount corresponding to a 4.05 dB increase in level.
Loudness calculations (following Moore and Glasberg, 2007) showed that, for monaural presentation, the matched level corresponds to a loudness value of 1.766 sones (47.6 phons) while for diotic presentation, the matched level corresponds to a loudness value of 1.977 sones (49.2 phons). Thus, the 4.05 dB difference between the matched levels in the monaural and the diotic conditons is close to the 5.6 dB predicted by the recent modification of Moore et al.’s (1997) loudness model. While some empirical data suggest doubling of loudness with diotic presentation, more recent studies suggest less than that (for an overview see Moore and Glasberg, 2007), and the recent modification of the model was developed specifically to account for these more recent findings. Thus, the present data are reasonably in line with the recent findings.
The alert reader will have noticed the discrepancy between the loudness values calculated for the partial loudness of the diotic tone in noise containing harmonics 2-5 at the matched level in experiment 1 (22.5 phons) and for the diotic interferer containing harmonics 7-14 presented in silence calculated here (49.2 phons). Both sounds were adjusted in level to be equally loud to the tonal component in the CHP stimulus, and thus, one might expect their loudness values to be equal to each other. A possible explanation for why the diotic interferer was matched at a higher level is that subjects included the energy of the noise to a certain degree when they assessed the loudness of the tonal component of the CHP stimulus. The interferer was presented in quiet, and thus would need to be adjusted to a higher level to be perceived as equally loud. Also, because the interferer had a very different timbre from the tonal part of the CHP stimulus, loudness comparison would have been difficult. In contrast, the diotic tone in noise containing harmonics 2-5 (experiment 1) was perceptually very similar to the CHP. Therefore, loudness comparison would have been easier, and, importantly, it would be unlikely that the noise would contribute differentially to the loudness of the tonal percept in the two stimuli.
1It is unlikely that listeners did not perceive the residue pitch of the CHP stimulus and that this match was only obtained because the pitch chroma of the adjusted sound (about 200 Hz) matched the pitch chroma of the first phase transition (about 400 Hz) in the CHP stimulus. Firstly, the pitch of the CHP stimulus and the pitch of the diotic tone in noise, which were presented sequentially in the loudness matching part of Experiment 1, did not appear to have an octave relationship but rather they sounded the same. Secondly, Gockel et al. (2009b) showed that PDI was minimal when the F0s of target and interferer differed by as much as one octave. Thus, if the pitch of the CHP stimulus were mainly determined by the individual phase transition around 400 Hz, then one would not expect to observe the amount of PDI that was found in Experiments II and III.
2Throughout the paper, if appropriate, the Huynh-Feldt correction was applied to the degrees of freedom (Howell, 1997). In such cases, the corrected significance value is reported.
PACS: 43.66.Hg, 43.66.Rq, 43.66.Ba