To investigate frequency-specific timing information in the scalp-recorded response, as well as an auditory nucleus which contributes to the surface response, we examined phase coherence between responses to speech stimuli recorded from the inferior colliculus and surface vertex of guinea pigs. A major goal of this work was to examine whether stimulus-specific phase signatures measured at the scalp represent an epiphenomenon associated with far-field (i.e., scalp-recorded) measurement of neural activity, or alternatively whether these specific activity patterns represent a more fundamental temporal processing phenomenon as evidenced by similar activity patterns in near-field responses. Here we show that phase differences in surface-recorded responses measured in guinea pig reflect stimulus phase attributes: phase differences were prominent in the formant transition period of surface responses to both the /ga/ vs /ba/ and /da/ vs /ba/ stimulus comparisons, and these phase differences were less prominent for the /ga/ vs. /da/ comparison. Importantly, these observations were also evident in surface recording measured in human subjects (Skoe et al., 2011
), suggesting similar underlying neural mechanisms in human and animal auditory systems. Near-field responses measured from ICc showed similar results to those described for the surface recordings: the GaDa comparison elicited smaller phase differences relative to GaBa and DaBa. However, in contrast to the surface responses, significant, but relatively subtle, phase differences were also evident in the steady-state portion of the near-field ICc response. Finally, we showed divergence between magnitude and phase spectra in ICc responses to these stimulus comparisons. Taken together, these results strongly suggest that the phase signatures elicited by speech sounds represent a fundamental temporal processing phenomenon which is generalized to both the surface recorded brainstem response as well as the localized auditory nuclei which contribute to the surface response. In addition, we expect that the phase-sensitivity shown here in synthesized CV syllables would make this analysis method applicable to responses evoked by a variety of stimuli, including natural speech and non-speech sounds, or the same sound delivered in different manners such as within different maskers.
There are a number of important similarities and differences between responses measured from the surface and ICc in the current study. First, while there was a general correspondence between phase attributes in surface and ICc responses, phase differences recorded directly from the inferior colliculus encoded stimulus distinctions more broadly than those recorded at the cortical surface. These near- field responses revealed phase differences in both the formant transition and steady state portion of the responses across all assessed frequency ranges (70 Hz to 3 kHz) and stimulus comparisons. The phase differences evident during the steady-state portion of the response, although significant, were substantially smaller than those seen during the formant transition. Near-field recordings revealed a richer pool of significant points than the surface recordings, but the basic response pattern was similar across recording sites: phase differences were most salient in the formant transition portion of the GaBa and DaBa comparisons.
Another notable difference between the surface and ICc responses is that higher frequency phase differences, while present in the direct IC recordings, are not observed at the surface. A plausible explanation for the lack of sensitivity of surface responses at higher frequencies is that neural information accessible via near-field electrophysiological recordings is often reduced or absent from far- field recordings (See Wood et al., 1981
for a review). A contributing factor to this phenomenon is that responses from multiple sources are combined during volume-conduction to the surface of the brain. Another factor is that while both near- and far-field recordings include contributions from a population of neurons, near-field techniques record activity from a much smaller neural population. Other midbrain regions, and indeed, possibly other regions within IC, may not distinguish these stimuli by phase differences, and when the signal is recorded by an electrode at the cortical surface, differences detectable via near-field techniques may be obscured by other signals simultaneously transmitted to the surface.
The convergence of phase-based results from surface and ICc responses strongly suggest that the phase differences recorded from the surface are generated at least in part by the ICc. This finding provides novel support for shared response features between the ICc and surface responses and adds to a literature that strongly suggests that the ICc plays an important role in shaping the surface recorded auditory brainstem response. For example, it has been shown that the FFR, which is an important attribute of the surface-recorded auditory brainstem response to speech (Hornickel et al., 2009a
; Johnson et al., 2008b
), is greatly attenuated when the IC is cooled and is evident again after the IC is warmed (Smith et al., 1975
). While this previous work focused on shared magnitude spectra between the surface and ICc, results from the current study add to our knowledge by showing shared phase spectra between the surface and ICc in response to speech sounds. Given the convergence of findings from both amplitude (Chandrasekaran et al., 2010
; Marsh et al., 1974
; Smith et al., 1975
) and phase spectra across the surface and ICc in animal models of the auditory system, we hypothesize that both the representation of amplitude and phase spectra in the human brainstem response reflects generalized auditory mechanisms that can be traced back to the properties of the nuclei which contribute to the surface response. It is hoped that future studies may be able to further test this hypothesis by examining perceptually-important attributes of the human brainstem response (Hornickel et al., 2011
; Hornickel et al., 2009b
; Johnson et al., 2008b
) in animal models of the auditory system.
An important consideration of this work, as well as the previously published phase-related work in the human auditory system (Skoe et al., 2011
), is how the auditory system might make use of this phase related information in the processing of complex signals, including speech sounds. One hypothesis is that low-frequency phase sensitivity evident at the scalp and ICc may represent an additional coding cue that may help facilitate the discrimination of speech stimuli. The logic for this hypothesis is grounded in the fact that the upper frequency range of phase-locking capability decreases in the ascending auditory pathway (See Joris et al., 2004
for a review). Therefore, the transposition of higher-frequency stimulus differences to lower response frequencies may serve as a non-linear mechanism for encoding fast moving frequency modulations that exceed the phase-locking capability of higher levels. This transposition may reflect the processing of amplitude modulations invoked by physical mechanisms of vocal production involving the fundamental frequency and its harmonics. John and Picton (2000)
used simple amplitude modulated stimuli to illustrate how the phase of low-frequency envelope responses from the brainstem conveys information present in higher frequency regions of the stimuli. It is hoped that future studies may test this hypothesis by further examining the relationship between low- frequency phase and higher frequency components of acoustical stimuli as well as their possible link to perception.
We did not see differences in phase encoding across different CF groups in the near-field ICc data. This lack of differential encoding is not unprecedented; different CF regions of the IC have been found to respond similarly to vowels (Watanabe et al., 1978
). However, the magnitude of responses did vary by CF region, with higher frequency CF regions producing larger responses, particularly in the lower frequency range of the responses. Modeling work has revealed that mid- and high-frequency cochlear regions are primary contributors to low-frequency FFRs (Dau, 2003
). This finding could help explain why we see greater FFR amplitudes in higher CF regions. The dissociation between magnitude and phase encoding indicates that although responses across the tonotopic map do differ, the differences in response magnitude do not impact the encoding of small timing differences that differentiate the stimuli. Response timing is encoded similarly across CF regions.
Phase differences in the steady state period of the near-field responses were not expected, as the stimuli are identical during this time period. No phase differences were evident in the steady state portion of the far-field responses. Small phase differences were seen in the human dataset collected by Skoe et al., (2011)
. In that study, steady state phase shifts indicated that responses to higher-frequency content phase-lagged responses to lower-frequency content, the opposite direction than would be predicted if they were simply carry-over effects from earlier time periods. In the present study, phase oscillations at 100Hz were evident in the near-field data, a frequency present throughout all stimuli as the fundamental frequency. As seen in , these oscillation patterns did not center at the zero- phase difference line: the DaBa and GaDa responses were superimposed on phase shifts of opposite direction (phase-lead and phase-lag, respectively) and were 180° out of phase. The GaBa response, which showed the least amount of steady state oscillation, ran closest to the zero-phase difference line. Therefore both oscillation pattern and static phase shift differed with stimulus comparison, suggesting that later phase encoding may provide contextual information concerning the spectro-temporal sound patterns earlier in the stimulus, in this case the formant transition. This concept is supported by data from Watanabe and Sakai (1978)
in which steady state IC responses to the vowel /a/ were found to differ when the vowel was presented in isolation or preceded by connecting speech. More recently, context effects were reported in human brainstem responses to speech syllables (Chandrasekaran et al., 2009
). The effects seen here may be related to the contextual effects reported in these papers.