|Home | About | Journals | Submit | Contact Us | Français|
Psychophysical forward masking is an increase in threshold of detection of a sound (probe) when it is preceded by another sound (masker). This is reminiscent of the reduction in neuronal responses to a sound following prior stimulation. Studies in the auditory nerve and cochlear nucleus using signal detection theory techniques to derive neuronal thresholds showed that in centrally projecting neurons, increases in masked thresholds were significantly smaller than the changes measured psychophysically. Larger threshold shifts have been reported in the inferior colliculus of awake marmoset. The present study investigated the magnitude of forward masking in primary auditory cortical neurons of anaesthetised guinea-pigs. Responses of cortical neurons to unmasked and forward masked tones were measured and probe detection thresholds estimated using signal detection theory methods. Threshold shifts were larger than in the auditory nerve, cochlear nucleus and inferior colliculus. The larger threshold shifts suggest that central, and probably cortical, processes contribute to forward masking. However, although methodological differences make comparisons difficult, the threshold shifts in cortical neurons were, in contrast to subcortical nuclei, actually larger than those observed psychophysically. Masking was largely attributable to a reduction in the responses to the probe, rather than either a persistence of the masker responses or an increase in the variability of probe responses.
Psychophysical forward masking manifests as an increase in the threshold of detection of a sound (probe) when it is preceded by another sound (masker) (Plomp 1964; Elliott 1969). The physiological basis of it is not known. The present study analysed responses of primary auditory (AI) cortical neurons to pairs of sequential tones with signal detection theory (SDT) methods, to accurately measure the threshold of detection of the probe, and the shift in threshold with the addition of a preceding masker. We compared these thresholds with similar measurements from the periphery and in sub-cortical nuclei. This is an important step in establishing the contributions of different auditory nuclei to forward masking.
Psychophysical forward masking results partly from adaptation in the auditory nerve (AN). Adaptation to a masker reduces a fibre’s response to a subsequent probe (Harris and Dallos 1979; Smith 1977, 1979). However, analysis of AN fibres’ responses with SDT techniques (Relkin and Pelli 1987), analogous to the methods used to measure thresholds psychophysically, showed that increases in threshold due to a preceding masker were smaller (<20 dB) than measured psychophysically (>35 dB) for similar stimulus conditions (Relkin and Turner 1988; Turner et al. 1994). This suggested that forward masking is not a result of peripheral processing alone. Subsequently, SDT-based analysis of forward-masked thresholds in the ventral cochlear nucleus (VCN), revealed that the magnitude of forward masking in chopper units, which form a major projection to the inferior colliculus (IC), was similar to the AN (Bleeck et al. 2006; Ingham et al. 2006). Thus psychophysical forward masking cannot be accounted for by responses in the brainstem.
A recent study by (Nelson et al. 2009), published after our data collection was completed, investigated correlates of psychophysical forward masking in the IC of awake marmosets. Using stimuli similar to those used by Jesteadt et al. (1982), in humans, the mean increase in threshold with masker level across the population of neurons was 0.5 dB/dB, consistent with psychophysical data. However, a study in anaesthetised guinea-pig IC found threshold shifts that were similar to the VCN in the same species (Ingham, N., personal communication).
Previous studies of AI neuronal responses have provided evidence of a cortical contribution to forward masking (Calford and Semple 1995; Brosch and Schreiner 1997; Sugimoto et al. 2002; Wehr and Zador 2003, 2005; Scholl et al. 2008). Cortical responses are often completely suppressed immediately after a stimulus, and suppression lasts at least 40 ms (Brosch and Schreiner 1997). Such strong suppression has not been observed subcortically (Schreiner 1981). However, STD-based measures of probe threshold are not available, and the quantitative relationship of cortical forward suppression and psychophysical forward masking remain unknown. The present study measured responses to many different probe levels, for different masker levels. Responses were compared on a trial-by-trial basis between the probe and no-probe conditions to derive probe detection thresholds more comparable with psychophysical measurements. Stimulus conditions were chosen to allow a direct comparison with previous sub-cortical studies (Relkin and Turner 1988; Ingham et al. 2006).
Recordings were made from either the left or right primary auditory cortex in 18 guinea pigs weighing between 410 and 877 g. Animals were initially anaesthetized with intra-peritoneal injections of urethane (1.3 g/kg, in 20% solution in 0.9% saline) and a 0.2 ml intra-muscular injection of Hypnorm (Comprising fentanyl citrate 0.315 mg/ml and fluanisone 10 mg/ml. Janssen, High Wycombe, UK). Atropine sulphate (0.06 mg/kg) was administered subcutaneously at the beginning of the experiment to reduce bronchial secretions. Anaesthesia was maintained throughout the experiment with 0.2 ml intramuscular injections of Hypnorm on indication of the pedal withdrawal reflex. Core temperature was monitored with a rectal probe and maintained in the range 37–38°C with a thermostatically controlled heating blanket. End-tidal CO2 was monitored and kept within normal physiological limits by artificially respiring with oxygen. Heart rate was monitored using a pair of electrodes inserted into the skin, on either side of the animal's thorax. The animal was placed inside a sound attenuating chamber, in a stereotaxic frame in which hollow plastic specula replaced the ear bars to allow direct visualisation of the tympanic membrane and closed-field sound presentation. A polythene tube was inserted into an opening in each bulla to equalize middle ear pressure, and the hole was resealed with petroleum jelly. A craniotomy with a diameter of around 5 mm was performed to expose the auditory cortex (Wallace et al. 2000), the dura was removed, and the surface of the brain was covered with 1.5% agar to stabilize the recordings and to prevent desiccation. All procedures were performed in accordance with the UK (Scientific Procedures) Act of 1986.
Recordings were made with glass-insulated tungsten electrodes (Bullock et al. 1988) in a linear array attached to a single circuit board. The multi-electrode signals were fed into a Tucker Davis Technologies Medusa headstage amplifier (Alachua, FL). All electrodes were advanced together into AI by a piezoelectric motor (Burleigh Inchworm IW-700/710). Spikes were recorded and analyzed on-line using Brainware (developed by J. Schnupp, University of Oxford, UK). Candidate spikes were amplified and discriminated from background noise on-line using a software level discriminator. They were further analyzed off-line with Plexon (Dallas, TX) software to isolate action potentials from separate single and/or multi-units.
Units were isolated with a 100-ms long wideband (0–48 kHz) noise played diotically at approximately 90 dB SPL. Although the exact extent of AI is somewhat idiosyncratic (Wallace et al. 2000), nevertheless it is straightforward to locate central AI and to confirm that units were primary cortical neurons, because they were in the correct spatial region with a smooth increase in best frequency as the electrodes were moved towards more caudal regions (Fig. 3 in Wallace et al. 2000), and also by their short latencies, around 10 ms. The use of multi-electrodes often also allowed us to check the direction of the tonotopic gradient with one electrode penetration, and this was confirmed in subsequent penetrations. Also, most of the units were very sensitive to tones. When a unit was isolated its frequency response area and characteristic frequency (CF) were determined by measuring its response to 50 ms long pure tones of varying frequency and level, played diotically. Frequency was typically varied between 100 and 30,000 Hz in steps of one third octave or smaller. The level was changed by attenuating the signal between 10 and 100 dB (from a maximum of ~100 dB SPL) in 10-dB steps.
After determining the characteristic frequency of the units of interest from all electrodes, the spike count evoked by forward masked tones was measured. Two different stimulus sets were used. One set used stimuli similar to those used by Relkin and Turner (1988) in which a 102-ms long tone (masker) was followed by another tone with a duration of 25 ms (probe), in most cases with no gap between the masker and probe. This set will be referred to hereafter as ‘(102,25)’-ms condition. In a second set, both masker and probe had a duration of 50 ms. This set will be referred to as ‘(50,50)’-ms condition. The shorter masker and longer probe were expected to produce a smaller effect of masking relative to the (102,25)-ms condition. It also meant that the threshold of the masker could be determined accurately from the threshold of the probe. All tones were gated with 2-ms on and off cosine-squared ramps and had a frequency near to the CF of the unit, although on some occasions a compromise was made to allow simultaneous recordings from more than one electrode site. The masker level and the probe level were varied independently. The probe level was varied from below threshold up to suprathreshold levels (>50 dB in 5 dB steps). Masker conditions were typically presented at 3-5 different levels 20 dB apart, including a no masker condition. Five different masker conditions required approximately 90 minutes of recording. For 15 units the masker–probe interval was increased from 0 (presented to all units) to 10, 20, 25 50 and 100 ms. Each of these three parameters - probe level, masker level and masker–probe interval—was independently varied from trial to trial, so that a single presentation of all stimulus conditions was completely randomised. Stimuli were presented diotically in the closed field and the interstimulus interval varied between 1 and 1.6 s. The minimum interstimulus interval was set as a parameter in the software whilst the maximum depended on computer hardware constraints. Sound levels were calibrated to dB SPL using a 1-mm probe tube connected to a 4134 Brüel and Kjaer microphone situated within a few millimetres of the tympanic membrane.
In a psychophysical forward-masking experiment each trial typically consists of two intervals: one in which only the masker is presented and another in which the masker is followed by the probe. In each trial, the participant has to indicate which of the two intervals contained the probe. The probe level is varied between trials and the level at which the probe is correctly detected on a certain proportion of the trials is determined. Forward masked thresholds, in our study, were derived using a method similar to that described by Britten et al. (1992), and is illustrated in Figure 1. This method is analogous to the two-alternative forced choice task used in psychophysics. For each cortical neuron the response to the probe was determined as the number of spikes that occurred within the time window of presentation of the probe (Fig. 1A). This time window is of the same duration as the probe stimulus and was set to start according to the latency of the unit, assessed by a post-stimulus time histogram of the response to both masker and probe. The spike count evoked by the probe was compared, for each stimulus presentation, with that measured within the equivalent temporal window when no probe was presented. Detection was considered to occur when the probe condition elicited more spikes than the no-probe condition. If responses were equal then a guess was made. Responses to the probe and no-probe conditions were paired and compared across all the 50 repetitions of each condition, and the number of ‘correct responses’ counted. The probability that a neuron responded in such a way as to allow correct detection of the probe, was calculated as the proportion of pairs for which the response to the probe was higher than the response to the no probe condition. This procedure was repeated for the different probe levels to derive the neurometric function (Fig. 1B; probability of correct responses as a function of probe level) for each unit, for each masker level and masker–probe time interval. The resulting neurometric function is equivalent to a psychometric function generated using a two-alternative forced choice procedure, and constitutes a prediction of performance based on the spike counts from a single unit. The threshold of detection of the probe was finally estimated from the neurometric function as the probe level for which a correct response occurred on at least 60% of the comparisons (horizontal dashed line in Fig. 1B).
Only the units that were responsive to the stimuli during the experiment and that provided reliable neurometric functions were considered for further analysis. Units which on visual examination yielded noisy neurometric functions, or for which it was not possible to collect more than 30 repetitions of each stimulus condition were excluded from the study. In total, 35 units were excluded on the basis of these criteria.
To predict psychophysical thresholds based on the combined responses of a population of cells, we employed a signal-detection model similar to Micheyl et al. (2005). For each unit, for each probe, masker and gap condition, we constructed a spike-count histogram with the distribution of the number of spikes obtained across repetitions. These were normalised by the total number of stimulus presentations for that condition in that cell (normally 50) in order to yield spike-count probability distributions. For a given stimulus condition the spike count distributions from all the units for which that stimulus condition was available were convolved together to generate population spike-count distributions (Fig. 1C; note however that spike count distributions can be considered for individual units as well as large populations). These distributions were designated as a condition in which the probe was ‘present’. They indicate the probability that the sum of the spike counts across the population will be a given value, in response to a single presentation of the probe tone for a particular stimulus condition. Population distributions for the response in the absence of the probe were generated from the response when there was no probe (the probe was maximally attenuated usually with 100 dB attenuation, ~0 dB SPL. This was always well below threshold). The distributions obtained from responses to the no-probe and probe conditions were compared to determine the probability with which the two conditions could be distinguished. Since the constituent units varied across each stimulus condition, comparisons were only made with a given ‘probe present’ condition and the ‘probe absent’ condition in the exact same sub-population of units. To generate a percentage correct value for a given stimulus condition, we employed a method like that used for individual units. Five hundred spike counts were drawn randomly from each of the two distributions. Each consecutive spike count pair was compared and ‘trials’ in which the spike count was higher in the ‘probe present’ condition were labelled as ‘correct’. If the spike counts were identical a guess was made. This yielded population neurometric functions identical in form to those of the individual units. Stimulus conditions were grouped irrespective of CF, were expressed in dB relative to the behavioural audiogram of a guinea-pig (Fay 1988), and were grouped into 5-dB bins for the probe levels and 10-dB bins for the masker levels.
The convolution performs an unweighted average of the spike count distributions of the different units available and, hence, implicitly assumes that their responses are statistically independent. Thus any potential information carried by individual units is equally and maximally valued. Neurons tuned to similar stimuli can provide independent information (Gawne et al. 1996; Nirenberg et al. 2001), particularly when responses are noisy. Responses of neurons in primary visual cortex were found to be almost independent if spikes produced by individual neurons were tracked separately, whilst activity summed across neurons produced redundancy (Reich et al. 2001). The sample of units considered for analysis contains units from different depths in different animals (12 animals for the (102,25)-ms condition and six for the (50,50)-ms condition). The probability of correlated activity between the units analysed is therefore small. Note also that since not all stimulus conditions were available in every unit, the number of units varied across stimulus conditions (from a few units to around 30).
Comparisons of spike count distributions to test for persistence of the response to the masker after the offset of the masker, or suppression below the spontaneous rate, were made using bootstrap methods to predict the probability that two distributions with the measured difference in means could arise if the two groups belonged to the same underlying spike count distribution. The samples from both original distributions were pooled, and then repeatedly randomly redistributed into two groups (500 times). For each resampling, the mean difference between the two groups was calculated. This generated a distribution of differences for the means, and allowed us to estimate the probability (our p value) that these two groups could have the observed difference in the mean if the samples were drawn from a single spike count distribution.
We recorded the responses of primary cortical neurons to individual pure tones, and pairs of sequentially presented pure tones: one ‘masker’ tone followed by a ‘probe’ tone. Tones were of equal frequency, close to the neurons’ characteristic frequencies (CF), and varied widely in level. Figure 1A shows in schematic form how a percentage of correct detections was derived by comparing, on a trial-by-trial basis, spike counts in a period when the probe is presented (left panel), and when the probe is absent (right panel). The resulting ‘neurometric’ functions (Fig. 1B) describe the percentage of trials on which the firing rate of an individual neuron would allow correct identification of the presence of the probe tone, as a function of the sound level of the probe. Forward masking is measured as a shift in this function, or a shift in a criterion threshold, to higher sound levels. This SDT based method is conceptually similar to the method used to derive detection thresholds in a psychophysical two-alternative forced choice task (see ‘Methods’ for details).
The SDT analysis can also be understood by considering spike count distributions: histograms of the number of spikes elicited by a stimulus on each presentation (Fig. 1C and D). Percentage correct is determined by the degree of overlap between the spike count distributions for the probe conditions (coloured distributions in Fig. 1C), and the no-probe condition (dotted black line in Fig. 1C). Figure 1C shows cartoon distributions of spike counts, and shows how they might be expected to change with either a change in the level of the probe or of the masker. The effect of changing stimulus conditions can be due to shifts in the shape and position of both spike count distributions. It thus also depends upon the statistics of the spike count in the absence of the probe (see Fig. 1D), and so depends on any effects of the masker on spike activity that considerably outlast the stimulus (see ‘Methods’ for further details).
The results described below are based on the SDT analysis of 98 units: 48 in the (102,25)-ms condition (19 single units and 29 multi-units) and 50 units in the (50,50)-ms condition (14 single units and 36 multi-units).
Figure 2 shows the neurometric functions (left column) and the corresponding spike rate-count functions (right column; shows the mean number of spikes for each stimulus presentation) for five example units: 3 multi-units (MU – recordings in which the shape of the action potential varies considerably and most likely reflects the activity of a cluster of nearby neurons) and two single units (SU—recordings where the shapes of the action potentials are very similar and most-likely originate from a single neuron). Each line represents a different masker condition (‘multiplication symbol’ corresponds to the no-masker condition). Changes in both the neurometric and rate-level functions caused by the presence of a masker were different for different units. For some units the addition of a masker (lines without symbols in Figure 2; masker levels are indicated in the insets) produced a shift of the neurometric and rate-level functions towards higher levels as the level of the masker increased (Fig. 2A, B and C). This corresponds to the distribution shift shown in Figure 1C when the level of the masker is increased. Neurometric functions were in some instances very similar to spike count functions (Fig. 2A and B), whilst in other cases there were clear qualitative differences (Figs. 2C, D and E), such as a change in the gradient of one of the functions (Fig. 2D). Often these differences were attributable to a ceiling effect when the neurometric functions approached 100%, corresponding to conditions when spike count distributions did not overlap at all. However, not all units reached the maximum performance (100% correct responses), even when the neurometric function saturated (Fig. 2A, B and E). Failing to reach the maximum indicates that the spike-count distributions (see Fig. 1C) to the probe and no-probe conditions overlap each other partially even at high probe levels.
For some units the presence of the masker caused a drop in the spike count to below the spontaneous firing rate of the unit (Fig. 2D), with stronger reductions occurring at higher masker levels. This would correspond to having the distribution obtained from the no-probe condition (dotted line in Fig. 1C) moving to lower spike counts, and would actually increase percentage correct if it were not always outweighed by a larger reduction in the response to the probe (colored lines in left panel in Fig. 1C). Thus the masker depressed spontaneous activity, but not as much as it suppressed the driven firing rates to the probe. Other units occasionally showed the opposite effect: the presence of the masker produced an increase of the spontaneous activity relative to the unmasked condition (Fig. 2E). This corresponds to the shift in the no-probe distribution shown in Figure 1D. Some units also showed nonmonotonic spike count vs. level functions that were also apparent in the shape of the corresponding neurometric function (Fig. 2B). The suppressive effect here cannot be attributed to an overall reduction in the discharge rate of the neuron (i.e. spike rate adaptation as is seen in the AN). In the example shown (as in all other non-monotonic units seen) a 70 dB SL masker evoked few spikes yet was more effective than a 30 dB SL masker.
The examples shown illustrate quite typical levels of masking seen in cortical units. We often observed large shifts in the neurometric functions in the presence of a masker. At some high masker levels, the percentage correct could remain at or near chance (50%; Fig. 2A, B). These threshold shifts were often attributable to a reduction in the response to the probe and a shift in the firing rate to higher stimulus levels. The functions in AI were qualitatively different to those in AN (Relkin and Turner 1988) and resembled more closely those in IC (Nelson et al. 2009). In the AN, the presence of the masker produces a shift of the neurometric functions along the intensity axis (Relkin and Turner 1988) and this shift saturates at high masker levels. Taken together with the previous results, at the level of individual recordings our data support the hypothesis that there is a larger effect of preceding stimuli on the detection of the probe in cortical responses than seen at any level in the auditory pathway up to IC.
We do not know how populations of sensory neurons contribute to a percept, or a decision in a psychophysical task. A reasonable possibility is that we select out sub-populations of neurons appropriate to the task. If we can do this with infinite precision, performance would be limited only by the most sensitive neurons. While this seems unlikely, behavioural thresholds must be some function of the thresholds across a population of neurons and we have therefore analysed our data in several different ways.
Threshold shifts across our population of recorded units were first evaluated by assessing, separately for each unit, the SDT-derived shift (relative to the unmasked threshold) in probe detection threshold as a function of the level of the masker. This analysis is very similar to that used in previous physiological studies. The criterion for correct detection was arbitrary; if it were set too high, then many neurons would have not exceeded it at any probe level. Thresholds for correct detection were therefore calculated as the level at which neurometric functions gave 60% correct responses. In the (102,25)-ms condition using 60% criterion increased the number of points by 32% relative to using 75%; in the (50,50)-ms condition the increase was of 42%. The choice of this criterion did not affect the qualitative properties of the plots.
Threshold shifts across the population, derived with SDT analysis, presented in Figure 3A and B, increased reliably with the level of the masker above the unmasked threshold. The effect of relative masker level on the amount of masking produced is similar across units, and showed no signs of saturating even at the highest levels. For a given masker level, the spread of threshold shifts was larger in the (50,50)-ms condition, with some units showing threshold shifts smaller than in the (102,25)-ms condition, for the same range of masker levels. However, in the (102,25)-ms condition the probe is shorter than the masker. The threshold of detection of the masker is probably therefore lower than the threshold of detection of the probe (in the absence of the masker), which is the reference level used to determine the masker level in Figure 3A. Thus the masker levels plotted in Figure 3A might be slightly under estimated, exagerating the difference in masking between the two conditons. The slopes of the growth of masking functions, determined from a regression of the data in Figure 3A and B (considering only maskers above the probe threshold), are: 0.79 and 0.80 dB/dB for the (102,25)-ms and (50,50)-ms conditions, respectively. Expressed in this way, the population of units appears fairly homogeneous in their behaviour. It again suggests that large threshold shifts were ubiquitous in the auditory cortex, and furthermore were proportional to the level of the masker above the unmasked threshold.
The proportion of units for which it was not possible to estimate either a detection threshold (the neurometric function did not reach the 60% correct point) or a threshold shift is indicated in the top histograms of Figure 3A and B. The histograms show that the proportion of occasions when there was complete masking increased with the level of the masker, and is larger in the (102,25) than in the (50,50)-ms condition.
For both (102,25) and (50,50)-ms stimulus conditions, some units showed an effect of masking (threshold shifts >0) even when the masker level was below the detection threshold for the probe when presented alone. However, a failure to meet the SDT criterion for threshold does not necessarily mean that there was no response to the masker. In Figure 3A and B responses to the masker that were less than 20% above the spontaneous rate (a commonly used criteria to define neural thresholds) are indicated by black symbols, whilst masker conditions that elicited firing rates more than 20% above spontaneous rate are indicated by open symbols. This demonstrates two things. Firstly, in both stimulus conditions, in the majority of instances in which the response to the masker does not meet either criterion, shifts in threshold are fairly small (mostly <10 dB). Thus, forward masking occurs mainly when there is a response to the masker. Secondly, in Figure 3B, there are a majority of open symbols below 0 dB re. probe/masker threshold. That is, although the masker was below threshold according to SDT-analysis, it evoked a response above the spontaneous rate. So detection thresholds with SDT analysis can be different from considering a criterion increase in the mean firing rate. Interestingly, the difference is much less marked for the (102,25) condition, suggesting that the differences in different criterion depend somewhat on stimulus conditions.
The threshold shifts obtained for the (102,25)-ms condition in Figure 3A can be compared with reports of sub-cortical data using similar stimuli. Cortical units showed minimum threshold shifts of 38 dB for a masker 40 dB above the unmasked threshold, and masking was complete above this level in most units. The maximum threshold shift estimated from responses of individual AN fibers was 20 dB (Relkin and Turner 1988; Turner et al. 1994). Cortical threshold shifts were also larger than those derived from the N1 component of the CAP (28 dB for a 40 dB increase in masker level (Relkin and Smith 1991). Thresholds derived from SDT-based analysis of forward masked responses in the VCN are different across different neuron types (Bleeck et al. 2006). The magnitude of forward masking was largest for onset and onset chopper units (a mean of 30 dB for a 40 dB increase in masker level; Ingham et al. 2006), consistent with previous studies that did not use SDT (Shore 1995). However, onset units in the VCN are local inhibitory interneurons, and so their effect is on other units in the cochlear nucleus. Threshold shifts derived from chopper units, which form a major excitatory projection to the IC, were lower (9 dB for a 40 dB increase in masker level Ingham et al. 2006). Unpublished data from the IC of anaesthetised guinea-pig suggests that units which respond in a sustained manner to tones showed similar forward masking to VCN chopper units (Ingham, personal communication). Cortical threshold shifts were also larger than those seen in units in IC of awake marmoset (a mean 18 dB for a 40 dB increase in masker level; Fig. 6, Nelson et al. 2009), albeit with a 20 ms gap between the masker and the probe, which would be expected to reduce masking. Also in that study, the mean slope of the growth of masking function across the population when plotted in this way was considerably lower than in our cortical data (0.5 dB/dB).
We also estimated the growth of masking slopes in individual units by fitting a straight line to the individual growth of masking functions. Slopes varied between about 0 (i.e. no threshold shift in the presence of the masker) and 1.23 dB/dB, with a mean slope of 0.49 for the (100,25)-ms condition and 0.41 for the (50,50)-ms condition (panels C and D in Fig. 3). This compares with a mean of 0.26 dB/dB (max: 0.5 dB/dB) in the AN (Fig. 7 in Relkin and Turner 1988) and 0.36 dB/dB (max: 1.46 dB/dB) in the IC (Fig. 5 in Nelson et al. 2009). The difference in the masker–probe interval used in the cortical and IC experiments is likely to contribute to the differences in slope reported. In effect, several psychophyscial studies have shown that the slope of the growth of masking function decreases as the interval separating the masker from the probe increases (Jesteadt et al. 1982; Moore and Glasberg 1983; Widin and Viemeister 1979). Furthermore, a recent study of the effects of different pulse rates in the transmission of temporal information to the auditory cortex, through cochlear implant stimulation of the auditory nerve in anesthetised guinea-pigs (Kirby and Middlebrooks 2010) observed growth of masking slopes near 1 at masker offset and shallower slopes at masker–probe intervals of 64 ms (Fig. 8 in Kirby and Middlebrooks 2010).
We also observed that larger slopes occurred for units with lower unmasked thresholds (r 2 =0.75 in Fig. 3C and r 2 =0.46 in Fig. 3D). That is, the effect of masker level was stronger for the units more sensitive to the tone when there was no masker. This seems to be because units with higher thresholds do not show much threshold shift for maskers that are below their unmasked threshold. Such a relationship has not been reported sub-cortically. We did not find a clear relationship between the slope and the difference between the tone frequency and CF as there was between slope and unmasked threshold. We also did not find a clear relationship between unmasked threshold and the difference between the tone frequency and the CF. This observation reflects the variation of the receptive fields of the cortical units recorded. Many had wide receptive fields for which thresholds did not rise rapidly away from CF.
Although a wide range of probe frequencies was tested, 66% of these were between 1 and 6 kHz. Only a minority of frequencies tested was below 1 kHz (8%) and above 6 kHz (26%). Although we did not observe any systematic differences across CF, it was not possible to make reliable conclusions concerning possible effects of frequency in the growth of masking slopes reported.
The threshold shifts obtained here for cortical neurons, for a masker level of 40 dB re. unit unmasked threshold, were also larger than the shift obtained in psychophysical studies. Plack and Oxenham (1998), using a 100-ms masker and 30-ms probe, measured a threshold shift of 23 dB for a masker approximately 40 dB above the threshold of detection of the probe (Fig. 5 of Plack and Oxenham 1998, 2 ms gap). We obtained physiological threshold shifts of 38 dB or more. The shifts in threshold we see were also larger than those seen by Turner et al. (1994), in which the durations and ramps of their stimuli matched ours exactly, but in which the masker was a narrowband noise. In their data, a masker 40 dB above absolute threshold produced a shift of only 8 dB. Cortical threshold shifts are also larger than those obtained by Jesteadt et al. (1982). As the masker increased from approximately 20 to 80 dB SPL, that is, for a 60 dB increase in masker level, thresholds increased by 32 dB (Fig. 1 in Jesteadt et al. 1982). However, in this psychophysical study both the masker and the gap were longer than in our study, making the comparison difficult. It appears nevertheless, that the threshold shifts we see in cortical neurons are larger than those observed psychophysically.
The previous analysis considered the effect of a masker on probe threshold within each single unit. However, physiological differences between neurons, in their receptive fields and thresholds (Fig. 2), mean that detecting a probe of a given frequency may not be mediated by the same neuron or groups of neurons at all masker levels. Potentially, we are able to select out that sub-population of neurons that has the lowest threshold in each set of conditions. In this case, we should not consider the threshold shift within in a single neuron, but rather the absolute threshold for a single masker condition. Figure 4 shows the data from Figure 3, but with probe thresholds and masker levels expressed relative to the guinea pig audiometric threshold (Fay 1988; this reduced the variability across neurons when compared in dB SPL because the auditory periphery varies in its sensitivity to different frequencies). The histograms to the left of panels A and B indicate the distribution of unmasked thresholds for the two stimulus conditions. Probe detection thresholds were estimated from the neurometric functions as the probe level at which the function reaches 60%.
The overall increase in the threshold level with masker level was again evident for both stimulus conditions. Probe thresholds of 60 and 55 dB or more were observed at a masker level of about 60 dB SL, for both masking conditions, respectively (Fig. 4A and B). At masker levels much higher than this, masking was often complete. A statistical analysis of the ratio between probe threshold and masker level showed a significant difference between the two stimulus conditions (Kolmogorov-Smirnov test, p<0.05). Thresholds at low masker levels tended to be lower in the (50,50)-ms condition relative to the (102,25)-ms condition. Overall, the effect of masking was larger in the longer masker/shorter probe condition, as expected. Thresholds and threshold shifts were not different for single and multi-units.
The rationale for looking at the data in this way was that perceptual performance might be determined by the most sensitive neurons in a population. Figure 4 shows lines drawn approximately through the minimum masked thresholds at each masker level, which would represent the best performance achieved by the most sensitive neurons across the population of units at each masker level. In both conditions minimum thresholds increased nearly linearly with masker level with slopes of 0.85 dB/dB (Fig. 4A) and 1.2 dB/dB (Fig. 4B), with the most sensitive units having lower thresholds in the shorter masker/longer probe condition.
This representation of the data is probably more comparable with the way these neurons might be used in a psychophysical forward masking experiment. It again confirms that forward masking is a robust and strong effect across our population of units, and supports the hypothesis that forward suppression in central neurons may contribute to psychophysical forward masking. Also again, it suggests that we are seeing a stronger effect than is observed psychophysically.
Figures 2, ,33 and and44 illustrate the effect of masking (as an increase in threshold) calculated for individual units and for the most sensitive units at each masker level. However, it is not necessarily the case that the detection of the probe is based on the information yielded by the best individual units. We of course do not know how these spike counts are combined across a population of cells during a psychophysical task. In order to gain some insight into how a population of neural responses might determine the effect of forward masking, we adapted the signal-detection analysis to deal with a large number of cells, similar to a signal detection model proposed by Micheyl et al. (2005). This method constructs population spike count distributions based on the assumption that the neurons are behaving in a statistically independent manner (see ‘Methods’ section). These population spike count distributions were then employed in a Monte-Carlo simulation, which allowed us to generate population neurometric functions similar to those for individual units.
Population neurometric functions were computed from all the units for which a given condition was tested. Figure 5A shows the resulting population neurometric functions computed for the (102,25)-ms condition, when there was no silent gap between the masker and the probe. Population neurometric functions were estimated for the unmasked condition (the leftmost line in panel A, with ‘plus’ symbol) and for masker levels ranging from 30 to 90 dB SL (see legend), distributed in intervals of 10 dB (sound level is again represented relative to a guinea-pig audiogram). The dashed line at 50% indicates chance performance. Neurometric functions shifted monotonically towards higher probe levels (towards the right in the plot) as the masker level increased from the unmasked condition to a 90 dB SL masker. As for individual units, thresholds were derived from the population neurometric functions as the level at which the function reached 60% correct responses. These thresholds are represented in Figure 5B for the (102,25)-ms (‘multiplication symbol’) and (50,50)-ms (‘empty circles’) groups separately. Also represented are the minimum individual thresholds at each masker level, in the (102,25)-ms condition taken from Figure 4A (grey line). In the absence of any masker, the population threshold was 12 dB SL or less, and rose steadily with masker level (Fig. 5B). Population thresholds were similar for the (102,25)-ms and the (50,50)-ms conditions (Fig. 5B). Thresholds derived from the population analysis were remarkably similar to the minimum thresholds derived from the analysis of individual units, despite the fact that the population analysis included all units irrespective of their individual neurometric functions. This result suggests that the performance of the population model, in this task and with the methods used, is determined essentially by the performance of the individual units best suited to detecting a particular probe. This is consistent with the observation of similar thresholds and threshold shifts for single and multi-units (Figs. 3.A, B, 4A and B), indicative that the performance of multi-units is determined by the most sensitive unit within the ensemble.
For 15 units, the masker–probe interval was varied and threshold shifts estimated. Figure 6 A-C illustrates the resulting neurometric functions for three units, for different intervals and different masker levels. As the masker–probe separation increased the neurometric functions became closer to the unmasked condition (asterisks), consistent with a weaker effect of masking at longer intervals. Recovery was still not complete at a separation of 100 ms. At these long intervals thresholds were still about 5 dB higher than the unmasked thresholds (Fig. 6D). This is consistent with previous observations of the effect of a preceding masker on the response to the probe that can extend for 53–430 ms, for masker and probe durations of 30 ms (Brosch and Schreiner 1997). For the higher masker levels (circles and squares), it was not possible to estimate a threshold at short masker–probe intervals (dashed lines in Fig. 6D), which might suggest that the effect of a short interval is stronger the higher the masker level.
The SDT analyses used here to derive the neurometric thresholds compare, trial-by-trial, the response evoked by the probe with the spike count in the same time-window of a trial when the probe is not presented, as in Figure 1A. The masker is present in both trials. The trial-by-trial comparison of the spike counts is very similar to measuring the overlap of the spike-count distributions for when the probe is present and when it is absent (Fig. 1C). Increasing the level of a preceding masker might reduce the mean spike count evoked by the probe (Fig. 1C), and/or increase the variability of the response to the probe. Alternatively, a preceding masker can produce a change in the spike count distribution of the no-probe trials (Fig. 1D). To investigate these alternatives, we examined the spike count statistics in response to the probe tones as a function of probe and masker levels (Fig. 7), and changes in firing rate due to a preceding masker when no probe was presented (Fig. 8).
The mean number of spikes evoked by the probe per presentation was low (Figs. 2A-E, 7A and C). The average spike count, calculated across the 50 repetitions in the unmasked condition, was never more than 8.5 spikes per presentation. In some cases probes evoked no more than one spike per presentation (22 units) and therefore neural activity is better expressed in terms of probability of firing (Calford and Semple 1995). Despite this, spike count showed an orderly increase that correlated very well with percentage correct. In most units (e.g. Fig. 7A), this relationship was tight across all masker conditions. Occasionally, however, a unit was found for which the ratio between the spike count and the variance changed for different masker conditions (see below). Individual Spearman correlation coefficients calculated between average spike counts and number of correct responses, considering all probe and all masker levels together for a given unit, varied between 0.33 and 1 for all units, with 90% of the units having Spearman coefficients higher than 0.6. A high correlation coefficient (as in Fig. 7A, but not in Fig. 7C) suggests that the main effect of a masker was to reduce the spike count in response to the probe, directly determining the detectability of the probe. Thus, the effect of adding a masker on the response to the probe was in most cases very similar to reducing the sound level of the probe (as in Fig. 1C).
Despite the tight relationship between correct detection and spike count, we nevertheless wanted to know whether spike count variability (or more precisely the shape of the spike count distributions) also played a role in determining thresholds. The high Spearman correlations do not rule this out, since they only test for order, and not for proportionality of changes in spike rate and percentage correct. The statistics of cortical neurons has been studied extensively in the visual system, where firing approximates a Poisson process, and so the variance increases with the spike rate (e.g. Tolhurst et al. 1983; Shadlen and Newsome 1998). In this case, an increase in probe level must produce an increase in firing rate that can overcome an increase in variability (the width of the spike count distribution in response to the probe). However, in auditory cortex single units may approximate binomial behaviour, firing once very reliably for a short tone-pip, whilst recordings of multiple-units can show greater than Poisson-like variability (DeWeese et al. 2003; Shadlen and Newsome 1998). In our data, consisting of single and multi-units, it was therefore possible that variance in recordings may increase more than, or less than the increase in firing rate, or even decrease with the increase in firing rate (variability is zero for a binomial process with a probability of one). What we found was that the increase in mean spike count in response to the probe was accompanied by an increase in variance. In the example unit in Figure 7B, below the 60% correct threshold (open symbols) the mean spike count followed a Poisson distribution, with the variance and mean being very similar (Fig. 7B and D). The firing of many units approximated Poisson processes, but variance was observed to increase less than the mean probability of firing (Fig. 7B). This tended to be only above the 60% threshold for probe detection (filled symbols), indicating that the detectability of the probe was also improved by a smaller increase in variance than would be expected from a Poisson model.
A convenient way to consider the effect of firing rate and variability together is through the Fano factor, the ratio between the variance in spike count and the associated mean (variance/mean; Buracas et al. 1998; DeWeese et al. 2003). The Fano factor across the population was mostly slightly larger than 1 for probe levels below detection threshold (Fig. 7E and F) and especially for the longer masker condition. That is, at these probe levels the variance in spike count is always larger than the mean. For probe levels above threshold the Fano factor could be larger or smaller than one (Fig. 7E and F), but was usually smaller than the value below threshold in a given unit. Thus, the increase in the detectability of a probe is mediated by increases in firing rates with increasing sound level, and this was competing with increases in variance (as illustrated in Fig. 1C), but not so much as would be expected if the neurons closely obeyed Poisson statistics.
Occasionally a unit (e.g. Fig. 7C) was encountered in which the masker shifted the relationship between percentage correct and firing rate. This suggests that some other quantity than the spike count in response to the probe was affecting masking. One can see here that the spike rate increases with the masker level, particularly below threshold. This effect can be explained by neuronal activity evoked by the masker that persists after the end of the masker, ‘swamping’ the response to the probe, (as illustrated in Fig. 1D), and has been widely proposed (Moore et al. 1988; Oxenham 2001; Plomp 1964; Zwicker 1984) as a mechanism to explain forward masking. In this unit, it considerably reduces the Spearman correlation (to 0.60). Despite this behaviour the variance in response to the probe almost exactly matched the spike count (Fig. 7D).
Masking by persistence was investigated across the population by analysing and comparing firing probabilities in conditions where the probe was not presented. The mean spike count in the time window after the presentation of the masker was subtracted from that calculated in an equivalent time window when the masker was absent. Negative values are indicative of persistence of the activity after the presentation of the masker. The difference in firing rate was calculated separately for each unit and masker level (Fig. 8A and B). In the majority of the cases the difference was near to zero. We tested statistically whether these differences were significant individually for each masker condition in each unit (p<0.01 uncorrected, bootstrap test; see Methods). Negative differences in spike count, that would suggest persistence, were significant in 4% of the cases in the (102,25)-ms condition, and in 22% of the cases in the (50,50)-ms condition. The positive differences were significant in 2% of the cases in the (102,25)-ms condition (panel A) and 7% in the (50,50)-ms condition (panel B). Overall, this suggests that in the majority of stimulus conditions persistent response to the masker plays little or no role in forward masking, but that this may be somewhat dependent on stimulus conditions. For short maskers, a sizeable minority of our data showed a persistent response that outlasted the masking stimulus. Brosch and Schreiner (1997) observed no overlap of responses evoked by the masker and the probe beyond stimulus onset asynchronies of 20 ms, with masker and probe 30-ms long. In the IC, only a small proportion of units showed effects of persistence (Nelson et al. 2009).
This study estimated forward masked thresholds in AI neurons using SDT. This method of deriving detection thresholds takes into account not only the mean spike count of the response but also the variability in cortical responses. Analysing the threshold of units in this way is much more analogous to the behavioural process, and so is a better basis for comparison with psychophysics. It is clear from our data that SDT based thresholds do not always give the same values as more arbitrary methods (e.g. the 20% criterion in Fig. 3). Further analysis showed that response variability grew with the increase in mean firing rate, and therefore reduced the increase in detectability that an increase spike count alone would have produced. SDT analysis also allowed us to take account of the possibility that persistent responses to the masker might contribute to forward-masking. Finally, it is also amenable to expansion to population based ‘signal detection models’. This allowed us to show that performance of the population could approach that of the best performing units in that population.
Threshold shifts derived from SDT measures of AI cortical activity were larger than those estimated from AN fibers (Relkin and Turner 1988; Turner et al. 1994) and sub-cortical auditory neurons (Ingham et al. 2006; Nelson et al. 2009), as we hypothesised. Growth of masking functions were qualitatively different to the nerve and had a wide dynamic range more comparable to those in the IC. Quantitatively, forward masking was stronger than any previous sub-cortical measures, with larger threshold shifts and steeper growth of masking slopes. Thresholds often increased monotonically until masking was complete. The large magnitude of masking was seen across the population of neurons, and across all the analyses.
In addition to the likely genuine cortical contribution to forward masking, there are several other factors that might have influenced our results. Following Relkin and Turner (1988), in the present study there was no silent gap between masker and probe. As mentioned above, this likely contributed to the larger threshold shifts and growth of masking slopes observed in our study relative to previous physiological studies, namely the results in the IC in awake marmosets (Nelson et al. 2009), where a 10 ms gap was used in the default condition. In our data, the introduction of a 25 ms gap could make the difference between complete suppression and a measurable threshold (Fig. 6). Nevertheless, the absence of a masker–probe interval does not fully explain the differences observed.
Another factor to consider is anaesthesia (Gaese and Ostwald 2001; Syka et al. 2005; Zurita et al. 1994). Brosch and Scheich (2008) found no differences in forward inhibition in cortical neurons between Ketamine-anesthetised and awake monkeys. However, responses to click trains in awake and ketamine-anaesthetised rat showed that anaesthesia significantly increases the duration of suppression by preceding stimuli in auditory cortex (Rennaker et al. 2007). Pentobarbitol markedly increases this effect relative to Ketamine (Fig. 3 in Wehr and Zador 2005). It also reduces the proportion of AI neurons that phase lock to amplitude modulated tones, and increases trial-to-trial variability, relative to awake-state responses (Ter-Mikaelian et al. 2007). Some studies under Urethane, as used here, found little effect on sensory responses (Angel and Gratton 1982; Angel 1984). However, in the auditory cortex it decreases spontaneous and driven activity (Albrecht and Davidowa 1989; Capsius and Leppelsack 1996). Unlike Ketamine, which acts principally on excitatory NMDA receptors to reduce sensitivity to glutamate, or Pentobarbitol which primarily accentuates the effect of inhibitory GABAA receptors, Urethane anaesthesia is produced by a modest effect on multiple receptor systems (Hara and Harris 2002). The effects of hypnorm, a combination of an mu-opioid agonist (fentanyl) and a dopamine antagonist (fluanisone), on cortical activity are not clear. Thus, anaesthesia may well be accentuating the effects of forward masking in our data compared with responses in an awake animal, but to what degree it is difficult to say. It seems unlikely that anaesthesia can explain all of the differences we observe compared to the IC of awake marmosets (Nelson et al. 2009). Temporal precision in the IC is comparatively unaffected by anaesthesia, and even in an awake preparation cortical responses adapt more quickly, and synchronise to less well to amplitude modulations above 10–12 Hz (Ter-Mikaelian et al. 2007).
The masking we observed might have been influenced by the intertrial interval (ITI; which ranged from 1–1.6 s). However, since all stimulus conditions were randomised, and thresholds are based on a comparison between two spike counts, this effect is probably small. Although ITI is known to influence cortical responses (Hocherman and Gilat 1981), Brosch and Schreiner (1997) found that masked tuning curves were independent of ITI. Nevertheless, if short ITIs reduced the responses evoked by probes, but had little effect on the low firing rates in ‘no probe’ conditions, thresholds might be elevated slightly.
The stimuli used here were chosen to match previous neurophysiological data, but are less well suited to direct comparison with psychophysics. The complications of anesthesia, the diversity of CF range, the wide range of masker levels, and species differences demand cautious comparison. Although forward masking in animal models has been less extensively investigated behaviourally, it appears similar to that seen in humans (Halpern and Dallos 1986; Salvi et al. 1982).
Recent psychophysical studies showed that the slopes of forward-masking growth-of-masking functions for pure tones can vary between 0.5 and close to 1 dB/dB (Plack and Oxenham 1998), depending on whether the probe and masker are in the compressive or in the linear region of the basilar membrane. For very short gaps between the masker and the probe, masking is strong and the threshold level of the probe can be quite similar to that of the masker, and growth of masking can approach 1 dB/dB. Jesteadt et al. (1982) measured slopes that varied from below 0.5 to close to 1 dB/dB. On the other hand, much less masking (0.22 dB/dB) is seen for stimuli using the same ramp-times and durations as our stimuli, but with narrowband noise maskers (Turner et al. 1994). Thus, there is a variation in the psychophysics which may be attributable to differences in stimulus parameters.
The magnitude of forward masking reported here, using SDT, suggests that, consistent with previous studies (Calford and Semple 1995; Brosch and Schreiner 1997; Eggermont 2000; Wehr and Zador 2005), a substantial amount of the effect of preceding stimuli on responses measured in the auditory cortex is cortical in origin. Attributing threshold shifts to a reduction in the firing rate in response to the probe (Fig. 7), and the short masker–probe gaps investigated here, suggest that masking may be a combination of synaptic depression (Wehr and Zador 2005), after hyperpolarization and inhibition (see Brosch and Schreiner 1997 for a detailed discussion). Non-monotonic spike count vs. level functions and neurometric functions are consistent with a change in the balance between excitatory and inhibitory neuronal inputs, either at cortical level (Calford and Semple 1995; Brosch and Schreiner 1997) or below (e.g. Greenwood and Maruyama 1965). A physiological and modeling study (Eggermont 2000) of responses of cortical neurons to gap-in-noise stimuli suggested that strong responses to the onsets of sounds induced after-hyperpolarization that dominated suppression for 40–50 ms after the first onset. Although these were broadband stimuli, this is consistent with the strong masking we observed in the (50,50)-ms condition, and the predominance of monotonic neurometric functions. On the other hand, the recent study by Kirby and Middlebrooks (2010) of the effects of pulse rate in the expression of temporal information in the AI of anesthetised guinea pigs, did not observe a change in effect of masking with cortical depth which suggests a subcortical origin to the central contribution to the effect of masking.
The thresholds and threshold-shifts under forward-masking, measured using SDT were larger than seen sub-cortically and in the auditory periphery, where threshold shifts were not as large as seen psychophysically. Instead, cortical threshold shifts appear to be larger than seen psychophysically.
In addition to the explanations, we have already discussed it is possible that probe detection is not based on the spike counts of individual units or of the population of units as we have assumed here and that other parameters of the neuronal activity have to be considered. For example, the degree of correlated activity to the probe across a population of neurons might vary with masking. It also remains to be investigated whether forward-masking is different in other auditory cortical fields. Nevertheless, the current data suggests that, as suggested by Relkin and Turner (1988), that there is a substantial central contribution to forward masking. When considered together with recent studies of auditory streaming in the auditory cortex (Bee and Klump 2005; Fishman et al. 2004; Micheyl et al. 2005; Fishman et al. 2001), it seems likely that central processes of adaptation play multiple roles in our perception of on-going sounds.
Work supported by the Medical Research Council, UK. We would like to thank Chris Scholes, who helped in some of the data collection. We are also grateful to Ian Winter and Michael Akeroyd for their comments on the manuscript, Chris Plack for fruitful discussions and help using the temporal window model of forward masking, which aided our understanding of the psychophysical data, and finally to two anonymous reviewers for their comments and suggestions on preliminary versions of the manuscript.