|Home | About | Journals | Submit | Contact Us | Français|
The cochlear nucleus angularis (NA) is widely assumed to form the starting point of a brain stem pathway for processing sound intensity in birds. Details of its function are unclear, however, and its evolutionary origin and relationship to the mammalian cochlear-nucleus complex are obscure. We have carried out extracellular single-unit recordings in the NA of ketamine-anesthetized barn owls. The aim was to re-evaluate the extent of heterogeneity in NA physiology because recent studies of cellular morphology had established several distinct types. Extensive characterization, using tuning curves, phase locking, peristimulus time histograms and rate-level functions for pure tones and noise, revealed five major response types. The most common one was a primary-like pattern that was distinguished from auditory-nerve fibers by showing lower vector strengths of phase locking and/or lower spontaneous rates. Two types of chopper responses were found (chopper-transient and a rare chopper-sustained), as well as onset units. Finally, we routinely encountered a complex response type with a pronounced inhibitory component, similar to the mammalian typeIV. Evidence is presented that this range of response types is representative for birds and that earlier conflicting reports may be due to methodological differences. All five response types defined were similar to well-known types in the mammalian cochlear nucleus. This suggests convergent evolution of neurons specialized for encoding different behaviorally relevant features of the auditory stimulus. It remains to be investigated whether the different response types correlate with morphological types and whether they establish different processing streams in the auditory brain stem of birds.
The cochlear nucleus is the first brain stem nucleus of the auditory pathway and, in birds, is subdivided into the nucleus magnocellularis (NM) and the nucleus angularis (NA), both of which are contacted by collaterals of the same auditory-nerve fibers (review in Carr and Code 2000). Most studies of the avian cochlear nucleus have concentrated on the NM. It is widely accepted that the bird NM is the equivalent of the spherical bushy cell population in the mammalian anteroventral cochlear nucleus. This is supported by many detailed similarities in both anatomy and physiology and the specialized role of that particular cell type in temporal auditory processing for sound localization (e.g., reviews in Carr et al. 2001; Oertel 1999; Trussell 1999). Yet the auditory system does much more than localize sources via interaural time differences; it enables localization using level differences and spectral cues and a whole range of sound-recognition behavior (e.g., Dooling et al. 2000). With extreme specialization for temporal coding in NM, we look to the other nucleus of the pair, NA, as a likely candidate to lay the foundations for these other tasks.
Studies on the cellular morphology of NA agree that it is a heterogeneous nucleus with several different neuron types (Boord and Rasmussen 1963; Häusler et al. 1999; Soares and Carr 2001). Soares et al. (2002) have recently used in vitro whole cell recording to characterize the intrinsic firing properties and single-cell morphology of NA in the chicken. They showed several distinct physiological classes of cells. Their physiological types correlated with cellular morphology and the morphological types were similar to those found in the owl. These results would predict diversity in NA in vivo responses as well; however, previous recordings in the NA provided conflicting reports on the response types present and indicated an unusual species specificity at such an early stage of auditory processing. For example, responses in the barn owl were reported to be very uniform (Sullivan 1985; Sullivan and Konishi 1984), whereas both more varied and more complex responses had been found in the redwing blackbird (Sachs and Sinnott 1978). The chicken, while also varied, showed yet other response categories (Warchol and Dallos 1990). In the light of our recent findings on the organization and cell types in NA, we have therefore reexamined the in vivo physiology in the barn owl NA.
Results are reported from experiments on six barn owls (Tyto alba pratincola), five males aged between 1 and 13 yr and one 6-mo-old individual of unknown sex. Most animals were used in two to three separate experiments spaced several days apart. Anesthesia was induced by intramuscular injections of 10–14 mg/kg ketamine hydrochloride (Ketavet Phoenix, St. Joseph, MO) and 2–3 mg/kg xylazine (xyla-ject, Phoenix); supplementary doses of ketamine and xylazine were administered according to individual needs (on average approximately 7 mg · kg–1 · h–1 ketamine and 1.5 mg · kg–1 · h–1 xylazine). Body temperature was continually measured by a probe inserted into the owl's cloaca and kept constant at 39°C by a feedback-controlled heating blanket wrapped around the owl's body (Harvard Instruments, Braintree, MA). An electrocardiogram (EKG) was recorded by fine needle electrodes inserted into a leg muscle and a muscle of the contralateral wing; this was periodically displayed on an oscilloscope and/or broadcast on an audiomonitor to check for muscle potentials (associated with breathing or waning anesthesia) and frequency and regularity of the heart beat. At the end of experiments after which the animal was allowed to recover, about 0.06 mg/kg buprenorphine hydrochloride (Buprenex, Reckitt and Colman Products, Hill, UK) was administered.
The owl's head was firmly held in a controlled position by a custom-designed setup, using earbars and a beak holder. After removing some feathers, cutting the skin, and gently roughing the skull surface, a metal headplate, as well as a short metal pin marking a standardized zero point, were permanently glued to the skull. After this, the ear bars and beak holder were removed and the head held firmly by the headplate alone. Three different approaches were then used to place electrodes into the brain stem: stereotactically through the main part of the cerebellum, stereotactically through the cerebellar flocculus, and using landmarks after aspirating the cerebellum and exposing the brain stem (the 2nd and 3rd were used in terminal experiments only). In all cases, after placing the electrodes on the brain surface, they were lowered by a custom-built stepping motor, controlled from outside the experimental chamber.
In the first approach, an opening was made in the skull around the desired area relative to the zero point and the dura mater was cut open, taking care to avoid blood vessels. Each electrode was individually centered over the zero point and moved defined amounts in the rostrocaudal and mediolateral axes before being driven down into the brain. In addition, the electrode was slightly angled laterally in most penetrations. Electrode positions were re-zeroed after any change in angulation, thus keeping the entrance points into the cerebellum reproducible. The NA was found between 1 and 5 mm caudal to the zero point, with mediolateral coordinates dependent on angulation, and being between 0.5 and 4 mm from zero.
For the second approach, a skull opening lateral to the cerebellum was made, providing a view onto the area of the semicircular canals. After removing the bone overlying the surface of the cerebellar flocculus, electrodes were introduced using various combinations of medial and rostrocaudal angles established in past experiments (Köppl et al. 1993).
In the final approach, a skull opening was made, exposing the caudal cerebellum on one side, leaving the midsagittal sinus covered, but exposing far laterally. After removing the dura mater, the accessible cerebellum was aspirated, exposing the auditory brain stem. Electrodes were placed under visual control, aiming for the very lateral edge of the brain stem just caudal to the cerebellar peduncle.
Owls were placed on a vibration-isolated table within a sound-attenuating chamber (IAC, New York) that was closed during all recordings. Commercial, Epoxylite-coated tungsten electrodes (Frederick Haer) were used, preferably with impedances around 15–20 MΩ. A grounded silver wire, placed under the animal's skin around the incision, served as the reference. Electrode signals were amplified and filtered by a custom-built headstage and amplifier. The recording was then passed in parallel to an oscilloscope, a threshold discriminator (Tucker-Davis Technologies, TDT; SD1), and an A/D converter (TDT DD1) connected to a personal computer via an optical interface (TDT OI). Transistor-transistor logic (TTL) pulses from the threshold discriminator were also registered by the personal computer via an additional timing module (TDT ET1), with a precision of 10 μs. The TTL pulses were also fed to the z axis of the oscilloscope displaying the recording trace, providing a visual aid for adjusting the TTL trigger level. In addition, a continuously refreshed, software-generated display of the waveforms that triggered TTL pulses aided in trigger judgement. Finally, a continuously refreshed display of the interspike interval distribution helped to judge unit isolation. For well-isolated units, there were no intervals shorter than about 0.6 ms, representing the absolute refractory period. In most cases, neural responses were only saved as TTL timing events. In two experiments, however, examples of analog waveforms were also saved. We did not check for prepotentials in the spike waveforms, assuming that these can only be expected in cells with large, endbulb-like terminals, which are not found in NA (e.g., Carr and Boudreau 1991).
Acoustic stimuli were digitally generated by custom-written software (Xdphys written in Dr. M. Konishi's lab at Caltech) driving a signal-processing board (TDT DSP2). After passing a D/A converter (TDT DD1) and an anti-aliasing filter (TDT FT6–2), the signals were variably attenuated (TDT PA4), impedance-matched (TDT HB4), and attenuated by an additional fixed amount before being fed to commercial miniature earphones (SONY MDR-E424). Two separate channels of signals could be generated, passing through separate channels of all associated hardware and driving two separate earphones. The earphones were housed in custom-built, calibrated, closed sound systems, inserted into the owl's left and right ear canal, respectively. Sound pressure levels were calibrated individually at the start of each experiment using built-in miniature microphones (Knowles EM3068 or TM3568, in 2 different sets of sound systems used).
While lowering the electrode, noise bursts (50-ms duration, 5 bursts/s) were played to the ipsilateral ear as search stimuli. Once auditory responses were discernable, different frequencies and both ipsi- and contralateral stimuli were tested to judge the position of the electrode. After isolating spikes, the characteristic frequency (CF) was estimated audiovisually and the TTL trigger level was adjusted carefully. The following protocol was then tested in full if possible or until the unit was lost.
Tone bursts (50-ms total duration, 5-ms rise-fall time, 200-ms cycle) of different frequencies and levels were presented in random order. Levels varied in 5-dB steps from below the initial threshold estimate to at least 60 dB SPL, frequencies varied in 50- to 250-Hz steps (depending on CF, smaller steps for units of lower CFs). Average discharge rates over three repetitions of each stimulus, within the stimulus window (not corrected for neural latency) were calculated. This paradigm was chosen to derive iso-rate tuning curves comparable to those published for the auditory nerve of the barn owl (Köppl 1997a, see there for detailed methods). The rate criteria for excitatory and, if present, inhibitory threshold responses, were fixed individually, depending on the spontaneous rate and variability. For most units, the criterion fell between 10 and 30 spikes/s above spontaneous rate.
Tone bursts (50-ms total duration, 5-ms rise-fall time, fixed starting phase, 200- ms cycle) at the estimated CF and a level at least 20 dB above threshold were presented 100 times. These parameters were chosen to estimate the maximal vector strength (VS) a unit could produce. Only statistically significant VS were accepted (Rayleigh test, P < 0.01).
Tone bursts (50-ms total duration, 1.6-ms rise/fall time, variable starting phase, 200-ms cycle) at the estimated CF, usually 20–30 dB above threshold, were presented 300 times. For stimuli below 1 kHz, rise/fall time was routinely changed to 5 ms to minimize spectral splatter. Also, in some early experiments, stimulus rise/fall times of 5 ms and 500 or 100 repetitions were generally used. For many units, several levels of PSTH were recorded, usually 20 dB apart. PSTHs with a standard bin width of 0.1 ms were used to judge the general shape of the PSTH and to derive the minimal response latency (1st 2 consecutive bins to exceed the maximum encountered outside the response). In addition, first-spike analyses (after Young et al. 1988) were carried out, using the first spike in each run after the minimal latency. Interspike interval distributions for all spikes during the response were calculated and regularity analyses (after Young et al. 1988) were carried out, using a time window of 12–20 ms after the minimal response latency for the calculations of mean interspike interval, mean ± SD, and mean coefficient of variation (CV). Mean regularity values were only used if the mean discharge rate was ≥100 spikes/s. These parameters and procedures closely followed the standard methods of many mammalian studies.
Tone bursts (50-ms total duration, 5-ms rise-fall time, 330-ms cycle) of different frequencies and levels were presented in random order. Levels varied in 3-dB steps from below the CF threshold to at least 60 dB SPL and usually about 10 frequencies (surrounding and including CF) were tested. Average rates over 10 repetitions of each stimulus within the stimulus window (not corrected for neural latency) were calculated. Spontaneous discharge rates were derived from no-stimulus trials also included randomly. These data provided more detailed response maps as well as the rate-level function at CF, with parameters identical to those used on auditory-nerve fibers in the barn owl (Köppl and Yates 1999). Saturation discharge rates were estimated as the average rate within the saturated part of the rate-level function as judged subjectively. If no saturation was apparent and the function covered ≥40 dB above threshold, the rate at the highest level tested was taken. In the case of nonmonotonic rate-level functions (defined as showing a decline of more than 20% below the highest rate), the highest driven rate, irrespective of level, was taken. Dynamic ranges were only calculated for monotonic rate-level functions, as the range covering from 10 to 90% of the increase above spontaneous rate to saturation rate. The 10 and 90% points were estimated from linear regressions through adjacent data points.
In 20 units (including all response types), the rate-level function at CF was repeated for contralateral stimulation. In all those cases, the threshold to contralateral stimulation was higher than to ipsilateral stimulation and the difference was consistent with physical cross-talk between the middle ears under similar conditions (Köppl 1997a), i.e., consistent with exclusively monaural input.
These data were collected analogous to the paradigms described in the preceding text for pure-tone stimuli except that band-limited noise (200 Hz to 12 kHz; synthesized freshly for each run) was presented. The level of noise stimuli across frequencies was equalized on-line, i.e., all noise levels are given as spectrum levels.
At selected recording sites, small lesions were induced by passing pulsed, positive current of 1–4 μA for 10 s to 8 min through the electrode. After a survival time of several hours to 13 days, the owl was killed by an anesthetic overdose and perfused transcardially with saline, followed by fixative (various buffered aldehyde mixtures). The brain was dissected out and cryoprotected by incubation in 30% buffered sucrose until it sank. Frozen sections were cut in the same (approximately transverse) plane as the electrodes had penetrated. Sections were mounted on gelatin-coated slides, stained with cresyl violet and coverslipped. All sections surrounding and including NA were examined at low magnification, documenting the extent of NA and any damage or glial accumulation potentially associated with lesions.
Electrodes were aimed stereotactically or using landmarks and the decision whether NA was actually penetrated in a given track was based on physiological criteria. The physiological responses of auditory areas adjacent to NA, i.e., the auditory nerve, NM, and NL are well known (e.g., Köppl 1997a; Pena et al. 1996; Sullivan and Konishi 1984). In addition, the well-known tonotopic organizations of the different nuclear areas (Carr and Konishi 1990; Köppl 2001; Takahashi and Konishi 1988) served as a further guideline.
Six recording sites were marked by electrolytic lesions. Five of these were later identified in histological sections of NA (examples in Fig. 1). The sixth lesion could not be found, presumably because the animal did not, as intended, recover from the experiment, and the survival time was too short to reveal the characteristic glial accumulation. One further track was approximately confirmed by an ink mark placed on the surface of the brain stem. The tonotopic position of lesion sites corresponded with the best frequencies estimated immediately before current application. The tracks where the five successful lesions were placed accounted for 12 of the units reported here; all of those were classified as NA and all response types [except the rare chopper-sustained (chop-S)] were represented. Using the lesioned sites as calibration points and the stereotaxic coordinates of electrode penetrations in the same individual as guidelines, we could reconstruct the approximate rostrocaudal position for about 70% of all recorded units. These recordings covered the rostral two-thirds of NA. The remaining units either came from brain stem hemispheres without any anatomical confirmation of recording sites or from penetrations aimed using landmarks. In the mediolateral axis, a detailed reconstruction of anatomical position was not possible. In this axis, the nucleus is much narrower and electrode angulation introduced an additional variable that could not be measured and reproduced as precisely as the electrode's X and Y positions. However, characteristic frequency changes along the mediolateral axis of NA in the barn owl (Köppl 2001), such that the lowest CFs can only be found at the most medial positions and the highest CFs at the most lateral extreme. Judging by this criterion, we have sampled the entire medio-lateral extent of NA (see data in the following text).
In summary, all recording sites that could be confirmed anatomically were confirmed to have been within NA, corroborating our judgements based on physiological criteria during the experiment.
Responses of a total of 76 single units recorded in the NA are reported here, covering a range of CFs from 0.45 to 10.2 kHz. The responses clearly were not uniform but could be classified into several different categories. Because they appeared to be comparable to types previously defined in the mammalian cochlear nucleus, we have adhered to the established nomenclature. Table 1 summarizes the types, their relative abundance, and several salient statistics of their response behavior. Figure 2 presents the decision tree that was developed, using a range of criteria, including the traditional PSTH parameters. The most common response of NA neurons was of the primary-like variety, followed by chopper-transient (chop-T), typeIV, onset, and, rarest, chop-S responses.
Note that in addition to the sample of NA units, there are 26 units that were classified as “auditory nerve” or “uncertain” (Table 1). This reflects some difficulty in separating potential auditory-nerve responses from primary-like NA units. We assumed that our sharp tungsten electrodes were capable of recording from both cell bodies and large axons or dendrites (see e.g., Joris 1998) and that all of these could in principle be encountered anywhere within the nucleus. Therefore a conservative interpretation was used, classifying all units whose values for a list of parameters fell within the known range of auditory-nerve fibers (Köppl 1997a,b; Köppl and Yates 1999) as “auditory nerve.” Based on previous studies on the avian NA (Sachs and Sinnott 1978; Sullivan and Konishi 1984; Warchol and Dallos 1990) and our own auditory-nerve data from the barn owl (Köppl 1997a,b), a relatively low vector strength of phase locking and/or a relatively low spontaneous discharge rate were considered decisive parameters for separating NA units from auditory-nerve inputs. Note that no specific criterion values can be universally used for this distinction because both parameters are strongly CF dependent. For example, a unit with a CF of 5 kHz and a vector strength of 0.5 would have been classified as auditory nerve, whereas a unit showing the same vector strength at a CF of 2 kHz would have been classified as NA. Cases where both the vector strength and the spontaneous discharge rate were consistent with auditory-nerve responses, but other characteristics did not fit an auditory-nerve fiber (e.g., a high maximal discharge rate) were classified as “uncertain.” Latency was, unfortunately, of little use in unit classification (see following text, after introducing the response types).
We will first describe and illustrate with examples the typical features of each NA response type. Then the different types will be compared and salient differences and similarities highlighted. It should be emphasized that most of the response types did not appear to be sharply distinguished, i.e., their characteristics overlapped to some degree for individual parameters. In addition, many parameters showed an overall CF-related variation that we have tried to separate and disregard in our classification of response types. A number of scatter plots will be shown in addition to individual examples to illustrate the range of responses and the extent of overlap between types.
Figure 3 illustrates an example of a primary-like response. The characteristic feature of primary-like units was, of course, a primary-like PSTH, showing a vigorous discharge at stimulus onset that gradually adapted to a steady, lower discharge (Fig. 3C). Occasionally, the PSTH showed a clear notch after a peak at response onset. However, this was confined to high levels of 40 dB or more above threshold and intermediate forms were seen. Also, such notches may sometimes be observed in auditory-nerve responses at high levels (own unpublished observations). Therefore we do not feel there is evidence for a distinct response type “primary-like with notch” (however, see also “onset units” in the following text). Interspike interval distributions of both spontaneous and evoked discharges were Poisson-like (Fig. 3F). Mean CVs were around 0.8 – 0.9 (median: 0.84), except in low-frequency units where cycle-by-cycle phase-locking produced a more regular response.
All primary-like units with a CF less than 5 kHz showed significant phase locking. Although their vector strengths were generally lower than those of auditory-nerve fibers of comparable CF, they could approach auditory-nerve values at CFs less than 1.5 kHz (Fig. 4). At CFs above 5 kHz, vector strengths were very low or not statistically significant. Spontaneous discharge rates were mostly below those of auditory-nerve fibers of comparable CF (Fig. 5A and Table 1).
Tuning curves of primary-like units were comparable to auditory-nerve tuning curves, showing a similar range of thresholds, Q10 dB and Q40 dB values. Four of 33 primary-like units showed evidence for side-band inhibition along the high-frequency flank of the excitatory tuning curve.
Rate-level functions of primary-like units were monotonic (Fig. 3B), with a median saturation discharge rate of 257.5 spikes/s and a median dynamic range of 30 dB (Table 1). Responses to noise were similar to pure-tone responses, with respect to PSTH (Fig. 3G), regularity of discharge and saturation rates.
Figures 6 and and77 illustrate two examples of chop-T responses. Chop-T units showed a regular discharge pattern at stimulus onset that produced several distinct peaks in the PSTH (Figs. 6C and and7C).7C). This did not depend on the stimulus rise time in three cases tested (rise time varied between 1 and 5 ms; example in Fig. 7C). Interspike intervals always increased within the first 10–20 ms of the stimulus (Figs. 6D and and7D);7D); the CV mostly decreased but could also remain steady or increase slightly (Figs. 6E and and7E).7E). Mean CVs fell between 0.38 and 0.84 (median: 0.53). At the upper end of this range, there was some overlap with primary-like units. Besides the characteristic peaks in the PSTH, chop-T units also typically did not show a Poisson-like distribution of interspike intervals in their sound-evoked discharge. Their distributions were more skewed, with an increased proportion of short intervals (Figs. 6F and and7F7F).
Spontaneous rates of chop-T units were typically low, with many units showing no spontaneous discharge at all (Fig. 5A and Table 1). Chop-T units were also characterized by inferior phase-locking. Although about half of them did show significant phase-locking, their vector strengths remained at the low end of those of primary-like units of comparable CF (Fig. 4).
Chop-T units had a median saturation discharge rate of 316 spikes/s and a median dynamic range of 23 dB but showed large ranges for both parameters (Table 1). There was no correlation between saturation rate and dynamic range. Discharge rates and dynamic ranges in response to CF tones and to noise, respectively, showed no systematic differences across the population. However, the PSTH of noise responses never showed clear chopping (Fig. 6G).
Chop-T units at the low end of the saturation discharge range showed PSTH with little sustained activity after the initial chopping peaks. Therefore we earlier contemplated a separate response type, “onset-chopper.” However, a more detailed analysis did not produce any evidence for this in the owl. Onset-choppers in cats and guinea pigs are characterized by their large dynamic range, small difference in threshold to tones and noise, and often more vigorous response to noise (Joris and Smith 1998; Rhode and Smith 1986; Winter and Palmer 1995). While these individual characteristics were occasionally observed in our chop-T units (e.g., small threshold difference in Fig. 7B), none displayed all of them. Also, the PSTH in response to noise never showed chopping, while the PSTH of onset-choppers are very similar in response to tones and noise (Winter and Palmer 1995).
An unexpected phenomenon was a decrease in spike size during medium- to high-level stimulation, observed in about one quarter of all our chopper-type neurons (including chop-S, an example of which is shown in Fig. 8, H and I). In early experiments, we may have rejected some units like these, mistaking the nonuniform spike size as indication for a multi-unit recording. However, the presence of a refractory period in the interspike-interval histogram confirmed that the spikes were from single units despite the variable spike height.
Chop-S responses were rare in our sample and are only represented by two units (from different animals). One example is shown in Fig. 8. They had the longest response latencies and were clearly above all other unit types in this respect (Fig. 9). This was our main criterion for placing them in a category of their own, instead of interpreting them as one extreme of the chopper-type range.
Chop-S units showed the lowest mean CVs of all types and their CV remained constant throughout the stimulus duration (Fig. 8E). Also, in one case the unit chopped to both pure-tone and noise stimuli (Fig. 8, C and G; in the other case, the noise spectrum level was not sufficiently above threshold). Both chop-S units had no spontaneous activity and reached only moderate saturation discharge rates around 200 spikes/s. Their dynamic ranges were rather different, at 16 and 35 dB, respectively.
Units that showed only a single prominent, initial peak in their PSTH and no evidence for chopping, were classified as onset. A second characteristic was that the maximal discharge rate to noise was higher than that to tones at CF. In other aspects, this group was heterogeneous; however, considering the overall rarity of onset units and thus small sample size (n = 6), we refrain from further subdividing. Figures 10 and and1111 illustrate the range of responses. The unit shown in Fig. 10 represents an extreme case of onset response with virtually no spiking during the remainder of the stimulus. The unit illustrated in Fig. 11 showed a robust sustained response after the onset and was in some aspects reminiscent of primary-like-with-notch responses in mammals. However, the distinction between onset and primary-like-with-notch is not always sharp in mammals as well (e.g., Blackburn and Sachs 1989; Rhode and Smith 1986). Our final criterion for classifying this (and another similar unit) among onset responses was the clearly more vigorous response to noise. The SD of the first spike, a measure for the synchrony of the initial discharge, was extremely low for the unit shown in Fig. 10 but, as a population, actually larger in onset units than in all other types. However, this mainly reflects the fact that the onset spike occasionally failed, producing a distribution of first-spike times with extreme outliers and thus making SD an inappropriate measure of dispersion. If an onset spike was fired, it was temporally precise, summing up, over many stimulus repetitions, to the characteristic sharp peak in the PSTH. The PSTH in response to noise was more primary-like in all cases tested, i.e., showed less synchronization to the stimulus onset (Figs. 10D and 11G). Onset units did not appear to phase-lock well (Fig. 4), however, our sample is small and restricted in CF.
Eleven units were classified as typeIV, with CFs between 0.6 and 7.7 kHz. Their defining characteristic was a pronounced nonmonotonic behavior of the discharge rate across different levels at CF, typically showing little to moderate excitation at low levels and inhibition at higher levels (Fig. 12B). The PSTH showed a clear onset response, followed by varying degrees of inhibition, depending on the unit and the stimulus level; we call this pattern onset-inhibitory. Units that responded with net excitation at low levels showed a sustained excitatory response at those levels, i.e., the PSTH changed with stimulus level (Figs. 12, C–E, and 13, C–E). At levels near the transition between net excitation and net inhibition, the PSTH could also look pauser-like. When testing a range of frequencies and levels, complex response maps resulted, with interleaving excitatory and inhibitory areas (Fig. 12A). CF and threshold were defined in these cases as the most sensitive point, regardless of whether the response was net excitatory or inhibitory. All typeIV units had high spontaneous discharge rates, typically above 100 spikes/s (Fig. 5A and Table 1).
Most typeIV units were also tested with noise stimuli. The majority showed purely excitatory responses and a monotonic increase in discharge rate with level (Fig. 13B). This was accompanied by a primary-like pattern in the PSTH (Fig. 13F). Three units gave similar responses to noise and tones, i.e., a net inhibition at higher levels, with an onset-inhibitory PSTH (Fig. 12, B and F).
Inhibition was seen to varying degrees in the different type IV units. At one extreme were units that showed little or no net inhibition, i.e., whose discharge rate never clearly decreased below the spontaneous rate (example in Fig. 13). At the other extreme were units that showed no clear excitation, i.e., whose only response appeared to be a net decrease of the discharge below spontaneous rate. However, the onset response in the PSTH was always present (example in Fig. 14). This was our final criterion for not placing such units into a (purely inhibitory) category of their own. It was also our impression that the responses recorded from typeIV units had a snapshot character, meaning that the degree of inhibition seen in an individual neuron could change over time. Significant changes in the discharge behavior over time could be documented for three single units. One example where inhibition became more pronounced with time is shown in Fig. 15. Stability or instability of the inhibitory response component did not appear to be related to the administration of anesthetic agents.
Our sample covered the full range of CFs expected for the barn owl, although only primary-like units were found throughout the whole range. Especially conspicuous was the absence of any type other than primary-like among the highest CFs, more than 8 kHz. Chop-T units, although the second most frequent type, were also restricted at the low-frequency end and were only encountered at CFs between about 2 and 8 kHz. Anatomically, we did not observe any differences in where the various response types were found.
Perhaps surprisingly, latency was not clearly different between units classified as auditory nerve or NA, respectively, or between most of the NA response types. A number of variables influence latency, most importantly the CF and, using a fixed rise time, both the absolute and relative level (above threshold) of the stimulus (e.g., Heil and Neubauer 2001). It seemed that the scatter introduced by those variables largely obscured the small latency differences between auditory-nerve fibers and NA units (Fig. 9). The only exception with clearly different latencies were the chop-S units. Higher sound levels or clicks might have provided more consistent information, but unfortunately unit isolation could be compromised with such stimuli. We also evaluated mean and median first-spike latency, and these showed greater variability than the minimal latency shown in Fig. 9.
Except for the typeIV, the different response types appeared to grade into each other. The extremes of the primary-like and chop-T responses, for example, were clearly different, however, a few units could only be classified by defining an arbitrary borderline value for the mean CV and subjectively classifying their PSTH shape. Similarly, chop-T units with low saturation discharge rates may form a continuum with onset units.
A number of parameters did not differ at all between most types of units. Tuning curves largely reflected the auditory-nerve inputs in terms of tuning and thresholds (Fig. 16). Only a minority of primary-like and chop-T units (4 of 33 and 2 of 17, respectively) showed evidence for off-CF inhibitory inputs along the high-frequency flank of their excitatory tuning curves. TypeIV units tended to fall among the most sensitive thresholds; however, there was no statistically-significant difference in threshold between the response types (Kruskal-Wallis H-test, P = 0.06; only units with a CF less than 8 kHz tested to minimize the variation of threshold with CF). Dynamic ranges were not significantly different between auditory-nerve units and the different NA response types (Kruskal-Wallis H-test, P = 0.22; typeIV units excluded because no dynamic range was defined for nonmonotonic rate-level functions). For the driven discharge range, i.e., the difference between spontaneous and saturation rate, significant differences were revealed. However, it was only the onset group that differed from all others (Kruskal-Wallis H-test with subsequent pairwise Mann-Whitney U-tests; typeIV units excluded). This is probably entirely due to the low saturation discharge rates of onset units, which similarly differed from those of all other types (see also Table 1).
The two main findings of this study are that there are five main physiological types in the barn owl NA and that these show basic similarities to responses in the mammalian cochlear nucleus. Because our results appear to be at variance with previous studies in the owl and chicken, we will first attempt a new synthesis of NA physiology in birds, showing that technical issues are a major concern and that our results may in fact be representative for all birds. We then highlight the quantitative similarities in the response types of the barn owl and mammals and discuss why these similarities are remarkable and what their implications are for the evolution of the auditory system. Finally, we discuss the implications for amplitude coding in birds.
There are several previous studies of NA physiology in different bird species (Hotta 1971; Sachs and Sinnott 1978; Sullivan 1985; Sullivan and Konishi 1984; Warchol and Dallos 1990). They differed substantially in their findings, and our new data seem to add yet more variety and conflict with earlier reports on the same species, the barn owl. We argue, however, that the observed differences between species and studies may be to a large extent methodological in nature. Our study had two important advantages over previous work. First, increasingly standardized methods exist for analyzing and classifying unit responses in the mammalian cochlear nucleus. While we did not assume a priori that our data would fit existing mammalian categories, we made use of the quantitative methods of analysis (e.g., regularity analysis, Young et al. 1988) as a basis for classification. Second, a large database was available about the responses of auditory-nerve fibers in the same species (Köppl 1997a,b; Köppl and Yates 1999), which proved essential for recognizing primary-like NA units. Sullivan (1985) found a large majority (92%) of chopper units in the barn owl NA with the remaining 8% being onset units. We believe that having no auditory-nerve data available, Sullivan conservatively interpreted any primary-like units he encountered as auditory nerve. This is consistent with the data shown in Sullivan and Konishi (1984) for phase-locking of presumed auditory-nerve fibers and NA units. The vector strengths for their NA units are largely overlapping with those of our chop-T units and many of their auditory-nerve values fall within the range of our primary-like units. Furthermore, “large, easily-isolated spikes” were used as one criterion to distinguish NA units from auditory-nerve fibers (Sullivan and Konishi 1984). This may have biased the sample in a different way compared with our set of data. We often found it difficult to obtain good unit isolation in NA and had best success with relatively high-impedance tungsten electrodes. Unit isolation was especially critical in the case of typeIV units, where vigorous background discharges could sometimes be observed during inhibition of the isolated unit. Several distinct morphological cell types are known in the barn owl NA, with different soma sizes and varying extents of dendritic arbours (Soares and Carr 2001). Should there be a correlation between morphology and physiology, it is possible that the probability and quality of extracellular recordings differs for the various response types. TypeIV units may also be mistaken for nonauditory neurons if CF and threshold are determined audiovisually (Sullivan 1985), instead of more extensively probing for the response map. Their weak excitatory responses are difficult to discern above the relentless spontaneous discharge and inhibitory responses may be partly masked by background excitatory discharge. We thus suggest that a combination of technical factors could explain the substantial differences between our data and previous work on the barn owl.
Another study on the redwing blackbird, a songbird, found a large majority (about 81%) of primary-like responses, about 10% typeIV units and some onset (5%) and pauser (4%) units (Sachs and Sinnott 1978). Interestingly, no chopper-like responses at all were observed. We believe that technical factors may have precluded a distinction between what we call primary-like and chop-T neurons. Sachs and Sinnott's study predated the introduction of regularity analysis as a useful criterion for this distinction and the bin width of the PSTHs shown would have been too crude to visually reveal the characteristic fast chopping pattern of chop-T responses (unfortunately, no bin width was specified, but it can be estimated at 5–10 ms from the figures). Indeed, if our primary-like and chop-T categories in the barn owl are combined, the distributions of the four remaining response types are similar to the redwing blackbird. A further intriguing point is the detailed similarity of typeIV responses in the redwing blackbird and the owl. In both species, typeIV neurons had high spontaneous rates, although most other NA neurons had lower spontaneous rates than auditory-nerve units. Their response maps showed interleaving excitatory and inhibitory areas, which differed between individual neurons. The rate-level functions at CF showed varying degrees of inhibition below spontaneous rate and the PSTH showed an onset-inhibitory response at levels with net inhibition. Finally, Sachs and Sinnott (1978) also provided evidence that the inhibitory component in typeIV responses is dynamic, by eliminating it through systemic administration of barbiturate.
Warchol and Dallos (1990) recorded in the chicken NA and, based on rate-level functions at CF and visual inspection of the PSTH at a fixed level above CF-threshold, also defined several response types. The majority (40%) were primary-like, 28% were chopper responses, and 8% onset responses. This corresponds almost exactly to what we found in the barn owl. However, the remaining 24% of the chicken neurons showed the unusual combination of a nonmonotonic rate-level function, suggestive of typeIV responses, and a chopper-like PSTH. Although chopper PSTH can be been associated with typeIV response maps in mammals (see e.g., review by Romand and Avan 1997), this was never seen in the red-wing blackbird and the barn owl. A confounding factor may have been the use of barbiturate for the induction of anesthesia in the chicken. Barbiturate has been shown to eliminate the inhibitory response component of typeIV neurons (Sachs and Sinnott 1978) and, depending on the duration of this effect, may have altered the responses recorded by Warchol and Dallos (1990). Further studies are needed to determine whether typeIV neurons are truly present in the chicken.
In summary, although the results from studies of the avian NA in different species and across different labs appear to differ substantially, technical issues may be responsible for a large part of these differences. Taking those into account, the available data can be reconciled into a fairly consistent scheme across species. There are four to five different response types in NA that may partly grade into each other. The most common one in all species is probably the primary-like type, followed by a chopper-like response type. Further, more detailed studies are needed to clarify whether the chop-T classification found appropriate for most chopper responses in the barn owl is typical for birds in general. Although Sullivan (1985) classified his chopper units as transient, this was not backed by regularity analysis. Chopper responses of the sustained type were only clearly documented in two rare cases in the owl, and their unusually long latencies may indicate that these are not NA neurons but possibly afferents from the superior olive, known to project to NA (e.g., Yang et al. 1999). Onset responses are also typically present, in low proportions of less than 10%. TypeIV responses were unambiguously documented in the owl and the redwing blackbird, in similar proportions of 10–15%. They are thus likely to be a typical feature of the avian NA.
The great similarity of the physiological responses in the cochlear nucleus between birds and mammals is our most interesting result. There is firm evidence now for primary-like, chopper-transient, onset and typeIV responses in the NA, as well as another primary-like population in the NM (Sachs and Sinnott 1978; Sullivan 1985; Sullivan and Konishi 1984; Warchol and Dallos 1990). As in mammals (Blackburn and Sachs 1989; Joris et al. 1994), primary-like units tend to show inferior phase-locking compared with auditory-nerve fibers at high frequencies; however, there is no firm evidence for their superior phase-locking at frequencies less than 1 kHz (Köppl 1997b; this study). In addition, NA primary-like units show lower, NM units higher average spontaneous rates than auditory-nerve fibers. The primary-like-with-notch response type is probably absent in birds. Chopper responses in birds appear to be mainly of the transient type, with only tentative evidence for a rare sustained type (this study). This is in contrast to mammals, where chop-S units are always common and often the predominant type of chopper (e.g., Blackburn and Sachs 1989; Rhode and Smith 1986; Winter and Palmer 1990; Young et al. 1988). However, the distinction of mammalian chop-T and -S units by their degree of regularity is somewhat arbitrary, and Rhode and Smith (1986) have argued that there is a continuum. It is thus possible that birds simply show less variation or a more skewed distribution within the same basic response type. Onset units are also definitely present in the NA, but they were fairly rare in all studies. In mammals, onset units are commonly subdivided into three variants with onset-chopper being the most common (e.g., Rhode and Smith 1986; Winter and Palmer 1995). Some onset units in birds [e.g., our Fig. 10; Fig. 9 (Sachs and Sinnott 1978); Fig. 11 (Sullivan 1985)] are similar to the onset-I type of mammals and no evidence was found so far for onset-chopper responses. However, any definite classification must await a larger sample. Finally, typeIV responses in birds resemble those of mammals in great detail, even showing the same variation in the inhibitory component between units and its anesthesia sensitivity (reviews in Rhode and Greenberg 1992; Sachs and Sinnott 1978; Young and Davis 2002; this study).
Considering the fundamental morphological differences between the cochlear nuclei of birds and mammals, the great similarity of response types was somewhat surprising. NA has no immediately obvious equivalent in mammals. It is a heterogeneous nucleus with no prominent subdivisions and with several morphological cell types distributed across the tono-topic gradient (Häusler et al. 1999; Soares and Carr 2001). This is in contrast to the complexity of the mammalian cochlear nucleus with many subdivisions and most cell types characteristically concentrated within those (e.g., review in Cant 1992). In addition, while some of the morphological cell types in the NA bear obvious similarities to types found in mammals, e.g., the radiate and planar multipolar neurons, the most common avian type, termed “stubby” (Soares and Carr 2001), has no equivalent in mammals.
This raises the very interesting question of how birds implemented those same physiological types. TypeIV units represent an especially intriguing case because this response type, in mammals, is typically found in the DCN (recent review in Young and Davis 2002), a part of the cochlear nucleus with cerebellum-like organization that birds do not have. An elaborate circuit, including direct projections from the auditory nerve and a multitude of both excitatory and inhibitory connections from other cochlear-nucleus neurons (both DCN and VCN) shapes the typeIV response. It appears that the neuronal morphologies and circuits underlying these responses may well be different, representing an intriguing case of independent evolution in birds and mammals, similar to that pointed out for the avian visual wulst and mammalian visual cortex (Pettigrew and Konishi 1976). Indeed, recent evidence from fossil studies on the middle ear (Clack 1997) and comparative studies on the inner ear (Manley and Köppl 1998) suggests that the sensitive, high-frequency hearing of airborne sound may be a relatively late development in vertebrate evolution that happened independently in the major clades. Such dramatic changes in the auditory periphery may have imposed powerful selective pressures on the central auditory system (Wilczynski 1984), leading to the independent evolution of sophisticated auditory processing capabilities. Consistent with this hypothesis, onset- and chopper-like responses were also found in the dorsal medullary nucleus, the primary auditory brain stem nucleus of frogs (Hall and Feng 1990), whose homology to the cochlear nucleus of other vertebrates is highly controversial (e.g., review by McCormick 1999). Future studies are needed that explore the links between cellular morphology and physiology in the avian NA, as well as details of the neural circuitry and pharmacology.
A classic view of the avian cochlear nucleus is that of a clear dichotomy between its two main subdivisions, with NM specializing in temporal coding and NA in sound level coding. This interpretation is heavily based on the elegant experiments of Takahashi et al. (1984), which showed that silencing NA eliminated the sound-level-based selectivity of midbrain “space-specific” neurons involved in sound localization. Later, the data of Sullivan and Konishi (1984) suggested a nearly homogenous neuron population in NA with large dynamic ranges, low spontaneous and high saturation discharge rates, i.e., well suited for the coding of sound level at frequencies near their CF. Our results, as well as studies in other avian species, however, have established a much greater variety of responses and showed that the classic depiction of NA is too simplistic. Instead, they suggest that NA, in contrast to NM, is the starting point of multiple, distinct auditory processing streams. Because nothing is known about potentially different projection targets of the different response types, we may draw inferences based on comparisons with the mammalian literature.
Clearly, as earlier experiments (Takahashi et al. 1984) had shown, NA is involved in the processing of sound level. However, sound level processing may involve more than coding for a wide dynamic range via average discharge rate. Certainly, none of our response types appeared to be specialized for large dynamic ranges or high driven rates. Although individual examples of both were found, at the population level, the response types did not differ from each other or from the auditory nerve in these respects. We suggest that instead, the major response types in NA may all mediate different aspects of level coding. In addition to the rate information relayed by primary-like units, the lower discharge variability of chopper-type responses may make these units particularly suitable as inputs to interaural level comparison circuits, where accurate, invariant information about the levels at both ears is important (Manley et al. 1988; Mogdans and Knudsen 1994; Shofner and Dye 1989). Finally, typeIV units may encode spectral features, i.e., level information integrated over a wider band of frequencies (Yu and Young 2000).
In mammals, it is thought that typeIV DCN neurons are involved in the detection of spectral notches, characteristic nulls in the spectrum that are caused by the acoustic filtering properties of the pinna and provide reliable cues to sound direction (May 2000; Young and Davis 2002). Could they also fulfill this role in birds? In the barn owl, localization cues in the form of monaural spectral notches do exist at high frequencies above 6 kHz (Keller et al. 1998). This is due to the specialized feather mask that effectively works like an immobile pinna. It has also been shown that owls are able to discriminate noisy signals with missing components relative to a learned reference, i.e., are principally able to recognize spectral-notch-like signals (Konishi and Kenuk 1975). However, there is little evidence that the animal uses spectral-notch cues in sound localization (Poganiatz and Wagner 2001). Moreover, we found typeIV neurons in the owl's NA with CFs down to frequencies less than 1 kHz. At those frequencies, monaural spectral cues have not been shown for any avian species but could theoretically be generated by a pressure difference receiver mechanism mediated by the interaural canal in many birds. Whether these cues could be of sufficient magnitude and could, e.g., take the form of sharp spectral notches, is still controversial (recent review in Klump 2000). In summary, there is little evidence to support the hypothesis that typeIV neurons in birds are used in sound localization. They may instead serve the more general function of detecting sharp spectral features (notches or peaks) in communication or other environmental sounds (Young and Davis 2002).
In addition to sound level processing being more varied that previously appreciated, it may not be the only function of NA. The presence of onset units argues for an additional involvement in temporal processing. In mammals, onset responses are found in several cell types in the cochlear nucleus (reviews in Rhode and Greenberg 1992; Romand and Avan 1997), and some are thought to encode temporal features such as broadband transients (e.g., Oertel et al. 2000). NA onset neurons may serve a similar function. This does not question the prominent and well-established role of NM in temporal processing. However, it emphasizes that NM is a very specialized nucleus, focused on one particular aspect of temporal coding (i.e., phase locking), while NA is more versatile and probably the source of multiple ascending auditory pathways.
We acknowledge C. Malek for programming support, Dr. J. Pena for assistance with Matlab, and Dr. D. Soares for help with data acquisition and histology. Knowles generously donated samples of miniature microphones. G. Manley and several anonymous reviewers kindly commented on an earlier version of the manuscript.
The work was supported by a Heisenberg Fellowship of the Deutsche Forschungsgemeinschaft to C. Köppl, by National Institute on Deafness and Other Communication Disorders Grant DC-D00436 to C. E. Carr and by the University of Maryland Center for Neuroscience.