Two questions regarding syllable order selectivity in the zebra finch Area X are as follows. (i) How do neurons encode song temporal structure? In particular, for how long on average do Area X neurons integrate auditory information in song? (ii) Is syllable order selectivity influenced by experience, in particular sensory acquisition of tutor song? We first analysed syllable order selectivity in normal birds, to address the first question, and then compared the neural properties of these birds with those of birds raised without tutor song exposure (isolate birds) to address the second question.
Syllable order selectivity of Area X is highly variable across different birds
Consistent with previous reports both in the Area X and the upstream nucleus HVC (Margoliash, 1983
; Margoliash & Fortune, 1992
; Lewicki & Konishi, 1995
; Lewicki & Arthur, 1996
; Doupe, 1997
; Rosen & Mooney, 2000
; Coleman & Mooney, 2004
), Area X neurons in our normal birds showed sensitivity to the sequence of syllables in BOS (67 neurons from 19 birds). Most neurons responded to BOS more strongly than to roBOS (), exhibiting substantial syllable order selectivity. When we quantified the syllable order selectivity with a SI, which will approach 1.0 if a neuron has high order selectivity and will be close to 0.5 if it has low order selectivity (see Materials and methods), most neurons had SI values larger than 0.5 (), indicating their preference for forward BOS over roBOS. The degree of syllable order selectivity was, however, highly variable across neurons. The SI varied from 0.5 (i.e. no order selectivity) to 1.0 (i.e. high order selectivity) and was distributed widely across that range.
Fig. 2 Syllable order selectivity of Area X neurons in normal birds. (A) The mean response strength (RS) to BOS and roBOS in individual neurons are plotted. The diagonal line indicates where cells would lie if the mean values of RS to the two stimuli were equal. (more ...)
What causes the highly variable degree of syllable order selectivity across different neurons? Syllable order selectivity is calculated from neural responses to forward and reversed versions of BOS. Because BOS differs between different birds, the degree of order selectivity may vary across birds depending on the acoustic structure of their song. Alternatively, because Area X is known to have more than one type of neuron (Farries & Perkel, 2002
), different types of neurons may have different degrees of syllable order selectivity, even in the same birds. To test these hypotheses, we examined the variability of syllable order selectivity across different birds as well as within individual birds.
We found that the degree of syllable order selectivity in Area X neurons was highly variable across individual birds. illustrates the extremes. The bird ‘normal 15’ had Area X neurons that responded to roBOS as strongly as to forward BOS, thus exhibiting very low order selectivity (see also ). In contrast, many Area X neurons of ‘normal 16’ responded well to forward BOS but much more weakly to roBOS, so that they showed high order selectivity. Within individual birds, however, the variance of order selectivity was relatively small: all Area X neurons in normal 15 had SI values smaller than 0.7, showing little preference for BOS over roBOS, while many Area X neurons of normal 16 had much higher SI values, indicating a strong preference for forward BOS over roBOS (). This trend for smaller variance in syllable order selectivity within each bird and higher variance across individual birds was observed in Area X of most birds that we tested and, accordingly, the variance across individual birds was significantly larger than the variance within individual birds (; P < 0.0001, one-way anova across different birds). These results suggest that the highly variable degree of syllable order selectivity is not due to the heterogeneity of neurons within individual birds, but could be due to bird-dependent factors such as the acoustic structure of BOS.
Fig. 3 Highly variable degrees of syllable order selectivity in Area X neurons. (A) Single-unit recordings of an Area X neuron with low syllable order selectivity from one bird, ‘normal 15’ (top), and of an Area X neuron with high syllable order (more ...)
The degree of syllable order selectivity is correlated with syllable duration
The highly variable degree of syllable order selectivity across different birds raises the question of what song features might cause this. Syllable order selectivity results from a neuron integrating auditory information across syllables and, thus, the degree of order selectivity could depend on aspects of the temporal structure of song, for instance the lengths of song syllables. To investigate this possibility, we examined syllable order selectivity in individual birds in relation to their song syllable duration.
We found that the degree of syllable order selectivity in Area X neurons was inversely correlated with the mean syllable duration of each bird's song (). That is, a song with longer syllables on average elicited stronger roBOS responses, resulting in lower syllable order selectivity. In general, roBOS became as effective as BOS at eliciting responses when average syllable duration approached 150–200 ms (note where the regression line crosses the x-axis in ). We also observed a similar phenomenon when we did not take the variability of neurons within birds into account, but simply compared mean syllable duration with the mean SI value of all neurons sampled within a nucleus in individual birds (r = −0.663, P < 0.002). These results suggest that the different degrees of syllable order selectivity across birds reflect features of the temporal structure of song, such as mean syllable duration.
Fig. 4 Scatter plot compares syllable order selectivity with mean syllable duration of song in Area X neurons (top); oscillograms of two example songs are shown below, one with relatively long (middle) and one with short (bottom) mean syllable duration. The (more ...)
Possible neural mechanisms underlying the negative correlation between syllable order selectivity and mean syllable duration
What is the neural basis for this negative correlation between syllable order selectivity and mean syllable duration? One possible explanation is that Area X neurons integrate auditory information in BOS over a limited time window, so that syllable order selectivity is mainly determined by the relative time scales of the neural integration time window and syllable duration. explains this hypothesis, as follows.
Fig. 5 Possible neural coding mechanisms underlying syllable order selectivity. In each row, schematic drawings of simplified amplitude envelopes of BOS (left column) and its roBOS (middle column) are shown on the bottoms, and the expected neural responses (PSTHs) (more ...)
If we assume that Area X neurons integrate auditory information in song using an integration time window that is limited in duration (e.g. 100 ms) and relatively constant throughout song and even across different birds, zebra finch song syllables (normally 20–200 ms duration) will be either shorter or longer than the integration time window depending on their duration. If syllables are longer than the neural integration time window, Area X neurons will be integrating auditory information for a shorter period of time than the individual syllable durations when they respond to BOS. Because reversing the order of syllables maintains the local temporal structure within individual syllables, the neurons will still be able to receive enough information from the syllables, even in roBOS, to generate a response as strong as that in forward BOS (, top). Conversely, if individual syllables are shorter than the neural integration time window, that is, if a neuron needs to integrate acoustic information for longer than individual syllables in order to generate firing after the syllables in BOS, reversing the order of those syllables will place an acoustic change within the integration time window. Then those syllables in roBOS will evoke a much weaker response than those in forward BOS (, bottom). Thus, responsiveness of Area X neurons to roBOS will be determined mainly by the lengths of syllables contained in song. Many zebra finch songs do not have just long or short syllables, but instead have a mixture of syllable durations (, middle). Even for such songs, any syllables longer than the integration time window will evoke a response in roBOS as strong as those in forward BOS, while any syllables shorter than the time window will evoke a much weaker response in roBOS, resulting in a compound response to the entire roBOS stimulus, with a combination of strong and weak firing. Therefore, the overall neural response to roBOS (i.e. mean firing rate during roBOS) will vary depending on the proportion of long and short syllables in the song, and should still be positively correlated with mean syllable duration. Because roBOS response is inversely proportional to syllable order selectivity (SI) by definition (see Materials and methods), the relationship between roBOS response and mean syllable duration will result in a negative correlation between syllable order selectivity and mean syllable duration (right column in ).
Although this hypothesis can explain the variable degree of syllable order selectivity of Area X neurons and its negative dependence on mean syllable duration of individual songs, there are different factors that could also influence order selectivity. For instance, some songs have repeats of acoustically similar syllables or of similar syllable pairs (). Because these syllables provide very similar acoustic information to neurons during both forward and roBOS, regardless of their duration, they will lower the order selectivity of the neurons independently of the neural integration time. Although this factor seems unlikely to explain by itself the negative correlation between order selectivity and mean syllable duration that we found (), the syllable repetitions could contribute to this negative correlation to some extent. For example, if the syllables or syllable pairs repeated in a motif tended to be long, many songs with syllable repeats would have both relatively long mean syllable duration and low order selectivity.
Altering the definitions of song syllables can change syllable order selectivity
We wondered how well the hypothesis of a limited neural integration time could explain the negative correlation between syllable order selectivity and mean syllable duration, and how much other factors such as repeated syllables might contribute to the negative correlation (see above). To address this question, we examined the effect of different syllable segmentations of the same song on syllable order selectivity and mean syllable duration. We presented birds with two different versions of roBOS, with different syllable definitions and thus different mean syllable durations, and measured the syllable order selectivity of Area X neurons. If the negative correlation between syllable order selectivity and mean syllable duration reflects the relative time scales of a limited neural integration time and syllable duration, changing the mean syllable duration by simply changing the syllable definitions will lead to different degrees of syllable order selectivity, depending on the mean duration of the newly defined syllables. Thus, Area X neurons should show a decrease in order selectivity as we cause the mean syllable duration to become longer, resulting in a negative correlation between syllable order selectivity and mean syllable duration even in the same neuron. In contrast, if other factors such as repeated syllables play critical roles in determining the syllable order selectivity of Area X neurons, neurons will not necessarily show a decreased degree of syllable order selectivity when mean syllable duration is increased by changing syllable definition.
Syllables in zebra finch songs are generally defined as segments of sound separated by silent intervals, but because the intervals are often very short (< 5 ms; for example, the interval between ‘e’ and ‘f’ in ), in many cases the sound segments before and after the short interval can be defined either as two separated syllables (‘e’ and ‘f’) or as a single syllable (‘C’) composed of two subsyllables, depending on the experimenters. We segmented the song in in two different ways, ‘a–i’ and ‘A–E’ (), using different lengths of minimum intersyllable intervals (see Materials and methods), and generated two different versions of roBOS (): roBOS1 is generated from finely segmented BOS (syllables ‘ihgfedcba’) and roBOS2 is generated from more grossly segmented BOS (syllables ‘EDCBA’).
The peristimulus time histograms (PSTH) in show an example of Area X neuronal responses to BOS and the two versions of roBOS. The roBOS1 evoked much weaker responses than BOS, while roBOS2 evoked responses as strong as BOS, resulting in higher syllable order selectivity for BOS vs roBOS1 than for BOS vs roBOS2. Because roBOS1 had a shorter mean syllable duration (83.1 ms) than roBOS2 (138.5 ms), this result shows that syllable order selectivity is negatively correlated with mean syllable duration even for the same song, simply segmented differently (filled circles in ). This negative correlation, together with the strong roBOS2 response, comparable to the BOS response, supports the idea that the neuron has an integration time for generating the BOS response that is relatively limited in duration throughout the song, regardless of the acoustic structure of syllables, and mostly shorter than the mean syllable duration of roBOS2. A similar negative correlation (a decrease in syllable order selectivity after the artificial increase in mean syllable duration) was observed in most neurons in all the birds that we tested (17/19 neurons from three birds; ), and the SI in many cells approached 0.5 when mean syllable duration was about 150 ms. Thus, this manipulation of syllable length further supported our hypothesis of a limited neural integration time, across different neurons and different birds.
Syllable order selectivity of Area X neurons may reflect acoustic similarity over a limited neural integration time
All the neurons analysed here are selective for BOS, responding strongly to BOS but not to completely reversed BOS or conspecific songs (Kojima & Doupe, 2007
). If such a song-selective neuron responds to roBOS as well as it does to BOS, this means that it has detected similar acoustic segments within the two songs. Thus, the degree of syllable order selectivity should reflect the acoustic similarity that the neuron detects between forward and roBOS. Given this fact, the correlation between syllable order selectivity and mean syllable duration () can be taken apart into two different correlations (): (i) syllable order selectivity is correlated with the acoustic similarity that neurons detect between BOS and roBOS; and (ii) the acoustic similarity is in turn correlated with mean syllable duration. We have so far investigated possible neural underpinnings of syllable order selectivity by focusing on the relationship of order selectivity to mean syllable duration. However, because BOS–roBOS acoustic similarity is the more direct determinant of syllable order selectivity, we next estimated the BOS–roBOS acoustic similarity and examined its relationship to syllable order selectivity (arrow 1 in ), as well as that to mean syllable duration (arrow 2), in order to further test the hypothesis of a limited integration time in Area X neurons.
Because the hypothesis assumes that Area X neurons integrate acoustic information over a limited time window throughout song, and across neurons and birds (), we estimated song acoustic similarity under the same assumptions – that is, our similarity measure compares segments of forward and reverse order songs using a time window that is constant in duration throughout song and across birds (see Materials and methods for details). We compared the values of this measure for each song to the same bird's neuronal order selectivity in order to examine the relationship between the acoustic similarity and syllable order selectivity (arrow 1 in ). If such a measure assesses acoustic similarity in a manner that resembles the neural mechanisms underlying syllable order selectivity, then it should be correlated with syllable order selectivity – that is, songs with more similar segments in forward and reversed versions should be associated with neurons with lower order selectivity. Moreover, this measure has a variable – the size of the time window used for the acoustic comparison – that can be used to estimate the time scale of neural integration that Area X neurons might employ. Our results so far show that the roBOS becomes as effective as BOS at eliciting responses when average syllable duration is about 150–200 ms ( and ), indicating that most neurons integrate auditory information over shorter times than that in responding to BOS. By exploring a range of acoustic similarity window sizes, including both below and above 150 ms, and assessing which window duration leads to the strongest correlation between forward–reverse song acoustic similarity and order selectivity, we could provide more direct estimates of the average time scale of neural integration, for all neurons.
Our acoustic similarity measure (see and Materials and methods for details), which we call ‘average BOS–roBOS acoustic similarity’, is based on cross-correlation between the roBOS and BOS spectrograms. This measure makes the simple assumption that responsiveness to a sound segment in roBOS is determined by the acoustic similarity between this roBOS segment and the most similar segment in BOS. It therefore calculates a correlation coefficient between the first roBOS segment (chosen to be of a particular, fixed duration, i.e. the acoustic comparison window) and the most similar segment of the same duration in BOS (i.e. a cross-correlation coefficient between the roBOS segment and the whole BOS), then repeats this procedure for all subsequent roBOS windows in the song (shifting the window 2 ms across the song for each repeat), and averages the results. This provides a measure of the maximum possible similarity of roBOS to forward BOS, for a given comparison window duration. In order to estimate what the duration of the relevant acoustic comparison window might be for the neurons, we calculated the acoustic similarity between forward and reversed songs multiple times, each time with a different, fixed acoustic comparison window size, ranging from 10 ms to 500 ms.
The degree of average BOS–roBOS acoustic similarity (the abscissa in ) depended strongly on the length of the segments being compared. When the comparison window was short (and shorter than most syllables, < 20 ms), the similarity measure was high for all songs, and covered only a small range of values across different songs, regardless of the individual songs' mean syllable durations (green dots in ). This is presumably because most such short roBOS segments found a matching segment in the forward song and generated correlation coefficients close to 1.0. In contrast, when the comparison window was longer (~100 ms), the similarity measure had a wider and lower range of values (red dots in ), likely because a considerable fraction of roBOS windows of this length will include more than one syllable placed in reverse order. Such segments will not find a matching segment (a high cross-correlation value) in the forward song. Similarity measured using this window size thus also depends on how well the comparison window size corresponds to the durations of the syllables in the song being compared and, accordingly, is significantly positively correlated with the mean syllable duration of the individual songs (). When the comparison window was longer than most syllables in any bird (500 ms), the similarity measure was low for all songs (blue dots in ), probably because most roBOS windows included multiple syllables placed in reverse order. These similarity values were also not significantly correlated with mean syllable duration (). Thus, when song segments are compared using an intermediate range of comparison time windows (), average BOS–roBOS acoustic similarity is positively correlated with mean syllable duration, supporting the relationship between song similarity and syllable length hypothesized in (arrow 2).
When we then compared the song similarity measures to the syllable order selectivity of Area X neurons, we found that we could account for a significant fraction of order selectivity, but again only for a certain range of acoustic comparisons. The average BOS–roBOS acoustic similarities calculated using windows with intermediate durations (50–200 ms) were significantly negatively correlated with syllable order selectivity (). That is, neurons in birds with songs of high average BOS–roBOS acoustic similarity (measured in 50–200-ms windows) tended to have low degrees of syllable order selectivity. With short (< 50 ms) and long (> 200 ms) windows, however, no significant correlations were observed. This indicates that the average BOS–roBOS acoustic similarities calculated using the intermediate time windows best account for the syllable order selectivity of those neurons, demonstrating the correlation indicated with the arrow 1 in . This result, together with the correlation between the acoustic similarity and mean syllable duration (arrow 2 in ), therefore corroborates the idea that Area X neurons have relatively limited integration times throughout song and across different neurons and birds, and suggests that the average integration time is on the order of 100 ms. This agrees well with the marked loss of order selectivity we observed when mean syllable duration was closer to 150–200 ms ( and ), which indicates that the neural integration time windows are mostly less than that length.
Fig. 8 Average BOS–roBOS acoustic similarity can account for syllable order selectivity. (A) The average BOS–roBOS acoustic similarities calculated with 10-, 100- and 500-ms windows (green, red and blue dots) are plotted against the syllable (more ...)
Together, the results of all our experimental manipulations and analyses suggest that much of the syllable order selectivity of Area X neurons can be explained by a relatively simple neural mechanism in which neurons integrate over a limited time window throughout song, on the order of 100 ms duration, regardless of the differences in acoustic structure between individual songs.
Area X mechanisms for encoding song temporal context in isolate birds were comparable to those in normal birds
Song-selective neurons with temporal context sensitivity have been hypothesized to be involved in song learning: their response to a sequence of syllables, at least in the song motor nucleus RA (the robust nucleus of the arcopallium), appears to be a prediction of premotor activity for the next syllable (Dave & Margoliash, 2000
), which could be very useful in learning the full song sequence. Because songbirds acquire their song by first memorizing the song of an adult individual (the tutor) and then matching their vocalizations to the tutor song model, the sequence-predicting activity of song neurons may reflect sensory learning of tutor song. Consistent with this possibility, many neurons in the songbird anterior forebrain pathway, including Area X, develop selectivity for the temporal order of tutor song (Solis & Doupe, 1997
; Yazaki-Sugiyama & Mooney, 2004
). To address the issue of whether learned tutor song information contributes to the development of song coding mechanisms, we raised zebra finches without exposure to tutor song (isolate birds). We then examined neural encoding of BOS temporal context in isolate Area X neurons, and compared the properties of these neurons with those of normal birds.
All the isolate birds (n
= 10) whose neural activity we recorded developed stereotyped songs, which often included abnormal acoustic features such as broadband noise notes, upward sweeps, an abundance of call-like notes (harmonic stacks), and abnormally long and short syllables and intervals. The isolate songs shared no song notes with their father's song, except for generic call-like notes. The spectrograms and statistical analyses of these songs have been published elsewhere (Kojima & Doupe, 2007
Despite the abnormal acoustic properties of the isolate birds' songs, we found, using the same series of analyses of syllable order selectivity as in normal birds, that the neural encoding of song temporal structure in Area X of isolate birds was comparable to that of normal birds (44 neurons in 10 isolate birds). The results of those analyses in isolate birds are as follows (). (i) The degree of syllable order selectivity was highly variable across neurons (). The SI was distributed widely from 0.5 (non-selective) to 1.0 (highly selective), and neither the mean nor the variance of SI values were significantly different from those in normal birds (unpaired t-test for means, P = 0.23; F-test for equality of variances, P= 0.17; see for normal birds' data). (ii) Syllable order selectivity was negatively correlated with the mean syllable duration of individual songs (; also see for comparison with normal birds' data). As in normal birds, the roBOS became as effective as BOS at eliciting responses when average syllable duration approached 150–200 ms, suggesting that Area X neurons in isolate birds integrate auditory information over a limited time window, shorter than 150 ms. (iii) This idea was supported by a decrease in syllable order selectivity after an artificial increase of mean syllable duration, by changing syllable definition, in most isolate Area X neurons that we tested ( and 16/17 neurons in four isolate birds; also see for comparison with normal birds' data). (iv) The idea of a limited integration time window was further corroborated by direct comparison between syllable order selectivity and the average BOS–roBOS acoustic similarity, calculated using comparison windows of fixed duration. As in normal birds, the average BOS–roBOS acoustic similarity in isolate birds was significantly correlated both with the mean syllable duration of individual songs and with the syllable order selectivity of Area X neurons, but only when it was calculated using time windows of intermediate durations (50 and 100 ms; ; also see and for comparison with normal birds' data). This indicates that a neural mechanism with a relatively short, constant integration time window (of 50–100 ms duration) can account for much of syllable order selectivity in Area X neurons of isolate birds.
Fig. 9 Area X neural properties in isolate birds, revealed with the same analyses that were used in normal birds, are comparable to those in normal birds. (A) The mean response strength (RS) to BOS and roBOS in all Area X neurons recorded from isolate birds. (more ...)
These results in isolate birds, highly comparable to those in normal birds, demonstrate that the neuronal mechanisms for temporal selectivity can develop without learned tutor song experience. Together with the conserved time scale of temporal context integration across marked differences in song acoustic structure among normal birds, these findings illustrate the independence of temporal coding in Area X neurons from the bird's experience during song development.