|Home | About | Journals | Submit | Contact Us | Français|
During avian vocal learning, birds memorize conspecific song patterns and then use auditory feedback to match their vocal output to this acquired template. Some models of song learning posit that during tutoring, conspecific visual, social, and/or auditory cues activate neuromodulatory systems that encourage acquisition of the tutor’s song and attach incentive value to that specific acoustic pattern. This hypothesis predicts that stimuli experienced during social tutoring activate cell populations capable of signaling reward. Using immunocytochemistry for the protein product of the immediate early gene c-Fos, we found that brief exposure of juvenile male zebra finches to a live familiar male tutor increased the density of Fos+ cells within two brain regions implicated in reward processing; the ventral tegmental area (VTA) and substantia nigra pars compacta (SNc). This activation of Fos appears to involve both dopaminergic and non-dopaminergic VTA/SNc neurons. Intriguingly, a familiar tutor was more effective than a novel tutor in stimulating Fos expression within these regions. In the periaqueductal gray (PAG), a dopamine-enriched cell population that has been implicated in emotional processing, Fos labeling also was increased after tutoring, with a familiar tutor once again being more effective than a novel conspecific. Since several neural regions implicated in song acquisition receive strong dopaminergic projections from these midbrain nuclei, their activation in conjunction with hearing the tutor’s song could help establish sensory representations that later guide motor sequence learning.
During vocal learning, songbirds memorize songs produced by others (“template acquisition”) and then practice their own vocalizations until they resemble the acquired template. Most birds exhibit strong stimulus biases in this learning, and social interactions with tutors optimize learning and influence the selection of a particular song pattern to be imitated (Immelmann, 1969; Clayton, 1987; 1988; Marler & Peters, 1988; Williams, 1990; Baptista & Gaunt, 1997; Roper & Zann, 2006). It is not known how and where such contextual factors modulate the process of vocal learning, but one likely possibility is that conspecific cues present during tutoring activate neuromodulatory systems that encode stimulus salience and/or incentive value, and modulate plasticity in the song system or other high level auditory regions.
The neurotransmitter dopamine (DA) modulates neural and behavioral plasticity, and is important for reward processing, arousal, and emotional response (for review, see (Behbehani, 1995; Bandler et al., 2000; Schultz, 2002; Wise, 2004; Fields et al., 2007)). A majority of dopaminergic neurons reside in the midbrain ventral tegmental area (VTA) and substantia nigra pars compacta (SNc), and neurons within these regions fire in response to intrinsically rewarding stimuli and to stimuli associated with primary rewards (Schultz, 2002). Moreover, blocking DA receptor activation within specific VTA efferent targets (e.g. striatum) impairs certain forms of reward-based learning (for review, see (Beninger & Miller, 1998; Setlow & McGaugh, 1998; Wise, 2004; Dalley et al., 2005). Dopaminergic neurons also are abundant in the periaqueductal gray (PAG), a region with diverse functions including modulation of emotional state, pain processing, vocal expression, and defensive behavior (Behbehani, 1995).
In songbirds, VTA, SNc and PAG neurons innervate several regions necessary for song perception, learning, and production (Lewis et al., 1981; Appeltants et al., 2000; Appeltants et al., 2002; Castelino et al., 2007; Gale et al., 2008). Thus, conspecific cues associated with song tutoring could modulate vocal learning through dopaminergic inputs to one or more of these regions. Such cues could include visual and/or social stimuli delivered by the tutor, or even conspecific song itself, which does have intrinsic reward value as indicated by the physiological and behavioral responses it elicits in young naive songbirds (Dooling & Searcy, 1980; Adret, 1993; Whaling et al., 1997; Braaten & Reynolds, 1999; Houx & ten Cate, 1999). To assess the plausibility that VTA, SNc and/or PAG activation participates in song acquisition, we used Fos immunocytochemistry to assess neuronal activation in these regions during song tutoring in young male zebra finches. The results suggest that stimuli present during song tutoring activate these neurons and that, within two of the regions (VTA and PAG), tutoring by a familiar caretaker is a more effective stimulus than is tutoring by a novel adult male.
The University of Rochester’s Institutional Animal Care and Use Committee approved all procedures used in this study. Zebra finches (Taeniopygia guttata) were bred and raised in our laboratory on a 14:10 light: dark cycle with food and seed available ad-lib. In this species, sensory acquisition normally occurs between posthatch d25–65 (Immelmann, 1969; Price, 1979; Eales, 1985; 1987b). Juveniles were reared by both parents until posthatch day 30 (d30) and then individually isolated in soundproof chambers. On d35, juvenile males were transferred to an experimental room and either 1) placed adjacent to a stimulus female, or 2) tutored for 1 hour in a cage with their father or a novel tutor, adjacent to a stimulus female. These experimental sessions began at the onset of the light cycle to ensure that juveniles had not sung during the immediately preceding hours. Siblings were always distributed across experimental groups. Pupils were housed behind a unidirectional microphone interfaced to a sound-activated recording system (Avisoft-Recorder) with triggering configuration set to parameters previously confirmed to capture any subsong or tutor song produced. At the onset of each recording session, a human observer monitored the real-time spectrographic display to log when the first tutor song was produced and to ensure that the computer was accurately capturing and saving audio files. Tutors typically engaged in song behavior immediately after lights were illuminated; none of the subjects used for analysis produced song on the day of tutoring. The experimental session lasted for 1 hour; for tutored birds, this began when the tutor first sang.
One hour after transfer to the experimental room (or 1 hour after the onset of tutor song for tutored birds), juveniles were deeply anesthetized and then perfused with 0.1 M phosphate-buffered saline (PBS), followed by fixation with cold 4% paraformaldehyde and 0.8% glutaraldehyde in 0.1 M phosphate buffer (pH = 7.4). Brains were removed, post-fixed for 2 h, and then sunk in 30% sucrose overnight. Frozen coronal sections (30 μm) were cut and stored in cryoprotectant at −4°C until used for ICC. A series of sections from each subject was stained with an antibody against mouse tyrosine hydroxylase (TH) to identify the VTA and SNc, as well as catecholaminergic cell bodies within the PAG. The VTA was divided into anterior, central, and posterior portions as follows (see Figure 1): anterior VTA = sections that include VTA but are anterior to the SNc; central VTA = sections that include anterior SNc and in which the SNc and VTA are connected by only a narrow bridge of cells; posterior VTA = sections that include VTA and SNc with only a slight narrowing between these two nuclei.
ICC experiments used antibodies to TH (#MAB318, Chemicon, Temecula, CA) and Fos (#SC-253, Santa Cruz Biotechnology, Santa Cruz, CA) that have been used previously in several songbird species including zebra finches (Bottjer, 1993; Bolhuis et al., 2000; Bolhuis et al., 2001; Riters et al., 2004; Heimovics & Riters, 2005; Bharati & Goodson, 2006; George et al., 2006; Hara et al., 2007; Alger et al., 2009). Specificity of the Fos antibody has been confirmed recently in starlings by the lack of immunostaining following preabsorption to the cognate peptide (Alger et al., 2009). For the TH antibody, reactive specificity has been confirmed in chicken, frog, lizard and zebrafish (technical information provided by Chemicon). Furthermore, for both antibodies, studies cited above confirmed that immunolabeling is absent in sections run without the primary antibody. Similarly, we confirmed the absence of immunolabeling in sections where either the primary or secondary was omitted. We also confirmed the specificity of our secondary antibodies by running sections with either the rabbit anti-Fos followed by anti-mouse secondary, or the mouse anti-TH followed by anti-rabbit secondary.
Sections matched for location along the AP axis were washed in 0.1 M PBS (pH = 7.4) with 0.3% Triton-X (TX), incubated in an endogenous peroxidase inhibitor (0.03% hydrogen peroxide in 10% methanol and phosphate buffer), washed again in PBS/TX, and preblocked in 10% normal goat serum (NGS). For single labeling to visualize TH expression, sections were incubated with mouse-anti-TH (1:15,000) overnight at room temperature, washed in PBS/TX, preblocked in 10% NGS, and incubated with goat anti-mouse secondary for 40 min (1:200, Vector Laboratories). Visualization was achieved using an ABC Elite kit and diaminobenzidine tablet solution (Biomeda, Foster City, CA) to produce brownish-red cytoplasmic staining. For double labeling, Fos immunolabeling preceded TH labeling. Sections were incubated in rabbit anti-Fos (1:9000) for four nights at 4°C, washed in PBS/TX, preblocked in 10% NGS, and incubated with goat anti-rabbit secondary for 40 min (1:200, Vector Laboratories). Visualization of Fos was achieved using an ABC Elite kit with an SG stain (Vector Laboratories) to produce blue-black nuclear staining. Sections then were washed with PBS/TX, preblocked in 10% NGS and processed for TH staining as described above. Sections then were rinsed in 0.1 M PBS, mounted on gelatin-coated slides, air-dried overnight, dehydrated in a graded alcohol series to xylene, and coverslipped.
Two separate experiments were conducted. Experiment 1 consisted of 2 groups: untutored controls (n=5) and pupils tutored by a familiar male (n=8; tutor was the father). For both treatment groups, an unfamiliar stimulus female was present in an adjacent cage. In this study the analyses focused on Fos expression with the VTA and SNc only. For each bird, an average density of Fos+ cells was calculated from bilateral estimates taken from anterior, central, and posterior VTA. The SNc at the level of posterior VTA also was analyzed, except in one of the tutored animals where histological artifact precluded SNc analysis. Two fields (dorsal and ventral)/hemisphere were sampled for sections through anterior and central VTA. Only a single field (centrally placed) could be sampled for posterior VTA and for SNc. For each animal, the density of Fos+ cells was determined (blind to experimental condition) by counting immunoreactive cells within sampling grids (59,030 or 72,896 um2 at 40X) placed at equivalent positions within the left and right VTA and SNc.
Experiment 2 consisted of 3 groups: untutored controls, pupils tutored by a familiar tutor (father), and pupils tutored by an unfamiliar tutor. An unfamiliar stimulus female was always present in an adjacent cage. For this experiment, unused sections from all birds from Experiment 1 were processed along with sections from a new cohort of 4 control, 4 familiar tutored, and 5 unfamiliar tutored birds. Final sample sizes were: 9 untutored, 12 familiar tutored, 5 unfamiliar tutored. For this experiment, an average density of Fos+ cells in the VTA of each bird was calculated by averaging bilateral estimates from sections at the transition from anterior to central VTA and from posterior VTA. The SNc also was analyzed as described for Experiment 1. In addition, the analysis in this study was extended in several ways. First, Fos+ cells in the VTA/SNc were characterized as being either TH+ or TH− to assess if group effects on Fos expression discriminated between these two cell types. Secondly, within the VTA and SNc, the proportion of TH+ cells expressing Fos protein was calculated. Thirdly, to qualitatively evaluate the distribution of Fos+ cells within VTA/SNc, a camera lucida was used to plot the location of every Fos+ cell within the three sections chosen for quantitative analysis in a subset of animals (3 father tutored, 3 untutored, 4 novel tutored). For this spatial analysis, Fos+ cells also were characterized as being either TH+ or TH. Finally, we extended the analysis to the PAG. For each bird, coronal sections from the mid and posterior VTA levels were used for this analysis. The PAG at this level includes dopaminergic cells projecting to the song-related brain regions HVC, RA, and Area X (Appeltants et al., 2000; Appeltants et al., 2002; Castelino et al., 2007). In the sections sampled for analysis, all TH+ cells within the PAG were counted, and the proportion of those also immunoreactive for Fos was calculated. The irregular distribution of TH+ cells made it difficult to discern the actual boundaries of the PAG, so we limited our analysis to the TH+ cells specifically. All statistical comparisons were restricted to data from sections processed together. Either t-tests (2-tailed) or ANOVA followed by Bonferroni posthoc tests were used to evaluate group differences in the density of Fos+ cells. Group differences in the proportions of double labeled Fos+ cells and TH+ cells also were analyzed by ANOVA; arcsine root transformation was applied to data sets where many proportions were at the extreme range (e.g., between 0.0 – 0.3). Nonparametric tests were also used for all of these comparisons to protect against violations of any assumptions associated with parametric tests: statistical outcomes were unchanged from those reported here. Finally, regression analysis was used to evaluate relationships between the number of songs heard and the density of Fos+ cells within both the VTA and SNc.
Adult zebra finch song consists of a sequence of notes (most with a clear harmonic structure) that is organized into a phrase that is typically repeated several times. Songs usually, but not always, begin with a series of repeated introductory notes. In contrast, subsong is an immature song pattern that juveniles begin to produce around 30 days of age. It is highly variable, lacks obvious phrase structure, and individual notes lack clear harmonic structure and generally are longer in duration than calls or song notes of adults. Because our sensitive recording parameters resulted in many sound files that lacked song (triggered by calls or wing beats), we used Avisoft-SASlab Pro to display frequency spectrographs of all files, and we retained only those that contained song. Although subsong was readily recorded in other experiments using these same recording parameters, no subsong occurred during experimental sessions on the day of sacrifice. This is likely because juveniles were not acclimated to the experimental room. Files were reviewed to count the total number of tutor song bouts in a recording session. For this purpose, a song bout was operationally defined as one or more song phrases followed by at least 1 second of baseline energy that did not contain any song notes or phrases.
Brief exposure to a familiar tutor increased the density of Fos+ cells in both the VTA and SNc. In Experiment 1, 35d old males that had been isolated for the previous 5 days were either tutored by a familiar song tutor (their father) or were left untutored. Tutored birds heard an average of 15 tutor song bouts during the experimental session (range = 2–63); none of the pupils sang during these sessions. As shown in Figure 2, the average density of Fos+ cells in the VTA of tutored birds was approximately 70% greater than in untutored birds (t=2.37, df=11, p<.05). Within the SNc, the average density of Fos+ cells in tutored birds was more than 2X greater than in untutored birds, however this difference was not statistically significant (p=.172; but see below). A representative photomicrograph showing Fos immunoreactivity is shown in Figure 3.
The mean density of Fos+ cells was higher in tutored birds than in untutored birds across all A-P levels of the VTA (main effect of treatment: F (1,31) =10.37, p<.005). However, the density of Fos+ cells also varied along the A-P axis, particularly within tutored animals (main effect of location: F (2,31) =4.16, p<.025; Figure 4). Furthermore, while there was no significant interaction between treatment and location within the VTA, the effect of tutoring appeared most robust within the anterior and posterior sections of this nucleus.
In a second experiment, we found that the tutoring-induced increase in Fos expression within the VTA and SNc was sensitive to familiarity with the tutor (Figure 5). Tutoring by a familiar male caused the greatest increase in Fos expression within both regions. For the VTA, there was a significant effect of treatment (F (2, 23)=8.47, p<.002), and while the density of Fos+ cells in the VTA was greater in both groups of tutored birds than in untutored birds, Bonnferoni posthoc tests revealed that only the familiar tutored and the untutored controls differed significantly (t=4.08; p<.01). For the SNc, there also was a significant overall effect of treatment (F (2, 22)=6.52, p<.01), and as in the VTA, only the group tutored by a familiar male expressed a significantly higher density of Fos+ cells than the untutored controls (t=3.57; p<.01). However in this region, the increased density of Fos+ cells in the group tutored by an unfamiliar male approached significance (t=2.22; p=.068). It should be noted that the Fos expression levels measured in Experiment 2 were considerably higher than those observed in the first experiment. This likely stems from differences in antibody batch and/or histochemical reactions, since the higher expression in Experiment 2 was evident in both new subjects and those that were re-run from the first experiment.
Within some behavioral paradigms, the amount of vocal imitation relates inversely to the number of song repetitions heard within tutoring sessions (Tchernichovski et al., 1999). We therefore explored whether the density of Fos+ cells in the VTA or SNc varied systematically with the number of song bouts heard. For both familiar and unfamiliar-tutored birds, no significant correlations were found between the number of song bouts heard and Fos expression in either the VTA or SNc.
Although song tutoring increased the overall density of Fos+ cells in the VTA and SNc, it did not alter the proportion of Fos+ cells that were also TH+, nor did it alter the distribution of Fos+ cells or double-labeled cells within these regions. Across all groups, about 30% of the Fos+ cells in the VTA were TH+, and about 10% of the Fos+ cells in the SNc were TH+. One-way ANOVA of the percentage of double-labeled (Fos+/TH+) cells did not reveal a significant main effect of treatment in either of these regions (Figure 6). Fos+ cells (both TH+ and TH−) were not obviously restricted to particular locations within the 3 control and 3 father-tutored animals in which the location of every Fos+ VTA and SNc cell was plotted. Furthermore, tutoring did not appear to affect the distribution of these cells (see Figure 7).
Although a substantial number of Fos+ cells in the VTA and SNc were also TH+ (see above), only a small proportion of the total population of TH+ cells in these regions expressed detectable labeling by the Fos antibody. In the VTA, approximately10% of the TH+ cells were also Fos+, while in the SNc, only about 1% of the TH+ cells were Fos+. Although the incidence of Fos labeling among TH+ cells was greater among tutored birds than untutored birds in both regions, group differences did not reach statistical significance (Figure 8).
In contrast to the VTA and SNc, a large proportion (40–65%) of the TH+ cells in the PAG expressed Fos protein, and in this region, a substantially larger percentage of TH+ neurons were double-labeled in tutored birds than in untutored controls (Figure 8). One-way ANOVA on the percentage of TH+ neurons in the PAG that were also Fos+ revealed a significant effect of tutoring (F (2,20)=7.942; p<.005). While the percentage of TH+ PAG neurons that were double labeled was greater in both groups of tutored birds than in untutored birds, Bonferonni posthoc tests revealed that only the familiar tutored and the untutored controls differed significantly (t = 3.89; p<.01).
Finally, we examined Area X for the presence of Fos+ cells because previous work has shown that brief song tutoring promotes the phosphorylation of calcium/calmodulin-dependent protein kinase II within this striatal/pallidal region of the song system (Singh et al., 2005). Very few Fos+ cells were present in Area X, even in birds that were tutored by a familiar adult male (data not shown), in agreement with previous studies reporting a lack of song-induced Fos expression within Area X (Kimpo & Doupe, 1997; Bailey & Wade, 2006).
Our results support several working hypotheses concerning how pathways encoding incentive value, arousal and/or stimulus salience could modulate vocal learning (Doya & Sejnowski, 2000; Troyer & Doupe, 2000; Fiete et al., 2007). First, we show that stimuli present during social tutoring increase Fos expression in the VTA and SNc, as well as among TH+ neurons within the PAG. These regions all provide dopaminergic input to areas implicated in song learning and production. Thus, this result is consistent with the notion that neuromodulatory pathways originating in these dopamine-enriched regions may facilitate template encoding within forebrain areas involved in perceptual learning and could contribute to species-specific stimulus preferences exhibited during song acquisition. Second, we show that a familiar tutor (i.e., the father) is more effective than a novel tutor in elevating overall Fos expression within the VTA and SNc, as well as among TH+ cells within the PAG. It is possible that young males find the presence of a familiar caretaker especially rewarding or arousing. Social interactions with adult males can bias the selection of song patterns for vocal imitation, and in particular, levels of parental care strongly influence model selection (Immelmann, 1969; Eales, 1987a; 1989; Williams, 1990; Roper & Zann, 2006). Thus, the specific song pattern produced by a caregiver could be favored during acquisition because it is accompanied consistently by a stimulus set that strongly activates neuromodulatory systems.
We do not know what specific aspects of the tutoring experience drive the Fos genomic response in VTA, SNc and PAG neurons. In zebra finches, the VTA contains auditory-responsive neurons (Gale & Perkel, 2006). Thus, one possibility is that the salient stimulus is song itself. Naïve songbirds discriminate between conspecific and heterospecific song, and the former has reward value (Dooling & Searcy, 1980; Adret, 1993; Nelson & Marler, 1993; Houx & ten Cate, 1999). Furthermore, the stronger activation elicited by the familiar tutor could reflect prior learning of that song pattern, since zebra finches can reproduce elements of a tutor’s song even when exposed to it only until 30 days after hatching (Immelmann, 1969; Yazaki-Sugiyama & Mooney, 2004; personal observations).
While song may contribute to the Fos responses measured after tutoring, the VTA, SNc and PAG can be activated by a wide variety of rewarding and/or arousing stimuli, and thus it is most likely that a constellation of stimuli associated with the tutoring experience contribute to this response. For example, sexually motivated behavior in male quail promotes expression of Fos (and Zenk/zif-268/egr-1, another immediate early gene) in the VTA and PAG as well as in several other catecholaminergic cell populations (Charlier et al., 2005). In adult songbirds, various socio-sexual behaviors increase Fos and/or Zenk expression in these same cell groups, and levels can be modulated by social context and hormonal condition. For example, Zenk expression is elevated in the VTA and PAG of singing as compared to silent male songbirds (Maney & Ball, 2003; Heimovics & Riters, 2005; Hara et al., 2007; Lynch et al., 2008), and at least in zebra finches, both Zenk expression and modulation of electrophysiological activity in the VTA are greater when song is directed towards a female, as opposed to when the singer is alone (Yanagihara & Hessler, 2006; Hara et al., 2007). While pupils in our study did not produce song, it seems likely that these singing-related changes do not merely reflect motor-related activity since Zenk expression levels in these areas do not relate reliably to the number of songs produced (Riters et al., 2004; Lynch et al., 2008; Heimovics & Riters, 2005; but see Maney and Ball, 2003) as is the case in other song regions (Jarvis & Nottebohm, 1997). Also, singing-induced Zenk expression in the VTA is attenuated when female-directed song occurs in the absence of visual input (Hara et al., 2009). Thus, in these social contexts, as well in the social tutoring paradigm employed here, VTA, SNc and PAG activation likely reflects the additive effect of multiple salient stimuli. However, a critical point is that regardless of the specific stimuli responsible for activating VTA, SNc and PAG neurons during social tutoring, this activation normally would coincide with exposure to the tutor’s specific song pattern. This coincidence could facilitate the encoding of that particular auditory pattern, contribute to species-specific biases in learning, and even attach an incentive value to that song pattern that would be critical during later vocal practice. A cogent example is provided by studies showing that in rats, pairing VTA stimulation with a specific auditory stimulus increases the cortical representation for that stimulus, while explicit backward conditioning with VTA activation decreases its representation (Bao et al., 2001; Bao et al., 2003).
Dopaminergic cell groups in the midbrain receive afferent input from several areas that could process salient cues conveyed during the tutoring experience. For instance, in mammals, afferents to the VTA/SNc include brainstem, hypothalamic, striatal, and cortical areas (Geisler & Zahm, 2005; McHaffie et al., 2006; Fields et al., 2007). Although not all of these projections have been confirmed in songbirds, increased Fos expression within the VTA during sexually motivated song production is accompanied by similarly increased expression in the preoptic area of the hypothalamus, a region that projects to VTA and PAG in songbirds (Riters & Alger, 2004; Riters et al., 2004; Heimovics & Riters, 2005). Interestingly, the enhanced effect of a familiar tutor on VTA and PAG activation suggests that regions encoding previous sensory and/or social experience also impact these regions, either directly or indirectly. Regarding auditory experience, at least two separate known projections could convey such information to the VTA. The robust nucleus of the archopallium (RA) and HVC (acronym used as proper name) are two regions that form part of a descending vocal motor pathway that is essential for normal song production. Adjacent to RA is an auditory responsive “cup” region that innervates the VTA and receives direct input both from a “shelf” region ventral to HVC (Vates et al., 1996; Mello et al., 1998) and from regions of auditory pallium that exhibit long term habituation to repeated presentations of the same song (Mello et al., 1992; Fortune & Margoliash, 1995; Stripling et al., 1997). This plasticity could underlie discrimination of familiar from unfamiliar song (for review, see (Mello et al., 2004)). Auditory information also could reach the VTA/SNc from Area X via the ventral pallidum (Gale et al., 2008). Area X receives auditory input from HVC (Katz & Gurney, 1981; Lewicki, 1996; Mooney, 2000) and is a specialized region of the avian basal ganglia that contains both striatal and pallidal cell types (Farries & Perkel, 2002; Carrillo & Doupe, 2004; Reiner et al., 2004). This region is critical for normal song development (Sohrabji et al., 1990; Scharff & Nottebohm, 1991), and biochemical changes observed in Area X during tutoring suggest it either encodes or receives information about tutor familiarity (Singh et al., 2005).
The VTA, SNc and/or PAG could influence the perception and/or encoding of song through their projections to any of a number of song-related brain regions. These neurons innervate Area X and HVC, and the PAG also projects to RA (Lewis et al., 1981; Appeltants et al., 2000; Appeltants et al., 2002; Castelino et al., 2007; Gale et al., 2008). VTA neurons also innervate the caudomedial nidopallium (R. Pinaud, personal communication), a region of the auditory lobule that has been linked to song perception and template encoding (Mello et al., 2004; Bolhuis & Gahr, 2006; Phan et al., 2006). The majority (>95%) of X-projecting neurons within the VTA/SNc are TH+ (Person et al., 2008), as are most (~60%) HVC-projecting VTA/SNc neurons (Appeltants et al., 2000). Although our analyses of TH+/Fos+ cells, neurons within the VTA and SNc failed to reveal a statistically significant effect of tutoring on this specific cell population, they clearly contribute to the overall effect of tutoring on Fos expression since tutoring did not alter the proportion of TH+ cells represented among the Fos+ population. Interestingly, while only about 2% of the Zenk+ population in the VTA/SNc of adult birds co-label for TH after singing (Riters et al., 2004; Castelino & Ball, 2005; Heimovics & Riters, 2005; Hara et al., 2007), we found that 30% of the VTA, and 10% of the SNc neurons that were Fos+ also expressed TH after tutoring. Although these differences may relate to differences in either the age of the subjects or Zenk vs. Fos expression patterns, they may also relect differences in which neuronal subpopulations are activated by different stimulus sets. VTA/SNc in mammals contain both GABAergic and cholinergic cells that are TH-(Kawaguchi, 1993; Tepper & Bolam, 2004), and at least the GABAergic interneurons are present within the avian VTA/SNc (Gale & Perkel, 2006; Hara et al., 2007). Thus, the tutoring-induced Fos response likely also includes one or both of these non-dopaminergic populations. It will be important to combine Fos ICC with positive markers for TH- cell types, as well as retrograde tracing. Also, measures of DA release during tutoring could reveal how VTA/SNc/PAG activation during song tutoring impacts song regions. In adults, singing-related DA release has been measured in Area X by in-vivo microdialysis (Sasaki et al., 2006), and similar approaches could be applied to characterize dopaminergic modulation during song tutoring.
One way that ascending VTA/SNc/PAG projections could affect learning is through the modulation of synaptic changes triggered by auditory input. In other words, if DA release modulates synaptic plasticity within regions implicated in vocal learning, encoding a specific song pattern could be facilitated when it co-occurs with the activation of midbrain dopaminergic cell groups. Studies of corticostriatal synaptic plasticity indicate that DA facilitates both long-term potentiation (LTP) and long-term depression (LTD) through its actions on D1 and D2 receptors respectively (Centonze et al., 2001; Kerr & Wickens, 2001; Calabresi et al., 2007; Surmeier et al., 2007). Corticostriatal LTP involves several distinct mechanisms, including DA modulation of cholinergic interneurons as well as direct stimulation of plasticity-related molecular pathways within striatal neurons (Jay, 2003; Calabresi et al., 2007). The latter likely involves D1 receptor-mediated increases in cAMP that activate protein kinase A and stimulate the phosphorylation of dopamine- and cAMP-regulated phosphoprotein 32 kDa (DARPP-32). DARPP-32 potently inhibits protein phosphatase 1 (Svenningsson et al., 2004), and its activation would sustain phosphorylation of calcium/calmodulin-dependent protein kinase II (CaMKII), as well as both NMDA and AMPA receptors, all changes associated with increased synaptic efficacy (Soderling & Derkach, 2000; Lisman & Zhabotinsky, 2001; Genoux et al., 2002). Hence, signatures of this cascade could be used to begin a search for regional targets of DA action during acquisition.
Some progress has been made in this regard for Area X, a striatal/pallidal region that has been implicated in both template acquisition and sensorimotor learning (Sohrabji et al., 1990; Scharff & Nottebohm, 1991; Singh et al., 2005; Haesler et al., 2007). Within this region, dopamine influences neuronal excitability of the medium spiny neurons and is necessary for NMDA receptor-dependent LTP elicited at HVC and LMAN glutamatergic synapses on these neurons (Ding & Perkel, 2002; Ding et al., 2003; Ding & Perkel, 2004). These same neurons express both CaMKII and DARPP-32 (Hein et al., 2007), and song tutoring markedly increases the phosphorylation of CaMKII within Area X, with familiar song being especially effective in this regard (Singh et al., 2005). Preliminary results from our laboratory indicate that infusions of the D1/D2 antagonist ifenprodil block this tutoring-induced CaMKII phosphorylation (Hein et al., 2005). In light of the present results, these findings suggest that the co-occurrence of VTA/SNc/PAG activation and auditory input during tutoring could facilitate learning by modulating striatal LTP. It would be interesting to assess whether social tutoring increases other markers of DA-regulated LTP (such as DARPP-32 phosphorylation) in Area X or other regions implicated in song learning. Such studies could focus efforts to assess how song learning is affected by disrupting biochemical signaling cascades during tutoring.
The present results may have implications for two other important aspects of vocal learning. First, sensorimotor learning (i.e., vocal practice) could be impacted by the associations formed during tutoring between specific song patterns and activation of reward-related pathways. Vocal practice is generally viewed as a form of reinforcement-based learning (Doya & Sejnowski, 2000; Troyer & Doupe, 2000; Fiete et al., 2007). When birds produce sounds resembling the song template, learning progresses because such sounds are rewarding, and they are therefore retained selectively in the bird’s repertoire. Second, VTA/SNc/PAG activation driven by conspecific cues during song acquisition also could contribute to the precise stimulus preferences evident during this phase of learning. Marler and Peters (1988) showed that imitation of heterospecific song syllables is facilitated if such syllables are embedded with conspecific elements. It would be fascinating if such hybrid songs are more effective than heterospecific songs in driving the activation of VTA/SNc or PAG neurons. Social experience also could modify the response of these neuromodulatory systems to heterospecific song tutoring. Birds will mimic heterospecific song if they are raised by that species (Immelmann, 1969; Price, 1979; Eales, 1987a; Clayton, 1989). Perhaps such cross fostering makes heterospecific visual and/or auditory cues more effective for driving VTA, SNc or PAG neurons. Recordings of VTA/SNc/PAG activity in awake, behaving birds (Yanagihara & Hessler, 2006), coupled with manipulations of DA action during discrete stages of vocal learning, could help assess how these regions contribute to this complex instance of patterned motor learning.
We thank Heather Bradstreet, Adam Neidert, and Erin Phillips for expert technical support. NIH MH068546 and the Whitehall Foundation supported this work.