|Home | About | Journals | Submit | Contact Us | Français|
Primates often rely on vocal communication to mediate social interactions. Although much is known about the acoustic structure of primate vocalizations and the social context in which they are usually uttered, our knowledge about the neocortical control of audio–vocal interactions in primates is still incipient, being mostly derived from lesion studies in squirrel monkeys and macaques. To map the neocortical areas related to vocal control in a New World primate species, the common marmoset, we employed a method previously used with success in other vertebrate species: Analysis of the expression of the immediate early gene Egr-1 in freely behaving animals. The neocortical distribution of Egr-1 immunoreactive cells in three marmosets that were exposed to the playback of conspecific vocalizations and vocalized spontaneously (H/V group) was compared to data from three other marmosets that also heard the playback but did not vocalize (H/n group). The anterior cingulate cortex, the dorsomedial prefrontal cortex and the ventrolateral prefrontal cortex presented a higher number of Egr-1 immunoreactive cells in the H/V group than in H/n animals. Our results provide direct evidence that the ventrolateral prefrontal cortex, the region that comprises Broca's area in humans and has been associated with auditory processing of species-specific vocalizations and orofacial control in macaques, is engaged during vocal output in marmosets. Altogether, our results support the notion that the network of neocortical areas related to vocal communication in marmosets is quite similar to that of Old world primates. The vocal production role played by these areas and their importance for the evolution of speech in primates are discussed.
Many primate species rely on vocal communication to mediate social interactions (Epple, 1968; Seyfarth et al., 1980; Boinski, 1993; Boinski et al., 1994; Clark and Wrangham, 1994; Vitale et al., 2003; Arnold and Zuberbühler, 2006), and the complexity of their call repertoires appears to be a function of adaptive pressure (Stephan and Zuberbühler, 2008). In spite of the importance of primate vocal communication, our understanding of its neural underpinnings is still quite limited. Most studies have focused on the auditory processing of species-specific calls (Aitkin et al., 1988; Aitkin and Park, 1993; Rauschecker et al., 1995; Wang et al., 1995; Lu et al., 2001; Bendor and Wang, 2005; Wang et al., 2005; Petkov et al., 2008), whereas relatively little is known about the brain representation of vocal-motor programs.
Early studies based on electrical stimulation (Jürgens et al., 1967; Jürgens and Ploog, 1970) or lesions (Sutton et al., 1974; Aitken, 1981) implicated the anterior cingulate cortex (ACC) with vocal control in macaques and squirrel monkeys. These results were supported by anatomical studies indicating a connection between the ACC and the periaqueductal gray (PAG), a brainstem structure involved with vocal motor output in monkeys (Jürgens and Pratt, 1979a,b; Jürgens and Zwirner, 1996; An et al., 1998) and other vertebrates (Kittelberger et al., 2006; Jürgens, 2009). The ACC was also found to have bidirectional connections with auditory associative areas (Barbas et al., 1999), and to exert a predominantly inhibitory control over extensive secondary auditory regions in the superior temporal gyrus (Müller-Preuss et al., 1980). In contrast, no evidence was initially found for an involvement of neocortical areas such as the left ventrolateral prefrontal cortex (VLPFC), which in humans corresponds to Broca's area for articulated speech (comprising Brodmann's areas 44 and 45). A main conclusion of these experiments was the notion that the ACC may be the sole cortical structure for vocal control in non-human primates, differently from the evolutionary branch that led to the origin of the human vocal pathway.
However, ethological approaches associated with quantitative histological analysis, high-resolution recordings of neuronal activity, and pharmacological or molecular manipulations of brain activity have improved our understanding of the neocortical system for vocal communication in non-human primates (Ghazanfar and Hauser, 1999). A variety of studies suggest the involvement of areas other than the ACC, especially parts of the prefrontal cortex, in the processing of species-specific auditory stimuli. For example, an auditory domain that includes areas 12 and 45 of the macaque brain has been found to be strongly responsive to species-specific vocalizations (Romanski and Goldman-Rakic, 2002). Functional imaging studies using playbacks of conspecific vocalizations have detected activation of this same region in one out of two monkeys (Petkov et al., 2008), and of the homologues of Broca's and Wernicke's areas in the macaque brain (Gil-da-Costa et al., 2006, but see Ghazanfar and Miller, 2006 for a critique of this work). Diffusion tensor imaging data have suggested an evolutionary gradient regarding the degree of connectivity between Broca's and Wernicke's areas: reduced in macaques, moderate in chimpanzees, and abundant in humans (Rilling et al., 2008). Furthermore, electrical stimulation of an area in macaques that is cytoarchitectonically homologous to area 44 in humans has been shown to elicit orofacial responses (Petrides et al., 2005). Despite these achievements, direct evidence of the involvement of the VLPFC in the control of vocal output in non-human primates is still missing.
The analysis of the expression of immediate early genes (IEGs) has been used to generate high-resolution maps of brain activation in response to specific stimuli in behaving animals (Chaudhuri, 1997). This approach has been particularly successful in the identification of auditory (Mello et al., 1992; Ribeiro et al., 1998) and vocal-motor representations in birds (Jarvis and Nottebohm, 1997; Jarvis and Mello, 2000; Jarvis et al., 2000; Mello, 2002). The avian studies of vocal control have focused on ZENK (a.k.a. zif-268, Egr-1, NGFI-A, and krox-24), an IEG highly sensitive to neuronal depolarization that is involved with synaptic plasticity (Wisden et al., 1990; Nottebohm, 1997; Jones et al., 2001; Mello 2002; Knapska and Kaczmarek, 2004). In saddle-back tamarins, expression of the IEG c-FOS has been found to increase in the ACC, dorsomedial prefrontal cortex (DMPFC), and VLPFC (Jürgens et al., 1996), but the vocalizations in this study were evoked by electrical stimulation of the PAG rather than occurring spontaneously, thus limiting the interpretation of the findings. Preliminary Egr-1 immunolabeling data from marmosets undergoing spontaneous vocal production suggests that frontal neocortical areas are activated in this condition (Simões et al., 2007, 2008). Preliminary c-FOS data in marmosets also suggest the involvement of neocortical areas in antiphonal calling (Miller et al., 2005).
The common marmoset (Callithrix jacchus) stands out as a model organism that may yield further insights into the biology of vocal communication in non-human primates. Marmosets present a relatively simple neocortical architecture, very conspicuous vocal behavior (Epple, 1968; Winter, 1978; Mendes et al., 2009), and a complex social hierarchy (Yamamoto et al., 2009). Marmosets and the related cotton-top tamarins also exhibit robust antiphonal calling (Ghazanfar et al., 2001; Miller and Wang, 2006), i.e. a tendency to vocalize back upon hearing species-specific vocalizations, facilitating the design of naturalistic vocal communication paradigms. The marmoset also offers unique opportunities for comparisons between New and Old World primates. Its auditory pathways have been characterized (Aitkin and Park, 1993) and some important features of the auditory cortex have been described, including its tonotopic organization (Aitkin et al., 1986; Bendor and Wang, 2005), connectivity (Aitkin et al., 1988; de la Mothe et al., 2006), and key parameters of the auditory representation of species-specific vocalizations (Wang et al., 1995; Wang and Kadia, 2001; Nagarajan et al., 2002; Bendor and Wang, 2007). In contrast, knowledge about the vocal control pathways in marmosets remains scarce. Importantly, it is still unclear whether the ACC is the predominant cortical vocal area in marmosets, or whether the involvement of prefrontal areas in vocal behavior, at least during auditory processing, can also be extended to this New World species.
Here we describe the neocortical distribution of Egr-1 immunoreactivity in marmosets spontaneously engaged in vocal production. We took advantage of individual differences in the tendency to respond to playbacks of conspecific calls and compared animals that did or did not vocalize during the auditory stimulation. Four areas were analyzed: The auditory cortex (AC), the ACC, the DMPFC, and the VLPFC. We hypothesized that the three frontal areas would present a greater number of Egr-1 reactive cells per unit area in the vocalizing animals than in the hearing only ones, while no significant differences would be found among them in the AC, showing that frontal neocortical areas, such as the DMPFC and the VLPFC are activated during vocal production in non-human primates.
All animal work including housing, surgical, and recording procedures were in strict accordance with the National Institutes of Health guidelines, and was approved by the Edmond and Lily Safra International Institute of Neuroscience of Natal Committee for Ethics in Animal Experimentation. Seven adult male common marmosets (C. jacchus) reared in captivity were initially housed individually for 24h in sound-attenuating chambers (sound attenuation of ~50dB, 40×40×50cm, dimensions in accordance with (Council, 1996).
After the 24-h isolation period, six animals were stimulated for 45min (Figure (Figure1A)1A) with a playback of conspecific vocalizations. The playback tape was obtained by continuously recording freely uttered vocalizations of adult captive marmosets unfamiliar to the animals investigated here (sampling rate 44kHz, duration of 45min). The vocalizations present in the playback were phee calls (4.92calls/min), twitter calls (1.75calls/min), and chatter calls (0.38calls/min) (Figure (Figure1B).1B). The tape was not edited and was presented non-stop to the experimental animals at 70dB at 1m through a high fidelity speaker (Selenium Super Tweeter ST350, frequency response: 2.500–20.000Hz) positioned inside the chamber but kept out of the animals’ reach by an acrylic screen. The “hearing and vocalizing” group (H/V, n=3) consisted of animals that spontaneously vocalized upon hearing the playback (phee calls, number of calls ranging from 41 to 75; one animal also uttered 19 chatter calls). In contrast, the “hearing only” group (H/n, n=3) consisted of animals that did not vocalize at all during presentation of the stimulus.
The animals were killed 60min after the onset of stimulation so as to match the peak of Egr-1 protein expression (Knapska and Kaczmarek, 2004). The sound-attenuated chambers were filled with 5% isofluorane in oxygen (Cristália, Brazil), and after 5min animals received an overdose of sodium thiopental (Cristália, Brazil; 50mg/kg, intraperitoneal injection). Animals were then intracardially perfused with heparinized phosphate-buffered saline (PBS) at 37ºC, followed by 4% paraformaldehyde in 0.1M phosphate buffer (PB), pH 7.4, at 4ºC. The brains were removed, washed for 24h in 0.1M PB, pH 7.4, at 4ºC, and cryoprotected for another 24h in 20% sucrose in 0.1M PB at 4ºC. The brains were then rapidly frozen in embedding medium (Tissue-Tek, Japan) using a mix of dry ice and ethanol, stored at −80ºC, sectioned coronally at 20μm on a cryostat (Micron HM 550, Germany), and thaw-mounted on glass slides (SuperFrost Plus, VWR International, USA). To facilitate the identification of areas of interest, serial sections (one at every 200μm) from all brains were stained for Nissl (0.1% cresyl violet).
Selected sections corresponding to the cortical areas of interest (see definition below) were processed for immunohistochemistry for the Egr-1 protein according to a standard protocol (Mello and Ribeiro, 1998; Ribeiro et al., 1998). Briefly, the sections were: (1) washed for 30min in 0.1M PB; (2) incubated for 30min in a blocking buffer (BB) solution (0.5% fresh skim milk and 0.3% Triton X-100 in 0.1M PB); (3) incubated overnight in rabbit primary antibody (1:100 dilution in BB; SC-189; Santa Cruz Biotechnology, USA); (4) washed for 30min in 0.1M PB; (5) incubated for 2h in biotinylated goat anti-rabbit secondary antibody (1:200 dilution in BB; BA-1000; Vector Labs, USA); (6) washed for 30min in 0.1 M PB; 7) incubated in an avidin–biotin–peroxidase complex (PK-4000; Vectastain Standard ABC kit; Vector Labs, USA) for 2h; and (8) placed in a solution containing 0.03% 3,3' diaminobenzidine (DAB; D5637; Sigma, USA) and 0.001% hydrogen peroxide in 0.1M PB. Reaction was stopped after a few minutes by rinsing the sections in 0.1M PB, pH. 7.4. Sections were then dehydrated through a series of graded alcohols and coverslipped with Entellan (Merck, Germany). In order to verify the specificity of the labeling, the primary antibody was replaced by blocking buffer in some test sections. Considering that a large number of brain sections were analyzed per animal, the overall number of sections for the entire study was too large to be processed in a single immunohistochemistry batch. Therefore we reacted and quantified the sections in three smaller batches, each including sections from one animal in the H/V group paired with comparable sections from one animal in the H/n group.
The coronal sections containing areas of interest were identified by consulting the brain atlas of the common marmoset (Stephan et al., 1980). The AC was sampled in sections at A+5.5 (Figure (Figure2A).2A). This specific portion of the AC was chosen because of its involvement in the processing of frequencies around 8kHz (Aitkin et al., 1988), which are characteristic of phee calls, the prevailing call type in our stimulus tape. The other three cortical areas of interest were sampled in more anterior sections at A+13.8mm (Burman et al., 2006), which includes the most anterior portion of the ACC, next to the genu (Paus, 2001), area 6m of the DMPFC (Jürgens et al., 1996; Burman et al., 2006), and the transitional region of area 12/45, which most resembles area 45 in macaques (Burman et al., 2006) (Figures (Figures22D,G,J).
For each cortical area examined, three adjacent coronal sections from each animal were chosen for quantitative analyses. In each section, Neurolucida software (MicroBrightField, Inc., USA) was used to delimit regions of interest (ROIs), defined as 200-μm wide square boxes drawn sequentially over the cerebral cortex of both hemispheres so as to sample the tissue at regular intervals from the outer cortical layers to the white matter (Figures (Figures2B,E,H,K).2B,E,H,K). In each area, the first ROI was positioned on the border between layers I and II, as defined by inspection of adjacent sections stained with cresyl violet. The subsequent ROIs were oriented perpendicularly to the cortical surface line and equally spaced by 100μm intervals, in a total of 4–6 non-overlapping boxes, depending on the cortical depth of each area (Figures (Figures2C,F,I,L).2C,F,I,L). All cells within ROIs identified as immunolabeled for Egr-1 were counted at the 40× magnification. Labeled cells could be unambiguously identified due to a characteristic nuclear pattern of staining (Figure (Figure1C).1C). We did not rank the cells with respect to the intensity of immunolabeling. In order to perform group comparisons with the resulting cell counts, we combined the data from the three separately reacted batches. For this purpose, we first used a normalization procedure which consisted in dividing the number of labeled cells within each ROI by the total number of labeled cells in the corresponding batch of reacted sections. For a general group comparison, we then plotted the normalized labeled cell counts for all ROIs from the three animals in each group in box plots (Figure (Figure4).4). Due to the low n, we did not subject the data to intergroup statistical analysis.
The sequential positioning of the ROIs made it possible to assess the distribution of labeled cells across different cortical depths within each area analyzed. Thus, for a more detailed regional analysis, we calculated the ratio of the normalized labeled cell counts for each ROI between the H/V animal and its H/n counterpart, within each of the three separately reacted batches. The values obtained for each ROI from the pairs of matched animals were then averaged across the three separately reacted batches, converted into a pseudocolor scale, and displayed over anatomical drawings of the brain areas analyzed (B graphs in Figures Figures55–8).
For a more quantitative evaluation of regional/laminar differences in labeling, we plotted the raw cell counts per ROI for individual animals in each group, keeping track of the position of the ROI relative to cortical depth (Graphs A in Figures Figures55–8). Values across cortical layers were compared using non-parametric statistics (Kruskal–Wallis test followed by Mann–Whitney tests with Bonferroni correction for multiple comparisons) implemented with MATLAB (The MathWorks, Inc., USA). For each cortical region analyzed we also plotted and compared the raw cell count values from all ROIs in the left and right hemispheres (Mann–Whitney test, C graphs in Figures Figures55–8).
Overall, the number of Egr-1 immunoreactive cells in the ACC, DMPFC and VLPFC was higher in the H/V group than in the H/n animals (Figures (Figures33 and and4).4). On the other hand, no major difference was found between the groups in the AC. Detailed results for each cortical area are presented below.
We observed statistically significant differences in the number of labeled cells across layers within the H/V animals, but the pattern was not consistent across animals, since the differences were detected at different cortical depths in different animals (Figure (Figure5A).5A). The highest ratios between the normalized counts of H/V and H/n groups were found at ROIs positioned between 300 and 800μm in the left hemisphere and between 0 and 200μm in the right hemisphere (Figure (Figure5B),5B), which could suggest a differential recruitment of layer IV in the left hemisphere and layer II in the right hemisphere when animals vocalize. Notwithstanding, no statistically significant differences were observed between hemispheres in H/V animals (Figure (Figure55C).
Within the H/V animals, we observed a non-significant trend for increased labeling in the ROIs between 600 and 800μm in all three animals (Figure (Figure6A).6A). Interestingly, a trend for increased labeling was found in the ROIs between 300 and 500μm in two of the H/n animals (data not shown), suggesting a differential recruitment of the supragranular layers of the ACC when animals vocalize. The highest ratios between H/V and H/n animals in the ACC occurred in the most medial ROIs of the right hemisphere (Figure (Figure6B).6B). Two animals showed statistically significant laterality, in favor of the left hemisphere (Figure (Figure6C,6C, left>right in batches 2 and 3).
The depth distribution of Egr-1 labeled cells in two H/V animals revealed a trend for increased labeling between 300 and 800μm, but we observed the opposite in the remaining animal (Figure (Figure7A).7A). The highest ratios between H/V and H/n groups were found in the most superficial ROIs, especially in the right hemisphere (Figure (Figure7B).7B). Only one animal of the H/V group showed a statistically significant difference between hemispheres, with more labeling in the left one (Figure (Figure7C;7C; left>right in batch 3).
The pattern of cortical depth distribution in the H/V animals showed a consistent trend of increased labeling in the ROIs positioned between 600 and 800μm (Figure (Figure8A).8A). These ROIs were positioned over cortical layer IV and its borders (Figure (Figure2D),2D), as defined by cresyl-violet staining of brain sections adjacent to those. The same pattern was also found in the H/n animals, but with lower absolute number of immunoreactive cells (data not shown). This suggests that the VLPFC is more active as a whole when animals vocalize. However, the pseudocolored map (Figure (Figure8B)8B) shows that the outer ROIs between 0 and 500 μm exhibit the highest Egr-1 labeling ratios between H/V and H/n animals. Altogether, these data mean that, although the increase of immunoreactive cells provoked exclusively by vocal output is widespread through the VLPFC, layers II and III are more recruited by vocal behavior than the other layers. Laterality was observed in two animals, but for different hemispheres(Figure 8C).
Our results show that Egr-1 expression, measured as the number of nuclei immunopositive for the Egr-1 protein per unit area, is strongly induced in the ACC, DMPFC, and VLPFC when animals vocalize upon hearing conspecific calls (H/V group), but not when they hear these calls without vocalizing (H/n group). In contrast, the AC showed increased Egr-1 protein expression in both H/n and H/V animals. None of the areas analyzed showed a consistent pattern of lateralization, and therefore no conclusion regarding this issue could be drawn from the data.
The equivalent levels of Egr-1 labeling observed in the AC for the H/V and H/n groups were expected, because the AC corresponds to the primary auditory cortex (Aitkin et al., 1988) and both groups of animals were similarly exposed to playbacks of conspecific calls. Higher Egr-1 labeling in the H/V group than in the H/n animals, which occurred in the more anterior areas investigated (ACC, DMPFC, and VLPFC), also matched the expectation that these areas are required for vocal control in marmosets (Jürgens et al., 1996). Altogether, the data provide direct evidence that the prefrontal areas mentioned above are engaged in vocal communication not only by auditory processing but also – and most importantly – by vocal output. In particular, these results are consistent with the ACC playing a role in the neural control of vocalizations in primates. The data also provide direct evidence of the involvement of the DMPFC and especially the VLPFC in the control of vocal output in a non-human primate species. The functional contribution of each of these areas to marmoset vocal communication remains to be determined.
Early lesion studies (Sutton et al., 1974; Aitken, 1981) were designed to test the hypothesis that non-human primates exert volitional – and not only emotional – control over their vocal output. Those studies were based on discriminative vocal conditioning tasks, not on audio–vocal interactions relevant to social context. Therefore, repetitive conditioning may have biased the results. It is currently accepted that distinct subdivisions of the ACC are differentially involved in a variety of cognitive and motor functions, but mostly as an interface among cognition, emotion, volition, and motor output (Paus, 2001). The functional contribution of the ACC to vocal control is probably restricted to the voluntary initiation of vocal utterances (Müller-Preuss et al., 1980; Paus, 2001). Evidence from squirrel monkeys indicates that inactivation of the PAG blocks vocalizations elicited by stimulation of the cingulate cortex (Düsterhöft et al., 2000). That study revealed that some ACC neurons were particularly active during the short amount of time elapsed between hearing the vocalizations of another individual and uttering an antiphonal response (Düsterhöft et al., 2000). A similar study revealed that PAG inactivation blocks vocalizations elicited by electrical stimulation of forebrain sites such as the cingulate cortex and the hypothalamus, but does not affect vocalizations elicited by stimulation of the caudal midbrain, pons or medulla (Siebert and Jürgens, 2003), where neuronal firing has been found to be correlated with different spectral features of the vocalizations (Lüthe et al., 2000). Taken together, these findings suggest that the ACC works as a volitional gate for vocal production.
In humans, focal bilateral lesions of the ACC are associated with akinetic mutism, characterized by a marked impairment in the spontaneous initiation of speech (Paus, 2001). Patients with unilateral lesions of the ACC display aprosodic and monotonous speech, characterized by hesitation (Paus, 2001). These observations support an involvement of the ACC in the volitional control of emotional utterances. However, similar effects have been reported in patients with lesions of the supplementary motor area (SMA) (Laplane et al., 1977; Ziegler et al., 1997; Krainik et al., 2003), a region of the premotor cortex that corresponds to BA 6, in the dorsomedial region of the prefrontal cortex. The DMPFC investigated in the present report most probably corresponds to the agranular area 6m (Burman et al., 2006). In squirrel monkeys, pharmacological blockade of the PAG by a glutamatergic antagonist prevented the vocal emission elicited by electrical stimulation of the ACC, but did not block vocalizations elicited by SMA stimulation (Jürgens and Zwirner, 1996). These results led the group to postulate the existence of two parallel vocal pathways, one involving the PAG and controlled by the ACC and another independent of the PAG and controlled by the SMA. According to this view, the former pathway would be responsible for the utterance of innate vocalizations related to the emotional state of the subject, while the latter pathway would trigger learned vocalizations (Jürgens, 2002).
If the ACC and the DMPFC are selectively involved in the voluntary initiation of vocal output, which area would be responsible for the control of the acoustic features of the vocalizations in the non-human primate brain? The debate on the possible existence of a homologue of Broca's area in the VLPFC of marmosets dates back to the first half of the last century (Brodmann, 1909; von Bonin and Peden, 1947). The latter authors considered the anterior portion of the area along the frontoparietal operculus of the marmoset brain to be a homologue of Broca's area. More recent work did not refer to any cytoarchitectonic areas that might represent a marmoset homologue of human or macaque BA 44 (Burman et al., 2006). Rather, it indicates that the ventrolateral area of the marmoset prefrontal cortex resembles cortical area 47/12 of the macaque brain (Petrides and Pandya, 2002), characterized by a sharply defined layer II and a well-developed layer IV, although not as thick as in adjacent areas 10 and 46. Still according to this study, the marmoset VLPFC is limited dorsally by a transitional region that resembles area 45 in the macaque and human brains, with a thick layer IV limited by large darkly Nissl-stained neurons in layers III and V (Burman et al., 2006). Since the ventrolateral region resembles both area 47/12 in macaques and area 45 in macaques and humans, it was called area 12/45 in marmosets (Burman et al., 2006).
The injection of anterograde and retrograde tracers revealed that area 12/45 is the most extensively connected among the prefrontal areas analyzed, which also comprised dorsal, orbital, medial, and lateral areas (Roberts et al., 2007). In macaques, both areas 47/12 and 45 receive polymodal afferents: While area 47/12 receives robust inputs from associative visual areas in the rostral inferotemporal cortex, area 45 receives inputs from rostral auditory regions in the superior temporal cortex (Romanski et al., 1999; Petrides and Pandya, 2002). In the marmoset brain, it is not possible to make a clear distinction between areas 12 and 45, although some degree of specialization has been observed in projections to secondary visual areas (Burman et al., 2006; Roberts et al., 2007).
Our results show intense activation of the marmoset VLPFC as a whole during vocal production. Statistical analyses within individuals indicate that layer IV and its borders with layers III and V comprise the region with the highest number of labeled cells within the VLPFC (Figure (Figure8A).8A). Area 45 in the monkey brain has been suggested to play a specific role in the active – i.e. not automatic – retrieval of mnemonic information, when the stimuli do not bear reliable relationships to other stimuli or to particular contexts, thus creating the need for some degree of judgment (Petrides, 1996). In humans, BA 45 is believed to take part in episodic memory retrieval (Cabeza et al., 2002), while BA 44 seems to integrate a subvocal rehearsal system for verbal working memory (Baddeley, 1992; Paulesu et al., 1993). However, the importance of the VLPFC for the evolution of speech and vocal control in primates is not restricted to its activation during vocal behavior. In addition to an involvement in the control of orofacial musculature (Petrides et al., 2005) and working memory, the VLPFC comprises motor area F5c, the site where mirror neurons were first discovered in macaques (Rizzolatti et al., 1996). This area has been considered by some authors as the macaque homologue of human area 44, and the mirror system has been suggested to integrate a core area for the motor learning of speech. Notwithstanding, area F5 is agranular (Rizzolatti et al., 1996), while area 44 is dysgranular (Amunts et al., 1999). Indeed a dysgranular area similar to area 44, just neighboring area F5, has been described in the monkey brain (Petrides et al., 2005). On the other hand, both area F5 in macaques and area 44 in humans have been reported to be responsive to both hand and mouth movements (Rizzolatti et al., 2002). Although the inclusion of area 44 in the mirror system is still debatable, the apparent superimposition of hand and mouth representations in human cortical areas involved with complex actions and necessary for speech control gives intriguing clues regarding the evolution of speech.
A comparative analysis of our marmoset data with data from areas 47/12, 44, and 45 in macaques and 44 and 45 in humans suggests a progressive anatomical and functional specialization of the VLPFC areas towards the fine control of vocal expression in primates (Table (Table1),1), in a manner much similar to that proposed by (Rilling et al., 2008) for the evolution of the arcuate fasciculus. The involvement of these areas with vocal control in non-human primates suggests that their primordial vocal function greatly precedes human speech, and probably served as a pre-adaptation for the emergence of human speech. While the ACC seems to modulate volitional motor outputs by interfacing cognition and emotion, the VLPFC may have evolved as the site of spoken language in humans by interfacing crucial capabilities underlying speech, such as high-level orofacial control, working memory, and the mirror system.
Our results support the notion that diverse cortical structures are involved in the control of vocal communication in marmosets. The cortical areas investigated here, previously reported to be associated with auditory processing and vocal control in humans and macaques, are also activated in the marmoset brain during hearing and vocal production. Most importantly, our data provide direct evidence that the VLPFC, a key region for speech control in humans, is also activated during vocal production in a non-human primate. The overall coherence of our results with the recent literature on vocal control in the human and macaque brains seems to push the debate on the evolution of speech back in the primate evolutionary branch, so as to include New World monkeys in the picture.
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.