A key characteristic of human language efficiency is that more frequently used words tend to be shorter in length—the ‘law of brevity’. To date, no test of this relationship between frequency of use and length has been carried out on non-human animal vocal communication. We show here that the vocal repertoire of the Formosan macaque (Macaca cyclopis) conforms to the pattern predicted by the law of brevity, with an inverse relationship found between call duration and rate of utterance. This finding provides evidence for coding efficiency in the vocal communication system of this species, and indicates commonality in the basic structure of the coding system between human language and vocal communication in this non-human primate.
Formosan macaque; communication; language; coding; primate
All non-human primates communicate with conspecifics using vocalizations, a system involving both the production and perception of species-specific vocal signals. Much of the work on the neural basis of primate vocal communication in cortex has focused on the sensory processing of vocalizations, while relatively little data are available for vocal production. Earlier physiological studies in squirrel monkeys had shed doubts on the involvement of primate cortex in vocal behaviors. The aim of the present study was to identify areas of common marmoset (Callithrix jacchus) cortex that are potentially involved in vocal communication. In this study, we quantified cFos expression in three areas of marmoset cortex – frontal, temporal (auditory), and medial temporal – under various vocal conditions. Specifically, we examined cFos expression in these cortical areas during the sensory, motor (vocal production), and sensory–motor components of vocal communication. Our results showed an increase in cFos expression in ventrolateral prefrontal cortex as well as the medial and lateral belt areas of auditory cortex in the vocal perception condition. In contrast, subjects in the vocal production condition resulted in increased cFos expression only in dorsal premotor cortex. During the sensory–motor condition (antiphonal calling), subjects exhibited cFos expression in each of the above areas, as well as increased expression in perirhinal cortex. Overall, these results suggest that various cortical areas outside primary auditory cortex are involved in primate vocal communication. These findings pave the way for further physiological studies of the neural basis of primate vocal communication.
immediate early gene expression; common marmoset; vocal communication; frontal cortex; auditory cortex; medial temporal cortex
The complexity of human communication has often been taken as evidence that our language reflects a true evolutionary leap, bearing little resemblance to any other animal communication system. The putative uniqueness of the human language poses serious evolutionary and ethological challenges to a rational explanation of human communication. Here we review ethological, anatomical, molecular, and computational results across several species to set boundaries for these challenges. Results from animal behavior, cognitive psychology, neurobiology, and semiotics indicate that human language shares multiple features with other primate communication systems, such as specialized brain circuits for sensorimotor processing, the capability for indexical (pointing) and symbolic (referential) signaling, the importance of shared intentionality for associative learning, affective conditioning and parental scaffolding of vocal production. The most substantial differences lie in the higher human capacity for symbolic compositionality, fast vertical transmission of new symbols across generations, and irreversible accumulation of novel adaptive behaviors (cultural ratchet). We hypothesize that increasingly-complex vocal conditioning of an appropriate animal model may be sufficient to trigger a semiotic ratchet, evidenced by progressive sign complexification, as spontaneous contact calls become indexes, then symbols and finally arguments (strings of symbols). To test this hypothesis, we outline a series of conditioning experiments in the common marmoset (Callithrix jacchus). The experiments are designed to probe the limits of vocal communication in a prosocial, highly vocal primate 35 million years far from the human lineage, so as to shed light on the mechanisms of semiotic complexification and cultural transmission, and serve as a naturalistic behavioral setting for the investigation of language disorders.
vocal learning; conditioning; operant; marmoset; semiotics; language disorders
Vocalizations are behaviorally critical sounds, and this behavioral importance is reflected in the ascending auditory system, where conspecific vocalizations are increasingly over-represented at higher processing stages. Recent evidence suggests that, in macaques, this increasing selectivity for vocalizations might culminate in a cortical region that is densely populated by vocalization-preferring neurons. Such a region might be a critical node in the representation of vocal communication sounds, underlying the recognition of vocalization type, caller and social context. These results raise the questions of whether cortical specializations for vocalization processing exist in other species, their cortical location, and their relationship to the auditory processing hierarchy. To explore cortical specializations for vocalizations in another species, we performed high-field fMRI of the auditory cortex of a vocal New World primate, the common marmoset (Callithrix jacchus). Using a sparse imaging paradigm, we discovered a caudal-rostral gradient for the processing of conspecific vocalizations in marmoset auditory cortex, with regions of the anterior temporal lobe close to the temporal pole exhibiting the highest preference for vocalizations. These results demonstrate similar cortical specializations for vocalization processing in macaques and marmosets, suggesting that cortical specializations for vocal processing might have evolved before the lineages of these species diverged.
Conversational turn-taking is an integral part of language development, as it reflects a confluence of social factors that mitigate communication. Humans coordinate the timing of speech based on the behaviour of another speaker, a behaviour that is learned during infancy. While adults in several primate species engage in vocal turn-taking, the degree to which similar learning processes underlie its development in these non-human species or are unique to language is not clear. We recorded the natural vocal interactions of common marmosets (Callithrix jacchus) occurring with both their sibling twins and parents over the first year of life and observed at least two parallels with language development. First, marmoset turn-taking is a learned vocal behaviour. Second, marmoset parents potentially played a direct role in guiding the development of turn-taking by providing feedback to their offspring when errors occurred during vocal interactions similarly to what has been observed in humans. Though species-differences are also evident, these findings suggest that similar learning mechanisms may be implemented in the ontogeny of vocal turn-taking across our Order, a finding that has important implications for our understanding of language evolution.
marmoset; turn-taking; vocal learning
Non-human primate communication is thought to be fundamentally different from human speech, mainly due to vast differences in vocal control. The lack of these abilities in non-human primates is especially striking if compared to some marine mammals and bird species, which has generated somewhat of an evolutionary conundrum. What are the biological roots and underlying evolutionary pressures of the human ability to voluntarily control sound production and learn the vocal utterances of others? One hypothesis is that this capacity has evolved gradually in humans from an ancestral stage that resembled the vocal behavior of modern primates. Support for this has come from studies that have documented limited vocal flexibility and convergence in different primate species, typically in calls used during social interactions. The mechanisms underlying these patterns, however, are currently unknown. Specifically, it has been difficult to rule out explanations based on genetic relatedness, suggesting that such vocal flexibility may not be the result of social learning.
To address this point, we compared the degree of acoustic similarity of contact calls in free-ranging Campbell's monkeys as a function of their social bonds and genetic relatedness. We calculated three different indices to compare the similarities between the calls' frequency contours, the duration of grooming interactions and the microsatellite-based genetic relatedness between partners. We found a significantly positive relation between bond strength and acoustic similarity that was independent of genetic relatedness.
Genetic factors determine the general species-specific call repertoire of a primate species, while social factors can influence the fine structure of some the call types. The finding is in line with the more general hypothesis that human speech has evolved gradually from earlier primate-like vocal communication.
Male Rocky Mountain elk (Cervus elaphus nelsoni) produce loud and high fundamental frequency bugles during the mating season, in contrast to the male European Red Deer (Cervus elaphus scoticus) who produces loud and low fundamental frequency roaring calls. A critical step in understanding vocal communication is to relate sound complexity to anatomy and physiology in a causal manner. Experimentation at the sound source, often difficult in vivo in mammals, is simulated here by a finite element model of the larynx and a wave propagation model of the vocal tract, both based on the morphology and biomechanics of the elk. The model can produce a wide range of fundamental frequencies. Low fundamental frequencies require low vocal fold strain, but large lung pressure and large glottal flow if sound intensity level is to exceed 70 dB at 10 m distance. A high-frequency bugle requires both large muscular effort (to strain the vocal ligament) and high lung pressure (to overcome phonation threshold pressure), but at least 10 dB more intensity level can be achieved. Glottal efficiency, the ration of radiated sound power to aerodynamic power at the glottis, is higher in elk, suggesting an advantage of high-pitched signaling. This advantage is based on two aspects; first, the lower airflow required for aerodynamic power and, second, an acoustic radiation advantage at higher frequencies. Both signal types are used by the respective males during the mating season and probably serve as honest signals. The two signal types relate differently to physical qualities of the sender. The low-frequency sound (Red Deer call) relates to overall body size via a strong relationship between acoustic parameters and the size of vocal organs and body size. The high-frequency bugle may signal muscular strength and endurance, via a ‘vocalizing at the edge’ mechanism, for which efficiency is critical.
More than 5,000 species of mammals share a basic larynx design. Many of them use the larynx to produce an enormous variability of sounds, but only in a handful of species has the physiology of sound production been studied. It is impracticable in most species because observation requires invasive techniques. Furthermore, many mammals do not spontaneously vocalize if they are manipulated or handled. We have constructed a finite element model of vocal fold tissue vibration on the basis of morphological and biomechanical features of the Rocky Mountain elk vocal organs. Operating within reasonable physiological parameter ranges, it allows the investigation of sound production efficiency as well as selective forces. The model can produce sounds with fundamental frequencies ranging between 60 and 1,200 Hz, covering not only some of the natural vocal repertoire of the elk's high-pitched bugle calls but also those of its close relative, the European Red Deer, who produces low-pitched roaring sounds with a similar anatomy. The approach is of broader interest, first because techniques can be adapted to other mammal species using only landmark anatomical and biomechanical features, and second, because simulations can serve as playbacks for perception studies investigating the role of vocalizations in communication.
Noisy acoustic environments present several challenges for the evolution of acoustic communication systems. Among the most significant is the need to limit degradation of spectro-temporal signal structure in order to maintain communicative efficacy. This can be achieved by selecting for several potentially complementary processes. Selection can act on behavioral mechanisms permitting signalers to control the timing and occurrence of signal production to avoid acoustic interference. Likewise, the signal itself may be the target of selection, biasing the evolution of its structure to comprise acoustic features that avoid interference from ambient noise or degrade minimally in the habitat. Here, we address the latter topic for common marmoset (Callithrix jacchus) long-distance contact vocalizations, known as phee calls. Our aim was to test whether this vocalization is specifically adapted for transmission in a species-typical forest habitat, the Atlantic forests of northeastern Brazil. We combined seasonal analyses of ambient habitat acoustics with experiments in which pure tones, clicks, and vocalizations were broadcast and rerecorded at different distances to characterize signal degradation in the habitat. Ambient sound was analyzed from intervals throughout the day and over rainy and dry seasons, showing temporal regularities across varied timescales. Broadcast experiment results indicated that the tone and click stimuli showed the typically inverse relationship between frequency and signaling efficacy. Although marmoset phee calls degraded over distance with marked predictability compared with artificial sounds, they did not otherwise appear to be specially designed for increased transmission efficacy or minimal interference in this habitat. We discuss these data in the context of other similar studies and evidence of potential behavioral mechanisms for avoiding acoustic interference in order to maintain effective vocal communication in common marmosets.
Callithrix jacchus; vocal communication; behavioral ecology; sound broadcasts; sound window
The ability to record well-isolated action potentials from individual neurons in naturally behaving animals is crucial for understanding neural mechanisms underlying natural behaviors. Traditional neurophysiology techniques, however, require the animal to be restrained which often restricts natural behavior. An example is the common marmoset (Callithrix jacchus), a highly vocal New World primate species, used in our laboratory to study the neural correlates of vocal production and sensory feedback. When restrained by traditional neurophysiological techniques marmoset vocal behavior is severely inhibited. Tethered recording systems, while proven effective in rodents pose limitations in arboreal animals such as the marmoset that typically roam in a three-dimensional environment. To overcome these obstacles, we have developed a wireless neural recording technique that is capable of collecting single-unit data from chronically implanted multi-electrodes in freely moving marmosets. A lightweight, low power and low noise wireless transmitter (headstage) is attached to a multi-electrode array placed in the premotor cortex of the marmoset. The wireless headstage is capable of transmitting 15 channels of neural data with signal-to-noise ratio (SNR) comparable to a tethered system. To minimize radio-frequency (RF) and electro-magnetic interference (EMI), the experiments were conducted within a custom designed RF/EMI and acoustically shielded chamber. The individual electrodes of the multi-electrode array were periodically advanced to densely sample the cortical layers. We recorded single-unit data over a period of several months from the frontal cortex of two marmosets. These recordings demonstrate the feasibility of using our wireless recording method to study single neuron activity in freely roaming primates.
Action potential; free-roaming; marmoset; multi-channel; multi-electrode array; natural behavior; neurophysiology; neural telemetry; primate; single-unit; vocalization; wireless
To understand the evolution of acoustic communication in animals, it is important to distinguish between the structure and the usage of vocal signals, since both aspects are subject to different constraints. In terrestrial mammals, the structure of calls is largely innate, while individuals have a greater ability to actively initiate or withhold calls. In closely related taxa, one would therefore predict a higher flexibility in call usage compared to call structure. In the present study, we investigated the vocal repertoire of free living Guinea baboons (Papio papio) and examined the structure and usage of the animals’ vocal signals. Guinea baboons live in a complex multi-level social organization and exhibit a largely tolerant and affiliative social style, contrary to most other baboon taxa. To classify the vocal repertoire of male and female Guinea baboons, cluster analyses were used and focal observations were conducted to assess the usage of vocal signals in the particular contexts.
In general, the vocal repertoire of Guinea baboons largely corresponded to the vocal repertoire other baboon taxa. The usage of calls, however, differed considerably from other baboon taxa and corresponded with the specific characteristics of the Guinea baboons’ social behaviour. While Guinea baboons showed a diminished usage of contest and display vocalizations (a common pattern observed in chacma baboons), they frequently used vocal signals during affiliative and greeting interactions.
Our study shows that the call structure of primates is largely unaffected by the species’ social system (including grouping patterns and social interactions), while the usage of calls can be more flexibly adjusted, reflecting the quality of social interactions of the individuals. Our results support the view that the primary function of social signals is to regulate social interactions, and therefore the degree of competition and cooperation may be more important to explain variation in call usage than grouping patterns or group size.
Evolution; Vocal communication; Call structure; Call usage; Guinea baboon; Social complexity; Competition
Spoken language and learned song are complex communication behaviors found in only a few species, including humans and three groups of distantly related birds – songbirds, parrots, and hummingbirds. Despite their large phylogenetic distances, these vocal learners show convergent behaviors and associated brain pathways for vocal communication. However, it is not clear whether this behavioral and anatomical convergence is associated with molecular convergence. Here we used oligo microarrays to screen for genes differentially regulated in brain nuclei necessary for producing learned vocalizations relative to adjacent brain areas that control other behaviors in avian vocal learners versus vocal non-learners. A top candidate gene in our screen was a calcium-binding protein, parvalbumin (PV). In situ hybridization verification revealed that PV was expressed significantly higher throughout the song motor pathway, including brainstem vocal motor neurons relative to the surrounding brain regions of all distantly related avian vocal learners. This differential expression was specific to PV and vocal learners, as it was not found in avian vocal non-learners nor for control genes in learners and non-learners. Similar to the vocal learning birds, higher PV up-regulation was found in the brainstem tongue motor neurons used for speech production in humans relative to a non-human primate, macaques. These results suggest repeated convergent evolution of differential PV up-regulation in the brains of vocal learners separated by more than 65–300 million years from a common ancestor and that the specialized behaviors of learned song and speech may require extra calcium buffering and signaling.
Understanding the role of avian vocal communication in social organisation requires knowledge of the vocal repertoire used to convey information. Parrots use acoustic signals in a variety of social contexts, but no studies have evaluated cross-functional use of acoustic signals by parrots, or whether these conform to signal design rules for different behavioural contexts. We statistically characterised the vocal repertoire of 61 free-living Lilac-crowned Amazons (Amazona finschi) in nine behavioural contexts (nesting, threat, alarm, foraging, perched, take-off, flight, landing, and food soliciting). We aimed to determine whether parrots demonstrated contextual flexibility in their vocal repertoire, and whether these acoustic signals follow design rules that could maximise communication.
The Lilac-crowned Amazon had a diverse vocal repertoire of 101 note-types emitted at least twice, 58 of which were emitted ≥5 times. Threat and nesting contexts had the greatest variety and proportion of exclusive note-types, although the most common note-types were emitted in all behavioural contexts but with differing proportional contribution. Behavioural context significantly explained variation in acoustic features, where threat and nesting contexts had the highest mean frequencies and broad bandwidths, and alarm signals had a high emission rate of 3.6 notes/s. Three Principal Components explained 72.03 % of the variation in temporal and spectral characteristics of notes. Permutated Discriminant Function Analysis using these Principal Components demonstrated that 28 note-types (emitted by >1 individual) could be correctly classified and significantly discriminated from a random model.
Acoustic features of Lilac-crowned Amazon vocalisations in specific behavioural contexts conformed to signal design rules. Lilac-crowned Amazons modified the emission rate and proportional contribution of note-types used in each context, suggesting the use of graded and combinatorial variation to encode information. We propose that evaluation of vocal repertoires based on note-types would reflect the true extent of a species’ vocal flexibility, and the potential for combinatorial structures in parrot acoustic signals.
Electronic supplementary material
The online version of this article (doi:10.1186/s12983-016-0169-6) contains supplementary material, which is available to authorized users.
Animal communication; Lilac-crowned Amazon; Psittaciformes; Signal design rules; Tropical dry forest
Researchers have described multilevel societies with one-male, multifemale units (OMUs) forming within a larger group in several catarrhine species, but not in platyrhines. OMUs in multilevel societies are associated with extremely large group sizes, often with >100 individuals, and the only platyrhine genus that forms groups of this size is Cacajao. We review available evidence for multilevel organization and the formation of OMUs in groups of Cacajao, and test predictions for the frequency distribution patterns of male–male and male–female interindividual distances within groups of red-faced uakaris (Cacajao calvus ucayalii), comparing year-round data with those collected at the peak of the breeding season, when group cohesion may be more pronounced. Groups of Cacajao fission and fuse, forming subgroup sizes at frequencies consistent with an OMU organization. In Cacajao calvus ucayalii and Cacajao calvus calvus, bachelor groups are also observed, a characteristic of several catarrhine species that form OMUs. However, researchers have observed both multimale–multifemale groups and groups with a single male and multiple females in Cacajao calvus. The frequency distributions of interindividual distances for male–male and male–female dyads are consistent with an OMU-based organization, but alternative interpretations of these data are possible. The distribution of interindividual distances collected during the peak breeding season differed from those collected year-round, indicating seasonal changes in the spatial organization of Cacajao calvus ucayalii. We suggest a high degree of flexibility may characterize the social organization of Cacajao calvus ucayalii, which may form OMUs under certain conditions. Further studies with identifiable individuals, thus far not possible in Cacajao, are required to confirm the social organization.
Breeding system; Mating system; One-male unit; Pitheciine; Social structure
Catecholamines, which include the neurotransmitters dopamine and noradrenaline, are known modulators of sensorimotor function, reproduction, and sexually motivated behaviors across vertebrates, including vocal-acoustic communication. Recently, we demonstrated robust catecholaminergic (CA) innervation throughout the vocal-motor system in the plainfin midshipman fish, Porichtys notatus, a seasonal breeding marine teleost that produces vocal signals for social communication. There are two distinct male reproductive morphs in this species: Type I males establish nests and court females with a long duration advertisement call, while type II males sneak-spawn to steal fertilizations from type I males. Like females, type II males can only produce brief, agonistic, grunt-type vocalizations. Here, we tested the hypothesis that intrasexual differences in the numbers of CA neurons and their fiber innervation patterns throughout the vocal-motor pathway may provide neural substrates underlying divergence in reproductive behavior between morphs. We employed immunofluorescence (-ir) histochemistry to measure tyrosine hydroxylase (TH, rate-limiting enzyme in catecholamine synthesis) neuron numbers in several forebrain and hindbrain nuclei as well as TH-ir fiber innervation throughout the vocal pathway in type I and type II males collected from nests during the summer reproductive season. After controlling for differences in body size, only one group of CA neurons displayed an unequivocal difference between male morphs: the extraventricular vagal-associated TH-ir neurons, located just lateral to the dimorphic vocal motor nucleus (VMN), were significantly greater in number in type II males. In addition, type II males exhibited greater TH-ir fiber density within the VMN and greater numbers of TH-ir varicosities with putative contacts on vocal motor neurons. This strong inverse relationship between the predominant vocal morphotype and CA innervation of vocal motor neurons suggests catecholamines may function to inhibit vocal output in midshipman. These findings support catecholamines as direct modulators of vocal behavior and differential CA input appears reflective of social and reproductive behavioral divergence between male midshipman morphs.
teleost; dopamine; noradrenaline; catecholamine; vocal motor; alternative reproductive tactics; teleost; posterior tuberculum; locus coeruleus; preoptic area
Primates are intensely social and exhibit extreme variation in social structure, making them particularly well suited for uncovering evolutionary connections between sociality and vocal complexity. Although comparative studies find a correlation between social and vocal complexity, the function of large vocal repertoires in more complex societies remains unclear. We compared the vocal complexity found in primates to both mammals in general and human language in particular and found that non-human primates are not unusual in the complexity of their vocal repertoires. To better understand the function of vocal complexity within primates, we compared two closely related primates (chacma baboons and geladas) that differ in their ecology and social structures. A key difference is that gelada males form long-term bonds with the 2–12 females in their harem-like reproductive unit, while chacma males primarily form temporary consortships with females. We identified homologous and non-homologous calls and related the use of the derived non-homologous calls to specific social situations. We found that the socially complex (but ecologically simple) geladas have larger vocal repertoires. Derived vocalizations of geladas were primarily used by leader males in affiliative interactions with ‘their’ females. The derived calls were frequently used following fights within the unit suggesting that maintaining cross-sex bonds within a reproductive unit contributed to this instance of evolved vocal complexity. Thus, our comparison highlights the utility of using closely related species to better understand the function of vocal complexity.
derived vocalizations; group size; homologous vocalizations; social complexity; vocal complexity; vocal repertoire
Determining whether a species' vocal communication system is graded or discrete requires definition of its vocal repertoire. In this context, research on domestic pig (Sus scrofa domesticus) vocalizations, for example, has led to significant advances in our understanding of communicative functions. Despite their close relation to domestic pigs, little is known about wild boar (Sus scrofa) vocalizations. The few existing studies, conducted in the 1970s, relied on visual inspections of spectrograms to quantify acoustic parameters and lacked statistical analysis. Here, we use objective signal processing techniques and advanced statistical approaches to classify 616 calls recorded from semi‐free ranging animals. Based on four spectral and temporal acoustic parameters—quartile Q25, duration, spectral flux, and spectral flatness—extracted from a multivariate analysis, we refine and extend the conclusions drawn from previous work and present a statistically validated classification of the wild boar vocal repertoire into four call types: grunts, grunt‐squeals, squeals, and trumpets. While the majority of calls could be sorted into these categories using objective criteria, we also found evidence supporting a graded interpretation of some wild boar vocalizations as acoustically continuous, with the extremes representing discrete call types. The use of objective criteria based on modern techniques and statistics in respect to acoustic continuity advances our understanding of vocal variation. Integrating our findings with recent studies on domestic pig vocal behavior and emotions, we emphasize the importance of grunt‐squeals for acoustic approaches to animal welfare and underline the need of further research investigating the role of domestication on animal vocal communication.
acoustic communication; graded vocalizations; sus scrofa; vocal repertoire; wild boar
Vocal fold control was critical to the evolution of spoken language, much as it today allows us to learn vowel systems. It has, however, never been demonstrated directly in a non-human primate, leading to the suggestion that it evolved in the human lineage after divergence from great apes. Here, we provide the first evidence for real-time, dynamic and interactive vocal fold control in a great ape during an imitation “do-as-I-do” game with a human demonstrator. Notably, the orang-utan subject skilfully produced “wookies” – an idiosyncratic vocalization exhibiting a unique spectral profile among the orang-utan vocal repertoire. The subject instantaneously matched human-produced wookies as they were randomly modulated in pitch, adjusting his voice frequency up or down when the human demonstrator did so, readily generating distinct low vs. high frequency sub-variants. These sub-variants were significantly different from spontaneous ones (not produced in matching trials). Results indicate a latent capacity for vocal fold exercise in a great ape (i) in real-time, (ii) up and down the frequency spectrum, (iii) across a register range beyond the species-repertoire and, (iv) in a co-operative turn-taking social setup. Such ancestral capacity likely provided the neuro-behavioural basis of the more fine-tuned vocal fold control that is a human hallmark.
Comparative analyses used to reconstruct the evolution of traits associated with the human language faculty, including its socio-cognitive underpinnings, highlight the importance of evolutionary constraints limiting vocal learning in non-human primates. After a brief overview of this field of research and the neural basis of primate vocalizations, we review studies that have addressed the genetic basis of usage and structure of ultrasonic communication in mice, with a focus on the gene FOXP2 involved in specific language impairments and neuroligin genes (NL-3 and NL-4) involved in autism spectrum disorders. Knockout of FoxP2 leads to reduced vocal behavior and eventually premature death. Introducing the human variant of FoxP2 protein into mice, in contrast, results in shifts in frequency and modulation of pup ultrasonic vocalizations. Knockout of NL-3 and NL-4 in mice diminishes social behavior and vocalizations. Although such studies may provide insights into the molecular and neural basis of social and communicative behavior, the structure of mouse vocalizations is largely innate, limiting the suitability of the mouse model to study human speech, a learned mode of production. Although knockout or replacement of single genes has perceptible effects on behavior, these genes are part of larger networks whose functions remain poorly understood. In humans, for instance, deficiencies in NL-4 can lead to a broad spectrum of disorders, suggesting that further factors (experiential and/or genetic) contribute to the variation in clinical symptoms. The precise nature as well as the interaction of these factors is yet to be determined.
Autism; communication; evolution; FOXP2; mice; neuroligin; speech; ultrasound; vocalization
Response properties of primary auditory cortical neurons in the adult common marmoset monkey (Callithrix jacchus) were modified by extensive exposure to altered vocalizations that were self-generated and rehearsed frequently. A laryngeal apparatus modification procedure permanently lowered the frequency content of the native twitter call, a complex communication vocalization consisting of a series of frequency modulation (FM) sweeps. Monkeys vocalized shortly after this procedure and maintained voicing efforts until physiological evaluation 5–15 months later. The altered twitter calls improved overtime, with FM sweeps approaching but never reaching the normal spectral range. Neurons with characteristic frequencies <4.3 kHz that had been weakly activated by native twitter calls were recruited to encode self-uttered altered twitter vocalizations. These neurons showed a decrease in response magnitude and an increase in temporal dispersion of response timing to twitter call and parametric FM stimuli but a normal response profile to pure tone stimuli. Tonotopic maps in voice-modified monkeys were not distorted. These findings suggest a previously unrecognized form of cortical plasticity that is specific to higher-order processes involved in the discrimination of more complex sounds, such as species-specific vocalizations.
auditory cortex; plasticity; primate; vocalization; learning; twitter call
High background noise is an important obstacle in successful signal detection and perception of an intended acoustic signal. To overcome this problem, many animals modify their acoustic signal by increasing the repetition rate, duration, amplitude or frequency range of the signal. An alternative method to ensure successful signal reception, yet to be tested in animals, involves the use of two different types of signal, where one signal type may enhance the other in periods of high background noise. Humpback whale communication signals comprise two different types: vocal signals, and surface-generated signals such as ‘breaching’ or ‘pectoral slapping’. We found that humpback whales gradually switched from primarily vocal to primarily surface-generated communication in increasing wind speeds and background noise levels, though kept both signal types in their repertoire. Vocal signals have the advantage of having higher information content but may have the disadvantage of loosing this information in a noisy environment. Surface-generated sounds have energy distributed over a greater frequency range and may be less likely to become confused in periods of high wind-generated noise but have less information content when compared with vocal sounds. Therefore, surface-generated sounds may improve detection or enhance the perception of vocal signals in a noisy environment.
acoustic communication; humpback whales; background noise; acoustic behaviour; communication strategy
Bats are among the most gregarious and vocal mammals, with some species demonstrating a diverse repertoire of syllables under a variety of behavioral contexts. Despite extensive characterization of big brown bat (Eptesicus fuscus) biosonar signals, there have been no detailed studies of adult social vocalizations. We recorded and analyzed social vocalizations and associated behaviors of captive big brown bats under four behavioral contexts: low aggression, medium aggression, high aggression, and appeasement. Even limited to these contexts, big brown bats possess a rich repertoire of social vocalizations, with 18 distinct syllable types automatically classified using a spectrogram cross-correlation procedure. For each behavioral context, we describe vocalizations in terms of syllable acoustics, temporal emission patterns, and typical syllable sequences. Emotion-related acoustic cues are evident within the call structure by context-specific syllable types or variations in the temporal emission pattern. We designed a paradigm that could evoke aggressive vocalizations while monitoring heart rate as an objective measure of internal physiological state. Changes in the magnitude and duration of elevated heart rate scaled to the level of evoked aggression, confirming the behavioral state classifications assessed by vocalizations and behavioral displays. These results reveal a complex acoustic communication system among big brown bats in which acoustic cues and call structure signal the emotional state of a caller.
The prefrontal cortex is associated with cognitive functions that include planning, reasoning, decision-making, working memory, and communication. Neurophysiology and neuropsychology studies have established that dorsolateral prefrontal cortex is essential in spatial working memory while the ventral frontal lobe processes language and communication signals. Single-unit recordings in nonhuman primates has shown that ventral prefrontal (VLPFC) neurons integrate face and vocal information and are active during audiovisual working memory. However, whether VLPFC is essential in remembering face and voice information is unknown. We therefore trained nonhuman primates in an audiovisual working memory paradigm using naturalistic face-vocalization movies as memoranda. We inactivated VLPFC, with reversible cortical cooling, and examined performance when faces, vocalizations or both faces and vocalization had to be remembered. We found that VLPFC inactivation impaired subjects' performance in audiovisual and auditory-alone versions of the task. In contrast, VLPFC inactivation did not disrupt visual working memory. Our studies demonstrate the importance of VLPFC in auditory and audiovisual working memory for social stimuli but suggest a different role for VLPFC in unimodal visual processing.
SIGNIFICANCE STATEMENT The ventral frontal lobe, or inferior frontal gyrus, plays an important role in audiovisual communication in the human brain. Studies with nonhuman primates have found that neurons within ventral prefrontal cortex (VLPFC) encode both faces and vocalizations and that VLPFC is active when animals need to remember these social stimuli. In the present study, we temporarily inactivated VLPFC by cooling the cortex while nonhuman primates performed a working memory task. This impaired the ability of subjects to remember a face and vocalization pair or just the vocalization alone. Our work highlights the importance of the primate VLPFC in the processing of faces and vocalizations in a manner that is similar to the inferior frontal gyrus in the human brain.
monkey; multisensory; visual working memory; vocalization
Vocal learning underlies acquisition of both language in humans and vocal signals in some avian taxa. These bird groups and humans exhibit convergent developmental phases and associated brain pathways for vocal communication. The transcription factor FoxP2 plays critical roles in vocal learning in humans and songbirds. Another member of the forkhead box gene family, FoxP1 also shows high expression in brain areas involved in vocal learning and production. Here, we investigate FoxP2 and FoxP1 mRNA and protein in adult male budgerigars (Melopsittacus undulatus), a parrot species that exhibits vocal learning as both juveniles and adults. To examine these molecules in adult vocal learners, we compared their expression patterns in the budgerigar striatal nucleus involved in vocal learning, magnocellular nucleus of the medial striatum (MMSt), across birds with different vocal states, such as vocalizing to a female (directed), vocalizing alone (undirected), and non-vocalizing. We found that both FoxP2 mRNA and protein expressions were consistently lower in MMSt than in the adjacent striatum regardless of the vocal states, whereas previous work has shown that songbirds exhibit downregulation in the homologous region, Area X, only after singing alone. In contrast, FoxP1 levels were high in MMSt compared to the adjacent striatum in all groups. Taken together these results strengthen the general hypothesis that FoxP2 and FoxP1 have specialized expression in vocal nuclei across a range of taxa, and suggest that the adult vocal plasticity seen in budgerigars may be a product of persistent down-regulation of FoxP2 in MMSt.
budgerigar; neural gene expression; FoxP1, FoxP2; open-ended vocal learning; vocal behavior
The evolution of the autonomic nervous system provides an organizing principle to interpret the adaptive significance of physiological systems in promoting social behavior and responding to social challenges. This phylogenetic shift in neural regulation of the autonomic nervous system in mammals has produced a neuroanatomically integrated social engagement system, including neural mechanisms that regulate both cardiac vagal tone and muscles involved in vocalization. Mammalian vocalizations are part of a conspecific social communication system, with several mammalian species modulating acoustic features of vocalizations to signal affective state. Prosody, defined by variations in rhythm and pitch, is a feature of mammalian vocalizations that communicate emotion and affective state. While the covariation between physiological state and the acoustic frequencies of vocalizations is neurophysiologically based, few studies have investigated the covariation between vocal prosody and autonomic state. In response to this paucity of scientific evidence, the current study explored the utility of vocal prosody as a sensitive index of autonomic activity in human infants during the Still Face challenge. Overall, significant correlations were observed between several acoustic features of the infant vocalizations and autonomic state, demonstrating an association between shorter heart period and reductions in heart period and respiratory sinus arrhythmia following the challenge with the dampening of the modulation of acoustic features (fundamental frequency, variance, 50% bandwidth, and duration) that are perceived as prosody.
Infant vocalizations; Prosody; Polyvagal Theory; Autonomic nervous system; Heart rate; Respiratory sinus arrhythmia
Across mammals many vocal sounds are produced by airflow induced vocal fold oscillation. We tested the hypothesis that stress-strain and stress-relaxation behavior of rat vocal folds can be used to predict the fundamental frequency range of the species’ vocal repertoire. In a first approximation vocal fold oscillation has been modeled by the string model but it is not known whether this concept equally applies to large and small species. The shorter the vocal fold, the more the ideal string law may underestimate normal mode frequencies. To accommodate the very small size of the tissue specimen, a custom-built miniaturized tensile test apparatus was developed. Tissue properties of 6 male rat vocal folds were measured. Rat vocal folds demonstrated the typical linear stress-strain behavior in the low strain region and an exponential stress response at strains larger than about 40%. Approximating the rat’s vocal fold oscillation with the string model suggests that fundamental frequencies up to about 6 kHz can be produced, which agrees with frequencies reported for audible rat vocalization. Individual differences and time-dependent changes in the tissue properties parallel findings in other species, and are interpreted as universal features of the laryngeal sound source.
larynx; viscoelastic properties; bioacoustics; anisotropy