|Home | About | Journals | Submit | Contact Us | Français|
The evolution of language and its mechanisms has been a topic of intense speculation and debate, particularly considering the question of innate endowment. Modern biological sciences— neurobiology and neuroethology—have made great strides in understanding proximate and ultimate causes of behavior. These insights are generally ignored in the debate regarding linguistic knowledge, especially in the realm of syntax where core theoretical constructs have been proposed unconstrained by evolutionary biology. Taking the perspective of organismal biology offers a principled approach to the study of language that is sensitive to its evolutionary context, a growing trend also in other domains of cognitive science. The emergence of a research program in the comparative biology of syntax is one concrete example of this trend.
Language is a highly specialized behavior, characterized by symbolic communication and complex pattern structures and diverse, if not unlimited, meanings. It develops under strong innate constraints but in interaction with a complex environment, representing a cultural heritage that is passed on across generations but may be lost as a language dies out. The evolutionary history of language is shrouded in mystery because there is no record of the vocal or gestural behavior or its functional consequences in pre-lingual hominids, and no agreement as to when and how language emerged and how it subsequently evolved.
The putatively unique attributes of language such as generativity (an unlimited number of possible sentences) [1-3] and syntactic complexity are often cited [4-7] as motivation to posit the existence of core language processes that are uninformed by all other psychological processes, with the biological study of language being limited to human biology. The apparent uniqueness of certain functional attributes of human language such as generativity could be taken as suggesting that there is little to be gained by comparisons with other species and a broader biological perspective. The emergence of a fledgling research program in the comparative biology of syntax, however, and the controversies it has engendered regarding the uniqueness of recursive syntactic processing in humans, motivate re-evaluation of these issues [8, 9].
The comparative approach is applicable to all organisms, including humans, and can give insight into unique attributes of language. Apparently unique specializations are well known and studied as a standard component of evolutionary theory. Furthermore, from the perspective of evolutionary biology the broader context of communication, social interaction, cognition, and speciation, forms a valuable and indeed the necessary framework from which to view human language.
Adopting a non-evolutionary perspective on language ignores the central organizing theory of modern biology and all that has sprung from it. Any substitute theory can be scrutinized from the evolutionary perspective for possible limitations. Indeed, in the non-evolutionary approach to the study of language, a principle shortcoming has been to rely on first principles arguments such as parsimony, unconstrained by established biological fact. While such arguments can be very powerful, divorcing the study of human behavior from the foundation of modern biology ultimately abandons many constraints of empirical science, as we argue has occurred for the study of language.
Unlike other empirical sciences, the objects of study in biology embody a rich evolutionary history, which is an essential component of the answer to the questions “how” and “why”. A non-biological approach based on first principles such as parsimony can easily overlook the consequences of evolutionary history [see 10].
Consider how a non-biological approach would fail in analysis of a system strongly dependent on feedback. Weakly electric fish sample their environment by producing low current time-varying signals with their electric organs. Changes in the distribution of current along the body surface represent information to the fish about changes in the local environment. The discharges of two fish in close proximity can degrade the electrolocation performance. To cope with this, individuals will change parameters of their discharge to establish sufficient separation so that the behavioral performance is unaffected. For example, in species that produce sinewave-like discharges, the fish with the lower frequency will lower its frequency and the fish with the higher frequency will increase its frequency .
Parsimony dictates a simple electronic circuit that directly sums input and output to control an amplifier mimicking the fish behavior, but the circuit uses feedback in a fundamentally different fashion than does the fish in reality. The jamming avoidance response is thought to have evolved from an ancestral circuit common in many species of fish whereby the brainstem pacemaker nucleus does not project back up to the brain. It is surprising, but only from a first-principles perspective naive of the biology, that the fish cannot directly sense its own electric organ discharge independent of environmental cues. The fish also adjusts its expectation of changes in the electric field based on its own tail movements (where the electric organ resides), a behavior that cancels predictable changes in the electric field and heightens responses to unexpected signals . Tail movements become part of the feedback. This is exceedingly elegant engineering, making electrolocation a premier vertebrate model system for studying brain and behavior [11, 12]. Such solutions are non-intuitive from a non-biological perspective, but abound in evolutionary history.
Superficially such animal examples may seem irrelevant to the student of language. But this is how evolution solves problems in brain and behavior, and is likely to also constrain language. Animal studies give insight into motor control, perception, learning, and cognition, defining powerful constraints on brain and behavior, and cannot logically be ignored in language research.
Can the complexity of an organism be explained from a first-principle perspective uninformed by an evolutionary perspective? Consider vocal learning, whereby young birds acquire auditory memories through exposure to adult song, then acquire their own songs with feedback from trial-and-error practice guided by comparison with the auditory memory acting as a “template” . Vocal learning depends on species-specific constraints on what signals can trigger memory formation, an innate predisposition for acquiring conspecific songs, innate and acquired templates, and a host of other factors especially social interactions [see 14]. Learning may proceed by instructional but also selectional and “action-based” processes . Vocal learning does not proceed monotonically from simple to complex vocalizations , and different individuals of the same species may take fundamentally different learning strategies . Complex sleep-dependent circadian variation in singing patterns  may be fundamental to vocal learning, arising perhaps from replay-like mechanisms that invoke sensory information during sleep interacting with sensorimotor information during singing .
How might one explain such complexity? The neuroethological approach is to combine the study of proximate and ultimate mechanisms . This approach has yielded an enormous trove of mechanistic, behavioral, and evolutionary insight, and beginning to forge deep interactions with psychology. There is an emerging explanation for all the complexity in biology and behavior – the parts are beginning to fit together [see 12, 20, 21]. Indeed, this kind of approach has begun to shape some views of language [e.g., 22, 23] and other aspects of psychology presumed to be unique to humans [24, 25]. Computational modeling, informed by this kind of biological perspective has been able to demonstrate very complex language behavior  and language learning  from relatively simple neural mechanisms.
Without a biological perspective, however, this leaves only a first principles approach unconstrained by evolutionary theory. This has often been the approach in language research, in which putative observable facts of linguistic complexity are mismatched against very real lack of understanding of biological mechanisms and the richness of social and environmental interactions. There is a danger, however, that from biological ignorance comes an overestimation of the benefits of first-principle reasoning with no estimate of the scientific errors introduced. Without sufficient empirical grounding, there is little constraint on theoretical adequacy.
Consider, for example, the nature/nurture debate. Children generally learn language with little intentional direction by adults. This observation (rarely quantified) is often cited as evidence for a special innate language module, discounting the pervasive social interaction and communication engaged in and observed by children as merely constraining the operating characteristics of the module. Indeed, recent research suggests that the presence of a real social agent may be more important to aspects of the process of language learning  by providing contingent social feedback  than the conditions of explicit informational feedback as characterized by some theoretical positions . Yet this same observation of learning without intentionally directed instruction holds for a great many behaviors in animals. A good example is the function of bird song, which has been extensively described as adult males direct their songs towards mated females or competing males. As in humans, the social interactions of juveniles are important for veridical song learning in many species of birds, but it has yet to be described that male birds intentionally direct their songs towards juveniles to tutor them. But just as with human infants, contingent social responding without intentional instruction can shape song learning in birds [31, 32]. A particularly compelling example is the recent demonstration that free-ranging juvenile song sparrows engage in “social eavesdropping”, preferentially approaching playback of interactively singing males over playback of single singing males .
Collectively, these observations do not argue against specialized systems for vocal learning with strong innate components: they go towards explaining them. From the perspective of organismal biology, the similarities and differences in learning mechanisms across many species and across domains of behavior are broadly instructive, and informs even a case such as language that may (or may not) have evolved suddenly and recently, possibly only in a single lineage represented by a single extant species . Indeed, none of the claims regarding language remove it, in principle or pragmatically, from the realm of experimental biology.
The relatedness among species was the fundamental insight of Darwin, an insight of immense and profound significance that has shaped all aspects of biology and most aspects of psychology. Such relatedness also applies to behavior [Box 1], but again a biological perspective helps illuminate relationships that might otherwise not be apparent. Ethologists have long recognized that each species occupies its own special perceptual world . For speciation through behavioral isolation, this is a necessary condition. By virtue of speciation humans indeed live in their own unique perceptual space, as linguists correctly point out, but so do the many species of weakly electric fish, songbirds, and perhaps all other species.
Pictured (Figure I) is a phylogenetic tree emphasizing some groups of birds and mammals and two complex, broadly distributed behaviors; echolocation (blue) [46, 47], and vocal learning (red) [48-50]. The behaviors may be present in only a subset of species in each group. Note that similar complex behaviors can evolve independently in multiple lineages, as is also observed for recursive syntactic processing (green), currently known only for humans [see 5] and starlings . All three of these behaviors might superficially be regarded as not having intermediate forms.
At the same time, arguments about the uniqueness of language tend to be of the form of identifying “the” single attribute that is unique, supporting the perspective of a special endowment that suddenly and mystically created humans. But language evolved to fit mutiple traits of the human brain, not the other way around . There is a long list of “unique” traits that were said to define language but fell by the wayside as they were demonstrated in other primates, other mammals and in birds [35, 36]. A vast biological knowledge indicates that specialization involves adaptation of a host of traits, not one. Which attribute of the sound localization system is unique in the champion species barn owl – the asymetrical ears, the hypertrophied brain stem structures, the topographic map of auditory space in the midbrain? Which single adaptation in songbirds created vocal learners? Also, as commonly used, the uniqueness of language is a reference only to extant species, ignoring the extensive evolution in the hominids.
The assumption of a unique human language organ has cut off scientific discourse from the rest of biology and from the rest of psychology. The contributions of biology become effectively constrained to human neurobiology. To confound the problem, while there are many studies in the neurobiology of language, much of this work focuses on metalinguistic judgments carried out with patients with focal brain damage or using neuroimaging under the linguistic assumptions that have governed much of psycholinguistic research for half a century (see below).
The effect of this assumption has been to generate a de novo “biology” of human uniqueness, a “hopeful monster” in the sense of Goldschmidt  that arises discontinuously in evolution, suddenly assembling new and old parts to produce new complexity. This too is a first principles approach unrelated to the broader field of organismal biology. It licenses argumentation that is unconstrained by data. For example, to regard language as specially encapsulated , isolated from other biological processes, relieves the study of language from being informed by biological fact.
We hold that psychology is a subset of organismal biology, and the study of language a subset of psychology. The explosion in knowledge in psychology has enormously influenced aspects of organismal biology (principally, neurosciences) and the converse has also generally been true [see 39]. However in taking an extreme position on the human uniqueness of language and the modular isolability of language processes [e.g., 38] the study of language does not benefit directly from these scientific advances.
A fundamental premise for many linguists is that the diversity of languages springs from a common cognitive structure, latent in all humans, called Universal Grammar (UG) . The assumption of this core cognitive construct, from which language emerges and by which language structures are constrained, derives from arguments that the structures of language are unlearnable, depending on innate knowledge, that some capacities for language are unique to humans, and that language is a cognitively isolated module that does not depend on other aspects human cognition. This black box approach to a fundamental question in cognitive science encourages linguists and psycholinguists to avoid consideration of evolutionary biology, and animal research in general.
Consider one of kind of empirical evidence for UG: Where different languages come into contact, interlocutors adopt a pidgin form that borrows from the different languages. Children raised in this environment form a creole that regularizes linguistic properties going beyond the information given . The linguistic evidence of the pidgin underdetermines the induced properties of the creole. The inference from this observation is there must be some innate mechanism in children with the latent power of language structure (i.e. UG) waiting to be triggered by the linguistic properties of the pidgin .
Yet Feher et al.  recently reported that zebra finches show an interesting ability to regularize their songs. When raised as isolates, zebra finch song structures are distinctly different from birds raised in the wild under normal rearing circumstances. However, when isolate zebra finches tutor other fledglings, the song of the students moves in the direction of the wild-type song and over tutoring generations converges on that song type. The re-development of wild type songs from isolate songs by tutoring shares much with the observations of creolization. Regardless of the way the results of this study are interpreted, there is one conclusion that is inescapable: the regularization of the structural patterning of communicative signals across generations is not unique to humans. Species-specific elements of UG can be viewed from the perspective of a pattern broadly present in evolutionary history, and specifically this also places additional pressure on the claim for uniqueness of UG in the primate lineage.
By shifting the terms of scientific discourse from the assumption of a core set of linguistic universals, to the actual distinctive behaviors of language populations, the study of language moves in the direction of the research methods of the experimental sciences. This also implies an obligation to consider testable mechanisms that can explain that behavior, and thereby calls into question the longstanding competence/performance distinction that has served as a barrier to empirical study of language use and processing. The shift to the study of behavior also brings with it a different standard for language research and what should count as evidence in understanding language processes. It is worth noting that the linguistic study of phonetics has always held itself to this empirical standard, and there has been a growing movement among linguists in developing a new laboratory phonology and experimental studies of semantics. With the increasing development of laboratory studies in linguistics, going beyond the individual use of linguistic intuition, comes the benefit of intersubjective testability which calls for adopting the standard that if a hypothesis is not falsifiable, then it is not a scientific statement.
The higher levels in the Chomskian hierachy of syntactic structures have long been held to be available uniquely to humans, and recently it has been proposed that these alone constitute the core faculty of language that is uniquely human . Since recursive syntactic structure is limited to higher levels of the Chomskian hierachy [Box 2], this has stimulated some extreme arguments regarding human uniqueness. If recursion could not have evolved because there cannot be intermediate forms, the same argument should apply to numerous biological traits that superficially might appear categorical, such as sexual reproduction. Complex seemingly categorical vertebrate traits such as echolocation and vocal learning most decidedly did evolve, and independently, multiple times [Box 1]. It is a mistake to confuse a failure to conceive of an adaptive explanation with evidence against its existence. And such argumentation is potentially dangerous in unintentionally having appeal for creation pseudoscience.
Finite state grammars (FSGs) are at the lowest level in the hierarchy of grammars first introduced by Chomsky . An FSG only represents the transition from the current state to the next state to determine if a test pattern meets or violates its rules. Higher levels are collectively refered to as phrase structured grammars (PSGs), and involve some memory for past states. Context free grammars (CFGs) , which can be computed though a recursive mechanism, are at the next level up from FSGs.
Hauser et al.  defined recursion as a unique human computational capacity necessary for the generativity of language by which a finite rule description can generate an infinity of patterns. Consider a Mandelbrot set, a fractal pattern with a nested hierarchical structure and a recursive definition (Figure Ia). Naturally occurring patterns such as coastlines or mountain surfaces are well approximated by such recursive fractals, although such approximations are not taken as evidence for a natural computational capacity. In computational linguistics, recursive patterns can be notated as production rules such as S -> ab | aSb. This is read as defining two forms of valid sentences (patterns) specified by the grammar. The simplest form (ab) consists of an item from the a vocabulary (e.g., male speech syllables in  or starling “warble” motifs in ) followed by an item from the b vocabulary (e.g., female syllables or “rattle” motifs). Arbitrarily longer forms can be made by substituting the definition in place of the S between an a and a b item. This defines legal patterns of the form anbn in which any number of a items are followed by a matching number of b items. This definition is both recursive (depends on itself for productivity) and center embedded. Center embedded grammars are CFGs, but not all CFGs are center embedded.
In computational linguistics, recursion without limit (to infinity) is the basis for analyzing the power of different grammars such as FSGs and CFGs. In reality, humans rarely achieve, and only awkwardly, even a recursive level of three when using center embedded sentences in natural speech (“The house that the man I married bought, burned.”). A recursive level of three corresponds to a highly impoverished fractal pattern (Figure Ib) of no geophysical interest. We suggest that embracing an infinite recursion for linguistics unsupported by corresponding data of actual human performance arises from an unfettered first principle perspective.
Analysis of recursion as a uniquely human capacity is particularly vexing in light of the lack of a specific and accepted single formal defintion. A complementary approach has been to examine related behaviors in other animals, recasting the presumption of human uniqueness into an empirical scientific inquiry to understanding the biology of language. Non-vocal learners (cotton-top tamarin monkeys, Saguinus oedipus) were exposed (without reinforcement) to sequences of heterospecific sounds (human syllables) described either by a finite state grammar (FSG) or one drawn from the more complex phrase structured grammars (PSGs)  [see Box 2]. The ability of the tamarins to recognize and generalize the FSG patterns shows rule-like induction for FSG patterns. The failure of the tamarins to perform similarly for PSG strings is difficult to interpret, as are null results more generally. When the pattern perception problem employing conspecific sounds was tested on a vocal learner, European starlings (Sturnus vulgaris) were able to learn and generalize both the FSG and PSG in an instrumental learning procedure, including generalization to longer patterns . Specific testing of recognition strategies indicated that this recognition ability was not due to the use of simple tricks like matching the beginnings or endings of the patterns—the starlings’ recognition performance was consistent with the grammars upon which they were trained. Thus starlings achieve both recursive processing  and hierarchical processing . Within the PSGs, the simplest class of grammars that explain the starling results are the context free grammers (CFG), one level higher than FSGs in the Chomsky grammar hierarchy [Box 2].
The conclusion that recursive hierarchical processing is not available to tamarin monkeys but is uniquely available to humans – and starlings – has not been universally well received. Given the variety of opinions regarding the definition of recursion, questions regarding the validity of the methodology for testing for recursive hierarchical processing are the most challenging to resolve. Less persuasive are arguments that resort to speculation regarding starling auditory subitization , while ignoring the known behavioral biology of starling song recognition .
The gramatical abilities of starlings raises additional, broader issues. First, the birds exhibited more complex pattern perception skills for vocal elements then heretofore reported. That this was observed in a vocal learning species surely is no conincidence. Songbirds (and other vocal-learning birds) have large, well differentiated forebrain regions involved in song production and perception. Such specializations are missing from non-vocal learning birds. This emphasizes the importance of quantitative differences in brain regions devoted to a particular task. The massive quantitative advantage that humans enjoy for vocal behavior should be evaluated in this light, not ignored in service of an ideological search for pure qualitative explanations. Our unique vocal behavior may substantially arise from quantitative advantages. Second, the work introduced by the primate studies, which then motivated the starling studies, represents an experimental approach to linguistics not previously available. In a linguistics that embraces all opportunities for scientific progress, this new avenue should be welcomed.
We find compelling evidence that language is a phenomenon of evolutionary biology and within the reach of biological investigation. An example is the emergence of a nascent field of the comparative biology of syntax. As with other parts of psychology in which human uniqueness arguments have served as a barrier to biological approaches [10, 24] the fundamental assertion that arises from this is a simple one, language researchers who fail to embrace biological approaches will be increasingly left behind.
We thank Michael D. Beecher, Tiffany Bloomfield, Anne Henly, Lori L. Holt, Michael C. LaBarbera, and Ofer Tchernichovski, who commented on versions of the manuscript. This work was supported in part by grants from the NIDCD to DM and the John Templeton Foundation to HCN and NIH grant DC00378.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.