|Home | About | Journals | Submit | Contact Us | Français|
Lexical tones are a phonetic contrast necessary for conveying meaning in a majority of the world’s languages. Various hearing, speech, and language disorders affect the ability to perceive or produce lexical tones, thereby seriously impairing individuals’ communicative abilities. The number of tone language speakers is increasing, even in otherwise English-speaking nations, yet insufficient emphasis has been placed on clinical assessment and rehabilitation of lexical tone disorders. The similarities and dissimilarities between lexical tones and other speech sounds make a richer scientific understanding of their physiological bases paramount to more effective remediation of speech and language disorders in general. Here we discuss the cognitive and biological bases of lexical tones, emphasizing the neural structures and networks that support their acquisition, perception, and cognitive representation. We present emerging research on lexical tone learning in the context of the clinical disorders of hearing, speech, and language that this body of research will help to address.
Today, >1.3 billion people, accounting for roughly a quarter of the world’s population, speak at least one Chinese language natively. Chinese languages, as well as many other languages in Africa, Asia, and the Americas, are tone languages that use pitch contrastively to indicate word meaning. Millions more are speakers of the related pitch accent languages, such as those of Japan and parts of Europe, which use pitch to distinguish a more limited set of words. As the world becomes increasingly multicultural, research and clinical tools for assessing and treating communicative deficits involving lexical tones are greatly warranted even for parts of the world where traditionally non-tone languages are spoken natively. This is especially true in the United States, where the number of both native- and second-language speakers of tone languages has grown rapidly, from >1.2 million speakers in 1990 to >2 million in 2000, when Chinese had become the third most widely spoken language here.1 In this article, we provide an introduction to the physiological bases of lexical tone acquisition, with an emphasis on how this knowledge may facilitate the remediation of related disorders of production and perception.
Speech production typically relies on acoustic energy produced by vibrations of the vocal folds, which manifests as a talker’s fundamental frequency (F0) and is perceptually correlated with pitch. For a native speaker of English (a non-tone language), pitch conveys prosodic information, signaling communicative aspects of speech such as intonation and stress. In tone languages, pitch is used to contrast the meanings of individual words.2 For example, in Mandarin Chinese the syllable/ma/spoken with a high-level pitch pattern (or tone) means “mother,” but it means “to scold” if spoken with a falling pitch pattern. Tones are described in relation to a talker’s pitch range and change in pitch over time.3 For example, “high level” means the pitch starts at the higher end of a talker’s speaking pitch range and remains relatively constant throughout the syllable. “Rising” and “falling” tones are collectively known as contour tones. Such descriptions represent a phonetic or surface description of lexical tones. Various analyses of tonal phonology and underlying psycholinguistic representation have been proposed, and clinicians who are interested in the relationships between underlying phonological processing and articulation disorders are directed to recent texts in phonological theory, including tonal phonology.4–6
The diagnosis and rehabilitation of lexical tone deficits may be facilitated by a rich understanding of the neural and psychological representations of lexical tones, and how these representations develop during first-language acquisition in children or second-language learning in adults. Evidence suggests that infants only a few days old are sensitive to differences in syllabic pitch contours.7 The perception of pitch is a useful behavioral adaptation for identifying auditory streams and appears to be largely conserved across primate evolution.8 As such, it is not surprising that pitch perception may be an endogenous property of the human auditory system at birth. However, as with other speech sounds, the ontogeny of pitch perception abilities is strongly influenced by infants’ native-language environment. Infants growing up in a non-tone language environment stop attending to changes in syllabic pitch contours by 9 months, whereas those in a tone-language environment continue to exhibit sensitivity to that phonetic distinction.9,10 Similarities can be found between this pattern of tone development and the acquisition of consonant and vowel contrasts11 and may represent a general property of environmental influences in the development of adult-like speech and hearing.12
Just as infants show developmental differences in perceptual sensitivity to lexical tone based on their linguistic environment, long-term exposure to a tone language results in demonstrable differences in perception and neural representation of sound in adulthood. For example, native speakers of English are less accurate at identifying or discriminating Mandarin lexical tone contours, whereas native speakers of Mandarin show biases in the perception of nonspeech pitch-contour stimuli mediated by the lexical tone categories of that language.13 Listeners with different language backgrounds are also likely to make use of different acoustic information available in tone contours.14,15 Native speakers of tone languages rely primarily on the overall shape of a tone contour, whereas native speakers of non-tone languages attend primarily to the average pitch or starting/ending pitch levels. There is also some evidence for experience-related differences in pitch-contour saliency: Unattended variation in pitch contour in a directed-attention task is more distracting to native speakers of a tone language than to speakers of a non-tone one.16
Early information about the neural systems underlying lexical tone production and perception came from the study of patients with brain injuries. Pioneering studies found that many individuals with aphasia showed deficits in producing and/or perceiving lexical tones.17 Native Thai speakers with Broca’s, conduction, or transcortical motor aphasias demonstrated reduced accuracy relative to normal or right-hemisphere injured subjects. The intelligibility of tones produced by individuals with aphasia was also found to be compromised.18 However, it is not clear whether the origins of the tone deficits seen in left-hemisphere injured individuals are phonetic (motor speech), phonological, or both. Tone intelligibility was more affected for speakers with nonfluent aphasia than it was for those with right-hemisphere injury and fluent aphasia, although those with nonfluent aphasia may have also been apraxic.19 At the perceptual level, it has been argued that both acoustic-phonetic and phonological impairments contribute to the difficulty in perceiving tones experienced by individuals with aphasia.20 More comprehensive discussions of aphasia in tone languages are available in other reviews.21
Although there is ample evidence that tone deficits are associated with a left-hemisphere injury, lesion studies alone cannot prove a left-hemispheric “specialization” of lexical tones.22 However, recent advances in neuroimaging technology have begun to provide converging evidence that lexical tone perception abilities are indeed correlated with activity in a left-hemisphere brain network.
Traditional tone perception research has mainly focused on psychological processes23,24 and gross hemispheric specialization similar to the studies just discussed.25 Functional neuroimaging studies using both positron emission tomography (PET) and functional magnetic resonance imaging (fMRI) have begun to further characterize specific brain regions implicated in the perception of both lexical and nonlexical tones. Pitch perception is associated with activation in the right inferior frontal gyrus (IFG) when the pitch patterns are not linguistically relevant.26 However, the pattern of brain activity when listening to lexical tones depends on linguistic relevance. When Mandarin speakers discriminate tones in Mandarin words (in which pitch patterns are linguistically relevant), they exhibit an increased activation in left-hemisphere regions that are associated with various levels of linguistic processing, including perisylvian, frontal, and parietal cortices. English-speaking participants, for whom the tones do not carry linguistic relevance, instead activate right-hemisphere structures, including the IFG and superior temporal gyrus (STG).27 In another study, the left IFG was also shown to be important for processing linguistically relevant lexical tones in native Thai-speaking subjects.28 In the converse situation, both Mandarin- and English-speaking individuals showed activation in the right insula and IFG when discriminating Mandarin pitch patterns embedded in English words, where they lack linguistic relevance for both groups.29 Taken together, these results clearly point to a functional significance of left-hemisphere regions for linguistically relevant lexical tone perception, whereas right-hemisphere regions support nonlinguistic tonal tasks.
An important question is whether the greater left-hemisphere activity for lexical tones in native tone-language speakers is a result of long-term stored categorical (phonological) representation of lexical tones per se or whether it is a reflection of underlying lexical-semantic processes associated with meaningful words.22 Thai tones superimposed on Mandarin syllables (“tonal chimeras”) and Mandarin tones on the same syllables (real words) have been used to distinguish between these two possibilities. The only brain region that showed increased activation to native-language tones across both Thai and Chinese listeners was the left planum temporale (PT), indicating that, of the left-hemisphere regions described earlier, it is this structure in particular that supports the prelexical phonological processing of tones.30 This is consistent with a broad literature in which the left PT is identified as a major site for mapping between the auditory and linguistic (phonological) representations of speech.31,32
Functional neuroimaging studies generally involve a control condition similar to the experimental task for comparative purposes. This comparison produces a map of the brain regions that exhibit significantly increased activation to the experimental condition (in these cases, lexical tone perception) relative to the control task. In the studies already described, many areas in the right hemisphere are also activated during lexical tone perception, but such activations do not differ significantly between experimental and control conditions. In sum, both lesion and neuroimaging studies point to the importance of the left hemisphere, especially the left IFG region and PT, in lexical tone perception. Crucially, by using cross-linguistic designs that compare processing in tone and non-tone language users, these neuroimaging studies have demonstrated that lateralization patterns are determined specifically by long-term experience with lexical tones.
PET and fMRI effectively reveal the neural structures involved in processing lexical tones. However, those methods lack sufficient temporal resolution to reveal details about the time course of lexical tone processing. Electrophysiological methods such as electroencephalography (EEG) and magnetoencephalography, which have temporal resolution on the order of milliseconds, are better suited to that line of inquiry. For example, EEG has been used to reveal that information about lexical tones is accessed at a similar point in time to vowels and consonants, and contributes similarly to word processing.33 Cross-linguistic electrophysiological studies have demonstrated experience-dependent differences in the neural encoding of linguistically relevant pitch at the earliest stages of auditory processing. Long-term experience with a tone language modulated the magnitude of the mismatch negativity (MMN) to tones in both speech34a and nonspeech35 contexts. The MMN is an auditory-evoked cortical potential that peaks 100 to 250 milliseconds after the onset of a stimulus and is used to assess pre-attentive sensitivity to auditory categories. Similar studies have helped reveal how the relative saliency of the acoustic dimensions of lexical tones are encoded in early cortical activity, and how this encoding is sensitive to language background.36 Hemispheric asymmetries also exist at preattentive stages of lexical tone processing. Lexical tones elicit more robust MMN responses from native speakers of Mandarin in the right hemisphere relative to the left, whereas consonants evoke the opposite asymmetry.37 These results have led some to suggest that lexical tones are initially processed on the basis of acoustic features,37 which exploit a general right-hemisphere bias for slow time-varying information,38 before being mapped onto phonological and semantic representations in the left hemisphere at later stages of processing.
Remarkably, sensitivity to linguistically relevant pitch patterns has also been demonstrated at the level of the brainstem, as indexed by the frequency-following response (FFR), an ensemble response that reflects the phase-locking of neurons in the rostral brainstem (inferior colliculus and lateral lemniscus). Native speakers of Mandarin exhibited more accurate representation of the pitch contours of lexical tones than English speakers.39 It has been shown that efferent signals from the cortex can shape the response properties of neurons at subcortical stages of processing.40,41 The long-term effects of such corticofugal mechanisms may facilitate brainstem encoding of pitch in native tone-language speakers.39,42 Taken together, studies that examine preattentive processing of lexical tones in the brainstem and cortex suggest that long-term exposure to linguistic pitch patterns influences neural responses even at levels of auditory processing early enough to be considered nonlinguistic or domain general.
Although substantial progress has been made in understanding the nature of infant acquisition of tone-language phonology and the long-term effects of lexical tone processing on the adult central auditory system, only recently has empirical work begun to investigate the acquisition of lexical tones by adult second-language learners. Explicit laboratory training has been shown to result in improvements in lexical tone identification in English-speaking adults, an ability that can generalize to both novel words and talkers.43 Not only can training on lexical tones improve tone perception after training, but perceptual training alone may also result in modest but significant improvements in the production of those same tones.44
Based on the neurophysiology and neuroimaging studies reported here, one might expect native speakers of a tone language to show an advantage over non-tone language speakers in the acquisition of a second tone language. When both native speakers of English and Mandarin were trained on Thai lexical tones, the Mandarin participants not only outperformed the English cohort on an initial discrimination task of those tones, but they also showed significant improvement after training, whereas the English participants did not.45 However, learning transfer across tone languages might not always be strictly advantageous. When native speakers of English and Mandarin were trained on the tones of Cantonese, significant native language-related patterns of learning were evident: Due to interactions with their extant lexical tone categories, Mandarin participants exhibited different patterns of learning successes and failures compared with native English speakers, who had no such prior categories.46
In the studies cited, learning was assessed via differences between groups, and the focus was always on learning the tone contours themselves. The role of individual differences in successful learning of lexical tones has also been investigated. In one such study,47 differences in learning outcomes between participants who successfully learned a lexical tone vocabulary versus those who did not resulted specifically from participants’ ability to learn the pitch contours, not segmental features of the vocabulary, and learning success was predicted by performance on a pretraining pitch contour identification task. A major determinant of performance on the pretraining pitch contour identification task, and subsequent successful learning of the vocabulary, was the extent of participants’ prior musical experience. We return to this point later.
Similar to the differences in neural processing of lexical tone seen after long-term experience with a tone language, short-term laboratory training in lexical tone identification also affects the cortical locus and representation of lexical tone processing. Short-term training on a pitch-contour identification task has been shown to result in increased activation in regions of the left superior temporal lobe and the right IFG.48 Although this result contrasts with previous studies implicating the left IFG in native-language lexical tone tasks, it may actually reflect the results of lexical tone learning in a nonlinguistic context.26,27,29 However, there is no a priori reason to believe that efficient second-language learning must rely on the same structures as one’s native language, and the complementary functions of the right IFG may facilitate effective remediation of lexical tone perception in individuals with insults to the original language-processing regions of the left hemisphere.
In a study using only participants with no prior experience with lexical tones, training lexical tones in a lexical (linguistic) context increased activation in left posterior superior temporal regions, left dorsolateral prefrontal cortex, and left IFG,49 consistent with the studies of native tone-language speakers discussed earlier.28,30 When accounting for individual variability in tone learning, increased activation in the posterior STG distinguished the most successful learners, suggesting increased reliance on auditory/perceptual analysis underlying learning success, whereas less successful learners were characterized by a diffuse cortical network including general-purpose memory and attentional regions in the frontal cortex.49
Some evidence also indicates that lexical tone training changes the cortical representation of sound categories, as measured through MMN, and that these changes, too, depend on the interaction with native-language lexical tone categories.50 Lastly, recent evidence suggests that short-term lexical tone training can result in subcortical changes in pitch encoding. In a study measuring the brainstem FFR of native English speakers both before and after training on Mandarin lexical tones, training resulted in a higher fidelity encoding of the most challenging pitch contour, suggesting corticofugal modulation of these low-level circuits complemented the development of categorical distinctions upstream.51
As evident from many of the training studies cited here, pretraining experience is very important to the learning outcomes of any training paradigm. For example, particular experience with a tone language is sometimes beneficial to learning new lexical tone contrasts,45 but the influence of prior tone categories are not always strictly advantageous.46,52 Recent work has begun to investigate differences in the learning outcomes of various laboratory tone-training paradigms, as well as interactions with individual learners’ pretraining needs and abilities. Both identification and discrimination training have been shown to be equally effective in learning Thai tones by native speakers of either English or Mandarin.53 Interestingly, native Mandarin speakers are able to learn to identify lexical tones based on visual production cues alone, including movements of the head, neck, and mouth.54 Individuals with peripheral or central hearing disorders that impair the perception of lexical tones may be able to integrate the visual indices of tones into their speech-reading repertoire. It may also be the case that individuals with congenital amusia (“tone-deafness”) and nonnative speakers of a tone language could benefit from similar visual training in tone identification as an accompanying cue to auditory strategies. The integration of auditory and visual information facilitates accurate speech perception, especially in adverse listening environments,55 and it may result from visual inputs to auditory association cortices associated with the perception of orofacial gestures.56 This possibility is also consistent with larger sensorimotor models of phonological processing,31 as well as evidence that lexical tone production improves following perceptual training.44 It remains an open question the extent to which explicit training on lexical tone production might also facilitate the development of more robust perceptual representations.
Work showing that not all learners benefit from the same type of training47 has recently been extended to reveal an interaction between pretraining auditory abilities and training paradigm design on learning outcome: Participants with low pretraining auditory ability benefit from low-variability training (in which uninformative auditory variability is minimized) and are disproportionately impaired by high-variability training, whereas individuals with high pretraining auditory ability can also benefit from high-variability training.57 These results provide a cautionary note to the standard view that high-variability training is always most beneficial.
Non-tone language speakers also exhibit an effect of prior experience on the acquisition of lexical tone in adulthood, based largely in their musical experience. Individuals with a musical background are more likely to be successful in learning lexical tones.47 Similarly, electrophysiological correlates of sensitivity to the auditory information in lexical tones show effects of musical experience: After native tone-language speakers, the largest MMN responses to lexical tones are elicited from musicians; the smallest responses are from nonmusicians.34b
Although it is largely unknown how musical and linguistic pitch-contour categories interact, the benefit of musical experience for successful lexical tone learners might arise from higher pitch fidelity in the subcortical auditory system and more developed cortical structures for pitch processing. For example, musicians have a more accurate brainstem FFR to the vocal pitch contours in Mandarin lexical tones than do nonmusicians.42 Similarly, successful learners of lexical tone exhibit significantly larger left Heschl’s gyri than less successful learners.58 Heschl’s gyrus is the location of primary auditory cortex and the cortical locus of pitch processing in humans.59,60 Similar to successful lexical tone learners, musicians also have larger Heschl’s gyri than nonmusicians.61 However, one should be cautious in interpreting causal relationships among musical experience, neural structure and function, and lexical tone learning. Some studies have indicated strong heritability in the morphology of these regions,62 as well as the co-occurrence of tone languages and the population distribution of certain gene varieties,63 suggesting genetic factors may also contribute to adult lexical tone learning.
The extent to which unique neural systems distinguish lexical tone production from non-tone speech production is unknown. To date, few comprehensive neuroimaging studies of lexical tone production have been undertaken. However, clinical data on tone language-speaking patients with neuromotor disorders such as Parkinson’s disease and cerebral palsy, which affect lexical tone production and speech in general, have provided valuable complementary evidence. For example, a Cantonese single-word intelligibility procedure revealed that the overall speech intelligibility of Cantonese-speaking individuals with cerebral palsy was compromised.64 Furthermore, pitch and prosody predict the intelligibility of Cantonese speakers with dysarthria.65 Deficits involving lexical tones in addition to other segments were also found in patients with dysarthria.66 The reduced intelligibility of lexical tones in a patient with Parkinson’s disease and hypokinetic dysarthria was attributable to an overall reduced speaking pitch range.67 Because Parkinson’s disease often affects the prosodic components of speech,68,69 it is not surprising that lexical tone production is also affected. More recently, examination of the effects of intensive voice therapy on Cantonese speakers with Parkinson’s disease indicated that, although pitch and pitch-range increases are associated with an improvement in intonation, lexical tone production may still be impaired.70 Studies continue to suggest that speakers of both Cantonese71 and Mandarin72 with cerebral palsy show difficulties producing lexical tones that have sufficient acoustic contrasts when compared with normal speakers. As described earlier, several studies have also examined tone deficits in aphasic patients, which are more fully reviewed elsewhere.21,22
As with speech perception in general, the production of lexical tones may rely in part on the ability to perceive lexical tones and make online adjustments to speech based on auditory feedback.73 The influences of perception on refining production provide an additional hurdle to hearing-impaired individuals who communicate in a tone language. Several investigations have examined lexical tone perception in individuals with hearing impairment and cochlear implants. The majority of these studies focused on children, finding that current cochlear implants are inadequate for allowing listeners to discriminate between various Cantonese lexical tones.74,75 However, cochlear implantation was associated with better phonological skills than hearing aids,76 and earlier implantation has been associated with benefits in the development of lexical tone production.77 Profoundly hearing-impaired Cantonese adolescents produce lexical contrasts without the variations in F0 that are the principal perceptual component of lexical tones, a result suggested to arise from limited laryngeal control due to an inability to receive auditory feedback.78 Thus listeners are likely to have substantial difficulties when attempting to understand the speech of hearing-impaired individuals, where clearly conveyed lexical tones are necessary for effective communication.
Although many formal tests assess English-speaking individuals with communicative deficits, fewer address communication skills in tone languages, let alone specifically targeting lexical tones. A Cantonese version of the Western Aphasia Battery has been developed.79 However, because it is modeled after the English version, lexical tones are not formally assessed. For speech-language pathologists working with individuals who speak a tone language, it is possible to follow some of the testing procedures developed in published laboratory research studies, but it is important to note that the intent may not necessarily have been the development of a clinical assessment tool. For example, clinicians may consider using a minimal-pair Cantonese tone perception test, based on data obtained from 21 individuals with aphasia and 8 normal participants in determining deficits in lexical tone perception.20
Few data exist to describe recovery after brain injury among speakers of tone languages, and there are no data regarding the efficacy of treatment programs that specifically target lexical tones. One way to conceptualize treatment programs is to distinguish the articulation and phonological components of the deficits. For example, for patients with dysarthria, it is important to determine whether a general pitch control difficulty exists, and, if so, targeting pitch variation could potentially manage lexical tone deficits. If the disorder is phonologically based, one may investigate which set of tones is affected—for example, whether contour tones are substituted by level tones or whether pitch levels are interchanged. Treatment in this case may involve building perceptual awareness of the distinctions between the different types of tones from discrimination to identification, from closed set to open set (from having a few to many choices), and from single syllables to conversational speech. If a functional approach to treatment is chosen, increasing the semantic content through longer or more situationally relevant utterances, it may decrease the reliance on single words whose meaning depends on lexical tones. Moreover, much insight may be gained through laboratory work on nonnative tone language acquisition.
In the earlier sections, we discussed several basic research studies that developed effective ways to train second-language lexical tone perception or production in the laboratory.36,45,47,53,54 With appropriate adjustments, such methodologies may effectively translate into the clinic for treatment of lexical tone disorders. For example, patients with hearing impairments may benefit from training on the same visual cues that have been shown to be effective for the identification of lexical tones in a laboratory setting.54
This article has focused on providing information about the basic science of lexical tone perception, production, and learning, and its potential clinical applications. Although some basic and clinical work has been done, compared with work on consonants and vowels, significant gaps still exist in our understanding of these topics. We believe that several considerations could be made in future work. For example, to examine the complexities of lexical tones accurately, there is a distinct need to move away from focusing on classic issues such as hemispheric asymmetries.80 Instead, focusing on how different regions contribute to the various stages of auditory processing, and examining how interactions between these regions (e.g., IFG, PT, STG) give rise to holistic phenomena in lexical tone perception, is likely to be more informative. Research on the combination of lexical tone and word learning can further enhance our understanding of neuro-plasticity underlying the interactions between acoustics and abstract categories. From a clinical standpoint, additional work is required to develop normative data on perception and production in different tone languages, delineate accurate characterizations of typical developmental trajectories, and compare the effectiveness of various treatment programs.
With this review, we emphasize that only limited research has so far been conducted concerning clinical disorders of lexical tones. Although characteristics of lexical tone deficits exist in various clinical populations, and learning studies with the potential to contribute to treatment have been conducted, diagnostic and treatment investigations have only just begun to emerge. To serve a broad population of first-and second-language tone-language speakers, an influx of studies with direct clinical applications is greatly needed. However, the present lack of such research does not warrant a reduction or denial of clinical services to speakers of tone languages. Languages of the world differ in a wide variety of features, and although lexical tones may be a feature of many languages, additional features are prevalent in other languages that may be unfamiliar to clinicians (e.g., three-way voicing contrast in Thai and Hindi). When working with a tone-language speaker, clinicians should not focus only on the fact that a language is tonal, but rather whether and what kind of clinical evidence can be found that would enhance services for these individuals, and whether the clinician can speak the language. The possibility of an interpreter proficient in the language should similarly be considered when necessary. Multiple reviews have discussed important guidelines for working with nonnative and bilingual populations, including populations that speak a language the treating clinician may not be familiar with81,82 (see also Centeno in this issue). Clinicians are advised to consider those guidelines carefully when working with tone-language speakers.
The Communication Neural Systems Research Group (principal investigator, Patrick Wong) at Northwestern University is supported by the National Science Foundation (BCS-0719666) and the National Institutes of Health (R01DC 008333, R21DC007468, R03HD051827, and R21DC009652). More information can be found at http://www.soc.northwestern.edu/wong.
Serving Linguistically and Culturally Diverse Adults with Communication Disorders: Multidisciplinary Perspectives and Evidence—A Clinical Forum; Guest Editors, José G. Centeno, Ph.D., and Kathryn Kohnert, Ph.D.
Semin Speech Lang 2009;30:162–173. Copyright © 2009 by Thieme Medical Publishers, Inc., 333 Seventh Avenue, New York, NY 10001, USA. Tel: +1(212) 584-4662.
Learning Outcomes: As a result of this activity, the reader will be able to (1) describe the neural bases of lexical tone perception in native- and second-language speakers, (2) discuss the impact of disorders of hearing, speech, and language on lexical tone production and perception, and (3) discuss the implications of various approaches to training/rehabilitation for reestablishing native-like lexical tone abilities.