Speakers use a range of acoustic means to communicate meanings at the lexical and sentence level. In a study of Mandarin-English bilinguals, Gandour, Tong, Talavage, Wong, Dzemidzic et al. (2007)
examined the neural processing of two sentence-level prosodic phenomena: contrastive stress (sentence initial or sentence final) and modality (declarative or interrogative) that are used in both English and Mandarin. They concluded that sentence-level prosody in Mandarin and English is mediated by a common neural system and one that involves an interplay between right-hemisphere and left-hemisphere systems (see also Friederici and Alter, 2004
). It is unlikely then that there will be any brain differences between Chinese speakers and non-Chinese speakers as a function of differences in sentence level prosody.
However, in contrast to English and other European languages, Mandarin (along with other Chinese languages such as Cantonese, Hakka and Hokkien) is a tonal language. In a tonal language, it is not only consonants and vowels that distinguish different words (i.e., that carry lexical effects) but also pitch patterns. Clearly these two aspects of the speech signal must be coordinated and a plausible view is that synchrony is achieved with respect to each syllable (see Xu and Liu, in press
). The processing of pitch so that it can achieve lexical effects constitutes a possible source of brain differences between Chinese and non-Chinese speakers.
The primary acoustic correlate of tones is the fundamental frequency of voice (F0
). The four tones of Mandarin, for instance, are conventionally labelled as tone 1, 2, 3 and 4 and differ in pitch height and the shape of the pitch contour (see Chao, 1968
).. They can be described as high-level (tone 1), rising (tone 2), low-dipping (tone 3) and falling (tone 4). To take a conventional example to illustrate the lexical effects of tone in Mandarin, the syllable /ma/ means “mother” when spoken in the first tone (/ma1
/), “hemp” when spoken in the second tone (/ma2
/), “horse” when spoken in the third tone (/ma3
/) and a reproach when spoken in the fourth tone (/ma4
When spoken in isolation, tones are easy to distinguish in terms of their F0
contours but variability is introduced when tones are spoken in sentence contexts. In normal speech, pitch patterns must shift 5-8 times per second. Nonetheless, regardless of the preceding tone, the F0
contour of the syllable associated with the tone converges over time with the characteristic of the underlying tone. In accounting for these data, Xu and Wang (2001)
proposed that speakers aim to reach an articulatory goal associated with the lexical tone. More recently, Gauthier, Shi and Xu (2007)
provided an existence proof, using a self-organising map simulation, that listeners could use the movement of the contour towards the underlying pitch target (the velocity of F0
) to categorise the tone. In their view the object of speech perception is the articulatory gesture. In the next section, we review what is known about the neural regions involved in processing tone.
5.1 Behavioural and functional studies of tone processing
Behavioural studies using dichotic listening techniques (e.g., Wang, Jongman and Sereno, 2001
) indicate that the ability to identify lexical tone is primarily lateralised to the left hemisphere. In contrast, other pitch-related abilities with no lexical effects appear to be predominantly lateralised to the right hemisphere (e.g., Blumstein and Cooper, 1974
; Warrier and Zatorre, 2004
). Neuroimaging data support the importance of left hemisphere structures in tone perception but also indicate the continued relevance of right-hemisphere structures.
In a PET study, Klein, Zatorre, Milner, and Zhao (2001)
contrasted the neural basis of pitch perception in two groups of speakers (n =12 in each): Mandarin-English speakers and native English speakers. Participants judged whether a pair of monosyllabic Mandarin words was identical or not. Half the word pairs had the same tone (e.g., /t’ou2
/) and half had a different tone (e.g., /fei2
/). Most relevant here is the researchers’ comparison of the two language groups performing the tone discrimination task relative to a silent baseline. Consistent with previous findings on the perception of pitch (Zatorre, Evans, Meyer and Gjedde, 1992
), native English speakers showed greater activation in right frontal and right temporal regions indicating that these regions mediate the processing and maintenance of pitch information. In contrast, for the Mandarin-English group, relative to the English group, all the observed differences were in left hemisphere frontal, temporal and parietal regions. This differential outcome supports the view that it is the linguistic-relevance of complex auditory stimuli that determines which neural mechanisms are engaged (e.g., Gandour, Wong, Hsieh, Weinzapfel and Hutchins, 2000
, see also Gandour, 2006
for a review). For Mandarin speakers, tone carries lexical significance leading to the activation of left hemisphere networks.
The four tones of Mandarin bear an imperfect relationship with any one property of patterns of prosody in English (stress, accent and intonation) and so in order to acquire vocabulary in Mandarin, native English speakers might need to develop novel processes in order to integrate tones and phonetic contrasts (see Wang, Sereno, Jongman and Hirsch, 2003
; see also Zhang and Wang, 2007 -this issue). Wang et al. used a focussed programme to help six native English speakers distinguish lexical tones as part of their efforts to learn Mandarin as a second language. In their fMRI study, participants identified the tone of a spoken Mandarin word. Wang et al. found that the learning of lexical tones was associated with increased volume of activation in existing language areas (Wernicke’s area) and in a neighbouring area of the left superior temporal gyrus. They also found increased volume of activation in the right inferior frontal gyrus which they linked to pitch processing. Thus they suggest that when native speakers of English acquire lexical tones, they increase demands on existing language areas (e.g., Wernicke’s area) as well as on those involved in pitch processing.
Turning now to work on connected speech, theorists have tended to emphasise the importance of Wernicke’s area because lesions in this left posterior temporal region are most typically associated with aphasia (e.g., Turner, Kenyon, Trojanowski, Gonotas and Grossman, 1996
). Narain, Scott, Wise, Rosen, Leff, Iversen and Matthews (2003)
argue that the left posterior temporal cortex is likely to be a core component of working memory network specialised for language comprehension (Aboitiz & Garcia, 1997
) with more anterior regions important for language comprehension (e.g., Crinion, Lambon-Ralph, Warburton, Howard and Wise, 2003
). For instance, patients with semantic dementia, who suffer a progressive loss in knowing the meanings of single words, typically show deterioration in the anterior and ventral temporal lobe (e.g., Mummery, Patterson, Price, Ashburner, Frackowiak and Hodges, 2000
). Further, both anterior and posterior regions of the left temporal lobe are activated in the passive comprehension of both connected speech and writing (Spitsyna, Warren, Scott, Turkheimer and Wise, 2006
, see also Narain et al., 2003
). Although activity is higher in left temporal lobe regions for the passive comprehension of speech and writing, there is also activity in the homotopic right cortex (see Spitsyna et al., p. 7330). In short, language comprehension in English utilises temporal lobe regions in both the left and right hemispheres.
What of the processing of connected speech in Mandarin compared to English? Using baselines controlling for the acoustic complexity of the speech signal (as in Narain et al, 2003
; Spitsyna et al, 2006
), Scott (2004)
examined the neural correlates of the passive perception of English and Mandarin sentences in native speakers of these languages. As would be expected, the intelligibility of the stimulus sentences determined relative regional activation. As in Narain et al (2003)
, for native English speakers (n =8), intelligible English sentences strongly activated the length of left lateral temporal neocortex, with peaks in a posterior region (Wernicke’s area) and in a more anterior region (anterior superior temporal sulcus). For native Mandarin speakers processing intelligible Mandarin sentences, the right superior temporal gyrus and sulcus was also strongly activated. Scott observed too that the activation in the right posterior superior temporal sulcus was close to the region associated with the processing of lexical tone independent of the intelligibility of the sentence.
In summary, current imaging data do not allow us to draw a strong function-structure inference. However, they are compatible with the notion that proficient Chinese speakers make extensive use of right hemisphere temporal lobe regions in processing Chinese. And so, on the assumption that structural change follows functional demand, we might expect anatomical differences in the brains of Chinese and non-Chinese speakers. In the next section, we consider existing literature on the neural markers for the acquisition of a tone language.
5.2 Whole-brain structural studies of Chinese speakers
Kochunov, Fox, Lancaster, Tan, Amunts, Zilles et al. (2003)
looked at differences in brain shape between 20 English-speaking Caucasians born in the USA and 20 Chinese-speaking Asians born in mainland China and currently living in the USA. They analysed MRI scans using deformation field morphometry (see section 1) that identifies brain surface differences between groups and report that right parietal, left frontal and left temporal regions were larger in the Chinese-speaking group while a left superior parietal region was larger in the Caucasian group.
Their basic position is that the observed differences reflect anatomical plasticity and derive from the different processing requirements for Chinese over English rather than genetic differences in the morphometry of Asian and Caucasian brains. The differences, they argue, are consistent with evidence from functional imaging studies although they note that the role of the parietal cortex in Chinese-language processing is unknown (p. 963). This study offers useful initial evidence that the processing requirements of a tonal language may induce changes in brain anatomy. One caution is that the study matched the samples for age, educational level and handedness but not for the number of languages spoken – the Chinese-speaking Asians were most likely bilingual in Chinese and English - so we cannot know for sure which if any of the observed differences may be attributable to the acquisition of an additional language.
In the next section (5.3) we report work-in-progress on grey-matter differences between Chinese and non-Chinese speakers.
5.3 VBM study on Chinese-English speakers
In order to examine the effects of the acquisition of a tone language, Crinion et al. (in preparation
) compared grey-matter density in four different groups of subjects that comprised the factorial combination of (1) Chinese speaking or not and (2) English as a first language or not. They were therefore able to look for the effect of speaking Chinese that was common to both Asian and English subjects thereby excluding any confounds from differences in the morphometry of Asian and European brains. The 78 participants ranged in age from 18 to71 years (30 male, 48 female; mean age 31.5; sd = 14.4) and the number of languages spoken ranged from one to seven (mean 2.8, sd = 1.4)2
The analyses to date indicate highly significant effects of speaking Chinese. Compared to English monolinguals and European multilinguals, Chinese speakers showed highly significant enhancement of grey matter density in two regions of the right hemisphere: the superior temporal gyrus, anterior to Heschl’s and a region in the inferior frontal gyrus. In the left hemisphere, the Chinese speakers showed two regions of increased grey matter density: one in the middle temporal gyrus and a second in the superior temporal gyrus (posterior to Heschl’s gyrus), see . Critically, the difference between Chinese speakers and monolingual English speakers and European multilinguals was shown both by Chinese speakers with Chinese as L1 and those with Chinese as L2. Such data confirm that the observed effect is a language effect rather than an effect of ethnicity.
Positive effect of speaking Chinese or not (Chinese and English learning Chinese> English monolinguals and European multilinguals) P=0.05 corrected for whole brain analyses.
In addition, speaking more than one language, whether European or Chinese, yielded an increase in grey matter density in the posterior supramarginal parietal region identified by the Mechelli et al. (2004)
study. That is, there is a commonality of effect consistent with this region’s role in the integration of sound and meaning. Chinese speakers must also coordinate tone with other aspects of the speech signal for speech perception and production and this coordination demand may utilise the additional right hemisphere and left hemisphere structures identified in the VBM analysis.