|Home | About | Journals | Submit | Contact Us | Français|
There has been evidence for functional abnormalities of the verbal working memory system in schizophrenia. Verbal working memory crucially involves the interplay between the anterior and posterior language systems, and previous studies have shown converging evidence for abnormalities in the posterior language system in schizophrenia. In this functional magnetic resonance imaging study, we measured cortical activity in chronic schizophrenic patients and matched healthy controls during auditory and visual verbal working memory tasks. We employed 1) regional analyses specifically targeting the posterior language system and 2) analyses of functional connectivity between anterior and posterior language regions. We performed these analyses separately for each memory stage and modality. In the regional analyses, the left sylvian–parietal–temporal (Spt) area consistently showed reduced activation during encoding and retrieval stages in schizophrenia. Magnitudes of activation in the left posterior superior temporal sulcus were correlated with the severity of delusions at every memory stage. Functional connectivity analyses revealed reduced connectivity between the left Spt and the anterior insula during the encoding of auditory words. In addition, the connectivity strength was correlated with the severity of auditory hallucinations. These findings identify abnormal components in the verbal working memory system and illustrate their possible overlap with the mechanisms of core schizophrenic symptoms.
Schizophrenia is characterized by various deficits in perception and cognition, as well as its hallmark clinical symptoms such as delusions and auditory hallucinations. Among all its neurocognitive dysfunctions, a deficit in working memory has been regarded as one of the core features of schizophrenia (Goldman-Rakic 1994). Working memory is defined as active maintenance and manipulation of information for a short period of time (Baddeley 1992). Significant impairments of working memory have been reported in both the verbal and nonverbal domains (for a meta-analysis, see Lee and Park 2005). However, one study showed that auditory verbal working memory was more impaired than tone working memory in a subgroup of schizophrenics, suggesting that verbal working memory can be disproportionately affected in at least a subset of patients (Wexler et al. 1998). Working memory consists of encoding, rehearsal, and retrieval stages. A recent meta-analysis of behavioral studies under various types of working memory paradigms demonstrated significant impairments in the encoding and/or early part of maintenance (Lee and Park 2005). Alternatively, it is also possible that the largest deficit for patients lies in late rehearsal, given the fact that schizophrenic patients typically show deficits in sustained attention (Bergman et al. 1995; Liu et al. 2002) and that attention is more likely to be distracted as retention/rehearsal becomes longer. Therefore, it remains an open question which stages in working memory are particularly impaired in schizophrenia, as well as the neuronal substrates underlying such abnormalities.
The neural substrates of verbal working memory have been intensively investigated in healthy individuals with noninvasive functional imaging techniques. Many of these studies were motivated by the model of the “phonological loop” (Baddeley 1992), which consists of the active motor–articulatory component called the “subvocal rehearsal system” and the passive memory capacity of the “phonological short-term store.” These studies provided converging evidence that these 2 subcomponents have separate neural substrates. Subvocal rehearsal is largely associated with the anterior language system, including the left inferior frontal gyrus (IFG), premotor cortex (PMC), anterior insula, and the bilateral supplementary motor area (SMA), whereas phonological short-term memory store is associated with the posterior language system, including the left temporoparietal cortex and/or inferior parietal cortex (Paulesu et al. 1993; Smith and Jonides 1998). The neuroanatomy of the posterior language system for verbal working memory was further elaborated by a recent model of auditory language processing (Hickok and Poeppel 2004, 2007). In this model, auditory verbal working memory is mediated by the “dorsal stream” of auditory language processing. The left sylvian–parietal–temporal (Spt) area is a site for the interface between the phonological networks in the bilateral superior temporal gyrus and sulcus (STG and STS) and the articulatory networks in the anterior language system. Consistent with this model, previous functional magnetic resonance imaging (fMRI) studies reported sustained activation in the left Spt and posterior STS (pSTS) during the subvocal rehearsal of auditory words (Hickok et al. 2003; Buchsbaum, Olsen, Koch, and Berman, 2005). Because there was no auditory stimulation during the rehearsal period, this sustained activation may reflect cortical activity induced by the interaction with the anterior subvocal rehearsal system. The temporal pattern of activation in these 2 areas was clearly distinct from that of the middle part of STG/STS (mSTG/STS) where transient activation was observed only during auditory stimulation (Hickok et al. 2003; Buchsbaum, Olsen, Koch, and Berman, 2005). These observations indicate that the left Spt and pSTS are critical for auditory verbal working memory (for neuropsychological evidence, see Shallice and Warrington 1977; Basso et al. 1982; Buchsbaum and D'Esposito 2008), although the functional roles of these areas in visual verbal working memory have not been clarified. A recent fMRI study proposed that the left inferior temporal cortex including the fusiform gyrus is recruited for visual verbal working memory by a top-down signal from the prefrontal cortex, based on the observation of sustained activation in this region during the rehearsal of visual words (Fiebach et al. 2006).
Previous neuroimaging studies of schizophrenia have accumulated evidence for cortical abnormalities in the posterior half of STG. Particularly, reduced gray matter volume in the posterior STG is one of the most consistently replicated findings of structural abnormalities of schizophrenia (McCarley et al. 1999; Honea et al. 2005). Several studies have also found associations of the posterior STG gray matter volume with the severity of core clinical symptoms such as formal thought disorder (Shenton et al. 1992) and delusions (Menon et al. 1995). Because the posterior STG overlaps the Spt and pSTS for verbal working memory, these structural studies raise the possibility of functional alteration of these subareas in schizophrenia. Other than verbal working memory, the polymodal areas in the pSTS and the inferior parietal cortex were proposed to constitute a part of the system for social cognition/perception, based on the observation that various socially significant cues (such as eye and mouth movements and familiarity of voice) modulate the activity of these areas (Frith 2007; Blakemore 2008). Shared anatomical substrates for language processing and social cognition, including social attention, have been postulated in the STS (Hein and Knight 2008; Redcay 2008). Also, abnormal activation of the social and verbal representations may underlie various schizophrenic symptoms (Wible et al. 2008). Although these considerations suggest a possible association between the functions of the posterior verbal working memory areas and the mechanisms of schizophrenic symptoms, no study has explicitly tested this possibility.
In addition to the posterior perisylvian areas, functional abnormalities of the frontal cortex in schizophrenics have long been suggested by the “hypofrontality” hypothesis (Ingvar and Franzen 1974). However, the findings of functional imaging studies of working memory remain controversial with regard to the strength of the prefrontal activation in schizophrenia. Some studies found reduced activation of the prefrontal areas in patients, whereas other studies reported “hyperactivity” in the same areas (for review, see Weinberger and Berman 1996). Given the neuroanatomical model of verbal working memory in which the frontal articulatory networks crucially interact with the posterior language areas (Smith and Jonides 1998; Hickok and Poeppel 2007), it is possible that cortical abnormalities of the frontal cortex are manifested in terms of functional integration (e.g., functional connectivity) with the posterior language system rather than in the magnitude of activation. Together with the regional abnormalities in the posterior language system, such abnormalities at the large-scale network level may contribute to clinical symptoms of schizophrenia. Indeed, it has been posited that an abnormal influence of the speech production system on the auditory cortex may underlie auditory hallucinations (Frith and Done 1988; Shergill et al. 2000).
In this fMRI study, we measured the cortical activity of patients with chronic schizophrenia and matched healthy controls during a verbal working memory task with auditory and visual conditions. In the first part of the study, we focused on regional activation in the posterior language areas, namely the left Spt, pSTS, and mSTG/STS. The second part of the study examined the functional integration of these areas with the anterior articulatory rehearsal system in the prefrontal cortex and adjacent structures (i.e., anterior insular cortex). For examining possible regional abnormalities in the posterior language areas, we identified cortical activation of interest in each individual brain without spatial normalization. This procedure is crucial for identifying separate but spatially close areas around the posterior Sylvian fissure because it has been reported that the left Spt is relatively small (Buchsbaum, Olsen, Koch, Kohn, et al. 2005) and the morphology of the posterior perisylvian areas is highly variable among individuals (Steinmetz et al. 1990; Westbury et al. 1999). Given the possibility that auditory and visual verbal working memory depend on partially different neural systems and that schizophrenia is impaired with verbal memory only at a specific memory stage, these regional and connectivity abnormalities were examined separately for the encoding, rehearsal, and retrieval stages and for the auditory and visual conditions. Lastly, the measures of regional activation and functional connectivity were examined for their possible associations with schizophrenic symptoms. By employing these analyses, we aimed to examine potential functional abnormalities in the verbal working memory system in schizophrenia and investigate the possible involvement of specific components of the verbal working memory system in clinical symptoms of schizophrenia.
Fourteen patients with chronic schizophrenia and 14 matched healthy control subjects participated in this study. Patients were recruited from the VA Boston HealthCare System, Brockton Campus. All patients were medicated and had a history of prior hospitalizations due to their symptoms. Control subjects were recruited from advertisements and group matched to patients on age, gender, socioeconomic status (SES), parental socio-economic status (PSES), and handedness (Table 1). Inclusion criteria for both patients and controls were as follows: right handedness, ages between 20 and 60 years, no hearing impairments, no history of electroconvulsive therapy, no history of neurological illness, no substance abuse or dependence history during the last 5 years (assessed using the addiction severity index and the Diagnostic and Statistical Manual of Mental Disorders (DSM)-IV criteria), no alcohol use 24 h prior to testing, verbal IQ above 75, and English as a first language.
All patients were diagnosed using the structured clinical interview for DSM-IV (SCID-P; First et al. 1995) and chart information when applicable. Normal control subjects received a screening telephone interview, which was used to ascertain information regarding mental health, neurological illness, and developmental disabilities. All control subjects were interviewed with the SCID-NP (Spitzer et al. 1990) to rule out axis I diagnoses and the Family History Research Diagnostic Criteria (FHRDC) instrument (Andreasen et al. 1977) to rule out any history of mental illness in first-degree relatives. Handedness was assessed by a modified Oldfield (1971) inventory and parental SES by the Hollingshead (1965) 2-factor index of social position. Clinical measures for patients included the Scale for the Assessment of Negative Syndromes (SANS; Andreasen 1981) and the Scale for the Assessment of Positive Syndromes (SAPS; Andreasen 1984). Clinical measures were administered at the time of entry into the study and at subsequent follow-up sessions. Chlorpromazine equivalent dosage for the 2 weeks previous to initial session was calculated for patients, as well as the dosage of anticholinergeic medication, benzodiazepines, lithium, and novel antipsychotic agents (Bezchilibnyk-Butler and Jefferies 1996).
Each trial consisted of 3 different phases (Fig. 1): the first presentation (3 s), rehearsal (15 s), and second presentation (probe) phase (3 s). The second presentation phase was followed by a rest period of 16 s. Each run consisted of 6 auditory trials and 6 visual trials, which resulted in each run lasting 7 min 40 s, including the initial and final resting periods. Two runs were administered for each subject (12 auditory trials and 12 visual trials in total). In each trial, 3 different words were presented (1 word/s) during the first presentation phase. Subjects were required to memorize the words for subsequent recall. During the rehearsal period, a red crosshair flashed every second. Subjects were asked to rehearse 1 of the 3 words covertly each time the crosshair flashed. The presentation of the flashing crosshair is crucial because it is possible that each individual might rehearse the words at significantly different rates without any external cues during the long rehearsal period. In the second presentation phase, a 3-word list was presented and subjects were asked to respond by pressing either the higher button of a response pad if the first and second word list were the same in terms of word order or the lower button if they were not the same. On half of the trials, the word order was the same, and on half of the trials, it was different, counterbalanced across auditory and visual conditions. Each subject was presented with 36 words for auditory presentation (3 words per trial×12 trials) and 36 words for visual presentation. Two different word sets were used for auditory and visual presentations in each subject and those 2 word sets were counterbalanced between subjects within each subject group.
In all, 72 monosyllabic words were selected from a psycholinguistic corpus for English words (MRC Psycholinguistic Database). Each word was controlled for the following factors: syntactic category (noun), letter length (range=3–6 letters), Kucera–Francis word frequency (range=30–90), and imaginability rating (range=400–800). During auditory presentation, each word was read by a female voice and digitized using the speech synthesis software (Natural Reader, NaturalSoft, Vancouver, Canada). Each speech stimulus in the 3-word lists was presented binaurally through a high-fidelity sound delivery system (Avotec, Stuart, FL) every second (duration=400–800 ms; interstimulus interval [ISI] = 200–600 ms) at a maximum intensity of 95 dB SPL during the first and second presentation phase. During visual presentation, each word was presented in lower case through an fMRI-compatible screen (Resonance Technology, Northridge, CA) every second (duration=500 ms; ISI=500 ms). Subjects’ responses were collected via a fiber optic button box that is held by the subject while in the magnet. The button box was connected to an interface for the PC that allowed for the collection of response time and accuracy data (Current Designs, Philadelphia, PA). Presentation of stimuli and collection of responses were performed using Neurobehavioral System's Presentation software (Neurobehavioral Systems, Albany, CA).
All images were acquired in a GE 3.0-T Signa System using a single-channel standard birdcage headcoil (GE Medical Systems, Milwaukee, WI). Functional images were collected with a gradient-echo echo-planar imaging (EPI) sequence (time repetition [TR]/time echo [TE]=2000/30 ms; field of view [FOV]=22 cm; flip angle=90°; matrix=64×64). Each volume consisted of 32 axial slices (3 mm thickness; no gap). In all, 230 volumes were acquired for each run (7 min 40 s). Low-resolution T2 images were also acquired (TR/TE=5000/68 ms; FOV=22 cm; matrix=256× 196) using the same location, plane orientation, and slice thickness as the functional images. In a separate session, high-resolution sagittal spoiled-gradient-recalled (SPGR) T1 images were acquired (TR/TE=40/3 ms; FOV = 24 cm; matrix=256×256; 128 slabs; 1.5 mm thickness) from each subject. The high-resolution SPGR images were used for anatomical localization and 3D visualization of the activation.
We performed image processing and analyses in Matlab (MathWorks, Natick, MA) using the Statistical Parametric Mapping (SPM2) software package (Wellcome Department of Cognitive Neurology, London, UK). The first 3 volumes of images in each run were discarded to allow for T1 equilibration. The remaining 227 volumes were corrected for slice-timing difference (sinc interpolation) and then spatially realigned using a rigid body transform to account for head movement over time. In order to identify activation of interest in the perisylvian areas (the left Spt, pSTS, and mSTG/STS; see Introduction) in the native space, spatial normalization to the standard brain space was not performed within the preprocessing stages. These images were spatially smoothed using a Gaussian kernel of full-width maximum at 7 mm. Time series data at each voxel were temporally filtered using a high pass with a cutoff of 128 s.
Cortical activation at each memory stage (encoding, rehearsal, and retrieval) under auditory and visual conditions was estimated separately using a general linear model. The 2 runs of fMRI time series data were modeled as separate sessions in subject-specific fixed-effect models. Task-related activation was modeled using 4 separate regressors for encoding, early rehearsal, late rehearsal, and retrieval, each of which was set at 0, 6, 12, and 18 s (with reference to the onset of each trial), respectively (for similar modeling, see Buchsbaum, Olsen, Koch, and Berman, 2005; Fiebach et al. 2006). We used different sets of regressors for the auditory and visual trials, resulting in 8 task-related regressors (4 memory stages × 2 conditions). As to each of the regressors, the canonical hemodynamic response function was convolved with a boxcar function with a length of 2.5 s. The estimation of activation for each task-related regressor was performed in the native space without spatial normalization.
Because the left Spt and pSTS were expected to show sustained activation throughout rehearsal stage in the absence of auditory stimulation (Hickok et al. 2003; Buchsbaum, Olsen, Koch, and Berman, 2005), we used the late rehearsal regressor in the auditory and visual trials to identify activation of these 2 areas. The early rehearsal regressor was not used for assessing rehearsal-related activation because it was closer to the timing of sensory stimulation and it was contaminated with activation induced by the sensory stimulation (Zarahn et al. 1997). We used the regressor for auditory encoding to identify activation of the left mSTG/STS because activation of this area is dependent on auditory sensory stimulation (Hickok et al. 2003; Buchsbaum, Olsen, Koch, and Berman, 2005). The significance of activation in these regions was assessed using an uncorrected threshold of P=0.005 at the voxel level with an extent threshold of contiguous 6 voxels. The anatomical classification of activation peaks was performed by coregistering functional images with individual subject's SPGR image. The Spt and pSTS were defined as local maxima of significant activation during the late rehearsal period in the planum temporale and in the posterior one-third of STS, respectively. The mSTG/STS was defined as a local maximum of significant activation in the middle STS or STG (excluding Heschl's gyrus and sulcus) during the auditory encoding period, in which there was no significant activation during late rehearsal period in either the auditory or visual condition. We defined the anterior end of the mSTG/STS as 10 mm posterior to the anterior commissure in each brain. The anterior boundary was set to exclude areas in the semantic-related processing stream for auditory language comprehension (Scott et al. 2000; Scott 2005) and to confine our analysis to the phonological system around Heschl's gyrus (Hickok and Poeppel 2007). We extracted time series data from a voxel with the highest t value and its surrounding 26 voxels to define each region of interest (ROI) for the analysis of time course. When only 1 of the 2 regions (Spt and pSTS) showed significant activation, we defined the coordinates of the missing area by assuming that Spt and pSTS are separated by 12 mm along the z-axis. The definition was based on the average discrepancy between the 2 areas in cases where both areas showed significant level of activation. In all cases, either the Spt or the pSTS showed significant activation. We calculated mean percent signal change at each memory stage by collapsing percent signal change during 6–10, 14–18, and 22–26 s for encoding, rehearsal, and retrieval (with reference to the onset of the trial), respectively, from each subject.
We performed functional connectivity analyses for each of the memory stages (encoding, rehearsal, and retrieval) separately following the procedures described by Rissman et al. (2004; for applications of this method, see Buchsbaum, Olsen, Koch, and Berman, 2005; Fiebach et al. 2006). Each functional connectivity analysis was conducted on a voxel-by-voxel basis in the native space of the individual subject. Activation at a memory stage in each trial was modeled as a unique event so that an entire series of task-related activation was modeled by 96 separate regressors (4 stages × 24 trials per subject). The resulting series of 96 activation estimates (β series) was subsequently sorted into 8 different series types depending on memory stage (4 stages) and the modality of stimuli (auditory and visual). Therefore, this resulted in separate β series that consisted of 12 data points for each memory stage during the auditory and visual conditions. We then created a seed β series by collapsing across the 27 voxels of a seed area (a voxel with the highest t value in the ROI analysis and its surrounding 26 voxels) for each series type. This seed β series was used for calculating correlation (Pearson's r) with the β series in each voxel in the whole brain. Therefore, this analysis created a set of correlation maps (8 maps: 4 memory stages × 2 conditions) for each individual subject. The resulting stage- and condition-specific correlation maps were then normalized to the Montreal Neurological Institute (MNI) space using the transformation matrix created when normalizing the mean image of realigned functional image series to the MNI space. Finally, voxels with significant correlations were estimated using random effects 1-sample t-tests for controls and patients separately. Resulting t maps were thresholded at t>3.85 (P<0.001, uncorrected for multiple comparisons). For a subsequent ROI analysis, we calculated mean β value in each region for controls and patients separately by identifying a voxel with the highest t value and collapsing its surrounding cubic cluster of 27 voxels. For the auditory condition, the left Spt served as the seed area based on the model of the auditory verbal working memory (Hickok and Poeppel 2007; see Introduction). For the visual condition, the left fusiform gyrus, as well as the left Spt, was used as the seed area, based on the previous report of an interaction between this area and the prefrontal cortex (Fiebach et al. 2006; see Introduction). Significant activation of the left fusiform gyrus was identified in each individual brain using the encoding regressor of the visual trials. We then created a correlation map for the encoding period of the visual condition for each subject. After normalizing the individual correlation maps, we performed a second-level analysis for group comparisons. Statistical threshold was set at P<0.005, uncorrected for multiple comparisons.
Both schizophrenic and control subjects were able to comprehend and complete the verbal working memory task without difficulty, and no subject was disqualified due to practice failure. The mean accuracy rate and reaction time for the auditory and visual conditions of the 2 groups are summarized in Table 2. Both groups showed a high accuracy rate (more than 90%) in each condition. There was no significant difference between groups either in the auditory or in the visual condition (t-test, P>0.1). Although the reaction time of schizophrenic patients tended to be longer than that of control subjects, this difference did not reach a significant level either in the auditory or in the visual condition (P>0.05). These behavioral results exclude the possibility that group differences in fMRI activation reflect the effects of poor performance in patients (Price and Friston 1999; Price et al. 2006).
We examined head movement during scanning for each subject group. The translational and the rotational movements were calculated in millimeters and in radian , respectively. Head movement in both translation and rotation was closely matched between groups (translational movement=0.217±0.139 mm in healthy controls and 0.216±0.121 mm in schizophrenic patients; rotational movement=0.0027±0.001 radian in healthy controls and 0.0025±0.002 radian; t-test, P>0.5).
The first part of the fMRI analyses focused on activation of verbal working memory areas in the left posterior language system. Using the estimate of activation during the rehearsal period, we reliably identified significant activation of the left Spt and pSTS in the individual brain in both subject groups (Table 3). In many schizophrenic and control subjects, we observed that both auditory and visual conditions produced peaks of activation at close locations within the left Spt and pSTS (for a representative case for control and schizophrenic subjects, see Fig. 2). For the left Spt and pSTS, we calculated a mean distance of the peaks of activation between the 2 conditions in the individual subject and found that the mean displacement was within a single voxel size in all of x, y, and z directions (left Spt: 1.78±2.32 mm [mean ± standard deviation, along x], 1.24±1.94 mm [y], and 1.21±1.57 mm [z] in controls, 1.5±2.18 mm [x], 1.71±3.12 mm [y], and 1.21±1.80 mm [z] in schizophrenics. Left pSTS: 2.43±4.94 mm [x], 1.21±2.19 mm [y], and 1.00±2.28 mm [z] in controls, 2.86±2.21 mm [x], 2.07±2.43 mm [y], and 1.43±2.59 mm [z] in schizophrenics). The fact that significant activation in the left Spt and pSTS was reliably identified in the separate analyses for auditory and visual modalities suggests that these 2 areas are crucial for the maintenance of abstract phonological information, rather than for echoic or sensory specific short-term memory.
We next calculated the mean time course of the fMRI signals of these 2 critical areas for verbal working memory and the left mSTG/STS (Fig. 3). For the auditory condition, all 3 areas showed clear activation at the time of auditory stimulation (i.e., encoding and retrieval) in both schizophrenic and control groups. Sustained activation in the left Spt and pSTS was significantly higher than the baseline during the rehearsal period, whereas the signal in the left mSTG/STS dropped close to the baseline level. These results confirmed the previous findings that the left Spt and pSTS are involved in both speech perception and (covert) production, whereas the activation in mSTG/STS is auditory perception dominant (Hickok et al. 2003). Group comparisons at each memory stage revealed a significant reduction of activation in the left Spt during encoding and retrieval (P<0.05) for schizophrenic patients, but activation during rehearsal was closely matched. In the left pSTS and mSTG/STS, there was no significant difference between groups throughout the trial. Because mSTG/STS is a strongly unimodal area, it was expected that this area would be sensitive to various acoustic factors, such as sound intensity (Jancke et al. 1998; Langers et al. 2007). The comparable activation between groups in this area indicates that observed group difference in the left Spt does not reflect any uncontrolled acoustic factors.
In the visual condition, both the left Spt and pSTS showed peaks of activation at encoding and retrieval stages in both groups with sustained activation during rehearsal as was found during the auditory condition. However, there was clearly reduced activation during the encoding and retrieval stages relative to the auditory condition, reflecting the fact that these areas play critical roles in speech perception as well as speech production (Hickok et al. 2003). There was no clear task-related activation observed in the left mSTG/STS during the visual condition. As in the auditory condition, schizophrenic patients showed significantly reduced activation during encoding and retrieval in the left Spt. Reduced activation was also found in the pSTS only during encoding. Taken together with the results of the auditory condition, abnormal activity was most consistently found in the left Spt during encoding and retrieval. This indicates that functional abnormalities in this area reflect mnemonic processes rather than perception of a particular sensory modality. Notably, none of the 3 areas showed significant group differences during rehearsal in either the auditory or visual conditions. It is also worth noting here that we minimized the difference of each individual's subvocal rehearsal rate by flashing the crosshair at a constant interval.
We next calculated the correlation between clinical scores in the SANS and SAPS and the percent signal change at each memory stage in the left Spt, pSTS, and mSTG/STS. We found that delusions of reference and global delusions scores showed significant positive correlation with activation in the left pSTS during all 3 stages in both auditory and visual conditions (delusions of reference: r=0.66, P<0.01 during auditory encoding; r=0.66, P<0.05 during auditory rehearsal; r=0.54, P<0.05 during auditory retrieval; r=0.67, P<0.01 during visual encoding; r=0.74, P<0.01 during visual rehearsal; and r=0.63, P<0.05 during visual retrieval; for correlations with the score of global delusions, see Fig. 4). The fact that significant correlations were observed in the 6 independent tests strongly suggests an association between the severity of delusions and the left pSTS activation. On the other hand, the left Spt and mSTG/STS did not show consistent correlations with these 2 scores. Therefore, among the 3 posterior language areas, cortical activation in the pSTS was selectively associated with the severity of delusions. There was no other clinical score that showed significant correlation with any areas in all the 6 tests. We also note that significant correlation with the score of global delusions was found in activity of the right STS during all 3 stages in the visual condition (r=0.65, P<0.05 during encoding; r=0.74, P<0.01 during rehearsal; and r=0.54, P<0.05 during retrieval) and during rehearsal in the auditory condition (r=0.55, P<0.05 during rehearsal; r=0.26, P>0.1 during encoding; and r=0.32, P>0.1 during retrieval; see Supplementary Fig. 1).
The second part of the fMRI analysis was aimed at investigating possible abnormalities of functional integration between the posterior verbal working memory areas and the anterior articulatory system. Using the left Spt as the seed area (see Introduction), we first performed whole brain functional connectivity analyses for each memory stage under the auditory condition (Table 4). During the encoding stage, significant correlations were found in several key areas in the anterior articulatory system in both subject groups (Indefrey and Levelt 2004): the junction of the left anterior insula and ventral IFG (hereafter anterior insula), dorsal IFG and PMC, and pre-SMA (Fig. 5A). We defined the posterior end of the pre-SMA using the y-coordinates of the anterior commissure (y=0) based on previous cytoarchitectonic and diffusion tensor imaging (DTI) studies (Zilles et al. 1996; Johansen-Berg et al. 2004). Performing an ROI analysis for the 3 areas (Fig. 5B), we observed high correlation values in the left anterior insula in healthy controls, whereas schizophrenic patients showed a marked reduction of correlation (P<0.05). In contrast, the left IFG/PMC showed high correlation values for both groups with no significant differences between the groups (P>0.5). In the pre-SMA, although both groups showed high correlations, there was a significant reduction in the patient group (P<0.05).
During the rehearsal stage, significant correlations were identified in the left anterior insula, IFG/PMC, and bilateral pre-SMA in each group, although correlation values were generally reduced compared with those during encoding (Fig. 5C). We performed group comparisons based on the ROI analysis and did not observe significant differences between groups in any of the 3 ROIs (P>0.1; Fig. 5D). During the retrieval stage, we observed high correlations within the 3 ROIs in both subject groups (Fig. 5E). The ROI analysis revealed that, in contrast to the encoding stage, schizophrenic patients showed comparable connectivity in the left anterior insula when compared with controls (P>0.1; Fig. 5F). As during encoding and rehearsal, the left IFG/PMC showed strong correlations with no significant difference between groups (P>0.1). However, we observed a marked reduction of correlation in pre-SMA in schizophrenic patients (P <0.05).
These results indicate that deficits in interplay between the anterior and posterior language systems are differentially manifested depending on memory stage. To further clarify stage-dependent changes of abnormal connectivity, we calculated effect size representing the group difference during each memory stage of the 3 ROIs (Fig. 6A). Although each of the ROIs showed unique patterns of abnormality, of particular note is that the left anterior insula showed a highly selective abnormality during encoding. The posterior language areas and left anterior insula are thought to be connected through the arcuate fasciculus. A recent DTI study observed that the strength of connectivity in the arcuate fasciculus in schizophrenic patients varied depending on the presence of auditory hallucinations (Hubl et al. 2004). Therefore, it is possible that functional connectivity of this pathway may be related to the severity of auditory hallucinations. This possibility was supported by the observation of a significant positive correlation between the SAPS scores of auditory hallucinations and measures of functional connectivity during encoding (P<0.05; Fig. 6B).
To examine whether similar impairments would be observed in encoding of visual words, we performed the functional connectivity analysis for the encoding period of the visual words using the left Spt as the seed area. Although we observed that the 3 critical speech production areas, including the left anterior insula, showed significant connectivity with the left Spt, the ROI analysis indicated no significant group differences in the strength of connectivity (P>0.1; Fig. 7A; for the results of the whole brain analysis, see Supplementary Table 1). It is possible that the functional roles of the frontotemporal pathways involving the left Spt may dynamically change depending on the modality of verbal information. This possibility is consistent with neuropsychological observations of selective impairments of auditory verbal memory after the left temporoparietal damage (Shallice and Warrington 1977; Basso et al. 1982).
We next examined functional connectivity using the left fusiform gyrus (often called visual word form area) as the seed area (Fiebach et al. 2006). Analysis of the time course of activation of this area confirmed sustained activation during rehearsal in both groups (Supplementary Fig. 2), a result consistent with the previous study of healthy subjects (Fiebach et al. 2006). A group comparison of the correlation maps revealed a significant reduction of r values in the patient group in the junction of the left anterior insula and the inferior frontal operculum (Fig. 7B; for other areas with significant reduction, see Supplementary Table 2). This result indicates that, although encoding-related abnormalities of the left anterior insula are apparent in both auditory and visual verbal working memory, the abnormality during visual verbal working memory may be associated with deficient functional integration with the left fusiform gyrus rather than with the left Spt.
The present study examined functional abnormalities of the verbal working memory system in schizophrenia using auditory and visual words. Regional analyses within the posterior language regions were used to examine activity within critical functional subregions of the verbal working memory network. Functional connectivity analyses were also used to examine the interaction between the anterior and posterior language systems. The regional analyses of the individual brains revealed reduced activation in the left Spt during encoding and retrieval for both auditory and visual conditions in patients. Functional connectivity analyses revealed a reduced coupling of activation between the left Spt and the anterior articulatory areas involved in auditory verbal working memory. In particular, the left anterior insula showed a marked reduction of connectivity with the left Spt during the encoding of auditory words. These patterns of selective abnormalities are consistent with a meta-analysis showing that the encoding stage is particularly deficient in schizophrenia (Lee and Park 2005). We also observed associations between functional measures of specific components of the verbal working memory system and clinical symptoms of schizophrenia: Magnitudes of activation in the bilateral pSTS were significantly correlated with the severity of delusions, and the strength of the connectivity between the left Spt and the anterior insula was correlated with the severity of auditory hallucinations. These findings indicate that particular components of the verbal working memory system may be related to distinct clinical symptoms of schizophrenia.
We replicated previous fMRI findings of functional differentiation within the posterior language regions during auditory verbal working memory and found these functional regions in schizophrenic patients as well as in healthy controls (Hickok et al. 2003; Buchsbaum, Olsen, Koch, and Berman, 2005). Within each subject group, we observed that the left Spt and pSTS showed sustained activation throughout rehearsal in both auditory and visual conditions, whereas the left mSTG/STS showed transient activation only during the auditory presentation of words. Within the left Spt and pSTS, we identified activation independently for the auditory and visual conditions in the individual brain and found that the discrepancies of the activation foci between the 2 conditions were within subvoxel size on average. Our results indicate that, at the level of resolution typically used in fMRI procedures, the left Spt and pSTS are crucially involved in both auditory and visual verbal working memory.
Among areas in the posterior language system, the left Spt showed a consistent reduction of activation in schizophrenic subjects during the encoding and retrieval stages of auditory and visual verbal working memory. Reduced activation during encoding and retrieval (but not during rehearsal) was also reported in a previous fMRI study of working memory in schizophrenia, although it was mainly observed in the prefrontal cortex (Johnson et al. 2006). Other than functional roles in verbal working memory, a recent fMRI study proposed that the left planum temporale plays a critical role in perception of voices in external auditory space (“outside head”) as opposed to internal space (“inside head”; Hunter et al. 2003). It was further proposed that dysfunction of this region leads to a false perception of externally located voices (i.e., auditory verbal hallucinations). Our findings of abnormal activation of this area in the patient group may be relevant to such deficient perception of voices in external auditory space, although we did not find significant association between the left Spt activation and the severity of auditory hallucinations in our correlation analyses.
A number of functional imaging and neurophysiological experiments have demonstrated that the pSTS is one of the cortical convergence zones of auditory and visual information not only in the language domain (Calvert 2001; Driver and Noesselt 2008) but also in the social and communication domain (e.g., speaker's identity and emotional state; Campanella and Belin 2007; Kreifelts et al. 2007). The importance of the pSTS for social perception/cognition has been supported by sensitivity of this region to various social and communication cues, such as direction of eye gaze (Puce et al. 1998), mouth movements (Pageler et al. 2003), and emotion and familiarity of voices (Kriegstein and Giraud 2004; Beaucousin et al. 2007). A recent meta-analysis study examined the locations of STS activation under multiple tasks, including theory of mind, biological motion processing, face processing, speech processing, and audiovisual integration and argued against strict functional subdivisions in the pSTS based on the observation of a cluster of activation in this area for all the tasks (Hein and Knight 2008).
From these observations, it was proposed that the system for processing language and social cognition is overlapping and/or adjacent in and around the posterior STG/STS and acts in a coordinated system (Kreifelts et al. 2007; Wible et al. 2008). It is possible that the same area of the pSTS can be involved in different functions by virtue of different patterns of interactions with other brain areas (Hein and Knight 2008). An even stronger claim postulates the existence of the shared system for language and social cognition in the STS, proposing a common process underlying the language and social domains, that is, analysis of auditory and visual sequences in order to interpret their meaning and communicative significance (Redcay 2008). In either case, the converging view suggests that the pSTS is involved not only in speech perception and production but also in analyzing socially significant signals and goal-directed behavior in the service of understanding the intentions of others and of attributing mental states to others (Wright et al. 2003; Pelphrey et al. 2004). The abnormal computations and/or misinterpretations of social signals (i.e., gaze direction) have been associated with certain types of delusional beliefs (i.e., delusions of reference and persecutory delusions; Hooker and Park 2005; Hoffman 2007). It is also worth noting here that the temporoparietal junction (which encompasses the pSTS) was shown to play a critical role in several aspects of social cognition including theory of mind, empathy, and agency especially in the right hemisphere (Decety and Lamm 2007). Furthermore, abnormal activity in this region has been associated with delusions of alien control (Spence et al. 1997; Blakemore and Frith 2003). Various delusional beliefs may be manifested depending on the extent and foci of the abnormal activation in and around the pSTS social cognitive processing system by provoking aberrant thoughts about one's own and other people's intentions and mental states (Brunet-Gouet and Decety 2006; Wible et al. 2008). Association of the posterior STG (left) and delusions has been also reported by a previous volumetric study (Menon et al. 1995). Our findings of a positive correlation between the pSTS activity and the severity of delusions during all memory stages may indicate increased sensitivity of this area to both endogenous (i.e., inner speech/subvocalization) and exogenous (i.e., auditory and visual signals) stimuli in patients with severe delusions. This possibility is consistent with a recent proposal that schizophrenic symptoms may stem from overactivity in areas around the pSTS whose representational content is closely related to the symptom (Wible et al. 2008).
Previous functional imaging studies have shown converging evidence that the anterior part of the left insula is critically involved in verbal working memory (Paulesu et al. 1993; Indefrey and Levelt 2004; Koelsch et al. 2009). In the present study, strong functional connectivity with the left Spt was observed during the encoding of auditory words in control subjects, whereas the connectivity was markedly reduced in schizophrenic patients. Other than verbal working memory processes, a series of recent brain lesion studies indicated that the anterior insula is involved in several auditory processes, such as allocating auditory attention to novel auditory stimuli, and phonological processing (for review, see Bamiou et al. 2003). These neuropsychological findings are consistent with our observation of particularly strong connectivity during the auditory condition.
Although conduction aphasia has been traditionally regarded as a disconnection syndrome caused by lesions in white matter tracts connecting the anterior and posterior language areas (i.e., the arcuate fasciculus; Geschwind 1965), recent neuropsychological studies have accumulated evidence that at least some types of conduction aphasia are due to cortical damage, typically involving the left temporoparietal/inferior parietal cortex and the left anterior insula (Damasio and Damasio 1980; Anderson et al. 1999). Conduction aphasia involves impairments in the repetition of heard words, phonemic errors in speech production, and naming difficulties, which lead to a claim that conduction aphasia can be characterized as “phonological encoding deficits” (Shallice and Warrington 1977). Based on this claim, we suggest that the selective reduction of functional connectivity during encoding reflects an impairment of the phonological encoding of auditory words in schizophrenia. Previous morphometric studies reported a gray matter volume reduction of the bilateral anterior insula in first episode schizophrenic patients (Kasai et al. 2003) as well as in chronic patients (Makris et al. 2006). One of the new findings of our functional approach is that the abnormalities involving this region may be manifested in a context-dependent way. Such a finding was possible by employing memory stage-specific analyses that decomposed verbal working memory into temporally distinct stages.
The involvement of the bilateral SMA (including SMA proper and pre-SMA) in overt speech production has been replicated in a number of functional imaging studies (for review, see Indefrey and Levelt 2004). It has been proposed that, whereas the SMA proper is predominantly implicated in motor outputs, the function of the pre-SMA is related more to high-order aspects of cognition (Picard and Strick 1996). A recent fMRI study suggested that there is a rostrocaudal gradient in the SMA and that the anterior parts are particularly involved in effortful word selection and the retrieval process (Alario et al. 2006). Our results revealed reduced functional connectivity between the pre-SMA and the left Spt only during the encoding and retrieval stages but not during the rehearsal period. In particular, marked reduction of connectivity during retrieval suggests that deficiencies in speech production in schizophrenia may be apparent during the word retrieval process. It has been proposed that speech production involves neural circuitry between the pre-SMA/SMA proper and the posterior perisylvian areas, disruption of which leads to transcortical motor aphasia (TCMA; Freedman et al. 1984). TCMA is characterized by perseveration of speech, mutism, and reduced spontaneous speech (Alexander et al. 1989), all of which are typically observed in schizophrenic patients with negative symptoms. We suggest that the interaction between the pre-SMA and the posterior language areas is impaired in schizophrenia particularly for word retrieval, which may contribute to speech production deficits in this disease.
It has been proposed that the left fusiform gyrus is recruited for visual verbal working memory by virtue of top-down signals of the anterior language system (Fiebach et al. 2006). There are at least 2 major white matter tracts that connect the fusiform gyrus and the frontal cortex, that is, the inferior longitudinal fasciculus and the inferior frontooccipital fasciculus, both of which may support language processing (Catani et al. 2002, 2003). Although there has been little evidence for structural abnormalities of schizophrenia in either the inferior longitudinal fasciculus or the inferior frontooccipital fasciculus, several DTI studies reported abnormalities in the uncinate fasciculus, the pathway between the temporal pole and the ventral part of the frontal cortex (Kubicki et al. 2002; Park et al. 2004). Because the inferior longitudinal pathway projects to the temporal pole, it is possible that structural abnormalities in the uncinate fasciculus may be related to the present finding of reduced functional connectivity between the left fusiform gyrus and the anterior insula. Although several functional imaging studies suggested that the left inferior longitudinal fasciculus is involved in aspects of language processing (Vigneau et al. 2006), its contribution to language functions has been questioned by a recent cortical stimulation study (Mandonnet et al. 2007). Further work is needed to clarify the relationships between these ventral pathways and specific language functions, as well as the impact of abnormalities in these pathways on schizophrenic symptoms.
We observed that the severity of auditory hallucinations was positively correlated with the strength of functional connectivity between the left anterior insula and Spt. This result suggests that this neural pathway may be a part of the mechanisms of auditory hallucinations. Because this pathway is critical for speech production (Geschwind 1965; Hickok and Poeppel 2007), this finding is consistent with the previous model of involvement of the speech production system in auditory hallucinations (Frith and Done 1988; Shergill et al. 2000). The critical involvement of the left anterior insula has been also supported by a recent fMRI study observing activation in this area prior to the onset of auditory hallucinations (Hoffman et al. 2008). A previous DTI study observed that the schizophrenic patients experiencing auditory hallucinations showed larger fractional anisotropy (FA) values in large parts of the arcuate fasciculus than those without auditory hallucinations, although healthy controls showed significantly larger FA values overall compared with the combined group of patients with and without auditory hallucinations (Hubl et al. 2004). Similarly, our analyses of functional connectivity revealed reduced connectivity overall in schizophrenic patients compared with the control group (Fig. 5B), but correlation analysis within the patient group indicated stronger connectivity as the hallucinations became more severe (Fig. 6B). Taken together, converging evidence from our study and others suggests that the connectivity strength of this pathway has a significant impact on auditory hallucinations in schizophrenics.
On the other hand, a significant reduction of connectivity in the patient group indicates that our finding must be viewed in the context of other comorbid factors of schizophrenia. We suggest that this pathway is part of a larger functional system that may be compromised in schizophrenia and may contribute to auditory hallucinations. Activity of the primary auditory cortex has been reported during patients’ auditory hallucinations (Dierks et al. 1999; van de Ven et al. 2005), and a recent review raised a possibility that abnormal interactions of the primary auditory cortex and the adjacent multisensory areas (such as the pSTS) may contribute to auditory hallucinations (Wible et al. 2008). Indeed, several electrophysiological studies demonstrated feedback from the multisensory areas in the STS to the auditory perceptual areas (Driver and Noesselt 2008; Ghazanfar et al. 2008). In addition to the speech production system, the involvement of the hippocampus has been also indicated (Suzuki et al. 2003; Woodruff 2004; Wible et al. 2008). We suggest that the connectivity between the left Spt and anterior insula may be part of mechanisms producing abnormal activation in the auditory cortex that underlies auditory hallucinations.
Several functional imaging studies reported significant gender differences in the organization of the language and verbal working memory systems in healthy populations (Speck et al. 2000). Behavioral evidence indicates that male schizophrenic patients are more likely to be impaired than female patients in several language components, such as grammar, semantics, and phonology (Walder et al. 2006). Because the participants of our studies were mostly male, it is possible that female patients would show different patterns of abnormality compared with the ones observed in the present study. A previous fMRI study using a sentence completion task reported that the temporal correlation of activity between the left STG and several areas in the frontal cortex was negatively correlated with the severity of auditory hallucinations in schizophrenia (Lawrie et al. 2002). Because all patients who hallucinated were female subjects in that study, it is difficult to directly compare their results with ours. These observations together indicate the need of the future study to investigate possible gender-specific deficits in the verbal working memory system in schizophrenia.
In this study, we employed an easy verbal working memory task to control task performance between the 2 groups (Price and Friston 1999; Price et al. 2006). Under this condition, we observed reduced activation and connectivity in several critical components of the verbal working memory system (e.g., the left Spt and its connection with the left anterior insula). Given the absence of reliable evidence indicating that schizophrenics are more “efficient” in processing verbal working memory or an alternative explanation for the reduced activation in patients, these results strongly suggest a deficiency of the verbal working memory system in schizophrenia. Although our data do not provide direct behavioral evidence for verbal working memory deficits in our sample, we predict that behavioral impairments would be observed in a more demanding task that is designed for detecting even subtle abnormalities of the system. This prediction is consistent with a previous fMRI study of schizophrenia showing reduced activation for verbal working memory with comparable task performances during scan (Stevens et al. 1998), and their companion study demonstrating behavioral impairments using a larger number of trials and a larger sample size (Wexler et al. 1998).
In the present study, we employed an approach to decompose the well-defined system of verbal working memory into key anatomical components (areas and pathways), and, for each component, we investigated possible functional abnormalities and associations with schizophrenic symptoms. Because each component of the verbal working memory system has been well characterized in terms of anatomical location as well as function (see Introduction), it was possible to evaluate the relationship or spatial overlap between the verbal working memory components (e.g., the pSTS, the left Spt—anterior insula pathway) and systems for other related functions, such as social cognition and speech production, each of which has been suggested to contribute to schizophrenic symptoms (Frith and Done 1988; Shergill et al. 2000; Brunet-Gouet and Decety 2006; Wible et al. 2008).
Although our approach has the advantage of focusing on a well-defined set of brain components, this resulted in excluding other systems that may be critical for schizophrenic symptoms. As discussed for auditory hallucinations, we do not claim that a particular component of the verbal working memory system is solely responsible for a particular symptom. For the mechanism of delusions, we do not exclude the possibility that the pSTS may be a part of the large-scale network for social cognition (Brunet-Gouet and Decety 2006). Possible involvement of the “default network” in schizophrenia has been indicated (Garrity et al. 2007; Wible et al. 2008). This large-scale network comprises the lateral temporoparietal cortex (Gusnard and Raichle 2001), which may include the Spt and pSTS. If the baseline activity of these areas is elevated in schizophrenia (Wible et al. 2008), these areas may show reduced activity to externally imposed tasks in patients as in the present study. Therefore, our findings within the verbal working memory system need to be integrated with these relevant systems that may together contribute to symptoms of schizophrenia. The verbal working memory system provides ideal conditions for such investigations of possible overlaps with the other systems because each component of the system has been characterized both structurally and functionally, and some of the components, particularly areas in the posterior system, have been demonstrated to be reliably identifiable in the individual brain.
In conclusion, the present study employed both regional analyses of the posterior language regions and analyses of functional connectivity between the anterior and posterior language regions to investigate functional abnormalities of the verbal working memory system in schizophrenia. Our findings of abnormalities within localized language regions and in the functional integration of a large-scale verbal working memory network may begin to bridge the gap between neural mechanisms for deficient cognitive functions and core clinical symptoms in schizophrenia.
The National Institute of Mental Health (RO1 MH067080-01A2) to C.G.W; the Department of Veterans Affairs, a Research Enhancement Award Program to R.W.M.
We gratefully acknowledge Istvan Moroz for technical support, Israel Molina for careful reading of the manuscript, and Laura Rosow for data collection. Conflict of Interest: None declared.