|Home | About | Journals | Submit | Contact Us | Français|
Reading aloud involves computing the sound of a word from its visual form. This may be accomplished 1) by direct associations between spellings and phonology and 2) by computation from orthography to meaning to phonology. These components have been studied in behavioral experiments examining lexical properties such as word frequency; length in letters or phonemes; spelling–sound consistency; semantic factors such as imageability, measures of orthographic, or phonological complexity; and others. Effects of these lexical properties on specific neural systems, however, are poorly understood, partially because high intercorrelations among lexical factors make it difficult to determine if they have independent effects. We addressed this problem by decorrelating several important lexical properties through careful stimulus selection. Functional magnetic resonance imaging data revealed distributed neural systems for mapping orthography directly to phonology, involving left supramarginal, posterior middle temporal, and fusiform gyri. Distinct from these were areas reflecting semantic processing, including left middle temporal gyrus/inferior-temporal sulcus, bilateral angular gyrus, and precuneus/posterior cingulate. Left inferior frontal regions generally showed increased activation with greater task load, suggesting a more general role in attention, working memory, and executive processes. These data offer the first clear evidence, in a single study, for the separate neural correlates of orthography–phonology mapping and semantic access during reading aloud.
Reading single words aloud is usually construed as involving the computation of a target phonological code from orthographic (visual) input. The phonological code is translated into the sequence of articulatory gestures that underlie overt pronunciation. Semantic (meaning) information may also contribute to this computation to some degree, particularly in cases where spelling–sound correspondences are unusual (e.g., “yacht”; Strain et al. 1995; Plaut et al. 1996). Numerous behavioral studies and a few functional neuroimaging studies of reading aloud have used factors such as word frequency, spelling–sound consistency, and imageability to investigate components of the reading process. In general, 2 main approaches have been used: 1) a factorial approach in which 1 or 2 of these factors are manipulated while holding a few other relevant factors constant and 2) a correlational approach (e.g., multiple linear regression) in which the influence of several variables is investigated at once. The factorial approach suffers from the limitation that investigation of only 1 or 2 variables at a time can at best yield a partial picture of the reading system. The regression approach, in contrast, allows for the simultaneous investigation of many variables of interest, but in practice, these variables are often correlated with each other (e.g., word frequency and length tend to be negatively correlated), making it difficult to attribute a unique role to any single variable. The approach used here was to investigate the simultaneous influence of multiple reading-related variables, with the critical difference that the factors were decorrelated from each other by careful stimulus selection. This approach was used to investigate the neural systems that support orthographic, phonological, and semantic processes in reading aloud, using both behavioral and functional magnetic resonance imaging (fMRI) data.
A major advantage of ensuring that the lexical factors of interest are uncorrelated is that any spatial overlap of brain activation across factors is attributable to a shared neural substrate, rather than a statistical correlation between factors. Six factors of particular relevance to orthographic, phonological, and semantic processes were decorrelated in this stimulus set: word frequency, spelling–sound consistency, imageability, bigram frequency, biphone frequency, and length in letters. Below, we describe each factor, its possible relevance to reading aloud, and relevant behavioral and functional neuroimaging evidence.
Reading aloud is strongly influenced by the frequency with which words are encountered in the language. Low-frequency words elicit longer processing times and higher error rates than high-frequency words. This pattern obtains not only for reading aloud (Monsell 1991) but with an even greater effect size for lexical decision (Forster and Chambers 1973; Schilling et al. 1998; Balota et al. 2004) and picture naming (Huttenlocher and Kubicek 1983; Hennessey and Kirsner 1999). The generalization of this effect across tasks is relevant because lexical decision does not require a speech response, and picture naming involves nonverbal input. Thus, although frequency effects may also be present at the level of overt articulation or visual encoding of letters, neither is necessary to elicit these effects. The longer response latencies elicited by reading low-frequency words may arise from multiple sources, one of which is the relative difficulty of mapping from orthography to phonology (for a review see Monsell 1991).
Several functional neuroimaging studies of single-word reading have reported increased activation for low- compared with high-frequency words in the inferior frontal gyrus (IFG) and anterior insula bilaterally (more consistently on the left), in the supplementary motor area (SMA), and in various left temporal regions (Fiez et al. 1999; Joubert et al. 2004; Kronbichler et al. 2004; Hauk, Davis, and Pulvermüller 2008). Some of the left temporal lobe foci have been in or near the so-called visual word-form area (VWFA; Cohen et al. 2000). Word-frequency effects in the VWFA have been taken as evidence for whole-word orthographic processing (i.e., activation of an orthographic lexicon) in this region (Joubert et al. 2004; Kronbichler et al. 2004; Hauk, Davis, and Pulvermüller 2008), whereas frequency effects in the IFG have generally been interpreted as evidence of phonological processing in this region (Bookheimer 2002).
It is important to note, however, that caution is required when interpreting brain activations that are accompanied by increases in task difficulty. Functional neuroimaging measurements are very sensitive to differences in response time (RT), accuracy, attention, working memory load, and level of effort between tasks (see, e.g., Honey et al. 2000; Adler et al. 2001; Braver et al. 2001; Ullsperger and Yves von Cramon 2001; Gould et al. 2003; Binder et al. 2004; Binder, Medler, et al. 2005; Mitchell 2005; Desai et al. 2006; Lehmann et al. 2006; Tregallas et al. 2006). These differences in task difficulty are thought to exert effects on brain activity by modulating domain-general cognitive processes necessary for completing any task. Likely examples of such domain-general systems include a sustained attention network for maintaining arousal, a selective attention system for focusing neural resources on a particular modality or sensory object in the environment (e.g., a visual display), a working memory system for keeping task instructions and task-relevant sensory representations accessible, a response-selection mechanism for mapping the contents of working memory to a response, a response-inhibition system for preventing premature or prepotent responses from being made in error, and an error-monitoring system for adjusting response criteria and RT deadlines to minimize such errors. If this is the case, and if the level of activation in these systems depends on general task demands, then it follows that activation can never be attributed with certainty to specific linguistic processes when this activation has resulted from a contrast in which general task demands differ. Specifically, brain areas whose task-related activation is thought to reflect attention and working-memory demands (e.g., SMA, anterior cingulate, IFG, anterior insula, precentral sulcus, and dorsal parietal lobe; Corbetta et al. 1998; Carter et al. 1999; LaBar et al. 1999; Duncan and Owen 2000; Derrfuss et al. 2005; Grosbras et al. 2005; Owen et al. 2005) largely overlap those whose activation increases for low- compared with high-frequency words (Fiez et al. 1999; Hauk, Davis, and Pulvermüller 2008). Moreover, these same areas show activation correlated with increases in RT during reading aloud (Binder, Medler, et al. 2005). Hence, neural effects of, for example, decreasing word frequency may be indistinguishable from attention or working-memory processes to the extent that they spatially overlap with effects of RT.
Although the focus in neuroimaging studies has been on the greater activation associated with decreasing word frequency, increasing word frequency may also be expected to have positive effects on activation, particularly in lexical–semantic systems. Word frequency is correlated with the extent of repeated exposure to a lexical concept; thus, access to word meaning is likely to be more extensive and more automatic in the case of high-frequency words. Higher-frequency words appear in more contexts (Adelman et al. 2006) and are judged to be more familiar (Toglia and Battig 1978; Baayen et al. 2006) compared with lower-frequency words. Word frequency facilitates performance on semantic decision tasks (e.g., deciding if a word denotes an object belonging to a particular conceptual category), suggesting that semantic information is more easily available for high-frequency words (Monsell et al. 1989; Chee et al. 2002). In word-association tasks, higher-frequency words are more likely to be produced as associates, suggesting that they have stronger associative connections with other words (Nelson and McEvoy 2000). Although such studies might lead to the expectation of greater activation in semantic processing areas for high-frequency words, only 2 prior imaging studies have reported such a pattern (Prabhakaran et al. 2006; Carreiras et al. 2009). In general, reading studies that directly compared high- and low-frequency words showed no relative activation for the high-frequency condition (Fiez et al. 1999; Chee et al. 2002; Fiebach et al. 2002; Joubert et al. 2004; Kronbichler et al. 2004; Carreiras et al. 2006; Hauk, Davis, and Pulvermüller 2008). Thus, although it may be reasonable to expect activation for high-frequency words in brain areas that support semantic processing, support for this from functional neuroimaging is scarce.
Spelling–sound consistency is another factor that affects the orthography–phonology computation. “Friends” of a word share the same rime pronunciation, whereas “enemies” of a word have a rime that is spelled the same but is pronounced differently. In general, words with inconsistent spelling–sound mappings (e.g., PINT has many enemies such as MINT, HINT, LINT, but no friends) elicit longer naming latencies than words with consistent spelling–sound mapping (Baron and Strawson 1976; Glushko 1979; Andrews 1982; Taraban and McClelland 1987; Jared 1997, 2002). These effects are greater for lower-frequency words compared with higher (Seidenberg et al. 1984). Computational models of reading and associated behavioral evidence suggest that when a word is used relatively infrequently and the mapping between spelling and sound is highly atypical, semantic information is used to help achieve the correct phonological representation (Strain et al. 1995; Plaut et al. 1996; Strain and Herdman 1999; Harm and Seidenberg 2004; Woollams 2005). Hence, brain activation elicited by reading low-frequency, low-consistency words might be interpreted in 2 ways. It may reflect increased use of neural resources for orthography–phonology mapping for low-frequency, low-consistency words, or, alternatively, recruitment of the semantic system. Several studies have reported activation in language-related prefrontal cortical regions such as IFG for reading inconsistent compared with consistent words (Herbster et al. 1997; Fiez et al. 1999; Binder, Medler, et al. 2005; Mechelli et al. 2005), and such results are typically interpreted as reflecting neural systems for orthography–phonology mapping. However, because low-frequency, low-consistency words also elicit longer naming latencies, such activations may be confounded with attention and executive processes, as described above (Binder, Medler, et al. 2005). Functional imaging evidence for recruitment of semantic processing areas in reading low-frequency, low-consistency words, on the other hand, is scarce.
Imageability, which refers to the ease with which a word evokes a mental image, is a semantic factor that facilitates word recognition in the lexical decision task (James 1975; Kroll and Merves 1986; Kounios and Holcomb 1994; Binder, Westbury, et al. 2005). Highly imageable words are thought to have a richer or more easily accessed semantic representation (Shallice 1988; Paivio 1991; Schwanenflugel 1991). Although semantic processing is usually considered to have a minimal role in reading aloud, there is evidence that it plays a role, particularly for less familiar, more difficult words (Patterson et al. 1985; Patterson and Hodges 1992; Strain et al. 1995; Strain and Herdman 1999; Woollams 2005). According to connectionist models of word reading, if both word frequency and spelling–sound consistency are low, the orthography–phonology computation is both less accurate and less efficient. Computing the correct phonological code requires additional input via the orthography–semantics–phonology pathway in such models (Plaut et al. 1996). Input along this semantically mediated pathway is greater for words that are highly imageable. This theory predicts faster latencies for low-frequency, low-consistency words that are highly imageable compared with those that are less imageable. This pattern has been observed in multiple studies of reading aloud (Strain et al. 1995; Strain and Herdman 1999; Shibahara et al. 2003; Woollams 2005).
Several functional neuroimaging studies of imageability have also been performed. In a single-word reading aloud study similar to the one performed here, Binder, Medler, et al. (2005) found increased activation in bilateral angular, superior frontal, and precuneus/posterior cingulate gyri as word imageability increased. Activations for lower-imageability words were found in bilateral anterior cingulate cortex, left precentral gyrus, and left IFG/anterior insula, similar to the activations reported for lower-frequency words in the studies mentioned previously. Similar inferior frontal activations for low- relative to high-imageability words have been reported in several studies using lexical and semantic decision tasks (Perani et al. 1999; Friederici et al. 2000; Fiebach and Friederici 2003; Noppeney and Price 2004; Binder, Westbury, et al. 2005; Sabsevitz et al. 2005). Several reading-related studies also reported activation for high-imageability words in precuneus/posterior cingulate and angular gyrus (Binder, Westbury, et al. 2005; Sabsevitz et al. 2005; Bedny and Thompson-Schill 2006), although exceptions to this pattern have also been reported (Pexman et al. 2007; Hauk, Davis, Kherif, and Pulvermüller 2008). On balance, these findings suggest that precuneus/posterior cingulate and angular gyrus play a prominent role in processing word meaning, a notion further supported by a recent large-scale meta-analysis of functional neuroimaging studies of word-related semantics (Binder et al. 2009).
A final primary factor of interest, bigram frequency, was included as a measure of orthographic familiarity. One of the few behavioral studies to examine the impact of bigram frequency on reading aloud found no effect (Strain and Herdman 1999). Bigram frequency has, however, been shown to affect other reading phenomena such as tachistoscopic word and letter perception (Biederman 1966; Broadbent and Gregory 1968; Rumelhart and Siple 1974; Rice and Robinson 1975; Binder et al. 2006) and lexical decision (Gernsbacher 1984; Westbury and Buchanan 2002), leading many researchers to control for this variable in studies of reading aloud (e.g., Waters and Seidenberg 1985; Taraban and McClelland 1987; Monsell et al. 1989; Strain et al. 1995; Jared 1997; Weekes 1997; Hino and Lupker 2000; Jared 2002; O'Malley and Besner 2008).
The study of bigram effects in the functional neuroimaging literature has been motivated by the theory that familiar letter combinations (e.g., bigrams and trigrams) are represented in the brain, particularly in VWFA, with more frequently encountered combinations having stronger representations (Dehaene et al. 2005; Binder et al. 2006; Vinckier et al. 2007). Activation of these sublexical orthographic codes speeds letter recognition, perhaps through an interactive activation mechanism (McClelland and Rumelhart 1981), and damage to their neural representations results in letter-by-letter reading (Binder and Mohr 1992; Leff et al. 2001; Cohen, Henry, et al. 2004). Relevant functional brain-imaging data for bigram frequency effects comes from 2 recent studies using nonwords, both of which found increased activity with increasing bigram frequency in left mid-fusiform gyrus (Binder et al. 2006; Vinckier et al. 2007).
Biphone frequency and letter length were included primarily to ensure that effects of the other variables could not be attributed to correlations with these factors. Biphone frequency is a measure of phonotactic complexity shown to affect speech production tasks such as nonword pronunciation (Majerus et al. 2002; Goldrick and Larson 2008; Graves et al. 2008) and picture naming (Vitevitch et al. 2004), with low-biphone-frequency words showing a processing disadvantage compared with high-biphone-frequency words. Effects of letter length (number of letters) could in principle also relate to phonotactic processing in that longer words may be more difficult to pronounce. Given that the stimuli in the present experiment are all monosyllabic, however, it is reasonable to expect letter-length effects to arise primarily at the level of visual encoding, as suggested by functional neuroimaging studies showing increased activation in primary visual cortex for longer words (Mechelli et al. 2000; Wydell et al. 2003).
The principal aim of the present study was to examine in more detail the neural systems supporting reading aloud. One unresolved issue is whether there are regions in the IFG that are specifically modulated by word frequency, spelling–sound consistency, or imageability. IFG activation has been reported for low values of all of these variables (i.e., more difficult conditions), although no studies have examined all 3 variables concurrently to assess the degree of overlap of the activated regions. Similar IFG regions are also modulated by response latency, suggesting that at least some of this activation may represent general executive and attention processes modulated by task difficulty. We predicted that common brain regions would be modulated by task difficulty across all 3 lexical variables, as well as by RT, and that these would include areas previously identified with attention, working memory, and other general executive processes. In addition, there may be areas of IFG overlap restricted to the word frequency and consistency variables, which would suggest a more specific role for these areas in orthography-to-phonology mapping.
A second issue concerns activation of the semantic system. Current neuroimaging evidence for engagement of the semantic system during reading aloud is limited (Binder, Medler, et al. 2005), despite modeling and behavioral evidence that semantic information plays a role in this task. We predicted 2 effects concerning the semantic system. First, increasing values of word frequency and imageability were expected to produce increasing activation in semantic regions such as the angular gyrus, ventral temporal lobe, and precuneus/posterior cingulate, indicating more extensive activation of semantic information with increasing concept familiarity and imageability. Second, we expected increased activation in the semantic system as spelling–sound consistency decreases. Although the predicted effects of frequency and imageability could be attributed to incidental processing of semantic information, semantic activation associated with decreasing consistency would be strong evidence for a direct contribution from semantics in reading aloud.
The stimuli were 465 monosyllabic English words, which were selected to ensure that the following 6 factors were uncorrelated: letter length, word frequency, spelling–sound consistency, imageability, bigram frequency, and biphone frequency (see Table 1 for extreme examples). Word-frequency values were obtained from CELEX (Baayen et al. 1995) in terms of occurrences per million and log transformed. Consistency was defined as the number of friends minus the number of “enemies.” Comparisons were based on phonetic transcriptions from CELEX that were transformed, when necessary, into standard American English pronunciations. Bigram frequencies were length and position constrained. For each 2-letter combination in a word, the frequencies of all words of the same length containing the same bigram in the same position were summed and log transformed. After calculating this value for each bigram in the word, these figures were then summed and divided by the total number of bigrams in the word to give a mean log-transformed positional bigram frequency. The same procedure was performed for biphones to yield mean log-transformed positional biphone frequency. Compared with unconstrained biphone frequency (cf. Vitevitch and Luce 2004), this method is more predictive of reaction time in an auditory pseudoword repetition task (Graves et al. 2008).
Imageability values were obtained from a database of imageability ratings compiled from 6 sources (Paivio et al. 1968; Toglia and Battig 1978; Gilhooly and Logie 1980; Bird et al. 2001; Clark and Paivio 2004; Cortese and Fugett 2004), the first 3 available through the MRC Psycholinguistic Database (Wilson 1988).
Stimulus selection began with a corpus of 1650 nonhomographic, monosyllabic words containing 4–6 letters, a subset of the monosyllabic words used by Seidenberg and McClelland (1989). This corpus was divided into 8 orthogonal cells by fully crossing high and low levels of word frequency, consistency, and imageability. The cell with the smallest number of items, the low-frequency/inconsistent/high-imageability cell, contained 38 words. Next, 38 items were selected at random from the other 7 cells, yielding a nucleus of 304 words for which frequency, consistency, and imageability were uncorrelated. Finally, words were selected from a list of 895 additional monosyllabic words in the Seidenberg and McClelland (1989) list in order to decorrelate the sample in terms of bigram and biphone frequency. This enlarged the sample to 465 words. Two correlation matrices are given in Table 2. The upper half presents correlation values among the 6 variables of interest across the 1650 words in the starting corpus. The lower half presents correlation values among the same variables across the final set of 465 words. A list of the 465 stimuli and their associated values is given in the supplemental material (Table S1).
The 20 participants (13 females) were all healthy, literate adults with normal or corrected-to-normal vision, were right handed on the Edinburgh handedness inventory (Oldfield 1971), and spoke English as a first language. Mean age was 23.2 (standard deviation, SD: 3.4), and mean years of education was 16.6 (SD: 3.1). A verbal IQ estimate from the Wechsler Test of Adult Reading (Wechsler 2001) was available for 19 of the 20 participants, with a mean standard score of 109.6 (SD: 8.3). Participants provided written informed consent according to local Institutional Review Board protocols and were paid an hourly stipend.
The fMRI experiment used a fast event-related design with continuous acquisition. On each trial, a word was displayed for 1000 ms, then replaced with a fixation cross. Approximate horizontal viewing angle subtended less than 7°. Participants were instructed to “read each word aloud as quickly and accurately as possible.” Participants spoke into an MRI-compatible microphone placed near the mouth and secured to the head coil. The fMRI session included 5 runs of single-word reading aloud. Each run lasted 8 min and consisted of 93 reading trials; these trials were randomly intermixed with 139 baseline (fixation) trials, resulting in a variable intertrial interval ranging from 2 to 34 s (mean: 4.9, SD: 3.72). Following this were 5 runs of pseudoword reading aloud. The pseudoword data were not included in the current analyses and will not be discussed further.
MRI data were acquired using a 3.0-T GE Excite system (GE Healthcare, Waukesha, WI) with an 8-channel array head radio frequency receive coil. High resolution, T1-weighted anatomical reference images were acquired as a set of 134 contiguous axial slices (0.938 × 0.938 × 1.000 mm) using a spoiled-gradient-echo sequence. Functional scans were acquired using a gradient-echo echoplanar imaging (EPI) sequence with the following parameters: 25-ms time echo, 2-s time repetition, 192-mm field of view, 64- × 64-pixel matrix, in-plane voxel dimensions 3.0 × 3.0 mm, and slice thickness of 2.5 mm with a 0.5-mm gap. Thirty-two interleaved axial slices were acquired, and each of the 5 functional runs consisted of 240 whole-brain image volumes.
Image analysis was performed using AFNI (http://afni.nimh.nih.gov/afni) (Cox 1996). For each subject, the first 6 images in each time series were discarded prior to regression analysis to avoid initial saturation effects. Images were slice timing corrected and spatially coregistered. Estimates of the 3 translation and 3 rotation movements at each time point, computed during registration, were saved for use as noise covariates. Image volumes containing artifacts were identified using AFNI's 3dToutcount program and subsequently removed from the analysis. A local Pearson correlation algorithm (Saad et al. 2009) was used to align each T1-weighted structural volume to the same EPI reference volume that was used to align the functional scans.
Audio recordings of the reading responses were processed using a combination of a freely available correlation-based noise subtraction algorithm (Cusack et al. 2005) and custom software developed in-house. This approach suppressed scanner noise while leaving the speech signal intact and automatically paired reading responses with stimulus onset markers. RTs were calculated from stimulus onset to response onset. Values more than 2SDs from each subject's own mean were checked and, when necessary, manually determined by visual and auditory inspection of the audio file. Responses were considered errors if the subject stuttered, produced a mispronunciation, failed to respond, or responded with an RT more than 3SDs from the group mean. These RTs, calculated for each item responded to individually for each subject, were used as a covariable in the fMRI regression analysis.
Voxelwise multiple linear regression was performed using 3dDeconvolve (Ward 2006). This analysis included the following covariables of no interest: a fourth-order polynomial to model low-frequency trends, the 6 previously calculated motion parameters, and a term for signal in the ventricles used to model noise. Covariables of interest were modeled using a gamma variate estimate of the hemodynamic response function and consisted of the following 14 terms: binary variables for 1) successful reading aloud trials and 2) trials in which the subject made an erroneous response; continuous, mean-centered values for 3) RT, 4) letter length, 5) word frequency, 6) consistency, 7) imageability, 8) bigram frequency, and 9) biphone frequency; and 10–14) interaction terms of interest. Interaction terms included in the model were those that we predicted to have an effect on brain activity (interactions of word frequency with letter-length, consistency, bigram frequency, and imageability). Detailed consideration of these interactions, however, is beyond the scope of the current report and will be presented in a subsequent article.
The resulting coefficient maps for each participant were linearly resampled in standard stereotaxic space to a voxel size of 1 mm3 and spatially smoothed with a 6-mm full-width-half-maximum (FWHM) Gaussian kernel. These smoothed coefficient maps were then passed to a random effects analysis comparing the coefficient values with a null hypothesis mean of zero across participants. The resulting group activation maps were thresholded at a voxelwise P < 0.01, uncorrected. A cluster-extent threshold was then calculated using AlphaSim to perform Monte Carlo simulations estimating the probability of spatially contiguous voxels for a range of alpha values. Input arguments to AlphaSim include 1) the voxelwise threshold (P < 0.01 in this case), 2) a cluster connection radius specifying the minimum distance for which clusters are considered distinct (here r = 4.24 mm, the diagonal length along the face of a single voxel), and 3) the level of smoothing. Considering the fact that raw MRI data contain a degree of smoothness introduced, for example, during image reconstruction from k-space (Friedman et al. 2006), the actual smoothness of the images was calculated from error residuals using the AFNI program 3dFWHMx. The resulting FWHM values (in mm) in 3 directions of x = 9.2, y = 9.1, and z = 7.7 were input to AlphaSim. This resulted in the removal of clusters smaller than 2052 μL (76 contiguous voxels in the original image space), for a whole-brain corrected probability threshold of P < 0.05.
Errors in reading aloud (mispronunciations, false starts, omissions, and latencies greater than 3SDs from the mean) were very infrequent (2.6% overall) and not analyzed further. Using simultaneous multiple linear regression analysis for which the dependent measure was each subject's mean-centered RT to each word, we examined effects of the 6 variables of interest, as well as interactions of word frequency with letter length, consistency, bigram frequency, and imageability, for effects on RT. The overall mean RT was 588 ms (SD: 123). Unique variance was explained by letter length (β = 7.3, P < 0.0001), frequency (β = −22.6, P < 0.05), and consistency (β = −0.4, P < 0.05). As expected, the directions of these effects were such that words with more letters, lower frequency, and more enemies were associated with longer latencies. No other main effects or interactions were significant. In addition to including all variables of interest in the same regression analysis to determine the unique effects of each variable, it is also informative to examine separate pairwise correlations between each variable and RT. This method broadly agreed with the full regression model, with letter length, word frequency, and consistency showing reliable correlations with RT. They differed only in that imageability showed a reliable pairwise correlation with RT but did not explain unique variance in the full model. Pairwise correlations were as follows: letter length, r = 0.174, P < 0.001; word frequency, r = −0.193, P < 0.0001; consistency, r = −0.092, P < 0.05; imageability, r = −0.097, P < 0.05; bigrams, r = −0.0036, P > 0.05; and biphones, r = −0.0254, P > 0.05.
The general contrast of all successful reading responses compared with fixation baseline (left side of Fig. 1) revealed activation in the standard overt reading network as seen in numerous prior studies (e.g., Fiez and Petersen 1998; Turkeltaub et al. 2002). Activation was observed bilaterally in peri-Rolandic cortices (pre and postcentral gyri), inferior frontal and insular cortex, superior temporal gyri, SMA, and dorsal anterior cingulate cortex, intraparietal sulci (IPS), occipital and inferior occipito-temporal cortices, as well as subcortical nuclei such as the thalamus. There was no obvious lateralization of activation. Areas showing greater activation for fixation than successful responses included several areas often reported to show task-induced deactivation, such as bilateral precuneus and posterior cingulate gyri, bilateral ventromedial prefrontal cortex, left angular gyrus, and right parahippocampus. Table 3 gives a complete list of coordinates for activation maxima, where positive z-scores represent areas activated for reading aloud compared with fixation and negative z-scores indicate areas activated for viewing fixation compared with reading aloud. Coordinates are listed for extreme maxima (in the case of positive values) or extreme minima (for negative values) that are at least 30 mm apart and appear within significant clusters.
With the exception of the contrast between successful reading aloud and fixation, all other results are from analyses of continuous covariables. Results of these analyses are described in terms of correlations between each regressor and blood oxygen level–dependent (BOLD) signal, not activation in comparison with a baseline. In Figures 1 and and2,2, hot colors indicate areas where BOLD signal correlated positively with the covariable, and cool colors indicate areas where BOLD signal correlated negatively with the covariable (i.e., greater BOLD signal for decreasing values of the covariable). For RT (right side of Fig. 1), the only significant effects were increases in activity with increasing RT (i.e., positive correlations). These included broadly similar patterns in left and right hemispheres, with somewhat greater activation on the left. This pattern was seen in the IFG extending both to the middle frontal gyrus and anterior insula, the inferior frontal junction (IFJ, the junction of the precentral and inferior frontal sulci), peri-Rolandic cortices, SMA, thalamus, and posterior superior temporal sulci. Exclusively left-sided activation appeared in IPS, supramarginal gyrus, and temporo-occipital sulcus. See Table 4 for a complete list of clusters along with their activation maxima and coordinates.
Effects were obtained for each of the 6 stimulus properties of interest. Correlations for the 4 primary factors (word frequency, spelling–sound consistency, imageability, and bigram frequency) are shown in Figure 2, complete list of cluster maxima is in Table 4. Positive correlations between BOLD signal and letter length were observed in bilateral medial, ventral, and polar occipital cortices. The signal in left parahippocampus was negatively correlated with length. See the supplementary figure for a map of these correlations.
Positive correlations for word frequency occurred in bilateral angular gyri; bilateral posterior cingulate gyri, subparietal sulcus, and precuneus; and left superior frontal sulcus. Negative correlations for word frequency (i.e., increasing BOLD signal intensity for lower-frequency words) were found in left IFJ, IFG, anterior insula, IPS, and subgenual cingulate, and bilaterally in SMA, thalamus, medial occipital cortex, and ventral occipito-temporal cortex (left > right).
Correlations with spelling-to-sound consistency (number of friends minus number of enemies) were all in the negative direction, indicating increasing neural activity for words with more enemies than friends. These areas were all in the left hemisphere and included IFJ, IFG/anterior insula and cortex along the inferior temporal sulcus (ITS) and middle temporal gyrus (MTG).
Imageability was associated with correlated activity changes broadly similar to those of word frequency, though the areas modulated by imageability were less extensive. Activity positively correlated with imageability occurred in bilateral angular gyri and bilateral precuneus. Areas negatively correlated with imageability included left IFJ and left lateral and ventral occipital cortex, spreading to the posterior inferior temporal gyrus.
Bigram frequency elicited only negatively correlated activity (greater activity for words with lower bigram frequency), which occurred in bilateral posterior MTG and left supramarginal gyrus. Biphone frequency was not significantly correlated with activity in any brain region.
We also reran the analysis with a model that included all variables except RT to examine the possibility, as discussed in Wilson et al. (2009), that inclusion of an RT regressor might have distorted effects of the psycholinguistic variables with which it was correlated. Results of this analysis were nearly identical to that of the full model.
There was widespread spatial overlap of areas exhibiting a main effect of successful naming and those showing increased activation with increasing RT. Positive effects of RT also overlapped with negative effects of word frequency, consistency, and imageability, primarily in left IFJ. RT overlapped with negative consistency and frequency effects in left IFG and anterior insula. In the upper part of Figure 3 are composite maps showing effects of lexical variables that overlapped extensively with effects of RT, raising the possibility that some of these effects could be related to general performance processes (e.g., attention, cognitive control, and working memory). As mentioned in the Introduction, specific lexical and general task processes are difficult to disentangle because both are related to time on task. Areas of the left IFG and IFJ have been implicated in both types of processes, and both areas show increased activity for reading words with decreasing imageability, decreasing consistency, decreasing frequency, and longer RT. This 4-part overlap is shown in the top part of Figure 3. In contrast, there were 2 regions in left IFG that showed isolated effects of decreasing spelling–sound consistency (red areas in Fig. 3). The more ventral of these is in the anterior aspect of the pars orbitalis and the more dorsal in pars triangularis. Several areas outside the frontal lobe also showed extensive overlap between positive RT and negative frequency effects. These included the left IPS, bilateral anterior cingulate gyrus, bilateral calcarine sulcus, and bilateral thalamus.
Effects of decreasing word frequency, consistency (to a small extent), and increasing RT, but not imageability, also overlap in the left mid-fusiform gyrus (ventral surface in the upper part of Fig. 3). This area has previously been implicated in processing of visual word forms but has not typically been associated with general performance effects (although RT effects in this area for reading aloud were reported previously by Binder, Medler, et al. 2005).
A combination of effects that may involve lexical processes more specifically is shown in the lower part of Figure 3. Areas showing increasing activity for words of increasing frequency overlap mainly with areas showing increasing activity for words of increasing imageability. This overlap is seen primarily in bilateral angular gyri and left precuneus (light green in the lower part of Fig. 3). Neither of these regions shows any RT effects. Similarly, posterior temporal and inferior parietal areas showing increasing activity with decreasing bigram frequency show no overlap with areas modulated by RT. Finally, inferior temporal regions showing increasing activity for less consistent words also show little or no overlap with areas modulated by RT.
Both separate and overlapping patterns of neural activity were detected for the 6 uncorrelated factors of interest. The data suggest a neural architecture in which distinct orthography–phonology and semantic pathways are engaged during reading aloud. The results also clarify the role of several left inferior frontal regions in reading aloud.
As illustrated in the upper part of Figure 3, effects in the negative direction (relatively greater activity for lower stimulus property values) for word frequency, consistency, and imageability all overlapped with positive RT effects in left IFJ, and negative effects of frequency and consistency overlapped with RT in left IFG. Previous studies of reading suggested involvement of these areas in phonological processing (Démonet et al. 1992; Fiez and Petersen 1998; Fiez et al. 1999, 2006; Price 2000; Bookheimer 2002; Fiebach et al. 2002; Price, Gorno-Tempini, et al. 2003; Joubert et al. 2004; Mechelli et al. 2005), whereas other studies have associated essentially the same areas with more general functions such as cognitive control, attention, and working memory (LaBar et al. 1999; Derrfuss et al. 2005; Owen et al. 2005). Of the major computational models of reading (e.g., Plaut et al. 1996; Coltheart et al. 2001; Harm and Seidenberg 2004; Perry et al. 2007), we are aware of none that attempt to disentangle reading-specific effects from more general performance effects. Our data suggest the possibility that the areas shown in white and those in orange within the RT outline in the upper part of Figure 3 may be engaged in general task-performance processes such as cognitive control, attention, or working memory, which are sensitive to any increase in task load regardless of the source of the increased demand.
In contrast to these left inferior frontal regions, the left mid-fusiform gyrus shows areas of overlap between positive RT effects and negative effects of word frequency and to some extent consistency but not imageability. Were these activations purely related to general processing demands, activity negatively correlated with imageability would be expected as well (as is the case in left inferior frontal regions). This left mid-fusiform gyrus area has been referred to as the “visual word-form area” (VWFA) because of its preferential response to well-formed letter strings (Cohen et al. 2002; McCandliss et al. 2003; Cohen, Jobert, et al. 2004; Dehaene et al. 2005). Two reviews place the center coordinate for this area in Talairach and Tournoux (1988) space at x, y, and z = −43, −54, and −12 (Cohen et al. 2002) and −42, −55, and −12 (Bolger et al. 2005), whereas a review that transformed coordinates to Montreal Neurological Institute (MNI) space (for a discussion of MNI and Talairach spaces see http://imaging.mrc-cbu.cam.ac.uk/imaging/MniTalairach) gives it as −44, −58, and −15 (Jobard et al. 2003). All report an SD of approximately 5 mm, placing these coordinates within 1SD of each other. Although the nearest local minimum for word frequency and local maximum for RT are somewhat anterior to this location, both clusters extend to clearly include the VWFA coordinate. Not only has this area been shown to be positively correlated with graphotactic probability in functional brain-imaging studies of healthy readers (Binder et al. 2006; Vinckier et al. 2007), but it is also one of the major areas that, when damaged, leads to a type of acquired dyslexia known as pure alexia (i.e., alexia without agraphia; Binder and Mohr 1992; Leff et al. 2001; Cohen, Henry, et al. 2004). There has been debate, however, about whether this region supports orthographic processing per se or a more general process (Price and Devlin 2003), perhaps related to the mapping between visual input and phonology (Price, Winterburn, et al. 2003; Sandak et al. 2004; Hillis et al. 2005). Our results, showing activation of this area with longer RT and lower values of word frequency and spelling–sound consistency but not imageability, suggest that it may support a relatively specific yet integrative function such as the mapping between orthography and phonology.
An alternate possibility that cannot be ruled out in this study is that top-down attention systems amplify processing of orthographic codes in order to help complete the mapping to phonology. This interpretation, however, rests on the assumption that attention systems can selectively modulate orthographic processing, and we are aware of no studies that directly demonstrate this. Hence, the overlap of word frequency and consistency, but not imageability, effects in the putative VWFA suggests that this region supports integration of orthographic and phonological information. Additionally, although this discussion of VWFA has focused on properties of the area surrounding its center coordinate, there is also evidence of graded function along the left fusiform gyrus, particularly in the posterior–anterior direction. For example, along the left fusiform gyrus Vinckier et al. (2007) reported posterior activity modulated by letter frequency, somewhat more anterior activity for bigram frequency, and more anterior still for quadrigram frequency. Functional heterogeneity within the left fusiform gyrus can also be seen in the work of Hauk, Davis, Kherif, and Pulvermüller (2008) and Hauk, Davis, and Pulvermüller (2008), who report activation for words compared with baseline (viewing length-matched hash marks) within 1 cm anterior to the McCandliss et al. (2003) coordinate, activity modulated by word frequency 2 cm anterior, and activity modulated by imageability 1 cm medial to the VWFA coordinate. Further study will no doubt help clarify the functional heterogeneity in this region.
Another set of overlaps involves positive correlations between neural activation and increasing values of word frequency and imageability (lower part of Fig. 3). These regions, which include the angular gyrus and precuneus/posterior cingulate cortex bilaterally, have been strongly implicated in semantic processes (Binder et al. 2009) and have shown activation with increasing word imageability in previous imaging studies (Jessen et al. 2000; Binder, Medler, et al. 2005; Binder, Westbury, et al. 2005; Sabsevitz et al. 2005; Bedny and Thompson-Schill 2006). One can also intuit that higher-frequency words are more likely to elicit automatic activation in a semantic network due to their extensive exposure in relation to uncommon words. Word frequency is highly correlated with concept familiarity (Toglia and Battig 1978; Baayen et al. 2006), contextual diversity (i.e., the proportion of documents that contain the word; Adelman et al. 2006), and probability of word association (Nelson and McEvoy 2000). Word frequency facilitates performance on semantic decision tasks, suggesting that semantic information is more easily available for high-frequency words (Monsell et al. 1989; Chee et al. 2002). Consistent with these observations, increasing word frequency produced correlated increases in BOLD signal in essentially the same brain regions that were modulated by increasing imageability and over an even larger spatial extent within these regions than the areas modulated by imageability.
Surprisingly, however, positive effects of word frequency have only rarely been reported in previous neuroimaging studies. In one study, higher-frequency words activated left temporal and parietal regions during reading and semantic decision tasks when compared with a low-level baseline task, whereas lower-frequency words did not (Chee et al. 2002). However, these activations did not survive a direct contrast between high- and low-frequency words. The left angular gyrus was activated in another study comparing silent reading of high-frequency words with consonant strings (Joubert et al. 2004), but again, this activation did not survive a direct comparison between high- and low-frequency words. To our knowledge, only 2 previous studies have found positive activations related to word frequency. Using fMRI during a lexical decision task, Carreiras et al. (2009) observed activation in the precuneus in a direct comparison between high- and low-frequency words, which they interpreted as reflecting “more pronounced semantic associations” for those items. Similarly, using an auditory lexical decision task, Prabhakaran et al. (2006) observed activation in the precuneus, along with left middle temporal and angular gyri, in a direct comparison between high- and low-frequency words. Other studies that examined frequency effects, however, reported no activations related to increasing frequency (Fiez et al. 1999; Fiebach et al. 2002; Kronbichler et al. 2004; Carreiras et al. 2006; Nakic et al. 2006; Hauk, Davis, and Pulvermüller 2008). Three of these studies restricted their word frequency analyses to areas that were more active for words compared with a resting baseline (Fiez et al. 1999; Kronbichler et al. 2004; Carreiras et al. 2006). As can be seen by comparing the successful reading condition in Figure 1 with the word-frequency result in Figure 2, if the word frequency analysis had been restricted to areas showing activation for words compared with the resting baseline, the areas more active for higher-frequency words would have been excluded. As discussed elsewhere (Binder et al. 2009), the semantic system appears to be active during resting and other passive states. One implication of this is that activity in the set of areas sometimes referred to as the default mode network (Gusnard and Raichle 2001), which includes bilateral posterior cingulate/precuneus, dorsomedial prefrontal cortex, and angular gyri, may at least partially reflect semantic processes. Thus, contrasts that use a resting baseline are likely to miss these regions.
Other studies that did not exclude brain regions from the frequency contrast may have been less sensitive because frequency was treated as a categorical variable (Fiebach et al. 2002; Nakic et al. 2006) or because of smaller stimulus and subject sample sizes. The reason for the lack of activation for high-frequency words in the Hauk, Davis, and Pulvermüller (2008) study is less clear but may relate to the fact that their stimuli were presented more rapidly. Stimuli were displayed for 100 ms in their study, compared with 1000 ms in ours, and they used a fixed stimulus onset asynchrony of 2.5 s, whereas our ITIs included random variation with a mean of 4.9 s. Thus, their subjects may have been less likely to engage in extensive semantic association and extralexical tasks such as mental imagery. On the other hand, their subjects read silently, whereas ours read aloud, introducing a further difference that makes the studies difficult to compare.
To further examine the relationship of increasing word frequency with increasing semantic content, we examined the relationship between frequency and familiarity using the norms available for our stimuli in the MRC psycholinguistic database and between frequency, familiarity, and number of semantic features for a separate set of words (McRae et al. 2005). Familiarity measures are based on ratings for which subjects presumably use semantic information along with other types of information such as how often or recently a word was used (Balota et al. 1991). Of our 465 stimuli, 297 had familiarity ratings, and 230 had meaningfulness ratings in the study by Toglia and Battig (1978). Frequency was correlated with familiarity (r = 0.77) and with meaningfulness (r = 0.38), both reliable at P < 0.001. For a separate set of 541 nouns denoting living and nonliving things, word frequency was correlated with number of semantic features, with correlations ranging from r = 0.12 to 0.19 (all significant at P < 0.01), depending on the source of the frequency measures (McRae et al. 2005). Familiarity correlated with number of semantic features to an even greater extent (r = 0.23, P < 0.0001) than did frequency. Thus, the positive correlations of frequency with familiarity, meaningfulness, and number of semantic features all support the interpretation that the overlapping activation for higher-frequency and higher-imageability words observed in bilateral precuneus/posterior cingulate and angular gyrus reflects semantic processing during reading aloud.
Areas of increased BOLD signal for words with inconsistent spelling–sound mappings were observed in the left MTG and ITS (Fig. 2) and were largely distinct from areas modulated by other variables (Fig. 3). As described in the Introduction, inconsistent words are the ones most likely to benefit from activation of semantic codes. Hence, the increased activity associated with such words is likely to reflect neural systems supporting task-related recruitment of semantic processing. This interpretation is particularly clear for activation in left MTG/ITS, a region reliably associated with lexical–semantic processing (Binder et al. 2009). Two left IFG areas (red areas, Fig. 3) were also specifically modulated by spelling–sound consistency. In contrast to other neighboring IFG regions, these 2 areas—in pars orbitalis and triangularis—showed activation changes that were specifically related to consistency and not to RT or other variables modulating task load. These IFG areas have often been implicated in semantic processing (Binder et al. 2009) and in some studies have been assigned a specific role in controlled semantic retrieval (Badre and Wagner 2002). We propose that these left IFG regions are involved specifically in top-down attentional modulation of semantic networks in the MTG/ITS. These frontal regions become transiently more active during processing of words with inconsistent spelling–sound mapping, providing an attentional input that helps strengthen the word's lexical–semantic representation.
To our knowledge, the present results provide the first imaging evidence in healthy adults for activation of the semantic system with decreasing spelling–sound consistency. Previous studies of this variable reported mainly left IFG activations, which were interpreted as evidence of phonological processing (Herbster et al. 1997; Fiez et al. 1999; Mechelli et al. 2005) or domain-general task load effects (Binder, Medler, et al. 2005). The most salient difference between these previous studies and the current one is that the former treated consistency as a categorical variable, classifying words as either regular/consistent or irregular/inconsistent. In this study consistency, like the other variables of interest, was treated as continuous, with stimuli expressing a range of values calculated in terms of number of friends minus number of enemies. Two previous reading studies of children that also treated consistency as a continuous variable (Bolger, Hornickel, et al. 2008; Bolger, Minas et al. 2008) revealed greater activation for lower consistency words in left posterior inferotemporal regions (inferior temporal and fusiform gyri), suggesting that use of continuous values for consistency affords greater sensitivity to temporal lobe activation. The current findings extend those of Bolger, Hornickel, et al. (2008) and Bolger, Minas et al. (2008) to healthy adults and show that temporal regions modulated by consistency are not modulated by other lexical variables or by RT. Frost et al. (2005) also manipulated consistency, along with frequency and imageability, in reading aloud, but restricted their analyses to 3 a priori regions of interest (left IFG, MTG, and angular gyrus). They too found activation for low-consistency words in the MTG, which they interpreted as related to lexical semantics, in part because activity in this region was also greater for high-imageability words.
The neural findings related to bigram frequency (Fig. 2) were somewhat unexpected. As a measure of graphotactics, bigram frequency was expected to correlate positively with activity in left mid-fusiform gyrus, as reported in previous studies using nonwords (Binder et al. 2006; Vinckier et al. 2007). The lack of such a finding in the current study may arise from the fact that the range of bigram frequencies is compressed for words compared with nonwords, with few words in the very low bigram frequency range. In fact, the response function reported by Binder et al. (2006), relating BOLD response in the VWFA to bigram frequency, suggests that the effect is greatest in the low bigram frequency range and reaches an asymptote at higher ranges. Lack of correlation between graphotactic probability and activation in left mid-fusiform gyrus is also not without precedent. In a study of silent single-word reading by Hauk, Davis, and Pulvermüller (2008), orthographic typicality (a composite variable that included bi- and trigram probabilities) showed no association with activation in left ventral temporal cortex.
On the other hand, the increase in BOLD signal associated with decreasing bigram frequency in the present study has important implications. Decreases in bigram frequency likely increase the difficulty of mapping from orthography to phonology. This scenario matches well with the localization of these effects to bilateral posterior MTG and superior temporal sulcus (Figs. 2 and and3).3). These areas on the left were found in a meta-analysis of word production studies to be implicated in phonological retrieval (Indefrey and Levelt 2004). The finding of increased activity associated with decreasing bigram frequency in the left supramarginal gyrus also fits well with neuropsychological findings that associate damage in this area with conduction aphasia (Damasio and Damasio 1980; Damasio 1998; Saffran 2000; Alexander 2003), a syndrome characterized by deficits in phonological retrieval, and with phonological agraphia, an impairment in mapping from sublexical phonological to grapheme representations (Roeltgen et al. 1983; Alexander et al. 1992). In addition, a recent meta-analysis of functional neuroimaging studies of dyslexia found consistent underactivation for impaired compared with unimpaired readers in posterior middle temporal, superior temporal, and supramarginal gyri (Richlan et al. 2009), areas largely overlapping those shown in Figure 2 for reading words of decreasing bigram frequency. Together, these converging lines of evidence suggest a central role for these areas in computing orthography–phonology correspondences.
The overall pattern that emerges across these findings is schematically summarized in Figure 4. Areas in blue are implicated in the direct mapping from orthography to phonology; areas in yellow and red reflect activation of semantic codes from orthography, and areas in green may constitute a common system supporting general attention, working memory, and executive processes. Note, however, that these colors represent somewhat loose groupings that may not share exactly the same information processing roles. For example, we propose that the inferior temporal area shown in yellow on the lateral and ventral surfaces plays a stronger role than angular gyrus in mapping from semantics to phonology for the purpose of generating a phonological code. This interpretation is compatible with results from a meta-analysis of word production studies by Indefrey and Levelt (2004), in which they suggest that the transition from lexical–semantic to phonological processing occurs along the MTG. Activation in the angular gyrus and precuneus/posterior cingulate regions (red in Fig. 4), on the other hand, may reflect incidental activation of semantic representations for words for which more semantic information happens to be available (e.g., high-frequency and/or high-imageability words). This distinction is supported by the fact that the MTG/ITS, but not the angular gyrus or precuneus/posterior cingulate, was modulated by decreasing spelling–sound consistency, suggesting that the MTG/ITS plays a more central role in the task of generating phonology.
Additional support for this distinction comes from studies of neurodegenerative disorders. Patients with semantic dementia tend to exhibit surface dyslexia, giving regularized pronunciations of low-consistency words such as “sew” (pronounced to rhyme with “dew”; Patterson and Hodges 1992). Such patients have primarily anterior temporal lobe damage (Nestor et al. 2006; Brambati et al. 2009; Wilson et al. 2009), though their damage appears to extend posteriorly to include the area of inferior temporal activity seen here for words of decreasing consistency. The association of anterior and inferior temporal lobe damage in semantic dementia with surface dyslexia is highly reliable, with the severity of surface dyslexia increasing with degree of overall semantic impairment (Woollams et al. 2007).
By comparison, patients with Alzheimer disease (AD) show widespread pathology in temporal and parietal areas that prominently include the medial temporal lobe, posterior cingulate/precuenus, and lateral posterior temporo-parietal regions (Arnold et al. 1991), largely sparing ventral and lateral anterior temporal regions (Buckner et al. 2005). Relative to patients with semantic dementia, AD patients show preserved reading aloud of low-consistency words (Noble et al. 2000), at least until later stages of impairment (Strain et al. 1998). Instead, AD patients show impairment on a range of tasks that may be more related to semantic feature knowledge than the mapping of semantics to phonology. For example, in a study examining feature knowledge and priming effects, AD patients produced a larger proportion of shared compared with distinctive features describing concrete concepts such as “apple,” “horse,” and “chair,” compared with age-matched controls (Alathari et al. 2004). The relative loss of distinct compared with shared features has been invoked to account for hyperpriming effects seen in AD (Martin 1992). In lexical decision, for example, although overall performance is impaired compared with age-matched controls, AD patients show better performance than age-matched controls when a target such as “illness” is preceded by a related prime such as “doctor” (Chertkow et al. 1989; Giffard et al. 2001). This effect obtains across multiple levels of relatedness (Alathari et al. 2004). In addition, AD patients often show category-related semantic deficits, with a relatively greater impairment in naming biological compared with nonbiological entities that becomes more discrepant with increasing degree of overall naming impairment (Whatmough et al. 2003). Finally, although semantic and phonemic fluency tasks presumably place similar demands on executive control processes, AD patients show greater impairment on semantic compared with phonemic fluency tasks (for a review and meta-analysis see Salmon et al. 1999; Henry et al. 2004). Thus, the current results, in addition to studies of neurodegenerative disorders, suggest that during reading the semantic system may be functionally segregated, with the angular gyrus and precuneus/posterior cingulate supporting semantic feature knowledge, and the left MTG/ITS utilizing semantic information for mapping orthography to phonology.
Two potential limitations of this study warrant mention. One has to do with the choice of which variables to examine, the other with stimulus selection. Decorrelating 6 psycholinguistic variables so that their effects can be examined separately and simultaneously is more than has been done in most functional neuroimaging studies. More or different variables could have been chosen, though it is difficult to see how many more could have been decorrelated from the others while still maintaining a sufficiently large and representative stimulus set. An example of one variable of theoretical interest for word recognition that was left unexamined is orthographic neighborhood size. Defined in terms of Coltheart's N (Coltheart et al. 1977), for our stimuli, this variable is significantly correlated with both letter length (r = −0.595, P < 0.0001) and bigram frequency (r = 0.476, P < 0.0001). Although the finding of positive correlation of BOLD signal with letter length in occipital areas fits with previous reports (Mechelli et al. 2000; Wydell et al. 2003), the negative correlation in the left parahippocampal gyrus was unexpected and could potentially be related to increasing orthographic neighborhood size.
A second limitation relates to stimulus selection. An advantage of studying single-word reading is that words can be selected in such a way as to tightly control the properties of the set. This introduces a potential disadvantage, however, in that the more highly selected the set, the greater risk that results obtained from that set will not hold for other, similarly selected sets of words. This concern, however, may not be great for the current stimulus set because the correlations among the 6 variables were not very high in the original corpus from which the current stimuli were drawn. Of the 15 pairings among the variables, only one had a correlation of 0.4, and none of the other pairings were correlated above 0.2 (Table 2). Thus, decorrelating these 6 variables may not have entailed a large degree of distortion relative to the original corpus.
This study was designed to examine separate and overlapping effects of multiple factors related to reading aloud. Our goal was to reveal neural substrates for computing phonology from orthography, a process that may involve semantic access to varying degrees depending on the type of stimulus. Although this approach to some extent incorporates the basic assumptions of the parallel distributed processing (PDP) approach to models of word reading, rather than being specifically designed to test competing models, some of our findings may help adjudicate among existing accounts. One major difference between extant models of reading aloud is the relative importance of semantics. Dual-route models posit 2 distinct pathways, a sublexical pathway optimized for processing words that follow grapheme–phoneme correspondence rules and a lexical pathway optimized for processing words whose grapheme–phoneme correspondences represent exceptions to the rules. Notably, the lexical pathway in dual-route models does not include an implementation of semantics (Coltheart et al. 1993, 2001). PDP models, on the other hand, posit that the same basic computational principles obtain throughout the reading system and predict some role for semantics, although to varying degrees, in reading all words (Seidenberg and McClelland 1989; Plaut et al. 1996; Harm and Seidenberg 2004). Our results support the latter class of models in that highly familiar and imageable words were associated with activation in areas prominently implicated in lexical semantics. More importantly, low-consistency words, which PDP models claim benefit from semantic activation, recruited left MTG/ITS, an area demonstrated in numerous studies to be implicated in lexical–semantic processing (Binder et al. 2009). Hence, the current study provides evidence of at least some role for semantics in reading aloud, consistent with PDP accounts of reading.
National Institutes of Health grants from the National Institute of Neurological Disorders and Stroke to J.R.B. (R01 NS033576) and the Eunice Kennedy Shriver National Institute of Child Health and Human Development to W.W.G. (F32 HD056767).
We thank David A. Medler, PhD, for valuable advice in the planning stages of this project and for providing the composite imageability database. We also thank Liya Assefa and Edward Possing for indispensable help with extracting reaction times from the audio recordings.
Conflict of Interest: None declared.