|Home | About | Journals | Submit | Contact Us | Français|
Studies of skilled reading (Price & Mechelli, 2005), its acquisition in children (Shaywitz et al., 2002; Turkeltaub, Gareau, Flowers, Zeffiro, & Eden, 2003), and its impairment in patients with pure alexia (Leff et al., 2001), all highlight the importance of the left posterior fusiform cortex in visual word recognition. We used visual masked priming and fMRI to elucidate the specific functional contribution of this region to reading and found that: i) unlike words, repetition of pseudowords (“solst-solst”) did not produce a neural priming effect in this region, ii) orthographically related words such as “corner-corn” did produce a neural priming effect, but iii) this orthographic priming effect was reduced when prime-target pairs were semantically related (“teacher-teach”). These findings conflict with the notion of stored visual word forms and instead suggest that this region acts as an interface between visual form information and higher-order stimulus properties such as its associated sound and meaning. Importantly, this function is not specific to reading, but is also engaged when processing any meaningful visual stimulus.
Functional neuroimaging studies have identified a left lateralized occipito-temporal region consistently engaged by word reading but outside of the classic neurological model of reading (Price, 2000). Activation in this area is typically centered on the posterior occipitotemporal sulcus and spreads medially and laterally onto adjacent fusiform and ventral inferior temporal gyri, respectively. Because it occurs quickly after stimulus presentation (approximately 150–200 msec post-onset) (L Cohen et al., 2000; Nobre, Allison, & McCarthy, 1994; Tarkiainen, Helenius, Hansen, Cornelissen, & Salmelin, 1999) and is unaffected by the font, case, and visual hemifield of presentation, Cohen and colleagues consider it the first stage of abstract orthographic processing and refer to the region as the “visual word form area” (L Cohen et al., 2000; L. Cohen et al., 2002; Dehaene, Le Clec’H, Poline, Bihan, & Cohen, 2002; Dehaene et al., 2001). In addition, both real words and pseudowords – that is, pronouncable letter strings that do not form a valid word such as “melk” – activate this region relative to consonant letter strings, false fonts, or simple fixation (Fiez & Petersen, 1998; Nobre et al., 1994; Polk et al., 2002; Price, Wise, & Frackowiak, 1996; Rumsey et al., 1997). This suggests that the stored visual information is pre-lexical (L Cohen et al., 2000; L. Cohen et al., 2002). In other words, by this account the left posterior occipito-temporal region stores combinations of letters which adhere to the orthographic regularities of the language such as bigrams (Dehaene, Cohen, Sigman, & Vinckier, 2005). The letter string “pl,” for instance, is a pre-lexical representation that would be activated equally by “plane,” “apple,” “plint,” and “taple.” According to this pre-lexical visual word form hypothesis, abstract combinations of letters are stored in this region.
A slightly different account was put forward by Kronbichler et al. (2004) who argued that the region acts as an orthographic lexicon that stores lexical, rather than pre-lexical, representations. Evidence for this claim came from an fMRI study that found an inverse relation between activation in the left posterior fusiform cortex and the frequency of the written word in print. The authors argued that the stored visual representation must correspond to whole words because word frequency is a property of the whole word rather than its component letters or bigrams. By this lexical VWF hypothesis, pseudowords partially activate a cohort of word representations due to overlapping orthography. That is, “plint” partially activates lexical representations for “pint,” “lint,” “plane,” etc. thereby activating the region. Although this second hypothesis differs in its emphasis on lexical, rather than pre-lexical, representations, both visual word form hypotheses agree that some level of abstract visual word forms are stored in this posterior fusiform region.
Here we propose an alternate account in which the left posterior fusiform cortex acts as an interface between abstract visual form information and higher order properties of the stimulus such as its associated sound and meaning. In reading, subtle visual differences often indicate dramatic differences in meaning. Consider “acne” and “acre,” where a small difference in visual form separates a skin condition from a measure of land. In this example, the extension of a single vertical line is crucial to identifying the word correctly; moreover, similar visual subtleties are not limited to Roman scripts, but are found in all written languages, be they alphabetic, syllabic, or logographic (Figure 1). Thus, reading requires linking very fine-grained visual form processing with higher order properties of the stimulus such as its associated meaning or sound pattern in order to uniquely identify the word. We therefore hypothesize that the left posterior fusiform gyrus integrates abstract visual form information with these higher order properties, in order to respond appropriately to a stimulus (cf. Price & Friston, 2005).
To summarize, there are three accounts of left posterior fusiform involvement in reading, two visual word form (VWF) hypotheses in which either pre-lexical or lexical letter-strings are stored in the region, and a third account in which the region acts as an interface between visual form information and higher order properties of the stimulus. Each of these generate distinct predictions that can be tested using priming and functional magnetic resonance imaging (fMRI). Previous studies have shown that at a neural level both repetition (Dehaene et al., 2001; Vuilleumier, Henson, Driver, & Dolan, 2002) and semantic priming (Kotz, Cappa, von Cramon, & Friederici, 2002; Mummery, Shallice, & Price, 1999) tend to manifest as a reduction in activation, possibly due to habituation of neuronal responses (Desimone, 1996). Consequently, by manipulating the relation between the prime and target word, we can investigate sensitivity to pre-lexical, lexical and conceptual relations in order to distinguish between accounts:
If pre-lexical letter combinations are stored in this region, then pseudowords should show repetition-induced decreases in activation similar to those seen in repetition priming for words. In both cases the priming mechanism is the same, namely shared letter strings. In contrast, neither of the other two hypotheses predict a neural priming effect. According to the lexical VWF hypothesis, pseudowords have no lexical entries and therefore there is nothing to prime. By the visual interface hypothesis, repeated pseudowords facilitate both visual form processing and access to phonological form, but an unsuccessful search for target meaning would undo these processing benefits. In other words, we assume pseudowords are processed as if they were words and this includes semantic processing in order to associate a meaning with the word-like stimulus (Price et al., 1996; Rumsey et al., 1997). It is the additional processing demands of this failed semantic search that explain why pseudoword repetition does not benefit processing in the left posterior fusiform gyrus.
The lexical VWF hypothesis predicts that orthographically similar, but distinct lexical items, such as “corner-corn” should not lead to reduced activation because they have separate lexical representations. In fact, there may even be competition between the visually similar word forms which would be expected to increase activation due to greater processing demands. In contrast, the other hypotheses predict a clear reduction in activation for unrelated word pairs that share visual form.
Pairs that also share meaning (e.g. “teacher-teach”) differentiate between the pre-lexical VWF and visual interface hypotheses because only the latter predicts a modulation of the neural priming effect. More specifically, shared meaning makes the prime and target words more difficult to differentiate semantically, increasing the processing demands during integration. This results in a smaller reduction in activation for related, relative to unrelated, orthographic pairs. In contrast, the pre-lexical VWF account predicts that shared meaning should not influence the priming effect since the stored representations do not have associated meanings.
We used a visual masked priming paradigm and fMRI to evaluate these competing hypotheses. Masked priming ensured that all priming effects were the result of automatic, unconscious processes rather than strategic processing adopted by the participants (Dehaene et al., 2001; Forster & Davis, 1984). Crucially these predictions are specific to the left posterior fusiform cortex, so the analysis focused solely on this region.
Twelve healthy, native British speaking volunteers (5F, 7M) participated in this study. Ages ranged from 18 to 25 years (mean = 21) and all were strongly right handed, as assessed with the Edinburgh handedness inventory. Participants were briefed on scanner safety and gave written consent before taking part. Ethical approval was granted by the Oxford Research Ethics Committee.
In the main experiment, participants saw a series of letter strings presented one at a time on a computer screen and decided whether each was an actual English word or not (i.e. made a lexical decision). A trial began with a fixation point presented for 1 sec followed by a visual mask of meaningless symbols presented for 500 msec. This was followed immediately by a prime (in lower case) for 33 msec which then was replaced by the target string in upper case (see Fig. 3a). By appearing immediately after the prime (stimulus onset asynchrony of 0msec) and in a different case, the target acted as a backward mask for the prime (cf. Forster & Davis, 1984). Participants indicated whether the target was an actual English word or not by a button press. The next trial did not begin until the subject had responded or 1500 msec had elapsed. Response times (RT) and accuracy were recorded. There was a short practice session before scanning for subjects to become familiar with the task. None of the items used outside of the scanner in practice or pre-testing was repeated during scanning.
There were a total of 280 trials, half of which had word targets. These were divided into five word conditions and three non-word conditions, each consisting of 28 word-pairs, except consonant letter strings (Table 1). The first four conditions manipulated Lexicality (words, pseudowords) and Repetition (repeated, unrelated) to determine whether words and pseudowords yield equivalent priming effects: 1) Unrelated pairs (e.g. “event-RUG”) shared neither form nor meaning and served as the baseline for evaluating the word priming effects. 2) Repeated words had identical orthography and meaning (e.g. “plant-PLANT”). 3) Pseudowords were orthographically legal, pronounceable non-word targets (e.g. “dollar-TAVE”). 4) Repeated pseudowords such as “solst-SOLST” were used to evaluate repetition priming effects on pseudowords. In this design, all primes except those in the repeated pseudoword condition were words to avoid biasing responses based solely on the prime, consistent with previous behavioural studies (Boudelaa & Marslen-Wilson, 2001; Forster & Veres, 1998; Pastizzo & Feldman, 2002).
Next, Form (±orthography) and Meaning (±semantics) relations between primes and targets were manipulated in conditions 1, 5–7 to test whether words sharing visual form (i.e., overlapping stems) also produced neural priming effects and whether these were modulated by the semantic relationship between the words: 5) Orthographically overlapping but unrelated pairs had identical orthography except for an additional segment on the prime (e.g., “corner-CORN”). 6) Orthographically overlapping and conceptually related pairs (e.g. “teacher-TEACH”) also shared a stem but were similar in meaning. 7) Semantically related pairs were synonyms with no orthographic similarity (e.g. “notion-IDEA”).
Finally, a baseline condition was included for evaluating word and pseudoword reading: 8) Unpronounceable consonant letter string targets such as “donkey-NKLX.” This condition included 84 items to equate the number of word and non-word targets while keeping the number of trials in the main experimental conditions (1–7) constant. In other words, the experiment comprised a low level baseline and two 2 × 2 manipulations which shared the Unrelated condition (#1).
All word stimuli were matched across conditions for rated familiarity (Mean ± SD, 530 ± 63, F=2.2, p=0.06) and imageability (482 ± 99, F=1.1, p=0.34) based on the MRC Psycholingustic database (Max Coltheart, 1981). In addition written word frequencies in British usage were matched across word conditions (88 ± 212; F=1.1, p=0.34) based on values per million in the Celex database (Baayen & Pipenbrook, 1995). Finally, the number of syllables (1.8 ± 0.8, F=0.1, p=0.94) and number of letters (5.7 ± 1.9, F=0.4, p=0.72) were matched across the all word conditions except the repetition condition, where these values were significantly smaller (syllables = 1.1 ± 0.4, letters = 4.4 ± 0.6) because the primes did not include an additional segment, as was present in the other four conditions. Non-word items matched lexical trials in letter length.
During scanning, items were presented in two runs to prevent fatigue and their order was counter-balanced across subjects. Within each run, the order of presentation was pseudo-randomized in an event-related design, with the constraint that transition frequencies between conditions were equated. The inter-trial interval varied according to subjects’ response times and thus led to a “jittered” sampling of the haemodynamic response (Josephs & Henson, 1999). To verify this, peri-stimulus data acquisition times per condition were computed for all trials across conditions and subjects and statistically compared for differences in distributions. Despite the large number of pair-wise comparisons (n=28), no difference even approached significance (all p>0.2), thus ensuring an unbiased sampling of the HRF across conditions.
All scans were carried out using the Varian-Siemens 3T scanner at the Centre for Functional Magnetic Resonance Imaging of the Brain in Oxford. A Magnex head-dedicated gradient insert coil was used in conjunction with a birdcage head radio frequency coil tuned to 127.4MHz. Functional imaging consisted of 21 T2*-weighted echo-planar image (EPI) slices (TR=3sec, TE = 30msec, FOV = 192 × 256mm, matrix = 64 × 64) giving a notional 3 × 4 × 5mm resolution. An automated shimming algorithm was used to reduce magnetic field inhomogeneities (Wilson et al., 2002). For anatomical localisation purposes, a T1-weighted scan was acquired (3D Turbo FLASH sequence, TR = 15msec, TE = 6.9msec) with 1mm2 in-plane resolution and 1.5mm slice thickness.
Reaction times (RTs) were measured from the onset of the target string. To minimise the effect of outliers in the RT data, the median RT for correct responses was calculated per condition per subject for use in the statistical analyses. One subject had RTs approximately 200 msec greater than the group mean and was therefore removed from both the behavioural and functional image analyses. In addition, three (out of 280) word-pairs were removed because accuracy on these trials was at chance.
After removing the first 4 images of each session to allow for T1 equilibrium, functional images were realigned (Jenkinson, Bannister, Brady, & Smith, 2002) using the FSL software (http://www.fmrib.ox.ac.uk/fsl) in order to correct for small head movements. No participant moved more than 1.5mm in any direction and rotations were less than 1.3°. Functional images were registered to the participant’s structural scan and then to the MNI 152-mean brain using an affine procedure (Jenkinson & Smith, 2001). Finally, each image was smoothed with a 5mm full-width half-maximum Gaussian filter. The FSL software was used to compute individual subject analyses in which the time series were pre-whitened to remove temporal autocorrelation (Woolrich, Ripley, Brady, & Smith, 2001). Each of the eight conditions were modelled separately and only included correct trials. Incorrect trials, temporal derivatives, and estimated motion parameters were included as covariates of no interest to increase statistical sensitivity. Random effects group analyses identified significant priming effects as reductions in BOLD signal relative to the appropriate baseline condition in our region-of-interest, namely the left posterior fusiform gyrus. This was defined as a sphere with a 1cm radius centred on (−42, −57, −15), the peak coordinates of Cohen et al.’s (2002) “visual word form area”. This ROI was used to identify the precise region of the left posterior fusiform cortex engaged by word and pseudoword reading. We calculated the voxel-level height threshold (Z>3.0) corresponding to a p<0.05 corrected for multiple comparisons within this ROI (Worsley et al., 1996). Subsequently analyses were based on the mean percent BOLD signal change within the voxels commonly activated by both word and pseudoword reading.
Before the fMRI experiment began, two preliminary experiments were conducted to verify that visually masked words were not consciously recognisable. Words were presented on a computer screen for either 33 or 200 msec and were forward and backward masked with meaningless symbol strings. In the first task, participants were asked to match the presented word to one of two choices and guess if uncertain. In the second, they read the words aloud as accurately as possible. The results are summarised in Figure 2. In both tasks, performance was at ceiling when words were presented for 200 msec. By contrast, words presented for only 33 msec were very difficult to report. Only 4/220 words (1.7%) were read aloud successfully, and accuracy in the matching task (52.9%) was not significantly different from chance (binomial test, p=.98). These results confirm that forward and backward masked words presented for only 33 msec were not consciously perceived by the participants, even when the task specifically required participants to attend to them. This suggests that the masked primes in the main lexical decision experiment were not visible to participants, and this was confirmed in post-hoc de-briefings. Some subjects reported “occasionally seeing something flash up” before the target, but none recognised these as words.
The accuracy and RT results of the main experiment are displayed in Figure 3 where error bars indicate the standard error of the mean. Overall, accuracy levels were very high, indicating that participants had no trouble performing the task. These were analysed with two 2 × 2 repeated measures analyses of variance (ANOVA). The first examined the effects of Lexicality (words vs. pseudowords) and Repetition (repeated vs. unrelated) and found a significant main effect of Lexicality (F1,10=11.7, p<0.01) and a significant Lexicality × Priming interaction (F1,10=11.3, p<0.01). The main effect of Repetition did not reach significance (F1,10=2.6, p>0.1). In other words, accuracy was higher for words than pseudowords in general, and repetition improved accuracy in the pseudoword, but not the word, condition. The second ANOVA examined the effects of Form (±orthography) and Meaning (±semantics). Neither main effect reach significance (Form: F1,10=3.6, p=0.09; Meaning: F1,10=1.5, p>0.1) but the interaction was significant (F1,10=5.8, p<0.05) indicating that subjects made more errors specifically in the Orthographic overlap conditions (e.g. corner-CORN).
Reaction times (RTs) were analysed in an identical fashion. The first ANOVA examined the effects of Lexicality (words vs. pseudowords) and Repetition (repeated vs. unrelated) and found a highly significant main effect of Lexicality (F1,10=111.8, p<0.001), indicating that participants responded more quickly to words than pseudowords. There was no main effect of Repetition (F1,10=2.4, n.s.) and the interaction showed a trend towards significance (F1,10=4.5, p=0.06). Planned comparisons showed a significant 30 msec facilitation for repetition priming of words (t10=2.9, p<0.05) but a non-significant 2 msec inhibition for pseudowords (t10=0.2, n.s.). Although there was no facilitation of RTs for repeated pseudowords, there was a significant improvement in accuracy, suggestive of a possible speed-accuracy trade-off for pseudoword repetition priming.
A second ANOVA tested the effects of Form (±orthography) and Meaning (±semantics) on RTs. There was a significant main effect of Form (F1,10=8.2, p<0.05) with no effect of Meaning (F1,10=1.7, n.s.) and no significant interaction (F1,10=0.9, n.s.). Planned comparisons indicated that pairs with overlapping orthography but different meanings such as “corner-CORN” produced a mean 24 msec facilitation (t10=2.7, p<0.05), while pairs sharing both form and meaning (e.g. “teacher-TEACH”) produced a mean 25 msec facilitation (t10=3.0, p<0.05). There was no significant priming for semantically related, but visually unrelated pairs such as “idea-NOTION” (13 msec facilitation, t10=1.5, n.s.). This comparison, therefore, showed that only word pairs with overlapping orthography produced reliable behavioural priming effects.
The functional imaging analyses began by identifying the specific region of the left posterior fusiform gyrus engaged in both word and pseudoword reading by separately comparing unrelated words and pseudowords to the consonant letter string baseline (Figure 4). The peak activation for words relative to consonant letter strings is shown in red and was located in the left occipitotemporal sulcus (−42, −60, −18; Z=3.6, p<0.05 corrected for multiple comparisons within the ROI) and extended both medially onto the convexity of the posterior fusiform gyrus and laterally onto the inferior temporal gyrus. The peak voxel for the pseudoword comparison was located on the left inferior temporal gyrus (−44, −54, −16; Z=3.7, p<0.05) and is shown in orange. Although the activation for pseudowords was more anterior and lateral to the word activation, the two clusters overlapped extensively (shown in yellow). The region of overlap consisted of 42 voxels (336mm3 volume) with a centre-of-gravity at (−42, −58, −16). Within the region of overlap there was no significant difference in mean percent BOLD signal change between words and pseudowords (t10=0.4, n.s.). Subsequent analyses evaluated the functional characteristics of the region of activation common to word and pseudoword reading.
In the first analysis, the mean percent BOLD signal changes per condition per subject were computed and entered into a 2-way ANOVA testing the effects of Lexicality (words vs. pseudowords) and Repetition (repeated vs. unrelated). The main effect of Lexicality was not significant (F1,10=3.1, p>0.1), but there was a trend towards a main effect of Repetition (F1,10=4.4, p=0.06.), driven primarily by a significant interaction (F1,10=4.7, p=0.05) indicating that words, but not pseudowords, led to a reduction in BOLD signal (Fig. 4b, top row). In fact, repetition priming for words led to a 67% reduction in BOLD signal relative to unrelated word pairs (t10=3.2, p<0.05) with a peak voxel at (−44, −62, −18). In contrast, repetition of pseudowords led to a 24% increase in BOLD signal, although this was not significantly different from pseudoword targets with unrelated primes (t10<1). Even within the original spherical ROI, no voxel showed a significant pseudoword repetition priming effect. These results are consistent with both the lexical VWF and visual interface hypotheses which predict no neural priming for pseudowords, but conflict with the pre-lexical VWF hypothesis that predicts equivalent repetition priming effects for both words and pseudowords.
A second analysis evaluated the effects of orthographic and semantic relatedness on activation in this left posterior fusiform region. Once again, mean percent BOLD signal change was entered into a 2-way ANOVA with Form (±orthography) and Meaning (±semantics) as independent factors. There was a significant main effect of Form (F1,10=8.8, p<0.05), which was qualified by a significant interaction (F1,10=6.3, p<0.05) indicating that although there was priming for all word pairs which shared visual form, the effect was reduced when they also shared meaning (e.g., “teacher-TEACH”). There was no significant main effect of Meaning (F1,10<1) and these results are shown in Figure 4b (bottom row). Planned post-hoc tests confirmed a significant neural priming effect for words sharing visual form but not meaning (t10=4.8, p<0.001) and a smaller effect for pairs sharing both form and meaning (t10=2.3, p=0.05). Words with similar meaning but distinct visual forms (e.g. “idea-NOTION”) did not show a reliable priming effect (t10=1.8, p>0.1).
These findings are inconsistent with the lexical VWF hypothesis, which predicts either no orthographic priming or an increase in BOLD signal due to competition between distinct lexical entries. The observation that orthographically related pairs led to a neural priming effect is, on the other hand, consistent with both the pre-lexical VWF and visual interface hypotheses. When visually related pairs also shared meaning (e.g. “teacher-TEACH”), the neural priming effect was reduced in magnitude. This modulation is difficult to explain in terms of pre-lexical letter-string representations (which carry no meaning), but it is predicted by the visual interface hypothesis. By this account, the shared visual form facilitates identification, thus reducing visual processing demands, although these effects are partially offset by the additional processing needed to distinguish between conceptually similar targets, thus reducing the size of the priming effect.
Consistent with previous studies, we have shown that a region of the left posterior fusiform cortex is engaged by both word and pseudoword reading relative to consonant letter strings (see Mechelli, Gorno-Tempini, & Price, 2003 for a review), and that the same region shows significant reductions in BOLD signal associated with case-independent repetition priming (Dehaene et al., 2004; Dehaene et al., 2001). These findings suggests that the area is engaged in processing abstract visual form information necessary for visual word identification and are consistent with the accepted notion that visual information becomes progressively less related to the specific features of retinal stimulation as it moves forward in the ventral visual stream. Ventral extrastriate regions compute abstract visual properties such as form (Grill-Spector & Malach, 2004), colour (Hadjikhani, Liu, Dale, Cavanagh, & Tootell, 1998; Wade, Brewer, Rieger, & Wandell, 2002), and depth (Neri, Bridge, & Heeger, 2004), although reading tends to rely primarily on form. To this level, all accounts of fusiform function agree. By extracting abstract visual form information, the region allows a visual representation to be mapped onto other aspects of the word such as its meaning (semantics) or sound (phonology). However, three new findings help to clarify the precise role of the left posterior fusiform cortex and thereby distinguish between competing explanations of fusiform involvement in reading. Specifically, we found that: i) unlike words, repeated pseudowords did not produce a neural priming effect in this region, ii) that orthographically related words did produce a neural priming effect, but iii) this orthographic priming effect was reduced when the prime-target pair were semantically related. Three theories of fusiform function in reading are considered in light of these findings.
According to the pre-lexical visual word form hypothesis, neurons in the left posterior fusiform region are tuned to sub-lexical combinations of letters that commonly co-occur in a written language (Dehaene et al., 2005; Dehaene et al., 2004). Visual word recognition occurs in a serially organized, staged approach starting with visual feature detectors in extrastriate cortex, proceeding through letter detectors and letter-cluster detectors in the posterior fusiform, and then activating lexical representations stored in more anterior multimodal fusiform areas. By this account one would expect repetition priming effects for both words and pseudowords, at least at the level of letter and letter-string detectors. However, only repeated words produced a neural priming effect (i.e. decreased BOLD signal) in the posterior fusiform, while repeated pseudowords led to a slight increase in signal. Interestingly, Fiebach and colleagues (2005) have also reported the same interaction in the left posterior fusiform, namely a neural priming effect for repeated words but no change in the haemodynamic responses for repeated pseudowords, even when participants were consciously aware of the primes. This lack of a priming effect for pseudowords is difficult to explain in terms of letter and letter-string detectors. In addition, the current study showed that the orthographic priming effect was modulated by the semantic relatedness of the pair. That is, words related in both form and meaning (e.g. “teacher-teach”) produced a significantly smaller neural priming effect than those which only shared form (e.g. “corner-corn”). Semantic modulation of the neural priming effect seems incompatible with the claim that pre-lexical letter combinations are stored in this region and poses a similar problem to that of previously reported word frequency effects (Kronbichler et al., 2004; Kuo et al., 2003). Together these findings question the adequacy of the pre-lexical VWF hypothesis as an explanation for posterior fusiform contributions to reading.
An alternate account suggests that only whole word patterns are stored in the left posterior fusiform cortex and that they serve as recognition units during reading (Kronbichler et al., 2004). By this hypothesis, frequency effects are easily explained as a property of the stored word (Morton, 1969) and pseudoword repetition priming is not expected because no neural representation of novel letter strings exists to be primed. However, the orthographic priming effects reported here are unexpected because word pairs such as “corner-corn” have independent lexical entries in memory. With no shared representation between them, there is nothing that can prime. Moreover, many theoretical accounts suggest that visual similarity increases competition between words and thus increases processing demands (M. Coltheart, Rastle, Perry, Langdon, & Ziegler, 2001; McClelland & Rumelhart, 1981). The observation that visually similar word pairs reduced BOLD signal in the posterior fusiform is therefore problematic for the lexical VWF hypothesis. In addition, psychophysical studies show that visual word recognition in skilled readers is considerably less efficient than expected if neurons responded to whole word patterns (Pelli, Farell, & Moore, 2003). These findings suggest that reading relies on independent detection of simple features rather than multi-letter features, in contrast to both the lexical and pre-lexical VWF hypotheses.
What, then, is the nature of fusiform involvement in reading? Anatomically, the posterior fusiform is part of inferotemporal cortex and thus sits atop a ventral processing hierarchy that extracts basic visual properties such as form and colour (DeYoe & Van Essen, 1988). Moreover, the specific region appears to be congruent with the visual field map VO-1 which responds most strongly to foveal presentation and has a less precise retinotopic map than earlier areas (Wandell, Brewer, & Dougherty, 2005). In other words, the region processes fine grained visual form such as is necessary to rapidly distinguish between visually similar, but conceptually distinct, stimuli such as written words. Bottom-up projections from early visual cortices (Distler, Boussaoud, Desimone, & Ungerleider, 1993) provide simple feature information which is combined and temporarily instantiated as a pattern of activation over a neuronal population in the posterior fusiform. This transient representation is concurrently shaped by both bottom-up and top-down constraints, thereby integrating both visual and non-visual information. This non-visual information includes, but is not limited to, semantic and phonological aspects of the stimulus which arrive via bidirectional connections with more rostral temporal lobe structures (Catani, Jones, Donato, & Ffytche, 2003; Distler et al., 1993), including portions of the superior temporal sulcus involved in phonological processing (Belin, Zatorre, Lafaille, Ahad, & Pike, 2000; Scott, Blank, Rosen, & Wise, 2000) and lateral and inferior regions of the anterior temporal pole involved in semantic memory (Hodges, Graham, & Patteron, 1995; Vandenberghe, Price, Wise, Josephs, & Frackowiak, 1996).
More generally, the left posterior fusiform cortex is only one component of the neural system engaged by reading (Price et al., 2003; Price et al., 2005). Within this system, functional connectivity studies of reading show that BOLD signal in the left posterior fusiform gyrus demonstrates temporal coupling with other left hemisphere language areas including the inferior frontal gyrus (i.e. Broca’s area), middle and superior temporal gyri, and more anterior fusiform regions (Bitan et al., 2005; Bokde, Tagamets, Friedman, & Horwitz, 2001; Mechelli et al., 2005). Electrophysiological evidence further suggests that this processing is not serially staged, but concurrent (Martin, Nazir, Thierry, Paulignan, & Demonet, 2005; Pammer et al., 2004). In other words, the integration of visual form and non-visual associations occurs as a highly interactive process (McClelland & Rumelhart, 1981) rather than as a feed-forward step in a serial mapping of vision onto sound and meaning (L. Cohen et al., 2002). Visual word recognition is then a dynamic, constraint satisfaction process, integrating bottom-up visual constraints with top-down contextual constraints including meaning, phonotactics, and morpho-syntax. The brain regions engaged collectively interact and settle into a short-lived, but stable, distributed pattern of activation spanning these regions. In summary, we propose that the posterior fusiform cortex transiently instantiates a representation of a visual stimulus that interfaces between its invariant visual characteristics (e.g. form) and higher order properties of that stimulus. Note that nothing about this claim is specific to reading – a point we will return to shortly.
By this hypothesis, repetition of words and pseudowords leads to reduced activation because the invariant representation of the visual stimuli, namely the prime and the target, is the same despite differences in their physical characteristics. A similar facilitation is present when the prime and target share a stem (e.g. “corner-corn”). This facilitation, however, is reduced when the words also share meaning (e.g. “teacher-teach”) because of the additional semantic processing necessary to differentiate between the similar meanings. In other words, to recognize “teacher” correctly requires distinguishing it from the similar concept “teach” and this is more demanding than distinguishing the meaning of “corner” from “corn.” This explanation highlights the point that the posterior fusiform is primarily driven by visual information but that this can be influenced by non-visual factors. This was clearly demonstrated by the finding that shared visual form led to significant neural priming whereas shared meaning did not. These results are consistent with previous studies showing that visual, but not auditory, stimuli engage the region (Dehaene et al., 2002).
The main difference between this account and the two visual word form hypotheses is that our hypothesis is not specific to reading – any meaningful stimulus would be expected to engage these same processes. In this context, “meaningful” depends critically on the task. Pictures of nonsense objects, for instance, can be meaningful when associated with particular hand actions (Phillips, Humphreys, Noppeney, & Price, 2002; Price & Devlin, 2003) and pseudowords are meaningful in that they have an associated sound pattern. In other words, as long as the stimulus affords higher-order, non-visual properties that must be integrated with visual information, we would expect the left posterior fusiform to be engaged to some extent. Consistent with this, object recognition also engages the same posterior fusiform region as reading (Ishai, Ungerleider, Martin, Schouten, & Haxby, 1999; Levy, Hasson, Avidan, Hendler, & Malach, 2001; McCandliss, Cohen, & Dehaene, 2003; Price & Devlin, 2003). In addition, repetition priming studies demonstrate that activation in the posterior fusiform cortex is reduced for repeated real-world objects, independent of their size or viewpoint, but repeated nonsense objects do not show a neural priming effect in this region (Henson, Shallice, & Dolan, 2000; Vuilleumier et al., 2002), similar to the current findings for real vs. pseudowords. Recognizing a visually presented object or picture (e.g. a tiger) requires computing invariant attributes such as its form, colour, motion, depth, etc. and integrating this information with its name (“tiger”) and meaning (“a ferocious cat that lives in Asian forests”). The fact that objects are typically associated with multiple visual attributes while written words are distinguished almost exclusively by their form, may help to explain why objects activate the fusiform more strongly than words (Moore & Price, 1999; Price & Devlin, 2003; Price & Mechelli, 2005), namely due to their greater visual integration requirements. Proponents of the VWF hypothesis, on the other hand, explain the overlapping activation for words and objects as an artifact of limited spatial resolution in functional neuroimaging. They argue that there are separate sub-populations of letter-string and object detectors within the same macroanatomic region, in much the same way that ocular dominance columns are inter-digitated in V1 (L. Cohen & Dehaene, 2004). Additional studies using very high resolution functional imaging (Cheng, Waggoner, & Tanaka, 2001; Yacoub et al., 2001) or intra-operative recordings (Kreiman, Koch, & Fried, 2000; Ojemann, Schoenfield-McNeill, & Corina, 2002) will be necessary to test these claims.
In summary, we propose that the left posterior fusiform cortex transiently instantiates an invariant representation of a visual stimulus that includes not only form, but other visual attributes when they carry relevant information. This representation is modulated by top-down projections from higher-order association cortices in order to integrate non-visual properties of the stimulus such as its meaning or sound. This account explains a range of visual word recognition findings, including activation for words and pseudowords relative to low level baselines, case-invariant repetition priming effects for words but not pseudowords, and the orthographic priming effects for unrelated and related words seen in the current study. In addition, it provides a parsimonious explanation for the common fusiform activation seen in both object recognition and word reading. In doing so, it leads away from cognitive-based parcelations of cortex and towards an understanding of brain function in terms of information processing grounded in known anatomical and neurophysiological properties of the region.
We thank Holly Bridge, Cathy Price, Christian Fiebach and an anonymous reviewer for helpful discussions and we thank Charvy Narain, Rami Niazy, and Hawkan Lau for providing the examples used in Fig. 1. The work was supported by Wellcome Trust grants to JTD and HLJ, MRC grants to PMM, and an NIH NRSA Award F32-DC00374 and RO1 grant MH55628-06 to LMG.