Consistent with previous studies, we have shown that a region of the left posterior fusiform cortex is engaged by both word and pseudoword reading relative to consonant letter strings (see
Mechelli, Gorno-Tempini, & Price, 2003 for a review), and that the same region shows significant reductions in BOLD signal associated with case-independent repetition priming (
Dehaene et al., 2004;
Dehaene et al., 2001). These findings suggests that the area is engaged in processing abstract visual form information necessary for visual word identification and are consistent with the accepted notion that visual information becomes progressively less related to the specific features of retinal stimulation as it moves forward in the ventral visual stream. Ventral extrastriate regions compute abstract visual properties such as form (
Grill-Spector & Malach, 2004), colour (
Hadjikhani, Liu, Dale, Cavanagh, & Tootell, 1998;
Wade, Brewer, Rieger, & Wandell, 2002), and depth (
Neri, Bridge, & Heeger, 2004), although reading tends to rely primarily on form. To this level, all accounts of fusiform function agree. By extracting abstract visual form information, the region allows a visual representation to be mapped onto other aspects of the word such as its meaning (semantics) or sound (phonology). However, three new findings help to clarify the precise role of the left posterior fusiform cortex and thereby distinguish between competing explanations of fusiform involvement in reading. Specifically, we found that: i) unlike words, repeated pseudowords did not produce a neural priming effect in this region, ii) that orthographically related words did produce a neural priming effect, but iii) this orthographic priming effect was reduced when the prime-target pair were semantically related. Three theories of fusiform function in reading are considered in light of these findings.
According to the pre-lexical visual word form hypothesis, neurons in the left posterior fusiform region are tuned to sub-lexical combinations of letters that commonly co-occur in a written language (
Dehaene et al., 2005;
Dehaene et al., 2004). Visual word recognition occurs in a serially organized, staged approach starting with visual feature detectors in extrastriate cortex, proceeding through letter detectors and letter-cluster detectors in the posterior fusiform, and then activating lexical representations stored in more anterior multimodal fusiform areas. By this account one would expect repetition priming effects for both words and pseudowords, at least at the level of letter and letter-string detectors. However, only repeated words produced a neural priming effect (i.e. decreased BOLD signal) in the posterior fusiform, while repeated pseudowords led to a slight increase in signal. Interestingly,
Fiebach and colleagues (2005) have also reported the same interaction in the left posterior fusiform, namely a neural priming effect for repeated words but no change in the haemodynamic responses for repeated pseudowords, even when participants were consciously aware of the primes. This lack of a priming effect for pseudowords is difficult to explain in terms of letter and letter-string detectors. In addition, the current study showed that the orthographic priming effect was modulated by the semantic relatedness of the pair. That is, words related in both form and meaning (e.g. “teacher-teach”) produced a significantly smaller neural priming effect than those which only shared form (e.g. “corner-corn”). Semantic modulation of the neural priming effect seems incompatible with the claim that pre-lexical letter combinations are stored in this region and poses a similar problem to that of previously reported word frequency effects (
Kronbichler et al., 2004;
Kuo et al., 2003). Together these findings question the adequacy of the pre-lexical VWF hypothesis as an explanation for posterior fusiform contributions to reading.
An alternate account suggests that only whole word patterns are stored in the left posterior fusiform cortex and that they serve as recognition units during reading (
Kronbichler et al., 2004). By this hypothesis, frequency effects are easily explained as a property of the stored word (
Morton, 1969) and pseudoword repetition priming is not expected because no neural representation of novel letter strings exists to be primed. However, the orthographic priming effects reported here are unexpected because word pairs such as “corner-corn” have independent lexical entries in memory. With no shared representation between them, there is nothing that can prime. Moreover, many theoretical accounts suggest that visual similarity increases competition between words and thus increases processing demands (M.
Coltheart, Rastle, Perry, Langdon, & Ziegler, 2001;
McClelland & Rumelhart, 1981). The observation that visually similar word pairs reduced BOLD signal in the posterior fusiform is therefore problematic for the lexical VWF hypothesis. In addition, psychophysical studies show that visual word recognition in skilled readers is considerably less efficient than expected if neurons responded to whole word patterns (
Pelli, Farell, & Moore, 2003). These findings suggest that reading relies on independent detection of simple features rather than multi-letter features, in contrast to both the lexical and pre-lexical VWF hypotheses.
What, then, is the nature of fusiform involvement in reading? Anatomically, the posterior fusiform is part of inferotemporal cortex and thus sits atop a ventral processing hierarchy that extracts basic visual properties such as form and colour (
DeYoe & Van Essen, 1988). Moreover, the specific region appears to be congruent with the visual field map VO-1 which responds most strongly to foveal presentation and has a less precise retinotopic map than earlier areas (
Wandell, Brewer, & Dougherty, 2005). In other words, the region processes fine grained visual form such as is necessary to rapidly distinguish between visually similar, but conceptually distinct, stimuli such as written words. Bottom-up projections from early visual cortices (
Distler, Boussaoud, Desimone, & Ungerleider, 1993) provide simple feature information which is combined and temporarily instantiated as a pattern of activation over a neuronal population in the posterior fusiform. This transient representation is concurrently shaped by both bottom-up and top-down constraints, thereby integrating both visual and non-visual information. This non-visual information includes, but is not limited to, semantic and phonological aspects of the stimulus which arrive via bidirectional connections with more rostral temporal lobe structures (
Catani, Jones, Donato, & Ffytche, 2003;
Distler et al., 1993), including portions of the superior temporal sulcus involved in phonological processing (
Belin, Zatorre, Lafaille, Ahad, & Pike, 2000;
Scott, Blank, Rosen, & Wise, 2000) and lateral and inferior regions of the anterior temporal pole involved in semantic memory (
Hodges, Graham, & Patteron, 1995;
Vandenberghe, Price, Wise, Josephs, & Frackowiak, 1996).
More generally, the left posterior fusiform cortex is only one component of the neural system engaged by reading (
Price et al., 2003;
Price et al., 2005). Within this system, functional connectivity studies of reading show that BOLD signal in the left posterior fusiform gyrus demonstrates temporal coupling with other left hemisphere language areas including the inferior frontal gyrus (i.e. Broca’s area), middle and superior temporal gyri, and more anterior fusiform regions (
Bitan et al., 2005;
Bokde, Tagamets, Friedman, & Horwitz, 2001;
Mechelli et al., 2005). Electrophysiological evidence further suggests that this processing is not serially staged, but concurrent (
Martin, Nazir, Thierry, Paulignan, & Demonet, 2005;
Pammer et al., 2004). In other words, the integration of visual form and non-visual associations occurs as a highly interactive process (
McClelland & Rumelhart, 1981) rather than as a feed-forward step in a serial mapping of vision onto sound and meaning (L.
Cohen et al., 2002). Visual word recognition is then a dynamic, constraint satisfaction process, integrating bottom-up visual constraints with top-down contextual constraints including meaning, phonotactics, and morpho-syntax. The brain regions engaged collectively interact and settle into a short-lived, but stable, distributed pattern of activation spanning these regions. In summary, we propose that the posterior fusiform cortex transiently instantiates a representation of a visual stimulus that interfaces between its invariant visual characteristics (e.g. form) and higher order properties of that stimulus. Note that nothing about this claim is specific to reading – a point we will return to shortly.
By this hypothesis, repetition of words and pseudowords leads to reduced activation because the invariant representation of the visual stimuli, namely the prime and the target, is the same despite differences in their physical characteristics. A similar facilitation is present when the prime and target share a stem (e.g. “corner-corn”). This facilitation, however, is reduced when the words also share meaning (e.g. “teacher-teach”) because of the additional semantic processing necessary to differentiate between the similar meanings. In other words, to recognize “teacher” correctly requires distinguishing it from the similar concept “teach” and this is more demanding than distinguishing the meaning of “corner” from “corn.” This explanation highlights the point that the posterior fusiform is primarily driven by visual information but that this can be influenced by non-visual factors. This was clearly demonstrated by the finding that shared visual form led to significant neural priming whereas shared meaning did not. These results are consistent with previous studies showing that visual, but not auditory, stimuli engage the region (
Dehaene et al., 2002).
The main difference between this account and the two visual word form hypotheses is that our hypothesis is not specific to reading – any meaningful stimulus would be expected to engage these same processes. In this context, “meaningful” depends critically on the task. Pictures of nonsense objects, for instance, can be meaningful when associated with particular hand actions (
Phillips, Humphreys, Noppeney, & Price, 2002;
Price & Devlin, 2003) and pseudowords are meaningful in that they have an associated sound pattern. In other words, as long as the stimulus affords higher-order, non-visual properties that must be integrated with visual information, we would expect the left posterior fusiform to be engaged to some extent. Consistent with this, object recognition also engages the same posterior fusiform region as reading (
Ishai, Ungerleider, Martin, Schouten, & Haxby, 1999;
Levy, Hasson, Avidan, Hendler, & Malach, 2001;
McCandliss, Cohen, & Dehaene, 2003;
Price & Devlin, 2003). In addition, repetition priming studies demonstrate that activation in the posterior fusiform cortex is reduced for repeated real-world objects, independent of their size or viewpoint, but repeated nonsense objects do not show a neural priming effect in this region (
Henson, Shallice, & Dolan, 2000;
Vuilleumier et al., 2002), similar to the current findings for real vs. pseudowords. Recognizing a visually presented object or picture (e.g. a tiger) requires computing invariant attributes such as its form, colour, motion, depth, etc. and integrating this information with its name (“tiger”) and meaning (“a ferocious cat that lives in Asian forests”). The fact that objects are typically associated with multiple visual attributes while written words are distinguished almost exclusively by their form, may help to explain why objects activate the fusiform more strongly than words (
Moore & Price, 1999;
Price & Devlin, 2003;
Price & Mechelli, 2005), namely due to their greater visual integration requirements. Proponents of the VWF hypothesis, on the other hand, explain the overlapping activation for words and objects as an artifact of limited spatial resolution in functional neuroimaging. They argue that there are separate sub-populations of letter-string and object detectors within the same macroanatomic region, in much the same way that ocular dominance columns are inter-digitated in V1 (L.
Cohen & Dehaene, 2004). Additional studies using very high resolution functional imaging (
Cheng, Waggoner, & Tanaka, 2001;
Yacoub et al., 2001) or intra-operative recordings (
Kreiman, Koch, & Fried, 2000;
Ojemann, Schoenfield-McNeill, & Corina, 2002) will be necessary to test these claims.
In summary, we propose that the left posterior fusiform cortex transiently instantiates an invariant representation of a visual stimulus that includes not only form, but other visual attributes when they carry relevant information. This representation is modulated by top-down projections from higher-order association cortices in order to integrate non-visual properties of the stimulus such as its meaning or sound. This account explains a range of visual word recognition findings, including activation for words and pseudowords relative to low level baselines, case-invariant repetition priming effects for words but not pseudowords, and the orthographic priming effects for unrelated and related words seen in the current study. In addition, it provides a parsimonious explanation for the common fusiform activation seen in both object recognition and word reading. In doing so, it leads away from cognitive-based parcelations of cortex and towards an understanding of brain function in terms of information processing grounded in known anatomical and neurophysiological properties of the region.