|Home | About | Journals | Submit | Contact Us | Français|
This fMRI study investigated top-down letter processing with an illusory letter detection task. Participants responded whether one of a number of different possible letters was present in a very noisy image. After initial training that became increasingly difficult, they continued to detect letters even though the images consisted of pure noise, which eliminated contamination from strong bottom-up input. For illusory letter detection, greater fMRI activation was observed in several cortical regions. These regions included the precuneus, an area generally involved in top-down processing of objects, and the left superior parietal lobule, an area previously identified with the processing of valid letter and word stimuli. In addition, top-down letter detection also activated the left inferior frontal gyrus, an area that may be involved in the integration of general top-down processing and letter-specific bottom-up processing. These findings suggest that these regions may play a significant role in top-down as well as bottom up processing of letters and words, and are likely to have reciprocal functional connections to more posterior regions in the word and letter processing network.
The identification of separate letters in a visually presented word is foundational to an individual's success in reading and the inability to do so can lead to debilitating cognitive impairments such as dyslexia. To better understand this ability, there has been extensive research on the neural bases of word processing over the last two decades. The culmination of many neuroimaging studies have identified a distributed cortical network for visual word processing that ranges from the ventral occipitotemporal cortex (Cohen & Dehaene, 2004; Dehaene, Cohen, Sigman, & Vinckier, 2005; James & James, 2005; Dietz, Jones, Gareau, Zeffiro, & Eden, 2005) to the frontal cortex (Reinke, Fernandes, Schwindt, O'Craven, & Grady, 2008; for a review see Tan, Laird, Li, & Fox, 2005; Dien, 2009).
Current understanding of this word processing network has emphasized the feed-forward connections. Visual input is processed systematically starting with early stages of line detection in V1 and V2, leading to prelexical shape information extractions in V4, finally to the fusiform gyrus for integration of letter strings and word forms independent of location, size, and case (McCandliss, Cohen, & Dehaene, 2003; Dehaene, Cohen, Sigman, & Vinckier, 2005). This last stage has been termed the ‘visual word form area’ (VWFA, Cohen et al., 2002; Cohen & Dehaene, 2004; McCandliss, Cohen, & Dehaene, 2003; but see Price and Devlin, 2003; Reinke, Fernandes, Schwindt, O'Craven, & Grady, 2008, for other interpretations of this area). From the VWFA, information is further passed along to several language areas in the temporal, parietal, and frontal cortices, and, more specifically, the left middle temporal gyrus, left superior parietal lobule, and inferior frontal gyrus (Tan, Laird, Li, & Fox, 2005; Reinke, Fernandes, Schwindt, O'Craven, & Grady, 2008). These areas are purportedly involved in grapheme-to-phoneme conversion and semantic decoding (see Jobard, Crivello, & Tzourio-Mazoyer, 2003 for a review).
Despite the success of this account in explaining neural processing of words, pure bottom-up word processing stands in opposition to classic behavioral phenomena, such as the ‘word superiority effect’ in which a letter is identified more rapidly in the context of a word (McClelland & Rumelhart, 1981). These behavioral results gave rise the ‘interactive-activation’ connectionist model in which letter identification arises from the interplay between top-down and bottom-up information. In keeping with this approach, researchers have speculated that frontal areas may play a role in the processing of letters and words and recent studies support this account. For instance, studies have shown the influence of task demands on both electrophysiological (Ruz & Nobre, 2008) and fMRI (Devlin, Rushworth, & Matthews, 2005) measures of word identification. However, these studies examined the interplay between top-down and bottom-up processing, rather than isolating top-down effects; in these studies, bottom-up input remained strong, providing unambiguous visual information about the presented word or letter. Thus, it is not clear whether the initial bottom-up input served to elicit subsequent top-down feedback, or whether top-down expectations might have played an important role regardless of bottom-up information. To the best of our knowledge, no studies have examined the neural activity of top-down letter processing in the absence of strong bottom-up input.
The current study isolated top-down components of letter detection by asking participants to indicate the presence or absence of a letter in different noise images when no letters were actually presented. The task was not detection of a particular letter, which would prompt a search for particular line segments, but rather to detect whether one of several different possible letters was presented. Over the last several years, this method has been successfully used to isolate top-down mechanisms of face detection (Zhang et al., 2008; Li et al., 2009). Application of this method to letter detection thus may also provide informative evidence about the neural mechanisms involved in the top-down processing of letters. By examining fMRI responses for the illusory detection of letters, we could reveal the neural regions involved in top-down letter processing without contamination from bottom-up information.
Twenty-four Chinese participants (twelve males, mean age = 21.2 years, SD = 2.6) with normal or corrected-to-normal vision participated in this study after giving their informed consent. All participants were familiar with the Roman alphabet through exposure to Pinyin, a phonetic form of the Chinese language introduced in the first grade of elementary school. This study was approved by the Human Research Protection Program of Tiantan Hospital, Beijing, China.
Pure noise stimuli (Fig. 1c) were created by additively combining Gaussian blobs of different sizes at random locations. Each pure noise image was only used once, either alone or in combination with a letter image. Detection images that contained a letter were created by subtracting a blurred version of a chosen letter (a, s, c, e, m, n, o, r, or u) from a noise image. Because the letters were light against a black background (intensity 0), this meant that the background was unchanged after subtraction, which left the noise image undisturbed in those regions, whereas the regions where the letter existed were made darker. Easily detected letters (Fig. 1a) or hard to detect letters (Fig. 1b) were created by subtracting the letter at 60% or 35% of its full value. Checkerboard-images (Fig. 1d) were additionally used to calculate participant's baseline hemodynamic response.
The experiment consisted of two parts: an initial training period where actual letters were presented on 50% of trials, and a testing period where only noise images were presented. Participants were scanned only during the testing period. The training session consisted of six blocks, each of which included 20 detection images and 8 checkerboard images. The first two blocks contained an equal number of easy to detect (Fig. 1a) and pure noise (Fig. 1c) stimuli. The next two blocks contained an equal number of hard to detect (Fig. 1b) and pure noise stimuli. Trials in the last two training blocks used pure noise stimuli on all detection trials. Participants were instructed that half of the detection images would contain letters and the other half would not and that the detection task would become progressively more difficult. They were instructed to press one button on a response device with their left or right index finger when they detected a letter or a second button with their opposite index finger when they did not detect a letter. Whether the detection finger was left or right was counterbalanced across subjects. Participants were instructed not to respond to the checkerboard images. Each trial started with a 200 ms fixation crosshair followed by either the detection image or checkerboard image for 600 ms. Participants' responses were collected during a blank screen presented for 1,200 ms after each trial. The aim of the training session was to teach participants the nature of the experiment, and to motivate them to attempt to detect letters even in pure noise images.
Four testing sessions followed the training blocks, with each consisting of 40 checkerboard images and 120 pure noise trials presented in random order. The task instructions were the same as for the training session and participants were instructed that half the images contained letters.
Structural and functional MRI data were collected using a 3.0 T MR imaging system (Siemens Trio, Germany) at Tiantan Hospital. fMRI was collected using a single shot, T2*-weighted gradient-echo planar imaging (EPI) sequence (TR/TE = 2,000/30 ms; 32 slices; 4 mm thickness; matrix = 64×64) covering the whole brain with a resolution of 3.75× 3.75 mm. High-resolution anatomical scans were acquired with a three-dimensional enhanced fast gradient-echo sequence, recording 256 axial images with a thickness of 1 mm and a resolution of 1×1 mm.
Spatial preprocessing and statistical mapping were performed with SPM5 software (http://www.fil.ion.ucl.ac.uk/spm/, Friston et al., 1995). The first three scans of each testing session were excluded for signal saturation. After slice-timing correction, spatial realignment and normalization to the MNI152 template (Montreal Neurological Institute), the scans of all sessions were resampled into 2×2×2 mm3 voxels, and then spatially smoothed with an isotropic 6 mm full-width-half-maximal (FWHM) Gaussian kernel. The time series of each session was high-pass filtered (high-pass filter = 128 s) to remove low frequency noise such as with scanner drift (Friston et al., 1995).
Trials from the testing session were classified according to whether participants did or did not detect a letter, resulting in two regressors convolved with a proptypical hemodynamic response function (HRF) to produce the letter response and no-letter response conditions. For each participant, scans of all testing sessions were combined and analyzed using a general linear model (GLM). Movement parameters were added in the GLM as additional regressors to account for residual head motion. After participant-specific parameter estimates were computed, a conventional whole-brain analysis was performed at the group level using random effect analysis to contrast letter response and no-letter response thresholds at p = 0.05, FWE corrected and cluster ≥ 5.
Participants detected letters on 34.6% (SD 19.0%) of the 480 pure noise detection trials. There was no significant difference in response time between letter responses and no-letter responses (letter response: Mean = 760.81 ms, SD =183.78 ms; no-letter response: Mean = 741.31 ms, SD=175.13 ms; t(23) = 1.422, p = 0.169).
A conventional whole-brain analysis identified a distributed network showing more activation for letter responses than for no-letter responses with a threshold of p < 0.05 (FWE corrected) and cluster ≥ 5 (Table 1). This network included the left inferior frontal gyrus (IFG, Fig. 2a), the left superior parietal lobules (SPL, Fig. 2b), the right precuneus (Fig. 2c), the right middle occipital gyrus (MOG, Fig. 2d), and the right middle temporal gyrus (MTG, Fig. 2e). In contrast, the reverse comparison did not identify significant activation anywhere in the brain (Table 1).
The within-subject contrasts between letter response trials and a no-letter response mainly served to identify the Regions of Interest (ROIs). Next, brain-behavior correlations for these ROIs were examined across-subjects, which is particularly informative considering the large individual differences in the proportion of trials where participants gave illusory letter responses (see Fig. 2). ROI selection and these correlations are not necessarily independent considering that both used the same data, which can potentially distort inferential conclusions (Kriegeskorte, Simmons, Bellgowan, & Baker, 2009). In particular, brain-behavior correlation values should be considered cautiously in light of selection effects (Vul, Harris, Winkielman, & Pashler, 2009). However, it is important to note that these ROIs were selected based on within subjects contrasts rather than the between-subjects correlations, and this procedure should prevent correlation inflation (E. Vul, personal communication, June 8, 2009).
To compare participants' individual behavior with detection related BOLD signal changes, the average letter/no-letter difference in percent signal change (PSC) within each activated region was correlated with average response time and also with the proportion of letter responses of each individual. There were no significant correlations between average response time and the difference in activation between letters and non-letter responses in each region (p > 0.05). However, the correlation between the proportion of letter responses and the difference in activation between letters and no-letter responses was significant within the left IFG (r = -0.636, p = 0.001) and marginally significant within the left SPL (r = -0.355, p = 0.088), but not within the other regions (the right precuneus, r = -0.038, p = 0.860; the right MOG r = -0.235, p = 0.268; the right MTG, r = -0.195, p = 0.361; see Fig. 2).
This study is the first to identify neural correlates of top-down letter processing in the absence of any systematic bottom-up information. By examining illusory letter detection to pure noise images, the reported activation differences can be attributed primarily to endogenous top-down processing. Compared to trials where letters were not detected, illusory letter detection was associated with greater activation in several cortical areas ranging from the occipital cortex to the frontal cortex.
Together with the existing evidence regarding these areas, the present findings shed important lights on the specific functions of these areas in top-down processing. The present study revealed that the SPL was activated by the letter response relative to the no-letter response. Different sub-regions in the SPL have been demonstrated to be involved in various processes such as spatial attention processing, episodic memory retrieval, and word processing. Recent studies reported that some sub-regions of the SPL were specifically activated by letter naming, but not by naming objects or scramble letters, with the activation peak (-38, -50, 48) highly consistent with what was found here (-34, -50, 50), suggesting such region may be specifically involved in letter processing rather than words processing or other object processing (Joseph, Cerullo, Farley, Steinmetz & Mier, 2006; Joseph, Gathers & Piper, 2003). In line with this suggestion, Price, Wise, and Frackowiak (1996) found activation of this region of the SPL elicited by pseudowords relative to words, and attributed this activation to the phonological coding during the translation of individual letters into their sounds. The patients with damage to the left SPL and posterior temporal cortex presented the absence of the ability to retrieve sounds from letters (Friedman, Ween & Albert, 1993). Given these evidence, the activation of the SPL elicited by letter response relative to the no-response in the present study may be related to the phonological translation from letter into sound when a letter was ‘detected’.
It should be noted that some sub-regions within the left SPL was also reported to be involved in a number of non-letter related top-down processes, such as imagery of objects (-13, -74, 52, Mechelli, Price, Friston & Ishai, 2004), imagery of famous faces (-38, 37, 40, Ishai, Haxby & Ungerleider, 2002) and detection of faces from noise (-45, -44, 44, Zhang, et al., 2008). However, the exact loci of the peak activation reported in these studies are not consistent with that found in the present study (-34, -50, 50), especially for the study of Zhang, et al., (2008) who used the same experimental paradigm as the presents study except that their participants were required to detect faces instead of letters from noisy images. The differences between these previous studies and the present study suggest that the sub-region of the SPL identified by the present results might indeed play a unique role in the top-down processing of letters.
With regard to the left IFG, there is evidence to suggest that the left IFG may have a broad top-down function. The above mentioned recent study (Zhang et al., 2008) used exactly the same experimental paradigm as the presents study except that participants were required to detect faces instead of letters from pure noise images. They found that the left IFG (-48, 13, 24) also was activated when faces were purportedly detected from noise pictures relative to when faces were not detected. Further, as revealed by our correlation analyses, the proportion of letter detections made by each participant was negatively correlated with the BOLD signal differences between letter and no-letter responses in the left IFG. In other words, the larger a participant's activation difference the less likely the participant would report to have seen letters. One explanation for this finding can be provided by Huettel, Song & McCarthy (2005) who suggested that the posterior left IFG was related to making decision under uncertainty and the activation of this region increased with the increasing uncertainty of a decision. On this account, the participants who were more liberal in responding may make decision about whether the letter was detected in a less uncertainty situation, and therefore, a small amount of top-down activation was sufficient to prompt letter detection. In contrast, for conservative individuals, the uncertainty of the detection task may increase though the same noise pictures were used for all the participants. As a result, more top-down activation was required to produce illusory letter detection in this more uncertainty situation.
However, recent studies have reported that the IFG is specifically activated by the bottom-up processing of letters (Joseph, Cerullo, Farley, Steinmetz & Mier, 2006; Joseph, Gathers & Piper, 2003; Pernet, Franceries, Basan, Cassol, Démonet & Celsis, 2004). For example, some studies have reported that the left IFG responded more to the naming of letters and objects than to the matching of letters or objects, suggesting that this region may be involved in phonological processing (Joseph, Cerullo, Farley, Steinmetz & Mier, 2006; Joseph, Gathers & Piper, 2003). The locus of the activation of the left IFG in these studies (-47, 5, 27) is highly consistent with that (-48, 5, 26) of our study. Further, in our study, although the participants were instructed to only respond to whether a letter was detected in the noise picture using a response key during the scanning sessions, they reported silently reading the letter when they ‘detected’ it from the noise picture. Unlike the words, letters have little semantic content but simple phonological information. Therefore, in line with Joseph et al., (Joseph, Cerullo, Farley, Steinmetz & Mier, 2006; Joseph, Gathers & Piper, 2003), the activation of the left IFG in our study may be due to the phonological processing of the ‘detected’ letters. This interpretation is also supported by the evidence form recent studies about word processing (Xu et al., 2001), wherein the same region (-44, 4, 28) in the left IFG was activated by pseudoword rhyming relative to color matching with letters. More interestingly, this region (-42, 4, 26) responded more to alphabetic word than to Chinese characters (for a review, see Tan, Laird, Li & Fox 2005). Thus, the evidence from the top-down and bottom-up studies taken together suggests that the sub-region in the left IFG identified in the present study might be involved in the integration of bottom-up letter specific processes and general top-down processes.
With regard to the function of precuneus, previous studies has demonstrated that bilateral precuneus was involved in different processes, such as mental imagery, working memory, and episodic memory (Cabeza & Nyberg, 2000). For example, studies have shown that the bilateral precuneus produced greater activation during visual imagery (e.g., famous faces) compared to passively viewing letter strings (e.g., Ishai, Haxby, & Ungerleider, 2002). Additionally, it is reported that precunues was activated by visual imagery such as faces, house, and chair (Mechelli, Price, Friston, & Ishai 2004). Also, Zhang et al. (2008) who use the same experimental paradigm to the present study have found that the left precuneus (-30, -62, 36) also shown more activation when detecting a face from noise picture than when not detecting a face. In the present study, to find the letter from the noise picture, individuals have to perform a match between the pattern seen in the noisy pictures and the letters stored in their memory. Once that match is successful, one could then report a detection of a letter. Therefore, the potential interpretation of the activation of precuneus in the present study may be due to the retrieval of letter information, based on which a letter has been ‘detected’. Our finding that the precuneus is active during top-down letter detection is consistent with the existing findings, and suggests that the precuneus may be more generally involved in top-down object identification for a variety of object classes, including letters.
The right MOG and right MTG also responded to letter response than to no-letter response. This finding is not readily related to the previous literature. The activation of these regions within visual cortex may be due to the top-down modulation of the SPL (Silvanto, Muggleton Lavie & Walsh, 2009); however, their exact roles in the top-down processing of letters need to be further investigated.
The present study using the pure noise paradigm represents the first attempt to reveal the neural regions involved the top-down processing of letters. Further research is however needed. For example, one needs to use the pure noise paradigm along with the existing bottom-up paradigms to compare the neural regions involved in top-down and bottom-up processing of letter. Further, one can use the same pure noise paradigm to contrast the top-down processing of letters with that of digits, shapes, and other objects as well as that of words using a within-subjects design. Such design should provide firmer evidence as to the extent to which some of the regions activated during top-down letter detection are indeed specific to letter or word processing and the others are associated with the top-down processing of objects generally. Furthermore, functional connectivity analyses are needed to identify the neural networks specifically involved in the top-down processing of letters as opposed to that of words, digits, shapes, and other objects. Such additional research should help greatly the elucidation of the neural organizations involved in the top-down processing of various ecological salient objects in our environment.
In this study, we examined the neural correlates of top-down letter perception. When participants detected letters in pure noise images, a number of brain regions showed higher activity as compared to trials when participants did not detect letters. These regions included the precuneus, an area generally involved in top-down processing of objects, and the left superior parietal lobule, an area previously identified with the processing of valid letter and word stimuli. In addition, top-down letter detection also activated the left inferior frontal gyrus, an area that may be involved in the integration of general top-down processing and letter-specific bottom-up processing. These findings suggest that these regions may play a significant role in top-down as well as bottom up processing of letters and words, and are likely to have reciprocal functional connections to more posterior regions in the word and letter processing network.
This paper is supported by the Project for the National Basic Research Program of China (973) under Grant No.2006CB705700, Changjiang Scholars and Innovative Research Team in University (PCSIRT) under Grant No.IRT0645, CAS Hundred Talents Program, CAS scientific research equipment develop program under Grant No. YZ200766, the Knowledge Innovation Project of the Chinese Academy of Sciences under Grant No. KGCX2-YW-129, the National Natural Science Foundation of China under Grant No. 30672690, 30600151, 60532050, 60621001, 30873462, 60910006, 30970769, 30970771, Beijing Natural Science Fund under Grant No.4071003, Technology Key Project of Beijing Municipal Education Commission under Grant No.KZ200910005005, the Joint Research Fund for Overseas Chinese Young Scholars under Grant No. 30528027.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.