Design and Stimulus Materials
The stimuli were designed such that they could be counterbalanced both across the two tasks (LD and RJ) and across the three relatedness conditions (directly, indirectly and unrelated). In order to counterbalance in this way, three hundred word triplets were developed such that target words (e.g. “stripes”)
were paired with directly related primes (e.g. “tiger”)
and indirectly related primes (e.g. “lion”).
Word-triplets were taken from those used in previous published studies (Balota and Lorch 1986
; McNamara and Altarriba 1988
; Weisbrod, et al. 1999
) or else were developed for the current study.
In 113 of these triplets, we conducted a free association experiment (described by Kreher et al. 2006
, Experiment 3) in which 30 participants who did not participate in the fMRI experiment generated 5 associates to either the primes from the directly related word-pairs, the primes from the indirectly related word-pairs, or to the target words. The directly related targets were almost always generated as associates while the unrelated targets were almost never generated as associates from the primes of the directly related word-pairs. The theoretical mediating words or the primes of the directly related word-pairs were often generated from the primes of the indirectly related word-pairs, while the targets of the indirectly related word-pairs were almost never generated from the primes of the indirectly related word-pairs (for details, see Kreher et al. 2006
). In the additional 187 triplets, all the directly related word-pairs, but none of the indirectly and unrelated word-pairs, had an associative strength on the Edinburgh Associative Thesaurus (Coltheart 1981
) of greater than zero; and, again, the associative strength of the indirect primes to their theoretical mediating words was much greater than the associative strength between the indirectly related word-pairs.
A second norming study established that, although individuals generally generated the mediator words of the indirectly related word-pairs when given both prime and target, they did not generate the mediator word when they were just given the prime. Finally, fifteen subjects who did not take part in the fMRI experiment conducted a Relatedness Judgement task on the word-pairs in which they were asked to rate how related in meaning they were on a five-point scale using three counterbalanced lists; the directly-related word-pairs (mean = 4.41, SD = 0.56) were rated as being more related in meaning than the indirectly related word-pairs (mean = 3.11, SD = 0.60) [t(299) = 27.207, p < .01], which were, in turn rated as more related in meaning than the unrelated word-pairs (mean =1.45, SD = 0.37) [t(299) = 40.579, p < .01].
These word triplets were then used to counterbalance targets across the six lists in a Latin Square design. Each participant saw one list during the LD task and one list during the RJ task. This ensured that no individual would see the same prime or target more than once (avoiding repetition priming effects), but that, across all participants, exactly the same targets would be seen in all three relatedness conditions in both tasks and that, across all participants, exactly the same primes would be viewed in the directly related and the unrelated conditions. Thus, in each of the six list there were 150 pairs: 50 directly related pairs, 50 indirectly related pairs and 50 unrelated pairs. In addition, the frequency and number of letters of both primes and targets across the six lists (and the three Relatedness conditions) was the same (no main effect of List or no List by Relatedness interaction, p > 0.5). Then, to each list, 50 word-nonword trials were added. All nonword targets were phonologically permissible strings in English and they were all derived from words that were unrelated to their primes. The nonwords were also counterbalanced across the LD and RJ tasks (they were included in the RJ task so that, counterbalanced across all participants, exactly the same stimulus lists could be used in both tasks).
Given that, on the RJ task, participants classified 50% of the indirectly related word-pairs as related and 50% as unrelated (reported in the Results), the RP was approximately 0.5. The nonword ratio (the number of word-nonword-pairs/word-nonword-pairs + unrelated pairs) (Neely, et al. 1989
) was 0.4. An example stimulus set is given in .
Example of word pairs, counterbalanced across conditions, derived from the triplet “lion-tiger-stripes”
Participants in fMRI study
Participants were recruited by advertisement. All were right-handed as assessed using the modified Edinburgh Handedness Inventory (Oldfield 1971
; White and Ashton 1976
). Selection criteria required all participants to have normal or corrected-to-normal vision, to be native speakers of English, and to have learned no other language before the age of five. In addition, volunteers were not taking any medication and were screened to exclude the presence of psychiatric and neurological disorders and to exclude contraindications for MRI. Written consent was obtained from all subjects before participation according to the established guidelines of the Massachusetts General Hospital Institutional Review Board. Two subjects were excluded because of scanning artifacts and one subject was excluded because his behavioral performance was at chance. This left sixteen participants in total (14 males and 2 females; mean age: 42).
Stimulus presentation and Tasks
During scanning, each participant viewed one list during the LD task and one list during the RJ task (lists were fully counterbalanced across participants as explained above). Each list was divided into three functional runs, each lasting 4 minutes and 10 seconds. The LD task was performed during the first three functional runs, and the RJ task was performed during the second three functional runs. The LD task always took place before the RJ task so that participants were not explicitly alerted to the semantic relationships between the word-pairs that could potentially bias their lexical decisions1
During the LD task, subjects decided as quickly and as accurately as possible whether the target was a real English word or a nonword. During the RJ task, subjects decided as quickly and as accurately as possible whether the target was related or unrelated in meaning to the prime. Participants were explicitly told that, when they saw target nonwords during the RJ task, they should indicate that these were not related in meaning to the primes. In both tasks, participants indicated their decisions by pressing one or two buttons using the index and middle fingers of their left hand (counterbalanced across subjects). Participants were practiced on the LD task before scanning and on the RJ task inside the scanner after carrying out the LD task. Subjects’ accuracy and reaction times (RTs) on both tasks were recorded.
In both tasks, each trial began with the prime (500msec), a blank screen (300msec), a target (500msec), and then another blank screen (300msec). Thus, the SOA was 800msec. Between word-pairs, a question mark appeared (1100msec) followed by a blank screen (300msec). The four trial types appeared in pseudorandom order, in all runs, interspersed among 100 visual fixation trials (fixate on a “+” for variable durations of 1000msec-8000msec, mean: 3000msec). The random interleaving of these fixation or ‘null-events’ amongst the word-pairs enabled the efficient estimation and deconvolution of the entire hemodynamic response (Burock, et al. 1998
MRI data acquisition
Subjects underwent two structural scans on a 1.5 Tesla scanner (Siemens Medical Solutions, Iselin NJ), each constituting a 3D MPRAGE sequence (128 sagittal slices, 1.3mm thickness, TR: 7.25msec, TE: 3msec, flip angle: 7°, bandwidth: 195 Hz/pixel, in-plane resolution: 1.3mm × 1mm). Functional imaging took place in a 3.0T head-only Siemens Allegra scanner. Blood oxygen level dependent (BOLD) signal was imaged using a T2*-weighted gradient-echo pulse sequence (TR: 2sec, TE: 25msec, flip angle: 90°) with 33 transverse slices covering the whole brain (125 images per slice, 3mm thickness, 0.9mm between slices). The in-plane resolution was 3.13×3.13mm (64×64 matrix, 200mm FOV). 125 images were acquired during each functional run for a total run time of 4mins 10sec. Head motion was minimized using pillows and a forehead strap. The first four volumes of each functional run were discarded to allow the magnetization to equilibrate.
Behavioral data analysis Accuracy
On the LD task, the frequencies with which nonwords were classified as words (false positive errors) and with which words were classified as nonwords (false negative errors) are reported. On the RJ task, the frequency with which the unrelated words were classified as related (false positive errors) and with which the related words were classified as unrelated (false negative errors) are reported. In addition, the frequency with which the indirectly related words were classified as unrelated are reported. Note that the judgments of the indirectly related word-pairs were subjective – they could be judged as related or unrelated depending on whether, within the time period given, participants were able to retrieve a potential mediator. They therefore cannot be considered correct responses or errors per se.
Given our a priori predictions, we performed planned repeated-measures 2 (Task) × 2 (Relatedness) ANOVAs that contrasted (a) the directly related and the unrelated word-pairs, (b) the indirectly related and the unrelated word-pairs, and (c) the directly related and the indirectly related word-pairs. Planned paired t-tests within the LD or RJ tasks were conducted to examine the source of any interactions between Task and Relatedness. Both subjects analyses (in which RTs were averaged over all items in each relatedness condition) and items analyses (in which RTs were averaged over all subjects in each relatedness condition) were conducted. In both subjects and items analyses, Task and Relatedness were within-subject factors. In all ANOVAs and t-tests, the dependent variable was RTs to the correctly-answered trials: for the LD task, these were the trials on which the targets were correctly classified as words; for the RJ task, these were the trials on which participants classified the directly related and the indirectly related word-pairs as related, and the unrelated word-pairs as unrelated. Because, as discussed above, during the RJ task, the decision as to whether the indirectly related words were related or unrelated was subjective, all analyses that involved the RJ task were repeated (a) including all RTs to indirectly related word-pairs, regardless of how they were classified in the RJ task, and (b) including RTs to indirectly related word-pairs that were judged as unrelated in the RJ task.
Alpha was set to 0.05. All analyses were repeated after logarithmically transforming the data and yielded the same pattern of findings.
In order to increase the signal-to-noise ratio, the two structural scans for each participant were averaged together, after motion correction, to create a single volume2
. This resulting high signal:noise volume was then subject to an automated segmentation procedure by which the surface representing the gray/white border was reconstructed and inflated to yield a 2D representation of the cortical surface (Dale, et al. 1999
; Dale and Sereno 1993
; Fischl, et al. 2001
) using FreeSurfer software developed at the Martinos Center, Charlestown, MA (http://surfer.nmr.mgh.harvard.edu/
Functional images were motion corrected using the AFNI algorithm (Cox 1996
; Cox and Jesmanowicz 1999
). Images were corrected for temporal drift, normalized and spherically smoothed using a 3D spatial filter (full-width-half-max: 8.7mm). The functional images were then analyzed with a General Linear Model (GLM) using a finite impulse response (FIR) model, using FreeSurfer Functional Analysis Stream (FS-FAST). The FIR model gave estimates of the hemodynamic response every 1sec as stimuli were allowed to onset on half as well as the full 2sec TR. It allowed us to address our hypotheses without assumptions about the shape of the hemodynamic response (Burock, et al. 1998
; Burock and Dale 2000
; Dale 1999
The cortical surface of each individual was morphed/registered on to an average spherical surface representation to align sulci and gyri across subjects (Fischl, et al. 1999a
; Fischl, et al. 1999b
). This structural spherical transform was used to map the GLM parameter estimates and residual error variances of each participant’s functional data to a common spherical coordinate system (Fischl, et al. 1999a
; Fischl, et al. 1999b
). Each participant’s data was then smoothed on the surface tessellation using an iterative nearest-neighbor averaging procedure, equivalent to applying a two-dimensional Gaussian smoothing kernel with a FWHM of approximately 8.5mm. Because this smoothing procedure was restricted to the cortical surface, averaging data across sulci or outside gray matter was avoided.
BOLD activity to correctly-answered trials was examined in the LD task (i.e. the trials on which the targets were correctly classified as words). In the RJ task, BOLD activity was examined to correctly-answered unrelated and related trials and to the indirectly related trials that were classified as related. However, because relatedness decisions to these indirectly related word-pairs is subjective, we also examined BOLD activity in the RJ task to all indirectly related word-pairs (regardless of how they were classified), as well as to indirectly related word-pairs that that were judged as unrelated. We note any differences in the findings revealed by these different analyses.
Because the LD and RJ tasks may have engaged neural processes at different latencies, we first examined the hemodynamic time courses that were generated during each of these tasks and to each type of word-pair, without any assumption about their overall shapes. These hemodynamic time courses were generated by averaging activity across voxels within temporal and prefrontal regions of interest at each TR (using the FIR model) and across all participants, see . The time window that captured the peak of this hemodynamic response across the two tasks and the three relatedness conditions was approximately 3–6 seconds. Therefore, all the statistical maps described below were constructed by summing activity at each voxel across this time-epoch.
Hemodynamic time courses within a priori regions of interest, showing modulation of activity to directly related, indirectly related and unrelated word-pairs in the LD and RJ tasks.
We first constructed a statistical map examining the regions activated across both tasks relative to the low-level baseline fixation condition. We also determined whether any of these regions were differentially modulated across the two tasks. We then constructed statistical maps based on planned 2 (Relatedness: directly related versus unrelated, or indirectly related versus unrelated) × 2 (Task: LD versus RJ) repeated measures ANOVAs to show the main effects of Relatedness as well as Task by Relatedness interactions. In these analyses, only ‘highest order’ effects are shown/reported. In other words, clusters that we report as showing main effects for a particular factor are those that failed to show Task by Relatedness interactions. In order to determine the sources of any significant Task by Relatedness interactions as well as to examine hemodynamic modulation by semantic relationship within the LD and RJ tasks within regions that did not show significant main effects or interactions in the overall ANOVA maps, we also constructed statistical maps comparing the directly related and unrelated word-pairs and comparing the indirectly related and unrelated word-pairs for the LD and RJ tasks separately. Finally, we constructed statistical maps that directly contrasted the directly and indirectly related word-pairs for each of the LD and RJ tasks. This enabled us to determine the specificity of any hemodynamic modulation to directly versus indirectly related word-pairs.
Correction for multiple comparisons depended on whether voxels fell within or outside a priori
regions of interest (see ). Within regions of interest (), p values for sets of contiguous voxels (clusters) were computed using a permutation (Nichols and Holmes 2002
) with 10000 iterations; a cluster was only considered significant if, on this permutation, its significance was less than p = 0.05. These clusters are indicated with a * in and . Outside regions of interest, we also report clusters that covered at least 300mm2
, with a corrected threshold for rejection of the null hypothesis of p < 0.05, identified on the basis of a Monte Carlo simulation across the whole cortical surface (Doherty, et al. 2004
). These clusters are indicated with a # in and .
Interactions with Task: Differences between the LD and RJ Tasks in hemodynamic modulation across all word-pairs relative to fixation
Hemodynamic modulation: Directly related vs. Unrelated