To test these alternatives, normal subjects were scanned while performing a manual Stroop task (
Stroop, 1935). In this task, subjects must name the ink color of the presented letters while ignoring the word spelled out by the letters. On congruent trials, the color of the ink matched the word (e.g. the word “red” written in red ink), whereas on incongruent trials, the color of the ink did not match the word (e.g. the word “red” written in green ink). Incongruent trials produced a state of high cognitive interference as indicated by higher mean error rates and higher median RTs across subjects (congruent error rate = 2.7%, s.d. = 2.7%; incongruent error rate = 4.8%, s.d. = 1.0%; paired t-test, p = 0.022, df = 22; congruent RT = 831 ms, s.d. = 104 ms; incongruent RT = 958 ms, s.d. = 133 ms; paired t-test, p = 2 × 10
−8, df = 22; see
Fig S1 for RT distributions).
Standard fMRI multiple regression techniques were used to replicate previous results from the conflict detection literature (
Botvinick et al., 1999;
Carter et al., 1998;
Kerns et al., 2004) showing increased activity in dMFC during incongruent, as compared to congruent, trials (). To test whether differences in RT alone would produce a similar activation pattern, congruent trials with slow RTs (greater than the median, mean of subgroup = 1119 ms) were compared to congruent trials with fast RTs (less than the median, mean of subgroup = 711 ms). Slow RT trials produced greater activity in the dMFC () even when there was no difference in congruency. However, a lack of incompatible features may not necessarily eliminate interference; thus we confirmed that slow and fast congruent trials have equally low levels of conflict by measuring error likelihood, which, according to the conflict monitoring model, is proportional to conflict (
Fig S2). No significant difference in error likelihood existed between slow and fast trials (fast error: mean = 2.9%, s.d. = 3.8%; slow error: mean = 2.2%, s.d. = 2.0% paired t-test, p = 0.41, df = 22) confirming low conflict, independent of response duration. To further test whether the dMFC can be activated in the absence of response conflict, subjects were asked to view a flashing checkerboard and press a button when the stimulus disappeared. Since no choice decision was required and since only one response was possible, no response conflict could exist; nevertheless, activity in dMFC was proportional to time on task (
Fig S3).
These results demonstrate that response duration can affect dMFC activation, even in the absence of competing responses. But is RT a more powerful predictor of dMFC activity than response conflict? To test this, fast RT incongruent trials (mean of subgroup = 783 ms) were compared against slow RT congruent trials (mean of subgroup = 1190 ms). If response conflict drives the dMFC response, more activity should exist on fast RT incongruent trials due to interference from competing responses. On the contrary, dMFC activity was greater on slow congruent than fast incongruent trials (), even though error rates were higher on fast incongruent trials (fast incong error = 4.4%, slow cong error = 2.2% paired t-test, p = 0.033, df = 22). Thus, dMFC activation tracks response duration, rather than the presence of incompatible stimulus-response features or increases in error likelihood.
Standard fMRI analysis methods rely on the general linear model, which makes assumptions about the intensity and duration of the underlying neuronal activity as well as the shape of the hemodynamic response function. Because incorrect assumptions can lead to invalid conclusions, the data was reanalyzed using event-related averaging, a model-free analysis. When averaged across all RTs, voxels in the dMFC showed larger BOLD responses on incongruent than congruent trials (), confirming the standard regression analysis result (). We then tested for a relationship between dMFC activity and conflict in the absence of RT differences by comparing BOLD responses for trials with RTs within 100 ms of the global median. By comparing trials near the median, the time needed to reach decision threshold is held roughly constant (mean RT of cong trials near median = 884 ms, s.d. = 117 ms, mean RT of incong trials near median= 891 ms, s.d. = 114 ms, paired t-test, p = 0.856, df = 44) and the resulting decisions vary only by the presence or absence of incompatible stimulus features. No significant differences in dMFC activity between congruent and incongruent trials were present when RT was held constant (). Finally, a comparison of fast incongruent and slow congruent trials resulted in larger BOLD responses for the congruent condition (; though error rates were significantly higher on fast incongruent trials – see above). Again, BOLD activity was related to response duration, not conflict; taken together these data are inconsistent with the conflict monitoring model.
Computational models of conflict monitoring argue that both RT and error likelihood are determined by the degree of conflict in the decision (
Botvinick et al., 2001;
Siegle et al., 2004). The stronger the activation of the incorrect response, the greater is its interference with the correct response, leading to more errors and slower RTs. Moreover, for any given set of initialization parameters, the model produces a one-to-one relationship between the three variables (
Fig S2). Thus, the model predicts that error likelihood and RT can both be used as measures of conflict, and more importantly, that this relationship depends only on the input to the stimulus layer (i.e. the activation of the color and word units). If the three variables were not one-to-one, that is, if the relationship between error likelihood, RT, and conflict changed with context (e.g. congruency), this would provide evidence against the model.
We tested whether the relationship between error likelihood and RT remained constant (one-toone). This was done by splitting each subject’s RT distribution into ten equal quantiles. Error rates for congruent and incongruent trials were compared within each quantile to determine the degree of response conflict for each condition. A plot of error likelihood as a function of RT () shows that incongruent trials generate higher error rates than congruent trials for each RT (paired t-test of congruent quantiles vs. incongruent quantiles, p = 0.033, df = 9). In addition, incongruent trials have higher error likelihood than congruent trials at the majority of RT quantiles (; paired t-test of congruent vs. incongruent trials within each quantile p < 0.05, df = 22). These data demonstrate that the frequency of conflict-induced errors is not uniquely related to RT as predicted by the conflict monitoring model (
Fig S2).
An analysis of the BOLD activity in dMFC demonstrated a monotonic increase as a function of trial-to-trial RT (). No significant difference between congruency conditions (paired t-test of quantiles, p = 0.90, df = 9) was present. A comparison of congruent and incongruent trials within each quantile showed a significant difference only at a single point, consistent with a false positive rate of 0.05 ( paired t-test, p < 0.05, df = 22; mean difference between conditions = −0.0049 equivalent to a signal change of less than 10−5 %). Furthermore, if dMFC was a conflict detector, then dMFC activity should be proportional to the amount of conflict, as measured by error likelihood, at each RT. However, after controlling for RT, there was no relationship between BOLD activity and error likelihood: the difference in error likelihood () and the difference in BOLD activity () were not correlated (Pearson r = 0.04, p = 0.78). To further quantify this relationship, a sequential (or hierarchical) linear regression was performed on the BOLD data in . For congruent trials, RT explained 43.4% of the variance; the addition of error likelihood to the model increased this value by 4.2%, but the increase was not significant (p = 0.10). For incongruent trials, RT explained 89.7% of the variance. The addition of error likelihood increased this value by less than 1×10−5 %. These data suggest that response conflict cannot explain a significant or physiologically relevant amount of variance in the dMFC.
The quantile analysis was repeated on a voxel-by-voxel basis to determine if any region of the dMFC showed activity consistent with the conflict monitoring model. No significant clusters were found (), even using an extremely lenient significance threshold of p = 0.01, uncorrected for multiple comparisons; this result indicates that the relationship between RT and dMFC activity does not depend on the exact position, shape, or extent of the region of interest tested. It is possible that methodological differences in our Stroop task may have produced brain activity that is uncharacteristic of previous studies. Voxels that showed greater activity on incongruent trials in our Stroop task were compared to those of previous Stroop () and non-Stroop () conflict studies. The position, shape, and extent of our region of interest were remarkably consistent with those of previous studies (). To determine whether “conflict” voxels (from ) were equivalent to “RT” voxels, we performed a novel GLM analysis in which the design matrix consisted of a single RT regressor that did not differentiate between congruent and incongruent trials. The voxels identified by the Incong > Cong contrast from were subsumed by those correlated with the RT-only regressor; this was also true for the majority of voxels from previous studies ().
Finally, the quantile analysis was used to test whether any voxels outside the dMFC showed greater activity for incongruent than congruent trials (). Only the left inferior frontal gyrus, which includes Broca’s area, expressed significantly greater BOLD activity per unit time on incongruent trials. This result demonstrates that the lack of significant voxels in dMFC is not due to insufficient statistical power, but rather, to the fact that interference between alternative semantic representations of color is localized to left IFG, not dMFC.