|Home | About | Journals | Submit | Contact Us | Français|
Past studies have suggested attentional control tasks such as the Stroop task and the task switching paradigm may be sensitive to the early detection of Dementia of the Alzheimer type (DAT). The current study combined these tasks to create a Stroop switching task. Performance was compared across young adults, older adults, and individuals diagnosed with “Very Mild” dementia. Results indicated that this task strongly discriminated healthy aging from early stage DAT. In a logistic regression analysis, incongruent error rates from the Stroop Switch discriminated healthy aging from DAT better than any of the other 18 cognitive tasks given in a psychometric battery.
In their seminal chapter on the role of attention on goal-driven behavior, Norman and Shallice (1986) argued people need to exert attentional control whenever they encounter situations that involve planning, troubleshooting, technical difficulty, novel sequences of action, or need to overcome a strong habitual response. As an example of the last situation, imagine a recent retiree driving on the interstate with the goal of visiting a new friend that lives off exit 13. If the previous job she held for 30 years required her to take exit 12, it would not be surprising if she accidentally takes this exit and perhaps drives a ways toward her former work before noticing the error.
Action slips such as these can lead to confusion, embarrassment, and even fears of dementia. In fact, such errors are often called “senior moments” and are interpreted as “memory” problem, i.e., forgot to make the correct turn. However, this memory interpretation is not entirely correct. If asked, it is unlikely that the woman really “forgot” where she was going. Instead, her failure was likely in resisting the strong habit of taking exit 12. As argued by Norman and Shallice (1986), attention is required to keep internal goals active long enough to exert influence over our actions. Because memory is dependent upon attending to goal-relevant information and ignoring goal-irrelevant information during encoding (see Hasher, Zacks, & May, 1999, for a review), breakdowns in the control of attention often manifest themselves as memory problems.
The difficulty in separating memory from attention has important implications for the diagnosis and treatment of Alzheimer’s disease (AD). In fact, Balota and Faust (2001) have argued that DAT individuals have specific difficulties in attentional selection of relevant information over irrelevant information and that this difficulty contributes to observed memory problems. According to Balota and Faust’s attentional control framework, attentional control underlies our ability to orchestrate thought and action in accord with internal goals; to use a goal to modulate competition between relevant and irrelevant information. This view contrasts with the common assumption that memory alone is the earliest and most detectible sign of AD (Grady et al., 1988; Haxby et al., 1988; Lafleche & Albert, 1995; Reid et al., 1996). However, although it is acknowledged that memory impairment occurs early, recent evidence has demonstrated concurrent deficits in executive function even prior to diagnosis of AD (Albert, Moss, Tanzi, & Jones, 2001; Backman, Jones, Berger, Laukka, & Small, 2005). Similarly, while it is widely accepted that medial-temporal brain areas are affected earliest in the progression of AD (Braak & Braak, 1991), frontal areas such as the prefrontal cortex and anterior cingulate also show considerable pathology early on (Killiany et al., 2000; Mintun et al., 2007; van der Flier et al., 2002; Yanaguchi et al., 2001). Because attentional control systems involve complex coordination of multiple sources of information from multiple brain areas, they may be especially vulnerable to early neuropathology accompanying AD (see Balota et al., in press, for more discussion).
The best known paradigm to investigate attentional control is the classic Stroop task (Stroop, 1935) in which participants are told to name the ink color in which words are presented while ignoring the words themselves. The irrelevant words can be either congruent (the word blue written in blue ink), incongruent (the word red written in blue ink), or unrelated to the color (the word poor written in blue ink). Spieler, Balota, and Faust (1996) investigated effects of both healthy aging and DAT on Stroop performance and found that aging and dementia produced distinct patterns of performance. Relative to young adults, healthy older adults produced larger Stroop interference (incongruent –neutral) in reaction times (RTs), but not error rates, suggesting that older adults could successfully overcome the automatic “word” response, but at a cost to RT. In contrast, both very mild and mild DAT individuals, based on the clinical dementia rating scale (CDR 0.5 and CDR 1, respectively, Morris, 1993), produced larger Stroop interference effects in reaction times (RTs), and exaggerated error rates to incongruent stimuli relative to age-matched healthy control individuals. The DAT groups also showed larger facilitation effects (neutral – congruent RT) than young and healthy older adults, indicative of incorrectly reading the word rather than color on congruent trials (see MacLeod, 1991, for discussion).
Kane and Engle (2003) provided evidence for two separable control processes necessary in the Stroop task: Goal maintenance and response competition. According to Kane and Engle, goal maintenance reflects the ability to maintain the appropriate task set (e.g., respond “color” and ignore “word”) across trials whereas response competition reflects the ease with which people can select between appropriate and inappropriate competing response tendencies. A loss in goal maintenance results in quick errors in which the person simply responds to the word without any influence from the potentially competing color. Behaviorally, this manifests as both an increase in errors to incongruent stimuli and also as an increase in RT facilitation for congruent stimuli (since accidentally reading the word in this condition does not produce an error). This is exactly the pattern found by Spieler et al. (1996) for DAT individuals, and indeed Spieler et al. used the same attentional control interpretation to accommodate these results. However, even with correct goal maintenance, one still must overcome the competition for incongruent stimuli between the inappropriate “word” dimension and the appropriate “color” dimension. Although the healthy older adults showed no deficits in goal maintenance, they did show deficits in the resolution of this response competition, as manifest in longer RTs to incongruent items, relative to neutral items. Castel, Balota, Hutchison, Logan, and Yap (2007) recently found a similar pattern to Spieler et al. (1996) using another response competition paradigm, the Simon task (Simon, 1969), in which participants were told to respond to the direction of an arrow (e.g., → or ←) presented in the left, middle, or right of the computer screen.
In a recent imaging study, De Pisapia and Braver (2006) reported evidence for both the involvement of lateral prefrontal cortex (lPFC) and anterior cingulate (ACC) activation in Stroop performance. Specifically, according to De Pisapia and Braver, lPFC is involved in active maintenance of task goals whereas ACC is necessary for the detection of conflict. If the lPFC fails to maintain the task goal then the person becomes reliant upon the ACC to detect conflict prior to the individual committing an error. If an error does occur, communication between the ACC and lPFC serves to refresh and maintain the goal on subsequent trials. Of course, these two neural systems nicely parallel the operations hypothesized to be involved in attentional control.
Studies such as those by Spieler et al. (1996) and Castel et al. (2007) suggest that attentional selection tasks such as Stroop and Simon, which require both goal maintenance and competition resolution, may be especially sensitive measures to early disruptions in performance created by the onset of AD. In a longitudinal study designed to test this possibility, Balota, et al. (in press) tracked the Spieler et al. (1996) healthy older subjects to predict progression to DAT within the next 13 years. Interestingly, those healthy older adults that eventually converted to DAT (N = 12) had incongruent error rates that were 2.2 times higher (M = 17.4%) than those (N = 35) that did not eventually convert to DAT (M = 7.9%). Balota et al. then compared the predictive power of Stroop error rates to that of tasks involving episodic memory (e.g., logical memory & associative recall), simple span (e.g., forward & backward digit span), spatial abilities (e.g., Benton copy & WAIR-R block design), and processing speed (e.g., Crossing off task & Trail Making forms A & B). Of all these other measures, only the WAIS-R block design significantly discriminated between converters and noncoverters. When entered into a regression equation, incongruent error rates were the strongest predictor of eventual diagnosis of DAT.
Given the potential importance of breakdowns in attentional control systems in early stage DAT, it is possible that tasks that place a premium on control may be especially useful in early discrimination. It is in this light that we explore the task switch paradigm (Allport, Styles, & Hsieh, 1994; Rogers & Monsell, 1995). In this paradigm, two or more tasks are intermixed within a block of trials and the participant must attend to cues that designate the appropriate task for the current trial. For instance, a participant may be shown two numbers and have to switch between adding and subtracting the second number from the first. The general finding from such tasks is that participants are slower and/or less accurate in responding on a “switch” trial than on a non switch trial, as long as the stimuli are compatible with both tasks (Spector & Biederman, 1976). Accurate performance in task switch paradigms requires participants to maintain multiple task sets (i.e., rules that govern the mapping between stimuli and their appropriate responses) in working memory while selecting the appropriate task set for the current trial. On switch trials, participants must engage in task set reconfiguration, which involves deactivating the current task set and retrieving and enabling the past task set (Rogers & Monsell, 1995). Although some reconfiguration can be initiated endogenously when given enough time between the cue and the next stimulus, a “residual switch cost” still remains. Rogers and Monsell interpreted this residual switch cost as reflecting an exogenous trigger necessary to complete the reconfiguration process (but see Allport & Wylie 1999, 2000; Waszak et al., 2003; and Wylie & Allport, 2000 for an alternative explanation for residual switch costs).
As with the Stroop task, task switching paradigms have been found to activate regions of PFC (Brass et al., 2005; Braver, Reynolds, & Donaldson, 2003). In addition, task switching typically activates posterior parietal regions (Eppinger Kray, MacKlinger, & John, 2007; Filoteo et al., 1992) presumably involved in shifts of attention from one stimulus dimension to another. Because both Stroop and task switch paradigms involve related attentional control processes (such as the necessity to maintain and utilize task goals to suppress the more dominant process), it is not surprising that researchers have combined these tasks in order to investigate group differences in attentional control. Versions of Stroop switch tasks have been used to examine presumed cognitive control deficits among older adults (Eppinger, Kray, MacKlinger, & John, 2007), individuals with traumatic brain injury (Perlstein, Larson, Dotson, & Kelly, 2006), ADHD (Wu, Anderson, & Castiello, 2006), and Parkinson’s disease (Woodward, Bub, & Hunter, 2002).
However, to our knowledge only one study (Fine et al., 2008) has used a Stroop switch task to discriminate healthy aging from the type of cognitive decline typically seen in DAT. Fine et al. used a Stroop switching task in which participants were given an incongruent color word presented either inside or outside of a small box on a computer screen. Participants were instructed to name the ink color if the stimulus appeared outside the box and name the word if it appeared inside the box. Participants also performed three other tasks: A standard (color-naming of words only) version of Stroop, color naming of ink patches, and word reading for stimuli presented in black ink. The standardized difference between Stroop switch performance and average performance on the other 3 measures, called a Stroop discrepancy score, was then used to predict an individual’s degree of decline on a dementia-rating scale (DRS, Mathis, 1973) over the following year. Fine et al. compared the predictability of this discrepancy score on cognitive decline to that of Apolipoprotein E (ApoE) genotype, a known predictor of AD. Specifically, individuals with the e4 allele of this gene have increased risk for developing AD (e.g., Blacker et al., 1997; Corder et al., 1993; Henderson et al., 1995). Of interest, Fine et al., found that those older adults who declined in cognitive performance over the year (as measured by the DRS) had larger Stroop switch discrepancy scores than those whose cognitive performance remained stable. In fact, using logistic regression, the discrepancy score significantly predicted whether or not an individual showed cognitive decline (75% correct classification rate), yet ApoE status did not (67% correct classification rate). Importantly, when examining the 4 tasks separately, the difference between the decliner and stable groups was greater in the Stroop switch task than any of the other 3 tasks, suggesting the “switch” component may increase the sensitivity of the Stroop task in predicting cognitive decline.
Although the Fine et al. (2008) results are provocative, there are some limitations which prevent a complete understanding between task switching and DAT. One limitation is that their Stroop switch task (the color word interference task, CWIT, Delis, Kaplan, & Kramer, 2001) reports only an overall time to complete the entire block of trials. Reporting only an overall completion time score prevents researchers from examining two critical components of performance: Error analyses and trial-specific switching effects. As discussed previously, errors on incongruent trials are often the most sensitive measure in discriminating performance between healthy older adults and DAT individuals (Balota et al., in press; Castel et al., 2007; Spieler et al., 1996). Using only overall completion times misses this important information (see Balota et al., in press for further discussion of this issue). Similarly, the overall completion time score renders task switch trials inseparable from task repeat trials. Thus, this measure confounds effects of switching between tasks with overall difficulty of mixing tasks, relative to single task conditions. In fact, several researchers have found that the cost of “mixing” is much greater than the cost of “switching”. Under mixed task conditions, performance even on task repetition trials is much slower and less accurate than under single task conditions (Pashler, 2000). This mixing cost is thought to represent a sustained effort to keep multiple tasks active in working memory, whereas the switch cost presumably reflects the more transient updating or reconfiguration of S-R mappings (Braver, Reynolds, & Donaldson, 2003). Past research has demonstrated that older adults show a large increase in mixing cost, but only a subtle (often nonsignificant) increase in switching costs, relative to young adults (Kramer et al., 1999). Thus, it is unknown whether Fine et al.’s “cognitive decliners” were impaired relative to the stable group on simply overall RT (due to mixing costs) or on switch trials specifically. This confound of mixing and switching is especially problematic for the Fine et al. study because their “decliner” group was older (M = 78.8) than their “stable” group (M = 74.4) and thus may have been more impaired by overall mixing costs.
Two other factors limit generalization of the Fine et al. (2008) results to the study of dementia and attentional control. First, the samples differed in initial cognitive performance. Not only did the “decliner” group start out older than the “stable” group, but they also started out with lower scores on the DRS. In fact, the “decliner group” started out lower than the “stable” group ended up. Perhaps the decline in DRS accelerates once performance drops beyond certain levels. Second, neither sample progressed into levels typically used as screening cutoffs for DAT. Even though the decliners had lower DRS scores than the “stable” group, their final scores (M = 133.3) were still within the range of healthy older adults (130–144, Rosser & Hodges, 1994). Therefore, although the Fine et al. study is provocative, further work is needed to explore whether Stroop task switching (1) discriminates DAT individuals from healthy controls or (2) predicts eventual progression into DAT from an initially healthy sample. In the current study, we primarily emphasize the power of this paradigm for discrimination.
The present study examined the performance of young adults, healthy older adults, and very mild DAT individuals in a trial-by-trial computerized version of the Stroop switching task. Examining performance on a trial-by-trial basis allows testing for not only overall group differences in the task (reflecting a general difficulty in “mixing” tasks) but also specific deficits among Stroop interference, switch costs, or their potential interaction (i.e., a task switch asymmetry). We also report evidence from psychometric tests that are available on these individuals to determine the extent to which more traditional cognitive measures are useful in discriminating between healthy aging and early stage DAT. It is important to emphasize here that the DAT individuals are at the earliest detectable stage of Alzheimer’s Disease. Indeed, the MMSE scores for the very mild DAT individuals (MMSE = 28.2) are only 1 point lower than the healthy control individuals (MMSE = 29.2) in this sample. The important question is whether the stress placed on the attentional system by the present Stroop switching task will afford better discrimination between these high functioning DAT individuals and the healthy controls, compared to standard psychometric measures.
Overall, it is predicted that performance on this task should decline as a function of both healthy aging and dementia. However, aging and dementia should produce separate patterns of deficits relative to young adults. Specifically, although both healthy older adults and DAT individuals should show overall worse performance than young adults, indicative of difficulty mixing multiple tasks, incongruent error rates should be especially sensitive at discriminating DAT from healthy aging (similar to Spieler et al., 1996 and Castel et al., 2007). In addition, if trial-specific switch costs are due to an automatic retrieval of previous competing S–R mappings (Allport & Wylie, 2000), then we should observe a similar switch cost pattern across all groups, because age-related changes in automatic processes are relatively preserved across the life span (see Balota, Dolan, & Duchek, 2000; Hasher & Zacks, 1988). However, if trial-specific switch costs are due instead to persisting inhibition of the current task set from previous trials, then we might expect reduced costs from the DAT group in whom inhibitory functions have been argued as especially deficient (Balota & Faust, 2001).
Older adult participants were recruited from the Washington University Alzheimer’s Disease Research Center (ADRC), and consisted of 64 healthy older adults and 32 individuals with very mild DAT. There was no significant difference in age (p > .41) between healthy older adults (M = 77.24, SD = 9.80) and DAT individuals (M = 78.78, SD = 5.89). There was also no significant difference in education level (p > .91) between healthy older adults (M = 14.66, SD = 2.610) and DAT individuals (M = 14.72, SD = 2.88). In addition, 30 younger adults (age 25 or younger) were recruited from the Washington University student community and participated for course credit or were paid $10. The younger adults had a mean age of 20.8 years (SD = 1.5).1
The healthy older adults and the individuals with DAT were seen by a physician and completed a battery of psychometric tests approximately once a year. All participants were screened for neurological, psychiatric, or medical disorders with the potential to cause dementia. The inclusion and exclusion criteria for diagnosis of DAT have been described in detail elsewhere (e.g., Morris, McKeel, Fulling, Torack, & Berg, 1988; Morris, 1993) and conform to those outlined in the criteria of the National Institute of Neurological and Communications Disorders and Stroke—Alzheimer’s Disease and Related Disorders Association (McKhann et al., 1984). Dementia severity for each individual was staged in accordance with the Washington University Clinical Dementia Rating (CDR) Scale (Hughes, Berg, Danziger, Coben, & Martin, 1982; Morris, 1993). According to this scale, a CDR of 0 indicates no cognitive impairment, a CDR of 0.5 indicates very mild dementia, a CDR of 1.0 indicates mild dementia, and a CDR of 2.0 indicates moderate dementia. At the Washington University Medical School ADRC, a CDR of 0.5 has been found to accurately indicate the earliest stages of AD (Morris, McKeel, & Storandt, 1991). All of the current very mildly demented individuals had a CDR 0.5 rating. Both the reliability of the CDR and the validation of the diagnosis (based upon autopsy) by the research team have been excellent (93% diagnostic accuracy) and well documented (e.g., Berg et al., 1998).
In addition to participating in the experimental task, all of the older adults participated in a two hour battery of psychometric tests as part of a larger longitudinal study of cognitive performance in healthy aging and DAT. The results from the Psychometric tests are displayed in Table 1. The Wechsler Memory Scale included the Logical Memory immediate test (recall of scoring units 0–23), Logical Memory delayed test (recall of scoring units 0–25), Forward and Backward Digit Span (# correct digits, 0–8 or 0–7, respectively), Associate Memory Recognition (0–7) and Recall (0–21); (Wechsler & Stone, 1973), and the Selective Reminding Task (Grober, Buschke, Crystal, Bang, & Dresner, 1988). The Wechsler Adult Intelligence Scale (WAIS) included the Information (scoring range 0–29), Block Design (scoring range 0–48), and Digit Symbol (scoring range 0–90) subtests and were scored according to the manual (Wechsler, 1955). Participants also received the Benton Visual Retention Test and the Benton Copy Test (# correct, # errors) (Benton, 1963), and Part A and Part B of the Trail Making Test (# of seconds to complete) (Armitage, 1945). Part B of the Trail Making Test not only assesses visual perceptual-motor performance, but also requires the ability to alternate between well-learned sequences (alphabet and numbers). Tests of semantic/lexical retrieval and word fluency included The Boston Naming Test (Goodglass & Kaplan, 1983a) and the Animal (Goodglass & Kaplan, 1983b) and Word S–P (Thurstone & Thurstone, 1949) Fluency tests (# correct of 60; # named in 1 min., respectively). The National Adult Reading Test was used as an assessment of word comprehension. Finally, the Mini Mental State exam (MMSE) was given as an assessment of overall cognitive functioning.
Psychometric tests are scored such that greater scores indicate better performance with the exception of Trail making A and B and Benton copy errors, for which higher scores indicate poorer performance. Psychometric testing always occurred within a two-month window of the Stroop switch task testing session. As shown in Table 1, as expected, the very mild DAT group performed more poorly than the healthy older group on most tests. Since younger adults were not recruited by the ADRC, they did not receive the psychometric battery.
The experiment was run on a Pentium II IBM-compatible computer, with a standard 15 inch monitor, and was implemented using E-prime software. Participants viewed the display from an approximate distance of 50 cm. The Response cues (color or word) and Stroop stimuli were presented in Courier New size 16 font in the center of the screen and presented against a black background. The Stroop stimuli were taken from Spieler, et al. (1996) and consisted of either color words (red, green, blue, and yellow) or neutral words (bad, deep, poor, and legal) matched to the color words in phoneme characteristics and printed word frequency.
Participants received 144 experimental trials consisting of 68 neutral trials and 76 incongruent trials. The four color words were presented 19 times and appeared in each of the three incongruent colors (e.g., the word “red” presented in green, blue, or yellow). Similarly, the matched neutral words were presented 17 times each and appeared in each of the same three colors as their matched color word (e.g., the word “bad” presented in green, blue, or yellow). There were no congruent trials in the experiment.
The Stroop task was embedded in a battery of other tasks investigating memory and attention performance that lasted approximately 2 hours. At the beginning of the Stroop task, participants were given verbal instructions regarding the nature of the task. Participants were instructed that they would be cued prior to each trial whether they should name the color in which a word is presented or if they should name the word itself. Either the cue “color” or the cue “word” was presented in the center of the screen for 1400 ms and was immediately followed by the Stroop stimulus. The “color” and “word” cues were presented in an alternating runs (e.g., AABBAABB) fashion (Rogers & Monsell, 1995) such that every 2 trials the participant would switch their responding from one dimension (color or word) to the next. This paradigm allows for the comparison of performance on switch trials (i.e., AB, BA) with performance on non switch trials (i.e., AA, BB). As noted by Wu et al. (2006), however, using this sequence does not provide a “pure” measure of switch costs because costs can persist beyond the first trial of a sequence (Allport & Wylie, 2000). Nonetheless, previous researchers have consistently found larger costs on the first trial (the current “switch” trial) than on the next trial (the current “non switch” trial), suggesting that this difference can still capture some component of switching difficulty.2
Participants were given 8 sample trials in the alternate runs format with the microphone turned off to establish that they understood the instructions. If the participant responded correctly on the sample trials, they were presented with 16 practice trials with the microphone turned on and were asked to respond as quickly and accurately as possible into the microphone. Following the practice trials, participants received the 144 experimental trials. The words and colors were presented in a fixed-random order with the constraint that neither the word nor its color were contained in the immediately preceding trial, preventing immediate positive or negative priming from one trial to the next. Self-paced rest breaks were given approximately every 40 trials. Participants’ RTs were recorded with ms accuracy using E-prime’s PST serial response box and an experimenter coded each response as (a) correct response, (b) response error, or (c) microphone error. Response errors consisted of either responding with the wrong word (e.g., responding “green” to the word “green” written in blue) or responding with a blended word (e.g., “gre-blue”).
Only correct responses were considered for the RT analyses. A separate mean and standard deviation were computed for neutral and incongruent stimuli and for color and word cues. Outliers were removed using the modified nonrecursive procedure suggested by Van Selst and Jolicoeur (1994). This procedure removed 2.9 % of the correct RTs. Task Switching effects were computed by subtracting the mean RT in the non switch condition from the switch condition. Response Cue effects were computed by subtracting the mean RT in the word cue condition from the color cue condition. Stroop interference effects were computed by subtracting the mean RT in the neutral item condition from the incongruent item condition.
The trimmed RTs for each participant were also transformed into z-scores based upon each participant’s overall trimmed mean RT and standard deviation. This transformation accounts for group differences in overall RT and variability which can either produce spurious group x difference score interactions or mask true group x difference score interactions (see Faust, Balota, Spieler, & Ferraro, 1999; Hutchison, Balota, Cortese, & Watson, 2008, for more discussion). Rather than report redundant RT analyses using both measures, we only report RT effects that were confirmed by the z-score analyses.
Arithmetic means for young, old, and very mild DAT participants based on individual participants’ trimmed mean RTs, z-score transformed means, and errors are presented in Table 2. We first compared young with healthy older participants and DAT participants in order to examine effects of aging and dementia. We then examine the ability of incongruent error rates in the Stroop switch task to predict DAT status in the older adults (CDR 0 vs CDR .5) and compared this to that of the 18 tests in the psychometric battery. Unless otherwise noted, each effect referred to as statistically significant is associated with a two-tailed p < .05.
RTs were analyzed using the general linear model with Response Cue, Switch, and Stroop Interference conditions varied within-subjects and Group as a between-subjects variable. As anticipated, participants were faster to respond to words than colors, faster to respond to neutral items than incongruent items, and young adults responded faster than healthy older adults, who, in turn, responded faster than DAT individuals. These observations were confirmed by main effects of Response Cue [F (1, 121) = 97.21, MSE = 49,001], Stroop Interference [F (1, 121) = 99.92, MSE = 14,901], and Group [F (2, 93) = 16.42, MSE = 449,880]. In addition to these main effects, there were two significant interactions in trimmed RTs and z-scores. First, as shown in the top half of Figure 1, the predicted Switch x Response Cue interaction was significant [F (1, 121) = 16.86, MSE = 10,871], indicating costs when switching to a word response (34 ± 19 ms), but a marginal switch benefit when switching to a color response (−23 ± 24 ms). [Hereafter, when reporting a X ± Y ms effect, Y refers to the 95 % confidence interval.] Although the significant cost when switching to word responses replicates Allport et al. (1994) and others, the benefit when switching to a color response does not. As can be seen from Figure 1, this benefit when switching to a color response occurred primarily for the very mild DAT group (switch benefit of 50 ± 45 ms), whereas the young and older adults showed no such benefit (switch benefit of 6 ± 45 ms and 15 ± 31 ms, respectively), consistent with the literature. Second, Stroop Interference was three times larger when participants were cued to respond to the color dimension (122 ± 29 ms) than when they were cued to respond to the word dimension (42 ± 22 ms), F (1, 121) = 15.78, MSE = 22,047. This Stroop Interference x Response Cue interaction is shown in the top half of Figure 2. As can be seen, all three groups showed larger Stroop interference (incongruent – neutral) during color naming than word naming. Although less in word naming than color naming, significant Stroop Interference in word naming is of interest, since incongruent color names do not typically produce such interference (MacLeod, 1991). This finding is common, however, on switch trials in the Stroop switching task (Allport et al., 1994) and has been taken as evidence for incomplete reconfiguration of the new task set (i.e., name the word not the color), allowing for competition from the competing color dimension.
Errors were analyzed in the same manner as RTs. The same main effects found for RTs were also obtained in errors rates. As can be seen from Table 2, participants made fewer errors to words than colors, fewer errors to neutral items than incongruent items, and young adults made fewer errors (3.9%) than older adults (11.5%) whom, in turn, made fewer errors than VM DAT individuals (20.3%). These observations were confirmed by main effects of Response Cue [F (1, 123) = 80.40, MSE = 453], Stroop Interference [F (1, 123) = 82.14, MSE =105], and Group [F (2, 123) = 26.28, MSE = 637]. In addition, there was a main effect of Task Switching [F (1, 123) = 12.39, MSE = 61] in which participants made more errors on switch trials than non switch trials.
The RT interactions were replicated in error rates. There was a marginal Switch x Response Cue interaction [F (1, 123) = 3.75, MSE = 48, p < .06], indicating a cost when switching to a word response (2.7 ± 1.3%), but not when switching to a color response (0.9 ± 1.5%). These data are shown in the bottom half of Figure 1. There are three interesting aspects of this finding. First, all three groups showed significant costs when switching to a word response (3.0%, 2.0%, and 3.3% for young, healthy older, and Very Mild DAT, respectively). As argued by Allport et al., (1994), the nondominant S–R mapping for color naming requires a more strongly imposed task set. As a result, switch costs (which are created when items associated with the previous task set emerge in a different task) are stronger during word reading trials. This explains why one typically finds more interference switching from color naming to word reading than vice-versa. Of current debate, however, was whether such interference (typically found in young adults) is due to persisting inhibition or automatic retrieval of the previously irrelevant task set. The lack of any group difference in this pattern supports the hypothesis by Allport and Wylie (2000) that the cost reflects an automatic retrieval of incompatible S–R mappings that must be overcome, rather than a persisting inhibition from control exerted over an incompatible task set on the previous trial. If the latter were true, one would expect greater costs when switching to a word response for young adults and healthy older adults, than for DAT individuals.
Second, unlike the RTs, there was no switch benefit observed for DAT individuals in switching to a color response. Thus, if the switch benefit for DAT individuals switching to a color response observed in the RT analyses is a real, rather than spurious, effect, then it likely influences response competition resolution rather than goal maintenance. Third, the asymmetrical switch cost effect did not significantly interact with stimulus type (p > .09). In fact, for our healthy older adults and DAT participants, the “neutral” items showed more of a switch asymmetry than the incongruent items. This is perhaps surprising in that neutral word names should only mildly interfere with color naming (MacLeod, 1991). However, it should be stressed that these items were not truly “neutral” because both the ink color and the word itself were valid responses, whereas in most Stroop studies the neutral word name is never a valid response, such as a row of Xs. Thus, the “neutral” words produced more interference than in most studies, particularly for our older and DAT participants.
There was also a significant interaction between Response Cue and Group, F (2, 123) = 19.06, MSE = 453. As can be seen in Figure 1, young adults showed no difference in responding to colors versus words (1.3 ± 5.5%) whereas the difference in responding to color versus words increased for older adults (12.0 ± 3.8%) and dramatically for the very mild DAT group (24.8 ± 5.3%). Follow-up analyses confirmed that the effect of Response Cue significantly differed across all three groups. This pattern suggests that young adults can successfully inhibit the more dominant word response in favor of a color response. By contrast, older adults and DAT individuals have much less success inhibiting the dominant word response, likely making fast “word” responses before a conflict is even detected.
There were three interactions involving Stroop Interference, and these can be seen in the bottom half of Figure 2. First, overall Stroop Interference (collapsed across Response Cue) was not significant for young adults (0.6 ± 2.6%), but increased for older adults (6.3 ± 1.8 %) and for DAT individuals (11.8 ± 2.5 %), F (1, 123) = 18.39, MSE = 105. Second, as expected, Stroop Interference interacted with Response Cue, F (1, 123) = 114.36, MSE = 134. Large positive Stroop Interference was obtained when participants were cued to respond to the color dimension (14.5 ± 2.6%), but small negative Stroop Interference occurred when participants were cued to respond to the word dimension (−2.1 ± 1.2%). Finally, both of these 2-way interactions were qualified by a significant three-way Response Cue x Stroop Interference x Group interaction [F (2, 123) = 20.43, MSE = 134]. Both the positive Stroop Interference during color naming and the negative Stroop Interference during word naming increased across age and dementia, though only the interference increase for color naming was significant. When responding to words, the small negative Stroop Interference was not significant for either young adults (−1.3 ± 2.4%) or healthy older adults (−1.6 ± 1.7%), but was significant for DAT individuals (−3.4 ± 2.3%). When responding to color, Stroop Interference increased 11.8% from young adults (2.4 ± 4.9%) to older adults (14.2 ± 3.3%) and an additional 12.7% from older adults to DAT individuals (26.9 ± 4.9%). Separate post hoc ANOVAs comparing 2 groups at a time revealed that both of these increases in Stroop Interference for color naming across Group were significant. As with the Response Cue x Group interaction above, this Stroop Interference x Group interaction replicates earlier findings by Spieler et al. (1996) and Castel et al. (2007) that error rates in the Stroop task can be a particularly useful discriminator between healthy aging and early stage DAT. Healthy older adults produced fewer errors than DAT individuals in both Stroop Interference and Response Cue effects. However, the findings from this paradigm differ from the earlier studies in that errors also discriminated younger from older adults. It is possible that adding the switching component to this congruency task disrupted older adults’ ability to maintain the appropriate task set, yet did not cause such problems for young adults. Indeed, the very low overall error rate for young adults (3.9 ± 3.3%) is impressive given the complexity of this task.
As mentioned in the introduction, Balota et al. (in press) found that incongruent errors in a Stroop task was the best predictor of which healthy non-demented individuals at the time of testing would become diagnosed with DAT over the next 13 years. The other psychometric measures given to individuals in that study included twelve of the eighteen measures used in the current study (shown in Table 1) and include tasks involving episodic memory (e.g., logical memory, associative recall), simple span (e.g., forward & backward digit span), spatial abilities (e.g., Benton copy & WAIR-R block design), and processing speed (e.g., Trail Making forms A & B). As with Balota et al., we chose to directly compare the ability of incongruent error rates in discriminating very mild DAT from healthy aging [t (94) = 4.35, p < .001] to that of the other Psychometric tests (t-values shown in Table 1). Based upon previous work by Balota et al. (in press), Castel et al. (2007), and Spieler et al. (1996), it was predicted that errors to incongruent items during the color naming portion of the current Stroop Switch task would be especially sensitive at discriminating healthy aging from the onset of dementia and would perhaps even outperform the currently available psychometric tasks.
The intercorrelations among the 18 psychometric tasks (as well as incongruent error rates in the current study) are presented in Table 3. It is of interest that incongruent error rates produced significant correlations with all 18 psychometric tests. Thus, it is possible that incongruent error rates provide a general assessment of overall cognitive functioning, such as one’s degree of attentional control. It has been argued elsewhere (Balota et al., 1999; Balota & Faust, 2001) that a single construct such as attentional control may in fact underlie performance deficits in DAT across a wide range of cognitive tasks. Also of interest in Table 3 is the degree of intercorrelations among the psychometric tests themselves. Such intercorrelation among predictors can create the problem of multicollinearity (Cohen, Cohen, West, & Aiken, 2002; Hosmer & Lemeshow, 2000) in interpretation of regression coefficients. Namely, if predicted variance in the dependent measure is highly shared among multiple predictor variables, the contribution of each variable could be underestimated.
In order to diminish the problem of multicollinearity, we used scores on 16 of the 18 tasks (excluding MMSE and NART) to create unit-weighted composite scores that, based upon previous research, represent abilities across 6 cognitive domains or abilities. The composite score for each domain was the average z-score per participant across each task. These scores (and their component tasks) included a Memory score (Logical memory immediate, Logical memory delayed, Associative recognition, Associative recall, Selective reminding), a Speed/Switching score (Trails A, Trails B, Digit-Symbol), a Spatial Ability score (WAIS Block, Benton Copy), a Digit Span score (Digits forward, Digits backward), a Knowledge score (WAIS information, Boston Naming), and a Verbal Fluency score (Word Fluency, Animal Fluency).
A logistic regression predicting DAT status (very mild versus healthy older adult) was conducted using incongruent errors (transformed into z-scores) plus the composite scores for each of the 6 cognitive domains (Memory, Speed/Switching, Verbal Fluency, Spatial Ability, Digit Span, and Knowledge).3 The results of this analysis are shown in Table 4. The correct classification rate was 81%, which was significantly [χ2 (7, N = 91) = 27.20] above the base rate classification of 68% if one simply predicted that all participants in the sample were CDR 0 (68% of all older adults in the sample were CDR 0). As can be seen, both incongruent errors and Memory significantly discriminated between CDR 0 and 0.5 individuals.4
In addition to the analysis above, we conducted two logistic regression analyses that investigated (a) the ability of incongruent color naming errors to improve predictability of DAT beyond predictions based upon each of the 18 current psychometric tests and (b) the ability of each of the current psychometric tests to improve predictability of DAT beyond predictions based solely on incongruent color naming errors. All 18 psychometrics were entered individually in this analysis because (a) comparing incongruent error rates to a single test at a time reduces the multicollinearity problem (i.e., shared variance among multiple predictor variables) that motivated the earlier analysis and (b) the predicted variance of any individual test might be diluted by its inclusion within a general composite score. The results from these two logistic regressions are summarized in Tables 5a and 5b. As can be seen in Table 5a, 12 of the 18 psychometric tests significantly predicted DAT when entered in the initial step of the logistic regression equations. Importantly, adding incongruent color naming errors significantly increased the predictability of DAT regardless of which psychometric test was entered first.
Although incongruent errors increased the predictability of DAT above and beyond that of psychometric tests, this was not true for 17 of the 18 psychometric tests when the order of entry was reversed. As can be seen in Table 5b, the selective reminding task was the only task that significantly increased R2 when entered after incongruent color naming errors (with marginal improvements from delayed logical memory and associative recall). None of the other remaining variables helped to categorize individuals as DAT versus healthy control once incongruent errors were already entered.
In summary, among all psychometric variables, incongruent error rate was the best discriminator of healthy older adults from very mild DAT individuals, followed by the selective reminding task. Of all 18 psychometric variables, only the selective reminding task significantly boosted R2 when entered after incongruent Stroop errors, whereas Stroop errors boosted R2 regardless of which task it followed. These results suggest that the current Stroop switch task and episodic memory measures are best for early discrimination of healthy aging from DAT.
The purpose of the present investigation was to examine how well a switching version of the Stroop task could capture attentional control deficits that presumably accompany aging and Alzheimer’s dementia. The results indicated the Stroop switch task was sensitive to group differences in age and importantly, exceeded the performance of current psychometric tests in discriminating healthy aging from early stage DAT.
As expected, when cued to respond to color, individuals with very mild DAT produced larger Stroop interference in error rates than healthy older adults. This finding replicates past studies by Spieler et al. (1996) and Castel et al. (2007) in which very mild DAT individuals showed increased errors in the incongruent Stroop and Simon conditions, respectively. This pattern of errors on incongruent trials has become a signature of early stage DAT. In the current study, it is likely that the added requirement of switching between naming words and colors weakened participants’ ability to consistently ignore words in favor of color naming. This is especially true for healthy older adults and early stage DAT individuals, who showed considerable increases in errors during color naming trials relative to young adults (see Figure 2). In contrast to the Spieler et al. (1996) and Castel et al. (2007) results, we found that Stroop interference errors on color naming trials were also greater for older adults than young adults. However, as previously mentioned, it is likely that the addition of the switch component to the Stroop task increased the attentional demands of the task and thus disrupted older adults’ ability to maintain the appropriate task set
In contrast to the robust group differences in Stroop interference observed in the current study, there was little-to-no difference in cross-trial switch costs across groups. All three groups showed larger costs when switching to word naming than when switching to color naming. This pattern replicates the interesting switch cost asymmetry first observed by Allport et al. (1994) and lends support to involuntary retrieval accounts of residual switch costs (Allport & Wylie 1999, 2000; Waszak et al., 2003; Wylie & Allport, 2000) in which stimuli involuntarily invoke previous S–R mappings, rather than to a persisting controlled inhibition of the previously irrelevant (but currently relevant) task (Mayr & Keele, 2000; Gilbert & Shallice, 2002; Yeung & Monsell, 2003). If the latter were true, then one would have expected diminished switch costs among DAT individuals, in whom controlled inhibitory processes are presumably deficient.
Our regression analysis further bolstered the claim that Stroop interference errors are particularly sensitive to early stage DAT. The ability of incongruent errors in the Stroop Switch task to predict DAT status was examined using logistic regression and this discrimination ability was compared to that of the 18 psychometric measures. Of particular interest, incongruent error rates were better at discriminating early stage DAT from healthy controls than each of the standard 18 psychometric tests available on these participants. Moreover, only one of those tasks (the selective reminding task) significantly predicted DAT status when entered into the logistic regression analysis after incongruent errors.
As discussed previously, frontal brain areas such as the prefrontal cortex and anterior cingulate show considerable pathology early in DAT (Killiany et al., 2000; Mintun et al., 2007; van der Flier et al., 2002; Yanaguchi et al., 2001). Given that both the Stroop task and Task Switch paradigms have been shown to activate such frontal regions (Brass et al., 2005; Braver, Reynolds, & Donaldson, 2003; De Pisapia and Braver, 2006) it is not surprising that a task combining these paradigms is sensitive to early impairments accompanying DAT. An abbreviated version of this task may therefore be useful as part of the diagnostic tools available to clinicians, as suggested by Fine et al. (2008).
In addition to the Stroop Switch task, it is of interest that average performance across the 5 Memory tasks also discriminated very mild DAT from controls. Of course, this finding is also not surprising, since many studies have found evidence for memory measures as early discriminators between healthy aging and DAT (see Bäckman, et al., 2005, for a review) and medial temporal brain areas show some of the earliest neuropathology linked to DAT (Braak & Braak, 1991). In a recent meta-analysis, Bäckman, et al. (2005) reported that the largest effect sizes in discriminating DAT from healthy aging came from tasks measuring episodic memory (e.g., California Verbal Learning Test, Logical Memory tests), executive functioning (e.g., Stroop task, Trails B), and Perceptual Speed (e.g., Digit-Symbol task, Letter cancellation task). The current evidence for the importance of the Stroop Switch task and Memory measures for early discrimination are therefore in line with the meta-analysis of Bäckman et al.
Within the memory domain, it is of interest that the selective reminding task was especially effective at discriminating DAT. As with the current study, past research has demonstrated that this task can accurately discriminate healthy aging from DAT (Grober et al., 1988; Grober et al., 2000). In the selective reminding task (Grober et al., 1988), participants are presented 16 stimuli at study and then are queried for the immediate cued recall of each stimulus item. If the participant fails to retrieve the item, they are then given the cue and item together. After a short delay, the participants are given a free recall test for all 16 items and then a cued recall test for those items not recalled during the free recall period. Importantly, this entire procedure is then repeated two more times (with 16 new items each time). Each participant receives both a free recall score (how many items they recalled during the 3 free recall attempts, 0 – 48) and a total recall score (free recall score plus how many items recalled when given cue, 0–48). Only the free recall score was used in the present study (the total recall score did not aid in discriminating DAT).
One reason why this task is especially effective at discriminating DAT may lie in the requirement to repeat the study-test sequences across 3 separate blocks. This repetition requires subjects on later blocks to recall currently relevant items while suppressing potentially interfering items from earlier blocks. Indeed, an examination of Grober’s (1988) initial data shows that the free recall of healthy older adults improved over trials whereas the recall of DAT participants did not, increasing the difference between groups across blocks. It is therefore possible that any practice effects among DAT participants are counteracted by the buildup of such proactive interference across blocks (see Tse et al., 2009, for recent evidence of strong proactive and retroactive interference effects in early stage DAT).
Neuroimaging researchers have identified the critical importance of prefrontal cortices to memory monitoring processes such as those needed to avoid proactive interference from earlier trials (Feredoes, Tononi, Postle, & Smith, 2006; Hedden & Yoon, 2006; Postle, Brush, & Nick, 2004). Hence, part of attentional control involves the ability to exclude no-longer-relevant information from entering into the current search set (Unsworth & Engle, 2007). Indeed, tasks designed to measure working memory (e.g., Ospan or Reading Span tasks) are often influenced to a large extent by participants’ ability to overcome proactive interference from previous trials (Bunting, 2006; Lustig, May, & Hasher, 2001).
Given that the selective reminding task likely involves attentional control in the form of memory monitoring (presumably located in the PFC), it is not surprising that it surpasses other memory tasks in its discrimination of DAT. It is also not surprising, therefore, that this task correlates more strongly with Stroop Interference errors in the current study than any other memory task. Sommers and Huff (2003) also obtained a correlation between Stroop performance and memory in a memory task (the Deese, Roediger & McDermott, DRM false memory paradigm) that required memory monitoring between studied and highly similar, but unstudied, items. Indeed, this is precisely the account afforded by Balota et al. (1999) in explaining the relative increased false memory in early stage DAT, compared to healthy control individuals, with the DRM stimuli.
In a similar vein, because attentional control requires the ability to maintain a goal across time to modulate competition between relevant and irrelevant information, this ability should rely partially on memory processes. Thus, like the selective reminding task, the Stroop task is not process pure. However, the fact that the Stroop Switch and selective reminding tasks function so well in discriminating DAT suggests perhaps that it is the process of coordinating attention and memory that is most sensitive to DAT, rather than any one particular cognitive domain or system.
The current results converge with accumulating evidence that errors on incongruent trials are a particularly useful marker for early stage DAT. The advantage of incongruent errors in discriminating healthy aging and very mild DAT over memory tasks in the present cross-sectional analysis is somewhat surprising. It was expected that the ability of memory tests to predict DAT status cross-sectionally should be artificially inflated, because memory deficits are explicitly used in the diagnosis of DAT. However, even under these conditions which should strongly bias memory measures, incongruent Stroop errors were still the best discriminator of healthy aging from DAT. In this light, our work converges with those of Fine et al. who emphasized the power of a Stroop Switching task to detect early cognitive decline. One could perhaps argue that the Stroop Switch task was more difficult than the memory measures and that, had difficulty been equated, memory tasks would have prevailed. However, the nature of what makes a task “difficult” becomes important. We would argue that the most “difficult” memory tasks are those that require some of the same PFC regions involved in Stroop to monitor memory between currently-relevant items and highly similar or recent distractors (McCabe, Roediger, McDaniel, Balota, & Hambrick, in press; Jacoby, 1999; Jennings & Jacoby, 1997; Roediger & McDermott, 1995). Thus, we believe it is important to consider the coordination of attentional control systems and memory systems in developing a better understanding of the cognitive changes in the earliest detectable forms of DAT.
This work was supported by NIA PO1 AGO3991 and AGO5681. Thanks are extended to Martha Storandt for providing the psychometrics, and to John Morris and the clinicians at Washington University ADRC for their careful description of the healthy older adults.
Publisher's Disclaimer: The following manuscript is the final accepted manuscript. It has not been subjected to the final copyediting, fact-checking, and proofreading required for formal publication. It is not the definitive, publisher-authenticated version. The American Psychological Association and its Council of Editors disclaim any responsibility or liabilities for errors or omissions of this manuscript version, any version derived from this manuscript by NIH, or other third parties. The published version is available at www.apa.org/journals/pag
1There were also 5 Mild DAT individuals in this study. However, data from these individuals are not included due to the small sample. In general, the data from the Mild group resembled that from the Very Mild group. However, there was one Mild AD participant with 100% errors in color naming and 99% accuracy in word naming, suggesting either an inability to understand the directions or, more interestingly, a complete inability to overcome the dominant “word” response in favor of the nondominant color response.
2Allport and Wylie (2000) found that performance on no switch trials of an alternating runs paradigm was impaired relative to performance under pure task blocks. In an experiment in which tasks were blocked separately, Allport et al., (1994, Exp 4) found evidence for impairment from competing S–R mappings even after 100 trials in the new task. In other experiments in which tasks were blocked, Allport and Wylie (2000, see also Wylie and Allport, 2000) found a “restart” effect for the later task (longer RT to the first trial in a sequence following a break in trials) even though the task itself did not switch. Allport and Wylie used both the persistent cost on no switch trials and the restart effects to question the ability of the alternating runs paradigm to measure accurate switch costs. It should be emphasized, however, that the impairment was greater for initial trials following a task switch than for initial trials that simply followed a rest break, suggesting that the initial trial following a switch reflects more than simply a restart cost.
3This analysis excludes data from 5 (2 healthy old and 3 very mild DAT) of the 96 total healthy old and very mild DAT participants for whom one or more of the psychometric tests were missing.
4In running diagnostics for collinearity, all tolerance values for the predictors used in this analysis were greater than .4 and none of the Variance Inflation Factors were greater than 2.5 (See Hosmer & Lemeshow, 2000). Thus, we feel confident that the problem of multicollinearity was reduced in this analysis.
5The Selective Reminding Task also includes an additional recognition test