|Home | About | Journals | Submit | Contact Us | Français|
The Iowa Gambling Task (IGT) is assumed to measure executive functioning, but this has not been empirically tested by means of both convergent and discriminant validity. We used structural equation modeling (SEM) to test whether the IGT is an executive function (EF) task (convergent validity) and whether it is not related to other neuropsychological domains (discriminant validity). Healthy community-dwelling participants (N = 214) completed a comprehensive neuropsychological battery. We analyzed the conventional IGT metric and three alternative metrics based on the overall difference of advantageous minus disadvantageous choices made during the last 60 IGT responses and advantageous minus disadvantageous choices based on two specific decks of cards (D minus A). An a priori six-factor hierarchical model of neuropsychological functioning was confirmed with SEM. Attention and processing speed were grouped as “non-associative” factors. Fluency, executive functioning, visual learning/memory, and verbal learning/memory were grouped as higher-level “associative” factors. Of the non-associative factors, attention, but not speed, predicted IGT performance. When each associative factor was entered along with attention, only EF improved the model fit and that was only for metrics based on trials 41–100. SEM indicates metrics based on trails 1–100 are influenced by attention, and metrics based on trails 41–100 are influenced by attention and EF. Its associative strength with attention is twice that of EF. Conceptually, the IGT is a multi-trait task involving novel problem-solving and attentional domains to a greater extent, and executive functioning to a lesser extent.
The Iowa Gambling Task (IGT), frequently referred to as the Bechara Gambling Task, was developed as a psychometric probe for deficits in real-life decision-making manifested by neurologic patients with lesion of the ventromedial prefrontal cortex (VMPFC; Bechara, Damasio, Damasio, & Anderson, 1994). The standardized computer-administered version of the IGT has undergone extensive demographically based norming and is available as a clinical and research tool (Bechara, 2007). The individual starts with $2000 and is tasked with increasing his/her money over 100 game trials (Bechara et al., 1994). The task involves choosing cards from four decks (A–D) in which all decks yield monetary awards and penalties. However, the contingency schedule for awards and penalties is such that selections from decks C and D are advantageous after the first 20 trials, whereas those from decks A and B are disadvantageous. The test taker is told only that some decks are better than others. Empirical investigations indicate that test takers generally become aware of the risk parameters after 20 (Maia & McClelland, 2004) to 40 (Brand, Recknor, Grabenhorst, & Bechara, 2007) trials, though the exact inflection point varies. Therefore, the IGT can be considered as a measure of decision-making first under ambiguity and later under known risk.
The impetus for developing the IGT came from patients with VMPFC lesions who manifested psychosocial dysfunction secondary to very poor decision-making, but who otherwise performed normally on detailed neuropsychological testing (e.g., Eslinger & Damasio, 1985). Patients with VMPFC lesions were shown to make more disadvantageous and fewer advantageous choices than both healthy controls and patients with lesions to other brain areas (Bechara et al., 1994). The IGT has also been strongly associated with the somatic marker hypothesis (SMH: Damasio, 1994), which argues that emotion and cognition are integrated in the orbital frontal cortex and are both critical for purposes of decision-making. Reasoning processes arising from emotion are influenced by crude biasing signals from the periphery or central representation of the periphery (Dunn, Dalgleish, & Lawrence, 2006). Physiological (autonomic) differences during IGT performance between prefrontal lesion patients and healthy controls consistent with the SMH have been reported (Bechara, Tranel, Damasio, & Damasio, 1996). The IGT paradigm has also been critiqued as an inadequate measure of how people behave based on future consequences (Colombetti, 2008), as some healthy adults are able to predict the outcome of the next card after as few as 20 selections from the 100 selection administration (Maia & McClelland, 2004). In such persons, the IGT might not operationalize the SMH (Colombetti, 2008).
As the IGT is adopted more broadly clinically and in neural and health science research, the construct that it assesses has been both broadened and refined. One way to consider the construct validity of a neuropsychological measure is in terms of brain–behavior relationships. It is clear that patients with lesions perform more poorly than healthy controls, though the sensitivity to lesion site may be broader than what was intended in the initial development of the instrument. Lesions of the amygdala (Bechara, Damasio, Damasio, & Lee, 1999), right parietal cortex and insular and somatosensory cortices (Tranel, Bechara, & Damasio, 2000), and anterior and posterior OFC (Bechara, 2007) produce the same pattern of deficit in advantageous IGT decision-making as do VMPFC lesions, though another group reported dorsal but not orbital lesion produced IGT deficit (Manes et al., 2002). The IGT is also sensitive to neurologic conditions such as multiple sclerosis (Roca et al., 2008) and frontotemporal dementia (Torralva, Roca, Gleichgerrcht, Bekinschtein, & Manes, 2009). Therefore, the IGT may be sensitive but not specific to frontal lesions (Alvarez & Emory, 2006).
IGT performance has been shown to be reduced in a number of populations in which decision-making difficulties would be expected, such as persons with chemical dependence (Rotheram-Fuller, Shoptaw, Berman, & London, 2004; Rodgers et al., 1999), and thought disorder (Nakamura et al., 2008). In terms of convergent validity, relative to two other gambling tasks, the IGT was demonstrated more sensitive to differences in decision-making of prefrontal subregion groups (Manes et al., 2002). Both the IGT and the Probability-Associated Gambling Task were sensitive to differences in decision-making capacity of those with mild cognitive impairment; however, the two gambling tasks were not associated with each other, and only the IGT was associated with figural memory, categorical thinking, and numeracy (Zamarian, Weiss, & Delazer, 2010).
Convergent/discriminant validity, the degree to which a measurement converges with similar measurements and diverges from dissimilar measurements, is another key index of overall validity (Weiner & Greene, 2008). Buelow and Suhr (2009) have argued that the total difference score metric from the IGT represents a diffuse amalgam of processes. Nevertheless, a review of the literature indicates that the IGT is assumed to measure executive function (EF) (Clark, Iversen, & Goodwin, 2001; Roca et al., 2008; Suhr & Hammers, 2010; Torralva et al., 2009; Verdejo-Garcia, Lopez-Torrecilas, Calander, Delgado-Rodriguez, & Bechara, 2009). EF, though not necessarily considered a unitary cognitive process, has been defined as the ability to organize a sequence of actions toward a goal (Anderson, Jacobs, & Anderson, 2008; Fuster, 2008), or, as the ability to activate and inhibit response sequences guided by internal neural representations whereby the frontal lobes “select” from a range of behavioral “programs” or routines (Eslinger & Chakara, 2004). Some of the most commonly used clinical measures of EF include the Wisconsin Card Sorting Test (WCST), Verbal Fluency, and the Color Word Interference Test (Alvarez & Emory, 2006; Rabin, Burton, & Barr, 2007). This assumption is not surprising given that the IGT professional manual presents data on the relationship of the measure to EF tasks, namely the WCST and the Tower of London, but not tasks from other neuropsychological domains, implicitly suggesting that it is an EF task, or at least, has more in common with that domain than others. Indeed, in one investigation, it was concluded that the IGT and the WCST performed successfully and similarly (Verdejo-Garcia et al., 2009). However, in two empirical reports, the IGT was considered more sensitive to the presence of neurologic disease than the “classic” executive functioning tasks (i.e., Trail Making Test, Verbal Fluency, WCST; Roca et al., 2008; Torralva et al., 2009), and in another was more sensitive to the presence of serious chemical dependence (Rotheram-Fuller et al., 2004). It has been reported that the last four blocks (80 trials) of the IGT are associated with a measure of EF, the WCST, and not with a measure of intelligence, the Wechsler Abbreviated Scale of Intelligence among healthy controls (Brand et al., 2007). Although this is a reasonable test of construct validity, it is not a stringent test, which arguably would include neuropsychological measures varying as to degree of similarity to the IGT.
There has also been concern that IGT performance could be influenced by component processes such as working memory (Manes et al., 2002). In that investigation, decision-making tasks thought to place few demands on working memory were compared with the IGT across frontal lesion and matched control participants. No difference in the validity of the IGT and “non-working-memory-saturated” decision-making tasks was found (Manes et al., 2002).
In summary, evidence for the construct validity of the IGT is solid, either when lesion populations or other populations with presumptive poor decision-making are concerned. This is notable given the low internal consistency of the instrument (Gansler, Jerram, Vannorsdall, & Schretlen, in press) with correlations between odd and even trials blocks at .5 or less depending on the metric already reported for the same sample involved in this report. On the other hand, evidence for the convergent/discriminant validity of the IGT is present but weaker. Evidence for discriminant validity, in particular, is either lacking or contradictory. As such, whether the IGT is best conceptualized as a “stand-alone” measure that does not belong to a domain or as one that measures an aspect of executive functioning remains unclear (Buelow & Suhr, 2009; Dunn et al., 2006; Toplak, Sorge, Benoit, West, & Stanovich, 2010). It is possible that the assumption that the IGT assesses executive functioning has been accepted without adequate empirical investigation and that the IGT composite score represents an artificial amalgam of various psychological processes including both “hot” and “cold” decision-making tendencies (Buelow & Suhr, 2009). Others have been more sanguine regarding the construct validity of the IGT, claiming that its lack of association with intellectual ability in general, and EF in particular is a validation of the SMH of “hot” decision-making (Toplak et al., 2010). The issue of convergent/discriminant validity, and thus domain membership, is critical to the interpretation of clinical neuropsychological data, which relies on nomothetic and ipsative sifting of data (Lezak, Howieson, & Loring, 2004) as multiple dissociations between performances are made (Russell, 1984). The same issue is important to investigators using the IGT to relate emotional and cognitive processes within the decision-making context. Although there are various definitions of executive function, one widely accepted definition is the ability to organize a sequence of actions toward a goal (Fuster, 2008), and another influential definition involves the capacity to activate and inhibit response sequences guided by internal neural representations (Eslinger & Chakara, 2004). Thus, from a face validity perspective, it appears that the IGT should relate more to EF tasks than other neuropsychological tasks.
Questions. To address the above validity concerns, we designed an investigation to answer the following questions: First, we hypothesized that a model would demonstrate IGT performance is significantly related to general intelligence. Second, we hypothesized that, in a neuropsychological model, IGT performance would relate to the other measures of executive function. Third, we hypothesized that, within the neuropsychological model, IGT performance would be relatively free from influence by other neuropsychological domains, such as learning and memory.
Several assumptions guided this approach. First, a latent factor structure informed both theoretically and empirically (Carroll, 2005; Horn & Blankson, 2005; McGrew, 2005; Spearman, 1927) best determines the neuropsychological measure domain membership (Dowling, Hermann, La Rue, & Sager, 2010; Gladsjo et al., 2004; Holtzer et al., 2007; McCracken & Franzen, 1992; Pena et al., 2009; Tulsky & Price, 2003).
Second, structural equation modeling (SEM; Kline, 2005), permitting a theoretically guided (i.e., confirmatory factor analytic) approach to examining underlying relationships within a database, is an appropriate method for examining domain-based relationships, which involve the relation of latent to manifest variables. To interpret neuropsychological test battery results in terms of impairment in discrete brain regions, neuropsychologists make assumptions about the underlying organization of cognitive functioning into distinct ability areas (Gladsjo et al., 2004).
Third, a hierarchical factor structure (Carroll, 2005), with neuropsychological domains at the first level, and a general factor at the second level, would effectively demonstrate the IGT and domain-based relationships. Most neuropsychological (Gladsjo et al., 2004) and intellectual measures (Spearman, 1927) are positively associated. Direct influence of a domain upon a specific test score is also indirectly influenced by general neuropsychological ability (G-NP).
Fourth, the conventional IGT metric (CD-AB trials 1–100), disadvantageous selections from decks A and B subtracted from advantageous selections from decks C and D over 100 trials, would produce less robust effects than alternative metrics that reduce error due to location bias by removing selections to decks B and C, as well as error due to the heterogeneity of decision-making processes by removing early trials (Gansler et al., in press). The difference score over the last 60 trials (CD-AB trials 41–100), selection from deck A subtracted from deck D over 100 trials (D-A trials 1–100), and selection from deck A subtracted from deck D over the last 60 trials (D-A trials 41–100) produce more robust construct and criterion validity results (Gansler et al., in press).
Participants were drawn from an initial community sample of 394 adults recruited from the Baltimore, MD, and Hartford, CT, areas to participate in the Aging, Brain Imaging, and Cognition (ABC) study (Testa, Winicki, Pearlson, Gordon, & Schretlen, 2009). Participants were recruited via random digit dialing, written invitation to Medicare beneficiaries aged 65 and older, and telephone calls to listings selected in pseudo-random fashion from residential directories. The ABC study was conducted in two phases: Participants (n = 215) who entered the study during phase 1 (1995–1998) were recruited from Baltimore. The IGT was not used at that time. However, 110 phase 1 participants returned during phase 2 (1999–2005), and they completed the IGT, which was added to the phase 2 study protocol. Another 179 participants were recruited during phase 2. These included 86 adults from Baltimore and 93 from Hartford. Thus, 289 participants were administered the IGT. Each participant underwent a physical and neurological examination, psychiatric interview, laboratory blood tests, brain MRI scan, and cognitive testing over 1–2 days. Participants were classified as unhealthy if they had a neurological condition known to affect cognitive functioning (e.g., Parkinson's disease, multiple sclerosis), severe/life-threatening medical problems (e.g., congestive heart failure with poorly controlled hypertension and diabetes), or significant psychiatric illness (e.g., schizophrenia, bipolar disorder, major depression, substance dependence). In order to boost homogeneity of variance, unhealthy participants were not included in this report. The remaining 245 participants were classified as reasonably healthy. IGT data for 23 of these participants were excluded or lost, and we excluded one individual who did not complete the remainder of cognitive testing. Finally, eight participants were excluded either due to invalid performance on the IGT (responding exclusively to decks C and D) or due to statistical outlier status on one or more of the neuropsychological composite scores. This left data for 214 healthy participants available for study. The IGT was administered using a prepublication copy of the standardized computer-based version obtained from the test's author. All participants gave written informed consent, and the study was approved by the Johns Hopkins Medicine and Hartford Hospital IRBs.
As shown in Table 1, the healthy ABC study participants were on average around 55 years of age, slightly more than half were women, and on average they had several years of education beyond the high-school level.
Neuropsychological test procedures were administered by trained research assistants working under the supervision of board-certified neuropsychologists. A battery of 25 neuropsychological tests was administered from all functional domains. That battery included seven subtests of the Wechsler Adult Intelligence Scale-Revised (WAIS-R; Wechsler, 1981), Information, Digit Span, Arithmetic, Similarities, Picture Completion, Block Design, and Digit Symbol; two subtests from the Wechsler Memory Scale-Revised (Wechsler, 1987)—Logical Memory and Visual Reproduction; and the Matrix Reasoning subtest from the Wechsler Abbreviated Scale of Intelligence (WASI; Wechsler, 1999). Additional procedures included the Brief Test of Attention (Schretlen, 1997), the Trail Making Test Parts A and B (Reitan, 1958), Calibrated Ideational Fluency Assessment (Schretlen & Vannorsdall, 2010), the Grooved Pegboard Test (Klove, 1963), the Boston Naming Test (Goodglass & Kaplan, 2000), Visual Motor Integration (Beery & Buktenica, 1997), the Rey Complex Figure Test (Rey, 1941) Connors' Continuous Performance Test (Connors, 1995), the Hopkins Verbal Learning Test, revised, subtests of Learning Trials (1–3) and Delayed Recall Trial (Brandt & Benedict, 2001), the Brief Visuospatial Memory Test, revised, subtests of Learning Trials (1–3) and Delayed Recall Trial (Benedict, 1997), and the categories and perseverative error scores of the modified WCST (Schretlen, 2010). Only a portion of these measures, which are listed in Figs 1 and and2,2, was used to create the SEMs.
Latent variable analyses were conducted using AMOS 7.0 (Arbuckle, 2006). The chi-squared goodness of fit adjusted for degrees of freedom (CMIN and CMIN/DF), comparative-fit index (CFI), root-mean-square error of approximation (RMSEA), and Akaike information criterion (AIC) were used to assess model fit. Conclusions about model fit were derived from the consideration of all measures; no single measure was considered as a “gold standard”. In order to facilitate comparisons among competing variations within larger models (i.e., conventional IGT metric vs. alternative IGT metrics within the model considering general intellectual functioning), delta AIC (ΔAIC) is reported. The measure allows for the comparison of model variations to determine the variation that best accounts for the information provided by the data. By convention, the model variation with the lowest AIC is said to have ΔAIC of zero and all other model variations are compared with that “baseline.” Burnham and Anderson (2002) offer the following criteria to interpret ΔAIC: Δ of 0–2 indicates that the model variation has substantial support as an alternative, Δ of 4–7 indicates that the model variation has weak support as an alternative, and Δ of >10 indicates that the model variation has essentially no support as an alternative. To assess the significance of individual and unique associations of functional domains to IGT metrics, the maximum likelihood estimates (MLEs) were used with α set at 0.05, and the standardized regression coefficients that lay at or below MLE statistical threshold (β weights) were used to compare the degree of association.
Guided by the extended Gf–Gc theory, a hierarchical model (Carroll, 2005) was fitted including the latent variables of fluid intelligence (Gf), crystallized intelligence (Gc), and visual processing abilities (Gvis) at the lower level and general ability (G) at the upper level (Fig. 1). The model was under-identified when the other second-order factors (i.e., short-term acquisition/working memory or processing speed) from the Gf–Gc theory were incorporated. The CFI for the G–Gf–Gc–Gvis model was 0.968 and RMSEA was 0.098, indications of superior and adequate fit, respectively. χ2 was significant (CMIN = 51.62, CMIN/DF = 3.04, p< .001), but this statistic may not be a reliable fit indicator when the sample size is large (Kline, 2005, p. 136). A sequence of nested structural equation models was then fitted to examine the unique associations of G and the three lower-level abilities, Gf, Gc, and Gvis, to the IGT.
When the IGT was added to the model, overall fit indices altered slightly but not meaningfully, in both favorable and unfavorable directions (CMIN = 65.99, CMIN/DF = 2.75, p < .001; CFI = 0.962; RMSEA = 0.091; AIC = 125.99, ΔAIC = 7.68 [relative to the minimum value among models including the IGT]). The standardized coefficient indicated that G (the second level) was meaningfully associated with the IGT (β = 0.25; p ≤ .001). Next, the direct association of the IGT with each of the first-level factors (Gf, Gc, and Gvis) was assessed, and that model also incorporated the indirect influence of G, as the path was drawn from the second level to the first level and then to the IGT. When the individual association of each first-level latent variable with the IGT was examined, the IGT was estimated meaningfully as indicated by the standardized coefficients (Gf: β = 0.34, p ≤ .001; Gc: β = 0.31, p ≤ .001; Gvis: β = 0.31, p ≤ .001). Model fit was virtually identical for Gc and Gvis (Gc: CMIN = 63.71, CMIN/DF = 2.65, p≤ .001, CFI = 0.964, RMSEA = 0.088, AIC = 123.71, ΔAIC = 5.40; Gvis: CMIN = 63.28, CMIN/DF = 2.64, p≤ .001, CFI = 0.964, RMSEA = 0.088, AIC = 123.30, Δ AIC = 4.99) and slightly better for Gf (CMIN = 58.31, CMIN/DF = 2.43, p≤ .001, CFI = 0.969, RMSEA = 0.082, AIC = 118.31, ΔAIC = 0).
In order to examine unique associations of the latent variables with the IGT, each possible combination of two latent variables was estimated simultaneously. When Gc and Gvis were simultaneously entered, both continued to meaningfully estimate the IGT (βs = 0.20 and 0.196, respectively, p= .016 and .020; CMIN = 57.96 CMIN/DF = 2.52, p≤ .001, CFI = 0.969, RMSEA = 0.082, AIC = 119.96, ΔAIC = 1.65). When Gf and Gvis were simultaneously entered, a unique relationship with Gf was observed (Gf: β = 0.36, p= .025; Gvis: β = −0.02, NS; CMIN = 58.29, CMIN/DF = 2.53, p≤ .001, CFI = 0.964, RMSEA = 0.088, AIC = 120.92, ΔAIC = 2.61). When Gf and Gc were simultaneously entered, a unique relationship with Gf was again observed (Gf: β = 0.30, p= .02; Gc: β = 0.04, NS; CMIN = 58.214, p≤ .000, CMIN/DF = 2.531, CFI = 0.964, RMSEA = 0.088, AIC = 120.214, ΔAIC = 1.90).
Results indicate that G, or general intellectual ability, is meaningfully associated with IGT performance and that this association is most robustly and uniquely influenced by the Gf (fluid) component of G. This can be concluded because the relationship between IGT and Gc and Gvis, which are significant when viewed either individually or when Gc and Gvis are modeled together, becomes non-significant when either is modeled with Gf. This indicates that the relationships between the IGT and Gc and Gvis are driven by variance in Gc and Gvis that is shared with Gf.
Each IGT alternative metric was put through the same analytic process as the conventional IGT. For purposes of ease of presentation, the relationship of Gf with each IGT alternative metric is presented, for, as with the conventional IGT, that aspect of intelligence bore the most robust and unique relationship with decision-making (Table 2). When all fit indices were considered, IGT estimation and model fit improved, as predicted, with the alternative IGT metrics, and this tendency was optimal for D-A trials 1–100.
To focus on neuropsychological domains, the model was switched from one of general intelligence to one of specific neuropsychological evaluation domains. This six-factor model was guided by theory and prior empirical work representative of clinical neuropsychological data sets (Pena et al., 2009). The strategy had several advantages including the incorporation of the executive function variable critical to the hypothesis, and the high degree of similarity of the neuropsychological task constituents between this report and that of Peña and colleagues (2009). A critical difference with that study was the inclusion of a general factor on the second level here, which is consistent with the assumption that the direct influence of a first-level latent factor may be better demonstrated in conjunction with the indirect influence of the second-level general factor. The general factor here refers to G-NP and does not represent Spearman's G as the general factor presumably did in the previous model. The six-factor two-level model (Fig. 2) fit the data adequately (CMIN = 197.98, CMIN/DF = 2.302, p= .000, CFI = 0.932, RMSEA = 0.078, AIC = 295.98). The addition of the IGT conventional metric as the manifest variable improved model fit slightly and by standardized coefficient demonstrated that G-NP was meaningfully associated with the IGT (β = 0.319, p≤ .001; CMIN = 219.57, CMIN/DF = 2.20, p< .001, CFI = 0.928, RMSEA = 0.075, AIC = 323.57). The standardized coefficients indicated all the latent variables were individually associated with the IGT (attention β = 0.357, speed β = −0.316, fluency β = 0.326, visual memory β = 0.323, verbal memory β = 0.328, executive function β = 0.297; all p ≤ .001). Model fit varied minimally as the six latent variables were selected to estimate the IGT (range of CMIN/DF = 2.17–2.24; range of CFI = 0.925–0.928; range of RMSEA = 0.074–0.076, range of AIC = 320.84–328.35).
In order to examine the unique association of executive function with the IGT and bring order to the six latent variables on the first level, we grouped the latter into non-associative (attention, processing speed) and associative (executive function, fluency, verbal learning and memory, visual learning and memory) variables. The rationale for this approach stems from the WAIS-IV, in which intellectual factors were divided into cognitive proficiency and general aptitude factors, where cognitive proficiency consists of the processing speed and working memory factors and general ability consists of the verbal comprehension and perceptual organization factors (Lichtenberger & Kaufman, 2009). Cognitive proficiency tasks require little or no associative processes and, therefore, are referred to as non-associative, whereas general aptitude tasks have associative components. In order to examine unique relationships in the context of a positive manifold, the IGT would be estimated by the non-associative latent variables first, and then its unique association would be tested by adding EF to the model with non-associative latent variables shown to estimate the IGT. This would allow the relationship of EF and IGT to be observed while accounting for the influence of the non-associative variables, related to both. When the latent variables of attention and speed were entered simultaneously, a unique association with the latent attention variable alone was indicated (attention β = 0.627, p≤ .001; speed β = 0.259, NS; CMIN/DF = 2.18, p= .000, CFI = 0.930, RMSEA = 0.075, AIC = 322.71). When the latent variables of attention and EF were then entered simultaneously, with speed dropped out for the lack of unique association, a unique association with attention only was observed (attention β = 0.285, p≤ .001, EF β = 0.127, NS; CMIN/DF = 2.17, p= .000, CFI = 930, RMSEA = 0.074, AIC = 320.71, ΔAIC = 2.461 [relative to D-A trials 1–100 AIC of 318.249]).
To test the prediction that alternative IGT metrics would demonstrate more robust association with latent neuropsychological variables, those were entered into the same two-level six-factor model just described. Consistent with the prediction, the latent G-NP variable in that model predicted a larger amount of the variance in the manifest alternative IGT metric scores (CD-AB trials 41–100 β =0.367, p≤ .001, CMIN = 212.43, CMIN/DF = 2.22, p< .001, CFI = 0.927, RMSEA = 0.076, AIC = 326.43, ΔAIC = 8.185 [relative to latent G-NP—D-A trials 1–100 AIC of 318.249]; D-A trials 1–100 β =0.426, p≤ .001, CMIN = 214.25, CMIN/DF = 2.14, p= .000, CFI = 0.932, RMSEA = 0.073, AIC = 318.25, ΔAIC = 0; D-A trials 41–100 β =0.437, p≤ .001, CMIN = 217.54, CMIN/DF = 2.18, p< .001, CFI = 0.930, RMSEA = 0.074, AIC = 321.54, ΔAIC = 3.288) (conventional IGT β was 0.319; Table 3).When the non-associative latent factors of attention and speed were entered simultaneously, a unique association of attention with the IGT difference last two thirds and total deck D minus deck A metrics was found, but not for deck D minus deck A last two thirds (CD-AB trials 41–100—attention β = 0.609, p= .043, speed β = 0.137, NS; CMIN/DF = 2.13, p< .001, CFI = 0.933, RMSEA = 0.073, AIC = 328.14, ΔAIC = 11.24 [relative to attention— D-A trials 1–100 AIC of 316.90]; D-A trials 1–100—attention β = 0.61, p≤ .043, speed β = 0.14, NS; CMIN/DF = 2.13, p< .001, CFI = 0.933, RMSEA = 0.073, AIC = 316.90, ΔAIC = 0; D-A trials 41–100—attention β = 0.180, NS, speed β = −0.269, NS). Unique associations for the attention and executive function latent variables with IGT trials 41–100 were found when both were entered simultaneously (CD-AB trials 41–100—attention β = 0.302, p≤ .001, EF β = 0.152, p= .05; CMIN/DF = 2.19, p< .001, CFI = 0.928, RMSEA = 0.076, AIC = 325.282, ΔAIC = 9.787 [relative to attention and EF—D-A trials 1–100 AIC of 315.495]; D-A trials 1–100—attention β = 0.406, p≤ .001, EF β = 0.107, NS; CMIN/DF = 2.09, p< .001, CFI = 0.934, RMSEA = 0.072, AIC = 315.495, ΔAIC = 0; D-A trials 41–100—attention β = 0.360, p≤ .001, EF β = 0.166, p≤ .05; CMIN/DF = 2.14, p< .001, CFI = 0.931, RMSEA = 0.074, AIC = 320.686, ΔAIC = 5.191). Therefore, the IGT metrics of advantageous minus disadvantageous selections over the last 60 trials of the test, particularly D-A trails 41–100, were more robustly associated with general neuropsychological function than the conventional metric and were directly related to the attention and EF variables.
To parallel the design for EF, the other associative variables (fluency, visual learning and memory, and verbal learning and memory) were simultaneously entered into the model with the attention variable to estimate the conventional IGT. When attention and fluency were entered simultaneously, neither estimated the conventional IGT metric (CD-AB trials 1–100—attention β = 0.53, NS, fluency β = −0.14, NS; CMIN/DF = 2.19, p< .001, CFI = 0.929, RMSEA = 0.075, AIC = 322.61, ΔAIC = 0.29 [relative to attention and verbal memory—IGT AIC of 322.32]). The same was true when attention and visual memory were entered simultaneously (CD-AB trials 1–100—attention β = 0.53, NS, visual memory β = −0.17, NS; CMIN/DF = 2.19, p< .001, CFI = 0.929, RMSEA = 0.075, AIC = 322.49, ΔAIC = 0.17). When attention and verbal memory were entered simultaneously, only the latent variable of attention estimated the IGT (CD-AB trials 1–100—attention β = 0.559, p= .046, verbal memory β = 0.203, NS; CMIN/DF = 2.19, p< .001, CFI = 0.929, RMSEA = 0.075, AIC = 322.32, ΔAIC = 0). The conventional IGT metric did not have a unique association with any of the “associative functions” after its relationship with the “non-associative” function of attention was accounted for.
To parallel the design for EF, the other associative variables (fluency, visual learning and memory, and verbal learning and memory) were simultaneously entered into the model with the attention variable to estimate the three alternative IGT metrics. As the overall model changed very little across those nine estimations, the ranges for the model fit statistics are presented here for reader convenience (CMIN/DF 2.13–2.19, p-values were all <.001, CFI 0.926–0.933, RMSEA 0.073–0.076, AIC 317.09–328.07). None of the combinations of the attention latent variable with fluency, visual learning and memory, or verbal learning and memory estimated CD-AB trials 41–100 (attention β and fluency β were 0.116 and 0.270, NS; attention β and visual memory β were 0.194 and 0.194, NS; and attention β and verbal memory β were 0.260 and 0.133, NS). None of the combinations of the attention latent variable with fluency, visual learning and memory, or verbal learning and memory estimated D-A trials 1–100 (attention β and fluency β were 0.475 and −0.007, NS; attention β and visual memory β were 0.523 and −0.054, NS; and attention β and verbal memory β were 0.454 and 0.014, NS). None of the combinations of the attention latent variable with fluency, visual learning and memory, or verbal learning and memory estimated selections of D-A trials 41–100 (attention β and fluency β were 0.060 and 0.370, NS; attention β and visual memory β were 0.101 and 0.350; and attention β and verbal memory β were 0.081 and 0.381, NS).
The purpose of this investigation was to determine the degree to which decision-making as measured by the IGT was influenced by general intellectual and executive functioning ability and whether it was relatively free from influence by other domains. Critically, the alternative metrics captured the influence of the latent variables to a greater extent than the conventional CD-AB trails 1–100 metric. The data support the conclusion that the IGT is a novel problem-solving task in the format of a decision-making task, with a robust attentional component introduced by the 100-trial format and with a considerably less robust executive functioning component relevant only to the last two thirds of the administration. Confirmatory factor analysis and SEM supported the hypotheses that the IGT is influenced by general intellect in the first model, and, in the second neuropsychological model by attention robustly, and executive ability with considerably less robustness. The D-A metric involving all 100 trials is particularly saturated with the attention component, whereas the influence of EF emerges for metrics based solely on the later trials (41–100). As the interpretation of neuropsychological performance is predicated on the putative domains that tasks are thought to assess (Lezak et al., 2004), these results underscore domain membership rather than “stand-alone” status for the IGT. Conceptually, the IGT can be viewed as a multi-trait task involving novel problem-solving and attentional domains to a greater extent, and executive functioning to a lesser extent. From a research perspective, this finding may speak to the need to modify stimulus presentation or administration to reduce error and increase saturation for higher cognitive processes. This finding also has theoretical as well as practical ramification, in light of recent assertions that evidence across studies of a lack of association of IGT and intellect supports the SMH dissociation of “hot” decision-making processes and general cognition. IGT variance accounted for by “cold” decision-making processes may help to explain its discrimination of diagnostic groups such as MS (Roca et al., 2008) despite not being designed for such a purpose. This investigation and other recent work (Suhr & Hammers, 2010) contradict the hot decision-making assertion and should engender further theoretical debate on the issue, though it is possible evidence of this dissociation would be more forthcoming in clinical populations with fronto-limbic compromise.
Given that intelligence tests are the most commonly used assessment tools in clinical neuropsychology (Butler, Retzlaff, & Vanderploeg, 1991; Rabin et al., 2007), and components of intelligence tests are deeply embedded in clinical neuropsychological factor structures (Gladsjo et al., 2004; Tulsky & Price, 2003), our first question concerned the relation of intelligence to decision-making ability as measured by the IGT. Support was found for the hypothesis that decision-making is estimated by general intelligence (G) indirectly and Gf uniquely and directly, accounting for 34%–40% of variance, depending on whether a conventional or alternative IGT metric was used. The influence of Gf on decision-making was most robust for D-A trials 1–100 metric, consistent with predictions that alternative IGT metrics would exhibit more robust relationships (Gansler et al., in press). Metrics based on deck D minus deck A calculations may benefit from the removal of the location bias error introduced by selections from deck B and C, which are less salient in the horizontal stimulus array. As Gf is defined as the capacity for identifying relationships, comprehending implications, and drawing inferences within content that is either novel or equally familiar to all (Horn & Blankson, 2005), the finding of an association of Gf and IGT performance has face validity and intuitive appeal. Crystallized (Gc) and visual (Gvis) intellectual abilities were not uniquely related to IGT performance. The individual associations of those variables with the IGT were due to variance they shared with Gf. This finding is compatible with another recent report by Suhr and Hammers (2010) of an association of IGT “failure” and lowered performance on WASI Matrix Reasoning performance, a task considered a good indicator of Gf (Lichtenberger & Kaufman, 2009). Brand and colleagues (2007) found no association of IGT trial blocks and intelligence. However, that work differed in statistical approach (i.e., multiple regression), using a lower number of participants which are relevant given the apparent modest effect sizes in operation. It differed in measurement approach as well, using an index of intelligence combining g-fluid (WASI Matrix Reasoning) with g-crystallized (WASI Vocabulary), which based on these results would have lowered the strength of association with intelligence.
Less robust support also was found for the second hypothesis, namely, that the IGT has an EF component, but their relationship is complex and deserves elaboration. The second model, with six factors of specific neuropsychological ability at the first level and G-NP at the second level, showed that each factor individually influenced IGT performance. The latent variable of G-NP modeled as a second-level factor influenced both conventional and unconventional IGT metrics. A nested model approach was adopted in order to identify any unique influences on IGT performance. Of the non-associative cognitive proficiency factors (attention and speed), attention was found to exert a robust influence on IGT performance. When the associative ability factors (EF, fluency, visual learning and memory, and verbal learning and memory) were each entered with attention, only EF showed a unique influence on IGT performance, providing evidence for convergent and discriminant validity, though the influence of EF was modest in size. This relationship applied to D-A trails 41–100 and CD-AB trials 41–100, but not to CD-AB trials 1–100, consistent with the notion that later trials measure decision-making under conditions of known risk. The latent attention and executive functioning variables accounted uniquely for 37% and 17% (rounded up), respectively, of the variance in D-A trials 41–100. For CD-AB trials 1–100 and D-A trials 1–100, only the latent attention variable uniquely accounted for variance (29% and 41%, respectively, when entered with EF). Internal consistency of the IGT as measured by odd-even block correlation is low (coefficients from .219 to .547 depending on the metric as calculated in this sample in a previous report) and sets an upper limit on validity indicators such as those calculated here (Gansler et al., in press).
In terms of validity, determined by both convergent and discriminant methods, the CD-AB trials 1–100 emerges as a member of the attention domain, whereas CD-AB trials 41–100 emerges as a member of both the attention and the EF domain (though it is more robustly associated with attention). This finding echoes that of Brand and colleagues (2007) who found an association of later—but not all—IGT trial blocks with performance on the Wisconsin Card Sorting Task. Those investigators noted that early IGT trials may best be thought of as representing implicit memory functioning and others have described early trials as measuring rational exploratory behavior (Dunn et al., 2006). The finding that CD-AB trials 41–100 and D-A trials 41–100 are influenced by both attention and EF, whereas CD-AB trials 1–100 is not, might help explain why performance on later blocks of the IGT are more sensitive to certain neurologic conditions (Roca et al., 2008; Torralva et al., 2009). It reinforces the notion that later trials represent decision-making under conditions of known risk (Brand et al., 2007). The influence of attentional processes could account for IGT trials 1–100 insensitivity in a study comparing persons with manic depressive illness to healthy controls (Clark et al., 2001) and may help to explain the circumstances under which it does and does not discriminate groups. It also echoes the distinction between cognitive proficiency variables, such as the attention latent variable, and general ability variables, such as the EF latent variable, made by Lichtenberger and Kaufman (2009) in their interpretation of intelligence tests.
The IGT administration calls for the participant to organize a sequence of actions toward a goal (i.e., maximizing the starting amount of $2,000; Fuster, 2008) and to activate and inhibit response sequences (i.e., by selecting cards from decks C and D but not A and B; Eslinger & Chakara, 2004). Thus, the IGT broadly fits these definitions of EF. It differs from some EF tasks by its incorporation of secondary reinforcer proxies ($2,000 in game money), and this might align it more closely with EF measures of social and emotional control (Beer, Shimamura, & Knight, 2004) than with the cognitive control aspects of classic EF tasks (Alvarez & Emory, 2006). In this sense, the IGT is responsive to calls for more ecologically valid instruments in clinical neuropsychology (Rabin et al., 2007), while being understood in the context of classical neuropsychological functional domains. Nevertheless, given the limitations that test reliability places on test validity, alterations in the administration procedures and stimulus presentation would be advantageous.
The results of this study are not completely consistent with previous research, including our own (Gansler et al., in press), on the relationship of the IGT with other neuropsychological variables. In that investigation, using a univariate correlational approach, the IGT was less robustly associated with attention and executive factors than with other factors. This likely results from our use of SEM to examine these relationships in the present study. Using latent variables, which is central to SEM, provides advantages over the sole reliance on observed variables, which are used in regression models. A latent variable incorporates the results of several observed variables to produce a representation of a theoretical construct that is not directly measured (Kline, 2005). In all cases, an observed variable incorporates some amount of variance that is a function of the vicissitudes of the test's construction or administration in addition to that part of the variance that is related to the underlying psychological construct it is purporting to measure; this is due to the fact that it is impossible to measure a construct perfectly (Kline, 2005). In the case of the IGT, unwanted variance may include location bias introduced by the horizontal array of four decks, and the changing nature of the decision-making process from early to late trials. When observed variables alone are used, the variance that is not directly related to the psychological construct (error variance) cannot be separated from the variance that does measure the construct. When multiple observed variables contribute to a latent variable, the variance of the latent variable is constructed from the overlapping variance of the observed variables, reducing the impact of the error variance that is unique to each observed variable. Thus, the latent variable affords a purer measure of the construct because the error variance is better controlled. In addition, latent variables provide greater ecological validity. Any administered measure of a psychological construct does not fully and uniquely measure that construct. Utilizing several measures to evaluate a construct enhances conceptual convergence among the measures (e.g., Cohen, 1959). Therefore, the latent variable is a more valid reflection of the underlying construct than any individual measure. Finally, SEM tests all relationships among variables simultaneously, including direct and indirect paths between variables. This type of analysis cannot be completed with regression models. SEM, therefore, more validly models reality, where effects are complexly and inextricably interrelated. For these reasons, the strengths of SEM lead us to believe that the results of this study provide a valid representation of the latent constructs that underlie IGT performance.
Limitations of the present study include reliance on healthy participants alone, which has the advantage of homogeneity but the disadvantage of not generalizing to clinical populations. The EF latent variable was operationalized by two metrics from the WCST, whereas the other latent variables were operationalized by two distinct neuropsychological tasks. Thus, the EF latent variable could have been particularly vulnerable to error variance associated with a single measurement paradigm.
This work was supported by the National Institute of Mental Health (MH60504: Aging, Brain Imaging and Cognition study).
The authors gratefully acknowledge, and extend heartfelt appreciation, for the statistical consultation contributions of Dr Amy Marks and Dr Lance Swenson.