PMCCPMCCPMCC

Search tips
Search criteria 

Advanced

 
Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
 
J Neurosci Methods. Author manuscript; available in PMC 2011 March 30.
Published in final edited form as:
PMCID: PMC2832711
NIHMSID: NIHMS161886

A cognitive neuroscience based computerized battery for efficient measurement of individual differences: Standardization and initial construct validation

Abstract

There is increased need for efficient computerized methods to collect reliable data on a range of cognitive domains that can be linked to specific brain systems. Such need arises in functional neuroimaging studies, where individual differences in cognitive performance are variables of interest or serve as confounds. In genetic studies of complex behavior, which require particularly large samples, such trait measures can serve as endophenotypes. Traditional neuropsychological tests, based on clinical pathological correlations, are protracted, require extensive training in administration and scoring, and leave lengthy paper trails (double-entry for analysis). We present a computerized battery that takes an average of 1 hour and provides measures of accuracy and speed on 9 neurocognitive domains. They are cognitive neuroscience-based in that have been linked experimentally to specific brain systems with functional neuroimaging studies. We describe the process of translating tasks used in functional neuroimaging to tests for assessing individual differences. Data are presented on each test with samples ranging from 139 (81 female) to 536 (311 female) of carefully screened healthy individuals ranging in age from 18 to 84. Item consistency was established with acceptable to high Cronbach alpha coefficients. Inter-item correlations were moderate to high within domain and low to nil across domains, indicating construct validity. Initial criterion validity was demonstrated by sensitivity to sex differences and the effects of age, education and parental education. These results encourage the use of this battery in studies needing an efficient assessment of major neurocognitive domains such as multisite genetic studies and clinical trials.

1. Introduction

There is increased demand for an efficient and reliable method of measuring individual differences in cognitive domains that can be linked to brain systems. The application of functional neuroimaging with “neurobehavioral probes” (Gur et al., 1992) has contributed to understanding complex measures by dissociating their more basic components. Efforts to integrate neurobiology and genetics in studying heritable brain disorders increasingly incorporate quantitative continuous cognitive measures. Such “endophenotypes” complement the dichotomous diagnostic approach applied in genetic studies, and are needed to construct a mechanistic neurobiological model of neurodevelopmental disorders such as schizophrenia and autism (Gottesman and Gould, 2003, Gur et al., 2007), and in other disorders affecting performance.

The increased demand for efficient neurocognitive testing has confronted limitations of available test batteries, which are clinically based paper-and-pencil tests requiring extensive training in administration and scoring and producing a paper trail unmanageable in large-scale studies. Furthermore, traditional neuropsychological batteries are based on clinical-pathological correlations, where patients with brain disorders are tested and deficits are linked to disease-related brain systems. The advent of functional neuroimaging has enabled experimental approaches to isolating brain systems recruited for specific behavioral tasks. The evolving experimental fields of cognitive and affective neurosciences (e.g., Davidson, Jackson and Kalin, 2000; Posner and DiGirolamo, 2000; Panksepp, 1998) have yielded novel insights with more narrowly defined behavioral tasks. There has therefore been demand to adapt such “cognitive neuroscience-based” tasks into computerized tests that have been linked to specific brain systems in functional neuroimaging studies (e.g., RFA-MH-08-090. Adapting Basic Cognitive Measures for Clinical Assessment of Schizophrenia see http://www.nimh.nih.gov/research-funding/grants/requests-for-applications.shtml).

The process of translating concepts and tasks developed by basic neuroscientists into clinically applicable instruments is challenging beyond the heterogeneity of clinical manifestations and the extent and pervasiveness of deficits. Several issues require consideration when adapting functional neuroimaging tasks for use as neurocognitive tests to determine individual differences. First, there is an inherent contrast between the goals of tasks in functional neuroimaging compared to tests used for establishing individual differences. The goal of tasks in functional imaging studies is to activate a specific neural circuitry, and the task is made deliberately easy so that performance is nearly perfect to avoid frustration confounds, or performance is used as a “nuisance variable.” Even when comparing patients to controls on a task where patients are putatively impaired, such methodological considerations have motivated investigators to seek ways for equating difficulty levels when contrasting activation between groups, at the expense of making the resulting tasks different (Holcomb et al., 2000). In contrast to tasks for neuroimaging studies, tests used in psychometric trait assessment focus on individual differences and must avoid ceiling and floor effects. Such tests need to walk the tightrope between being too easy, which will mask individual differences, and too hard, which can decrease motivation. It is especially important to construct tests so that patients have some experience of success and are not totally frustrated, while healthy people are sufficiently challenged. Finally, it is necessary to consider that some clever but subtle manipulations that yield elegant results in college undergraduates may prove of limited utility in patients with brain disorders.

We have developed and described a systematic procedure (Gur et al., 1992) for selecting appropriate behavioral constructs, assembling test items, and performing the various stages of validation. We have developed a battery of computerized tests and established its comparability with a traditional neuropsychological battery in a healthy normative sample (Gur et al., 2001a) and in patients with schizophrenia (Gur et al., 2001b). This battery has been applied in large-scale genetic studies yielding measures with substantial heritability and linkage (Aliyu et al., 2006; Greenwood et al., 2007; Gur et al., 2007a,b; Almasy et al., 2008). Here we present the method for developing the computerized battery, an evaluation of its psychometric properties, and initial evidence for its construct validity.

The computerized neurocognitive battery (CNB) takes approximately an hour to administer and includes tests that measure the following domains: abstraction and mental flexibility, attention, working memory, episodic memory (word, face, and spatial recognition memory), language reasoning, spatial processing, sensorimotor, motor speed, and emotion identification. The CNB yields measures of accuracy (number of correct responses) and speed [median response time (RT) for correct items]. This feature permits evaluation of the individual differences in strategy pertinent to the speed-accuracy tradeoff (Smith and Kounios, 1996). The previous version of the CNB has been supplemented with measures of working memory (the letter n-back paradigm, extensively validated in functional neuroimaging studies, e.g., Braver et al., 1997; Ragland et al., 1997; 2002; Rodriguez-Jimenez et al., 2009; Minzenberg et al., 2009), and a motor speed test (finger tapping, another measure of a narrow domain extensively validated with functional neuroimaging, e.g., Aizenstein et al., 2004). The current version also replaced the facial emotion identification test that only uses black-and-white happy and sad Caucasian faces. The new emotion identification test is based on a more advanced method used for face affect acquisition, described earlier in this journal (Gur et al., 2002a). The test includes four emotions (happy, sad, anger and fear), and uses an ethnically diverse set of posers. Here we present the CNB method of construction, normative data, and construct validation data by examining inter-correlations among the tests, and initial criterion validation through examining its sensitivity to performance-related individual differences dimensions including sex differences and age and education effects.

2. Method

2.1. Validation sample

Samples used during stages of developing and validating the battery included primarily college undergraduates who received course credit for participation. The sample for the normative study consisted of healthy volunteers recruited by a consortium of research centers. The sample size varies from 139 to 539 (see Table 1) depending on their inclusion in the participating sites. Participants were recruited through advertising in community outlets including newspapers, shopping centers, and community centers and organizations. Potential participants were screened medically and psychiatrically as detailed elsewhere (Calkins et al., 2006; Aliyu et al., 2006; Gur et al., 2007a) and only individuals with no medical or psychiatric diagnosis were included in this sample. In addition, participants had no first-degree relatives with psychotic or mood disorders.

Table 1
Sample sizes (#F=number of females), administration times, means, standard deviations (SD) and Cronbach’s alpha coefficients for the tests in the battery.

2.2. The development of CNB

We have developed a set of computerized neurobehavioral measures using tests that incorporate behavioral neuroscience-based tasks used in functional neuroimaging. Our approach to the task-to-test adaptation and the validation process was detailed in Gur et al. (1992). Briefly, we constructed tasks that can be presented with a wide range of difficulty levels while maintaining dimensional integrity and comparability between the “offline” assessment of individual differences and the system activated in a functional neuroimaging study. Tasks first undergo a process of construction and evaluation by a team of experimental and clinical investigators. This group helps carry the tasks through conceptualization, task assembly, stimulus assessment, and constructing a preliminary version. The resulting version is submitted to psychometric study to assess reliability and construct validity and obtain normative data. Data are also obtained at this stage to document comparability of test versions and the effects of retesting (“practice effects”). Tests included in the CNB have undergone this development process. The “pipeline” for adapting neuroscience based tasks for use as psychometric tests is represented in Figure 1.

Figure 1
A flow diagram for adapting cognitive neuroscience tasks as psychometric tests in neuroimaging and clinical studies.

The following stages are undertaken: 1. Conceptualization. This takes place in weekly meeting of faculty, fellows and students where neuroscience literature is reviewed with focus on its relevance to neurodevelopmental disorders. At this stage we consult with basic science collaborators and colleagues and often invite consultants from other institutions. 2. Assembly. Potential stimuli are examined by the group and discussed in relation to complexity, physical characteristics, and suitability for patients. At this phase we also select the response modality and assemble the first version of a test, which includes a large number of items. This stage can require a large-scale project, as when we recorded facial expression from 150 actors and actresses to generate the stimuli for the facial emotion identification tests (Gur et al., 2002a). Initial test versions are first evaluated internally by members of the center and feedback is obtained. Based on this experience, instructions are evaluated and revised. We involve experts in education and linguistics for feedback on content and presentation of the instructions. We then assemble an initial computerized version of the test that includes a training module. 3. Screening. This version is then piloted with students in a computer laboratory setting, where about 100–150 participants are tested in groups averaging 10 people. They take the tests under supervision of laboratory staff. Here we emphasize conscientious performance by test takers and obtaining thorough feedback from participants. 4. Construction. In this phase we perform an item analysis of the pilot data and, based on this analysis, we assemble alternate forms of the test. We then administer the test to a small sample of a targeted patient sample and community controls (N= ~20 in each group), and perform item analysis to arrive at the final version that is submitted to a more rigorous validation process. Based on this experience and the psychometric analysis of the pilot data, we finalize the test. This includes instructions, training procedures, scoring and quality assurance. 5. Validation. The newly developed test is administered in conjunction with the current version of the CNB to a normative sample of healthy people from the community. Here we examine correlations between test performance and demographic features. 6. Dissemination. When a test is validated we make it available to the scientific community. The current versions of the tests have been written in the Adobe Flash® platform and are available via the web. The web-based platform provides a selection of computerized batteries as well as automated scoring and other features. An investigator interested in using the CNB can register for an account on the PennCNP® website at https://penncnp.med.upenn.edu/request.pl by submitting a form to verify research credentials and IRB compliance.

2.3. Method of administration

The CNB is administered using clickable icons on desktop or laptop computers, in a fixed order. The tests were implemented on Macintosh® computers using the PowerLaboratory® program (Chute and Westall, 1997). An Applescript® routine is used to collect participant IDs and basic demographic information and to present the tests in a prescribed order. A research assistant reads instructions for each test and observes as the participant performs the tests. Assistants are rained to administer the tests in a standard fashion fostering optimal performance without aiding the participant. The test administrator provides a status code and comments regarding the validity of collected data. The results are uploaded to a data repository using an automated script, and scored using a program written in the Python programming language. A neuropsychologist reviews test sessions to determine data validity. For each domain, accuracy (number of correct responses) and speed (response time for correct answers) are computed.

As a first step in testing, the subject is acquainted with the computer and mouse and perform an un-speeded version of the Mouse Practice task ((MP; Gur et al., 2001a). This practice is administered at the beginning of the session to assure the participant has sufficient skills in moving and clicking the mouse. Such practice helps “level the playing field” for people inexperienced with computer, such as the elderly. In addition, each test begins with a practice module, to assure understanding of the instructions. The domains and tests for each domain are (example stimuli for the tests are provided in Figure 2):

Figure 2
Examples of test stimuli and procedures.

Abstraction and mental flexibility (ABF)

Penn Conditional Exclusion Test (PCET; Kurtz et al., 2004) is a measure of abstraction and concept formation, with alternate forms. Subjects decide which of 4 objects does not belong with the other 3 based on one of three sorting principles (e.g., shape, size, line thickness). Sorting principles change after 10 successive correct responses, and feedback is used to guide discovery of the principle and indicate its change. There are 4 alternate forms available. An accuracy score is calculated by multiplying the proportion of correct responses by the number of categories attained (out of 3 possible).

The PCET measures the ability to discover principles by hypothesis testing, where the principle shifts after its discovery is established. Such sorting tasks have been examined psychometrically (e.g., Miyake et al., 2000) and demonstrated robust involvement of frontal regions in extensive functional neuroimaging studies (e.g., Marenco et al., 1992; Adler et al., 2001; Crone et al., 2008; Specht et al., 2009). Individual differences in performance on such tasks have been related to frontal lobe volumes (Gunning-Dixon and Raz, 2003), dopaminergic influx (Vernaleken et al., 2007), and proneness to psychosis (Tallent and Gooding, 1999).

Attention (ATT)

The Penn Continuous Performance Test (PCPT; Kurtz et al., 2001) uses a standard CPT paradigm. The participant responds to a set of 7-segment displays presented 1/sec., whenever they form a digit (NUMBERS, initial 3 min) or letter (LETTERS, next 3 min). The number of true positive responses is recorded as the accuracy score and the median response time for true positive responses is the measure of attention speed.

The CPT is a well-established paradigm for measuring individual differences in vigilance and its developmental changes (e.g., Lin et al., 1999; Riccio et al., 2002). It has been related to activation of a frontal-parietal network (e.g., Adler et l., 2001; Ogg et al., 2008). The CPT task can help in the diagnosis of attention deficit disorders (Kooistra et al., 2009) and in detecting genetic susceptibility to psychosis (Chen and Faraone, 2009).

Working memory (WM)

The Letter N-Back (LNB; Ragland et al., 2002) presents letters for 500ms, and the participant has an additional 2000ms to respond by pressing the spacebar. There are three conditions: 0-Back - press the spacebar when the letter presented is an "X"; 1-Back - press when the letter presented is the same as the previous letter; 2-Back – press when the letter presented is the same as the one just before the previous letter. Following a training period, the test presents three blocks of each condition in a pre-determined order, for a total of 135 trials. The number of correct responses is recorded as the measure of accuracy and median response times for correct responses as a measure of speed.

The domain of “working memory” was conceptualized by Baddeley (2000) as an episodic buffer, where multimodal information is continuously integrated. It has been studied extensively with functional neuroimaging, and demonstrated to recruit dorsolateral prefrontal regions (Braver et al., 1997; Ragland et al., 1997; 2002; Rodriguez-Jimenez et al., 2009; Minzenberg et al., 2009). Working memory has been shown to have a significant genetic loading (Blokland et al., 2008), and to decline with aging in relation to the integrity of the dorsolateral prefrontal cortex and related regions (Mattay et al., 2006). Abnormal activation has been linked to schizophrenia (Callicott et al., 2003; Perlstein et al., 2003), bipolar illness (Drapier et al., 2008) and traumatic brain injury (Newsome et al., 2007),

Episodic Memory (MEM)

Verbal Memory. The Penn Word Memory Test (PWMT; Gur et al., 1997) presents 20 target words that are then mixed with 20 distractors equated for frequency, length, concreteness and low imageability. The participant’s score reflects the number of correctly recognized targets and correctly rejected foils. Median response times for correct responses serves as a measure of speed. A 20 min delayed recall procedure is also administered. Face Memory. The Penn Face Memory Test (PFMT; Gur et al., 1997) presents 20 digitized faces that are then mixed with 20 distractors equated for age, gender and ethnicity. The participant’s score reflects the number of correctly recognized targets and correctly rejected foils, and median response times for correct responses serves as a measure of speed. The procedure is repeated at 20 min delay. Spatial Memory. The Visual Object Learning Test (VOLT; Glahn et al., 1997) uses Euclidean shapes as stimuli with the same paradigm as the word and face. The participant’s score reflects the number of correctly recognized targets and correctly rejected foils, and again median response times for correct responses serves as a measure of speed. The procedure is repeated at 20 min delay. Two forms are available for each test.

This triad of declarative memory tasks examines the main visual modalities in an immediate and delayed recognition format. Such recognition memory tasks activate frontal and bilateral anterior medial temporal lobe regions (Gur et al., 1997; Jackson and Schacter, 2004). They have been applied extensively in functional neuroimaging studies both in healthy people and in patients with brain dysfunction (Gur et al., 1997). Performance on these tasks is heritable and deficits indicate vulnerability to neuropsychiatric disorders, including schizophrenia (Aliyu et al., 2006, Almasy et al., 2008; Greenwood et al., 2007) and brain disorders such as epilepsy (Saykin et al., 1989).

Language Reasoning (LAN; Gur et al., 1982)

The abbreviated Penn Verbal Reasoning Test consists of 8 verbal analogy problems from the Educational Testing Service (ETS) factor-referenced test kit. The number of correct responses is entered as the accuracy score. Two forms are available.

Verbal reasoning is a reliable measure of general intellectual abilities that has been shown to activate left temporo-parietal regions (Gur et al., 1982; 2000). Individual differences in performance have substantial heritabilities (Almasy et al., 2008; Greenwood et al., 2007). Reduced activation has been associated with aging (Gur at al., 1987a) and abnormal activation has been linked to brain dysfunction (Gur et al., 1987b).

Spatial Processing (SPA)

The Judgment of Line Orientation (JOLO) is a computerized version of Benton’s (Benton, Varney, and Hamsher, 1978) test (Gur et al., 1982). It presents two lines at an angle, and participants indicate the corresponding lines on a simultaneously presented array. The number of correct responses is entered as the accuracy score.

The JOLO is a reliable measure of spatial orientation abilities that has been shown to activate right temporo-parietal regions (Gur et al., 1982; 2000). Individual differences in performance have substantial heritabilities (Almasy et al., 2008; Greenwood et al., 2007). Reduced activation has been associated with aging (Gur at al., 1987a) and abnormal activation has been linked to brain dysfunction (Gur et al., 1987b).

Sensorimotor Processing Speed (SM)

The Mouse Practice task (MP; Gur et al., 2001a) requires moving the mouse and clicking as quickly as possible on a green square that disappears after the click. The square gets increasingly small. Since it is rare for participants to miss a target, no accuracy score is calculated and the median response time is used as the main dependent measure. Two forms are available. An un-speeded version of this task is administered at the beginning of the session to assure the participant has sufficient skills in moving and clicking the mouse.

Motor Speed (MOT)

The Computerized Finger Tapping Test (CTAP; Coleman et al., 1997; Almasy et al., 2008) measures how quickly the participant can press the spacebar using only the index finger. After a practice trial with each hand, the test presents five trials for the dominant hand alternating with five trials for the non-dominant hand. In each, the participant is asked to tap the spacebar repeatedly for 10s when the green "GO" screen is presented. The computer records the number of taps.

The finger-tapping procedure provides a measure of motor speed, which has been used in multiple functional neuroimaging studies because it maps well into motor and premotor regions contralateral to the moving finger, a response which diminishes with aging (Aizenstein et al., 2004). It correlates with other measures of motor speed, and shows a robust sex difference favoring males (Saykin et al., 1995).

Emotion Identification (EMO)

We used the 40-item facial affect identification Emotion Recognition test (ER-40; the method for its development was described earlier in this journal, Gur et al., 2002a). Facial displays of 4 emotions (Happy, Sad, Anger, Fear) and Neutral faces, 8 each, are presented and the subject identifies the emotion in a multiple-choice format. The facial stimuli are balanced for gender, age, and ethnicity. Number of correct identifications is used as an accuracy score, and median response times for correct responses serves as a measure of speed. Two forms are available.

Affect identification is the only test in this battery that measures social cognition, a domain of increased interest in behavioral neuroscience (Davidson et al., 2000). It has been applied in large normative populations (Mathersul et al, 2008; Williams et al., 2008), showing sex differences and age effects. Measures show high heritability, and deficit in facial affect identification have been linked to genetic susceptibility to neuropsychiatric disorders (Aliyu et al., 2006, Almasy et al., 2008; Greenwood et al., 2007). The facial affect recognition task has been applied in multiple functional neuroimaging studies and demonstrated to activates temporo-limbic regions (Gur et al., 2002b; Moser et al., 2007; Derntl et al., 2008), which diminishes with aging (Gunning-Dixon et al., 2003). Abnormal activation has been linked to schizophrenia (R.E. Gur et al., 2002; and depression Beesdo et al., 2009).

3. Results

3.1. Performance and internal consistency

The performance data for each test and information on its internal consistency are provided in Table 1. As can be seen, administration time of the CNB is approximately one hour and it yields moderate to high coefficient alpha values. Internal consistency estimates were higher for speed than for accuracy measures.

The intercorrelations among the tests are presented in Table 2. As can be seen, the measures are more highly intercorrelated for speed (lower triangle) than for accuracy (upper triangle). They also suggest greater intercorrelations within memory measures, while other correlations range from moderate to nil. These correlations thus support the construct validity of the battery.

Table 2
Intercorrelations among the performance indices for tests included in the battery. The upper diagonal shows correlations with accuracy (hence none available for SM and MOT), and the lower diagonal shows correlations for RT. Correlations significant at ...

3.2. Sensitivity to dimensions of individual differences

Initial criterion validation of the battery included an examination of its sensitivity to dimensions of individual differences known to affect performance. We evaluated sex differences, correlations with age, and correlation with the socioeconomic indices of education and parental education.

Sex Differences

Men and women did not differ in age (mean±SD men 36.7±13.5, range 18–84; women 36.4±15.3, range 18–84, t = 0.25, df = 533, p = 0.8035), education (men 15.2±2.4, range 10 to 20; women 15.4±2.4, range 8 to 20, t = 0.87, df = 533, p = 0.3822) or parental education (men 13.8±3.4, range 3 to 20; women 14.1±3.1, range 2 to 20, t = 1.23, df = 533, p = 0.2188). Figure 3 shows means of males and females on the neurocognitive measures. As can be seen, men were more accurate and faster on the attention domain and the spatial task, and faster on the motor speed task, while women were more accurate in the working memory, word and face memory, and emotion processing tests.

Figure 3
Mean (±SEM) of men (blue bars) and women (red bars) on the tests included in the battery. Domain names as in Table 1. Note that no accuracy measures are available for the sensorimotor test because no errors were made and for the motor speed test ...

Age Effects

Figure 4 shows correlations with age for both accuracy and speed, with error bars indicating 95% confidence intervals based on 1000 bootstrapped correlations. As can be seen, accuracy is negatively correlated with age for several measures, while response time increases with age for nearly all measures. The exception is working memory, which does not show reduced performance with aging either for accuracy or for speed. Immediate word memory, verbal reasoning and spatial processing do not show reduced accuracy with age.

Figure 4
Correlations of age with accuracy (black bars) and response time (RT; gray bars) indices of performance on the tests. Error bars indicate 95% confidence intervals based on 1000 bootstraps. As seen, the effects of age are stronger for speed than for accuracy, ...

Correlations with education and parental education

Figure 5 shows correlations with education (left panel) and parental education (right panel) for both accuracy and speed. As can be seen, education is associated with greater accuracy and speed for abstraction and flexibility, working memory and spatial processing, better accuracy for face and spatial memory, language reasoning and emotion processing, and faster motor speed. Correlations with parental education are generally higher and are seen both for accuracy and speed. Thus, parental education is associated with both better accuracy and speed for executive functions (abstraction, attention and working memory), immediate word memory, spatial and emotion processing, better accuracy in face and spatial memory, and faster sensorimotor processing and motor speed.

Figure 5
Correlations of education (left panel) and parental education (right panel) with accuracy (black bars) and response time (RT; gray bars) indices of performance on the tests. Error bars were indicating 95% confidence intervals based on 1000 bootstraps. ...

4. Discussion

The results indicate the overall feasibility of using this brief computerized battery in research. A large sample of healthy volunteers was studied without difficulties; the average administration time was about an hour and the measures yielded moderate to high indices of reliability, construct validity and, more preliminarily, criterion validity.

Regarding reliability, the coefficient alpha values indicated acceptable to high internal consistency for all tests. They were expectedly higher for response times, for which they ranged from .777 for delayed spatial recognition to .969 for CPT numbers, than for accuracy, for which they ranged from .587 for the ER40 to .954 for total CPT. Thus, each test samples a coherent behavioral domain. While test-retest reliability is important to establish as well, and this effort is underway, high internal consistency indices auger positively for expecting good repeatability of measures. The high coefficient alphas suggest good reliability because, as generalization of the split-half reliability, they estimate the likelihood of obtaining the same scores under multiple simultaneous administrations (Allen and Yen, 1979).

The inter-correlations among tests were again generally higher for response time, where they ranged from .042 between SM and PVRT to .899 between CPT for numbers and CPT for letters, than for accuracy, where they ranged from −.075 between CPT for numbers and LNB to .632 between CJOLO and PVRT. The intercorrelations overall were moderate enough to indicate that the tests are diverse and do not measure the same domain repetitively. Notably, the highest correlation was observed between a spatial test, requiring the judgment of line orientation, and a purely language test, requiring verbal reasoning. They were both expected to measure higher order reasoning related to lateralized cortical processing and their correlation support the underlying model of brain-behavior relationships, thus supporting the construct validity of the battery.

The results of initial criterion validation were encouraging. The tests showed sensitivity to sex differences, with the expected better performance for males on spatial and motor tasks and better performance for females on memory and emotion processing tasks (Halpern et al., 2007). Less expected was the better performance for men on the CPT letters condition and for females on the LNB test. These are new findings of sex differences that suggest better vigilance abilities in men and better working memory abilities in women. These findings need replication.

The sensitivity of the tests to effects of normal aging provides further validation. Performance showed moderate to high correlations with age, and these were more pronounced for speed than for accuracy. Furthermore, as expected from the literature the effects of aging were more pronounced for memory and “fluid intelligence” tasks (abstraction and mental flexibility, attention) than for “crystallized intelligence” tasks (working memory, verbal reasoning, spatial processing). While older adults may have less experience with computers, the tasks were designed to require minimal computer skills from the participants. Furthermore, a training module familiarizes all participants with the computer interface and practice is provided before each task. Anecdotally, older adults were fully engaged and age effects were similar to paper and pencil measures.

Correlations with education and parental education were apparent, although they were generally higher for accuracy than for speed. More education was associated with better working memory, verbal reasoning and spatial processing skills, as well as better memory accuracy and faster motor speed. Higher parental education was associated with better performance across domains.

The overall results are positive, and the data presented here (and available to investigators, along with all the tests and a manual, at https://penncnp.med.upenn.edu) can be used as a normative database for comparison with a range of clinical populations. The data can also facilitate identification of young people at risk for brain disorders and may offer avenues for early intervention and rehabilitation. The computerized format offers many advantages over traditional measures. It provides an efficient means to obtain both accuracy and speed, requires minimal training of administrators, offers automated scoring and its procedures and measures are to experimental data obtained in functional neuroimaging. The CNB has been successfully implemented in large-scale genetic studies that, in further support of its validity, have reported significant heritability of performance across domains (Greenwood et al., 2007; Gur et al., 2007). Instructions have been translated into several languages and used in research including genetic studies and pharmacological trials. The incorporation of neurocognitive and affective neuroscience probes in genetically informative samples can provide the next stage of identifying brain systems that are targets for intervention, aiming to modify deficits observed in cognitive and emotion processing. Our current methodological efforts are to find the smallest number of items needed for each test, so as to eventually create even more efficient multiple forms of the battery for repeated studies.

Acknowledgments

Supported by grants MH-084856, MH-64045 and MH-60722.

We thank the research teams that assisted in task development and administration, and the following NIMH supported collaborating consortia: Multiplex Multigenerational Investigation of Schizophrenia (MGI MH49142), Project Among African Americans to Explore Risks for Schizophrenia (PAARTNERS MH66121) and Consortium on the Genetics of Schizophrenia (COGS MH65578).

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

References

  • Adler CM, Sax KW, Holland SK, Schmithorst V, Rosenberg L, Strakowski SM. Changes in neuronal activation with increasing attention demand in healthy volunteers: an fMRI study. Synapse. 2001;42:266–272. [PubMed]
  • Aizenstein HJ, Clark KA, Butters MA, Cochran J, Stenger VA, Meltzer CC, Reynolds CF, Carter CS. The BOLD hemodynamic response in healthy aging. J Cogn Neurosci. 2004;16:786–793. [PubMed]
  • Aliyu MH, Calkins ME, Swanson CL, Jr, Lyons PD, Savage RM, May R, Wiener H, McLeod-Bryant S, Nimgaonkar VL, Ragland JD, Gur RE, Gur RC, Bradford LD, Edwards N, Kwentus J, McEvoy JP, Santos AB, McCleod-Bryant S, Tennison C, Go RC, Allen TB. Project among African-Americans to explore risks for schizophrenia (PAARTNERS): recruitment and assessment methods. Schizophrenia Research. 2006;87:32–44. [PubMed]
  • Allen M, Yen W. Introduction to Measurement Theory. Monterey, CA: Brooks/Cole; 1979.
  • Almasy L, Gur RC, Haack K, Cole SA, Calkins ME, Peralta JM, Hare E, Prasad K, Pogue-Geile MF, Nimgaonkar V, Gur RE. A genome screen for quantitative trait loci influencing schizophrenia and neurocognitive phenotypes. Am J Psychiatry. 2008;165:1185–1192. [PMC free article] [PubMed]
  • Baddeley A. The episodic buffer: a new component of working memory. Trends Cogn Sci. 2000;4:417–423. [PubMed]
  • Beesdo K, Lau JY, Guyer AE, McClure-Tone EB, Monk CS, Nelson EE, Fromm SJ, Goldwin MA, Wittchen HU, Leibenluft E, Ernst M, Pine DS. Common and distinct amygdala-function perturbations in depressed vs anxious adolescents. Arch Gen Psychiatry. 2009;66:275–285. [PMC free article] [PubMed]
  • Benton AL, Varney NR, Hamsher KD. Visuospatial judgment. A clinical test. Arch Neurol. 1978;35:364–367. [PubMed]
  • Blokland GA, McMahon KL, Hoffman J, Zhu G, Meredith M, Martin NG, Thompson PM, de Zubicaray GI, Wright MJ. Quantifying the heritability of task-related brain activation and performance during the N-back working memory task: a twin fMRI study. Biol Psychol. 2008;79:70–79. [PMC free article] [PubMed]
  • Braver TS, Cohen JD, Nystrom LE, Jonides J, Smith EE, Noll DC. A parametric study of prefrontal cortex involvement in human working memory. Neuroimage. 1997;5:49–62. [PubMed]
  • Calkins ME, Dobie DJ, Cadenhead KS, Olincy A, Freedman R, Green MF, Greenwood TA, Gur RE, Gur RC, Light GA, Mintz J, Nuechterlein KH, Radant AD, Schork NJ, Seidman LJ, Siever LJ, Silverman JM, Stone WS, Swerdlow NR, Tsuang DW, Tsuang MT, Turetsky BI, Braff DL. The Consortium on the Genetics of Endophenotypes in Schizophrenia: model recruitment, assessment, and endophenotyping methods for a multisite collaboration. Schizophr Bull. 2007;33:33–48. [PMC free article] [PubMed]
  • Callicott JH, Egan MF, Mattay VS, Bertolino A, Bone AD, Verchinksi B, Weinberger DR. Abnormal fMRI response of the dorsolateral prefrontal cortex in cognitively intact siblings of patients with schizophrenia. Am J Psychiatry. 2003;160:709–719. [PubMed]
  • Chen WJ, Faraone SV. Sustained attention deficits as markers of genetic susceptibility to schizophrenia. Am J Med Genet. 2000;97:52–57. [PubMed]
  • Chute DL, Westall RF. PowerLaboratory. Devon, Pa: MacLaboratory, Incorporated; 1997.
  • Coleman AR, Moberg PJ, Ragland JD, Gur RC. Comparison of the Halstead-Reitan and infrared light beam finger tappers. Assessment. 1997;4:277–286.
  • Crone EA, Zanolie K, Van Leijenhorst L, Westenberg PM, Rombouts SA. Neural mechanisms supporting flexible performance adjustment during development. Cogn Affect Behav Neurosci. 2008;8:165–177. [PubMed]
  • Davidson RJ, Jackson DC, Kalin NH. Emotion, plasticity, context, and regulation: Perspectives from affective neuroscience. Psychol Bull. 2000;126:890–909. [PubMed]
  • Derntl B, Windischberger C, Robinson S, Lamplmayr E, Kryspin-Exner I, Gur RC, Moser E, Habel U. Facial emotion recognition and amygdala activation are associated with menstrual cycle phase. Psychoneuroendocrinology. 2008;33:1031–1040. [PubMed]
  • Drapier D, Surguladze S, Marshall N, Schulze K, Fern A, Hall MH, Walshe M, Murray RM, McDonald C. Genetic liability for bipolar disorder is characterized by excess frontal activation in response to a working memory task. Biol Psychiatry. 2008;64:513–520. [PubMed]
  • Glahn DC, Gur RC, Ragland JD, Censits DM, Gur RE. Reliability, performance characteristics, construct validity, and an initial clinical application of a visual object learning test (VOLT) Neuropsychology. 1997;11:602–612. [PubMed]
  • Gottesman II, Gould TD. The endophenotype concept in psychiatry: etymology and strategic intentions. Am J Psychiatry. 2003;160:636–645. [PubMed]
  • Greenwood TA, Braff DL, Light GA, Cadenhead KS, Calkins ME, Dobie DJ, Freedman R, Green MF, Gur RE, Gur RC, Mintz J, Nuechterlein KH, Olincy A, Radant AD, Seidman LJ, Siever LJ, Silverman JM, Stone WS, Swerdlow NR, Tsuang DW, Tsuang MT, Turetsky BI, Schork NJ. Initial heritability analyses of endophenotypic measures for schizophrenia: the consortium on the genetics of schizophrenia. Arch Gen Psychiatry. 2007;64:1242–1250. [PubMed]
  • Gunning-Dixon FM, Gur RC, Perkins AC, Schroeder L, Turner T, Turetsky BI, Chan RM, Loughead JW, Alsop DC, Maldjian J, Gur RE. Age-related differences in brain activation during emotional face processing. Neurobiol Aging. 2003;24:285–295. [PubMed]
  • Gunning-Dixon FM, Raz N. Neuroanatomical correlates of selected executive functions in middle-aged and older adults: a prospective MRI study. Neuropsychologia. 2003;41:1929–1941. [PubMed]
  • Gur RC, Gur RE, Obrist WD, Hungerbuhler JP, Younkin D, Rosen AD, Skolnick BE, Reivich M. Sex and handedness differences in cerebral blood flow during rest and cognitive activity. Science. 1982;217:659–661. [PubMed]
  • Gur RC, Gur RE, Obrist WD, Skolnick BE, Reivich M. Age and regional cerebral blood flow at rest and during cognitive activity. Arch Gen Psychiatry. 1987;44:617–621. [PubMed]
  • Gur RC, Gur RE, Silver FL, Obrist WD, Skolnick BE, Kushner M, Hurtig HI, Reivich M. Regional cerebral blood flow in stroke: hemispheric effects of cognitive activity. Stroke. 1987;18:776–780. [PubMed]
  • Gur RC, Erwin RJ, Gur RE. Neurobehavioral probes for physiologic neuroimaging studies. Arch Gen Psychiatry. 1992;49:409–414. [PubMed]
  • Gur RC, Ragland JD, Mozley LH, Mozley PD, Smith R, Alavi A, Bilker W, Gur RE. Lateralized changes in regional cerebral blood flow during performance of verbal and facial recognition tasks: Correlations with performance and ”Effort” Brain and Cognition. 1997;33:388–414. [PubMed]
  • Gur RC, Alsop D, Glahn D, Petty R, Swanson CL, Maldjian JA, Turetsky BI, Detre JA, Gee J, Gur RE. An fMRI study of sex differences in regional activation to a verbal and a spatial task. Brain and Language. 2000;74:157–170. [PubMed]
  • Gur RC, Ragland D, Moberg PJ, Turner TH, Bilker WB, Kohler C, Siegel SJ, Gur RE. Computerized neurocognitive scanning: I. Methodology and validation in healthy people. Neuropsychopharmacology. 2001;25:766–776. [PubMed]
  • Gur RC, Ragland J, Moberg P, Bilker W, Kohler C, Siegel S, Gur R. Computerized neurocognitive scanning: II. The profile of schizophrenia. Neuropsychopharmacology. 2001;25:777–788. [PubMed]
  • Gur RC, Sara R, Hagendoorn M, Marom O, Hughett P, Macy L, Turner T, Bajcsy R, Posner A, Gur RE. A method for obtaining 3-dimensional facial expressions and its standardization for use in neurocognitive studies. J Neurosci Methods. 2002;15(115):137–143. [PubMed]
  • Gur RC, Schroeder L, Turner T, McGrath C, Chan RM, Turetsky BI, Alsop D, Maldjian J, Gur RE. Brain activation during facial emotion processing. NeuroImage. 2002;16:651–662. [PubMed]
  • Gur RE, McGrath C, Chan RM, Schroeder L, Turner T, Turetsky BI, Kohler C, Alsop D, Maldjian J, Ragland JD, Gur RC. An fMRI study of facial emotion processing in schizophrenia. Am J Psychiatry. 2002;159:1992–1999. [PubMed]
  • Gur RE, Calkins ME, Gur RC, Horan WP, Nuechterlein KH, Seidman LJ, Stone WS. The Consortium on the Genetics of Schizophrenia: neurocognitive endophenotypes. Schizophr Bull. 2007;33:49–68. [PMC free article] [PubMed]
  • Gur RE, Nimgaonkar VL, Almasy L, Calkins ME, Ragland JD, Pogue-Geile MF, Kanes S, Blangero J, Gur RC. Neurocognitive endophenotypes in a multiplex multigenerational family study of schizophrenia. Am J Psychiatry. 2007;164:813–819. [PubMed]
  • Halpern D, Benbow C, Geary DC, Gur RC, Hyde J, Gernsbacher MA. The science of sex differences in science and mathematics. Psychological Science in the Public Interest. 2007;8:1–52.
  • Holcomb HH, Lahti AC, Medoff DR, Weiler M, Dannals RF, Tamminga CA. Brain activation patterns in schizophrenic and comparison volunteers during a matched-performance auditory recognition task. Am J Psychiatry. 2000;157:1634–1645. [PubMed]
  • Jackson O, 3rd, Schacter DL. Encoding activity in anterior medial temporal lobe supports subsequent associative recognition. Neuroimage. 2004;21:456–462. [PubMed]
  • Kooistra L, Crawford S, Gibbard B, Ramage B, Kaplan BJ. Differentiating attention deficits in children with fetal alcohol spectrum disorder or attention-deficit-hyperactivity disorder. Dev Med Child Neurol. 2009 Jun 22; [Epub ahead of print] [PubMed]
  • Kurtz MM, Ragland JD, Bilker W, Gur RC, Gur RE. Comparison of the continuous performance test with and without working memory demands in healthy controls and patients with schizophrenia. Schizophr Res. 2001;48:307–316. [PubMed]
  • Kurtz MM, Ragland JD, Moberg PJ, Gur RC. The Penn Conditional Exclusion Test: a new measure of executive-function with alternate forms for repeat administration. Arch Clin Neuropsychol. 2004;19:191–201. [PubMed]
  • Lin CC, Hsiao CK, Chen WJ. Development of sustained attention assessed using the continuous performance test among children 6–15 years of age. J Abnorm Child Psychol. 1999;27:403–412. [PubMed]
  • Marenco S, Coppola R, Daniel DG, Zigun JR, Weinberger DR. Regional cerebral blood flow during the Wisconsin Card Sorting Test in normal subjects studied by xenon-133 dynamic SPECT: comparison of absolute values, percent distribution values, and covariance analysis. Psychiatry Res. 1993;50:177–192. [PubMed]
  • Mathersul D, Palmer DM, Gur RC, Gur RE, Cooper N, Gordon E, Williams LM. Explicit identification and implicit recognition of facial emotions: II. Core domains and relationships with general cognition. J Clin Exp Neuropsychol. 2008;19:1–14. [PubMed]
  • Mattay VS, Fera F, Tessitore A, Hariri AR, Berman KF, Das S, Meyer-Lindenberg A, Goldberg TE, Callicott JH, Weinberger DR. Neurophysiological correlates of age-related changes in working memory capacity. Neurosci Lett. 2006;392:32–37. [PubMed]
  • Minzenberg MJ, Laird AR, Thelen S, Carter CS, Glahn DC. Meta-analysis of 41 functional neuroimaging studies of executive function in schizophrenia. Arch Gen Psychiatry. 2009;66:811–822. [PMC free article] [PubMed]
  • Miyake A, Friedman NP, Emerson MJ, Witzki AH, Howerter A, Wager TD. The unity and diversity of executive functions and their contributions to complex "Frontal Lobe" tasks: a latent variable analysis. Cogn Psychol. 2000;41:49–100. [PubMed]
  • Moser E, Derntl B, Robinson S, Fink B, Gur RC, Grammer K. Amygdala activation at 3T in response to human and avatar facial expressions of emotions. J Neuroscience Methods. 2007;161:126–133. [PubMed]
  • Newsome MR, Scheibel RS, Steinberg JL, Troyanskaya M, Sharma RG, Rauch RA, Li X, Levin HS. Working memory brain activation following severe traumatic brain injury. Cortex. 2007;43:95–111. [PubMed]
  • Ogg RJ, Zou P, Allen DN, Hutchins SB, Dutkiewicz RM, Mulhern RK. Neural correlates of a clinical continuous performance test. Magn Reson Imaging. 2008;26:504–512. [PubMed]
  • Panksepp J. Affective neuroscience: the foundations of human and animal emotions. New York: Oxford University Press; 1998.
  • Perlstein WM, Dixit NK, Carter CS, Noll DC, Cohen JD. Prefrontal cortex dysfunction mediates deficits in working memory and prepotent responding in schizophrenia. Biol Psychiatry. 2003;53:25–38. [PubMed]
  • Posner MI, DiGirolamo GJ. Cognitive neuroscience: Origins and promise. Psychol Bull. 2000;126:873–889. [PubMed]
  • Ragland JD, Glahn DC, Gur RC, Censits DM, Smith RJ, Mozley PD, Alavi A, Gur RE. PET regional cerebral blood flow change during working and declarative memory: Relationship with task performance. Neuropsychology. 1997;11:222–231. [PubMed]
  • Ragland JD, Turetsky BI, Gur RC, Gunning-Dixon F, Turner T, Schroeder L, Chan R, Gur RE. Working memory for complex figures: an fMRI comparison of letter and fractal n-back tasks. Neuropsychology. 2002;16:370–379. [PubMed]
  • Riccio CA, Reynolds CR, Lowe P, Moore JJ. The continuous performance test: a window on the neural substrates for attention? Arch Clin Neuropsychol. 2002;17:235–272. [PubMed]
  • Rodriguez-Jimenez R, Avila C, Garcia-Navarro C, Bagney A, Aragon AM, Ventura-Campos N, Martinez-Gras I, Forn C, Ponce G, Rubio G, Jimenez-Arriero MA, Palomo T. Differential dorsolateral prefrontal cortex activation during a verbal n-back task according to sensory modality. Behav Brain Res. 2009;205:299–302. [PubMed]
  • Saykin AJ, Gur RC, Sussman NM, Gur RE. Memory deficits before and after temporal lobectomy: Effect of laterality and age of onset. Brain and Cognition. 1989;9:191–200. [PubMed]
  • Saykin AJ, Gur RC, Gur RE, Shtasel DL, Flannery KA, Mozley LH, Malamut BL, Watson B, Mozley PD. Normative neuropsycholgical test performance: Effects of age, education, gender and ethnicity. Applied Neuropsychology. 1995;2:79–88. [PubMed]
  • Smith RW, Kounios J. Sudden insight: all-or-none processing revealed by speed-accuracy decomposition. J Exp Psychol Learn Mem Cogn. 1996;22:1443–1462. [PubMed]
  • Specht K, Lie CH, Shah NJ, Fink GR. Disentangling the prefrontal network for rule selection by means of a non-verbal variant of the Wisconsin Card Sorting Test. Hum Brain Mapp. 2009;30:1734–1743. [PubMed]
  • Tallent KA, Gooding DC. Working memory and Wisconsin Card Sorting Test performance in schizotypic individuals: a replication and extension. Psychiatry Res. 1999;89:161–170. [PubMed]
  • Vernaleken I, Buchholz HG, Kumakura Y, Siessmeier T, Stoeter P, Bartenstein P, Cumming P, Gründer G. 'Prefrontal' cognitive performance of healthy subjects positively correlates with cerebral FDOPA influx: an exploratory [18F]-fluoro-L-DOPA-PET investigation. Hum Brain Mapp. 2007;28:931–939. [PubMed]
  • Williams LM, Mathersul D, Palmer DM, Gur RC, Gur RE, Gordon E. Explicit identification and implicit recognition of facial emotions: I. Age effects in males and females across 10 decades. J Clin Exp Neuropsychol. 2008;19:1–21. [PubMed]