|Home | About | Journals | Submit | Contact Us | Français|
Cognitive impairment in schizophrenia is often severe, enduring, and contributes significantly to chronic disability. But clinicians have difficulty in assessing cognition due to a lack of brief instruments. We evaluated whether a brief battery of cognitive tests derived from larger batteries could generate a summary score representing global cognitive function. Using data from 3 previously published trials, we calculated the corrected item-total correlations (CITCs) or the correlation of each test with the battery total score. We computed the proportion of variance that each test shares with the global score excluding that test (Rt2=CITC2) and the variance explained per minute of administration time for each test (Rt2/min). The 3 tests with the highest Rt2/min were selected for the brief battery. The composite score from the trail making test B, category fluency, and digit symbol correlated .86 with the global score of the larger battery in 2 of the studies and correlated between .73 and .82 with the total battery scores excluding these 3 tests. A Brief Cognitive Assessment Tool for Schizophrenia (B-CATS) using the above 3 tests can be administered in 10–11 min. The full batteries of the larger studies have administration times ranging from 90 to 210 min. Given prior research suggesting that a single factor of global cognition best explains the pattern of cognitive deficit in schizophrenia, an instrument like B-CATS can provide clinicians with meaningful data regarding their patients’ cognitive function. It can also serve researchers who want an estimate of global cognitive function without requiring a full neuropsychological battery.
Schizophrenia is a disorder that affects 1% of the population and costs the US tens of billions of dollars a year, about half of which is attributable to indirect costs, such as unemployment.1 While positive and negative symptoms contribute to morbidity, multiple studies have demonstrated that cognitive impairment in schizophrenia contributes most to chronic disability and unemployment.2 The global cognitive deficit in schizophrenia is identifiable by the first episode of psychosis, endures over time, and is large—averaging between 1 and 2 standard deviations (SDs) below that of healthy control subjects.3 Patients with schizophrenia are especially impaired in the areas of verbal memory, attention, speed of processing, and executive function, with deficits up to 2.5 SD below control subjects.3 As clinicians and researchers alike become more attuned to the importance of this cognitive impairment, there is a growing need for evaluative tools to allow clinicians to appropriately identify and treat the cognitive burden of schizophrenia.
In the area of schizophrenia and cognition, projects such as the Measurement and Treatment Research to Improve Cognition in Schizophrenia (MATRICS) and the Treatment Units for Research on Neurocognition and Schizophrenia (TURNS) are directing the attention of the pharmaceutical industry to cognitive impairment as a target for future treatments for schizophrenia (www.matrics.ucla.edu, www.turns.ucla.edu). At the same time, psychosocial and cognitive rehabilitation researchers are developing and evaluating promising, new nonpharmacological strategies for enhancing cognition in schizophrenia.4,5 However, even as new pharmacological, rehabilitative, and psychotherapeutic interventions are being developed, there remains a serious gap in this area; clinicians (and researchers) do not have a well-validated instrument to measure cognition that can be administered and interpreted easily in a clinical setting. Furthermore, clinicians are poor at accurately evaluating cognitive functioning in typical clinical interviews and often underestimate the degree of deficit.6 As a result, clinicians in office or hospital practice are unable to effectively evaluate which patients are candidates for new treatments and the effectiveness of treatments in individuals.
The administration of a full neuropsychological battery and its interpretation by a neuropsychologist remains the “gold standard” for assessing the pattern and degree of cognitive deficit in mental illness. The principal advantage of longer batteries is that these may identify patterns of strengths and weaknesses across multiple functional domains. Seven such domains were identified as important to schizophrenia through a consensus process (MATRICS).7 It has also been suggested, however, that a single generalized cognitive deficit best characterizes schizophrenia, leading to questions about the added value of identifying more subtle patterns of cognitive strengths and weaknesses.8–10 It further remains unclear whether longer batteries of tests (eg, MATRICS battery, 65 min) adequately specify multiple, discrete cognitive domains. Moreover, administration of a comprehensive battery is time consuming, expensive, and generally unavailable in most practice settings. Other brief tools that have been developed, such as the Repeatable Battery for the Assessment of Neuropsychological Status (RBANS)11 and the Brief Assessment of Cognition in Schizophrenia (BACS),12 show large correlations with the global scores from a full neuropsychological battery and correlate with measures of functional outcome.11,13 Unfortunately, administration length of the BACS (35 min) and RBANS (25 min) is often longer than a typical medication management appointment. Other problems, such as clinicians’ lack of familiarity with the psychometric administration procedures and interpretation, limit the usefulness of the above tools for clinical work.
To address these needs, Velligan et al14 developed a 15-min battery of 3 tests called the Brief Cognitive Assessment (BCA). The tests in the BCA were selected by experts based on their experience with neuropsychological testing in schizophrenia and appreciation of the need for a short and easily administered and interpreted battery.
Our study had similar aims but used different methods. Our goal was to empirically derive a battery that could be administered in well less than 15 min. We a priori decided that the battery should have an administration time of 12 min or less. Therefore, in this article, we focused our derivation strategies on brevity. Specifically, we hypothesized that a few short tests, extracted from a larger neurocognitive battery, could account for a meaningful amount of the total variance of the global score from the comprehensive battery. We developed the battery by evaluating the psychometric relations of each test to the global scores derived from more comprehensive neuropsychological batteries and selecting those tests that account for the greatest amount of variance in the global score per minute of administration time.
This article is meant as a demonstration that empirical derivation strategies can be used to develop a very brief (10–12 min) battery that accounts for a large portion of the variance of the global cognitive score of longer batteries. Different derivation strategies would have undoubtedly selected different tests. The method described below thus does not necessarily lead to the “best” possible brief battery but rather illustrates a process for developing simple and abbreviated cognitive batteries for use with schizophrenia subjects.
Approval was obtained from the UCLA internal review board to perform the analyses described below.
Candidate test variables were extracted from data in more comprehensive neuropsychological batteries used in 3 completed National Institute of Mental Health-sponsored studies: the first-episode schizophrenia (FES) study (N = 73)8; the clozapine, haloperidol, olanzapine, risperidone (CHOR) study (N = 56),15 and the Clinical Antipsychotic Trials of Intervention (CATIE) study (N = 1005).16 Sample sizes are lower than those in the original studies because only complete records (no missing data on any test variable) were used. Demographic and study sample details on the complete study populations have been published elsewhere.8,15,16 Table 1 describes the demographic characteristics of the subjects included in the current analyses. Demographic data from the CATIE data set are reproduced with permission from the publication by Keefe et al,16 as the data set provided for the current analyses deliberately excluded demographic information as an additional privacy measure. The neurocognitive tests included in the FES, CHOR, and CATIE batteries, and the overlap among the batteries are in table 2.
All analyses were carried out using SPSS version 15.0 software. We compared means (Student t test) and frequencies (Pearson chi-square test) between subjects with complete vs incomplete neuropsychological data in each battery to assess for possible differences in population among subjects included vs excluded in the current analyses.
We computed the corrected item-total correlation (CITC) for each test score relative to the composite score (the average of all standardized test scores) within each battery using the reliability procedure. The CITC is the correlation between each standardized test score in the battery with the standardized battery total score excluding the test score itself, thereby controlling for part-whole correlation. We conducted principal factor analyses and examined the loadings of each test score on the first principal factor to confirm the construct validity of the test variables derived from each reliability analysis (table 2). As there was negligible difference between the results of the reliability analyses and the factor analyses, the test scores were ranked from highest to lowest by CITC within each battery. To determine not only which tests provided the highest correlations with total battery scores but also which were the most efficient, we selected the 5 tests from each battery that had the highest CITC and computed an index of “variance per minute” (VPM) to assess the amount of variance in global scores contributed by each individual test score, given their typical administration times. We first computed the variance of the total battery explained by each test excluding itself (Rt2 = ) and then divided this by administration time (VPMt=Rt2/mint). Administration times were derived from the original publication16 and/or the administration records from the studies.8,15 The mean administration time was used for tests with overlap among the batteries.
Overlapping tests were included only once in the further analyses. We calculated the average CITC and VPM for each of the top tests over all 3 study samples. Tests with no overlap with other batteries were included using the CITC and VPM from the battery in which the tests were included. Tests overlapping in 2 or all batteries were averaged by taking the mean CITC and VPM from all batteries in which the tests were included. The top 3 tests by VPM were selected for the Brief Cognitive Assessment Tool for Schizophrenia (B-CATS). We chose 3 tests because 3 was the greatest number of tests we could include while remaining under the predetermined 12-min administration time limit. One of those tests was included in the FES and CHOR batteries; the other 2 tests overlapped in all 3 batteries. We calculated the Pearson product moment correlations between the B-CATS and the total test batteries from the FES and CHOR studies and the test batteries excluding the 3 tests comprising the B-CATS.
The subjects included in the current analyses were those with complete initial neuropsychological data from the FES, CHOR, and CATIE studies. Analysis of demographic data suggests that in the FES and CHOR samples, subjects included in the current analyses differed from those excluded from the analyses on certain variables. In the FES sample, subjects with complete neuropsychological data had significantly fewer average years of education than those with incomplete data (13.08 vs 14.21 y, respectively, P < .005). Subjects did not differ on demographic variables of age, sex, or race. In the CHOR sample, subjects with complete neuropsychological data compared with those with incomplete data were, on average, significantly younger and had more years of education (mean age of 39 y with an average of 11 y of education vs mean age of 42 y with an average of 9.2 y of education, P < .05 for both age and education). Subjects did not differ on variables of sex and race. The CATIE data set used in our analyses deliberately excluded demographic data for the purposes of participant privacy, and comparison between means cannot be calculated. To address concerns that the differences in age and education among subjects with complete data vs those without incomplete data may bias the results, we conducted a median split for age and education and correlated B-CATS with the total scores and the total scores excluding the B-CATS tests of FES and CHOR. Differences in correlation coefficients were small, with correlations between global scores and B-CATS ranging from .84 to .88 for education and .88 to .89 for age in the CHOR study and .82 to .86 for education and .83 to .89 for age in the FES study.
The reliability analyses demonstrated good internal consistency, as evidenced by coefficient alphas ranging from .84 to .94. The principal factor analysis confirmed the ranking of tests obtained by the reliability analyses of each battery (table 2). The 5 test scores with highest CITC in each study were selected for further analysis.
CITC and VPM statistics for the top tests by CITC are summarized in table 3. We ranked by VPM the 5 tests with the highest CITC scores from each sample. The top 3 tests—the trail making test part B (TMT B) (time to completion), category fluency, and the Wechsler Adult Intelligence Scale digit symbol substitution (digit symbol)—were chosen for the B-CATS (see table 4).
Because the CATIE study did not include TMT B in its battery, the CATIE data set was not included in the correlation equations. The B-CATS correlated .86 with the global scores of the larger neuropsychological batteries in both the FES and the CHOR studies and correlated between .73 and .82 with the total scores excluding those 3 tests (table 5), thereby eliminating part-whole correlations. The 3 tests correlated minimally to moderately with each other, with inter-item correlations ranging from .27 to .48.
The average administration times of the 3 B-CATS tests are 4 min (TMT B), 3 min (category fluency—animals, fruits, vegetables), and 3.4 min (digit symbol).10,15,16 Altogether, the estimated administration time of the B-CATS is 10.5–11 min. A description of the 3 tests and their estimated administration time are included in table 4. FES and CHOR studies included only animal fluency under category fluency. The average administration time of animal fluency alone is 1 min.8,15
To assess a “B-CATS-like” battery with the CATIE data set, we chose the top 3 tests by VPM included in the CATIE battery and correlated their aggregate with the CATIE battery global score. A B-CATS of digit symbol, category fluency, and letter-number sequencing correlated .87 with the global score from the CATIE battery and correlated .79 with the global score excluding digit symbol, category fluency, and letter-number sequencing. This version of the B-CATS has an administration time of approximately 12.5 min.
We found that 3 tests—TMT B, category fluency, and digit symbol—correlated .86 with the larger FES and CHOR batteries and correlated between .73 and .82 with the total scores from the batteries excluding the B-CATS tests. The correlations of B-CATS with the larger battery total scores are within the range of the typical test-retest correlation coefficients for these measures. The tests themselves have small-to-moderate inter-item correlations ranging from .27 to .48 across both the FES and the CHOR batteries. Thus, the B-CATS should provide a valid estimate of general cognitive ability. The B-CATS has an expected administration time of 10–11 min in contrast to the full batteries of the FES, CHOR, and CATIE studies, which have administration times ranging from 90 to 210 min.
Our decision to examine the correlation of the B-CATS with the total score from the full neuropsychological battery and the battery excluding the B-CATS tests serves as both an undercorrection and an overcorrection of the problem of part-whole correlation. The actual correlation of the B-CATS with the full batteries is likely somewhere in between. An ongoing validity study (see below) will address this issue more fully by comparing the B-CATS with a separately administered neuropsychological battery.
The alternate B-CATS composed of category fluency, digit symbol, and letter-number sequencing calculated for comparison to the CATIE battery also correlated very highly with the global score, although the estimated administration time is 1.5–2.5 min longer. This suggests that other combinations of brief tests may also correlate highly with the total score of a larger battery. The tests we chose are not unique in capturing a large proportion of the variance of the total score, although they are 3 especially brief tests. But if other B-CATS were needed—for computer-based administration, for targeting of a specific domain, for emphasis on the highest correlation between global scores, with less regard to administration time—similar approaches could construct multiple combinations of tests for use as BCA batteries.
A combination approach could also be used combining expert input with empirical derivation. For instance, if the goal in constructing the B-CATS had been domain diversity, instead of brevity, only tests from different domains might have been considered for inclusion. The construction of the above battery also involved expert opinion. For instance, the choice of the VPM as a criterion for test inclusion was based on our opinion that it provided the best measure of efficiency per test. Our decision to include 3 tests, instead of 1 or 2, was based on our belief that 3 tests were “better.” This was because 3 tests correlated more highly with the global cognitive score than did 1 or 2 tests while remaining within the desired administration time. If we had gone strictly by VPM, we would have chosen only 1 test because VPM dropped as the number of tests were added.
Because our goal in this instance was to design an instrument for clinical use, the administration time of the B-CATS was a crucial consideration. Despite the fact that over 20 years of research have demonstrated pervasive and profound cognitive deficits in schizophrenia,17 clinicians remain unable to measure cognition in their patients. And while neuropsychological evaluation is available to some, the need to refer patients to neuropsychologists, and the cost of comprehensive neuropsychological testing, frequently results in a lack of any form of cognitive assessment. Those patients may therefore benefit from a more integrated brief assessment within their regular appointments. A typical outpatient psychiatric practice allots 15–20 min for medication management appointments. An administration time of even 15 min would likely leave clinicians without enough time to enquire about symptoms status and overall functioning, medication compliance, and side effects and monitor for the potential metabolic and movement side effects of antipsychotic agents. Our tool is estimated to take between 10 and 11 min. Furthermore, the administration time may be shortened by reducing category fluency to animal fluency alone (eliminating 2 min of testing time). Both the CHOR and the FES studies included only animal fluency under category fluency. The effect of including only animal fluency is being evaluated during our ongoing validity study. Furthermore, a web-based program (see below) for scoring and interpretation of the results will reduce evaluation time and increase usability of the tool.
The B-CATS can also provide researchers with an approximate global cognition score without requiring a full neuropsychological battery. While a brief battery such as the B-CATS is not a substitute for a comprehensive neuropsychological battery, many clinical trials that do not include cognition as a primary target may still gain value from an estimate of global cognition that is brief and easily administered. The B-CATS can provide an estimate of general cognitive ability in about 10 min, saving significant resources and reducing testing strain on subjects. Furthermore, several pharmacological agents with specific cognitive domain targets (for a review of such potential targets, see Tamminga18) are currently under development. Testing of these agents will require comprehensive assessment of the domain of interest. The B-CATS could provide a brief but sufficient estimate of general cognitive ability, freeing time and resources for a more extensive assessment of the domain or domains of interest.
The B-CATS is not designed to assess cognitive function at the domain level. Currently, there is debate in the field about the pattern of cognitive deficits in schizophrenia. In particular, whether performance on all cognitive domains predicts general cognitive function in aggregate or whether generalized cognitive ability hierarchically informs the performance on subdomains of cognition. While some research and expert opinion support a domain model of cognitive deficit in schizophrenia,7,19,20 recent factor and structural equation analyses from large studies with comprehensive neuropsychological batteries9,16,21–23 have demonstrated that a single factor of global cognitive function better explains the deficit pattern of schizophrenia (one of these analyses was conducted on the CATIE data set used in this study16). Specifically, these studies have shown that the best fit for the neurocognitive performance data of people with schizophrenia is either a 1-factor model where the factor of global cognitive ability explains the majority of the variance in performance on the specific tests included in the battery without significant input from cognitive subdomains21,23 or the global factor hierarchically influences performance on the subdomains, which in turn predict performance on the individual tests in the battery.9,16,22 The measurement of the global factor over more specific domain assessment has functional significance as well. In a 2000 meta-analysis, Green et al24 concluded that global cognition scores (vs individual tests or domains) correlate most highly with measures of functional outcome.
The B-CATS uses 3 existing neuropsychological tests, TMT B, digit symbol, and category fluency. All 3 tests require participation from multiple cognitive domains. For instance, digit symbol requires the contribution of motor and processing speed,25,26 visual scanning,27 and learning and memory.25,26 Category fluency utilizes, among other areas, verbal fluency and language skills,28,29 processing speed, and various memory processes, including verbal memory and semantic organization.30,31 TMT B is a test that uses a complex series of cognitive skills including set shifting,32 executive function and working memory,33 attention,34 motor and processing speed,32 and visuospatial scanning14,29,32 (not all involved domains may be assessable, however, if trails B is scored only for administration time and not for errors33). Despite the interaction of skills from multiple cognitive domains, currently, many researchers consider digit symbol and category fluency to be measures of processing speed,7,19,22 and generally in factor analyses, trails B loads on the processing speed domain.19 Processing speed tests (or rather tests assigned to the domain of processing speed) may be particularly well suited for capturing the cognitive deficits associated with schizophrenia. A recent meta-analysis35 suggests that when comparing the performance of patients with schizophrenia with comparison subjects, digit symbol substitution (and to a lesser degree category fluency) has a cumulative effect size of 1½ times the global across-domain effect size (Hedges g = −1.57 for digit symbol, −1.41 for category fluency, and −0.98 for global effect size). Furthermore, there is 73% nonoverlap in scores between patients with schizophrenia and healthy control subjects vs 55% nonoverlap for the global effect size. The effect size of digit symbol performance in relatives of people with schizophrenia was also larger than that of the global effect size, suggesting that digit symbol may be a marker of a potential cognitive endophenotype. Another study demonstrated that after controlling for digit symbol, no further significant differences in cognitive functioning existed between subjects with schizophrenia and control subjects.36 Finally, 2 recent studies demonstrated that processing speed deficits are the most highly correlated with community functioning, after controlling for general cognitive ability,37 or the most broadly and directly correlated with several measures of functional capacity and outcome.20 These recent data support the use of a processing speed–based measure for screening and assessment of general cognitive function. However, the domain homogeneity of the B-CATS is a limitation, and different derivations of B-CATS could emphasize domain diversity (although that would likely lengthen administration time).
The BCA, constructed by Velligan et al,14 is an almost identical group of tests, composed of verbal fluency (letters and categories), trails A and B, and Hopkins verbal learning test (HVLT). The BCA correlates .72 with a more comprehensive neuropsychological battery, although without controlling for part-whole correlation. However, the inclusion of both letters (FAS) and categories (animal), and the use of the HVLT, which is a series of 3 trials of 12 words, increases administration time to 15 min or more. The B-CATS is shorter and was derived empirically. On the other hand, the BCA evaluates verbal memory, a cognitive domain greatly affected by schizophrenia (although a recent factor analysis suggests that verbal memory deficits are not separable from the generalized cognitive deficit23). Finally, the BCA correlates in the moderate range with several measures of functional outcome.14 The similarity of the B-CATS to the BCA provides empirical support for the expert consensus that the included tests capture a high degree of variance in a comprehensive battery’s total score.
While our goal was not to derive the best possible brief battery, we believe that the B-CATS is a reasonable choice for cognitive assessment by clinicians, based on its strong correlation with the global cognitive scores from full neuropsychological batteries and its brevity and ease of administration. We plan to make it or some version of it available for clinicians (and researchers). More work is required, however, before the B-CATS is ready for clinical use. We are currently validating the B-CATS in a sample of 100 subjects with schizophrenia. We are comparing performance on the B-CATS to the global score on the MATRICS consortium consensus battery (MCCB) at 2 time points (to assess test-retest reliability and practice effects). We are also correlating the scores of the B-CATS and the MCCB to performance on a measure of functional capacity, the University of California at San Diego performance-based skills assessment. The administrators of the B-CATS in this study are clinicians and research assistants with limited to no training in psychometrics to model the real-world goal of administration of the B-CATS by clinicians without psychometric training.
We are also constructing a website to simplify administration and scoring of the B-CATS for clinicians. Ideally, the website will provide some structure to assist clinicians without psychometric experience administer and score the tests correctly. Despite the great need for a clinical tool to assess cognition in schizophrenia, there is concern that clinicians without formal training in psychometrics may administer the battery in such a way that it loses interpretability. However, a benefit of the B-CATS is that the test instructions are quite simple. By providing users with access to the B-CATS website, we hope to improve reliability by clinician administrators. The website will allow clinicians to download and print test instructions and alternate versions of the tests, and the site will provide a template to enter age, gender, education, and scores to obtain a normed B-CATS score. Clinicians will also be able to use the website to ask questions about test administration and scoring and watch a brief video demonstration of the correct administration of the B-CATS.
In conclusion, the B-CATS is a brief measure composed of existing cognitive tests that provides a global cognition score. It is highly correlated with the global scores from comprehensive neuropsychological batteries that take between 6 and 20 times as long to administer. The B-CATS has been developed specifically for use by clinicians, and the 3 tests have straightforward and easy instructions for administration and scoring. Researchers may also find it a useful tool for obtaining a fast estimate of global cognitive ability that does not require administration by a psychometrist.
National Alliance for Research on Schizophrenia and Depression Young Investigator grant (20062767 to I.M.H.).