|Home | About | Journals | Submit | Contact Us | Français|
Postoperative cognitive dysfunction (POCD) is a decline in cognitive function from preoperative levels, which has been frequently described after cardiac surgery. The purpose of this study was to examine the variability in measurement and definitions for POCD.
Electronic medical literature databases (EMBASE, MEDLINE, Psychinfo, and Cumulative Index of Nursing and Allied Health Literature) were searched for the intersection of the search terms: “thoracic surgery” and “cognition, dementia, and neuropsychological test”. Abstracts were reviewed independently by 2 reviewers. English articles with more than 50 participants published since 1995 that performed preoperative and postoperative psychometric testing in patients undergoing cardiac surgery were reviewed in their entirety. Data relevant to the measurement and definition of POCD were abstracted and compared to the recommendations of a 1995 Consensus Statement on measurement of POCD.
Sixty-two studies of POCD in patients undergoing cardiac surgery were identified. Of these studies, the recommended neuropsychological tests were done in less than half of the studies. Cognitive domains measured most frequently were attention (n=56; 93%) and memory (n=57; 95%); motor skills were measured less frequently (n=36; 60%). Four definitions of POCD emerged: percent decline (n=15), standard deviation decline (n=14), factor analysis (n=13), and analysis of performance on individual tests (n=12). Because of variability in its measurement, the prevalence of POCD varied by over 10-fold across studies.
There is marked variability in the measurement and definition of POCD. This heterogeneity may impede progress by reducing the ability to compare studies about the causes and treatment of POCD.
Advances in surgical techniques, perfusion systems, and perioperative management have reduced the mortality associated with cardiac surgery1. However, postoperative cognitive dysfunction (POCD) remains a common outcome with potential to adversely impact quality of life2,3. POCD is a decline in performance on neuropsychological tests relative to preoperative levels. To adequately capture cognitive performance in important domains, several neuropsychological tests are needed4. Because of the multiple tests used to assess a particular domain, the variability in scoring of these tests, and the high correlation between different tests; the methodology used to analyze the tests to arrive at a determination of impairment can have a large impact on the reported prevalence of POCD.5 One problem with variable criteria for POCD is that the results of studies may not be comparable.
Challenges with comparability of POCD measures have been recognized for a long time6. To address these challenges, the Statement of Consensus on Assessment of Neurobehavioral Outcomes after Cardiac Surgery (Consensus Statement), published in 1995, recommended the core battery, timing, and additional comorbid conditions for assessment of POCD.7 Since the publication of this statement, there have been many studies that have investigated POCD as a neuropsychological end-point after cardiac surgery, but whether they have followed the consensus statement is unknown. The purpose of this manuscript is to perform a comprehensive literature review to determine: a) if a standardized neuropsychological battery for POCD has emerged in accordance with Consensus Statement recommendations, b) if the comorbidities, assessment timing, and learning effect recommendations of the Consensus Statement are being utilized, and c) if standard analytic criteria for POCD have emerged.
The study was conducted with approval of the IRB. We searched of the following databases: EMBASE, MEDLINE, Psychinfo, and Cumulative Index of Nursing and Allied Health Literature. Studies were reviewed between June 1995 and May 2009, limited to English language and human subjects. The cardiac surgery search term was created by the combination of the following medical subject headings (MeSH): “cardiac surgery, coronary artery bypass graft, heart surgery, OR thoracic surgery” and the keyword searches: “CABG, valve replacement, OR valve repair”. The cognitive term included the MeSH terms “cognition, dementia, neuropsychology, OR neuropsychological tests” combined with the keywords “post operative cognitive dysfunction, dementia, cognitive impairment, OR neuropsychological tests.” The abstracts identified from the intersection of the cardiac surgery and cognitive terms were independently reviewed by two reviewers and relevant studies were identified for full text review. The intersection was limited to the dates of interest, human studies, English language, and adults (≥18 years). Criteria for full text review included: children were not the study population, patients underwent cardiac surgery, and cognitive function was measured. For abstracts with a disagreement among reviewers, the manuscript was reviewed to determine if the study met inclusion criteria. Additionally, the reference list of selected articles was reviewed for additional articles of interest.
Manuscripts selected for inclusion were prospective studies of cardiac surgery patients that assessed both preoperative and postoperative cognitive function. Studies were excluded that used a cognitive screening test, such as the Mini Mental State Examination8, as the only measure of cognitive function. Studies of cardiac procedures such as angiography, angioplasty, or valvuloplasty were excluded. We excluded studies with <50 patients and intervention studies with <25 patients per arm, because these studies would lack the size to accurately define POCD. Studies that assessed POCD solely in the first three postoperative days were excluded, because the distinction of POCD from postoperative delirium would be clouded.9 Studies with multiple publications from the same cohort were reviewed, but were reported once.
From the selected studies, we abstracted study characteristics, patient demographics and psychometric assessments using the framework of the Consensus Statement. The number of patients enrolled and the number who completed the study at the last follow-up were abstracted. Age, prior stroke, and operative procedure were recorded. Because there is an impact of learning on repeat neuropsychological test administration, we recorded if learning was accounted for in the identification of POCD. In addition to these core features, we recorded if studies included assessment of anxiety and depression, because these conditions can impact cognitive testing. The Consensus Statement named four neuropsychological tests as a core battery to cover three cognitive domains (Figure 1) including the Rey Auditory Verbal Learning Test(11) for verbal memory, Trailmaking test A and B10 for attention, and grooved pegboard11 for motor skills. For each study, we identified the number of the core tests utilized to assess POCD, the cognitive domains covered by the tests administered, and the postoperative timing of the follow-up assessments.12 We recorded the analytic criteria for POCD used and the prevalence of POCD reported.
Figure 2 summarizes the search strategy and results. Abstracts identified with electronic databases (n=1311) were reviewed for potential fit of the selection criteria and 190 articles were identified for full review. The full review included a manual search of references, which identified an additional 31 articles. Post-hoc inclusion of keywords from reference-list identified articles did not identify additional articles. Studies were also eliminated that were secondary analyses (n=54), sample size <50 (n=40), used screening cognitive instruments only (n=29), review articles (n=16), and otherwise excluded (case reports, letters, dissertations n=20). Overall, sixty-two studies of postoperative cognitive dysfunction after cardiac surgery met inclusion criteria.
Studies are presented according to study design: Table 1 includes prospective cohort studies without controls (n=14); Table 2 includes prospective studies with controls (n=8); and Table 3 includes intervention studies (n=40) of which 38 were randomized.
Table 4 describes the data collection by type of study. While most studies included some of the neuropsychological tests named in the Consensus Statement, less than half included all four tests. Coverage of the memory (n=57; 95%) and attention (n=56; 93%) domains was very good; while motor skill (n=36; 60%) was less commonly measured. In prospective cohort studies with and without controls, postoperative testing was likely to occur after the recommended three month time period. However, slightly less than half of randomized controlled trials measured POCD after 3 months. Roughly half of studies reported preoperative assessments for depression and anxiety. The impact of learning was accounted for in very few studies and neurological exam was documented in less than half. One prospective study with controls13,14 and one intervention study15,16 adhered to the neuropsychological testing battery, test timing, and comorbidity measurement of the Consensus statement.
Table 5 lists the analytic criteria used to define POCD and the number of studies in which these criteria appear. Many studies analyzed POCD using more than one analysis methodology. There are four major analytic criteria for used for defining POCD. In the ‘percentage decline’ definition, a patient must decline a percentage (usually 20%) from baseline in a specified number of tests (usually 2). The ‘SD decline’ criterion requires a reference population (baseline performance, normative data, or control) to define the standard deviation for the employed battery and then creates the dichotomous outcome based on a decline greater than the standard deviation. The reference population used to define the standard deviation is not consistent among studies, nor is the magnitude of the standard deviation decline (i.e. 1 SD, 1.5 SD, 2SD). The ‘factor analysis’ methodology uses raw neuropsychological data to group highly correlated tests into several (3–4) latent cognitive domains, which are continuous variables. The latent cognitive domains can be dichotomized (usually at 1 SD) to define decline. ‘Individual test analysis’ assesses performance on individual neuropsychological measures and several continuous outcome variables. Individual test analysis generally does not create a dichotomous definition of decline. Finally, a growing number of studies (n=10) are reporting continuous measures (Z-score) of neuropsychological performance, both as individual tests and as a composite. The composite score is generally a sum or mean of individual test Z-score change from baseline.
Table 6 reports the range of the prevalence of POCD by criteria and follow-up interval. There is a high degree of variability of the reported prevalence. In some cases, the ratio in prevalence from minimum to maximum is greater than ten-fold. Studies often showed an increase in POCD prevalence at longer follow-up intervals, but most of these studies did not account for the impact of age and atherosclerosis on cognitive function over these time periods.
This review identified 62 studies that assessed POCD after cardiac surgery and examined the adherence to the recommendations of the Consensus Statement that was published in 1995.7 We found significant variability in the neuropsychological tests and the timing of the tests used to measure POCD. Most batteries covered the domains of attention and verbal memory, while motor function was measured less frequently. Half of studies assessed anxiety or depression and a few accounted for the learning effect. Consequently, standard analytic criteria for POCD did not emerge, indicating that the Consensus Statement guidelines are not widely accepted or applied. The resultant heterogeneity in how POCD is measured and defined may limit the ability to compare POCD outcomes across studies and possibly impede progress in the field.
Studies of cardiac surgery have inherent variability because of patient factors (age, education, comorbidity), cardiac surgery factors (hypothermia, cardiopulmonary bypass, cross clamp, bleeding), physiologic factors (inflammation, microembolization, blood brain barrier function), intraoperative factors (anesthesia, cerebral oxygenation, hypotension), perioperative factors (medication, sleep, complications), and postoperative factors (rehabilitation, depression, social supports). Identifying POCD in patients becomes more difficult when variable measurement of cognitive function with different neuropsychological tests and multiple analytic criteria are utilized. Thus, confronted with two POCD studies with different results, it is difficult to know whether the differences are substantive or simply related to how POCD is measured and defined. Development of standardized criteria for neuropsychiatric conditions such as delirium17, Alzheimer’s disease18, and depression19 has allowed clinical and basic science research in these conditions to progress.
Ultimately, using standardized criteria creates a dichotomous definition of POCD. In this review, we found that recent studies analyze and report both a dichotomous definition and a continuous/summary measure of cognitive function. While calculation of a dichotomous definition has clinical applicability, it reduces statistical power in the study. Additionally, the mechanism of how multiple neuropsychological tests are combined into a single measure of cognitive function remains the subject of a debate because of the cognitive domain overlap of neuropsychological tests, the method of combination (e.g. mean/sum of Z-scores, confirmatory factor analysis, etc), and the impact of learning. Thus, it may be timely to utilize the wealth of evidence from recent studies to revisit measurement methods and definitions for POCD.
Importantly, the data from the studies identified in this review can play a key role in the development of a standardized battery and analytic criteria for POCD, which can address the challenges associated with POCD in several ways. First, each neuropsychological test measures more than one cognitive domain (e.g. Performance on Trailmaking requires attention as well as working memory and motor skills) and thus, the tests are highly correlated. As a result, impairment in one cognitive domain may have effects on tests that predominantly measure other cognitive domains (e.g. Impaired psychomotor skill will affect performance on Trailmaking, independent of attention or working memory). Using previous studies, a standardized cognitive battery would define the degree of contribution of a neuropsychological test to each cognitive domain and ensure adequate coverage of all appropriate cognitive domains. Second, the information about floor effects (i.e. poor initial performance which cannot decline) 20 and ceiling effects (i.e. excellent initial performance which cannot improve) can be obtained from the current literature and used to optimize selection of neuropsychological tests to detect clinically significant change.21 Third, learning effects can be measured and factored into a standardized neuropsychological battery and analytic criteria for POCD.15,22,23 The learning effect occurs because repeated administration of tests increases the knowledge of the test structure and thus, performance tends to improve with repeated administration. Fourth, a standardized battery would help define the test-retest reliability of the neuropsychological tests. Reliable neuropsychological tests are important to reduce regression to the mean, where performance at the extreme (high or low) will tend to move toward the mean on repeat testing.24 Finally, using the current literature, the contribution of individual variability vs. true change would likely be better characterized.20,21 For example, neuropsychological performance can be affected by factors not related to cognitive function (i.e. sleep the night prior, frustration of commute to testing center, or fatigue towards the end of testing). The current literature could be used to establish normative values for defining significant change. Ultimately, this change definition would need to be validated against a change in social or occupational function to demonstrate that it was clinically significant.
The development and validation of a standardized neuropsychological battery and analytic criteria could help advance POCD to the level of a clinical disorder by improving efficiency of measurement, identifying patients at high risk, and ensuring clinical meaning of the outcome. If POCD were more easily operationalized, smaller physician groups would be empowered to measure POCD to improve operative technique, anesthesia protocols, and perioperative care without requiring external funding to conduct a research study (reimbursement for cognitive testing may be necessary). A standardized battery and criteria would also be a boon to research in this area. For example, when standardized criteria for delirium were developed17 the number of research studies published on delirium increased by over 100% in the subsequent 10 years compared to the 10 years prior.
In conclusion, using the recommendations of the1995 Statement of Consensus on Assessment of Neurobehavioral Outcomes after Cardiac Surgery as a framework, the present systematic literature review identified 62 unique studies of POCD and analyzed the adherence to these recommendations. While the cognitive domains of attention and memory are included in nearly all studies, there is significant variability in the coverage of other cognitive domains and in the individual neuropsychological tests used to measure POCD. Moreover, no standard analytic criteria for POCD have emerged. This heterogeneity limits the ability to compare POCD amongst studies. A unified battery and analytic criteria would improve comparability, address measurement challenges such as learning, floor, and ceiling effects, and ultimately, advance science in this field by allowing clinicians and investigators to develop a better understanding of the causes of POCD and thereby to develop strategies for its prevention or treatment.
Financial Disclosure: Dr Rudolph is supported by a VA Rehabilitation Research and Development Career Development Award. Additional support for this award was provided by NIH grants (AG026781, AG029861, AG027549, AG030618, AG028189, AG008812)
Conflict of Interest: The authors have no financial conflict of interest to declare