|Home | About | Journals | Submit | Contact Us | Français|
Current criteria for mild cognitive impairment (MCI) require “essentially intact” performance of activities of daily living (ADLs), which has proven difficult to operationalize. We sought to determine how well the Functional Activities Questionnaire (FAQ), a standardized assessment of instrumental ADLs, delineates the clinical distinction between MCI and very mild Alzheimer’s disease (AD). We identified 1801 subjects in the National Alzheimer’s Coordinating Center Uniform Data Set with MCI (n=1108) or very mild AD (n=693) assessed with the FAQ and randomized them to the development or test sets. Receiver-operator curve (ROC) analysis of the development set identified optimal cut-points that maximized the sensitivity and specificity of FAQ measures for differentiating AD from MCI and were validated with the test set. ROC analysis of total FAQ scores in the development set produced an area under the curve of 0.903 and an optimal cut-point of 5/6, which yielded 80.3% sensitivity, 87.0% specificity, and 84.7% classification accuracy in the test set. Bill paying, tracking current events, and transportation (p’s<0.005) were the FAQ items of greatest diagnostic utility. These data suggest that the FAQ exhibits adequate sensitivity and specificity when used as a standardized assessment of instrumental ADLs in the diagnosis of AD versus MCI.
Mild cognitive impairment (MCI) frequently represents an intermediate stage between normal aging and dementia1. Subjects meeting criteria for MCI are at elevated risk of subsequent progression to dementia2. Current criteria establish general guidelines for the diagnosis of MCI: 1) subjective cognitive complaint, 2) essentially intact activities of daily living (ADLs), 3) objective cognitive impairment, and 4) not demented1.
One of the defining features of MCI that distinguishes it from mild dementia is the requirement for “essentially intact” ADLs. This distinction is particularly important for the identification of multiple-domain amnestic MCI subjects, who are at the highest risk for subsequent progression to AD3,4 and would otherwise fulfill DSM-IV criteria for dementia5. However, exactly what level of function constitutes “essentially intact” ADLs has yet to be consistently defined, largely because both baseline ADL responsibilities and the extent of ADL decline that constitutes significant disability varies widely among individuals. Therefore, the assessment of ADLs has most frequently been operationalized through the use of clinician judgment1.
Subjects fulfilling criteria for MCI demonstrate small but significant declines in the performance of instrumental ADLs (IADLs) relative to normal controls6–14. IADLs are clearly better preserved in MCI than in mild AD9,11,15,16, but the precise threshold of functional decline that separates MCI from dementia remains uncertain. An empiric approach to this issue is to quantify clinically determined differences in functional performance between these groups using a formal assessment of IADLs. Standardization of the extent of IADL decline allowable for MCI would serve to increase the reliability of this diagnostic classification.
The Functional Activities Questionnaire (FAQ)17 is a commonly used IADL scale that effectively discriminates between normal control and demented subjects, with sensitivity ranging from 85 to 98% and specificity ranging from 71 to 91%17–22. Preliminary results suggest that the FAQ may also have utility for distinguishing between MCI and mild AD23.
The aim of the present study was to confirm and extend these findings using the large subject population included in the Uniform Data Set (UDS) compiled by the National Alzheimer’s Coordinating Center (NACC). The UDS contains standardized data from subjects evaluated by the Alzheimer’s Disease Centers (ADCs) supported by the National Institute on Aging (NIA)24,25. Functional impairment can be globally staged in the UDS using the Clinical Dementia Rating Scale26, but IADLs are addressed in greater detail with the FAQ. FAQ data for subjects meeting criteria for MCI or AD were extracted from the UDS. We analyzed these data to determine the utility of the FAQ for distinguishing between MCI and very mild AD and to identify the cut-points for global FAQ indices and individual FAQ item scores that most closely correspond with clinical diagnoses.
The NACC UDS contains data from 31 ADCs with current or prior funding from the NIA. We identified 1108 MCI and 693 AD subjects who were ≥ 50 years old, had Mini-Mental Status Examination (MMSE) scores ≥ 24, had been assessed with the FAQ, and whose data had been entered into the UDS by May 29, 2007. MCI subjects fulfilled Petersen criteria (subjective cognitive complaint, objective cognitive impairment, essentially normal functional activities, and not demented)1 and were subdivided into single-domain amnestic (48.0%), single-domain non-amnestic (14.6%), multiple-domain amnestic (30.7%), and multiple-domain non-amnestic (6.7%) groups based upon the presence or absence of memory and other cognitive impairments as determined by clinical judgment and/or neuropsychological testing. AD subjects fulfilled National Institute for Neurological and Communicative Disorders and Stroke-Alzheimer’s Disease and Related Disorder Association (NINDS-ADRDA) criteria27 for possible (15%) or probable (85%) AD. The majority of MCI and AD subjects in the NACC database who fulfilled the inclusion criteria for age and MMSE scores were assessed with the FAQ (MCI: 96.3%; AD: 98.3%). Demographic characteristics, including age, gender, race, education, and MMSE scores, were similar between NACC participants with and without FAQ assessments (all p’s>0.05). Written consent, approved by the Institutional Review Board of each ADC, was obtained from each of the subjects or their designated surrogate.
IADL performance was quantified using the FAQ17. This instrument was administered to an informant, who rated each subject’s performance over the preceding 4 weeks on 10 separate categories of IADLs: 1) writing checks, paying bills, keeping financial records; 2) assembling tax or business records; 3) shopping alone; 4) playing a game of skill; 5) making coffee or tea; 6) preparing a balanced meal; 7) keeping track of current events; 8) attending to and understanding a television program, book, or magazine; 9) remembering appointments, family occasions, medications; and 10) traveling out of the neighborhood. Performance in each category was rated as follows: 0-normal; 1- has difficulty, but does by self; 2- requires assistance; or 3- dependent. Activities that could not be rated, either because the subject never performed them prior to developing cognitive difficulties or because the informant had insufficient information to provide a valid response, were not scored. Overall FAQ performance was evaluated using two separate methods: total FAQ score, which included only subjects that had valid scores on all items (66% of participants), and average score across FAQ items with valid responses (mean FAQ item score), which included all subjects. These analyses allowed for the comparison of the relative utility of complete versus incomplete FAQ data for distinguishing between MCI and AD. Although the FAQ was administered as part of the UDS assessments, the diagnostic criteria for AD and MCI do not require its use nor do they recommend specific cut-points. In order to determine whether FAQ scores significantly influenced clinical diagnoses, clinical core directors of the 29 active ADCs were surveyed regarding the specific role of FAQ scores in diagnosis at their center (i.e. none, supportive, or specific cut-point).
Statistical analyses were performed using SPSS 15.0 for Windows (SPSS Inc., Chicago, IL). Demographic variables were compared between the MCI and AD groups using independent-samples t-tests (age, education, MMSE scores) and Pearson’s chi-square tests (gender, race). The distributions of total FAQ scores [F(1183)=229.37, p<0.001] and mean FAQ item scores [F(1799)=314.37, p<0.001] violated assumptions of homogeneity of variance, as scores in the MCI group were skewed towards 0, while scores in the AD group were more evenly distributed. Therefore, total FAQ scores, mean FAQ item scores, and FAQ scores for individual items were compared between groups using Mann-Whitney U tests with Bonferroni correction for multiple comparisons where appropriate.
We used receiver-operator characteristic (ROC) analysis to describe the differences between the diagnostic groups. For these analyses, subjects were stratified by diagnosis and randomized into development (N=901) and test (N=900) sets. The development set was used to determine optimal cut-points on FAQ measures for differentiating between AD and MCI and the more difficult distinction between probable AD and multiple-domain amnestic MCI. These cut-points were then validated using the test set. For subjects with complete FAQ data, the diagnostic value of individual FAQ items for distinguishing between MCI and AD was evaluated using stepwise logistic regression analysis that included age, race, MMSE scores and all remaining FAQ items as covariates.
Active ADCs that responded to the survey were divided into two groups based on whether FAQ scores were used in their diagnostic processes. Sensitivity, specificity, and classification accuracy for the optimum cut-points on total FAQ and mean FAQ item scores were separately calculated for each active center and compared between groups using independent samples t-tests.
Comparisons of demographic variables between the MCI and AD groups (Table 1) indicated that AD subjects were significantly more likely to be non-Hispanic Whites (p<0.001), older (p=0.015), and have lower MMSE scores (p<0.001) than MCI subjects. Informants for AD subjects provided information on fewer FAQ items (p<0.001) and were less likely to provide data for all FAQ items (p<0.001) than informants for MCI subjects.
As expected, AD subjects had significantly higher total FAQ scores [Figure 1a; Z(1185)=−22.99, p<0.001] and mean FAQ item scores [Figure 1b; Z(1801)=−26.27, p<0.001] than MCI subjects. Significantly higher scores in the AD group were also seen for each individual FAQ item (all p’s<0.001; Bonferroni-corrected critical p=0.005), both in the subgroup with complete FAQ data and in the overall cohort (Figures 1c and 1d). Higher scores on each of these indices indicate greater functional impairment.
Subjects with complete versus incomplete FAQ responses demonstrated significant differences in demographic and FAQ variables. MCI subjects with complete FAQ data were more likely to be male [56.6% vs. 43.8%, χ2(1,1108)=15.09, p<0.001], have higher MMSE scores [27.91 vs. 27.28; t(1106)=5.36, p<0.001], and have lower mean FAQ item scores [0.24 vs. 0.38; t(1106)=4.56, p<0.001] than those with incomplete data. There was no difference in the distribution of MCI subtypes between subjects with complete versus incomplete responses [χ2(3,1108)=5.06, p=0.17]. AD subjects with complete FAQ data were more likely to be male [63.3% vs. 38.0%; χ2(1,693)=43.48, p<0.001] and have higher mean FAQ item scores [1.32 vs. 1.00, Z(691)=−5.78, p<0.001], but marginally less likely to meet NINDS-ADRDA criteria for probable AD [82.8% vs. 88.0%; χ2(1,693)=3.61, p=0.057] than those with incomplete data.
Demographic variables, FAQ scores, proportion of subjects with complete FAQ data, and distributions of MCI subtype and AD diagnostic categories did not differ between the development and test sets (data not shown, all p’s >0.05). Separate ROC curves were generated from the development set to determine the optimal cut-points using total FAQ scores or mean FAQ item scores for distinguishing between AD and MCI (Figure 2).
ROC analysis of total FAQ scores produced an area under the curve (AUC) of 0.903 [Figure 2a; 95% confidence interval (CI): 0.876–0.930, p<0.001] and d’ of 1.80. The optimal cut-point was between 5 and 6, which yielded 82.9% sensitivity, 83.9% specificity, and 83.6% classification accuracy when applied to the development set, and 80.3% sensitivity, 87.0% specificity, and 84.7% classification accuracy when applied to the test set. Slightly poorer discrimination was seen between probable AD and multiple-domain amnestic MCI when this cut-off was used with the test set: 80.3% sensitivity, 81.3% specificity, and 80.7% classification accuracy. These findings indicate that total FAQ scores < 6 were most consistent with a clinical diagnosis of MCI, and total FAQ scores ≥ 6 were most consistent with a clinical diagnosis of AD.
Using the test set, logistic regression analysis of the total FAQ score cut-point versus clinical diagnosis was conducted. After adjusting for age, race, and MMSE score, this analysis yielded a Nagelkerke R2 value of 0.579. A total FAQ score ≥ 6 was independently associated with a diagnosis of AD vs. MCI [β=2.97, S.E.=0.24, Wald χ2=150.06, odds ratio (OR)=19.51, CI=12.13–31.38, p<0.001] and with a diagnosis of probable AD vs. multiple-domain amnestic MCI (β=2.65, S.E.=0.34, Wald χ2=60.12, OR=14.15, CI=7.24–27.64, p<0.001).
ROC analysis of mean FAQ item scores produced an AUC of 0.864 (Figure 2b; CI: 0.840–0.889, p<0.001) and d’ of 1.49. The optimal cut-point was between 0.436 and 0.437, which yielded 82.4% sensitivity, 76.5% specificity, and 78.8% classification accuracy for distinguishing AD from MCI in the development set and 81.8% sensitivity, 77.4% specificity, and 79.1% classification accuracy in the test set. Similar results were obtained when this cut-point was used to distinguish between probable AD and multiple-domain amnestic MCI in the test set: 82.4% sensitivity, 70.9% specificity, and 78.3% classification accuracy. The use of mean FAQ item score allowed for the inclusion of a greater number of subjects than the use of the total FAQ score but resulted in poorer discrimination between groups.
ROC data for the diagnostic value of individual FAQ items were separately derived from the development set for subjects with valid data for all items and all subjects with valid data for each item. The optimal cut-off point for each item was a score ≥ 1 (i.e. presence of any impairment). For subjects with complete FAQ data, the items that yielded the best discriminative power between AD and MCI included: paying bills (86% sensitivity, 77.5% specificity, and 80.3% classification accuracy), assembling tax records (88.6% sensitivity, 71.9% specificity, and 77.4% classification accuracy), and traveling outside the neighborhood (80.3% sensitivity, 77.5% specificity, and 78.4% classification accuracy). Discriminative indices were consistently higher for the subset of subjects with valid data for all FAQ items than for the overall cohort.
In order to determine which individual FAQ items were independently associated with a clinical diagnosis of AD, stepwise logistic regression analysis was performed using the development set and adjusted for age, race, and MMSE score. This analysis yielded a Nagelkerke R2 value of 0.618 and indicated that subjects with any impairment on paying bills (β=1.28, S.E.=0.33, Wald χ2=14.76, OR=3.60, CI=1.87–6.91, p<0.001); shopping alone (β=0.83, S.E.=0.33, Wald χ2=6.52, OR=2.29, CI=1.21–4.34, p=0.011); tracking current events (β=0.85, S.E.=0.30, Wald χ2=8.15, OR=2.34, CI=1.31–4.20, p=0.004); traveling outside the neighborhood (β=0.87, S.E.=0.30, Wald χ2=8.59, OR=2.39, CI=1.33–4.28, p=0.003); or playing a game of skill (β=0.70, S.E.=0.31, Waldχ2=4.97, OR=2.01, CI=1.09–3.72, p=0.026) were more likely to be diagnosed with AD. When cut-offs on these individual items were applied to the test set, their discriminative power for identifying AD remained poorer than that obtained using global FAQ indices: 60.6–84.6% sensitivity, 78.9–88.5% specificity, and 77.9–80.9% classification accuracy for subjects with valid data for all items and 56.1–81.1% sensitivity, 75.0–84.3% specificity, and 75.3–77.3% classification accuracy for all subjects.
Of the 29 ADCs actively collecting UDS data, 28 responded to the survey regarding the use of FAQ for diagnosis. These centers contributed data for 93.0% of the subjects included in our analyses. Nineteen centers (comprising 77.4% of subjects) do not use the FAQ for diagnosis and 9 centers (comprising 22.6% of subjects) use FAQ data only as supporting information. None of the ADCs implement a specific cut-point on FAQ scores for distinguishing between MCI and AD. Sensitivity, specificity, and classification accuracy of optimal cut-points for total FAQ and mean FAQ item scores did not differ between centers that considered FAQ scores during diagnosis and those that did not (all p’s>0.1; Table 2).
The analyses presented here focus on the association of clinical diagnoses of AD versus MCI with IADL performance as measured by the FAQ. Expert clinicians were more likely to diagnose AD when total FAQ scores were ≥ 6 and more likely to diagnose MCI when total FAQ scores were < 6. Specific FAQ items that distinguished AD from MCI assessed bill paying, shopping, tracking current events, transportation, and playing games. These findings provide an empiric basis for using the FAQ or similar IADL measures to help distinguish MCI from very mild AD.
Previous studies using the FAQ support its utility for distinguishing between demented and non-demented subjects17–22,28. However, the ideal cut-point for identifying demented subjects has not been consistently established. The original report describing the FAQ used two different criteria: dependency (i.e. item score = 3) in at least two categories of IADLs, which yielded 85% sensitivity and 81% specificity; and a total score ≥ 5 used in conjunction with a battery of neuropsychological testing, which yielded 91% sensitivity and 89% specificity17. The optimal cut-point in our study was a total FAQ score ≥ 6, which is similar to cut-points used by other groups, which range from ≥ 522 to ≥ 819,28. Prior studies have included both normal controls and subjects with more advanced AD, resulting in larger effect sizes. The overall sensitivity and specificity of the FAQ for a diagnosis of AD in this study is slightly lower than previously reported, likely because we used the FAQ to make more difficult distinctions between MCI and very mild AD and between multiple-domain amnestic MCI and very mild probable AD. The latter classification is particularly challenging, because these two diagnoses have overlapping patterns of cognitive impairment and differ only in degree of functional impairment. Our findings indicate that distinctions based on total FAQ scores corresponded well to consensus diagnoses of expert clinicians at NIA-funded ADCs and establish a threshold for the extent of functional decline consistent with a diagnosis of dementia.
Applying a cut-point of 5/6 on total FAQ scores to distinguish AD from MCI in the current sample yields a positive likelihood ratio of 5.61 and a negative likelihood ratio of 0.22, which results in a small to moderate change in pretest probability of a subject being diagnosed with AD29. However, 14.5% of MCI subjects had scores ≥ 6 and 18.5% of AD subjects had scores < 6. Such imprecision might be an expected consequence of imposing a categorical diagnosis of MCI to the continuum between normal aging and dementia. The subtleties and heterogeneity of functional performance between individuals may elude strict definition by standardized instruments and continue to require the more subjective interpretations offered by clinical judgment. Assessment with newer and more precise ADL scales9,30 or those that focus on any impairment on the specific IADLs that best differentiate AD from MCI may result in clinical diagnosis of AD at earlier stages of disease.
Our results are consistent with prior work indicating that mild but measurable impairments on IADLs are detectable in MCI6–14. Further analyses of this data indicate that greater total FAQ scores are seen in amnestic versus non-amnestic MCI. However, total FAQ scores do not differ between single-domain versus multiple domain amnestic MCI31.
A limitation of the FAQ is that informants may not be able to provide responses on certain items, either because the subject never performed them prior to developing cognitive impairment or because the informant had insufficient information to rate the subject’s current performance. The original version of the FAQ allowed informants to speculate on subjects’ potential to perform activities that they had never previously pursued17. However, such responses are not included in the UDS version of the FAQ. As a result, a significant percentage of subjects (34.2%) had incomplete FAQ data, though only 4.7% were missing data for > 2 items. Incomplete FAQ responses were seen more frequently in the AD group than the MCI group. Informants for MCI subjects may have been more knowledgeable about their participants or known them for a longer period of time than the informants for AD subjects. It is also possible, albeit less likely, that AD subjects may have had fewer premorbid IADL responsibilities than MCI subjects. The issue of incomplete FAQ responses was addressed by analyzing the mean score for items with valid responses. This measure allowed for the inclusion of all AD and MCI subjects, but exhibited poorer discriminative power than the total FAQ score. These data suggest that the FAQ has greater utility for distinguishing between MCI and AD when complete data are available.
Some of the ADCs use FAQ scores as supportive data in their diagnostic procedures, but none of the centers that responded to our survey implement a specific cut-point to distinguish between MCI and AD. However, the sensitivities, specificities, and classification accuracies of the optimal cut-points derived from our analyses were similar between ADCs that use the FAQ diagnostically and those that do not. Variability in FAQ usage is therefore unlikely to have significant skewed our results.
This study includes a large subject population drawn from multiple ADCs around the United States, underscoring the generalizability of our findings across disparate regional populations. The relatively robust nature of our findings is further supported by the similar sensitivity, specificity, and classification accuracy seen with the total FAQ score cut-point in the development and test sets. However, there are a few factors that may limit the interpretation of our results. The study population was comprised of a convenience sample of highly educated subjects volunteering for research at major academic centers and may not be representative of epidemiological samples or those with greater ethnic diversity. Nevertheless, the FAQ has previously demonstrated utility for identifying demented subjects in population-based studies conducted in several other countries19,20,22,28. Diagnostic classification in the UDS is derived from consensus clinical diagnoses based upon the current criteria for MCI1 and the NINDS-ADRDA criteria for AD27. The UDS includes a core neuropsychological battery24, but the NACC does not specify which additional cognitive tests can be used to supplement that battery, does not establish performance thresholds that define cognitive impairment, and does not stipulate the precise role of test scores in the consensus diagnosis process. Therefore, it remains possible that variability in the interpretation of the diagnostic criteria for AD, MCI, and MCI subtypes across different ADCs may have influenced our results. Finally, our analyses were not adjusted for the presence or severity of any comorbid medical conditions that could have differentially influenced IADLs between groups.
Although functional impairment is a core feature of the DSM-IV criteria for AD5 and a supportive feature of the NINCDS-ADRDA criteria for AD27, some investigators have suggested that ADL measures have little utility for the diagnosis of dementia or AD32,33. In contrast, our work is consistent with previous reports that ADL measures provide sufficient ecological validity for distinguishing demented and non-demented subjects19,22,30. The findings reported here provide an upper threshold for the extent of functional deficits seen in MCI, suggest specific categories of IADLs that most effectively discriminate between MCI and AD, and offer further support for the implementation of formal IADL measures, such as the FAQ, as part of the inclusion and exclusion criteria in future studies of MCI, particularly clinical trials, to allow for better standardization of diagnoses across investigators and research centers.
This research was supported by the National Institute on Aging (P50 AG 16570, U01 AG016976), the National Alzheimer’s Coordinating Center, the Alzheimer’s Disease Research Centers of California, and the Sidell-Kagan Foundation. We would like to thank Nathaniel Mercaldo for his assistance with data management.
Conflicts of Interest: ET, BWB, EW, and PHL have nothing to disclose. DSK, in the past year, has served as a consultant to GlaxoSmithKline, on a Data Safety Monitoring Board for Sanofi-Aventis, and as an investigator in a clinical trial sponsored by Elan and Forest. JLC, in the past year, has served as a consultant to Abbott, Acadia, Accera, Adamas, Astellas, Avanir, Bristol-Myers Squibb, CoMentis, Eisai, Elan, EnVivo, Forest, GlaxoSmithKline, Janssen, Lilly, Lundbeck, Medivation, Merck, Merz, Myriad, Neuren, Novartis, Noven, Pfizer, Prana, reMYND, Schering-Plough, Sonexa, Takeda, Toyama, and Wyeth; on the speakers’ bureau for Eisai, Forest, Janssen, Lundbeck, Merz, Novartis, and Pfizer; and owned stock in Adamas, Prana, and Sonexa.