|Home | About | Journals | Submit | Contact Us | Français|
This report describes a pilot study to evaluate feasibility of new home-based assessment technologies applicable to clinical trials for prevention of cognitive loss and Alzheimer disease.
Community-dwelling nondemented individuals ≥ 75 years old were recruited and randomized to 1 of 3 assessment methodologies: (1) mail-in questionnaire/live telephone interviews (MIP); (2) automated telephone with interactive voice recognition (IVR); and (3) internet-based computer Kiosk (KIO). Brief versions of cognitive and noncognitive outcomes were adapted to the different methodologies and administered at baseline and 1-month. An Efficiency measure, consisting of direct staff-to-participant time required to complete assessments, was also compared across arms.
Forty-eight out of 60 screened participants were randomized. The dropout rate across arms from randomization through 1-month was different: 33% for KIO, 25% for IVR, and 0% for MIP (Fisher Exact Test P = 0.04). Nearly all participants who completed baseline also completed 1-month assessment (38 out of 39). The 1-way ANOVA across arms for total staff-to-participant direct contact time (ie, training, baseline, and 1-month) was significant: F (2,33) = 4.588; P = 0.017, with lowest overall direct time in minutes for IVR (Mn = 44.4; SD = 21.5), followed by MIP (Mn = 74.9; SD = 29.9), followed by KIO (Mn = 129.4; SD = 117.0).
In this sample of older individuals, a higher dropout rate occurred in those assigned to the high-technology assessment techniques; however, once participants had completed baseline in all 3 arms, they continued participation through 1 month. High-technology home-based assessment methods, which do not require live testers, began to emerge as more time-efficient over the brief time of this pilot, despite initial time-intensive participant training.
Major challenges exist for conducting Alzheimer disease (AD) primary prevention trials in recruitment and retention of elderly cohorts and evaluating them over extended periods of time. First, elderly populations often present with a range of physical, social, and health issues, which limit study participation, particularly when a clinical site is geographically remote. Second, identifying a target population for a clinical trial within a general medical setting requires establishing operational study criteria and diagnostic categories at entry to classify homogeneous subgroups of participants with or without memory deficits or other forms of mild cognitive impairment (MCI). Third, assessment batteries need to be developed that are: sensitive to change over time; have sufficient breadth to tap key domains that mark transition from cognitive health to dementia; and suitable for large-scale administration.
This report describes a pilot study in which we incorporated the cognitive and noncognitive domains that Ferris et al1 identified as important to characterizing the transition to dementia. Our goal was to adapt these assessment measures to a home-based longitudinal study of a population of community-dwelling elderly people, using technologies that reduce the need for face-to-face assessment and participant travel. If such home-based assessment is feasible, it would allow a potentially larger and broader sampling of individuals to participate in studies.2
This study included screening procedures to identify nondemented elders, categorizing them as having normal cognition or MCI by a battery of well-established in-person neuropsychologic measures.3,4 Participants were then randomly assigned to 1 of 3 experimental methods of home-based longitudinal assessment: (1) mail-in/phone (MIP): A validated cognitive assessment was conducted by phone with a live, trained tester, supplemented by written mail-in questionnaires assessing noncognitive domains; (2) assessment and data collection through telephone-based automated interactive voice response (IVR)5–7 for all cognitive and noncognitive measures (no live tester); and (3) computer-based, video-directed assessment, and data collection through kiosk (KIO) (no live tester). Abbreviated versions of cognitive and noncognitive measures from key domains associated with change from nondemented to MCI and dementia were administered in formats specific to the technology of experimental method of assessment.
In addition, medication adherence to a twice-daily vitamin regimen, a performance-based instrumental activity of daily living, was assessed by methods specific to each technology arm.
The study was conducted by the Alzheimer Disease Cooperative Study (ADCS), a national consortium of sites with expertise in the conduct of clinical trials in cognitive loss and dementia.8 The purpose of this pilot study was to establish the feasibility of recruitment, randomization, and in-home evaluation of elderly participants. In addition to measures of participant performance, measures of efficiency (staff-to-participant contact needs) were developed to compare these different in-home evaluation methods. The pilot study offered an opportunity to standardize operational procedures, automate scoring, and establish transfer of data from individual sites to a centralized database, all of which provided a model for a subsequent large-scale study.
Participants were recruited from 3 ADCS sites similar to those likely to be recruited into a primary prevention trial.
Inclusion criteria were: More than or equal to 75 years old; Mini-Mental State Examination (MMSE)9 score more than or equal to 26; willingness to take study multivitamins; minimal computer skills or willingness to learn; English fluency; ability to answer and dial a telephone; adequate speech, hearing, and vision to complete assessments; and independently living. A study partner, was encouraged, but not required.
Exclusion criteria were: dementia diagnosis; using prescriptive cognitive-enhancing drugs; intent to continue use of own nonprotocol multivitamins during the study; history or presence of major psychiatric, neurologic, or neurodegenerative conditions associated with significant cognitive impairment; unstable housing arrangements over the 3-month duration of the pilot study; or current participation in a clinical trial involving CNS medications or cognitive testing.
Each site targeted recruitment at communities with large numbers of age-appropriate elderly participants and located close enough to the research site to ensure ready access to in-person technical support. Each site used a variety of recruitment strategies, for example, lectures on memory enhancement, outreach through direct mail, and posting fliers. Informed consent was obtained and participants were offered a small financial incentive for study completion, in accordance with local institutional review boards (IRBs).
Participants were randomized to 1 of 3 home-based assessment methods:
Mail-in/phone (MIP). The cognitive assessment was conducted by a trained evaluator through live telephone call with the participant. Noncognitive assessment and the experimental medication adherence were conducted by mail.
Interactive Voice Recognition (IVR) Assessment through automated telephone. A standard large-key telephone was installed in the home; cognitive, noncognitive, and medication adherence assessments were conducted through an automated telephone system. Responses were recorded and scored through voice recognition and keypad entry, requiring no live staff.
Kiosk (KIO) Computerized Assessment. A computer kiosk, consisting of a touch screen and attached telephone handset, and in most cases a new broadband internet connection, was installed in the home. Cognitive and noncognitive tests were presented orally and visually by an on-screen videotaped tester; responses were recorded by voice through a telephone handset or by touch screen. Experimental medication adherence was assessed by MedTracker,10,11 an instrumented 7-day reminder pill box, linked by Bluetooth to the in-home study computer to record precise times of pill-taking.
A day-long investigator meeting was conducted to train staff, including installation of equipment for IVR and KIO arms, and procedures for training participants in the use of in-home equipment. For the MIP arm that required a live phone tester for the cognitive assessment, coordinators received training in test administration and scoring. Tester competence was confirmed by a certification test assessing knowledge of administration and scoring of the cognitive assessment battery.
Preparatory steps at each site included identifying local broadband internet service providers and establishing relationships with targeted community sites. After participants signed consent forms, an in-person screening visit was conducted to determine eligibility. During screening the participant was administered an in-person neuropsychologic battery taken from the Uniform Data Set of the National Alzheimer Coordinating Center that is widely used in clinical research protocols (NACC-UDC, ADNI).3,12 The neuropsychologic battery assessed verbal episodic memory, attention, semantic memory/language, psychomotor speed, and executive functioning.3 The clinician categorized participants as normal or Mild Cognitive Impairment (MCI) based on impressions of memory impairment from interview and available neuropsychologic evaluation.
Training visit. The training visit took place in the participant’s home for the IVR and KIO arms and either by phone or at the participant’s home for the MIP arm. The training visit consisted of a mock demonstration of test taking by the participant. The baseline evaluation was scheduled within the next week.
Baseline Visit. For the MIP group the cognitive battery was conducted by phone with a live tester; the tester reminded the participant to mail in the paper-and-pencil noncognitive battery in a prestamped addressed envelope. For the 2 high technology arms, appointments were scheduled through their respective automated technologies. If the visit was not completed, staff contacted the participant and the effort by staff was captured on the Efficiency form. No staff time was required for either the cognitive or noncognitive evaluations for IVR and KIO arms.
One-Month Follow-up. An assessment was scheduled 1 month after the baseline visit and was to be initiated by the participant. If a visit was missed, staff contacted the participant and the effort was captured on the Efficiency form.
The cognitive13–17 and noncognitive18–25 evaluations each included 8 domains, represented by brief instruments suitable for repeated assessment. Tests were adapted to the technological format of each of the study arms, preserving as much as possible the integrity of the original in-person test.
The cognitive performance battery was designed to require about 30 to 40 minutes in duration, presented in a set sequence (Table 1). The test order was designed to achieve an approximate 15 to 20 minute interval between the immediate and delayed recall of the East Boston Memory Story. A checklist of adverse events was collected midway through the cognitive battery as a noncognitive filler. Participants were requested to allot 40 minutes for the in-home cognitive test session and encouraged to not take any breaks. However, if a break occurred, testing was resumed at the beginning of the discontinued test.
The noncognitive portion of the home-based assessment (Table 1) was completed by mail for the MIP arm, by automated telephone for the IVR arm and by automated computer kiosk for the KIO arm. Participants were instructed to take the study multivitamin in the morning and evening and an experimental medication adherence measure was generated for each of the arms as described above. All participants also returned pills at the end of the pilot study, a gold standard medication adherence rate that will be used in the Main Study.
Staff contact was measured by the frequency and length of time spent with participants, either in-person or by phone. These contacts included: nonscheduled contacts with staff (eg, nonevaluation phone calls), in-person staff time to train participants in the use of the assessment methods (ie, MIP, IVR, or KIO), and staff time to administer the cognitive battery for the MIP arm by phone. (The other 2 arms required no staff time for home-based experimental testing.) A total staff-to-participant length of direct contact variable (in minutes) was generated by summing: (a) time to train; (b) baseline additional staff-to-participant contact; (c) baseline cognitive testing (MIP only); (d) 1-month additional staff-to-participant contact; e) 1-month cognitive testing (MIP only).
The disposition of recruited participants was reported for this pilot study according to the Consort recommendations (Fig. 2) and key descriptive data on participant demographic and health status were summarized. To address feasibility of enrollment, randomization, and repeated data collection over the brief, 1-month interval, the 3 arms were compared with respect to (a) the number of participants discontinued after randomization; (b) length of time from screening to baseline; and (c) retention rate through 1-month follow-up assessment. Statistical comparisons among methods were also made on the composite “efficiency” measure, described above. The number of incomplete tests in the cognitive battery (out of a total of 8) was compiled across arms.
Categoric data (ie, frequency counts) were analyzed by Fisher Exact Test of equal proportions and if P < 0.05, pairwise comparisons were conducted. For continuously measured variables, 1-way analyses of variance (ANOVA) were conducted across arms. If Bartlett Test of homogeneity of variances was nonsignificant (P ≥ 0.05) for a given variable, a Tukey post hoc analysis was conducted. If Bartlett test was significant (P < 0.05), Welch t-tests26 (accommodating unequal variances) were conducted, with Hochberg27 adjustment for multiple comparisons (ie, pairwise comparisons = MIP:IVR, MIP:KIO, IVR:KIO). All tests were 2-tailed with α set at 0.05.
The means and standard deviations for scores on the cognitive and noncognitive measures were summarized for each arm at baseline and 1 month. A test-retest Pearson r correlation was calculated between baseline and 1-month scores to estimate the reliability of these abbreviated tests. Statistical comparisons of cognitive and noncognitive scores across arms were deferred until a large sample size is collected in the main study.
A total of 60 participants were enrolled in the study, of which 48 (80%) were randomized (Fig. 2). The reasons that 12 participants discontinued before randomization included: general unwillingness to be randomized (n = 3); only willing to be randomized to KIO arm (n = 1); health concerns (n = 2); failure on MMSE (n = 1); unwillingness to take study vitamin (n = 1); and unspecified reasons unrelated to any of the arms (n = 4). The 48 randomized participants were distributed as: MIP n = 14; IVR n = 16; and KIO n = 18. Nine of the 48 randomized participants discontinued before baseline: 6 from KIO (33%), 3 from IVR (19%). Five of the 6 participants who discontinued from KIO reported reasons specific to that arm’s technology: for example, “too much trouble getting broadband internet.” Two of the 3 IVR participants who discontinued were described as “unable to complete the baseline.” The others withdrew with no specific comment recorded. The dropout rate across arms from randomization through completion of the study at 1 month follow-up was significantly different: 33% for KIO, 25% for IVR, and 0% for MIP (Fisher Exact Test P = 0.04; none of the pairwise comparisons were significant).
The mean intervals in days that elapsed between screening and baseline evaluation were: MIP = 30.1 (SD = 14.8); IVR = 24.5 (SD = 8.8); and KIO = 35.6 (SD = 15.2). These time intervals were not significantly different across arms [F (2,36) = 2.17; P = 0.129]. Nor were there significant differences across arms in retention rate from baseline to Month 1 evaluation (Fisher Exact Test, P = 0.641). Of the 39 participants who completed baseline, only 1 participant (in the IVR arm) failed to complete the assessments at Month 1, yielding a 97.4% retention.
Demographic and health status variables were not significantly different across the 3 arms. Overall, the mean age of the 39 randomized participants was 82.1 (SD = 5.2) and the number of years of education was 15.8 (SD = 2.9; Range = 8 to 20). A total of 29 participants (74%) were female and 8 (21%) were married. The ethnicity of the pilot sample was homogeneous: 36 (92.3%) White; 2 (5.1%) African-American; and 1 (2.6%) Native American. The near-ceiling mean score of 29.0 (SD = 1.2) on the MMSE was reflective of the study cutoff score of Z26. A total of 27 participants (69%) reported cardiovascular medical condition, 18 (46%) self-reported a memory complaint, and 7 (15%) met the clinician rating for MCI.
Scorable data were produced for all cognitive and noncognitive experimental tests by the completion of the pilot study. Study coordinators from each of the 3 pilot sites (2 in New York City and 1 in Portland, OR) reported similar technical start-up issues. They reported no implementation problems with the low-technology MIP arm.
Site staff encountered the most frequent and time-consuming difficulties in the set up of the KIO arm. These difficulties included: delays in broadband installation; difficulty installing speakers; need to permit remote access and multiple computer access for the participant; insufficient power from the MedTracker battery pack; and need for real-time assistance for computer installation. This technical assistance was not captured in the Efficiency measures, which focused exclusively on contacts that staff made directly with participants. In addition, data were lost for the Digit Span Backwards because the length of pauses in utterance was not considered in the original scoring paradigm for KIO and administration was terminated prematurely until this problem was corrected. In addition, 1 KIO participant was missing the Abbreviated TICS at baseline and 1-month due to hearing difficulty. All other experimental tests, both cognitive and noncognitive, were complete for this arm.
The 3 arms were compared at baseline and Month 1 with respect to the number of incomplete tests in the home-based experimental cognitive battery (max = 8 tests). At baseline, none of the participants in the MIP (n = 14) or IVR (n = 13) arms were missing any tests. In the KIO arm 8 participants were missing 1 test (Digit Span Backward), 1 participant was missing 2 tests (Digit Span Backward and the Abbreviated TICS); 3 participants had all cognitive tests completed. The Fisher Exact Test across groups was highly significant (P < 0.001). At Month 1, although all 12 KIO participants were retained for Month 1 testing, 9 participants were missing Digit Span Backward and 1 participant was missing both Digit Span Backward and Abbreviated TICS. As with baseline testing, none of the MIP (n = 14) or IVR (n = 12) participants had any incomplete tests at Month 1. The Month 1 Fisher Exact Test of number of incomplete tests across arms was highly significant (P < 0.001).
Noncognitive tests were largely completed in all 3 arms. All participants had complete data for BCFSI, QOL, Behavioral, and ADL at both baseline and Month 1; and the CGIC was completed for all groups at Month 1. On Participant Status 2 participants out of 12 were missing information about cognitive-enhancing medications in the IVR arm at Month 1. On the Resource Use Inventory, 1 out of 13 IVR participants was missing the number of hours/week with helpers at baseline.
For Medication Adherence all 12 KIO participants had scorable MedTracker data and according to the returned number of pills, all participants had taken at least some medication. In the MIP arm, all participants mailed in self-reported use of medication. However, 2 out of 14 MIP participants had a missing return date so the adherence was unscorable and 1 MIP participant returned all pills. In the IVR arm, 1 out of 12 IVR participants was missing medication adherence data collected by self-reported automated phone response. In addition, 1 IVR participant had a missing return date for pill count and 1 IVR participant returned all pills. The medication adherence data collection in this pilot was targeted to standardized implementation of the procedures only; the actual measurement and data analysis (how the experimental arms each compare with the gold standard of returned pill count) have been deferred to the Main Study.
At baseline, 9 (75%) participants in KIO required study coordinator contacts outside of training and testing, as compared with 7 IVR (54%) and 3 (21%) MIP participants. The overall Fisher Exact Test was significant (P = 0.024), with a significant pairwise comparison, using the Hochberg adjustment, between KIO and MIP (P = 0.048). By 1-month, however, despite a borderline level of significance (Fisher Exact P = 0.078), the magnitude of the differences was less noticeable. The numbers of participants requiring outside staff contact at 1-month were: 8 KIO (67%), 11 IVR (85%), and 6 MIP (43%).
As noted in Table 2, the arms were significantly different in the amount of staff-to-participant training time required: F(2,36); = 8.834; P = 0.001. The mean time in minutes for training subjects in each arm’s technology were: MIP Mn = 15.2 (SD = 6.2); IVR Mn = 31.6 (SD = 6.2) and KIO Mn = 104.2 (SD = 101.2). KIO training time was significantly longer than MIP (P = 0.022) and IVR (P = 0.031) and IVR training time was significantly longer than MIP (P < 0.001). At baseline, additional staff-to-participant contact time varied across groups [F (2,36) = 3.988; P = 0.027]. However, none of the pairwise comparisons were significant. By 1 month there were no significant differences across groups in additional staff-to-participant contact time [F(2,36) = 0.144; P = 0.866]. The 1-way ANOVA across arms for total staff-to-participant direct contact time (ie, training time + additional contact time + time live tester spent in experimental cognitive testing for MIP arm) was significant: F (2,33) = 4.588; P = 0.017. The group that required the lowest overall direct time in minutes was IVR (Mn = 44.4; SD = 21.5), followed by MIP (Mn = 74.9; SD = 29.9), followed by KIO (Mn = 129.4; SD = 117.0). The only significant pairwise difference was found between IVR and MIP, the former being the most time efficient (P = 0.034) with respect to total staff-to-participant direct contact time through the entire length of the study.
The scores for cognitive and noncognitive tests by arm are presented in Table 3. Pearson test-retest reliability coefficients comparing baseline and 1 month scores were significant (P < 0.001) for all cognitive and noncognitive experimental measures, ranging from 0.58 to 0.84. As the sample size was small in this pilot, statistical comparisons of scores were deferred until data are analyzed in the main study.
This pilot study showed the challenges of recruitment of elderly individuals to a home-based study of cognitive and noncognitive change using traditional and novel methods for evaluation. The cohort was quite old, with a mean age above 80, and nearly 70% reported cardiovascular conditions, thereby showing that an “at risk” population can be recruited. Randomization into the novel technology arms (KIO and IVR) was associated with an increased likelihood of discontinuation and dropout before baseline; participants assigned to KIO in particular cited the inconvenience of the technology. This finding points to the need to vet new technologies fully with real-life experience before deploying them to home use. It is noteworthy that once participants were familiarized with study procedures (ie, completing baseline) they remained engaged, as supported by the over 90% retention from baseline to 1-month follow-up across all arms.
Objective findings corroborate subjective accounts of greater difficulty with the initiation of the novel, automated methods, and particular difficulty with the KIO installation. The staff time was greater for the KIO arm, particularly during training and baseline data collection, and did not include the additional time for installation or the “real time” technical assistance for computer trouble-shooting. Nevertheless, the staff- time at startup for the new technologies resulted in more efficiency over time, even over this short study. For example, when total staff time was tallied for training, baseline, and 1-month follow-up periods, the IVR was already more time efficient than the MIP arm. Given the estimate from the current pilot study that each live cognitive assessment in the MIP arm requires 22 minutes for completion, after 3 additional follow-up assessments the KIO arm would also be projected to require less staff time, assuming no additional technological problems occurred. Certainly, longer follow-up with more assessments might result in significant savings with these automated methodologies.
This protocol represents the first use of this cognitive battery without in-person administration. It is also the first use of the abbreviated noncognitive assessments. Data collection for all measures was nearly complete across all arms. The test-retest reliability of all experimental measures was good, supporting their stability when collected by these novel formats. There were no significant differences across the 3 arms in any of the experimental cognitive scores at 1 month, with scores from each of the 3 experimental arms comparable with in-person “gold standard” tests.
Medication adherence served as a performance-based measure of “activities of daily living” and the data collection techniques were unique to each of the 3 technologies. Scorable data were obtained from all but 3 participants, indicating that these methods are reasonable for data collection in an elderly population.
This pilot study informed the currently ongoing, much larger study designed to evaluate the sensitivity of these methods over a 4-year interval. Some of the limitations of the pilot study, such as limited diversity of race/ethnicity and levels of education, have been addressed. The main study targets participants with a diversity of demographic factors that play a role in generalizability of findings (eg, 20% minority enrollment is required at each participating site).
The pilot study identified the gap between the relatively high initial enthusiasm for higher technologies (such as computers) and the reluctance to fully participate once randomized. The pilot study also identified the fear of the size and inconvenience of internet-based technologies and prompted us to develop strategies to address these concerns. For example, computer Kiosks delivered in shipping cartons intimidated potential participants with small living quarters, and delivery of uncrated computers helped alleviate this concern. Providing companionship while service people installed cable connections overcame anxiety about having a “stranger” in the home. “Real time” help lines for local site staff for installation problems allowed the participant to see the technical issues as manageable.
The most encouraging finding was the evidence that efforts early in the trial had long-lasting benefits in terms of participation and follow-up. The main study will provide larger sample sizes in which we will be able to evaluate and compare methods for efficiency and accuracy in early detection of cognitive deterioration and dementia.
The authors are grateful to Dr Paul Aisen for review of the manuscript and editorial advice. The authors would also like to acknowledge the contributions of Tracy Reyes and Ben Barth at Healthcare Technology Systems; Jacques H. de Villiers, Rachel Coulston, Esther Klabbers, and John Paul Hosom at the OHSU Center for Spoken Language and Understanding; Jessica Payne-Murphy at the OHSU Oregon Center for Aging & Technology; and Study Coordinators Melissa Rushing at Mount Sinai School of Medicine; and Erica Maya at NYU School of Medicine.
Supported by these NIA grants: U01AG10483, P50AG005138, P30AG008051, and P30AG024978. Development of the Kiosk and MedTracker was supported in part by grants from NIA (P30-AG024978; P30-AG08017) and Intel Corporation.