A randomized, double-blind, 2-group parallel study was conducted to compare tarenflurbil with placebo for 18 months involving 133 participating trial sites. Written informed consent was obtained from participants, their legally authorized representatives, or both. The institutional review board of participating institutions approved the study.
Patients were predominantly recruited from dementia clinics and were enrolled from February 21, 2005, through April 30, 2008. Initial enrollment criteria included community-dwelling patients with mild to moderate AD severity (Mini-Mental State Examination [MMSE] scores of 15–26, inclusive). Patients were assigned to treatment with tarenflurbil at doses of either 400 mg or 800 mg twice a day. After analysis of phase 2 data indicated that patients with mild AD appeared to have a more robust response to 800 mg of the tarenflurbil twice daily7
and after discussion with the US Food and Drug Administration, the enrollment criteria were changed on May 27, 2005, to enroll only patients with mild AD (MMSE score of 20–26, inclusive) with treatment modified by ending the 400-mg group: 42 patients with mild AD were switched to the 800-mg group and 32 patients with moderate AD were discontinued from the trial; thus, the revised analysis plan excluded those with moderate AD.
The remainder of the inclusion and exclusion criteria remained unchanged throughout the trial: 55 years or older and living in the community, meeting criteria for dementia by the Diagnostic and Statistical Manual of Mental Disorders (Fourth Edition) (DSM-IV
), and having probable AD by National Institute of Neurological and Communicative Disorders and Stroke–Alzheimer's Disease and Related Disorders Association criteria.8
Additional inclusion criteria at screening included no clinically significant focal intracranial pathology as assessed by CT or MRI within the previous 12 months, a modified Hachinski ischemic score9
of less than 4, at least 6 years of education or sufficient work history to exclude mental retardation, adequate vision and hearing to participate in study assessments, and a reliable caregiver who saw the patient at least 4 days a week and could accompany the patient to each clinic visit.
Participants taking an acetylcholinesterase inhibitor were enrolled provided they had been taking that specific medication for at least 6 months before taking the study drug. Participants taking memantine were enrolled if they had been taking stable doses for at least 3 months before taking the study drug. Randomization was stratified by use or nonuse of cholinesterase inhibitors and memantine. Participants taking antidepressant, anti-psychotic, or anxiolytic drugs, vitamin E, or Ginkgo biloba were eligible, provided that the dose was stable for at least 3 months prior to randomization. Chronic aspirin use for cardioprotective therapy was allowed.
Participants were excluded if they had evidence of epilepsy; focal brain lesion; head injury with loss of consciousness or confusion after the injury; DSM-IV-TR (Text Revision) criteria for any major psychiatric disorder including psychosis, major depression, bipolar disorder, or alcohol or substance abuse; history of upper gastrointestinal tract bleeding requiring surgery, transfusion, or both within 3 years or documented evidence of active gastric or duodenal ulcer disease within 3 months; history or evidence of active malignancy, except for prostate cancer, basal cell carcinoma, or squamous cell carcinoma of the skin within 24 months of entry; a chronic or acute renal, hepatic, or metabolic disorder; any use of AD immunotherapy or recent use of any investigational therapy or major surgery; an uncontrolled cardiac condition (New York Heart Association class III or IV); anticoagulant therapy such as warfarin within 12 weeks of enrollment; use of any CYP2C9 enzyme inhibitor or the CYP2C9 enzyme substrates losartan, phenytoin, tamoxifen, torsemide, and fluvastatin within 2 weeks of enrollment; recent history of chronic use of nonsteroidal anti-inflammatory drugs (NSAIDs) at any dose or aspirin greater than 325 mg/d; or history of hypersensitivity to any NSAIDs including cyclooxygenase 2 (COX-2)–specific inhibitors. Race was determined by self-report and was assessed to evaluate possible drug effect modification.
Eligible participants were randomized by a central randomization schema generated by the sponsor. The randomization tables were maintained in a locked file room of the head of the Quality Assurance department. The clinical system was used to assign blinded drug treatment kits. Both dosages of tarenflurbil were administered as 2 tablets twice a day: a single tarenflurbil tablet for the 400-mg and 2 tarenflurbil tablets for the 800-mg groups, then after the protocol amendment only at doses of 800 mg twice daily. Participants in the placebo group took 2 tablets identical to the tarenflurbil tablets twice a day to ensure blinding. Participants were not asked to guess their randomization group.
Adverse event monitoring, physical examinations including vital signs measurement, standard resting 12-lead electrocardiograms, and blood and urine sample collection for clinical laboratory analysis and determination of plasma tarenflurbil concentration were performed at the screening visit and at months 1, 3, 6, 9, 12, 15, and 18. Additional adverse event monitoring was performed via telephone with caregivers at week 2 and every month between scheduled visits. All participants were assessed 30 days after the last dose of study medication. A central laboratory was used throughout the study.
Outcome Measures and Power Estimates
Co-primary efficacy outcomes were cognition as assessed by the Alzheimer Disease Assessment Scale–Cognitive Subscale (ADAS-Cog, 80-point version)10
and functional ability as assessed by the Alzheimer Disease Cooperative Study activities of daily living (ADCS-ADL, 78-point scale).11
A key secondary outcome measure assessed global function with the Clinical Dementia Rating (CDR) sum of boxes (CDR-sb, 18-point scale).12
Additional secondary outcomes included the MMSE (30-point scale),13
Neuropsychiatric Inventory (144-point scale),14
quality of life scale (QOL-AD, 13–52 points),15
Caregiver Burden Inventory (96-point scale),16
and 70-point version of ADAS-Cog. Blood samples were collected and stored for population pharmacokinetic analysis and for apolipoprotein E (APOE
) genotype testing.
The power estimates were based on the joint power for detecting a difference between treatment groups in the changes from baseline to month 18 on the ADAS-Cog and the ADCS-ADL scales. Statistical power for each end point was calculated separately assuming an effect size of 20%, ie, treatment difference divided by standard deviation, using a 2-sided test and a 5% significance level. Assuming SD of changes from baseline to month 18 is approximately 10.0 for ADAS-Cog and 13.0 for ADCS-ADL, with 800 patients per group, the study would have had at least 98% power to detect a treatment difference in the changes from baseline to month 18 of 2.0 points for ADAS-Cog and 2.6 points for ADCS-ADL. No adjustment for dropouts was made in this calculation because the co-primary analyses use the z score last-observation-carried-forward imputation algorithm. Because the co-primary end points were expected to be correlated, the joint power would have been in excess of 0.96 (0.982) for detecting treatment differences.
The primary analysis was performed on changes from baseline to month 18 in total score for ADAS-Cog and ADCS-ADL. Slopes of total scores for both scales were evaluated as a secondary outcome. The key secondary efficacy end point was change from baseline to month 18 CDR-sb score, and slopes were also evaluated. Other secondary efficacy end points were changes from baseline to month 18 for MMSE, Neuropsychiatric Inventory, QOL-AD, and Caregiver Burden Inventory. Safety end points included incidence of adverse events, clinical laboratory tests, vital signs, electrocardiogram, and physical examination.
All efficacy analyses were performed using the intent-to-treat population, which in this instance consisted of all participants who were randomized, had mild AD at screening, and received at least 1 dose of study medication. Participants initially randomized to the 400-mg group were pooled with the 800-mg group. A z score last-observation-carried-forward method was used to impute missing data for the main change-from-baseline analysis of each efficacy end point. A missing value was replaced with a value that was the same number of SDs from the treatment group mean at that time point as that participant's last observed value [z score=(observed value – treatment group mean)/ treatment group SD]. This imputation method accounts for AD being a progressive disease and for data that may not be missing at random (ie, patients who progress more quickly may be more likely to withdraw). No imputation was used for comparison of slopes. A per-protocol analysis that included only those participants who completed double-blind therapy and who had no major protocol violations was also performed for the co-primary and key secondary end points. Participants with moderate AD were not included in any of these analyses.
Change-from-baseline analyses were conducted using an analysis of covariance model with treatment group, clinical site, and current use of acetylcholinesterase inhibitor, memantine, or both as fixed effects with the baseline score as the covariate. The slopes analyses were conducted using a repeated measures linear mixed model, with random intercepts and slopes, baseline score and time as covariates, factors for treatment group, clinical site, and current use of acetylcholinesterase inhibitor, memantine, or both, and a term for treatment group × time interaction. Time was treated as a continuous variable. Because both co-primary change-from-baseline analyses had to be statistically significant at the .05 level to meet study objectives, no adjustment for multiple comparisons was made for co-primary analyses.
A gatekeeper approach was used to control for multiple comparisons for the slopes analyses of the ADAS-Cog and ADCS-ADL and for the change-from-baseline and slopes analyses for the CDR-sb. After performing the primary analysis, treatment comparisons were planned in this order: (1) slopes for the ADAS-Cog; (2) slopes for the ADCS-ADL; (3) change from baseline to month 18 in the CDR-sb; (4) slopes for the CDR-sb. Each comparison could only be considered statistically significant if it, and all preceding analyses, were statistically significant at the .05 level. No multiple comparison adjustments were used for other efficacy end points.
Additional efficacy analyses included categorization of participants who improved from baseline at each visit for ADAS-Cog, ADCS-ADL, and CDR-sb. For ADAS-Cog, these criteria had to be met on 2 consecutive visits to qualify as an improver. Blood samples were collected for the measurement of tarenflurbil and S-flurbiprofen. A population pharmacokinetic (PK) model was developed using the plasma concentration data collected and the tarenflurbil PK parameters area under the curve (AUC(0–12 h)) and Cmax (maximum plasma drug concentration) for each participant were estimated and used to explore potential relationships between tarenflurbil plasma concentrations and the primary efficacy outcomes using a mixed model.
Safety analyses were conducted using all participants, including those with mild or moderate AD, who received at least 1 dose of the study drug (safety population). The Pearson χ2 test was used for all treatment comparisons based on categorical variables (eg, demographic and baseline characteristics; proportions of improvers). Analysis of variance or analysis of covariance was used to compare treatment groups for continuous variables. Statistical analyses used SAS software version 9.1.3 (SAS Institute Inc, Cary, North Carolina). P<.05 was considered significant.