|Home | About | Journals | Submit | Contact Us | Français|
The Look AHEAD (Action for Health in Diabetes) Study is a long-term clinical trial that aims to determine the cardiovascular disease (CVD) benefits of an intensive lifestyle intervention (ILI) in obese adults with type 2 diabetes. The study was designed to have 90% statistical power to detect an 18% reduction in the CVD event rate in the ILI Group compared to the Diabetes Support and Education (DSE) Group over 10.5 years of follow-up.
The original power calculations were based on an expected CVD rate of 3.125% per year in the DSE group; however, a much lower-than-expected rate in the first 2 years of follow-up prompted the Data and Safety Monitoring Board (DSMB) to recommend that the Steering Committee undertake a formal blinded evaluation of these design considerations. The Steering Committee created an Endpoint Working Group (EPWG) that consisted of individuals masked to study data to examine relevant issues.
The EPWG considered two primary options: (1) expanding the definition of the primary endpoint and (2) extending follow-up of participants. Ultimately, the EPWG recommended that the Look AHEAD Steering Committee approve both strategies. The DSMB accepted these modifications, rather than recommending that the trial continue with inadequate statistical power.
Trialists sometimes need to modify endpoints after launch. This decision should be well justified and should be made by individuals who are fully masked to interim results that could introduce bias. This article describes this process in the Look AHEAD study and places it in the context of recent articles on endpoint modification and recent trials that reported endpoint modification.
Weight loss is commonly recommended to overweight and obese adults, especially those with type 2 diabetes, in order to reduce cardiovascular risk; however, the effect of weight loss on cardiovascular disease (CVD) outcomes has never been tested definitively . The results of epidemiologic studies have been inconsistent, and randomized controlled trials of weight loss have generally focused on short-term changes in intermediate endpoints like blood pressure and serum lipids [2,3]. The Look AHEAD (Action for Health in Diabetes) Study  (Clinicaltrials.gov Identifier: NCT00017953) was designed to be the definitive test of the long-term health benefits of a lifestyle intervention aimed at weight loss in adults with type 2 diabetes.
The primary hypothesis was that the lifestyle intervention would reduce the incidence of a composite endpoint of incident CVD defined as cardiovascular death (including fatal myocardial infarction and stroke), nonfatal myocardial infarction, or nonfatal stroke. The secondary hypothesis was that the lifestyle intervention would reduce the incidence of a composite endpoint of all-cause death or incident CVD-related secondary outcomes defined in aggregate as myocardial infarction, stroke, coronary artery bypass graft (CABG) surgery or percutaneous coronary (PC) intervention, hospitalization for congestive heart failure (CHF), carotid endarterectomy, or surgical bypass or percutaneous intervention for peripheral arterial disease.
The sample size was based on the aim of detecting an 18% difference in the primary endpoint in the intensive lifestyle intervention (ILI) compared to the Diabetes Support and Education (DSE) control group. (During the design of the trial, targets of 15%–20% for interventions were identified as conveying significant public health benefit. The Steering Committee selected an 18% intervention effect on which to base power projections because this appeared feasible and required a cohort sufficiently large to meet other objectives of the trial.) We made the following assumptions: (1) The rate of incident CVD in the DSE group would be 3.125% per year (corresponding to the projected rate of incident CVD in a population of overweight and obese adults with type 2 diabetes eligible for participation in Look AHEAD); (2) the cohort would be recruited uniformly over 2.5 years of follow-up; (3) in all, 2% of participants would be lost to endpoint ascertainment annually; and (4) participants lost to follow-up would be similar to their counterparts in regard to both treatment and endpoint risk. Based on these assumptions, we determined that 5000 participants followed up for a maximum of 11.5 years would yield 92% power (with two-sided α = 0.05) to detect an 18% relative difference in the composite primary endpoint (i.e., an absolute event rate of 3.125 per 100 person-years in the DSE group versus an absolute event rate of 2.562 per 100 person-years in the ILI group).
The 3.125% event rate was a best estimate based on reported CVD event rates among individuals similar to those who were to be recruited for the study. Specifically, we used longitudinal data from diabetic participants in the Atherosclerosis Risk in Communities (ARIC) study  and the Cardiovascular Health Study (CHS) . We assumed that 75% of Look AHEAD participants would have no history of CVD and that their age distribution would represent an equal mixture of the ARIC (45–64 at baseline) and CHS (65–75 at baseline) samples. The overall event rate for the combined ARIC + CHS diabetic population was estimated to be 3.72%. This event rate was adjusted 5% upward to account for silent myocardial infarctions (which were not part of the ARIC/CHS event estimates) and then 20% downward to account for both a healthy volunteer effect and the overall decline in CVD event rates in the United States since the 1980s. This produced the anticipated 3.125% event rate used for power calculations (3.72% × 1.05 × 0.80 = 3.125%).
As Look AHEAD reached the 2-year mark, the Data and Safety Monitoring Board (DSMB), charged with monitoring study progress, noted that the actual event rate in the DSE group was much lower than expected and informed the Steering Committee through the National Institute of Diabetes, Digestive and Kidney Diseases (NIDDK) Project Office. The Steering Committee therefore reconsidered the power projections in light of this lower-than-expected event rate. For example, at a hypothetical 2% event rate in the DSE group over 11.5 years of follow-up (considerably lower than the anticipated 3.125% rate), the trial would have 80% power to detect an 18% effect on the primary endpoint – less than the originally planned 90% power but still acceptable. However, as Look AHEAD reached the 3-year mark, the DSMB observed that the event rate in the DSE group was only 0.7% per year. In response to continuing DSMB concern, the Steering Committee created an Endpoint Working Group (EPWG) to carefully consider the ramifications of the lower-than-expected event rate and to recommend alternative approaches to preserve the study’s integrity and maximize its scientific value.
The EPWG included five Look AHEAD clinical investigators well versed in clinical trials, three National Institutes of Health (NIH) scientists, and two clinical trials experts not affiliated with Look AHEAD (Table 1). Two members also served on the study’s Adjudication Committee, responsible for ascertaining endpoints. All EPWG members were masked to study results, including the three NIH scientific officers (Evans, Kaufman, and Geller). The EPWG was charged with evaluating all relevant aspects of the study bearing on the lower-than-expected event rate, including the duration of follow-up and definition of the primary endpoint. To avoid bias and to maximize masking of the study investigators, the EPWG only examined outcome data from the DSE control group and shared only summary data when making its recommendations to the Steering Committee. Neither the EPWG nor the Steering Committee was ever privy to the overall event rate (DSE and ILI combined), so data on the event rate in the DSE group did not effectively unblind them. The EPWG convened regularly to deliberate and to review updated event rates. A timeline of events and key study dates are provided in Figure 1.
The EPWG identified three possible reasons for the unexpectedly low CVD event rates:
The Steering Committee weighed all three of these hypothetical concerns during the design phase and deflated the projected event rate by 20%. In retrospect, however, the original event rate projections were simply not conservative enough.
As the EPWG considered expanding the primary composite endpoint to include additional endpoints, it posed the following five questions: (1) Does obesity consistently predict the occurrence of the endpoint in longitudinal epidemiologic studies? (2) Is the endpoint of sufficient clinical importance to serve in a composite endpoint alongside myocardial infarction, stroke, and CVD death? (3) Is the endpoint related to atherosclerotic CVD? As originally written, the protocol enshrined atherosclerotic CVD as the study’s main focus. Insofar as ‘new’ primary endpoints fall under this general category, they fit more naturally with the original conceptual framework. (4) How susceptible is the endpoint to ascertainment bias? It is crucial to avoid including in the primary endpoint any event that might be susceptible to ascertainment bias; for example, chronic stable angina might be differentially detected in the Lifestyle Intervention Group, because of more frequent exercise (which might induce symptoms) and more frequent contact with study staff (which might trigger safety measures leading to medical evaluation and treatment). (5) How acceptable is the endpoint to the study’s stakeholders and scientific and clinical audience? Midstream changes in primary endpoints are apt to be viewed with some suspicion by Look AHEAD’s stakeholders and audience: the less traditional or familiar the endpoint, the more suspicious they might be.
The ideal additional endpoint(s) would therefore fit the following five criteria: (a) related to obesity, (b) high clinical importance, (c) related to atherosclerotic CVD, (d) low risk of ascertainment bias, and (e) acceptable to stakeholders and audience. With these criteria in mind, the EPWG reviewed the following nine endpoints:
The EPWG members selected these nine endpoints based on their experience in other trials of diabetes treatment and CVD prevention and on their knowledge of the epidemiology of type 2 diabetes and obesity.
The EPWG evaluated each of these possible additional primary endpoints in detail. These deliberations are summarized in Table 2, with particular attention to how each endpoint might meet the five criteria listed above.
This deliberation led the EPWG to narrow the range of potentially acceptable endpoints to four: all-cause mortality, hospitalized angina, urgent revascularization, and hospitalized CHF. The rationale for excluding other endpoints was as follows: (1) Incident CKD, as defined by using serum creatinine to estimate glomerular filtration rate, was considered to have insufficient clinical importance; (2) cancer was too incongruent with our original endpoints and the evidence to determine which cancers are ‘obesity related’ was still unsettled; (3) LVH is primarily asymptomatic and suffers from disagreement about definition; (4) deep venous thrombosis/pulmonary embolism has a wide range of gravitas, is less related to atherosclerosis than are other vascular endpoints, and has not been widely used in trials or epidemiologic studies as part of a composite vascular endpoint; and (5) fractures have a wide range of severity and are inversely associated with body mass index.
Thus, the EPWG carefully considered the four remaining endpoints: all-cause mortality, hospitalized angina, urgent revascularization, and hospitalized CHF. These were reduced to three when the EPWG determined that virtually all of the urgent revascularizations in Look AHEAD occurred in participants who otherwise met criteria for hospitalized angina. The deliberations regarding all-cause mortality, hospitalized angina, and hospitalized CHF are summarized below.
The argument in favor was that all-cause mortality is the bottom line for patients and physicians and would provide a way to capture effects on important non-CVD events, like cancer or liver disease. The argument against was that (1) these non-CVD events are best treated as secondary endpoints, since the primary hypothesis focuses on CVD per se, and (2) all-cause mortality may introduce ‘noise’ in the form of accidental deaths and nonobesity-related cancers (e.g., brain and lung).
The argument in favor was that hospitalized angina would capture the ‘aborted’ myocardial infarctions related to secular improvements in acute cardiac care; would be consistent in tone, therefore, with recent thinking on CVD endpoints (see Luepker et al. ); and is fully congruent with the original hypothesis. The argument against was that (1) it might be difficult to distinguish ‘urgent’ cases from ‘chronic’ cases, the latter of which would be susceptible to ascertainment bias in an unblinded study, and (2) it might be difficult to agree upon a specific definition. However, the current Look AHEAD definition of hospitalized angina (see Figure 2) mitigated both concerns: the definition clearly excluded chronic stable angina and the definition had already been smoothly implemented by the Adjudication Committee for several years without generating significant disagreements among committee members.
The argument in favor was that CHF is common and important, it was already a component of the composite secondary endpoint, and it might be improved by weight loss along a variety of physiologic pathways (e.g., better exercise tolerance, reduced reliance on thiazolidinediones, and improved lung function). The argument against was that (1) CHF is a heterogenous syndrome related not only to atherosclerosis but also to hypertension, renal disease, and other causes (e.g., valvular heart disease) generally not discernable from the records available to the study adjudicators, and (2) it is often difficult to distinguish from other causes of acute dyspnea, especially chronic obstructive pulmonary disease and pneumonia.
After deliberation, the EPWG unanimously favored hospitalized angina and unanimously rejected hospitalized CHF. A large majority was against all-cause mortality, but a minority favored it. Further discussion led to a consensus that the additional primary endpoint plus all-cause mortality should be an additional major secondary analysis in the main results of this study.
Having evaluated the possible options for expanding the primary endpoint on purely scientific grounds, the EPWG then turned to the practical matter of whether the potential additional endpoint occur frequently enough to augment the overall event rate. The coordinating center determined that adding hospitalized angina to the primary endpoint definition would approximately double the event rate in the DSE (control) group to 1.25%–1.35% per year.
After over 2 years of monitoring, research, and deliberation, the EPWG made the following recommendations to the Look AHEAD Steering Committee as a means to address the lower-than-expected event rate:
Either (a) expanding the primary endpoint to include hospitalized angina without extending study duration or (b) extending study duration by 2 years without expanding the primary endpoint would increase statistical power to detect an effect of 18% from roughly 50% to only 70%. Adopting both recommendations would push statistical power to roughly 75%, assuming no change in the event rate in the second half of the study. If the underlying event rate increased in the second half of the study, then adoption of both recommendations would increase power to above 80% – that is, into the conventional range for randomized controlled trials. EPWG members recommended applying the same effect size (18% difference in risk of the composite primary endpoint) to the expanded endpoint, because the causal pathways to hospitalized angina were so similar to the pathways leading to myocardial infarction and CVD death. The change in the definition required an adjustment in the formal statistical monitoring of the trial to preserve the prior level of α spending while transitioning to rules based on the expanded endpoint.
Even with the strongest scientific rationale, the EPWG considered how these recommendations might be viewed by the scientific audience outside Look AHEAD. In the end, the EPWG consensus was that the recommended modifications to the study protocol would be well accepted for the following four reasons: (1) they are responsive to generally recognized secular trends in CVD, (2) they are concordant with the study’s original conceptual framework and primary hypothesis, (3) they were developed and proposed by a group that was aware only of event rates in the comparison group and otherwise fully blinded to treatment effects, and (4) they were proposed 4 years the original before the originally planned date of study close-out (December 2012).
The EPWG presented its recommendations to the Steering Committee on 8 April 2008. The Steering Committee unanimously supported the recommendation. Throughout the decision-making process, the DSMB chose to remain silent on the specifics of the protocol change, because it believed that its unblinded status could otherwise introduce bias. Beyond asking the Steering Committee to review the low event rate in the DSE group, the DSMB never indicated (a) whether it would have ended the trial for futility had the protocol not been changed or (b) whether the specific changes allayed its original concern. Although neither the EPWG nor the Steering Committee were privy to DSMB deliberations, both groups recognized that the DSMB was in the awkward position of having prompted an inquiry into event rates and endpoint definition without having the latitude to comment on the investigators findings or response. From the perspective of the EPWG and the Steering Committee, the investigators had responded to the DSMB and saw only that the DSMB continued to allow the trial to continue.
Rigorous adherence to study protocol is an acknowledged cornerstone of trial methodology. In practice, however, it appears that many trialists modify their approach after the study goes into the field. In fact, Chan [13,14] and Mathieu  estimate that about one-third of properly registered trials undergo a modification of the primary outcome(s) following registration. That such changes often appear to favor the intervention  heightens the fear that failure to adhere to predetermined endpoints can inflate type I error rate.
Nonetheless, trialists do recognize that there may be appropriate reasons for modifying endpoints after launch. For example, in 2002, Wittes  argued for some ‘agility’ in study design, especially for long-term trials that may see secular trends in standard of care or relevant endpoints, or that may simply have been underpowered based on a priori calculations. Indeed, over the past decade, several large trials have changed endpoints after launch. Table 3 summarizes changes made in six major trials over the past 7 years. Most have done so apparently without sacrificing integrity, impact, or acceptability. The one possible exception is the PROActive trial , which drew criticism for relying heavily on a secondary endpoint that was introduced only a few weeks before the trial was closed [17,18]. In two of these trials (FIELD  and PEACE ), the definition of the primary endpoint was expanded to increase the event rate in the face of unexpectedly low power – similar to the situation in Look AHEAD. In two trials, the primary endpoint was narrowed in light of new data that came to light after launch (NAVIGATOR [21,22] and EUROPA ).
Wittes  and Evans  agree that individuals who are blinded to trial results should be the ones who decide about changing the primary endpoint; unblinded individuals might be tempted to modify endpoints so as to favor positive results. Evans  poses a series of questions to trialists who are considering a change: (1) What data triggered the review? (2) Have interim results been reviewed? (3) Who is making the decision? Ideally, endpoint reviews would be triggered by factors other than low event rates (e.g., secular trends in endpoint classification) and there would be no review of interim results. In Look AHEAD, internal data (lower-than-expected event rates) triggered the review and the DSMB and unblinded Coordinating Center investigators had reviewed interim results. However, the decision makers were investigators in the EPWG and Steering Committee and NIH project scientists, all of whom were fully blinded. From Evans’ perspective, therefore, these answers seem to put Look AHEAD in a ‘gray zone’ with regard to propriety – hence we prepared this article to make our decision-making process fully transparent .
In studies of CVD, extending the duration of follow-up is a logical remedy for decreased power related to lower-than-expected event rates. Because CVD is the leading cause of death in the general population, longer follow-up will inevitably lead to more events, especially in middle-aged and older populations with underlying CVD risk factors, like type 2 diabetes and obesity. Less has been written about decision making leading to a change in study duration.
Longer duration poses a special challenge in trials designed to test a behavioral intervention for two reasons: (1) It may be difficult to maintain the contrast between the intervention group and the comparison group in later years; (2) as the cohort ages, the accumulating comorbid conditions might interfere with adherence. The decision to increase Look AHEAD’s duration by 2 years required reconsenting participants for extended duration, but the intervention was otherwise unmodified.
In the later part of the twentieth century, CVD researchers began to observe secular trends in the incidence, treatment, and natural history of coronary heart disease that had important ramifications for endpoint definition . To create a new standard for CVD endpoints across studies, an international panel of experts representing the American Heart Association, the National Heart, Lung, and Blood Institute (NHLBI), and the Centers for Disease Control, the World Heart Federation, and the European Society of Cardiology released a scientific statement on ‘Case Definitions for Acute Coronary Heart Disease in Epidemiology and Clinical Research Studies’ . The statement identifies a standard approach in using ‘unstable angina pectoris’ as part of a component definition of coronary heart disease along with nonfatal myocardial infarction (MI). According to this scheme, ‘unstable angina’ is defined as new or changing cardiac symptoms with positive ECG findings. The additional Look AHEAD primary endpoint of ‘hospitalized angina’ (see Figure 2) is designed to capture episodes of unstable angina as part of a composite definition of CVD. Indeed, over the past decade, most major cardiovascular trials have used composite endpoints, typically with three or four components . This is certainly true for trials of diabetes treatment .
The experience regarding endpoint modification in Look AHEAD teaches several lessons for long-term trials.
First, in a long-term trial, it would be prudent to prespecify plans for checking event rates during the course of the trial. The plans should include options that could be triggered if the observed rate is much lower than expected. The options might include (a) extending duration, (b) expanding the endpoint, or (c) stopping the trial for lack of statistical power. In Look AHEAD, the Coordinating Center conducted regular rate checks under the supervision of the DSMB, but there was no prespecified plan to react when the rates were low.
Second, even absent low event rates, it might be useful to prespecify a point at which the endpoint definition is reexamined in light of secular trends and prevailing practice.
Third, it might be helpful to use the number of events to drive duration, rather than setting duration a priori based on estimated event rates. Of course, this approach requires that the funding source makes a somewhat open-ended commitment in terms of duration, which may not be feasible in some circumstances.
Finally, it is good to be as conservative as feasible about projected event rates.
Funding This study is supported by the Department of Health and Human Services through the following cooperative agreements from the National Institutes of Health: DK57136, DK57149, DK56990, DK57177, DK57171, DK57151, DK57182, DK57131, DK57002, DK57078, DK57154, DK57178, DK57219, DK57008, DK57135, and DK56992. The following federal agencies have contributed support: National Institute of Diabetes and Digestive and Kidney Diseases; National Heart, Lung, and Blood Institute; National Institute of Nursing Research; National Center on Minority Health and Health Disparities; Office of Research on Women’s Health; the Centers for Disease Control and Prevention; and the Department of Veterans Affairs. This research was supported in part by the Intramural Research Program of the National Institute of Diabetes and Digestive and Kidney Diseases. The Indian Health Service (IHS) provided personnel, medical oversight, and use of facilities. The opinions expressed in this article are those of the authors and do not necessarily reflect the views of the IHS or other funding sources.
Additional support was received from The Johns Hopkins Medical Institutions Bayview General Clinical Research Center (M01RR02719) and the Prevention & Control Core of the Baltimore Diabetes Research & Training Center (P60KD079637); the Massachusetts General Hospital Mallinckrodt General Clinical Research Center and the Massachusetts Institute of Technology General Clinical Research Center (M01RR01066); the University of Colorado Health Sciences Center General Clinical Research Center (M01RR00051) and Clinical Nutrition Research Unit (P30 DK48520); the University of Tennessee at Memphis General Clinical Research Center (M01RR0021140); the University of Pittsburgh General Clinical Research Center (GCRC) (M01RR000056), the Clinical Translational Research Center (CTRC) funded by the Clinical & Translational Science Award (UL1 RR 024153) and NIH grant (DK 046204); and the Frederic C. Bartter General Clinical Research Center (M01RR01346)
The following organizations have committed to make major contributions to Look AHEAD: FedEx Corporation; Health Management Resources; LifeScan, Inc., a Johnson & Johnson Company; OPTIFAST® of Nestle HealthCare Nutrition, Inc.; Hoffmann-La Roche, Inc.; Abbott Nutrition; and Slim-Fast Brand of Unilever North America.
Reprints and permission: http://www.sagepub.co.uk/journalsPermissions.nav