Several designs have been used for cancer screening RCTs. Most studies have employed the traditional two-arm design [1
], which aims to determine whether the screening intervention results in benefit, that is, a reduction in cause-specific mortality. In the two-arm design, participants in one arm receive the screening exam of interest while those in the other arm serve as controls, receiving either no screening exam as part of the trial or an exam routinely used in the population as a screening modality. Similar designs have been used to address questions about screening frequency [25
], age-specific effectiveness [26
], and the effect of adding one screening modality to another [27
]. One trial, the Prostate, Lung, Colorectal and Ovarian (PLCO) Cancer Screening Trial, is testing multiple regimens by using two study arms: participants in intervention arm received screening exams for three cancers (all participants received lung (chest x-ray) and colorectal (flexible sigmoidoscopy) cancer screens, while women also received ovarian cancer screens (transvaginal ultrasound and CA-125 evaluation) and men also received prostate cancer screens (digital rectal exam and PSA level evaluation), while participants in the control arm received no trial exams [32
]. PLCO provides an excellent example of many of the issues confronted when designing and conducting an RCT [33
A major design issue for PLCO involved deciding whether to conduct one trial or four separate trials, one for each cancer site. A comparison of costs and logistics revealed that evaluation of screening modalities for the four cancers using one trial was less costly and more efficient, as one administrative structure and coordinating center could be used. Another design issue involved the evaluation of multiple tests for a single cancer site. Rather than evaluating each test for the same cancer site with a different arm, it was decided that combinations of screening tests would be evaluated. So, for example, DRE and PSA testing were administered together at each screening round, rather than conducting a separate trial for each. The primary reasons for combining modalities were cost constraints and the desire to evaluate the regimen of combined interventions. If a combined regimen does not work, testing the individual procedures would not be warranted. If the combined regimen does work, each test could be independently evaluated in separate RCTs.
Several designs were considered for PLCO. The two primary competitors were the reciprocal control design and the all-versus-none design. The reciprocal control design would have had three arms: one devoted to screening for prostate or ovarian cancer, one to colorectal cancer screening, and one to lung cancer screening. Since screening would have been undertaken for only one cancer site (per gender in the case of the prostate/ovarian arm) in any given arm, participants in the other two arms would have served as controls. This design was ultimately deemed unfeasible because of the cost of screening all participants. Furthermore, the reciprocal control design was expected to result in substantial levels of contamination, as all participants, due to the fact that they were being screened, would be aware of the other trial-administered screening tests, which they might then seek out. The all-versus-none design, with participants randomized to one of two arms, thus was chosen. In the spirit of a multiphasic screening endeavor, one arm served as a control, while screens for all cancer sites of interest were administered in the other arm. Use of the all-versus-none design reasonably assumes that the screening tests for each cancer do not detect any of the other cancers of interest, and that the endpoints—death from each of the four cancers—are not related. It was further decided to employ the so-called “stop screen” approach, an approach in which screening is performed for a fixed number of years or screening rounds and then stopped, but follow-up continues for endpoint ascertainment [34
]. This approach was chosen because it had been used successfully in breast and colorectal cancer screening trials, and because it is the only design that allows direct assessment of overdiagnosis. Overdiagnosis is the identification through screening of cancers that never would have surfaced clinically in the absence of screening. This phenomenon has been observed repeatedly [35
] and must now be considered the rule rather than the exception in cancer screening.
In addition to allowing for assessment of overdiagnosis, a stop-screen design is often necessary because resources typically are not available to screen throughout the life of the study. Therefore, the number of screening rounds also must be decided upon at the design stage in this type of trial. However, changes can be and often are made as the study progresses. In PLCO, the initial regimen of four annual screens for PSA and CA-125 was later expanded to six screens, and was a trade-off between the expected number of screens necessary to produce an effect (should one exist) versus available and anticipated resources. In the early years of PLCO, sigmoidoscopy was administered at the first and fourth annual screens, although the annual fourth screen eventually was replaced with a screen at the sixth annual visit to reflect emerging clinical practice. An annual interval between screens is typically chosen in RCTs, as it is the interval most likely to be used once mass screening programs are established in the community. Compared to less-frequent screening, an annual interval increases the likelihood of detection of a broad spectrum of preclinical conditions, thus providing a better representation of the natural history of the cancers under study. A longer interval is less desirable in most instances, as it might allow some rapidly growing lesions, lesions likely to be lethal but which could be cured if found early, to escape detection. An exception to the use of an annual interval is found in endoscopic screening for colorectal cancer: our current understanding of colorectal cancer progression resulted in a five-year screening interval in PLCO.
Duration of follow-up, the time from randomization to cessation of event ascertainment, is another important design parameter. In PLCO, a minimum of 10 years of follow-up was initially chosen to allow for sufficient time for mortality reductions, should any exist, to emerge. Although follow-up intervals of at least 7 years were typical in breast cancer screening trials [1
], it was assumed that the longer natural history of prostate cancer, and perhaps other cancers under study, warranted a longer follow-up period. In the National Lung Screening Trial (NLST), modeling of the disease and screening processes resulted in the decision to capture endpoint events over an approximately 7 year period [19
]. It must be recognized that these and other design parameters were chosen using the best information available at the time of design and may be subject to change as a result of data gathered during the trial and other information. In the Minnesota FOBT trial [25
], the screening and follow-up periods chosen at the beginning of the trial were ultimately both extended, resulting in the opportunity for a mortality reduction to emerge.