|Home | About | Journals | Submit | Contact Us | Français|
The mission of the National Cancer Institute’s Early Detection Research Network (EDRN) is to identify and validate cancer biomarkers for clinical use. Since its inception, EDRN investigators have learned a great deal about the process of validating biomarkers for clinical use. Translational research requires a broad spectrum of research expertise, and coordinating collaborative activities can be challenging. The EDRN has developed a robust triage and validation system that serves the roles of both “facilitator” and “brake”.
The system consists of (i) establishing a reference set of specimens collected under Prospective-specimen-collection-Retrospective-Blinded-Evaluation (PRoBE) design criteria; (ii) using the reference set to pre-validate candidate biomarkers before committing to full scale validation; (iii) doing full scale validation for those markers that succeed in pre-validation; and (iv) ensuring that the reference set is sufficiently large in numbers and volumes of sample that future candidate biomarkers can also be studied with it. This system provides rigorous and efficient evaluation of candidate biomarkers and biomarker panels. Reference sets should also be constructed to enable high quality biomarker discovery research.
We describe the process of establishing our system and hope it will serve as an example of how to validate biomarkers for clinical application. We also describe the biospecimen reference sets that are available in the EDRN and hope this will encourage the biomarker research community, from academia or industry, to use this resource to advance their biomarkers into clinical use.
The Early Detection Research Network (EDRN) is a research consortium established by the National Cancer Institute in 1999 with the mission to translate promising cancer biomarkers to clinics for use in assisting with medical decisions. Of particular interest are biomarkers for cancer risk prediction, diagnosis, early detection, and prognosis. The main drive for establishing the consortium was the observation that patients with early stage cancer had better survival rates than those with late stage. It was assumed that shifting cancer diagnosis to earlier stage may improve cancer patients’ survival. In addition, it appeared that there were many published candidate biomarkers but very few whose clinical usefulness was validated. Given the tremendous advances that had occurred in genomics and proteomics, the hope was that these newly discovered biomarkers would lead to substantial improvements in early diagnosis and prediction of cancer risk, thereby reducing the burden of cancer on the US population. The mission of the EDRN is to facilitate this process.
Since its inception, EDRN investigators have learned a great deal about the process of validating biomarkers for clinical use. Translational research requires a broad spectrum of research expertise, and coordinating collaborative activities can be challenging. The players include molecular biologists, clinicians and population scientists whose cultures and research methods are quite different. Nevertheless, with a common goal, we have established a robust system for biomarker triage, and validation facilitates good biomarkers being chosen and rigorously validated. Here we describe the process of establishing our system and hope it will serve as an example of how to validate biomarkers for clinical application. We also describe the biospecimen reference sets that are available in the EDRN and hope this will encourage the biomarker research community, from academia or industry, to use this resource to advance their biomarkers into clinical use.
During 2000–2002, there were many reports on protein profiling using surface-enhanced laser desorption/ionization (SELDI) for cancer diagnosis. As many of whese strongly indicated potential clinical utility(1–3), the EDRN decided to validate SELDI protein profiling for prostate cancer diagnosis. Because this technology was unconventional in that it did not identify informative proteins but relied on a protein mass spectrometry pattern to distinguish cancer patients from controls, EDRN investigators opted for a 3-stage validation process (4). The first stage was to demonstrate that SELDI protein profiles when applied to the same serum pool are reproducible across different laboratories. After many efforts at standardization, this stage was successfully completed (5), demonstrating that the profiles are indeed reproducible. The second stage was to determine if the serum profile could distinguish subjects with biopsies that were positive for prostate cancer from subjects with negative biopsies using high quality specimens from a well-designed multi-center study. Results were negative (6); the study did not reproduce previously identified informative peaks or identify new informative peaks that distinguished prostate cancer patients from patients with negative biopsies for prostate cancer. The third stage, to determine if prostate cancer could be detected early, would have been based on Prostate Cancer Prevention Trial (PCPT) sera, but was then not pursued.
The first two stages of this SELDI validation study taught investigators several important lessons. The first was identification of an important source of bias that may have been responsible for the early “promising” results that did not subsequently validate. This biasing factor was the length of time and conditions under which serum was stored for prostate cancer cases and non-prostate cancer controls (7). It turned out that protein peak intensities were negatively associated with storage length (7). Unfortunately sera for cancer patients tended to be collected over many years and used multiple times while sera for controls tended to be collected recently and had been subjected to fewer freeze-thaw procedures. In the stage-2 validation study that EDRN conducted, case and control samples were chosen to be similar in regards to storage length and number of freeze-thaw cycles (no more than one freeze thaw), as well as in regards to other clinical and epidemiologic factors (age and race).
The second lesson was the crucial importance of having available an unbiased well designed set of specimens to evaluate a marker. Although we suspected the source of bias in the promising preliminary studies, we needed high quality specimens to definitively test the marker. It took a long time (2 years) to get the required number of subjects satisfying the tight inclusion and exclusion criteria from existing repositories. We could have avoided a 2 year delay in getting an answer to the value of the SELDI marker had we had a reference set available.
In 2004 EDRN, after lengthy discussions, the EDRN’s Genitourinary (GU) Collaborative Group decided to establish a prostate cancer reference set that could address the various issues identified in the SELDI validation study. The first decision was that the reference set should be designed with a clear clinical application in mind and that specimens should be collected from the target clinical population without any potential for bias. Three clinical application settings were identified. The first was to help men who were candidates for biopsy under current clinical guidelines make a decision about whether or not to undergo the biopsy procedure. The intent was to reduce the number of unnecessary biopsies performed without reducing the detection rate of prostate cancer. The second setting was to help men who had a negative biopsy make decisions about having subsequent biopsies. The third setting was to aid men in making treatment decisions after a positive biopsy for prostate cancer. While the third clinical question is probably the most important in the clinical care of prostate cancer, answering this question requires long term follow up in order to collect mortality data. The GU Group decided to initially focus on the first application, men recommended for their first biopsy. Prostate specific antigen (PSA) would not be a good marker in this population as these men most likely were candidates for biopsy due to elevated serum PSA levels. This comprehensive clinical application presents a realistic and practical problem with potential impact for individuals and for the population.
Moreover the study design for this application is straightforward. Serum samples could be collected prospectively from men prior to biopsy and outcomes from the pathology report would be available shortly thereafter. This feature eliminates common sources of biases in case-control designs where specimens are collected after the disease status is known.
The second decision was that the reference set would be used for triaging biomarkers. Investigators, within or outside of EDRN, would be invited to submit their biomarkers to be evaluated in a blinded fashion on the reference set as a pre-validation step. Note that if such a reference set had been available, the full scale SELDI validation study could have been avoided. If marker had good performance in the reference set, a full scale validation study would be undertaken to validate the marker.
The third decision was that although the sample size of the pre-validation reference set was chosen to be 120 (60 men with positive biopsy and 60 men with negative biopsy), recruitment would continue to allow a validation study to be completed in a timely fashion if some markers were found to merit validation. The unbiased selection of cases and controls, the uniform serum collection and the fact that the same type of specimens were used for pre-validation and validation studies ensure that there is a high probability that the performance of the marker observed in the pre-validation set will hold up in the full scale validation study and that the biomarker, if validated, will have clinical utility.
In 2005, similar discussions occurred in the other three EDRN organ-specific collaborative groups (lung, gastrointestinal, and breast/gynecologic) and a number of reference set studies were designed and executed thereafter. One important development is the growing expectation that any EDRN validation study must also produce a reference set that can be used for validation of future biomarkers for the same clinical application. Indeed this has become an important criterion in evaluating and approving proposals for validation studies.
The prostate cancer pre-validation set was quickly established using specimens contributed from three EDRN Clinical and Validation Centers (CVCs). Biomarker discovery laboratories were invited to present their markers to the EDRN collaborative group that ranked them and voted for access to the pre-validation set. Markers from four labs were approved for pre-validation and aliquots of serum were sent to each of them in a blinded fashion. The markers approved for pre-validation were hk2, hk4, hk11, TSP-1, %[-2]proPSA, and EPCA2. Assay results were sent to the EDRN Data Management and Coordinating Center (DMCC) for analysis and comparison. Only %[-2]proPSA passed the pre-validation stage (8), and it went on for a successful full scale validation study (9). This biomarker received FDA approval in 2012.
Widespread PSA screening complicates biomarker evaluation for prostate cancer risk prediction or diagnosis because in current clinical practice a biopsy is usually triggered by elevated serum PSA levels. The three clinical applications on which the GU group has focused are practically important given the reality of PSA screening. If PSA screening patterns or criteria for post-PSA work-up change, the target population for application of a biomarker might also change and the validation results from the current reference set might not generalize. On the other hand, constructing a reference set for general population screening would require obtaining biopsies from men who according to current practice would not undergo biopsy. This is only feasible in some large prostate cancer prevention trials that require biopsies from all participants regardless of their PSA values, such as occurred in the Prostate Cancer Prevention Trial (PCPT). Teaming up with such large cohorts would allow EDRN to address that question.
Another difficulty in designing a reference set for general population screening of prostate cancer is the high prevalence of indolent prostate cancers. It seems more efficient to first address the third clinical application proposed above and learn more about aggressive prostate cancer before designing a general population screening biomarker study. Focusing on the clinical applications for which studies are more feasible and more likely to give fruitful results is an important lesson learned in the EDRN.
Most EDRN reference sets are prospectively designed and coordinated with multi-center specimen collections. For each prospectively established reference set, a protocol and manual of operations (MOP) were developed that were internally and externally reviewed. Training on the MOP and site audits for quality control were conducted by the EDRN DMCC staff. EDRN Standard Operation Procedure (SOP) was developed that requires blood tubes sit after collection for 30–60 minutes at room temperature, then if not processed immediately may be stored in 4C refrigerator for no more than 4 hours. Long term storage is at temperature −80C or colder. There are variations in specimen processing requirements in some reference set protocols but the information along with the SOP are all available on the EDRN public portal (Google EDRN → Resource). The adherence of the protocols is usually excellent for prospective studies coordinated by the DMCC due to training and auditing. Actions were taken when deviations were identified. In one study, one collection site was suspended and specimens were not used because of serious protocol deviations discovered by auditing. The extent of adherence to the protocol varies for retrospectively constructed reference sets, i.e. sites identify existing specimens that satisfy protocol. The DMCC has been examining all retrospectively constructed reference sets and documenting identified protocol deviations. The findings will be added to the EDRN public portal describing that reference set. Specimens were shipped to the Fredrick National Laboratory of the National Cancer Institute and clinical data were stored centrally at the DMCC using EDRN’s Validation Study Information Management System (VSIMS). Most EDRN reference sets adhere to the Prospective-specimen-collection-Retrospective-Blinded-Evaluation (PRoBE) study (10) design criteria and are, therefore, strongly unbiased for their intended clinical application context, i.e. specimens collected from a clinically relevant and well defined cohort in the absence of knowledge about patient outcome. Constructing PRoBE designed reference sets is efficient because the lengthy specimen collection period can begin even before markers become available for testing. For example, the industrial partner who owns the license for %[-2]proPSA, included a subset of the data from the EDRN reference set in their Food and Drug Administration (FDA) approved trial (they restricted the range of PSA and so did not include the whole reference set). Moreover, combinations of markers, tested at different times but on the same reference set specimens, can be evaluated with such reference sets. Another attractive feature of PRoBE designed reference sets is that they avoid biases and ethical issues associated with studies that obtain biomarker test results at the time of recruitment where diagnostic workup and patient management may be influenced by biomarker values. In the PRoBE design, biomarker tests are obtained retrospectively after the patient has been treated. Pepe et al (10) provides details and extensive discussion of the PRoBE design.
Discussions are currently underway on expanding use of reference sets for biomarker discovery research. One striking observation from our experience with reference sets is the drastic decrease in performance of many biomarkers that show very promising results at the discovery phase. As we have observed, bias from specimens used in discovery studies is difficult disentangle if the specimen collection does not meet PRoBE design criteria. Although it is helpful that the EDRN has reference sets to pre-validate these markers and eliminate false leads, it would be more efficient if biomarkers that moved out of the discovery phase had a higher chance of retaining their performance. Using high quality specimens that come from the target population for intended clinical application at the discovery phase will increase the chances of better biomarkers moving into the validation pipeline, thereby increasing the chance of successful validation.
Even good reference sets have limitations. For example, they are derived from specific institutions, and observed biomarker performance may not generalize to other populations. Consequently, external validation of markers is also necessary. For the EDRN prostate cancer reference set, the prevalence of positive biopsy is over 40%, somewhat higher than that in the general population of men who currently undergo biopsy for prostate cancer diagnosis at their local clinics. This suggests that the EDRN reference set may not represent the general population and argues for external validation of biomarkers that validate on this set.
For less prevalent cancers, such as ovarian cancer and pancreatic cancer, construction of PRoBE designed reference sets for early detection may not be feasible at the discovery stage or even at a non-pivotal validation trial stage. Precious samples are likely to be saved for validation studies. Community healthcare systems have much better long term follow up and results are more generalizable than tertiary healthcare centers. Partnering with them will greatly enhance our ability to establish reference sets for early detection of less prevalent cancers. Even when full adherence to PRoBE standards cannot be achieved, as occurred with several EDRN reference sets constructed in earlier years, some of the PRoBE design principles can and should still be incorporated into construction of specimen reference sets. These include ensuring that specimens are collected according to a rigorous protocol, ensuring that data are documented on factors that might influence biomarker values or disease characteristics, and incorporating specimens from multiple centers. Such specimen sets would be of much greater value than those used in most discovery studies at present. Blinding and randomization should also be reliably implemented centrally, whereas, in many published biomarker discovery papers, one cannot tell if or how blinding and randomization have been done.
There should be a policy for access to these high quality specimens for biomarker discovery. The policy could be similar to that used for access to pre-validation sets except that for the discovery purpose, one does not need the preliminary data indicating a potential clinical application. However, the policy should have elements including a strong rationale for the approach, strong design for the proposed study, evidence of a robust assay, and agreement that samples will be blinded when performing the assay, with unblinding only after the assay results have been sent to the central data repository and archived for later combination with other markers to construct marker panels. Panel construction can be postponed for a certain period to allow the marker developers to exploit their intellectual capital in their markers, but there should be a pre-planned use of the reference set data in order to try to identify panels with significantly better performance than any single marker. The availability of high quality specimens for biomarker discovery, triage and pre-validation, and validation in a coordinated system will enhance the translation of biomarker from bench to clinical use and benefit patients.
“This is an un-copyedited authored manuscript copyrighted by the American Association for Clinical Chemistry (AACC). This may not be duplicated or reproduced, other than for personal use or within the rule of ‘Fair Use of Copyrighted Materials’ (section 107, Title 17, U.S. Code) without permission of the copyright owner, AACC. The AACC disclaims any responsibility or liability for errors or omissions in this version of the manuscript or in any version derived from it by the National Institutes of Health or other parties. The final publisher-authenticated version of the article will be made available at http://www.clinchem.org 12 months after its publication in Clinical Chemistry.”