|Home | About | Journals | Submit | Contact Us | Français|
In the past decade, biomarker discovery has become ubiquitous in cancer research. However, despite this interest in biomarker research, few newly-characterized biomarkers have emerged as clinically-used entities. Here, we review the current state of biomarker research in cancer and identify challenges that stall many biomarker discovery efforts. We outline a model for systematic biomarker discovery, exemplified by recent efforts in prostate cancer, in which bioinformatics plays a central role in identifying promising new candidate biomarkers. Finally, we review the role of the National Cancer Institute’s Early Detection Research Network (EDRN) in biomarker studies and the importance of EDRN-led efforts to establish a research standard for more effective biomarker discovery efforts.
The past decade has witnessed a great surge in biomarker research, with thousands of research articles nominating an ever-increasing number of putative cancer biomarkers. Indeed, in 2010 over 20,000 papers investigated biomarkers, including over 8,000 biomarkers for cancer and about 600 that were suggested as useful for early detection. Such a tremendous research output has fueled expectations that effective biomarker-based diagnostics would rapidly take form. Yet, this has not been the case, and only a handful of newly-characterized biomarkers have been approved by the Food and Drug Administration (FDA) . Here, we review general problems with biomarker research that have led to this low success rate and highlight promising biomarkers emerging in the field of prostate cancer.
The central goal of biomarker research is to elucidate a molecule whose measurement provides information about a patient’s disease or risk of disease . The clinical application of biomarkers is, therefore, of paramount importance if a biomarker is to be effectively translated into a clinical setting. However, defining a clinical setting is often neglected for many biomarker candidates, thereby complicating clinical translation and making independent validation of candidate biomarkers difficult . For example, merely discovering a biomarker for cancer versus normal patients is not adequate without defining several key parameters: how is the biomarker measured in human patients (tissue/serum/urine, etc), what patient population receives the test, and how is the test interpreted? These issues plague many previous studies, limiting the impact of biomarker discoveries that are not aligned with the intended clinical use and leading to poorly-designed biomarker studies that commonly fail [4,5].
Another challenge of biomarker research has been the role of serendipity in biomarker discovery. Use of convenience samples in discovery continues to complicate the generalizability of biomarkers for a specific use since these biomarkers suffer from being discovered by chance and are plagued by bias, leading to low rates of validation by independent groups . In part, the continued use of non-human sources, such as cell lines and animal models, further complicates biomarker discovery by introducing biomarkers not reproducible in humans.
Although it is encouraging that, in the last 3 years, the number of approved biomarkers for cancer has increased modestly, progress is proceeding slowly if one looks at the proportionality of approved biomarkers compared to those published each year . Due to perceived underachievement of long-term returns on investment, many private industry research programs avoid biomarker research—a stark contrast to drug discovery efforts. This attitude is in accordance the generally low reimbursement for diagnostics when compared to drugs . Taken together, there is a need for systematic discovery and validation of biomarkers both from public and private settings.
The process of biomarker discovery ought to be no different than drug discovery. It requires sustained, long-term efforts starting from basic research all the way to FDA filing. It is estimated that it costs more than one billion dollars to find one effective cancer drug. Biomarkers for early detection may be even harder because they entail finding the disease among asymptomatic populations. The cost of developing robust diagnostic FDA-approved biomarkers is speculative because there is a lack of published historical data.
In light of these challenges and difficulties, NCI established a program called Early Detection Research Network (EDRN) with the goal of developing and evaluating biomarkers for cancer risk, early detection, diagnosis and prognosis . EDRN promotes a vertical approach to conducting biomarker research, whereby biomarkers are developed in Biomarker Developmental Laboratories (BDLs), refined and cross validated by Biomarker Reference Laboratories (BRLs), and validated in collaboration with Clinical Validation Centers (CVCs), all within one organization . The focus is on coordinating multiple resources with a goal of minimizing the barriers to the rapid and efficient transfer of biomarker developments between entities.
The process of biomarker discovery is guided by a statistical five-phase criterion, termed a prospective-specimen-collection, retrospective-blinded-evaluation (PROBE) design (Table 1) [9–11]. This five-phase approach has established both a scientific standard and a roadmap for successfully translating biomarker research from the laboratory to the clinic. Phase 1 is devoted to discovery and involves exploratory study to identify potentially useful biomarkers. Phase 2 refers to validation where biomarkers are tested to determine their capacity for distinguishing between people with cancer and those without. Phase 3 determines the capacity of a biomarker to detect preclinical disease by testing the marker against tissues collected longitudinally from research cohorts. Phase 4 includes prospective screening studies on biomarker performance in large populations, and determines the false referral rate. Phase 5 refers to the penultimate period in which large-scale population studies evaluate both the role of the biomarker for cancer detection, and its overall screening impact. EDRN has been charged with conducting studies up to Phase 3 (Table 1). Studies conducted beyond Phase 3 require collaborations with the clinical trial community which may have appropriate cohorts and samples for validating biomarkers per our guidelines [9,10].
EDRN uses these guidelines for triaging biomarkers and distinguishing useful biomarkers from those which do not show clinical utility . When judged using the PROBE criteria, it becomes apparent that the vast majority of candidate biomarkers found in the published literature are in Phase 1, lacking the validation or downstream development and refinement necessary for clinical translation of a biomarker discovery effort . While these candidate biomarkers may reflect promising research findings, the issue then arises on how to select individual biomarkers for further evaluation, because biomarker development is a long, labor-intensive process not suitable for most candidates. In light of this, how do we determine which biomarkers are most promising for clinical translation? And at what point in the biomarker development process should this selection occur?
Although there are no definitive answers to these questions, the EDRN has provided the first step by establishing a cohort of clinical reference samples (CRS) to facilitate the initial stage of biomarker validation. The use of CRS for independent validation efforts can be used to determine a “go” or a “no go” decision. In addition, considerations about the method of biomarker evaluation in patients also impact a “go” or “no go” decision, as tissue-based biomarkers often require a biopsy or other invasive procedure. Biomarkers measured in serum or urine therefore offer a less invasive (and often preferable) alternative to tissue-based biomarkers.
The goal of the EDRN is twofold: (1) to discover novel biomarkers, and (2) to evaluate the efficacy and appropriateness of existing biomarkers. The example of prostate cancer here is instructive. Widespread screening of asymptomatic men for a serum prostate specific antigen (PSA) level has dramatically increased the incidence of prostate cancer in Western populations following its adoption in the early 1990s [12–14]. But PSA is a flawed biomarker, plagued by a low specificity and positive predictive value [15,16]. In fact, typically 66 – 75% of men with an elevated PSA (>4ng/mL) do not have prostate cancer . To help improve prostate cancer screening efforts, EDRN investigators identified about 108 biomarkers that could be potentially tested to serve as an adjunct to PSA (S.S., unpublished data). Out of 108 they found about 58 of them which use reproducible tests which have reproducible data (S.S., unpublished data). After critical reviews of these 58, only five were selected for further validation, indicating that very few biomarkers exhibit promise for clinical translation (S.S., unpublished data).
The EDRN, therefore, emphasizes the need for systematic, rational discovery of biomarkers using a well-design research design and set of clinical samples. Moreover, biomarkers with well-characterized biological roles in tumorigenesis are of particular interest, as these represent functional components of cancer biology. Candidate biomarkers identified through the use of convenience samples should be approached with caution, as success in biomarker development results from systematic approaches that provide informative data based on sound clinical questions, address clinical questions and plan to implement the clinical needs. A clinical validation study of new biomarkers should be entertained only if likely to provide significant enhancement over the current standard of care; in other words, a poorly designed clinical study is worse than no study at all. Furthermore, identification of novel and relevant biomarkers should be sought by prospectively designing clinical studies with that purpose rather at the forefront rather than piggybacking ongoing studies. Here, we will review the discovery of novel prostate cancer biomarkers of an example of systematic, rational biomarker development.
Over the past decade, emerging biomarkers in prostate cancer have demonstrated the power of rational and systematic nomination of biomarker candidates. Our research group the University of Michigan has employed bioinformatics-driven analyses of prostate cancer RNA to discover novel components of prostate cancer biology . By integrating large-scale expression profiling datasets of human prostate cancer, we have defined a set of genes with dramatic overexpression in a subset of cases (outliers) that are often overlooked by conventional overexpression or copy-number analyses. Such efforts facilitated the discovery of common chromosomal translocation of the ETS family of transcription factors (ERG, ETV1, ETV4, ETV5), which occur in approximately 50% of patients [18,19]. These translocations result in high expression of ETS factors by generating a gene fusion with the ETS factor under the control of androgen-responsive promoters such as TMPRSS2 [18,20]. The most common of these gene fusions, TMPRSS2-ERG, accounts for ~90% of all ETS fusions found in prostate cancer (Fig. 1A and 1B) [17,20], and in vitro studies have demonstrated that induction of AR signaling in prostate cancer cells results in genomic colocalization of the TMPRSS2 and ERG genes, which thereby facilitates the formation of the gene fusion when in the presence of DNA damage (Fig. 1A) .
Because the TMPRSS2-ERG gene fusion is both tissue-specific and cancer-specific, it represents a promising biomarker for prostate cancer. Indeed, these chromosomal rearrangements offer distinct advantages over the current biomarkers for prostate cancer, such as PSA, because their detection is indicative of cancer, unlike PSA, and their functional role in promoting prostate cancer progression is well-characterized. Currently, detection of ERG fusions (by fluorescence in situ hybridization), ERG protein (by IHC), or TMPRSS2-ERG mRNA in urine can all be used as diagnostic tools for prostate cancer [22–24].
Yet, 50% of prostate cancers do not harbor an ETS gene fusion, meaning that these fusions lack sensitivity as a screening assay. Using similar bioinformatics analyses of ETS-negative prostate cancers led to the identification of the serine protease SPINK1 as an outlier in ~20% of ETS-negative cancers (10% of all prostate cancers) (Fig. 1B) . SPINK1 expression was also observed as an independent predictor of biochemical recurrence after resection , and in vitro studies demonstrated a role for SPINK1 in tumor cell invasion [25,26]. Similarly rare fusions of the RAF family kinases have also been found in 1–2% of patients (Fig. 1B) . These findings suggest that the systematic nomination of biomarkers can identify complementary biomarkers that also represent aspects of disease biology.
Because individually outliers such as TMPRSS2-ERG or SPINK1 lack sensitivity, their ideal clinical application is as part of a multiplexed assay that provides information on multiple biomarkers . Detection of outlier RNA in urine sediments is a particularly promising avenue for biomarker development because acquisition of urine sediments does not require an invasive biopsy procedure.
To this end, multiple groups have investigated multiplexed panels of urine biomarkers for prostate cancer diagnosis in PSA-prescreened cohorts. These studies have cumulatively defined an emerging role for two urine biomarkers, TMPRSS2-ERG and PCA3, whose detection improves the ability to distinguish men with prostate cancer and men without [23,28–30]. PCA3 is a long noncoding RNA (lncRNA) whose 3.7kb RNA product is both spliced and polyadenylated, yet does not encode a protein (Fig. 1C) . lncRNAs have been recently shown to contribute to prostate cancer pathogenesis  and are emerging as a poorly understood layer of cancer biology . Although its function in prostate cancer cells is unclear, PCA3 ranks as one of the best prostate cancer biomarkers in tissue-based studies, with >90% of prostate cancers exhibiting PCA3 overexpression (Fig. 1C) [33,34]. Using a first-generation assay in a laboratory setting, retrospective analysis of urine PCA3 values show a sensitivity of 54% and a specificity of 74% , and the combination of PCA3 with TMPRSS2-ERG improves the area-under-the-curve (AUC) metric of the receiver-operating-characteristic (ROC) curve over each individually .
To translate these findings into clinical practice, commercial CLIA-grade urine assays have been developed for PCA3 and TMPRSS2-ERG (Table 2). Tomlins et al. recently used these assays to conduct a large-scale analysis of PCA3 and TMPRSS2-ERG in PSA-prescreened populations at multiple community clinic settings as well as academic institutions . In both the academic and community cohorts, a combined urine measurement of TMPRSS2-ERG and PCA3 outperformed PSA in distinguishing patients with biopsy-proven cancer (the AUC for TMPRSS2-ERG + PCA3 was between 0.71 – 0.77 and the AUC for PSA was between 0.60 – 0.61) . Of note, TMPRSS2-ERG and PCA3 also provided additional improvement over the Prostate Cancer Prevention Trial (PCPT) risk calculator for prostate cancer, which integrates serum PSA level, age, patient and family history to predict risk for prostate cancer . While the prognostic significance of PCA3 and TMPRSS2-ERG individually has debated in conflicting studies , Tomlins et al. found that the combined TMPRSS2-ERG + PCA3 measurement associated with risk of high-grade cancer, and an increased TMPRSS2-ERG score was individually correlated with the number of biopsy cores with cancer and maximum tumor dimension, but not PSA level or ultrasound volume . EDRN investigators are similarly conducting a multi-institutional validation study testing the utility of a PCA3 urine test to assist clinicians in decision making for initial biopsy or repeat biopsy.
While TMPRSS2-ERG and PCA3 are promising biomarkers, they have limitations as well. First, the measurement of TMPRSS2-ERG and PCA3 is determined relative to the urine KLK3 (PSA) mRNA level, and thus a low yield of prostate cells from a urine sediment sample can render the tests uninformative. Second, these assays are more expensive and more complicated to perform, thus requiring specialized personnel to perform the tests.
As a result, the EDRN is also testing biomarkers developed elsewhere that may provide additive value to the panel. New –omic technologies, such as metabolomics and proteomics, have nominated novel biomarkers that have not previously been measured. An example of this is sarcosine, a metabolite produced by cells during the metabolism of the amino acid glycine, which was nominated by systematic analyses of prostate tissue cohorts . Sarcosine overexpression characterizes prostate cancer, particularly metastatic cancers, and it may play a role in cancer aggressiveness in both prostate and lung cancer [38,39]. Development of urine assays for sarcosine is also underway (Table 1) , and the EDRN is poised to incorporate promising biomarkers into its programs. We are currently testing a number of biomarkers using the prostate cancer reference sets, which are comprised of patients samples from patients that are (1) men over 40 years of age, (2) scheduled for prostate biopsy due to an elevated PSA (>2.5 ng/mL) or PSA rising (>0.5 ng/mL/yr), (3) risk factors for prostate cancer (e.g. family history) even in the absence of an elevated PSA, (4) abnormal digital rectal exam (DRE), (5) percent free PSA <15%, (6) no prior history of prostate cancer or prostate biopsy, and (7) prostate biopsy with at least 10 cores taken in a laterally directed fashion. Blood from these patients is collected prior to prostate biopsy.
In addition, the EDRN is also re-evaluating PSA in prostate cancer screening. Despite the long-time use of a 4.0 ng/ml cutoff for a 'normal' PSA value , it has been acknowledged that only about 25% of men with such an elevated value will be found to have prostate cancer at prostate biopsy  and ~15–20% of men with a PSA <4.0 ng/ml may also have prostate cancer . Because of this, three-quarters of men with an elevated PSA who have a biopsy undergo the procedure unnecessarily. Recent data from large longitudinal screening programs and from the Prostate Cancer Prevention Trial now suggest that the risk of prostate cancer is equally elevated (20–25%) even among men with serum PSA levels from 2.5 ng/ml to 4.0 ng/ml . Additional indications for prostate biopsy include a rising PSA, an abnormal digital rectal examination, or even a lower PSA value for a patient with other risk factors . For example, in accord with the initial demonstration from Hopkins that family history is linked to prostate cancer risk, EDRN group has demonstrated that for a 65 year old man with a first degree relative with prostate cancer, a PSA of 1.8 carries a 25% positive predictive value for prostate cancer (http://edrn.nci.nih.gov/resources/sample-reference-sets/Prostate%20Ref%20SOP.pdf). An opportunity exists for a biomarker in this application to reduce the number of unnecessary initial and repeat biopsies in men who are ultimately proven to not have prostate cancer while maintaining a very high level of sensitivity.
EDRN is an accelerator that drives good biomarkers through the clinic and also as a brake to use good clinical design to eliminate biomarkers without any added value. Two examples of this are [−2]pro-PSA, a precursor molecule of PSA which contains two additional amino acids at the 5’ end of the molecule [16,43], and percent free PSA (%fPSA), which represents the relative fraction of serum PSA proteins not complexed with other serum proteins [44,45]. Biochemically, PSA begins as a precursor protein and undergoes several proteolytic cleavage steps to produce the mature PSA protein , which can complexes in the blood with alpha-1-antichymotrypsin. In prostate cancer, improper processing and complexing of PSA results in an increase in [−2]pro-PSA [14,46] and a decrease in the %fPSA [44,45]. Changes in [−2]pro-PSA and %fPSA in prostate cancer also associate with more aggressive, higher-grade disease [16,47,48]. Widespread interest in these biomarkers has further spurred development of commercial assays for [−2]pro-PSA and %fPSA (Table 2).
To pursue these biomarkers, the EDRN conducted a multi-center study of [−2]pro-PSA in combination with PSA and fPSA for prostate cancer detection in patients with relatively low PSA levels of 2.0 to 10.0 ng/mL . The objective of the study was to evaluate [−2]pro-PSA, fPSA, and PSA using a mathematical formula (prostate health index [phi] = ([−2]proPSA / fPSA) × PSA1/2) to enhance specificity for detecting overall and high-grade prostate cancer. For the 2–10 ng/mL PSA range, at 80–95% sensitivity, the specificity and AUC (0.703) of [−2] phi exceeded those of PSA and %fPSA. Increasing phi was associated with a 4.7-fold increased risk of prostate cancer and 1.61-fold increased risk of Gleason ≥7 disease upon biopsy . The AUC for phi (0.724) exceeded that of %fPSA (0.670) in discriminating between prostate cancer with Gleason ≥ 4+3 vs. lower grade disease or negative biopsies.
The clinical translation of novel biomarkers can transform disease management. However, biomarker discovery and development is fraught with challenges, resulting in the eventual abandonment or failure of the vast majority of candidate biomarkers. The EDRN has proposed a rational paradigm for biomarker development that focuses on systematic, evidence-based discovery and validation of biomarkers as a prerequisite for further advancement of nominated biomarkers to clinical trials. Biased discovery methods and poorly designed clinical trials compromise many biomarker studies, complicating efforts to independently validate these markers and define their appropriate clinical use. New discoveries of urine RNA biomarkers (TMPRSS2-ERG, PCA3) and novel derivations of serum PSA (fPSA, [−2]pro-PSA) in prostate cancer now promise to change the clinical detection and management of this disease. Together, the combination of multiple biomarkers will offer advantages over each individual biomarker in clinical use. These biomarkers serve as examples demonstrating the potential benefit biomarker development has on clinical oncology.
We would like to acknowledge the numerous labs, authors, and publications that we were unable to cite in this review due to space restrictions. This work was supported by the Early Detection Research Network grant U01 CA 11275 (to A.M.C.), the Department of Defense grants PC100171 (to A.M.C.) and PC094290 (to J.R.P.), NIH Prostate Specialized Program of Research Excellence grant P50CA69568 (to A.M.C.). A.M.C. is supported by the Doris Duke Charitable Foundation Clinical Scientist Award, the Prostate Cancer Foundation, the American Cancer Society, and the Howard Hughes Medical Institute. J.R.P. is a Fellow of the University of Michigan Medical Scientist Training Program. A.M.C. is a Taubman Scholar of the University of Michigan.
Conflict of Interest Disclosure
A.M.C. serves as an advisor to Gen-Probe, Inc., who has developed diagnostic tests using PCA3 and TMPRSS2-ERG. The University of Michigan has licensed the development of TMPRSS2-ERG-based prostate cancer diagnostic assays to Gen-Probe and A.M.C. is named as a co-inventor. Gen-Probe was not involved in the writing or approval of this manuscript.