The primary findings of this study are 44 potential lung cancer biomarkers that discriminate stages I-III NSCLC cases from at-risk heavy smoker controls that can be combined into classifier panels that meet and exceed pre-specified performance criteria. The results of this study are novel in the following: (1) most of the proteins identified in this study have not been identified previously as serum lung cancer biomarkers; (2) we have identified novel protein biomarker panels that distinguish lung cancer cases from appropriate controls with high sensitivity and specificity in an independent, blinded verification set; and (3) this study achieves a new level of evidentiary standard in clinical proteomic biomarker studies as a result of a large sample size, a study design to control preanalytical variability, and the unique capability of this proteomic technology to interrogate the circulating proteome quantitatively with a breadth, sensitivity, and dynamic range unmatched by other broad serum profiling platforms 
, including mass spectrometry 
, antibody arrays 
, and autoantibody arrays 
. This study is the first large-scale application of this technology and the largest clinical proteomic biomarker study to date. As such, this study aims to overcome critical confounders and limitations of clinical proteomic biomarker studies that contribute largely to the lack of translation to the clinic due to false discovery 
. These confounders and limitations include clinical sample integrity, preanalytical variability, and inadequate study design and power.
The best overall performing classifier used 12 of the 44 biomarkers and achieved 91% sensitivity and 84% specificity in cross-validated training and similar performance of 89% sensitivity and 83% specificity in blinded validation. These results provide evidence that these biomarkers are valid and that the classifier was not over-fit to the training data. This performance and the biological plausibility (following) of the 12 biomarkers are encouraging for the next phase of development – validation in an independent clinical study.
The 12 biomarkers identified in this study () encompass functions of cell movement, inflammation, and immune monitoring that may contribute to cancer development. Most of the 12 proteins have been associated generally with cancer biology, some have been identified as candidate lung cancer biomarkers, none have been validated as lung cancer biomarkers, and none are used clinically 
. Four of the 12 proteins have been identified in serum and lung cancer tissue or cell culture as candidate lung cancer biomarkers – cadherin-1 
, endostatin 
, HSP90 
, and pleiotrophin 
. Eight of the 12 proteins, CD30 ligand, LRIG3, MIP-4, PRKCI, RGM-C, SCF-sR, sL-Selectin, and YES, have not been identified previously in serum as lung cancer biomarkers and represent novel findings.
Six of the 12 proteins, CD30 ligand, endostatin, HSP90, MIP-4, pleiotrophin, PRKCI, and YES were observed up-regulated in lung cancer in this study, consistent with their proposed biological roles in proliferation, invasion, or host inflammatory and immune response to the tumor. CD30 ligand is a member of the TNF ligand superfamily, which stimulates T-cell growth. Up-regulation of this protein correlates with proliferation in hematological malignancies 
. Endostatin, best known as an inhibitor of angiogenesis, has elevated serum levels in several cancers 
. Overexpression of endostatin and its parent extracellular matrix protein, collagen XVIII have been associated with poor prognosis in NSCLC 
The chaperone HSP90α is important for the stability of and function of a wide range of oncoproteins, including BCR-ABL, ERBB2, EGFR, BRAF, and AKT, among others, and inhibitors of this protein are now in oncology clinical trials, including NSCLC 
. HSP90 may also play a role in tumor cell resistance to complement mediated cytotoxicity 
. MIP-4 is over-expressed in ovarian and gastric cancers, and may have a role in immunosuppression of the host tumor response 
. Pleiotrophin is a growth factor with both mitogenic and angiogenic properties and levels in the serum of NSCLC patients have been reported to correlate with disease stage and prognosis 
. PRKCI is an oncogene that is often amplified in NSCLC and over-expressed in lung tumors correlates with poor prognosis 
. YES, another protein kinase and member of the src-family of tyrosine kinases, has a role in malignant transformation and increased protein levels have been reported in early stages of hepatocarcinoma 
We observed decreased levels of proteins in the serum of lung cancer patients compared to controls, including cadherin-1, LRIG3, sL-selectin, SCRsR, ERBB1 and RGM-C. Lower circulating levels of many of these proteins are associated with relief of inhibition of growth and invasion. For example, cadherin-1 is critical for cell adhesion and indirectly affects transcriptional regulation circuits through β-catenin 
. Consistent with our results, reduced expression has been reported in lung cancer, and loss of cadherin-1 is a key event leading to loss of adherence, tumorgenicity, and metastasis 
. The LRIG family consists of membrane proteins with soluble leucine rich repeat domains and immunoglobulin-like domains. Down-regulation of expression of this protein in glioblastoma cell lines resulted in increased proliferation and invasion, decreased apoptosis, and increased EGFR expression, leading to the hypothesis that LRIG is a tumor suppressor 
. L-selectin plays a role in activation of naïve lymphocytes that participate in immune surveillance and antitumor immunity. It also mediates the adherence of lymphocytes to endothelial cells. Lower expression of L-selectin may be a component of the immune suppression observed in many cancer patients 
Some of the proteins described in this study are the soluble domains of membrane receptors, and the function of the circulating form of these proteins may oppose their membrane-bound counterparts. Turner et al. 
proposed that soluble SCF-receptors regulate kit activation. Our results suggest that a low level of SCF-sR fails to titrate SCF, which makes more SCF available for binding cancer cells. Unlike the membrane bound form, soluble RGM-C inhibits hepcidin expression 
. We find that RGM-C is down regulated in NSCLC serum, consistent with increased intracellular iron and proliferative cell growth 
The limitations of this study include the following. We did not test cases prior to clinically apparent disease. We did not demonstrate organ-specificity and many of the markers are known to be elevated in other cancers. However, the markers will be used in combination and in the proper diagnostic context, such as with imaging, smoking history, and symptoms. We did not validate our findings in an independent set of clinical samples. Our multi-center study was designed to minimize the effects of potential preanalytical variability, which is mitigated, but not eliminated by this study. All of these limitations will be addressed in the next phase of development, which is enabled by the positive results of this study.
The biomarkers that we discovered have several potential clinical applications. The first application is early detection of lung cancer in long-term smokers when it may be cured by surgery. Our results are a significant improvement on the performance of other recently published lung cancer biomarker studies aimed at early diagnosis 
using mass spectrometry 
or gene expression 
. This performance could allow for testing of individuals with increased lung cancer risk, with subsequent CT screening based on the blood test result.
A second potential application is a test for diagnosing lung cancer in subjects with suspicious lung nodules identified by CT, which could help mitigate the problem of morbidity and cost associated with surgical interventions. CT screening reveals suspicious nodules in ~40% of long-term smokers 
, but ~97% are likely benign 
. Protocols for managing these patients balance the risk of “watchful waiting” with definitive and costly invasive procedures. Watchful waiting monitors nodule growth by periodic follow-up CTs, but may miss the opportunity for early surgical cure. Invasive procedures incur the risk of complications and death that arise from biopsy or futile thoracotomy for benign lesions. This risk might be reduced by a new strategy to assess nodule volume doubling time by CT 
. However, CT radiation itself increases cancer risk 
Based on the discoveries reported here, we have initiated clinical validation studies of populations at risk for lung cancer. Our goal is to develop a clinical blood test to enable an earlier diagnosis. This study is the first to be published in a sequence of successful biomarker discovery studies that we have already completed in different cancers and demonstrates the power of our proteomic technology to discover robust biomarkers in important diseases. This general approach can also be applied to discover biomarkers for many more conditions including infectious, inherited, neurological, and metabolic diseases.