|Home | About | Journals | Submit | Contact Us | Français|
A challenge in the treatment of lung cancer is the lack of early diagnostics. Here, we describe the application of monoclonal antibody proteomics for discovery of a panel of biomarkers for early detection (stage I) of non-small cell lung cancer (NSCLC). We produced large monoclonal antibody libraries directed against the natural form of protein antigens present in the plasma of NSCLC patients. Plasma biomarkers associated with the presence of lung cancer were detected via high throughput ELISA. Differential profiling of plasma proteomes of four clinical cohorts, totaling 301 patients with lung cancer and 235 healthy controls, identified 13 lung cancer-associated (p < 0.05) monoclonal antibodies. The monoclonal antibodies recognize five different cognate proteins identified using immunoprecipitation followed by mass spectrometry. Four of the five antigens were present in non-small cell lung cancer cells in situ. The approach is capable of generating independent antibodies against different epitopes of the same proteins, allowing fast translation to multiplexed sandwich assays. Based on these results, we have verified in two independent clinical collections a panel of five biomarkers for classifying patient disease status with a diagnostics performance of 77% sensitivity and 87% specificity. Combining CYFRA, an established cancer marker, with the panel resulted in a performance of 83% sensitivity at 95% specificity for stage I NSCLC.
Lung cancer (LC)1 is the most common cause of cancer-related death, and the low overall 5-year survival rate, ranging from 10 to 14% (1), at least in part, reflects the problem of late diagnosis of the disease. On the other hand, early detection significantly improves LC survival (2). Although recent attempts for early detection of lung cancer via computer tomography (CT) screening of nonsymptomatic subjects at high risk were effective (3) and showed improved survival (4), because of the relatively low specificity of CT imaging, there is an imminent need for better identification of LC patients in the group of individuals presenting with small solitary pulmonary nodules. Currently used plasma protein biomarkers for LC, such as carcinoembryonic antigen (CEA), cytokeratin-19 fragment (CYFRA), squamous cell carcinoma antigen (SCC), and neuron-specific enolase exhibit insufficient sensitivity and specificity (5) for population screening and routine diagnosis of LC and are used only in specific cases to monitor the efficacy of radio- and chemotherapy (6). Unbiased gene expression experiments (7), proteomics (8), or auto-antibody profiling (9) efforts so far also failed to discover reliable and clinically useful markers for the early detection of LC (10).
Global proteome profiling holds promise for testing the hypothesis of whether protein species exist that are NSCLC-specific. However, global proteomic analysis of plasma has been hampered by methodological issues such as the lack of appropriate affinity reagents for downstream verification and development of clinically useful diagnostic tests. Because of this bottleneck, only very few reported biomarkers were subsequently validated, and practically none made it to the clinic (11).
Global antibody proteomics approaches (12, 13) have aimed to generate libraries of antibodies to cover most or all individual proteins in the human proteome but were targeted against recombinant proteins as immunogens. Although an important development, the limitation using recombinant proteins as antigens is that their antigenic epitopes are not represented in the natural state. Another issue with these initiatives is the inherent limitation by our current understanding of the natural human genome and proteome. Our efforts focused on the development of a new technology coined as mAb proteomics that leads to fit-for-purpose affinity reagents against native epitopes allowing for rapid translation of research results into clinically useful immunoassays without any a priori knowledge and biases. Here we present its application for the discovery of plasma protein biomarkers in LC. We produced complex mAb libraries in the form of hybridoma supernatants harvested from cultures of somatic fusion and termed them nascent mAb libraries. The nascent mAb libraries were targeted to the immunogenic epitome of the NSCLC cancer plasma proteome. Differential screening (cancer versus control) of the libraries identified mAbs detecting NSCLC-associated plasma protein epitope markers, some of which were also present in the cancer tissue samples. Ultimately, we identified five biomarkers whose levels were statistically different in the plasma of NSCLC patients and healthy controls. Among them, four proteins α-1 antichymotrypsin (ACT), leucine-rich α-2 glycoprotein 1 (LRG1), haptoglobin (Hpt), and complement factor H (CFH) were previously associated with LC (14–17), whereas complement factor nine (C9) is a biomarker for which no quantitative studies demonstrating an association with cancer have been previously reported. Screening of cloned and complex plasma proteome-specific mAb libraries with the cognate antigens led to the detection of antibody partners, allowing the development of sandwich immunoassays. Combination of the biomarkers with CYFRA (18) resulted in a diagnostic performance that may provide sufficient specificity to complement CT imaging in population screening of asymptomatic subjects with a high risk of LC.
Plasma samples from patients with newly diagnosed lung cancer and no previous treatment were obtained from informed patients and apparently healthy individuals after obtaining their written consent by a clinical protocol approved by the regional/local ethics committee and the institutional review board of the clinic/company (see Table I) from Proteogenex (Culver City, CA) under clinical protocol PG-ONC 2003/1, Asterand (Royston, UK) under clinical protocol AST-FB-003 and from the Department of Pulmonology of the University of Debrecen in Hungary under clinical protocol RKEB/IKEB:2422-2005. Plasma specimens for cohorts I and III were obtained using K2-EDTA as anticoagulant, whereas specimens for cohorts II and IV were obtained using citrate as anticoagulant. Lung cancer staging was done according to the American Joint Committee on Cancer and was based on information in the final histopathology report having the LC-histotype according to the World Health Organization classification (19). Clinical data including stage at diagnosis, histology, additional pulmonary pathologies, smoking habits, and general patient demographics are presented in Table I for each cohort.
Depletion of 12 abundant proteins was performed using a commercially available SEPPRO IgY12 LC10® (12.7 × 79.0 mm) column from Beckman Coulter (Fullerton, CA) on a BioCad chromatography HPLC work station (Applied Biosystems, Foster City, CA). Chromatography was performed according to the protocol supplied by the vendor, with minor buffer modifications. Briefly, a plasma sample (250 μl) was thawed and diluted by the addition of 750 μl of buffer A (25 mm Tris, 0.5 m NaCl, 1 mm MnCl2, 1 mm CaCl2, and 0.05% sodium azide, pH 7.4). The diluted plasma was loaded onto the SEPPRO IgY12 column at a flow rate of 0.5 ml/min for 30 min; the flow rate was then increased to 2 ml/min for the remainder of the run. The unbound proteins (depleted fraction) were washed off with binding buffer, and the depleted fraction was collected into a 15-ml centrifugal filter (Amicon) with a cut-off at 5kDa. The depleted plasma was concentrated by centrifugation at 3,500 × g. The bound proteins were eluted from the column with stripping buffer (100 mm glycine, pH 2.5) and collected into a separate tube. The column packing material was neutralized with 100 mm Tris-HCl, pH 8.0, for 10 min and re-equilibrated with binding buffer, before performing protein depletion from the next plasma sample. A total of 20 runs for the lung cancer plasma from cohort I (see Table I) were carried out. The concentrated and depleted fractions were pooled before further processing.
Glycoprotein enrichment was performed using a multi-lectin affinity chromatography column containing a mixture of three lectins: concavalin A, wheat germ agglutinin, and jacalin lectin from Vector Laboratories (Burlingame, CA), as previously described (20).
Protein normalization intended to reduce the dynamic range of protein concentration of the depleted plasma and the glycosylated protein fraction was performed as previously described (21). Briefly, an immunoaffinity column was prepared by covalently linking rabbit polyclonal antibodies raised against normal human serum from Sigma to HiTrap protein G HP columns (4.6 × 100) from GE Healthcare (Chalfont St. Giles, UK) using 15 mm dimethyl pimelimidate and 15 mm dimethyl suberimidate. For enrichment of the depleted plasma and glycosylated protein enriched fraction, 1 mg of protein was loaded onto the column and incubated for 5 min before washing with 1× PBS buffer at pH 7.0. The flow-through represents the enriched protein fraction, whereas the bound proteins were eluted with stripping buffer B from Agilent Technologies (Santa Clara, CA). The column was equilibrated with 1× PBS buffer, pH 7.0, between runs.
Two groups of four female BALB/c mice, at least 8 weeks of age from Charles River Laboratories (Lyon, France) were injected subcutaneously in the rear footpads and at the base of the tail, each group with one of the complex antigen protein mixtures in the presence of Freund's adjuvant. Each mouse received 10 μg of either glycoprotein-enriched depleted plasma (group A) or normalized glycoprotein-enriched depleted plasma (group B) on days 1, 15, and 29. At least 3 weeks after the third immunization, an intravenous injection with 10 μg of the complex antigen mixture in PBS was performed to boost the immune response. The mice were housed and treated in an animal facility accredited by the Regional Department of Veterinary Services (accreditation number 91-228-107), and all of the procedures were approved by the local ethical committee.
Four days following the final intravenous injection, the mice were sacrificed by cervical dislocation, the spleens were removed by dissection, and a cell suspension was prepared. Somatic hybridization of spleen cells with SP2/00-Ag-14 myeloma cells was performed using polyethylene glycol, and the fused cells were seeded into 96-well microplates under conditions that achieve oligo- or monoclonality for hybrid cell lines. Cultures were maintained under hypoxanthine-aminopterin-thymidine selection according to standard procedures (22). After allowing the cells to grow for 2 weeks, the IgG antibody production in each well was measured by ELISA using the protocol described below for HTS direct ELISA on 25 μl of hybridoma supernatant and horseradish peroxidase-coupled goat anti-mouse IgG as the detection antibody. Immunoglobulin secreting hybridoma supernatants derived from this initial stage are termed nascent mAb libraries, for discrimination from libraries of cloned and characterized mAbs. Cloning of selected hybridomas was performed using hypoxanthine-aminopterin-thymidine containing semisolid medium, based on methylcellulose Clonacell-HY medium D from StemCell technologies (Vancouver, Canada), in 4-cm culture dishes. After 4–10 days, hybridoma cell colonies were picked and transferred into 96-well plates filled with hypoxanthine-aminopterin-thymidine-containing complete medium.
Ascites fluid was produced by intraperitoneally injecting 1–5 × 106 cloned hybridoma cells into incomplete Freund's adjuvant-primed BALB/c mice, using a protocol that was approved by the animal care committee of the animal facility. Ascitic fluid was cleared of cells and cell debris, and IgG was purified using a two-step affinity chromatography procedure consisting of prepurification on a Pierce thiophilic adsorbent column from Thermo Fisher Scientific (Waltham, MA) followed by further purification using Protein G-Sepharose 4 FF from GE Healthcare as recommended by the suppliers. Following concentration by ultrafiltration, the purity of the preparations was assessed by SDS-PAGE, and the protein concentration was determined using the BCA protein assay kit from Pierce. The purified antibodies were stored as working aliquots at −78 °C.
Depletion of the most abundant plasma proteins was performed using commercially available columns according to the protocols supplied by the vendors. The Human-7 Multiple Affinity Removal System from Agilent Technologies (Santa Clara, CA) was used to deplete the pooled plasma used in the first step of the screening process, as well as the individual plasma samples used in the second screening (see Fig. 1). This depletion system was chosen because it provided the most efficient and reproducible depletion of abundant plasma proteins with no apparent carry over. Single-use SpinTrap depletion columns from GE Healthcare were used to deplete individual plasma samples in the third screening as well as in the experiments used to determine and verify the performance of the antibody panel, removing any possibility of sample carryover that might occur from multi-use columns. The quality and reproducibility of the depletion was determined and controlled by ELISA using commercially available mAbs against human albumin (Zymed Laboratories Inc., South San Francisco, CA) and IgG (Southern Biotechnology Associates, Inc., Birmingham, AL). The depleted plasma (pooled or individual) was labeled with bifunctional NHS-biotin containing a long alkyl chain as a spacer, EZ-Link Sulfo-NHS-LC-Biotin from Pierce according to the manufacturer's instructions. The biotinylated depleted plasma samples were used as tracers in the high throughput ELISA experiments. Biotinylation of purified proteins (listed in supplemental Table 1) was also performed according to the manufacturer's instructions.
384-well high protein-binding plates (Corning Inc., Lowell, MA) were coated with goat anti-mouse Igγ chain specific polyclonal antibody (goat anti-mouse IgG) (Southern Biotechnology Associates, Inc.) for 2 h at room temperature. Following washing, the wells were blocked with 0.5% BSA in PBS at 4 °C overnight. Undiluted culture supernatants from nascent hybridomas or from cloned cell lines were added in quadruplicates. Mouse anti-human mAb against albumin (Zymed Laboratory Inc.) was spotted in eight wells on each plate and used as positive control. Tissue culture medium was also added to eight wells as a negative control. The plates were incubated overnight at 4 °C, and following washing, all wells were incubated with tracers (biotinylated depleted plasma). Biotinylated human serum albumin (supplemental Table 1) was added to the wells containing the positive controls, and the incubation continued for 90 min at room temperature. The unbound proteins were then removed by washing, and horseradish peroxidase-coupled avidin (Vectastain Elite ABC peroxidase kit from Vector Laboratories) was added to each well for detection of bound biotinylated proteins, as specified by the vendor's protocol. Following washing, color development was carried out by adding freshly prepared substrate solution (H2O2; Sigma) and chromogen (o-phenylenediamine; Sigma) to each well. The reaction was followed by kinetic reading for 4 min at 450 nm at 37 °C, at 32-s intervals using ELISA readers from Molecular Devices (Toronto, Canada). All of the steps amenable to robotization were automated.
The Vmax was calculated from the linear portion of the curves using SoftMax Pro software from Molecular Devices. The Z′ factor, a metric to quantify the quality of the screening experiment with respect to reproducibility and data scatter (23), was calculated for each plate using the positive and negative control wells. Plates with a Z′ factor below 0.5 (usually less than 10% in a screening experiment) were excluded and repeated. The positive and negative controls were used to normalize the data across the plates.
Outliers for each group of replicates were removed using an automated procedure based on the mean and standard deviation values of the multiple measurements. In total, 9.2% of all generated data were considered as aberrant. The data obtained after normalizing and averaging the replicates (supplemental Fig. 1) were further analyzed using statistical methods.
The normality of the distribution of the results was estimated using the Wilks-Shapiro test. The distribution of the results for each hybridoma was calculated separately for the control and lung cancer samples. Nonparametric statistical analyses were applied: differences between two independent groups were determined using the Mann-Whitney U test; differences between more than two groups were assessed using the Kruskal-Wallis one-way analysis of variance test. All statistical tests were two-sided and were performed using R statistical software (www.cran.r-project.org). A predictive model for discriminating lung cancer cases from healthy controls using the panels of mAbs was produced using the freely available machine learning toolkit Weka (http://www.cs.waikato.ac.nz/~ml/) with a linear support vector machine and sequential minimal optimization algorithm. The model was established on the entire data set using 10-fold cross-validation. Logistic regression based on the sequential minimal optimization algorithm predictions was calculated to produce probabilities of class values (here lung cancer versus control) and to generate the receiver operating characteristics (ROC) curves.
Immunoaffinity magnetic beads were prepared for each antibody by mixing the antibody with protein G-coupled Dynabeads (Invitrogen) followed by covalent cross-linking of the bound antibody using dimethoxypropane, according to the manufacturer's instructions. The antigen was immunoprecipitated by mixing the beads with either total or depleted plasma according to the specific objective of the experiment. Following overnight incubation at 4 °C to allow antigen binding, the beads with bound antigen were washed, and the antigen was eluted using the elution buffer supplied with the kit. Proteins from the resulting eluate as well as the unbound fraction were separated by SDS-PAGE and the gels subjected to Coomassie Blue or silver staining. The band corresponding to the specifically immunoprecipitated protein was excised, subjected to in-gel trypsin digestion, and processed for mass-spectrometric analysis.
Analysis was performed at different MS laboratories. At the Laboratory of Proteomics Research at Biological Research Center of the Hungarian Academy of Sciences (Szeged, Hungary) analysis was done using a Thermo LCQ Fleet three-dimensional ion trap coupled with Eldex nanoHPLC system. Mascot Distiller (version 22.214.171.124) was used as a peak list-generating software and default peak picking parameters were used for the triple-play method except the precursor charge state was considered both 2 and 3 (default charges) in all cases and the peak profile parameters were changed: minimum peak with 0.1 Da, expected peak with 0.5 Da, and maximum peak with 2.5 Da. Mascot Daemon (version 2.2.04.) software was used as a search engine on the NCBInr (20080718) database with 6,833,826 actual protein entries and without species restriction. At the Proteomics Core Facility of the Department of Biochemistry and Molecular Biology of the Medical and Health Science Center of University of Debrecen (Debrecen, Hungary), analysis was done using ABI Q-Trap 4000 MS/MS system coupled with Agilent 1100 nanoLC. ProID (ABSciex) (version 1.4) was used as a peak list-generating software and search engine with default parameters. The SwissProt database (version 20090201) with 512,205 actual protein entries was searched without species restriction. At the Plate-Forme Spectrométrie de Masse et Protéomique at the Université Pierre et Marie Curie, Paris, France, analysis was done using the 4700 Proteomics Analyzer MALDI-TOF/TOF from Applied Biosystems. Peak lists were generated with Data Explorer 4.6 with default parameters and were submitted to the Mascot search engine with the NCBInr 20091010 database (9,868,855 sequences) restricted to human sequences.
Ten μg of each total and depleted protein sample was separated by SDS-PAGE on 4–20% Tris-glycine gels from Invitrogen and transferred to a nitrocellulose membrane (Whatman). The membranes were blocked for 1 h in PBS-Tween (pH 7.2) containing 5 mg/ml polyvinylpyrrolidone (Sigma) and incubated with the specific primary antibody (0.2 or 0.4 μg/ml) overnight. Immunoblots were developed by peroxidase-conjugated anti-mouse antibody from Southern Biotechnology Associates, Inc. and an ECL detection kit (Pierce). Protein bands were detected using different exposure times with a Gel Logic 1500 imaging system and Kodak MI software from Eastman Kodak (Rochester, NY). PageRuler Prestained Protein Ladder (Fermentas; Thermo Fischer Scientific) and MagicMarkXP Western protein ladder (Invitrogen) were used for molecular weight determination.
HEK293T cells at ~60% confluency were transfected with the plasmid, ORIG-SC321789, from OriGene (Rockville, MD) containing the True-Clone (Human NM_052972) for human LRG1 using turbofectin 8.0 (OriGene) according to the manufacturer's description. Cell supernatants were collected 24, 48, and 72 h after transfection and stored at −20 °C.
The experiment was based on the affinity of cytochrome c for LRG1 described in Ref. 24. The 96-well high-binding plates (Corning Inc.) were coated with cytochrome c from horse heart (Sigma) for 1 h at room temperature. Following washing, unbound sites were blocked with 0.5% BSA in PBS at room temperature for 30 min. Supernatants containing recombinant human LRG1, added to the wells at different dilutions, were incubated for 1 h at room temperature. After washing, specific antibodies were added at a concentration of 1 μg/ml and incubated for 1 h at room temperature. Following washing, all of the wells were incubated with horseradish peroxidase-coupled anti-mouse IgG (Southern Biotechnology Associates) for 30 min at room temperature. Finally, unbound proteins were removed by washing, and the plates were developed by adding freshly prepared substrate (H2O2) solution to each well containing tetramethylbenzidine (Sigma) as a chromogen. The reaction was terminated after 10 min of incubation, and the end point was read at 450 nm.
ELISA for CYFRA, CEA, and SCC were performed according to the manufacturer's instructions using respectively CYFRA 21–1 EIA, CanAg ACE (CEA) EIA, and CanAg SCC EIA kits from Fujirebio Diagnostics, Inc. (Malvern, PA).
All frozen and formalin-fixed paraffin-embedded surgical specimens used in this study were obtained from the research file of the Department of Pathology, University of Debrecen, utilizing protocols approved by the institutional review boards. Following hematoxylin-eosin staining of sections (4 μm), immunohistochemical labeling was carried out with the use of the Envision (biotin-free) peroxidase-based detection kit (Dako, Glostrup, Denmark) for mouse monoclonals using the purple VIPTM substrate-chromogen visualization from Vector Laboratories followed by either methyl-green or hematoxylin nuclear counterstaining, as previously described (25, 26). All of the antibodies were diluted in protein-free antibody diluent (DAKO, Denmark) and found to be reactive on tissues within the range of 1–4 μg/ml on native/cryostate and paraffin sections, respectively. To check the staining specificity for the test mAbs, each run included sections for negative control staining where isotype-specific irrelevant control mAbs (Dako, Denmark) were used in place of the primary antibodies. Tissue reactivities for the mAbs were then scored as previously described (26).
In our reverse approach for biomarker discovery (monoclonal antibody proteomics) (27), we first generated nascent and cloned libraries of affinity reagents (mAb) against native plasma protein epitopes, and then via unbiased high throughput direct ELISA profiling, we identified mAbs that discriminate plasma samples of LC patients from plasma samples of healthy controls (Fig. 1a). To generate the appropriate immunogen (Fig. 1b), we depleted the 12 most abundant plasma proteins representing ~90% (w/w) of the pooled samples from twenty NSCLC cases (cohort I in Table I) by means of a commercially available immunoaffinity depletion column. The small clinical cohort included patients with advanced cancer disease from the two main histological subtypes of NSCLC, adenocarcinoma, and squamous cell carcinoma. Two strategies were used to obtain protein fractions of high complexity from the depleted plasma proteins (Fig. 1). In the first strategy, plasma was depleted of abundant proteins. As a second strategy, glycosylated proteins, known as a source of potential cancer biomarkers (28), were enriched using multi-lectin affinity chromatography (20). Both the depleted plasma and glycosylated protein enriched fractions were then subjected to immunoaffinity chromatography based normalization (21) aimed to reduce representational differences among the remaining plasma proteins and to maximize the number of protein species reaching the immunogenic threshold during immunization. Hybridoma supernatants containing at least 50 ng/ml IgG were then tested with biotinylated plasma tracers using direct ELISA in a series of screening experiments designed to identify antibodies that detect plasma biomarkers for LC (Fig. 1b). The technical variability of the direct ELISA assay, in six identical experiments, was 10.6% across the full range of signal intensities, equivalent to the performance of protein microarrays and significantly better than that of shotgun mass spectrometry protein profiling (29). The reproducibility of the tracer preparation evaluated by the independent processing and screening of plasma aliquots with a panel of mAbs was above 90% (supplemental Fig. 1).
In the initial screening step, all 1051 IgG-producing nascent hybridomas from two somatic cell fusions were tested with depleted and biotinylated pooled plasma from 21 healthy control subjects and the 20 NSCLC patients used for immunogen preparation (cohort I) (Fig 2a). Plasma pooling was necessary to reduce the need for biological material (hybridoma supernatants and human plasma) and to reduce the biological variability of this step. A total of 184 nascent hybridoma supernatants were found to detect a normalized signal ratio higher than 1.5-fold between the disease and normal pools. The majority of antibodies (87%) detected a higher signal in the pooled LC plasma. The next screening steps involved testing the antibodies with individual plasma samples. First, the 184 candidate mAbs were examined with plasma from a small clinical cohort of 32 subjects (20 LC cases and 12 control subjects; Table I). The screening identified 61 hybridomas, discriminating (p < 0.05) the group of control subjects from subgroups of LC cases. The next screening step aimed to select those mAb candidates that separate mostly early stage LC from healthy subjects, thus securing the identification of biomarkers for early lung cancer detection. To this end, all of the 61 mAb candidates were screened with a larger and independent clinical cohort (cohort III; Table I and Fig. 2b) heavily weighted toward early stage (stage I and II) lung cancer. In this screening step, we identified 24 hybridomas that detected a difference of at least 15% between the apparent abundance of the cognate protein in plasma of the control and LC cases (based on the median group values, p < 0.05). The 13 best discriminating hybridomas were further characterized (supplemental Table 2). At this stage, all hybridomas were cloned at least two times. We sequenced the variable regions of the IgG heavy chain cDNAs, and the results confirmed the expectation, that all mAbs were from independent clones (supplemental Fig. 2).
Because the mAb libraries were generated against highly complex and quasi-representative mixtures of immunogenic proteins, mAb proteomics was not biased by a priori knowledge of cognate biomarker antigens. Cognate protein identification was performed at the distal phase of the process (30).
Ten mAbs that displayed clonal stability and high mAb yield were subjected to protein identification wherein immunoprecipitates were analyzed by shotgun mass spectrometry. The results revealed four different plasma proteins: Hpt beta chain for mAbs Bsi0071, CFH for mAb Bsi0271, Complement C9 (C9) for mAb Bsi0270 and Bsi0272, and leucine-rich α-2 glycoprotein 1 (LRG1) for mAb Bsi0351, Bsi0352, and Bsi0392. Peptide sequences from the MS experiments and the protein identifiers are summarized in supplemental Table 3. Visual pattern analysis of Western blots of SDS-PAGE gels of immunoprecipitates of mAbs from a cloned mAb library and subsequent sandwich ELISA screening suggested that mAbs Bsi0358 and Bsi0359 react with ACT, whereas Bsi0077 reacts with CFH. Protein identification was validated by ELISA experiments with purified natural plasma proteins (Fig. 3a) and recombinant LRG1 expressed in HEK293 cells (Fig. 3b).
Our results show that the 13 leading candidate mAbs recognized at least five different biomarkers, whose concentrations in the plasma range from 40 μg/ml (C9) to a few mg/ml (Hpt). Mimotope redundancy testing with phage display suggested that antigenic epitopes were all independent (supplemental Table 4).
To address the question of whether the protein biomarkers are detectable in the cancer cells and to assess their specificity for LC tissue, five mAbs (one for each analyte) were tested on 286 archived tissue microarray cores (each 3 mm in diameter) from 85 cases with histologically and immunohistochemically well characterized stage I and II NSCLC. In addition, small cell LC and 42 normal tissue samples representing 16 different tissue types including intact lung were also examined (supplemental Table 5). The analysis of LC and other normal tissue specimens indicated that four of the mAbs, Bsi0033 (Hpt), Bsi0077 (CFH), Bsi0272 (C9), and Bsi0358 (ACT) distinctly and specifically reacted with LC cells (Fig. 3c), whereas normal tissues including lung had no or low reactivity. Bsi0392 (LRG1) did not react with the tissue sections (Table II). Similar to previously published reports (31), the anti-Hpt-specific mAb (Bsi0033) was shown to preferentially stain adenocarcinoma as compared with squamous cell carcinoma (Table II).
Although direct ELISAs of the labeled plasma samples with the single mAb candidates do not provide absolute measurements of the biomarker concentrations, we have used the relative values (normalized rates from kinetic readings) obtained in the screening experiments to estimate the diagnostic capacity of each biomarker. The values measured in the plasma of 219 LC cases and 169 healthy controls of clinical cohort III are shown in Fig. 4 (a–e). An increase of biomarker concentration in LC cases is clear for LRG1, ACT, C9, and Hpt. In the case of CFH, we observed a slight decrease. There was no association of the level of the biomarkers with gender neither in the group of cases nor in the group of controls. Similarly, smoking habits evaluated in two groups (current and occasional smokers versus previous smokers and nonsmoking individuals) did not correlate with any of the biomarker levels (not shown).
The discrimination capacity of the selected mAbs as potential diagnostics tools was determined from the area under the receiver operating characteristics curves (AUC and ROC, respectively). Biomarkers with the best performance were determined by the AUC as LRG1 and ACT, followed by Hpt, C9, and CFH (Fig. 4f). The sensitivity at 95% specificity was different among the different biomarkers and ranged from 45% for LRG1 to 13% for CFH when estimated for all LC stages. The performance of the biomarkers was evaluated for early stage LC patients (stage I) and compared with three current biomarkers for LC CYFRA, CEA, and SCC (32). From the biomarkers reported here, LRG1 had again the best performance with an AUC of 0.78, better than that of CEA (0.70) and SCC (0.67) but lower than CYFRA (0.84) (Fig. 4g).
The apparent protein redundancy observed at the biomarker candidate level prompted us to screen a complex (>600 mAbs) cloned plasma proteome-specific mAb library for the identification of partner mAbs to develop robust quantitative sandwich assays for each biomarker protein (Table III). Testing multiple candidates for each mAb resulted in assays for four of the biomarkers (Hpt, C9, ACT, and CFH). The library did not contain matched pair antibodies for LRG1 (24). We have to note here that the initially identified mAbs were also capable of reacting in sandwich format for three of the biomarkers (Hpt, CFH, and C9), suggesting recognition of structurally independent protein epitopes for the respective mAbs. In the case of ACT and LRG1, the mAbs were probably generated against structurally overlapping epitopes because they were not able to form matched pairs. The epitope mapping experiment using peptide libraries showed that two of the mAbs against LRG1 (Bsi0351 and Bsi0352) had some sequence similarity (22%) between the peptide sets (supplemental Table 4), also confirming similar mimotop affinities. The two assays developed for this biomarker, either using cytochrome c or a previously published antibody 2F5.A2 (24) as a sandwich partner, provided similar results (Table III). Because of the better dynamic range of the sandwich assays, all biomarkers except CFH showed improvement of the discrimination between LC and healthy samples. In contrast to the single mAb-driven capture assays, the sandwich ELISA assays do not require a sample preparation step, and thus they are much more suitable for clinical applications as in vitro immunodiagnostics tools.
Using support vector machines and 10-fold cross-validation of the results generated on clinical cohort III (all LC stages included), we have determined a panel classifier composed of the five markers reported here. As expected, the panel increased the accuracy of the prediction (AUC of 0.88) compared with individual markers and provided specificity of 87% with a sensitivity of 77% (Fig. 4h, solid red line). Interestingly, from the known biomarkers, only CYFRA provided additional performance to the panel for discriminating stage I NSCLC patients and healthy controls. The results indicate that the combined performance with CYFRA is superior (Fig. 4h, solid black line) with an AUC of 0.93 and a sensitivity of 84% at 95% specificity across all stages of NSCLC. For stage I NSCLC, we observed slightly lower performance than that for more advanced stages (Fig. 4h, inset) but could still achieve a sensitivity of 83% at 95% specificity. To verify the performance of the panel on an independent collection, we have further measured plasma samples from an independent clinical cohort of 45 LC cases and 63 age-matched healthy controls (clinical cohort IV). The plasma were collected at a single clinical center and processed using a different analytical procedure as compared with clinical cohort III. The estimated performances of both panels were very similar both in terms of AUC and diagnostic performance (Fig. 4h, dotted lines), suggesting that the biomarker panel is not strongly dependent on the type of analytical preprocessing as reported for other plasma biomarkers panels (33).
Hpt, LRG1, ACT, and C9 have been reported as acute phase proteins (34, 35). As a regulatory protein of the complement cascade, CFH also plays an important role in inflammation. Therefore, we tested acute pneumonia patients with elevated C-reactive protein levels and a group of patients with other inflammatory and auto-immune benign pulmonary diseases (clinical cohort IV, other LD). Although the COPD, fibrosis, and sarcoidosis patient groups were not distinguishable from controls, the calculated lung cancer index was slightly elevated in pneumonia patients (Fig. 4i).
We have developed a widely applicable approach for the generation of fit-for-purpose affinity reagents in the form of mAbs directed against disease-associated biomarkers requiring no a priori knowledge of disease biology and have demonstrated its utility using lung cancer as an example. Four of the five lung cancer-associated biomarkers reported here have been previously associated with cancer with varying levels of experimental evidence, whereas quantitative studies demonstrating complement C9 as a cancer biomarker in plasma were not reported. Haptoglobin in general, as well as the differentially glycosylated forms of this abundant plasma protein, has the strongest clinical evidence as a general cancer biomarker (36, 37). In LC, the levels of serum haptoglobin and its different glycoforms have been demonstrated as potential biomarkers for both small cell lung cancer (38) and NSCLC (14). We have also published results previously obtained with antibody Bsi0033, for which we showed a higher affinity for glycosylated forms of haptoglobin (30). There is also strong evidence for an association of CFH with bladder (39) and ovarian cancers (40). Autoantibodies to CFH were reported in sera of patients with early stage NSCLC (41). The association of the increased level of ACT in plasma/serum with pancreatic cancer was demonstrated in two proteomics studies (42, 43). In LC, the specific staining of NSCLC tumors with a monoclonal antibody that recognized ACT was reported and demonstrated that the tumor cells synthesized ACT de novo (16). Our results confirmed this observation and also provide evidence that early stage LC is associated with an increased plasma concentration of ACT. The increase in serum/plasma levels of LRG1 has been associated with the presence of ovarian cancer (44, 45) as well as lung cancer in two semi-quantitative proteomics studies (15, 46) with nine and six patients, respectively. We are thus providing the first independent validation of this biomarker using a quantitative immunoassay on two independent clinical cohorts with 264 LC cases and 232 healthy controls. The C9 component of the complement system has not been associated as a plasma marker with cancers of any type.
Three of the biomarkers identified by our study (C9, Hpt, and ACT) are seemingly acute phase proteins (34), and there is some recent evidence that LRG1 is also induced by human IL-6 and synergistically up-regulated with either IL-1β or TNF-α in a pattern similar to those of type 1 acute phase proteins (35). The combination of increased plasma levels of Hpt, ACT, and LRG1 was recently shown to occur in patients with ovarian carcinoma (44). The increase of the levels of other acute phase proteins, such as C-reactive protein and serum amyloid protein, was associated with LC in several independent reports (47–49). It is now well accepted that cancer induces an inflammatory reaction, the nature and the extent of which is variable and depends on the cancer type (50). Interestingly, generalized inflammation reaction as deduced from nonspecific inflammation markers appears to play a particularly important role in lung cancer, and it is a strong negative prognostic factor for overall survival (51). Anti-inflammatory medication, such as cortisosteroid treatment of COPD patients, has been reported to decrease LC incidence (52), supporting the notion that the inflammatory reaction promotes cancer even at the initial stages. Another recent study showed that inflammatory responses to environmental exposures such as tobacco smoke also correlate with an increased risk of LC (53). However, the profiles of up-regulated acute phase proteins were reported to be specific both in the case of ovarian cancer (54) and in the case of LC, because other inflammatory lung diseases such as COPD did not reveal increases of the same proteins (47). In the prototype acute inflammatory clinical condition, bacterial pneumonia, associated with elevated levels of C-reactive protein, we detected some up-regulation of Hpt, C9, LRG1, and ACT, suggesting that the mechanisms, which induce transcriptional activation and production rates of these biomarkers, share some but not all regulatory elements with the inflammatory response. The fact that we also detected four of the biomarkers in the cancer cells by immunohistology strongly suggests that proinflammatory stimuli, e.g. cytokines in concert with yet unknown cancer-specific mechanisms actually induce ACT, C9, HPT, and CFH expression by the tumor tissue.
Considering the circulating levels (~0.1–0.5 mg/ml) of the markers and the size of a stage I LC mass (1–2-cm diameter), it is unlikely that the tumor alone produces the markers detected in the plasma. Instead, it is more likely that as a result of the local inflammatory response, the surrounding lung tissue or more distant sites also contribute to the increase in the levels of the circulating biomarkers. Inclusion of CYFRA (55) together with the biomarkers reported here increased the diagnostic performance. In addition, keratin fragments detected by the CYFRA assay are released from the cancer cells (56), which is likely to eliminate the potential confounding effect of inflammation. Results from the National Lung Screening Trial of over 54,000 current and former smokers using low dose spiral computer tomography (LD CT) demonstrate that LC is detectable at early asymptomatic stages (3), leading to a 20% reduction in cancer deaths (4). However, LD CT suffers from low specificity because roughly 95% of the patients with solitary pulmonary nodules detected by LD CT actually did not have detectable cancer. To reduce unnecessary interventions because of overdiagnosis, there is an imminent need for a clinical test that provides the required specificity in conjunction with LD CT screening. Although the panel identified in this study needs further validation on a larger clinical cohort and may not be suitable as a stand-alone test, its combination with CYFRA may provide sufficient specificity to complement imaging in routine use and population screening for lung cancer.
In conclusion, our hypothesis-free approach of screening the plasma proteome with hybridoma libraries not only delivered results consistent with previously reported lung cancer biomarkers but also uncovered new ones as well. Increasing the size of the initial libraries as well as the sensitivity of the screening assay is expected to contribute to the number of identified biomarkers because it is unlikely that the majority of specific lung cancer markers are among the high and medium abundant proteins. It is necessary to point out here that the screening steps performed in the study were designed to separate cancer from control cases. Other monoclonal antibodies that would detect not only the presence of the disease but the histological subtype as well can be identified by generating and screening the libraries with alternative hypotheses. The observed redundancy of antibodies generated against the same biomarkers also suggested that the global number of lung cancer-specific biomarker epitopes may be higher than reported here. Further characterization of the specificity of the LC-associated mAbs toward potential lung cancer-specific forms of the identified biomarker proteins should provide more insight into the biology of the disease. In principle, the development of antibodies for diagnostics is a difficult process with several limitations. One of the major strengths of our approach is that it inherently couples biomarker discovery with in vitro diagnostic development. Because the discovery phase is already driven by mAbs, translation, including the generation of reagents suitable for the development of specific in vitro diagnostic grade immunoassays, could be fast and effective.
We thank Edward M. Rubin (Lawrence Berkeley National Laboratory and Joint Genome Institute) for critically reviewing the manuscript; David Page from the University of Wisconsin for introduction in Weka and SVM; Prof. G. Bolbach and T. Blasco from the mass spectrometry facilities of University Pierre et Marie Curie for the mass spectrometry experiments; Gabor Zahuczky from University of Debrecen Genomed (Debrecen, Hungary) for the V-region sequencing; and Sándor Cseh from TargetEx Ltd. (Dunakeszi, Hungary) for the phage display experiments. We also thank Jozsef Lazar from BSI Kft. for help with the recombinant LRG1 and Celine Leclerc and Alexandra Kremeurt from BSI SAS for measurements of the known LC biomarkers. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
A. G., B. L. K., M. G. K., I. K., W. H., N. T., C. M. B., A. J., Y. K., and L. T. hold shares and warrants of Biosystems International SAS. M. G. K., I. K., W. H., N. T., J. K., C. M. B., A. J., Y. K., and L. T. are employees of Biosystems International.
L. T. conceived the experiments; M. G. K., B. L. K., and L. T. designed the experiments; I. K. and A. G. contributed to data interpretation of immunologic and separation experiments, respectively; W. H., I. K., and J. K. contributed to immunization, antibody production and characterization; A. J. contributed to specific experimental design and execution of high throughput hybridoma generation and cloning; N. T. contributed to specific experimental design and execution of high throughput screening; C. M. B. and M. H. contributed to specific experimental design and execution of sample and immunogen preparation; C. M. B. contributed to specific experimental design and execution of library screening for antibody partners; Y. K. and M. G. K. contributed to specific design and execution of statistical analysis; E. C. contributed to specific clinical protocol design and execution of sample collection; B. D. executed and analyzed the IHC experiments; and M. G. K. and L. T. wrote the manuscript.
* This work was supported in part by National Institutes of Health Grant CA126220, National Institutes of Health Grant GM15847 (to B. L. K.), Contribution 991 from the Barnett Institute (to B. L. K.), and Grant OTKA K-81839 from the Hungarian Government (to G. A.).
This article contains supplemental text, Tables S1–S5, and Figs. S1 and S2.
1 The abbreviations used are: