|Home | About | Journals | Submit | Contact Us | Français|
We investigated the potential of in-depth quantitative proteomics to reveal plasma protein signatures that reflect lung tumor biology. We compared plasma protein profiles of four mouse models of lung cancer with profiles of models of pancreatic, ovarian, colon, prostate, and breast cancer and two models of inflammation. A protein signature for Titf1/Nkx2-1, a known lineage-survival oncogene in lung cancer, was found in plasmas of mouse models of lung adenocarcinoma. An EGFR signature was found in plasma of an EGFR mutant model, and a distinct plasma signature related to neuroendocrine development was uncovered in the small-cell lung cancer model. We demonstrate relevance to human lung cancer of the protein signatures identified on the basis of mouse models.
Mouse models of cancer, in particular genetically engineered mouse models (GEMs) in which genetic alterations in oncogenes and tumor suppressors known to be mutated in human cancer are introduced, recapitulate many features of the human disease on a more uniform genetic background. For example, mouse models have been used to study the role of mutant KRAS and EGFR in the genesis of lung adenocarcinomas. As in humans with EGFR mutant lung cancer, mice with this disease respond to treatment with a tyrosine kinase inhibitor (Politi et al., 2006). Similarly, the deletion of Rb and Trp53, genes that are inactivated in the vast proportion of human small-cell lung cancers (SCLCs), in the murine lung leads to the development of tumors that recapitulate the features of this histological subtype (Meuwissen et al., 2003; Schaffer et al., 2010).
To determine whether we could identify plasma protein signatures that are common to lung adenocarcinomas or that reflect pathways driving tumor development, we examined the plasma proteomes of several mouse lung tumor models, in comparison with the plasma proteomes of models of other tumor types. We also extended these findings to human patients with lung cancer.
Plasma proteins from tumor-bearing mice and age-matched littermate controls were subjected to quantitative profiling by mass spectrometry as previously described (Faca et al., 2008) (see Table S1, which is available with this article online). Plasma was collected from three well-characterized models of lung adenocarcinoma: TetO-EGFRL858R/CCSP-rtTA (Lung-EGFR) (Politi et al., 2006), TetO-Kras4bG12D/CCSP-rtTA (Lung-Kras) (Fisher et al., 2001), and urethane treated (Lung-Urethane) (Horio et al., 1996). Plasma was collected 5–6 weeks (Lung-EGFR) and 5–6 months (Lung-Kras) after starting doxycycline treatment, representing early and intermediate stages of tumorigenesis, respectively (Fisher et al., 2001; Politi et al., 2006). For the urethane mouse model, plasma was collected at 38–42 weeks of age from mice bearing multiple adenomas and adenocarcinomas that harbor frequent Kras and Trp53 mutation (Horio et al., 1996). Plasma was also collected from a mouse model of small-cell lung cancer (Lung-SCLC) (Meuwissen et al., 2003), aged for 6–9 months following AdCre infection. In the SCLC model, AdCre-infected Trp53lox/lox; Rblox/lox mice develop neuroendocrine tumors that resemble SCLCs histologically (Meuwissen et al., 2003). The resulting proteomic data from tumor-bearing mice and matched controls for all four lung cancer models were compared with corresponding data from other models for which plasma proteome data was available to delineate lung cancer signatures. They consisted of two mouse models of breast cancer (MMTV-rtTA/TetO-NeuNT; (Breast-HER2) [Moody et al., 2002; Pitteri et al., 2011] and Tg(MMTV-PyMT)634Mul; (Breast-PyMT 0.5 cm) and (Breast-PyMT 1.0 cm), representing early and late stages of tumor development, respectively [Pitteri et al., 2008]), two prostate cancer models (Ptenpc−/− mice bearing nonmetastatic tumors compared with Ptenpc−/−;Smad4pc−/− mice bearing metastatic tumors; (Prostate-Strain Comparison) [Ding et al., 2011]), and one model each of colon (ApcΔ580/+; (Colon-Apc) [Hung et al., 2009)], ovarian (LSL-KrasG12D/+;PtenloxP/loxP; (Ovary-Kras/Pten) [Dinulescu et al., 2005; Pitteri et al., 2009)] and pancreatic ductal adenocarcinoma (PDAC) (Pdx1-Cre;LSL-KrasG12D;Ink4a/Arflox/lox; (Pancreas-PanIN [pancreatic intraepithelial neoplasia]) and (Pancreas-PDAC), representing early and late stage of tumor development, respectively [Faca et al., 2008]). Additionally, the plasma proteomes of two mouse models of inflammatory disease were also assessed to ascertain whether alterations observed in the plasma of the tumor models were related to general inflammatory processes. These included a mouse model bearing necrotic granulomatous tissue induced by carra-geenan-sponge implantation (Confounder-Acute Inflammation) and a model consisting of type II collagen-induced arthritis (Confounder-Chronic Inflammation) (Kelly-Spratt et al., 2011).
For each model, we quantified the relative concentrations of proteins in case and control plasmas by differential cysteine alkylation in intact proteins using isotopically labeled acrylamide (Faca et al., 2006). We identified protein products from 5361 unique genes with <5% false discovery rate from ~13 million spectra in 14 proteomic profiling experiments with plasmas from individual mouse models (Table S2). Case:control protein concentration ratios were calculated according to identification of cysteine-containing peptides from the protein products of 2261 unique genes (Table S3).
Unsupervised hierarchical clustering analysis of quantified plasma proteins revealed clustering of models by organ type, with all four lung cancer models clustering together as well as the two breast cancer models, suggestive of organ-type specific plasma protein signatures (Figure 1). To further explore protein signatures associated specifically with lung adenocarcinoma, the plasma proteome data was analyzed for proteins that were significantly elevated (case:control ratio > 1.25; p < 0.05, t test) in mice bearing lung adenocarcinoma tumors or only identified in tumor bearing mice from at least two of the three lung adenocarcinoma models, but not in other mouse models. This filtering strategy yielded 13 proteins (Table 1; Lung adeno subgroup in Table S4) including the known lung surfactant proteins (Sftpb and Sftpd) (Pérez-Gil, 2008). Three additional proteins (Igsf4a, Ppbp, and Prtg) exhibited increased levels in mice bearing adenocarcinomas as well as in mice bearing SCLC tumors but not in nonlung tumor models, thus representing proteins more broadly associated with lung cancer (Table 1; Lung cancer subgroup in Table S4). Independent evidence of PPBP as a lung cancer marker has been reported on the basis of human studies (Yee et al., 2009). Proteins with significantly decreased concentrations (case:control ratio < 0.8; p < 0.05, t test) were also identified in the lung adenocarcinoma models (Table S4). Of potential broader cancer relevance, we identified 16 additional proteins that were altered across multiple cancer models, but not in confounder mouse models, and thus represent potential broad epithelial tumor signatures (Common subgroup in Table S4). One such protein is Wfdc2, which has been associated with ovarian cancer (Hellström et al., 2003). Of interest, WFDC2 was found to be expressed in human lung tumor tissue, although not previously known to be elevated in lung cancer plasmas (Bingle et al., 2006; Galgano et al., 2006). Another subset of seven proteins was elevated in plasmas from both cancer models and the inflammatory models (Inflammation subgroup in Table S4). This subset included platelet-derived chemokine Cxcl4/Pf4, a cleaved form of which has been recently reported as associated with breast ductal carcinoma in situ (Solassol et al., 2010) and Angptl3, which has not been previously associated with lung cancer. Angptl3 plays a role in lipid metabolism (Koishi et al., 2002), induces angiogenesis (Camenisch et al., 2002), and has been found to be associated with the sera of ovarian cancer (Lin et al., 2009).
Next, we examined whether proteins with increased levels in plasmas from lung tumor-bearing mice are expressed in lung cancer cells and may be released into the extracellular space (Faça and Hanash, 2009). To this end, the proteomes of 21 human lung adenocarcinoma cell lines were analyzed by mass spectrometry to identify proteins in whole cell extracts, and proteins specifically on the cell surface or released into the medium. These comparisons revealed that 25 of 39 proteins found at increased levels in plasmas from mice with lung adenocarcinoma (Table S5) were identified in conditioned media, and an overlapping set of 26 proteins were found in the cell surface compartment of lung adenocarcinoma cell lines. Twenty one of the 39 proteins were enriched more than 5-fold in the conditioned media compared to their abundance in whole cell extracts (Table 1 and Table S5). Together, these profile comparisons suggest that lung cancer cells are a likely contributing source for the increased levels of these proteins in plasmas from mice with lung adenocarcinomas.
Sftpb and Sftpd, which were elevated in lung adenocarcinoma plasma profiles are secreted by alveolar cells and play an essential role in lung function by reducing surface tension (Pérez-Gil, 2008). These proteins are encoded by genes that are known targets of Titf1/Nkx2-1, a master transcription factor of peripheral airway cells and a known lineage-specific determinant of survival in lung cancer cells (Tanaka et al., 2007; Weir et al., 2007). Expression of Titf1/Nkx2-1 protein in lung tumors was confirmed by immunohistochemistry (Figure 2A). On this basis, we speculated that additional targets of Titf1/Nkx2-1 might be represented at increased levels in the plasma proteomes of the lung adenocarcinoma models. To further delineate a Titf1/Nkx2-1 plasma proteome signature, we examined mRNA micro-array data from a set of 111 human NSCLC tumors, including both adenocarcinomas and squamous cell carcinomas (accession number, GSE3141; Bild et al., 2006). Orthologous genes encoding four proteins with increased levels in the murine lung adenocarcinoma plasmas (SFTPB, SFTPD, NPC2, and WFDC2) exhibited a strong positive correlation with TITF1/NKX2-1 mRNA levels (Spearman correlation coefficient > 0.6; p < 0.00001) (Table S6). NPC2, a cholesterol transfer protein implicated in Niemann-Pick disease (Vanier and Millat, 2004), was not previously known to be elevated in lung cancer plasma. Expression of the NPC2 gene showed a strong correlation with TITF1/NKX2-1 mRNA across the set of human lung tumors (Spearman correlation coefficient = 0.64) providing supportive evidence for NPC2 as a potential target of this transcription factor in lung cancer cells. Immunohistochemical staining for Npc2 and Sftpb in mouse lung cancer tissue (Figure 2A) indicated that these proteins as well as Titf1/Nkx2-1 are coexpressed in lung tumor cells. Additional data associating NPC2 with lung cancer included identification of NPC2 in media of all 21 lung adenocarcinoma cell lines (Table S5) and occurrence of NPC2 among 300 proteins found in pleural effusions associated with lung cancer (Pernemalm et al., 2009).
To further determine the role of Titf1/Nkx2-1 in the expression of genes encoding proteins found at elevated levels in plasmas from lung cancer models, we examined the effects of inhibiting Titf1/Nkx2-1 in two lung adenocarcinoma cell lines with high expression of TITF1/NKX2-1: HCC4019 (KRAS mutant) and H3255 (EGFR L858R mutant) (Figure 2B). TITF1/NKX2-1 knockdown experiments were performed twice with each cell line. Gene expression analysis yielded a total of 964 genes with average case:control RNA ratios of less than 0.75 in one or both cell lines following knockdown. Plasma data were available for protein products of 34 genes among the 964 potential TITF1/NKX2-1-regulated genes, including Sftpa1, Sftpb, Sftpd, and Npc2 (Table S7). The average plasma ratios of these proteins in tumor-bearing mice from lung adenocarcinoma mouse models relative to controls were significantly higher than plasma ratios for other mouse models (mean ± standard deviation: 2.28 ± 0.07 versus 1.20 ± 0.37; p = 0.0005, t test) (Figure 2C), providing further supporting evidence for a Titf1/Nkx2-1 regulated protein set in plasmas of mice with lung adenocarcinoma. We also observed a strong correlation between expression of genes in this set and TITF1/NKX2-1 mRNA levels in NSCLC tumors according to expression array analysis (Table S7), further supporting their regulation by TITF1/NKX2-1 transcription factor.
We searched for dominant networks among proteins with significantly altered levels in plasmas from genetically engineered lung cancer mouse models based on Ingenuity Pathway Analysis (http://www.ingenuity.com/). We observed 21, 15, and 17 networks with statistical scores of more than 10 (score = −log10 [p value] based on the right-tailed Fisher’s exact test) in EGFR, Kras, and SCLC lung cancer mouse models, respectively. Protein networks involving TGFβ were observed in plasma with high significance among all lung cancer models (network #4 in the Lung-EGFR model; #5 in the Lung-Kras model; #2 in the Lung-SCLC model) (Table S8). NFκB was also found in networks with high significance among all three models. Networks involving TGFβ or NFκB may reflect a host response to the tumors based on the associated proteins and/or may reflect functional roles for these pathways in lung tumor formation. NFκB signaling has been shown to be required for mouse lung adenocarcinoma development (Meylan et al., 2009). Interestingly a network involving Egfr was the most statistically significant network (score = 38) in the EGFR mutant mouse model (Figure 3A), whereas it was observed as network #8 (score = 23) in the Kras mutant mouse model. A network involving Egfr was also highly significant (score = 41) in the SCLC model (Figure S1A). Messenger RNA expression of Egfr was found to be elevated in tumor tissue compared to normal lung tissue in this SCLC model (accession number, GSE18534; Schaffer et al., 2010). Thus, this SCLC model may recapitulate EGFR-positive human SCLC, which occurs in ~40% of SCLC (Schmid et al., 2010). The plasma Egfr network observed in the EGFR mutant mouse model includes proteins that bind directly with EGFR (Met, Cd44, Cdh1, Ndn, Sh3bgrl, and Rin1) and proteins that interact indirectly with EGFR (Adam10 and Trak2/Als2cr3). Among proteins that bind directly with EGFR, Met, Cdh1, and Sh3bgrl were predominantly elevated in tumor bearing mice from the EGFR mutant mouse model (Table S3). Moreover CDH1 occurred at higher levels in EGFR mutant lung adenocarcinoma cell lines than in KRAS mutant lung adenocarcinoma cell lines (Figure 3B). Interestingly, one form of Cd44 (IPI00265503) was only identified in the EGFR mutant mouse model. This form represents a longer Cd44 variant (Figure 3C). Proteomic analysis of 21 lung adenocarcinoma cell lines revealed that peptides encoding the CD44 variant region were identified in 4 of 10 EGFR mutant cell lines, but none of 11 KRAS mutant cell lines (p = 0.0351, Fisher’s exact test) (Figure S1B). Concordantly, alternative splicing RNA variants of CD44 were associated with EGFR mutant compared with KRAS mutant lung adenocarcinoma cell lines (p = 0.0237, Fisher’s exact test) (Figure 3D).
Interestingly, although proteins that form an Egfr network were observed in plasma from tumor-bearing mice in the EGFR mouse model, levels of endogenous mouse Egfr in plasma from tumor-bearing mice in this model as well as the two other lung adenocarcinoma models were reduced (Table S4) compared to other models (p = 0.0018, t test) (Figure 3E). We hypothesized that soluble factors produced by lung tumors and/or through a host response may induce down-regulation of circulating levels of soluble Egfr, a major source of which in the mouse is the liver. Indeed, Egfr mRNA expression levels in liver from tumor-bearing EGFR mutant mouse model were decreased compared to controls (data not shown). In view of reports that IL-6 is induced at high levels in some primary human lung adenocarcinomas (Gao et al., 2007) and that its levels are increased in the plasma of patients with lung cancer (De Vita et al., 1998; Kaminska et al., 2006), although IL-6 was not detected by mass spectrometry in plasmas from mouse models possibly because of assay sensitivity, we examined the systemic effect of IL-6 administration on plasma levels of soluble Egfr. A similar reduction in plasma Egfr levels was observed with IL-6 treatment as in plasmas from lung tumor-bearing mice (Figure 3E), consistent with our hypothesis that soluble factors may induce down-regulation of Egfr in host tissues with lung tumor development.
To assess protein levels in the context of treatment-induced changes in tumor volume, we compared plasma protein profiles of tumor-bearing mice carrying the EGFRL858R transgene before and after erlotinib treatment (Lung-Regression model; Politi et al., 2006). CCSP-rtTA; EGFRL858R mice were fed doxycycline for 5–6 weeks to induce tumors and were subsequently treated with 25 mg/kg/day of erlotinib for two weeks, after which plasma was collected. Plasma samples from mice in which tumor regression was documented were analyzed and compared to their control erlotinib-treated littermates. In total, levels of 91 of 164 proteins that were either elevated or decreased in plasmas from tumor-bearing mice in the EGFR model and that were quantified in the regression model, returned toward baseline (Figure 3F and Table S9). Thus, using this approach, we have identified proteins (e.g., Npc2 and Adam10) whose abundance reflects tumor progression and regression status in the EGFR model.
SCLC exhibits distinct molecular characteristics compared to lung adenocarcinoma. Analysis of the plasma proteome of the SCLC mouse model yielded 116 proteins that were significantly elevated (case:control ratio > 1.25; p < 0.05, t test) in tumor-bearing mice but not in either the EGFR or Kras model (Figure 4A and Table S10). Ingenuity Pathway Analysis of these 116 proteins revealed that 33 (28.4%) proteins were associated with Neurological Disease or Nervous System Development and Function pathways (Table S10), pointing to a neural signature in the plasma protein profile of the SCLC mouse model, consistent with the neuroendocrine feature of SCLC. A significant network in the SCLC model consisted of proteins in the proteasome-ubiquitin pathway (Figure S1A and Table S8). Anti-apoptotic Bcl-2 overexpression, induced by proteasome-mediated NFκB activation, is observed in up to 90% of SCLCs, and inhibition of the proteasome pathway represents a potential therapeutic modality (Ben-Ezra et al., 1994; Lara et al., 2006),. A subset of the proteins with increased plasma levels (case:control ratios > 1.5) in tumor-bearing mice in the SCLC model also exhibited increased expression (case:control ratio > 1.25) of their corresponding genes in tumor tissue compared to normal lung tissue from the SCLC mouse model (accession number, GSE18534; (Schaffer et al., 2010) (Figure 4B), including Ncam1/Cd56, which has been previously associated with SCLC (Ledermann et al., 1994; Vangsted et al., 1994).
To determine the potential relevance to human lung cancer of findings from mouse models, we performed assays in human blood samples of a set of proteins consisting of SFTPB, WFDC2, ANGPTL3, and EGFR for NSCLC and ROBO1 for SCLC, for which ELISAs were available. One source of human plasmas was from newly diagnosed smokers with operable NSCLC (n = 28) and control subjects matched for age, sex, smoking status, ethnicity, and plasma collection protocol (n = 39) (newly diagnosed set). A second set consisted of sera collected from subjects 0–11 months before they were diagnosed with NSCLC (n = 26) and matched control subjects who remained cancer free over a 4-year follow-up period (n = 26) (prediagnostic set) that were part of the Carotene and Retinol Efficacy Trial cohort study (Goodman et al., 2004). Demographics of subjects in these two sample sets are summarized in Table 2. For the group of newly diagnosed subjects, EGFR levels were decreased in subjects with lung cancer, and SFTPB, WFDC2, and ANGPTL3 levels were elevated with statistical significance, concordant with our findings in mouse models (p < 0.05 in t test or Mann-Whitney test) (Figure 5A). A receiver operating characteristic (ROC) analysis of the combined panel of EGFR, SFTPB, WFDC2, and ANGPTL3 using a linear combination rule yielded an AUC of 0.882 for the newly diagnosed set (Figure 5B). To further assess the potential contribution of the marker panel to early detection of lung cancer, we compared performance of the marker panel in the prediagnostic sample set with the performance of a previously validated panel (Qiu et al., 2008) consisting of autoantibodies to ANXA1, YWHAQ, and LAMR1 (autoantibody panel) in the same sample set (Figures 5C and 5D). For the group of subjects with sera collected before a diagnosis of NSCLC, an AUC of 0.808 was observed for the EGFR, SFTPB, WFDC2, and ANGPTL3 panel compared with 0.828 for the autoantibody panel and an AUC of 0.898 for the two panels combined.
Given the finding of substantially increased levels of the neural protein Robo1 in tumor-bearing mouse plasma from the SCLC mouse model (Figure 4B), we examined the levels of this protein in human plasmas. An ELISA assay targeting the ectodomain of ROBO1 yielded significantly increased levels in plasma from patients with SCLC in comparison with levels in plasmas from control subjects without cancer (n = 10 and 39, respectively; Figure 5E), On the other hand, the levels of ROBO1 were not significantly elevated in plasmas from subjects with NSCLC (n = 28) in the newly diagnosed set (data not shown), suggesting that ROBO1 is a SCLC-specific biomarker, concordant with the findings from the SCLC mouse model.
Genetically engineered mouse models of human cancer have generated remarkably concordant histopathology, genetic profiles, and response to therapeutics when compared to their human counterparts. However, the use of such models to identify blood-based cancer signatures has been limited, and most published studies have focused on single models (Faca et al., 2008; Pitteri et al., 2011). The present study compared plasma proteomes of a relatively large number of mouse models on the basis of quantitative mass spectrometry. Analysis of the plasma proteome profiles resulted in a clustering of the four lung tumor models together, separate from other tumor models. Among 39 proteins present at increased levels in plasmas from mice with lung adenocarcinoma (Table S5), 17 were extracellular proteins, seven were localized to the plasma membrane, and 11 were intracellular, as determined by Ingenuity Pathway analysis. Furthermore, 21 of the proteins were demonstrated to be enriched in conditioned media of lung adenocarcinoma cell lines.
Of note, some previously described protein associations with lung cancer (e.g., CEA, CYFRA21-1, and serum amyloid A) were not revealed in our proteomic analysis of plasmas from lung adenocarcinoma mouse models. CEA (Ceacam5) was neither quantified nor identified in any mouse models, and CYFRA21-1 (fragment of Krt19) was only identified in the acute inflammation model. On the other hand, CEACAM5 and KRT19 were identified in 13 and 16 of 21 human lung adenocarcinoma cell lines, respectively (data not shown). These discrepancies may be due to the limitations of assay sensitivity of mass spectrometry or the lack of significant elevations of these proteins at early stages of tumor development (Van’t Westeinde and van Klaveren, 2011). Serum amyloid A (Saa1) was identified in all mouse models used in this study, but was not quantified (Tables S2 and S3), because none of the identified Saa1 peptides contained cysteine (data not shown). Therefore, Saa1 still could be a potential biomarker of lung cancer.
Strong evidence was obtained for a Titf1/Nkx2-1-related signature in plasmas from mice with lung adenocarcinomas. The evidence included expression of Titf1/Nkx2-1 and several of its targets in mouse lung tumors, a positive correlation between mRNA levels of protein signature genes and TITF1/NKX2-1 expression in human lung cancers and cell lines, and reduced expression of protein signature genes following knockdown of TITF1/NKX2-1 with short inhibitory RNA. The finding that a master regulator specific to a tissue lineage during airway development, as in the case of TITF1/NKX2-1, regulates production of proteins that are released into circulation with tumor development provides a new avenue to search for cancer protein signatures in plasma based on protein products of genes under the control of master developmental regulators that are also expressed during tumor development.
The discovery of an Egfr network in plasma from the mutant EGFR lung cancer model with the highest statistical significance among networks observed in that model suggests the occurrence in plasma of proteins that inform about genes and pathways driving tumor development. The Egfr signature included Adam10, which is known to be induced by Egfr signaling (McCulloch et al., 2004; Yan et al., 2002). Interestingly, Cdh1 and Cd44, which were also part of the Egfr containing network, are substrates of Adam10 (Maretzky et al., 2005; Murai et al., 2004). CD44 isoforms have been reported to occur in cancer cells (Ponta et al., 2003). Although inclusion of variant exons is stimulated by both RAS/MAPK pathway and EGF signaling (Huot et al., 2009; Weg-Remers et al., 2001), alternatively spliced isoforms of CD44 were found to occur predominantly in EGFR mutant lung adenocarcinoma cell lines concordant with our proteomic finding of peptides encoding Cd44 variant exon sequences only in the EGFR mutant mouse model (Figure 3D). Alternative splicing of CD44 has been linked to epithelial-mesenchymal transition triggers (Warzecha et al., 2009). Our CD44 peptide findings in the EGFR mutant mouse model and EGFR mutant lung adenocarcinoma cell lines may be related to an epithelial phenotype (Deng et al., 2009).
In our study, we observed distinct plasma protein signatures in SCLC compared to NSCLC. They included proteins that reflected a neural lineage consistent with neuroendocrine features of SCLC. Mouse Robo1 was the neural protein most elevated in plasma from the SCLC mouse model compared to control mice. Robo1 is a highly conserved axon guidance receptor and a member of the NCAM family. Targeted deletion of the Robo1 gene affects lung development and results in bronchial hyperplasia (Xian et al., 2001), and Slit-Robo1 signaling induces angiogenesis (Wang et al., 2003). Assay of ROBO1 plasma levels in human SCLC cases yielded concordant findings with the mouse model with statistically significantly increased levels in cases compared to healthy controls.
We obtained further evidence for concordance between findings in lung cancer mouse models plasmas and human blood samples with observations of reduced levels of circulating EGFR in our study and in a prior study (Lemos-González et al., 2007) and increased levels of circulating SFTPB, WFDC2, and ANGPTL3 in our study. Of particular relevance to lung cancer detection are results of assays using prediagnostic sera, which also yielded concordant results indicative of a potential utility of findings from mouse models for blood-based early detection strategies of lung cancer and for monitoring of subjects with the disease for tumor status.
All animal experiments were conducted in accordance with institutional and national guidelines and regulations, under approval by the IACUC at Fred Hutchinson Cancer Research Center. Plasma samples from mouse models were obtained from tumor-bearing mice and control littermates. For each mouse model, an independent pool of case and control plasma was created. Pools of case and control plasma were formed by combining aliquots of plasma from 4–10 mice (Table S1). Details on mouse models, plasma sample preparation, mass spectrometry analysis, and statistical analysis are provided in Supplemental Experimental Procedures.
Twenty-one lung adenocarcinoma cell lines were profiled (Table S5). Detailed analysis procedures and methods for cell culture, collection of whole cell extracts, conditioned media and cell surface proteins, and mass spectrometry analysis are given in Supplemental Experimental Procedures.
For unsupervised hierarchical clustering of mouse plasma protein profile, the Cluster program (http://rana.lbl.gov/EisenSoftware.htm) was used to perform complete linkage hierarchical clustering of plasma proteins with filtering at least 40% presence and log2 transformation, and the result was displayed with the aid of TreeView software (http://rana.lbl.gov/EisenSoftware.htm) (Eisen et al., 1998).
For pathway analysis, Ingenuity Pathway Analysis (IPA) Software (Ingenuity Systems, Mountain View, CA) was used to perform pathway analysis of elevated/decreased and uniquely identified proteins from plasma of genetically engineered lung mouse models.
Tissues were fixed in 4% paraformaldehyde overnight at room temperature, placed in 70% ethanol, and sent for paraffin embedding and sectioning (Histoserv). The primary antibodies used for immunohistochemistry were anti-NPC2 (used at a dilution of 1:500, Sigma Cat. No. HPA000835), anti-surfactant protein B (used at a 1:250 dilution; Abcam Cat. No. ab40876), and anti-thyroid transcription factor 1 (used at a dilution of 1:200; Dako Cat. No. M3575).
Microarray data for TITF1/NKX2-1 knockdown experiments was generated with the Illumina Human HT-12 array platform. Details are found in Supplemental Experimental Procedures.
Plasma samples were obtained at the time of diagnosis from subjects with operable NSCLC (n = 28), subjects with SCLC (n = 10), and healthy control subjects (n = 39) matched for age (±5 years), sex, and smoking history and plasma collection protocol. Twenty-six pairs of prediagnostic NSCLC sera samples and matched controls were collected 0–11 months prior to diagnosis as part of the Carotene and Retinol Efficacy Trial (CARET) at the time when they were completely asymptomatic. Control subjects were matched for age, sex, and smoking history and were not diagnosed with cancer over a 4-year follow-up period, irrespective of their state of general health otherwise. All human plasma samples were obtained with the informed consent and the approval of IRB at Fred Hutchinson Cancer Research Center.
Levels of circulating EGFR (R& D Systems, dilution 1:300), SFTPB (USCN life, dilution 1:2000), WFDC2 (IBL-America, dilution 1:10), and ANGPTL3 (IBL-America, dilution 1:50) were measured according to the manufacturer’s protocols. The resulting data were normalized according to the mean of control subjects in each assay, and p values were calculated by use of the t test and Mann-Whitney test. To avoid biasing issues in ROC analysis, panels were generated using a linear combination of all assayed proteins. The level of ROBO1 was determined by in-house sandwich ELISA for ectodomain of ROBO1. Details on the ROBO1 ELISA assay are provided in Supplemental Experimental Procedures.
We have applied a comparative strategy of mouse models of cancer to uncover protein signatures in plasma that reflect cell lineages of lung cancer, or that reflect pathways driving tumor development. Proteins not previously associated with lung cancer were identified in plasmas from lung cancer models compared with plasmas from other cancer models or models of inflammation. We also obtained evidence for concordant findings in human lung cancer cell lines and in plasmas collected from subjects with lung cancer at the time of diagnosis as well as in sera collected from asymptomatic subjects prior to diagnosis supporting the merits of comparative profiling of mouse models of cancer for the discovery of proteomic signatures relevant to humans.
Funding support was provided by the National Cancer Institute’s Mouse Models of Human Cancer program; the Canary, Labrecque, and Uniting Against Lung Cancer Foundations; and the Department of Defense (DOD) Congressionally Mandated Lung Cancer Research program. K.S.P. is a Parker B. Francis Fellow. D.M.D. is supported by the DOD (grant no. W81XWH-10-1-0263) and by the Burroughs Wellcome, V Foundation, Mary Kay Ash, and Rivkin Foundations.
Supplemental Information includes one figure, 10 tables, supplemental references, and supplemental Experimental Procedures and can be found with this article online at doi:10.1016/j.ccr.2011.08.007.
The authors declare no competing financial interests.
Microarray data sets were deposited in the National Center for Biotechnology Information’s Gene Expression Omnibus (GEO) database (http://www.ncbi.nlm.nih.gov/geo) with the accession number GSE28480.