|Home | About | Journals | Submit | Contact Us | Français|
By means of in vivo selection, transcriptomic analysis, functional verification and clinical validation, here we identify a set of genes that marks and mediates breast cancer metastasis to the lungs. Some of these genes serve dual functions, providing growth advantages both in the primary tumour and in the lung microenvironment. Others contribute to aggressive growth selectively in the lung. Many encode extracellular proteins and are of previously unknown relevance to cancer metastasis.
Metastasis is frequently a final and fatal step in the progression of solid malignancies. Tumour cell intravasation, survival in circulation, extravasation into a distant organ, angiogenesis and uninhibited growth constitute the metastatic process1. The molecular requirements for some of these steps may be tissue specific. Indeed, the proclivity that tumours have for specific organs, such as breast carcinomas for bone and lung, was noted more than a century ago2.
The identity and time of onset of the changes that endow tumour cells with these metastatic functions are largely unknown and are a subject of debate. It is believed that genomic instability generates large-scale cellular heterogeneity within tumour populations, from which rare cellular variants with augmented metastatic abilities evolve through a darwinian selection process2,3. Work on experimental metastasis with tumour cell lines has demonstrated that reinjection of metastatic cell populations can lead to enrichment in the metastatic phenotype4–6. Recently, however, the existence of genes expressed by rare cellular variants that specifically mediate metastasis has been challenged7. Transcriptomic profiling of primary human carcinomas has identified gene expression patterns that, when present in the bulk primary tumour population, predict a poor prognosis for patients8–10. The existence of such signatures has been interpreted to mean that genetic lesions acquired early in tumor-igenesis are sufficient for the metastatic process, and that consequently no metastasis-specific genes exist. However, it is unclear whether these genes predicting metastatic recurrence are also functional mediators.
The lungs and bones are frequent sites of breast cancer metastasis, and metastases to these sites differ in terms of their evolution, treatment, morbidity and mortality11. Reasoning that each organ places different demands on circulating cancer cells for the establishment of metastases, we sought to identify genes expressed in breast cancer cells that selectively mediate lung metastasis and that are correlated with the propensity of primary human breast cancers to relapse to the lungs.
The cell line MDA-MB-231 was derived from the pleural effusion of a breast cancer patient suffering from widespread metastasis years after removal of her primary tumour12. Individual MDA-MB-231 cells grown and tested as single-cell-derived progenies (SCPs) have distinct metastatic abilities and tissue tropisms13 despite having similar expression levels of genes constituting a validated Rosetta-type poor prognosis signature9 (Supplementary Fig. S1). These different meta-static behaviours, including different tropisms to bone and lung, are associated with discrete variation in overall gene expression patterns (Supplementary Fig. S1; ref. 13). We therefore proposed that organ-specific metastasis must be determined by genes that are distinct from a Rosetta-type poor prognosis signature and are differentially expressed within the MDA-MB-231 population. Indeed, previous work has shown this to be true for most of the genes linked to the activity of bone metastatic subpopulations4,13.
To identify genes that mediate lung metastasis we tested parental MDA-MB-231 cells and the 1834 sub-line (an in vivo isolate with no enhancement in bone metastatic behaviour4; Fig. 1a) by injection into the tail vein of immunodeficient mice (Fig. 1b). Metastatic activity was assayed by bioluminescence imaging (BLI) of luciferase-transduced cells as well as gross examination of the lungs at necropsy. The 1834 cells exhibited limited but significant lung metastatic activity compared with the parental population (Fig. 1b). When 1834-derived lung lesions were expanded in culture and reinoculated into mice, these cells (denoted LM1 subpopulations; Fig. 1a) showed increased lung metastatic activity. Another round of selection in vivo yielded second-generation populations (denoted LM2) that were rapidly and efficiently metastatic to the lungs (Fig. 1b). Histological analysis confirmed that LM2 lesions replaced large areas of the lung parenchyma, whereas 1834 cells exhibited intravascular growth with less extensive extravasation and parenchymal involvement (Fig. 1c). Inoculation of as few as 2 × 103 LM2 cells was sufficient for the emergence of aggressive lung metastases, whereas inoculation of 2 × 105 parental cells left only a residual, indolent population in the lungs (Fig. 1d). Furthermore, the enhancement in lung meta-static activity was tissue specific. When LM2 populations were inoculated into the left cardiac ventricle to facilitate bone metastasis, their metastatic activity was comparable to that of the parental and 1834 populations, and it was markedly inferior to that of a previously described, highly aggressive bone metastatic population (Fig. 1b).
To identify patterns of gene expression associated with aggressive lung metastatic behaviour, we performed a transcriptomic micro-array analysis of the highly and weakly lung-metastatic cell populations. The gene list obtained from a class comparison between parental and LM2 populations was filtered to exclude genes that were expressed at low levels in a majority of samples and to ensure a threefold or higher change in expression level between the two groups. A total of 95 unique genes (113 probe sets) met these criteria: 48 were overexpressed and 47 underexpressed in cell populations most metastatic to the lungs (Fig. 2a and Supplementary Table 2). This gene set was largely distinct from the bone metastasis gene-expression signature previously identified in bone metastatic isolates derived from the same parental cell line4. In fact, only six genes overlapped with concordant expression patterns between the two groups (Supplementary Table 3).
Hierarchical clustering with the 95-gene list confirmed a robust relationship between this gene expression signature and the lung-specific metastatic activity of cell populations selected in vivo (Fig. 2a). In addition, this gene expression signature segregated the SCPs (which were not used in generating the gene list) into two major groups, one transcriptomically resembling the parental cells, the other more similar to the lung-metastatic populations selected in vivo. This latter group of SCPs was also more metastatic to lung than the former group (Fig. 2b). However, unlike the LM2 populations, none of the lung-metastatic SCPs concordantly expressed all of the genes in the lung metastasis signature (Fig. 2a). Consistent with this was our observation that the lung metastatic activity of the LM2 populations was about one order of magnitude greater than the most aggressive SCPs (Fig. 2b). We postulated that the subset of genes from the 95-gene signature that are uniformly expressed by all lung-metastatic SCPs and populations selected in vivo might confer baseline lung-metastatic functions, which we define as lung meta-stagenicity. Genes expressed exclusively in the most aggressive LM2 populations may serve specialized, lung-restricted functions, which we collectively describe as lung-metastatic virulence. A final list of 54 candidate lung metastagenicity and virulence genes was selected for further evaluation (Supplementary Methods and Supplementary Table 4).
A subset of biologically interesting genes overexpressed in the 54 gene list was selected for functional validation. These genes include those encoding the epidermal-growth-factor family member epiregulin (EREG), which is a broad-specificity ligand for the HER/ErbB family of receptors14,15, the chemokine GRO1/CXCL1 (ref. 16), the matrix metalloproteinases MMP1 (collagenase 1)17 and MMP2 (gelatinase A)18, the cell adhesion molecule SPARC19, the interleukin-13 decoy receptor IL13Rα2 (ref. 20) and the cell adhesion receptor VCAM1 (refs 21, 22) (Fig. 2a). These genes encode secretory or receptor proteins, indicating possible roles in the tumour cell microenvironment. In addition to these genes, we included the transcriptional inhibitor of cell differentiation and senescence ID1 (refs 23, 24) and the prostaglandin-endoperoxide synthase PTGS2/COX2 (ref. 25). Northern blot analysis of the various cell populations selected in vivo revealed expression patterns for these genes that were correlated with metastatic behaviour (Fig. 2c). SPARC, IL13RA2, VCAM1 and MMP2 belong to the subset of genes whose expression is generally restricted to aggressive lung-metastatic populations and are rarely expressed (less than 10% prevalence for VCAM1 and IL13Rα2, and less than 2% prevalence for SPARC and MMP2) in randomly picked SCPs (data not shown). In contrast, the expression of ID1, CXCL1, COX2, EREG and MMP1 is not restricted to aggressive lung metastasis populations but increases with lung metastatic ability. Analysis of protein expression for these genes confirmed that the differences in mRNA levels translated into significant alterations in protein levels (Supplementary Fig. S2).
To determine whether these genes have a causal function in lung metastasis, they were overexpressed by retroviral infection in the parental population either individually, in groups of three, or in groups of six (Supplementary Fig. S3). Only cells overexpressing ID1 alone were modestly more active at forming lung metastases than cells infected with vector controls (Fig. 3a). Consistent with the hypothesis that metastasis requires the concerted action of multiple effectors was our observation that combinations of these genes invariably led to more aggressive metastatic activity and that some combinations recapitulated the aggressiveness of the 4175 LM2 population (Fig. 3b). Triple combinations of lung metastasis genes in parental cells did not enhance bone metastatic activity (Supplementary Fig. S4), supporting their identity as tissue-specific mediators of metastasis. The requirement for some of these genes was tested by stably decreasing their expression in 4175 (LM2) cells with short-hairpin RNA-mediated interference (RNAi) vectors (Fig. 3c). A decrease in ID1, VCAM1 or IL13Rα2 levels decreased the lung metastatic activity of 4175 cells more than tenfold (Fig. 3d). These effects were not due to activation of the RNAi machinery, because efficient knockdown of another gene, ROBO1, did not inhibit lung metastasis formation (data not shown). Collectively, the results show that these nine genes are not only markers but also functional mediators of lung-specific metastasis.
A biologically meaningful and clinically relevant gene profile that mediates lung metastasis might be expressed uniquely by a subgroup of patients that suffered relapse to the lung and it should be associated with the clinical outcome. To test this, a cohort of 82 breast cancer patients treated at our institution was used in a univariate Cox proportional hazards model to relate the expression level of each lung metastasis signature gene with clinical outcome. Twelve of the 54 genes are significantly associated with lung-metastasis-free survival, including MMP1, CXCL1 and PTGS2 (Supplementary Table 5). A cross-validated multivariate analysis using a linear combination of each of the 54 genes weighted by the univariate results26 distinguished between patients with a high risk and those with a low risk for developing lung metastasis (10-year lung-metastasis-free survival of 56% versus 89%, P = 0.0018; see Supplementary Fig. S5) but not bone metastasis (70% versus 79%, P = 0.31). When a similar multivariate analysis was performed by weighting each gene by a t-statistic derived from a comparison of its expression between the LM2 cell lines with that of the parental MDA-MD-231 cells, the 54 genes again distinguished patients at high risk for developing lung metastasis (62% versus 88%, P = 0.01; see Supplementary Fig. S5) but not bone metastasis (75% versus 79%, P = 0.49). These results indicate that a clinically relevant subgroup of patients might express certain combinations of lung metastasis signature genes.
To determine directly the extent to which breast cancers express the lung metastasis signature in a manner resembling the LM2 cell lines, the 54 genes were used to cluster the Memorial Sloan-Kettering Cancer Center (MSKCC) data set hierarchically. Manual inspection of branches in the dendogram revealed a group of primary tumours that concordantly expressed many elements of this signature (Fig. 4a, dashed red box). In particular, a subgroup of primary tumours expressed to various degrees most of the nine genes that were functionally validated. Many patients who developed lung metastasis were among this group. Tumours in this group predominantly expressed markers of clinically aggressive disease, including negative oestrogen receptor (ER)/progesterone receptor status, a Rosetta-type poor-prognosis signature8, and a basal cell subtype of breast cancer27. There was no association of our signature with a high expression of HER2. A molecularly similar subgroup of breast cancer was identified when the clustering analysis was repeated on a previously published Rosetta microarray data set of breast cancer patients9 (Supplementary Fig. S6), indicating that the findings might not be unique to our cohort of patients.
Although the results of the hierarchical clustering are indicative, this approach can lead to arbitrary class assignments and is generally not ideal for class prediction28. We therefore took advantage of the repeated observation of our signature in two independent data sets. For training purposes the Rosetta data set was used to define a group of patients expressing the lung metastasis signature most resembling the LM2 cell lines (Supplementary Fig. S7). All 48 of the 54 lung metastasis genes that were shared between the MSKCC and Rosetta data set microarray platforms were subsequently used to generate a classifier to distinguish these tumours from the remaining tumours in the cohort (Supplementary Table 6). This classifier was then applied to the MSKCC cohort to identify tumours expressing the lung metastasis signature in a manner resembling the LM2 cell lines. These patients had a markedly worse lung-metastasis-free survival (P < 0.001; Fig. 4b) but not bone-metastasis-free survival (P = 0.15; Fig. 4b). These results were independent of ER status and classification as a Rosetta-type poor-prognosis tumour (Fig. 4c). Six of the nine genes that we tested in functional validation studies (MMP1, CXCL1, PTGS2, ID1, VCAM1 and EREG) were among the 18 most univariately significant (P < 0.05) genes that distinguished the patients used to train the classifier (Table 1 and Supplementary Fig. S7, cluster 3), and classification using only these 18 genes gave similar results (data not shown). The three remaining genes (SPARC, IL13RA2 and MMP2) are members of the lung metastasis virulence subset and were expressed only in the most highly metastatic cell lines in our model system (Fig. 2c).
It is unknown how and when metastasis genes are activated29. One explanation for the expression of a lung metastasis signature in a subgroup of primary breast cancers is that these genes may confer a growth advantage on the primary tumour while allowing growth at distant sites7. To test this hypothesis, MDA-MB-231 cells were injected orthotopically into the mammary fat pad of immunodeficient mice. We found that the 1834 (LM0) and 4175 (LM2) cell populations were progressively more aggressive at growing in the mammary fat pad than in the parental cell line. This was correlated with expression of lung metastagenicity genes (Figs 2c and and5a)5a) and was not due to a general enhancement of growth because the 4175, 1834 and parental populations had comparable abilities to metastasize to bone (refer to Fig. 1b). Furthermore, the 4175 and 1834 populations were also more metastatic to the lungs from the orthotopic site after primary tumour resection, recapitulating the phenotypes observed with the tail vein metastasis assay (Fig. 5b). In contrast, the virulently bone-metastatic population 1833 (ref. 4) was only marginally more aggressive in the mammary fat pad than the parental cells and did not metastasize to lung after primary tumour resection (Fig. 5a, b).
To identify which of the genes in the lung metastasis signature might be conferring growth at the primary tumour site, we quantified mammary-fat-pad tumour growth of 4175 cell populations with stable knockdown of various lung metastasis genes that were previously assayed for effects on metastatic behaviour (refer to Fig. 3c, d). Whereas knockdown of IL13Rα2, SPARC and VCAM1 decreased lung metastatic ability but not orthotopic tumour growth, knockdown of ID1 resulted in a statistically significant reduction in both (Figs 3d and and5c).5c). These data indicate that some lung metastasis genes might facilitate both breast tumorigenicity and lung metastagenicity, whereas others confer growth advantages exclusively in the lung microenvironment.
We have identified a set of genes that mediates breast cancer metastasis to lung and is clinically correlated with the development of lung metastasis when expressed in primary breast cancers. Many of the genes in this signature have not previously been linked to metastasis. Together with bone, the lung is one of the most frequent targets of breast cancer metastasis in humans. We provide evidence that these two sites impose different requirements for the establishment of metastases by circulating cancer cells. In addition to providing clinical validation, potential prognostic tools and possible targets for cancer treatment, the present findings shed new light on the biology of breast cancer metastasis.
Many of the genes in the lung metastasis signature are frequently expressed in all MDA-MB-231 subpopulations that metastasize to the lungs, regardless of whether these cells were randomly picked from the parental cell line or selected in vivo. Most of these genes, which we denote as promoting lung metastagenicity, encode extra-cellular products including growth and survival factors (for example the HER/ErbB receptor ligand epiregulin), chemokines (CXCL1), cell adhesion receptors (for example ROBO1) and extracellular proteases (MMP1). They also include intracellular enzymes (for example COX2) and transcriptional regulators (for example ID1), as well as several other downregulated genes. Their expression pattern is tightly correlated with lung metastatic activity. When tested by overexpression in poorly metastatic cells or by RNAi-mediated knockdown in highly metastatic cells, several genes in this group function as mediators of lung metastasis but not bone metastasis. Furthermore, in the cohort of human breast cancer primary tumours examined, those expressing the lung metastasis signature had a significantly poorer lung-metastasis-free survival but not bone-metastasis-free survival. This signature therefore seems to include a set of clinically relevant genes that mediate a metastagenicity function30,31 with selectivity for the lung.
Beside our data, other recent findings reveal the existence of metastasis gene signatures expressed by primary tumours. It is unclear at what point these metastasis gene signatures are acquired during the process of tumorigenesis because the selection pressure for this acquisition is unknown. One possibility is that elements of metastasis gene signatures might have a function in primary tumour growth. Consistent with this idea is the observation that the in vivo selected cell lines expressing the lung metastagenicity signature are more tumorigenic when implanted in the mammary glands of mice. Despite promoting growth in the mammary gland and in the lung, these genes are not general mediators of neoplastic growth. Many lung metastasis signature genes therefore seem to enhance growth both within the breast and the lung (Fig. 5d). These overlapping functions might explain how cells expressing genes involved in metastasis can be selected for in the primary tumour, providing insight into the interpretation of primary tumour microarray data.
Another subset of the lung metastasis genes is overexpressed only in rare, virulently metastatic cells selected in vivo. Several of these genes mediate lung metastasis in our functional assays. Many in this class encode extracellular proteins (for example SPARC and MMP2). With some exceptions (for example the receptors IL13RA2 and VCAM1) this group of genes is sporadically expressed in human primary breast tumours. We propose that these genes act mainly as virulence genes30,31 that may allow tumours to aggressively invade, colonize and grow in the lungs without markedly contributing to primary tumour growth (Fig. 5d). Thus, their expression may be rare in primary tumours but strongly selected for once such cells reach the lung. Supporting this model, a recent study analysing MMP2 expression in matched primary breast cancers and pleural effusions found that MMP2 levels are specifically enriched at the metastatic site32.
Breast cancer is a heterogeneous disease with diverse metastatic behaviour. As a consequence, patients differ widely in prognosis and survival. Attempts to classify this disease molecularly have yielded several useful markers of poor prognosis. However, to our knowledge none of these markers have yet been shown to act as functional mediators that account for the diversity of breast cancer metastases. In contrast, our lung metastasis signature seems to identify poor-prognosis patients who are at high risk of developing lung metastasis, which is consistent with the functional testing done experimentally. Further studies with additional patient cohorts, and a delineation of the role of these genes in specific steps of the metastatic process, should lead to a better understanding of the biology of metastasis and its susceptibilities to treatment.
The parental MDA-MB-231 cell line was obtained from the American Type Tissue Collection. Its derivative cell lines and SCPs were described previously4. Cells were grown in high-glucose DMEM medium with 10% fetal bovine serum. For bioluminescent tracking, cell lines were retrovirally infected with a triple-fusion protein reporter construct encoding herpes simplex virus thymidine kinase 1, green fluorescent protein (GFP) and firefly luciferase13,33. GFP-positive cells were enriched by fluorescence-activated cell sorting.
All animal work was done in accordance with a protocol approved by the Institutional Animal Care and Use Committee. Balb/c nude mice (NCI) 4–6 weeks old were used for all xenografting studies. For lung metastasis formation, 2 × 105 viable cells were washed and harvested in PBS and subsequently injected into the lateral tail vein in a volume of 0.1 ml. Endpoint assays were conducted at 15 weeks after injection unless significant morbidity required that the mouse be euthanized earlier. For bone metastasis, 105 cells in PBS were injected into the left ventricle of anaesthetized mice (100 mg kg−1 ketamine, 10 mg kg−1 xylazine)4. Mice were imaged for luciferase activity immediately after injection to exclude any that were not successfully xenografted.
For mammary-fat-pad tumour assays, cells were harvested by trypsinization, washed twice in PBS and counted. Cells were then resuspended (107 cells ml−1) in a 50:50 solution of PBS and Matrigel. Mice were anaesthetized, a small incision was made to reveal the mammary gland and 106 cells were injected directly into the mammary fat pad. The incision was closed with wound clips and primary tumour outgrowth was monitored weekly by taking measurements of the tumour length (L) and width (W). Tumour volume was calculated as πLW2/6. For metastasis assays, tumours were surgically resected when they reached a volume greater than 300 mm3. After resection, the mice were monitored by bioluminescent imaging for the development of metastases.
Mice were anaesthetized and injected retro-orbitally with 1.5 mg of d-luciferin (15 mg ml−1 in PBS). Imaging was completed between 2 and 5 min after injection with a Xenogen IVIS system coupled to Living Image acquisition and analysis software (Xenogen). For BLI plots, photon flux was calculated for each mouse by using a rectangular region of interest encompassing the thorax of the mouse in a prone position. This value was scaled to a comparable background value (from a luciferin-injected mouse with no tumour cells), and then normalized to the value obtained immediately after xenografting (day 0), so that all mice had an arbitrary starting BLI signal of 100.
Methods for RNA extraction, labelling and hybridization for DNA microarray analysis of the cell lines have been described previously4. For the primary breast tumour data, tissues from primary breast cancers were obtained from therapeutic procedures performed as part of routine clinical management. Samples were snap-frozen in liquid nitrogen and stored at −80 °C. Each sample was examined histologically in cryostat sections stained with hematoxylin and eosin. Regions were dissected manually from the frozen block to provide a consistent tumour cell content of greater than 70% in tissues used for analysis. All studies were conducted under protocols approved by the MSKCC Institutional Review Board. RNA was extracted from frozen tissues by homogenization in TRIzol reagent (Gibco/BRL) and evaluated for integrity. Complementary DNA was synthesized from total RNA by using a dT primer tagged with a T7 promoter. The RNA target was synthesized by transcription in vitro and labelled with biotinylated nucleotides (Enzo Biochem). The labelled target was assessed by hybridization to Test3 arrays (Affymetrix). All gene expression analysis was performed with an HG-U133A GeneChip (Affymetrix). Gene expression was quantified with MAS 5.0 or GCOS (Affymetrix).
The Kaplan–Meier method was used to estimate survival curves, and the log-rank test was used to test for differences between curves using WinSTAT (R. Fitch Software). The site of distant metastasis for the patients in the MSKCC data set was determined from patient records. Patients with lung metastasis developed metastasis to the lung only or to the lung within months of metastasis to other sites. A detailed description of analytical methods used in the paper is provided in Supplementary Methods.
Descriptions of additional experimental procedures used are given in Supplementary Methods.
We thank R. Benezra, Y. Kang, C. Hudis, L. Norton, N. Rosen and C. VanPoznak for insights and discussions, and K. Manova and the staff of the Molecular Cytology Core Facility for assistance with immunohistochemistry. A.J.M is a recipient of the Leonard B. Holman Research Pathway fellowship. G.P.G. is supported by an NIH Medical Scientist Training Program grant, a fellowship from the Katherine Beineke Foundation and a Department of Defense Breast Cancer Research Program pre-doctoral traineeship award. J.M. is an Investigator of the Howard Hughes Medical Institute. This research is supported by the W.M. Keck Foundation and an NIH grant to J.M., and a US Army Medical Research grant to W.G.
Supplementary Information is linked to the online version of the paper at www.nature.com/nature.
Author Information All microarray data have been submitted to the Gene Expression Omnibus (GEO) under accession number GSE2603. Reprints and permissions information is available at snoissimrepdnastnirper/moc.erutan.gpn. The authors declare no competing financial interests.