|Home | About | Journals | Submit | Contact Us | Français|
Previous studies from our laboratory have demonstrated that metastatic propensity is significantly influenced by the genetic background upon which tumors arise. We have also established that human gene expression profiles predictive of metastasis are not only present in mouse tumors with both high and low metastatic capacity, but also correlate with genetic background. These results suggest that human metastasis-predictive gene expression signatures may be significantly driven by genetic background, rather than acquired somatic mutations. To test this hypothesis, gene expression profiling was performed on inbred mouse strains with significantly different metastatic efficiencies. Analysis of previously described human metastasis signature gene expression patterns in normal tissues permitted accurate categorization of high or low metastatic mouse genotypes. Furthermore, prospective identification of animals at high risk of metastasis was achieved by using mass spectrometry to characterize salivary peptide polymorphisms in a genetically heterogeneous population. These results strongly support the role of constitutional genetic variation in modulation of metastatic efficiency and suggest that predictive signature profiles could be developed from normal tissues in humans. The ability to identify those individuals at high risk of disseminated disease at the time of clinical manifestation of a primary cancer could have a significant impact on cancer management.
Elucidation of the mechanisms governing metastases development are of great importance given that the majority cancer-related patient mortality is due to metastatic disease . While it is recognized that metastasis occurs as a result of complex interactions between normal tissues and neoplastic cells, most contemporary studies primarily focus on how tumor-specific changes affect both the biological and clinical outcome. For example, a number of studies have identified metastasis suppressor genes, whose loss of function in primary tumors promotes metastatic progression . The ability of a tumor to successfully form a metastasis, however, is potentially influenced by many complex and interacting factors that are not necessarily inherent to the primary tumor itself. Indeed, it is plausible that the metastatic capacity any cells shed by a primary tumor is significantly influenced by variability in the functional state of normal tissues that metastatic cells come into contact with while both in transit to and in residence at the site of secondary lesion formation. Therefore, the observed differential metastatic capacity evident in human tumorigenesis is a function not only of somatic events within tumor cells, but also from constitutional genetic heterogeneity affecting gene expression in transit and secondary sites. It has been demonstrated that constitutional genetic polymorphism in mice can influence almost any measurable trait , including gene expression, with steady state levels of mRNAs in tissues being significantly influenced by genetic background [4–7]. Additionally, such studies have shown that it is possible to map genetic factors controlling transcript levels in numerous organisms.
These observations suggest that a complex process like metastasis could be significantly influenced by the genetic background upon which a tumor arises. To address this question, we have previously conducted a series of studies demonstrating that the genetic background upon which the polyoma middle-T (PyMT) transgene is carried has a significant impact on the metastatic efficiency of resulting mammary tumors. Specifically, when PyMT mice were crossed with inbred mice of different strains, the resulting F1 progeny demonstrated significantly different frequencies of macroscopic lung metastasis formation, ranging from ~10-fold fewer than the original FVB inbred background to ~2-fold more . The variation in the number of observed metastases in the F1 progeny is most likely the result of a constitutional genetic polymorphism in the recipient animals since the tumors in all of the progeny result from polyoma middle-T transgene activation. In concordance with these observations, we have recently identified a gene in which hereditary variation significantly contributes to the modulation of metastastic efficiency in this system .
In addition, we have observed that the expression of a number of genes described as components of a metastasis predictive signature set  were differentially regulated between tumors of different genotypes and metastatic potential, thus largely replicating the published human metastasis signature profile [11, 12]. Since all tumors in our model system were induced by the same oncogenic event, the induction of the PyMT transgene, we hypothesize that genetic polymorphism, rather than oncogenic mutation as previously suggested , regulates expression of constituents of such metastasis predicting gene signature profiles. Furthermore, if this hypothesis is correct, it would be possible to make two important predictions; first, that signature genes should be differentially expressed in normal tissues prior to tumor induction; and second, by profiling normal tissues prior to the occurrence of metastatic disease, it should theoretically be possible to prospectively predict metastatic risk given that the polymorphisms responsible for differential signature gene expression exist in germ line DNA.
To test this hypothesis we have examined the expression of a known  and a novel metastasis predictive gene signature profile in normal tissues within the PyMT mouse mammary tumor model to attempt to differentiate between high and low metastatic genotypes. Furthermore, we aim to explore the possibility of prospectively identifying animals at high risk of metastasis by analyzing normal tissues prior to metastasis development.
Transgenic DBA and NZB F1 hybrids were generated by crossing two inbred strains, [FVB/N-TgN(MMTV-PyVT)634Mul × inbred (DBA/2J or NZB/B1NJ)]. FVB F1 transgenic mice were generated by crossing inbred FVB/N-TgN(MMTV-PyVT)634Mul × FVB/NJ. Transgene positive DBA F1 males were then bred to DBA/2J females to generate DBA N2 backcross animals.
The AKXD RI panel consists of 25 inbred sublines with different combinations of the original parental genomes were derived from interbreeding AKR/J (a high metastatic genotype) and DBA/2J (a low metastatic genotype) as previously described . The PyMT animal FVB/N-TgN(MMTV-PyVT)634Mul was bred to 18 of the 25 AKXD sublines to generate 18 [PyMT × AKXD]F1 sublines to provide genetic mapping information for metastasis efficiency modifier genes . Transgene positive female animals were aged for 100 days, euthanized by carbon dioxide exposure and tissues harvested for analysis. Lungs were sectioned and metastases were quantitated using a Leica Q500IW image analysis system, as previously described .
Mammary tumors were excised from female transgenic DBA F1, NZB F1 strains or AKXD sublines. Normal mammary gland, lung tissues and whole blood collected from non-transgenic FVB F1 or F1 progenies between [FVB/N × DBA/J2] or [FVB/N × NZB/B1NJ] were snap-frozen upon harvesting and stored at −80ºC. These tissues were selected because they are the source of the primary tumor and the metastatic target organ in this model. Blood was also assayed due to its relatively easy clinical accessibility to model potential use in human populations. In some instances, tissues were placed in RNAlater™ (Ambion, Inc.) prior to freezing at −80ºC, in accordance with the manufacture’s recommended protocol. Prior to RNA isolation, all tissues were pulverized while frozen solid on dry ice in an RNase free environment.
Total RNA extractions from tissue samples were carried out using TRIzol® Reagent (Life Technologies, Inc.) according to the standard protocol. Total RNA was prepared from whole blood using QIAamp RNA blood mini kit (Qiagen) per manufacture’s instruction. RNA quantity and quality were determined by the Agilent Technologies 2100 Bioanalyzer (Bio Sizing Software version A.02.01., Agilent Technologies) and/or the GeneQuant Pro (Amersham Biosciences Corp.). Samples containing high-quality total RNA with A260/A280 ratios between 1.8 and 2.1 were purified with the RNeasy Mini Kit (Qiagen). An on-column genomic DNA digestion was performed as part of this purification step using the RNase-Free DNase Kit (Qiagen).
First-strand cDNA was synthesized from total RNA by reverse transcription using the ThermoScript™ RT-PCR System (Invitrogen), according to the manufacture’s protocol. Synthesized cDNA template was diluted 1:40 prior to use in quantitative real-time PCR (qPCR). Primers were either purchased from the Applied Biosystems Assays-On-Demand™ Gene Expression product line or designed using Primer3 (Applied Biosystems, Foster City, CA). Reactions were prepared using Taqman® Universal PCR Mastermix (Applied Biosystems) or QuantiTect™ SYBR® Green PCR Mastermix (Qiagen) per manufacture’s protocols, and were performed in triplicates on an ABI Prism® 7900HT Sequence Detection System. Target gene expression was normalized to that of Gapdh using Taqman® Rodent GAPDH (Applied Biosystems) internal control primers.
Purified total RNA for each strain used in Affymetrix GeneChip assays was processed as previously described . Hybridizations were performed on Affymetrix Murine Genome Moe430 A and B GeneChip® Arrays. Microarrays were processed using an Agilent GeneArray Scanner with Affymetrix Microarray Suite version 5.0.0.032 software.
Further analysis of raw microarray data, including normalization and statistical analysis was performed using BRB-Array Tools v3.2.0 [17, 18]. Array data were filtered to exclude genes possessing < 20 % of expression data having at least a 1.5 -fold change in either direction from gene’s median value, and p-value of the log-ratio variation < 0.01. Metastasis predictive gene signature expression profiles were then determined using the class prediction function of BRB ArrayTools. Models that prospectively predict sample class were created using the Compound Covariate Predictor  and Diagonal Linear Discriminant Analysis functions . The models incorporated differentially expressed genes at the P < 0.05 significance level, as assessed by the random variance t-test . The prediction error of each model was estimated using leave-one-out cross-validation (LOOCV) as described by Simon et al . The entire model building process was repeated for each LOOCV training set including the process of gene selection. Additionally, we evaluated whether the cross-validated error rate estimate for each model was significantly less than one would expect from random prediction. Class labels were randomly permuted, and the entire LOOCV process repeated. The significance level is reported as the proportion of random permutations that gave a cross-validated error rate no greater than the cross-validated error rate obtained with real data. One thousand random permutations were used.
Clustering analysis of relative gene expression levels from normal mammary gland, as determined by qPCR was performed using the tree joining with complete linkage and Euclidean distance functions of the program Statistica ver 5.5.
The Affymetrix .CEL files were normalized using the RMA method, averaged for each AKXD recombinant inbred strain, and loaded into the internet based quantitative trait loci (QTL) analysis package WebQTL (http://webqtl.org/) . The database was then searched for probe sets for each of the metastasis predictive signature genes described in Ramaswamy et al. . The interval mapping function of WebQTL was performed with each of the probe sets to identify regions of the genome that correlated with signal intensity levels of each probe set to identify potential expression QTLs (eQTLs).
Saliva was collected from DBA N2 animals at 60 days of age, an age at which the majority of animals have developed palpable tumors. Animals were fasted for three hours prior to sample collection, anesthetized using an intraperitoneal injection of Avertin (0.010 g/ml) and injected subcutaneously with pilocarpine (0.5 mg/ml in PBS) at a concentration of 0.5 mg/kg mouse weight. Saliva was collected with a sterile plastic pipette tip attached to a syringe with animals placed head down on an inclined platform. Saliva was then centrifuged to pellet debris, aliquoted into sterile eppendorf tubes, snap frozen and stored at −80°C.
Following quantification of salivary protein concentration by BCA assay (Pierce), samples were diluted 1:1 in 9 M urea and 2% CHAPS, and vortex mixed for 30 minutes at 4°C. Ten microliters of each saliva protein sample (100 μg/ml) was then mixed with 90 μl of binding buffer (4.5 M urea, 1% CHAPS and 50 mM NaOAc, pH 4.5), and applied to individual spots on CM10 chips (Ciphergen Biosystems, Fremont, CA) in duplicate. The chips were then incubated at room temperature for 2 hours with agitation, and washed three times with binding buffer followed by a final rinse with deionized water. Following air drying, two 1 μl applications of matrix solution (saturated solution of α-cyano-4-hydroxycinnamic acid in 50% acetonitrile with 0.5% trifluoroacetic acid) were applied to each chip spot. Once dried, the chips were analyzed in a ProteinChip Biology System reader (PBS II, Ciphergen Biosystems). Mass spectral data over an m/z range of 2,000–30,000 were collected by averaging at least 70 laser shots from various regions of the spot surface, with a laser intensity of 160 and a detection sensitivity of 10. Saliva mass spectra were calibrated externally using the All-in-1 peptide standard (Ciphergen) per manufacture’s instruction and normalized using total ion current. Peak identification (signal/noise ≥ 3) and alignment (cluster mass window at 0.3%) were performed using Ciphergen Biowizard 3.1.1 (Ciphergen).
SELDI MS data were randomly divided into a training set containing spectra from 28 mice, and a test set containing spectra from 14 mice. Mice in the training set were classified as high- or low-metastatic phenotypes based on their metastatic indexes compared to the average metastatic index of the whole population. Intensity signals of protein profiles from each animal were obtained by averaging peak intensities of the duplicate spectra. Class prediction was performed as described above in analysis of Affymetrix GeneChip® Data using BRB ArrayTools. Models incorporated m/z values that were differentially expressed among m/z values at the 0.006 significance level as assessed by the random variance t-test. These m/z peak values were then applied to the test set to classify the metastatic phenotype (in relation to the average metastatic index of the training set) of the mouse using the compound covariate predictor and diagonal linear discriminant analysis algorithms.
To test the hypothesis that tumor-derived metastasis signature profile genes were also differentially expressed in normal tissues, qPCR was performed for 10 of the 17 metastasis signature genes defined by Ramaswamy et al. . Nine out of ten of the genes exhibited differential expression between the FVB and NZB F1 normal mammary tissue samples, (Figure 1a), while all 10 of the genes were differentially expressed in the FVB versus DBA F1 comparison. In addition, the expression patterns of 9 out of 10 of the signature set genes were concordant between NZB and DBA F1’s.
Evidence for differential expression of signature genes was also observed in lung tissues, though the magnitude of these differences was significantly reduced compared to the mammary tissue. Only two genes in the DBA and five genes in the NZB versus FVB comparisons differed by 1.5-fold or greater (Figure 1b). The remaining genes, however, demonstrated consistent, albeit modest, expression differences suggesting that the observed variation was likely real. As in the mammary qPCR experiments, most gene expression patterns were concordant between the NZB and DBA F1 lung tissues. However, the expression patterns observed in the lung were significantly different from those seen in mammary tissue. Furthermore, both profiles differed in some respects from the published metastasis signature , which is not an entirely unexpected observation given the divergent cellular compositions of lung, mammary and tumor samples. It is therefore highly improbable that these dissimilar tissue types would have identical gene expression patterns.
Gene expression profiling was also performed on mRNA extracted from whole blood. Nine of the ten profile genes failed to amplify, which is not surprising given that the original gene expression profile was derived from solid tumors rather than cells of hematopoetic origin. One gene, Pttg1, was detectable, and displayed strain-specific expression levels similar to those observed in normal mammary tissue, with concordance observed between the DBA and NZB strains (Figure 1c).
Differential expression of metastasis predictive expression signature genes provides strong, yet not definitive, evidence that genetic background influences both metastatic and prognostic signatures. Co-localization of the metastasis efficiency modifying genes and any genetic factors controlling expression levels of individual members of the signature profile would provide additional support of this hypothesis. Therefore, gene expression profiling was performed on three independent tumor samples from each of the 18 substrains of the AKXD recombinant inbred mouse panel used in previous studies to map the metastasis efficiency modifier locus Mtes1 . Due to the lack of statistical power associated with only 18 genotypes in the sample set, a statistically significant genetic factor influencing expression levels was found with only one gene, Col1a1 (see Figure 2). Suggestive genetic factors were observed for 13 genes, while three additional genes did not produce either suggestive or significant results. However, of the 14 genes exhibiting suggestive or significant results, 9 were linked to chromosomes associated with metastatic efficiency modifier loci [chrs 4, 9, 11, 17; [15, 24] and unpublished results], consistent with a possible link between the metastasis and gene expression modifying loci.
Although normal tissue data is consistent with the hypothesis that genetic background influences both metastatic capacity and the gene signature profile, one may argue that expression of the human metastasis predictive gene set may be irrelevant in regards to metastatic efficacy in our mouse model. Therefore, we generated a mouse metastasis gene signature profile to test whether these phenomena could be observed using mouse genes directly linked to metastatic propensity. The AKXD recombinant inbred (RI) metastasis and gene expression data were therefore used to generate a mouse metastasis predictive gene expression profile, analogous to the previously describe human signature profiles. The Affymetrix .CEL data was loaded into BRB ArrayTools and normalized by the RMA method. The average density of pulmonary metastases in each of the 18 AKXD RI sublines was then analyzed using the Class Prediction tool to generate a signature profile. A 58 probe-set classifier that predicted metastasis with 77% accuracy was identified from the 18 segregating genotypes (see Table 1).
Quantitative RT-PCR analysis was performed to determine whether predictive genes were differentially expressed in normal tissue. Primers were therefore designed against each gene in the predictive profile. Thirty eight of the 58 primer pairs selectively amplified cDNA but not genomic DNA and were used for qPCR experiments (see Table 1). The relative abundance of each of the genes of the mouse metastasis classifier was quantified in a set of normal FVB (n=5) or [FVB × DBA]F1 (n=4) mammary glands, and used in an unsupervised clustering analysis to determine whether the high metastatic FVB genotype would cluster separately from the low metastatic [FVB × DBA]F1 genotype. Consistent with our hypothesis, expression of the mouse metastasis gene signature set in normal tissues segregated the high and low metastatic genotypes (see Figure 3).
These data are consistent with the possibility that the metastasis predictive gene profile is at least partially the result of germline polymorphism rather than somatic mutation, and the gene expression modifiers and metastasis efficiency modifiers may be related. If true, this suggests that metastasis signature profiles may be functioning as a surrogate for genotype, with each expression profile member representing one or more polymorphic sites in a genome-wide scan, analogous to a complex trait mapping study. Since the polymorphisms underlying a significant fraction of the gene expression differences are present in germline DNA, this would suggest that it should be possible to use other technologies that permit genome-wide screens of polymorphisms to predict metastatic capacity. In addition, since germ line polymorphisms are present in all tissues of the body, it should be possible to perform the assays in tissues other than tumor tissue.
To test these possibilities, mass spectrometric profiling of mouse saliva was performed. Examination of microarray data from high and low metastatic genotypes revealed that significant differences were observed in the expression of a number of different salivary protein components (; see Figure 4). In addition, a literature search revealed that known polymorphisms in salivary gland proteins largely segregated mouse strains into known high- and low-metastatic classes [25–27]. Saliva was therefore collected from FVB and DBA F1 animals and assayed by Ciphergen SELDI-TOF to determine whether the two genotypes would cluster independently. One hundred and thirty eight peptide peaks from the saliva profiles were used in an unsupervised clustering analysis. As expected, the two genotypes segregated with only one single error (Figure 5). Furthermore, transgene-positive samples did not segregate separately from transgene–negative samples in either the FVB or DBA F1 samples, arguing against the possibility that the presence of tumor cells might influence a salivary peptide predictive profile.
To extend these result to a genetically heterogeneous population more analogous to human cancer populations, a backcross was performed to generate PyMT-positive N2 animals. Saliva was collected at 60 days of age, and saliva protein profiles analyzed in duplicate on Ciphergen CM10 chips. Animals were held until 100 days of age, when their metastatic burden was determined by measurement of pulmonary metastasis density on coronal H & E sections. The number of metastases in the N2 population ranged from 0–37 per cm2 of lung tissue.
From the spectra of a training set consisting of 28 animals, a predictive peptide profile was developed that would classify animals as either above or below the median metastasis value of the training set. Four hundred and sixty nine peptide peaks were analyzed to determine a predictive peptide set. Cross-validation was performed to identify a predictive profile consisting peptide peaks with m/z values of 2153.99, 2481.01, 2770.25, 2524.64, 2567.03 that could correctly categorize training set samples with ~80% accuracy, and a sensitivity and specificity of 77–85% (Table 1). An additional 14 samples were subsequently classified using the predictors and the predicted assignment compared to the experimental metastasis data. Approximately 80% of the samples (11/14) of the test set were correctly classified (Table 2).
The recent observation that gene expression profiles from bulk tumors can define high and low metastatic propensities prior to the formation of overt clinical secondary lesions [10, 28, 29] suggests that a significant proportion of cells in a tumor must exhibit the predictive gene expression profile. Thus, it has been suggested that metastatic potential is encoded early in tumorigenesis, most likely by the oncogenic mutations themselves . However, these results are inconsistent with the commonly accepted progressive theory of metastasis [30, 31], which predicts that only a subpopulation of a tumor obtains full metastatic competency. We argue that our previous genetic and expression studies [8, 12, 15] explain this paradox. It would therefore follow that the genetic background from which a tumor arose might significantly influence the metastatic efficiency of a primary tumor, and be a major determinant of the prognostic expression profile [11, 32]. If this theory was pursued to its logical conclusion, it should therefore be possible to identify those individuals at high risk using differential expression patterns in normal tissues.
To test this hypothesis, we have examined expression of the proposed metastasis predictor gene set in normal tissues. Consistent with our hypothesis, the data presented here demonstrate that a significant fraction of the predictive metastatic signature genes proposed by Ramaswamy et al. are differentially expressed in normal tissues between high- (FVB) and low- (DBA, NZB F1) metastatic genotypes prior to the induction of malignant disease. These data, in addition to other studies demonstrating the significant impact of genetic background on gene expression [4–7], suggest that tumor metastatic propensity and predictive gene expression patterns are significantly influenced by constitutional polymorphisms rather than solely by oncogenic mutations. These data are also entirely consistent with the recent report by Malins et al.  who demonstrated that it was possible to distinguish metastatic and non-metastatic prostate cancer patients based on DNA structure in normal prostate epithelium, suggesting that predisposition to metastasis is constitutively encoded.
More importantly, these observations suggest that prospective identification of patients at higher risk of developing disseminated disease using readily available samples instead of the more difficult to obtain tumor tissue might well be possible. In support of this hypothesis, we have demonstrated that it is possible to use saliva peptide polymorphisms to prospectively identify individuals at high risk for metastatic disease in a genetically segregating population. This approach has enabled the protein expression pattern itself, independent of the identity of the proteins or peptides, to be employed as a discriminator of metastatic propensity. Although significant improvements in the sensitivity and specificity of the assays are obviously required to achieve the level of accuracy required for clinical use, the ability to correctly categorize ~80% of the samples based on a relatively small training set suggests this methodology, or some variant of it, might be of significant value for clinical prognosis.
To achieve the required level of accuracy, a variety of strategies are currently being pursued. Foremost is the use of more sensitive instrumentation to increase the resolution, sensitivity, reproducibility and the number of the data points used to derive signature gene profiles. Increased sensitivity will likely permit the identification of more reliable and informative markers than is currently possible with a low resolution methodology such as SELDI-TOF. Second, a much larger cohort of animals or individuals will likely be required to generate a clinically usable predictor than were used in this proof-of-principle study. Significant variability in the metastasis frequency exists in both high and low metastatic genotypes, resulting in a small fraction of animals that are highly predisposed to metastasis phenotypically resembling low metastatic strains and vice versa. Inclusion of misclassified individuals in the training set would therefore reduce the accuracy of the predictor and likely is partially responsible for the 20% inaccuracy rate observed in the current study. Significant increases in the number of samples used to generate the predictor should reduce the confounding effects of misclassified samples, thus improving the overall predictive value. Third, analysis of tissues other than saliva might provide a better prognostic profile. While perhaps the most easily available non-invasive human tissue, salivary secretions might not possess adequate complexity to provide sufficient marker variability to enable accurate prognosis in the more genetically complex human population. Subsequently, blood or serum samples may provide a richer source of polymorphisms than saliva for more accurate clinical diagnosis.
The ability to prospectively identify those patients at increased risk of tumor dissemination also raises the potential for pre-emptive use of anti-metastatic therapies. Identifying patients with a higher likelihood of developing metastases prior to detectable secondary disease and enrolling them in metastasis-prevention regimens might significantly reduce the burden of secondary lesions. Preliminary evidence of the feasibility of such a strategy is currently ongoing in our laboratory, and by using a small molecule agent, caffeine, we have demonstrated the ability to significantly reduce the efficiency of pulmonary colonization in the PyMT animal . If these studies can be translated into human populations, similar strategies and agents might eventually lessen metastasis-associated mortality and morbidity. Alternatively, identifying high risk patients would also permit better monitoring to enable earlier therapeutic intervention against metastatic disease.
Finally, it is also important to point out that while the outcome of this study might be extrapolated to have implications for the seed and soil hypothesis proposed by Paget , there are a number of important distinctions. To date most studies examining metastasis-related genes have focused on factors intrinsic to the tumor cell (seed) or cells surrounding the tumor cell (soil) in the secondary site. Since normal tissues and autochthonous tumors have the same constitutional genetic information, genetic variation that modifies metastatic efficiency can operate in the tumor cell, the stroma of the secondary site, in intermediate tissues during dissemination, or any combination of the above simultaneously. These data suggests that genetic heterogeneity influences both seed and soil between individuals, not somatic changes within the “seed” or the interaction of the “seed” with the different “soils” that exist within individuals. Thus, at present, this study addresses the role of genetic constitutional polymorphism on metastatic efficiency at the level of the whole organism, though individual contributions of the “seed” and “soil” are likely to be critical determinants in the efficiency of metastatic dissemination.
This research was supported by the Intramural Research Program of the NIH and the National Cancer Institute.