Additional methods for cell culture assays, quantitative real-time PCR, immunoblotting, generation knockdown and overexpression cell lines, details of tumor xenografting, histological analyses, long-term survival assays and statistical analyses are provided in a supplementary section.
Primary tumor and metastasis tissue samples
We compiled a microarray dataset of 615 patients from MSKCC and EMC (EMC344, EMC189 and MSK82, GEO accession numbers: GEO2603, GSE5327, GSE2034 and GSE12276). These datasets were all normalized using MAS5.0 and each microarray was centered to the median of all probes. For each patient, metastasis free survival (MFS) is defined as the time interval between the surgery and the diagnosis of metastasis.
Archival human breast carcinoma metastasis specimens were obtained and processed in compliance with protocols approved by the MSKCC Institutional Review Board. Samples were snap frozen in liquid nitrogen and stored at -80°C. Each sample was examined histologically using hematoxylin and eosin stained cryostat sections. Regions were manually dissected from the frozen block to provide consistent tumor cell content of greater than 70% in tissues used for analysis. RNA was extracted from frozen tissues by homogenization in TRIzol reagent (GIBCO/BRL) and evaluated for integrity. Complementary DNA was synthesized from total RNA using a T7-promoter-tagged-dT primer. RNA target was synthesized by in vitro transcription and labeled with biotinylated nucleotides (Enzo Biochem, Farmingdale, NY). Labeled target was assessed by hybridization to Test3 arrays (Affymetrix, Santa Clara, CA). All gene expression analysis was carried out using HG-U133A (for 36 samples) or HG-U133plus2 (for 29 samples) GeneChips. Seven samples were profiled on both platforms and the data were averaged for the common probes. Gene expression was quantitated using GCOS.
Molecular pathway gene-expression signatures
To predict pathway activation from microarray gene-expression data, we used gene expression profiles derived from the overexpression of c-Src, H-Ras, β-catenin, E2F3, Myc in quiescent mammary epithelial cells (
Bild et al., 2006). We derived a gene signature-expression classifier for each of these pathways using a false discovery rate of 0.05 and a fold-change of 1.5 as criteria. The 605 genes that met these thresholds in the c-Src responsive gene set were filtered by using the EMC-344 data set to eliminate non-informative genes. After eliminating genes that were either expressed at a low level (raw intensity < 64 or 2
6 in more than 25% tumors) or were non-variable across samples (standard deviation < 0.8 for log2 intensity, equivalent to the median of all genes), 159 genes remained that constitute the c-Src responsive signature (SRS) used here.
The SRS was applied to EMC-344 and MSK-82 datasets, which are based on HG-U133A and were combined, and to EMC-189 dataset, which is based on HG-U133plus2 and was processed separately. To search for breast cancers with SRS expression pattern similar to the c-Src-activated mammary epithelial cells (
Bild et al., 2006) we performed unsupervised clustering (using
heatmap.2 function in
gplots package of R statistical software). Two clusters were consistently revealed by such procedure (R index = 0.85) (
McShane et al., 2002). One cluster was identified as SRS
+ based on the criteria that it exhibits gene-expression similarity to c-Src-activated mammary epithelial cells as gauged by positive Pearson’s correlation coefficients, “metagene” scores (
Bild et al., 2006), and that it enriches ER
+ tumors (
Collins and Webb, 1999;
Ishizawar and Parsons, 2004). The other cluster showed the opposite characteristics and was denoted as SRS
−.
The same approach was applied for β-catenin, E2F3, H-Ras, and c-Myc pathways. TGFβ pathway was gauged as previously described (
Padua et al., 2008). For TCF/Wnt pathway, we performed unsupervised clustering using the dominant negative TCF4 signature (
van de Wetering et al., 2002). One of the two clusters significantly overexpresses the vast majority of the dnTCF4 genes, and tumors in this cluster are therefore defined as TCF/Wnt+.
To examine the prognostic value of SRS in different subsets of breast cancers, we divided the breast cancer samples based on their ER status or molecular subtypes. For ER status, we used either published pathological annotations (for GSE2063, GSE5327 and GSE2034), or the intensity of probe “205225_at” (ESR1) on the Affymetrix chip when the pathological status was not available (for GSE12276). We used raw intensity of 1000 as the cutoff to define ER- vs. ER+. It has been established that this is an appropriate cutoff when the data is normalized with MAS5.0 and the global scaling is set to 600 (
Foekens et al., 2006). Molecular subtype classification was done as previously described (
Smid et al., 2008) according to published classifiers (
Perou et al., 1999;
Sorlie et al., 2003). Of note, a non-trivial proportion of Luminal tumors cannot be unambiguously determined between Luminal A and luminal B subtypes. We therefore merged the two subtypes in some analyses.
Survival analyses were carried out using the “survival” package of R. P values were calculated by the “survdiff” command in the package, which is based on log-rank tests. When sample size is small, we also performed Fisher’s Exact Test. Kaplan-Meier curves were drawn with the survfit command in the same package. The Cox proportion hazard regression analyses were performed using the “coxph” method in the same package.
Cytokine gene expression analysis
We compiled a list of 260 cytokine genes using GO database (cytokine activity entry: GO:0005125). These cytokines were mapped to 404 probes on the Affymetrix HG-U133A platform. We screened these probes in each tissue sample for those whose intensity value was greater than the median of all genes, and statistically overexpressed in bone metastasis samples compared to other metastasis (t test with Welch’s correction, p < 0.05 with correction for multiple tests).
Tumor xenografts and analysis
All procedures involving mice were approved by the MSKCC Institutional Animal Care and Use Committee. And the details are provided in the supplementary method section.
Statistical analysis
Bone metastasis assay in BoM-1833 line has been repeated for 2 times (n=7-10 for each cohort at each time). The results were pooled and shown as . The same result has been reproduced independently later in two additional experiments. For the bone metastasis assay of CN34-BoM2 (), lung colonization assay (), orthotopic proliferation (), intratibial growth (), and Dasatinib treatment (), one experiment was performed with n=10-15 in each cohort. Results are reported as mean ± SEM (standard error of the mean), as indicated in the figure legends. Comparisons between Kaplan-Meier curves were performed using the log rank test. Other comparisons were performed using unpaired two-sided t test without equal variance assumption unless otherwise specified.
Accession numbers
The raw and normalized data of breast cancer metastases have been deposited to the Gene Expression Ominbus (GEO) database (GSE14020).