Breast cell lines (BCL) were obtained from the ATCC1
or from collections developed in the laboratories of Drs. S. Ethier2
(SUM44, SUM52, SUM149, SUM159, SUM185, SUM190, SUM225, SUM229), V.J. Möbus (BrCa-MZ-01), and V. Catros (S68). All BCLs tested were derived from carcinomas except MCF10A, which is derived from fibrocystic disease, and the HMEC-derived 184A1, which was derived from normal mammary tissue. The cell lines were grown using the recommended culture conditions (Supplementary Table 1
). All experiments were done with subconfluent cells in the exponential phase of growth.
ALDEFLUOR assay and separation of the ALDH-positive population by FACS
ALDH activity was assessed in 33 BCLs representing the main molecular subtypes of human breast cancer. The ALDEFLUOR kit (StemCell technologies, Durham, NC, USA) was used to isolate the population with high ALDH enzymatic activity using a FACStarPLUS (Becton Dickinson)as previously described (15
). Briefly, cells were incubated in ALDEFLUOR assay buffer containing ALDH substrate (BAAA, 1 μmol/l per 1×106 cells). In each experiment a sample of cells was stained under identical conditions with 50mmol/L of diethylaminobenzaldehyde (DEAB), a specific ALDH inhibitor, as negative control. The sorting gates were established using PI stained cells for viability. Prior to RNA profiling or NOD/SCID mice injection, the purity of sorted populations was checked using double sorting of 10,000 ALDEFLUOR-positive and negative cells in BrCa-MZ-01 and SUM159 cell lines. For both cell lines, sorted ALDEFLUOR-positive populations contained more than 98% of ALDEFLUOR-positive cells and no ALDEFLUOR-positive cells were detected in the ALDEFLUOR-negative population.
Tumorigenicity in NOD/SCID mice
Tumorigenicity of ALDELFUOR-positive, -negative and unseparated SUM159, MDA-MB-453 and BrCa-MZ-01 cells was assessed in NOD/SCID mice. Fat pads were prepared as described (15
ALDEFLUOR-positive, -negative and unseparated cells from 184A1, SUM149 and SUM159 were plated as single cells in ultra-low attachment plates (Corning, Acton, MA) at low density (5000 viable cells/ml). Cells were grown in serum-free mammary epithelial basal medium (Cambrex Bio Science, Walkerville, MD) for 3-7 days, as described (16
). The capacity of cells to form spheres was quantified after treatment with different doses of IL8 (GenWay Biotech, San Diego, CA) added to the medium.
Total RNA was extracted from frozen ALDEFLUOR-positive and -negative cells using DNA/RNA All Prep Maxi Kit, according to the manufacturer's instructions (Qiagen, Sample and Assay technologies, The Netherlands). Eight BCLs were used for transcriptional analysis: 184A1, BrCa-MZ-01, HCC1954, MDA-MB-231, MDA-MB-453, SK-BR-7, SUM149, and SUM159. RNA integrity was controled by denaturing formaldehyde agarose gel electrophoresis and micro-analysis (Agilent Bioanalyzer, Palo Alto, CA).
Gene expression profiling with DNA microarrays
Gene expression analyses used Affymetrix U133 Plus 2.0 human oligonucleotide microarrays containing over 47,000 transcripts and variants including 38,500 well-characterized human genes. Preparation of cRNA, hybridizations, washes and detection were done as recommended by the supplier1
. Expression data were analyzed by the RMA (Robust Multichip Average) method in R using Bioconductor and associated packages (17
), as described (18
). RMA did background adjustment, quantile normalization and summarization of 11 oligonucleotides per gene.
Before analysis, a filtering process removed from the dataset genes with low and poorly measured expression as defined by expression value inferior to 100 units in all the 16 samples, retaining 25,285 genes/ESTs. A second filter, based on the intensity of standard deviation (SD), was applied for unsupervised analyses to exclude genes showing low expression variation across the analyses. SD was calculated on log2-transformed data, in which lowest values were first floored to a minimal value of 100 units, i.e. the background intensity, retaining 13,550 genes/ESTs with SD superior to 0.5. An unsupervised analysis was done on 16 ALDEFLUOR-positive, -negative cells on 13,550 genes. Before hierarchical clustering, filtered data were log2-transformed and submitted to the Cluster program (19
) using data median-centered on genes, Pearson correlation as similarity metric and centroid linkage clustering. Results were displayed using TreeView program (19
). To identify and rank genes discriminating ALDEFLUOR-positive and -negative populations, a Mann and Whitney U test was applied to the 25,285 genes/ESTs and false discovery rate (FDR, was used to correct the multiple testing hypothesis (see Supplementary Table 2
for complete data set). The classification power of the discriminator signature was illustrated by classifying samples by hierarchical clustering. A LOOCV was applied to estimate the accuracy of prediction of the identified molecular signatures and the validity of supervised analysis; each sample was excluded one by one and classified with the linear discriminant analysis (LDA) (20
) by using model defined on the non-excluded samples.
After ALDEFLUOR-positive and ALDEFLUOR-negative populations from different cell lines were sorted, total RNA was isolated using RNeasy Mini Kit (QIAGEN) and utilized for real-time quantitative RT-PCR (qRT-PCR) assays in a ABI PRISM® 7900HT sequence detection system with 384-well block module and automation accessory (Applied Biosystems). Primers and probes for the Taqman system were selected from the Applied Biosystems website1
. The sequences of the PCR primer pairs and fluorogenic probes used are available on the Applied Biosystems website (CXCR1 assay ID: Hs_00174146_mi; FBXO21 assay ID: Hs_00372141_mi, NFYA assay ID: Hs_00953589_mi, NOTCH2 assay ID: Hs_01050719_mi, RAD51L1 assay ID: Hs00172522_mi, TBP assay ID: Hs_00427620_mi). The relative expression mRNA level of CXCR1, FBXO21, NFYA, NOTCH2, RAD51L1
was computed with respect to the internal standard TBP
gene to normalize for variations in the quality of RNA and the amount of input cDNA, as described previously (21
Assays were done in triplicate in transwell chambers with 8μm pore polycarbonate filter inserts for 12-well plates (Corning, NY). Filters were coated with 30 μl of ice-cold 1:6 basement membrane extract (Matrigel, BD-Bioscience) in DMEM/F12 incubated 1 hour at 37°C. Cells were added to the upper chamber in 200 μl of serum-free medium. For the invasion assay, 5000 cells were seeded on the Matrigel-coated filters and the lower chamber was filled with 600 μl of medium supplemented with 10% human serum (Cambrex) or with 600 μl of serum-free medium supplemented with IL8 (100ng/mL). After 48 hours incubation, the cells on the underside of the filter were counted using light microscopy. Relative invasion was normalized to the unseparated corresponding cell lines under serum condition.
For luciferase gene transduction, 70% confluent cells from HCC1954, MDA-MB-453, and SUM159 were incubated overnight with a 1:3 precipitated mixture of lentiviral supernatants Lenti-LUC-VSVG (Vector Core, Ann Arbor, MI) in culture medium. The following day the cells were harvested by trypsin/EDTA and subcultured at a ratio of 1:6. After 1 week incubation, cells were sorted according to the ALDEFLUOR phenotype and luciferase expression was verified in each sorted population (ALDEFLUOR-positive and ALDEFLUOR-negative) by adding 2 μl D-luciferin 0.0003% (Promega, Madison, WI) in the culture medium and counting photon flux by device camera system (Xenogen, Alameda, CA) (Supplementary Figure 1
Six weeks-old NOD/SCID mice were anesthetized with 2% isofluorane/air mixture and injected in the heart left ventricle with 100,000 cells in 100 μL of sterile Dulbecco's PBS lacking Ca2+ and Mg2+. For each of the three cell lines (HCC1954, MDA-MB-453, SUM159) and for each population (ALDEFLUOR-positive, ALDEFLUOR-negative and unsorted), three animals were injected.
Baseline bioluminescence was assessed before inoculation and each week thereafter inoculations. Mice were anesthetized with a 2% isofluorane/air mixture and given a single i.p. dose of 150 mg/kg D-luciferin (Promega, Madison, WI) in PBS. For photon flux counting, we used a charge-coupled device camera system (Xenogen, Alameda, CA) with a nose-cone isofluorane delivery system and heated stage for maintaining body temperature. Results were analyzed after 2 to 12 minutes of exposure using Living Image software provided with the Xenogen imaging system. Signal intensity was quantified as the sum of all detected photon flux counts within a uniform region of interest manually placed during data postprocessing. Normalized photon flux represents the ratio of the photon flux detected each week after inoculations and the photon flux detected before inoculation.
Results are presented as the mean ±SD for at least three repeated individual experiments for each group. Statistical analyses used the SPSS software (version 10.0.5). Correlations between sample groups and molecular parameters were calculated with the Fisher's exact test or the one-way ANOVA for independent samples. A p-value <0.05 was considered significant.