|Home | About | Journals | Submit | Contact Us | Français|
Invasive ductal carcinoma (IDC) and invasive lobular carcinoma (ILC) are the two major histological types of breast cancer worldwide. Whereas IDC incidence has remained stable, ILC is the most rapidly increasing breast cancer phenotype in the United States and Western Europe. It is not clear whether IDC and ILC represent molecularly distinct entities and what genes might be involved in the development of these two phenotypes. We conducted comprehensive gene expression profiling studies to address these questions. Total RNA from 21 ILCs, 38 IDCs, two lymph node metastases, and three normal tissues were amplified and hybridized to ~42,000 clone cDNA microarrays. Data were analyzed using hierarchical clustering algorithms and statistical analyses that identify differentially expressed genes (significance analysis of microarrays) and minimal subsets of genes (prediction analysis for microarrays) that succinctly distinguish ILCs and IDCs. Eleven of 21 (52%) of the ILCs (“typical” ILCs) clustered together and displayed different gene expression profiles from IDCs, whereas the other ILCs (“ductal-like” ILCs) were distributed between different IDC subtypes. Many of the differentially expressed genes between ILCs and IDCs code for proteins involved in cell adhesion/motility, lipid/fatty acid transport and metabolism, immune/defense response, and electron transport. Many genes that distinguish typical and ductal-like ILCs are involved in regulation of cell growth and immune response. Our data strongly suggest that over half the ILCs differ from IDCs not only in histological and clinical features but also in global transcription programs. The remaining ILCs closely resemble IDCs in their transcription patterns. Further studies are needed to explore the differences between ILC molecular subtypes and to determine whether they require different therapeutic strategies.
Invasive ductal carcinoma (IDC) and invasive lobular carcinoma (ILC) are the major histological types of invasive breast cancer among women of different races worldwide, ranging from 47 to 79% and 2 to 15%, respectively (Harris et al., 2000 ). Although histologically disparate, these tumor types show clinical similarities and differences. Characteristics such as tumor site, size, grade, and stage at presentation are similar for both types (Winchester et al., 1998 ). ILCs often present with subtler signs on physical examination and mammography due to their characteristic histology and absence of a sclerotic tissue reaction. In contrast to a mammographic mass, asymmetric density or architectural distortion are the predominant mammographic signs in more ILCs than IDCs, whereas malignant calcifications are less frequent in ILCs (Newstead et al., 1992 ). IDC and ILC are managed similarly, but whether overall survival rates of patients differ is controversial (Yeatman et al., 1995 ; Toikkanen et al., 1997 ; Winchester et al., 1998 ). However, the metastatic patterns of IDC and ILC are clearly different, with gastrointestinal, gynecologic, and peritoneal-retroperitoneal metastases, particularly to endocrine-related sites such as adrenal glands and ovaries, markedly more prevalent in ILCs (Dixon et al., 1991 ; Borst and Ingold, 1993 ; Bumpers et al., 1993 ; Sastre-Garau et al., 1996 ).
At the molecular level, IDC and ILC seem to show more differences than similarities. They differ in hormone receptor status with 55–72% of IDCs being estrogen receptor (ER) positive compared with 70–92% of ILCs, and 33–70% of IDCs being progesterone receptor (PR) positive compared with 63–67% of ILCs (Harris et al., 2000 ). A number of proteins have also been found to be differentially expressed in IDC and ILC, including vimentin, cathepsin D, thrombospondin, E-cadherin, vascular endothelial growth factor, cytokeratin 8, and cyclin A (Domagala et al., 1990 ; Bedner et al., 1995 ; Serre et al., 1995 ; Berx et al., 1996 ; Lee et al., 1998 ; Lehr et al., 2000 ; Coradini et al., 2002 ). In addition, differences in genetic alterations in IDC and ILC have been observed. Genes such as ERBB2 (Rosenthal et al., 2002 ) and p21 (Rey et al., 1998 ) show a markedly higher amplification rate in IDC than in ILC. In contrast, loss of chromosome 16q (site of the E-cadherin gene) is observed at a much higher frequency in ILC than in IDC (Serre et al., 1995 ; Cleton-Jansen, 2002 ), and particularly more frequent in ILC than poorly differentiated IDC (Buerger et al., 2000 ). However, IDC and ILC share certain characteristics in gene expression. Well differentiated IDC and ILC show similar expression of some proliferation and cell cycle regulated genes, including cyclin D1, p16, p27, mdm-2, and mib-1 (Geradts and Ingram, 2000 ; Soslow et al., 2000 ), and similar bcl-2 and HIF-1α expression (Coradini et al., 2002 ).
Recent research reported a disproportionate increase of ILCs in the United States and Europe, possibly associated with increased usage of combined hormone replacement therapy (Li et al., 2000a ,b , 2003 ; Daling et al., 2002 ; Verkooijen et al., 2003 ). In the United States, ductal carcinoma incidence rates remained essentially constant from 1987 to 1999, whereas lobular carcinoma rates increased steadily, significantly increasing the proportion of breast cancer with a lobular component from 9.5 to 15.6% during that time period. In Switzerland, there has been a mean annual increase in the incidence of IDC of 1.2% compared with a mean annual increase of 14.4% for ILC during the period 1976–1999. Use of combined hormone replacement therapy, but not estrogen replacement therapy alone, seems to increase the risk of developing ILC by 2.7-fold, whereas the increase in IDC risk is only 1.5-fold (Li et al., 2003 ).
Because ILC is the most rapidly increasing breast cancer phenotype, more difficult to diagnose than IDC, and yet is treated similarly to IDC, it is imperative to determine whether the clinical treatment of ILC should differ from IDC. To individualize breast cancer treatment, a molecular understanding of the mechanisms that underlie the development of these two phenotypes is crucial.
It is not clear whether IDC and ILC represent molecularly distinct entities and what genes might be involved in the development of these two phenotypes. Traditional studies have focused on a small number of genes in a large number of cases, whereas microarray analysis has provided us a powerful tool to explore gene expression on a genome-wide scale. By using this technology, breast cancers have been molecularly classified into a number of subtypes associated with different clinical outcomes (Sorlie et al., 2001 , 2003 ; West et al., 2001 ; Ahr et al., 2002 ; van't Veer et al., 2002 ). Sets of genes have been identified as signatures of each subtype and may be potentially useful in drug design and patient care (Perou et al., 2000 ; Ahr et al., 2001 ; van de Vijver et al., 2002 ). However, IDC has been the dominant histological subtype investigated in most of these studies.
To better understand the biology of the two predominate phenotypes of breast carcinoma at the molecular level, we conducted comprehensive studies on the gene expression profiles of IDC and ILC by using RNA amplification and cDNA microarray technology. A total of 64 breast tissues from American and Norwegian patients, including 21 ILCs, 38 IDCs, two lymph node metastases, and three normal tissues were analyzed on arrays containing >42,000 clones. Hierarchical clustering analysis (Eisen et al., 1998 ), significance analysis of microarrays (SAM) (Tusher et al., 2001 ), prediction analysis for microarrays (PAM) (Tibshirani et al., 2002 ), and Pearson's correlation analysis were used to address whether ILCs and IDCs are molecularly distinct entities, what are the similarities and differences in gene expression profiles between these two phenotypes of breast cancer, and whether there are molecularly distinct subtypes within ILCs.
We selected a total of 59 primary breast cancer cases of which 28 were from Stanford Hospital (BC samples) and 31 from a series of patients from Ullevål Hospital, Oslo (ULL samples) (Bukholm et al., 1997 ). Cases had been accrued in accordance with local institutional review board guidelines. Of these, 38 were IDCs and 21 were ILCs. The distribution of cases according to patient source, lymph node status, tumor grade, patient age, the expression of hormone receptors (ER and PR) and a prognostic marker ERBB2/HER2/neu, and the tumor component (the percentage of carcinoma cells on an adjacent frozen or permanent section of the solid tumor) are shown in Table 1. For complete details for each case, see Clinical and Pathology Parameters on the Web supplement at http://genome-www.stanford.edu/breast_cancer/lobular/. The IDCs and ILCs had similar tumor characteristics, except that almost all of the ILCs were grade II tumors and most were from patients >55 years of age at diagnosis. In addition, no HER2/neu-positive tumors were present in the 14 ILCs with known HER2/neu status, whereas 8 of 31 IDCs with known HER2/neu status had positive expression. Two lymph node metastases and three normal breast samples from five IDC patients were also included in the study.
Primary breast carcinomas were frozen in either liquid nitrogen or on dry ice within 20 min after devascularization and stored at -80°C. Frozen sections were cut from primary breast carcinoma specimens and stained with hematoxylin and eosin to confirm tumor content. Specimens in which at least 40% of the cells were carcinoma cells were used in this study. Two experienced breast pathologists separately reviewed, classified, and graded all tumor specimens according to the modified Scarff-Bloom-Richardson method (Elston and Ellis, 1993 ). Cases with discrepancies were reviewed together to obtain consensus. Details of tumor specimen histology are available on the Web at http://genome-www.stanford.edu/breast_cancer/lobular/.
Total RNA was isolated from primary tumor tissue using by TRIzol solution (Invitrogen, Carlsbad, CA) after homogenization by using a PowerGen model 125 (Fisher Scientific. Pittsburgh, PA). The concentration of total RNA was determined using a GeneSpec I spectrophotometer (Hitachi, Yokohama, Japan), and the integrity of the RNA was assessed using a 2100 Bioanalyzer (Agilent Technologies, Palo Alto, CA). Amplification of total RNA was performed using an optimized protocol described previously (Zhao et al., 2002 ). Amplified tumor RNA was labeled by Cy5 and amplified RNA from Universal Human Reference total RNA (Stratagene, La Jolla, CA) was labeled by Cy3. The labeling and hybridization of amplified RNA to cDNA microarrays containing >42,000 elements, was performed as described previously (Zhao et al., 2002 ). Complete experimental protocols can be found at http://genome-www.stanford.edu/breast_cancer/lobular/ or http://www.stanford.edu/group/sjeffreylab/. Multiple batches of arrays were used in this study which did not seem to influence the sample distribution in hierarchical clustering analysis in any significant way. Details of the normalization of the intensity levels can be found at http://genome-www5.stanford.edu/help/results_normalization.shtml.
The arrays with hybridized probes were scanned using an Axon scanner. The scanned images were analyzed first using GenePix Pro3.0 software (Axon Instruments, Foster City, CA), and spots of poor quality determined by visual inspection were removed from further analysis. The resulting data collected from each array was submitted to the Stanford Microarray Database (SMD; http://genome-www5.stanford.edu/microarray/SMD) (Sherlock et al., 2001 ; Gollub et al., 2003 ). Only features with a regression correlation (among all pixels within a feature) >0.5 and a signal intensity >50% above background in both Cy5 and Cy3 channels were retrieved from SMD.
Hierarchical Clustering. A hierarchical clustering algorithm (Eisen et al., 1998 ) was applied to group genes and samples on the basis of their similarities in expression, and the results were visualized using TreeView software (http://rana.lbl.gov/EisenSoftware.htm). The first clustering was performed on all 64 samples by using 4539 clones representing 3341 genes whose expression varied at least threefold from the mean abundance across all samples in at least three samples and was measurable in at least 70% of the samples included in the analysis. The second clustering was performed on 59 IDCs and ILCs by using 78 clones representing 45 named genes identified in PAM analysis. The third clustering was performed on the 59 primary tumors by using 481 genes (represented by 548 clones) of 500 intrinsic genes identified previously (Perou et al., 2000 ; Sorlie et al., 2001 , 2003 ) (see below) whose expression was measurable in at least 70% of the samples included in the analysis. The criteria for spot quality control and gene filtering before hierarchical clustering are somewhat arbitrary. However, prior work has shown that expression variations selected similarly reliably reflect changes in expression levels measured by other methods (Lossos et al., 2002 ; Chen et al., 2003 ).
SAM. Genes with potentially significant changes in expression between ILCs and IDCs were identified using the SAM procedure (Tusher et al., 2001 ) (http://www-stat.stanford.edu/~tibs/SAM/), which computes a two-sample t-statistic of ILCs and IDCs for the normalized log ratios of gene expression levels for each gene. It thresholds the t-statistics to provide a “significant” gene list and provides an estimate of the false discovery rate (the percentage of genes identified by chance alone) from randomly permuted data. We performed a SAM analysis on 32,345 clones representing 20,375 genes whose expression was measurable in at least 70% of the 59 primary tumors and filtered only by spot quality (regression correlation >0.5, signal intensity >50% above background in both Cy5 and Cy3 channels). We used a selection threshold giving the lowest median estimate of 0.6 false positive genes (false discovery rate of 0.1%).
PAM. PAM does sample classification using the nearest shrunken centroid method with automatic gene selection and cross-validation (Tibshirani et al., 2002 ). It uses a parameter threshold Δ to select genes for class prediction. The first PAM classification done to compare ILCs with IDCs was performed using the PAM for R package (http://www-stat.stanford.edu/~tibs/PAM/) on the 32,345 clones filtered as described above. We varied Δ to find a value that balanced prediction accuracy with the number of genes in the predictive model. A threshold of 2.9 giving the lowest overall error rate, and the minimum number of predictive genes was selected. Another PAM analysis was performed comparing typical ILCs and IDCs (details available on the Web supplement). A final PAM classification was performed comparing typical ILCs with ductal-like ILCs on 23,914 clones representing 15,281 genes whose expression was measurable in at least 80% of the 21 ILCs and filtered only by spot quality as described above. A threshold of 2.3 was chosen for the same reasons described above.
Pearson's Correlation Analysis Using Centroids. Previously, an “intrinsic” gene list had been selected consisting of 500 genes represented by 561 clones whose expression varied the least in successive samples from the same patient's tumor but which showed the most variation among tumors from different patients (Perou et al., 2000 ; Sorlie et al., 2001 , 2003 ). Five sets of centroids (i.e., profiles, consisting of the average expression for the 500 intrinsic genes) corresponding to the five subtypes of breast carcinomas were recently published using data from 122 breast samples (Perou et al., 2000 ; Sorlie et al., 2001 , 2003 ). In total, 455 of the centroid genes (represented by 484 clones) were measurable in at least 70% of the 59 primary tumors in our dataset. We then computed the Pearson's correlation coefficients (r) of each of our samples to each of these five sets of centroids. An r threshold of 0.14 (the 95th percentile) was generated by permutation of the expression values for each gene and computing the maximal correlation of each resulting sample with one of the original five centroids. This was repeated 10 times, and the 95th percentile of these correlations was used as the cutoff to categorize 56 out of 59 primary tumors into the five subtypes.
Hierarchical clustering of the 64 samples was performed using the selected 4539 clones representing 3341genes whose expression varied more than threefold from the overall mean abundance in at least three samples (Figure 1). In the dendrogram shown in Figure 1A, four distinct groups of tumors are apparent, suggesting that the tumors can be divided into four types on the basis of the 3341 differentially expressed genes. The association of tumors within this unsupervised cluster is not due to gene filtering criteria because varying data selection criteria still maintains the tumor associations. It also seems that the contents of tumor cells or the adipose and immune components have little influence on this cluster pattern (see Clinical and Pathology Parameters on the Web supplement at http://genome-www.stanford.edu/breast_cancer/lobular/). One striking feature is that 11 of 21 (52%) of the ILCs were found in group IV, which also contains three normal breast samples. This suggests that this group of ILCs is different in gene expression profile from IDCs and has more gene expression similarities with normal breast than IDCs. We refer to this group of ILCs as “typical” ILCs. A fraction of other ILCs share similar gene expression profiles with IDCs and are referred to as “ductal-like” ILCs. The relatedness of typical ILCs to normal samples is not likely due to the composition of the tumors because five of eight ILCs with relatively low percentage of tumor cells (40–60%) clustered elsewhere with IDCs. In addition, genes such as E-cadherin and basal epithelial cell markers (e.g., KRT5, KRT 17, and epidermal growth factor receptor [EGFR]) show significantly different expression levels in typical ILCs and normal samples (Figure 1, D, H, and I). It is also worth noticing that the two lymph node metastases clustered together with the primary tumors they derived from, consistent with our previous findings, suggesting a similar gene expression profile between primary tumor and lymph node metastasis. Each normal sample (derived from the same breast as a corresponding primary tumor but taken from a distant location) exhibited expression profiles similar to other normals (our unpublished data) and different from their corresponding IDC (Figure 1A).
Group I tumors have high relative expression of ER and its regulated genes (Figure 1F). This group displays low relative expression of basal epithelial cell markers, including basal keratins and EGFR (Figure 1, H and I), adipose (Figure 1J) and stromal tissue markers (Figure 1G). Interestingly, the ER-overexpressing group I tumors differentially express genes involved in proliferation and cell cycle regulation (Figure 1C). Group II IDC tumors exhibit the lowest relative expression of ER and its regulated genes (Figure 1F) and high relative expression of basal epithelial cell markers, EGFR, and proliferation and cell cycle-regulated genes. (Figure 1, C, H, and I). Stromal and adipose tissue markers in group II are present mainly in the ILC samples (Figure 1, G and J). Group III and IV are similar in that they both show relatively low proliferation/cell cycle activities (Figure 1C) but differ in other signatures. Specifically, group III has relatively high expression of ER and its regulated genes (Figure 1F), stromal tissue markers (Figure 1G), and variable expression of basal epithelial cell (although relatively low EGFR) and adipose tissue markers (Figure 1, H, I, and J). Group IV tumors, consisting of the typical ILCs, has mixed expression of ER and its regulated genes (Figure 1F) and stromal tissue markers (Figure 1G), variable expression of basal epithelial markers (with relatively low EGFR expression) (Figure 1, H and I) but very high relative expression of adipose tissue markers (Figure 1J). Two markers, E-cadherin and ERBB2, are almost absent from group IV tumors but present in several tumors in the other three groups (Figure 1, D and E). These results suggest that the typical ILCs are molecularly different from IDCs. It is worth noting that group III mainly consists of patients <55 years of age and most had lymph node metastases. More than one-half the patients in group IV (typical ILCs) also had lymph node metastases but were at least 55 years old at diagnosis.
To identify genes whose expression differs significantly between ILCs and IDCs, we performed SAM analysis (Tusher et al., 2001 ) (http://www-stat.stanford.edu/~tibs/SAM/). There were 474 clones representing 378 unique genes that were selected at the lowest median number of falsely significant genes, 0.6. Of the 378 clones, 150 have known biological functions, including 75 genes that show high expression in ILCs and low expression in IDCs, and 75 genes vice versa. Most of the 150 genes can be categorized into five biological processes according to Gene Ontology annotations (Ashburner et al., 2000 ): cell adhesion/motility, lipid/fatty acid metabolism, immune and defense response, electron transport, and nucleosome assembly (Table 2). Many genes involved in signal transduction, regulation of transcription, and small molecule transport and metabolism were also among the genes identified by SAM (see Web supplement for full list).
To explore the question of which genes best discriminate ILCs and IDCs, we performed PAM analysis. This method of nearest shrunken centroids is used in cancer class prediction to find genes that best characterize cancer types. Here, we used PAM to identify a minimal subset of genes that succinctly characterized ILCs and IDCs. By using a threshold of 2.9 (Figure 2A), a set of 78 clones representing 45 named genes were selected (Figure 2B), 44 of which were also present in the list of genes identified by SAM. ILCs and IDCs were separated based on the expression pattern of these genes with an overall error rate of 0.15. Specifically, 18 of 21 ILCs (86%) and 32 of 39 IDCs (82%) were correctly classified. BC-L-014, ULL-L-014, and ULL-L-028 were the exceptions and they all belonged to the ductal-like ILCs. When the 78 clones were used in a hierarchical clustering of all 59 tumor samples, the same three ductal-like ILC samples were placed on a main ductal branch containing most of the IDCs, separate from the lobular branch that contained 18 ILCs (Figure 2C). All typical ILCs clustered together in a core on the lobular branch with ductal-like ILCs positioned at the edges. Two group I IDCs (ULL-D-056 and ULL-D-216) and three group II IDCs (BC-D-007, BC-D-032, and BC-D-035) also are on the lobular branch, although most are on one edge near the ductal-like ILCs. Each of the IDCs on the lobular branch is ER and/or PR positive (see Clinical and Pathology Parameters on the Web supplement).
The most important discriminator identified by PAM is cadherin 1 (CDH1, E-cadherin). Four different clones representing CDH1 were among the top discriminators (Figure 2B). Their average expression ratio in ILCs was 4.2- fold lower than that in IDCs, consistent with previous immunohistological studies of CDH1 in ILCs and IDCs. It is worth noticing that BC-D-048 has low expression of CDH1 similar to ILCs, which is consistent with invasiveness and unfavorable prognosis (Siitonen et al., 1996 ; Hunt et al., 1997 ; Nagae et al., 2002 ). Seven other genes (SORBS1, VWF, AOC3, MMRN, ITGA7, CD36, and ANXA1) functioning in cell adhesion were also selected as discriminators, suggesting a different cell adhesion feature between ILCs and IDCs. A number of other genes with high ranks among the identified discriminators are involved in lipid/fatty acid transport and metabolism, including FABP4, LPL, PLIN, ANXA1, and CD36, indicating a potential difference in lipid/fatty acid metabolism between ILC and IDC tumor tissue. An interesting electron transport gene overexpressed in ILCs is glutathione peroxidase 3, which catalyzes the reduction of hydrogen peroxide, organic hydroperoxides, and lipid peroxides, protecting cells against oxidative damage. Together, these results demonstrate that the majority of ILCs can be distinguished from IDCs by expression patterns of a small set of genes involved in several biological processes.
When typical ILCs were compared with IDCs by PAM analysis (see Web Supplement), 26 clones representing 14 named genes were identified that best distinguished the two groups with an overall misclassification error rate of 0.102 (0% error rate for the typical ILCs, 13% error rate for the IDCs). Twenty-one of the 26 clones were present among the 78 clones previously identified by PAM that distinguished ILCs and IDCs. Among the five clones not identified, there were two named genes: PDE2A (phosphodiesterase 2A and cGMP-stimulated) and early B-cell factor. These two genes are also present in a PAM analysis that distinguishes typical ILCs from ductal-like ILCs (Figure 4, B and C), discussed below.
To further assess the degree of differences between gene expression profiles in ILCs and IDCs, and to compare that to the previous classification into five subclasses (luminal A, luminal B, ERBB2, basal, and normal-like), we performed Pearson's correlation by using the five sets of centroids recently defined in Sorlie et al. (2003 ). These sets of centroids consist of the average expression of the 500 intrinsic genes corresponding to each of the five subtypes. The Pearson's correlation coefficients between the expression ratio of 455 intrinsic genes in our 59 tumor samples, and the five sets of centroids were calculated. Fifty-six of 59 carcinomas were assigned to a subtype by the highest r (Figure 3, A and B), confirming the existence of the five centroids also in this set of tumors. The three tumors that could not be classified using an r threshold of 0.14 (determined by multiple permutations of gene expression values) were all typical ILCs (ULL-L-024, ULL-L-058, and ULL-L-105, colored gray in Figure 3).
The correlation coefficients between our 59 samples and the centroids of the five subtypes provide additional evidence that typical ILCs are different from ductal-like ILCs and IDCs in their gene expression profile. Seven of the eight typical ILCs that have >0.14 correlation coefficients were assigned to the normal-like subtype (Figure 3A), consistent with hierarchical clustering results shown in Figure 1. Only one typical ILC was assigned to another subtype (BC-L-090, assigned to basal subtype with an r of 0.25 compared with the ductal-like lobular BC-L-014 assigned to basal subtype with an r of 0.7). In contrast, only one of the 10 ductal-like ILCs was present in the normal-like subtype group (ULL-L-168, with an r of 0.3). Five of 10 ductal-like ILCs showed high correlation with the corresponding set of centroids for their subtypes (r > 0.3). Notably, the basal subtype had the highest correlation with the centroids compared with other subtypes, suggesting a highly consistent gene expression pattern associated with basal subtype tumors.
When variation in expression of 481 intrinsic genes was used to order the 59 samples in a hierarchical clustering, two features of the dendrogram were evident (Figure 3B). First, samples tended to cluster based on their correlation to the centroids of the subtypes. For example, seven of 10 basal subtype tumors clustered together, consistent with the high r among basal subtype IDCs observed above. Second, six of the 11 typical ILCs clustered together on the normal-like subtype branch, whereas only one of the 10 ductal-like ILCs clustered with this group, confirming that this group of ILCs has characteristic gene expression patterns different from IDCs and ductal-like ILCs. When we ordered the 38 IDCs only using the intrinsic genes, the dendrogram showed an even clearer separation of the five subtypes (see Web supplement). This is not surprising because the centroids were essentially derived from IDCs and thus have a high power of classification for IDCs.
The expression patterns of the intrinsic genes characterizing the five subtypes are largely in agreement with previous reports. For example, the basal epithelial cell markers, including keratins 5 and 17 were relatively highly expressed in the basal subtype (Figure 3, I and J), whereas ER and most of the other ER coexpressing genes failed to express in this subtype (Figure 3H). Genes representing tumor markers such as ERBB2 and MUC1 also showed relative low expression in the basal subtype (Figure 3, D and F). Interestingly, a cluster of genes with diverse functions is highly expressed in basal and ERBB2 subtypes (Figure 3K) and seem inversely related to ER expression. Another cluster of genes show relative low expression in basal and luminal B subtypes (Figure 3G), with relative overexpression in luminal A and normal-like subtypes.
To identify a minimum set of genes that best discriminate typical ILCs from ductal-like ILCs, PAM was performed on 23,914 clones representing 15,281 genes whose expression was measurable in at least 80% of the 21 ILCs. Seventy-six clones representing 44 genes with known functions were selected at an overall error rate of 9% (Figure 4, A and B). These genes function in a number of biological processes according to Gene Ontology annotations (for details, see Web supplement). Many of these genes are involved in regulation of cell growth (CDKN1C, G0S2, PDGFA, KIT, and F2 relatively overexpressed and MAP3K8 relatively underexpressed in the typical ILCs) and immune response (AOC3, IGJ, F2, F3, and IGLL1 relatively overexpressed and DEFB1, HLA-C relatively underexpressed in the typical ILCs). When the 76 clones were used in hierarchical clustering of the 21 ILCs, typical ILCs and ductal-like ILCs were separated into two groups with 100% accuracy (Figure 4C). The two genes identified in the PAM analysis of typical ILCs compared with IDCs (see Web supplement) but not identified on the original SAM list of clones distinguishing ILCs and IDCs, PDE2A (phosphodiesterase 2A) and EBF (early B-cell factor) are also relatively overexpressed in typical ILCs (Figure 4C). Together, these results strongly suggest the existence of two groups of ILCs differing in gene expression profiles.
We have systematically surveyed gene expression of 38 IDCs and 21 ILCs on a genome-wide scale by using RNA amplification and cDNA microarray techniques. Our data strongly suggest that a subgroup of ILCs, that we are calling typical ILCs, differ from IDCs not only in their histological structures and clinical features but also in global transcription programs. Three different statistical methods used to analyze the expression patterns all provided evidence supporting this conclusion. First, hierarchical clustering analyses showed that ILCs separate into two groups: typical ILCs that tend to cluster together and ductal-like ILCs that cluster with different subgroups of IDCs. Second, PAM analysis showed that ILCs could be separated from IDCs at a fairly high success rate on the basis of expression variations of only 78 transcripts and that the typical ILCs were more closely related than the ductal-like ILCs when clustering was performed using these selected genes. Third, Pearson's correlation analysis revealed that the expression pattern of the intrinsic genes in typical ILCs correlates poorly with previously characterized expression patterns of all but one IDC subtype, whereas the correlation between IDCs in this study with previous IDC subtypes is much higher. The differences between ILCs and IDCs we observed are not explained by different cellular composition of the samples, because the overall percentage of malignant epithelial component in ILCs was comparable with the IDCs when assessed by the same pathologists.
It is believed that all breast carcinomas, including both IDC and ILC, start in the terminal ductal lobular unit (TDLU) (Wellings et al., 1975 ; Wellings, 1980 ; Russo et al., 1990 , 2001 ; Russo and Russo, 1994 ). The malignant epithelial cells in IDC or ILC may represent differences in cell of origin within the TDLU (progenitor cell differences) or differences in point when the cancer started during the TDLU lobular maturation process (type 1 lobule for IDC vs. type 2 lobule for ILC). This might explain why we see some lobular carcinomas as a distinct subtype and others with more similar gene expression to ductal carcinoma—there may be a continuum in the occurrence of epithelial carcinomas within the TDLU or from cells derived during the continuum of the TDLU maturation process.
SAM analysis suggests that genes differentially expressed between IDCs and both groups of ILCs are involved in cell adhesion, lipid/fatty acid metabolism, immune/defense/stress responses, electron transport, and nucleosome assembly. How the differences in gene expression between ILCs and IDCs translate to differences in clinical and microscopic properties of the tumors are not clear. However, several hypotheses can be offered based on information from previous studies. First, the differential expression of cell adhesion molecules may account for some of the differences observed in invasion patterns of ILCs and IDCs. The classical invasion pattern of ILCs is characterized by single files or cords of small cohesive cells that diffusely infiltrate the stromal tissues (Harris et al., 2000 ). In contrast, IDCs are characterized by tubule formation or solid sheets of tumor cells. Different morphological patterns of invasion may be associated with different adhesive properties between the malignant epithelial cells themselves and with surrounding tissues. It is notable that nine of 11 (82%) of the typical ILCs showed classical lobular morphology, seen in only four of 10 (40%) of the ductal-like ILCs. The other two typical ILCs (ULL-L-111 and ULL-L-190) showed classic lobular mixed with trabecular or trabecular/alveolar growth pattern. Importantly, the ductal-like lobular sample that clustered with basal-like IDCs (BC-L-014) grew as a solid variant with <5% ductal edges. In addition to the multiple cell adhesion molecules, two genes involved in cell motility (ANXA1 and ENPP2) are differentially expressed between IDCs and ILCs and may influence differences in migration ability of tumor cells during invasion.
A recent study (Gupta et al., 2003 ) analyzing IDCs with and without lymphovascular tumor emboli assessed by E-cadherin immunostaining, suggested that, although this cell adhesion molecule is characteristically lost in ILCs and may even show loss in some high grade IDCs, observation of diffuse strong E-cadherin expression in IDCs may play a role in tumor growth as intravascular nests or emboli within lymphatics when lymphovascular invasion exists. In E-cadherin negative tumors that metastasize, individual cells may be able to migrate and travel in the vasculature and lymphatics differently than tumor emboli, which are composed of clusters of cells, potentially explaining the different patterns of distant metastatic spread in ILCs and IDCs.
The differential expression of genes involved in lipid/fatty acid metabolism is complex but may be partially responsible for different proliferation rates of tumor cells in ILCs and IDCs. Breast epithelial cell proliferation and differentiation are controlled by multiple factors, including growth factors, hormones, and fatty acids. Growth factors may cause phospholipid hydrolysis with release of fatty acids and lipoxygenase products that stimulate cell growth. Dietary intake can affect fatty acid metabolism in tissues. Specific polyunsaturated fatty acids found in vegetable oils, such as linoleic acid, may promote tumor growth, whereas other polyunsaturated fatty acids found in fish oil or monounsaturated fats, such as oleic acid in olive oil, are either neutral or inhibitory (Rose et al., 1997 ; Natarajan and Nadler, 1998 ; Stoll, 1998 ; Bartsch et al., 1999 ). In our series, the adipose-enriched cluster seems to have an inverse relationship with the proliferation/cell cycle cluster. Profiling whole tissue, nonmicrodissected specimens detects gene expression averaged across all cells in the tumor sample. We are just starting studies to define the cell types that contribute to the observed differential expression, but we have not yet identified which cellular components (epithelial or adipose) are expressing the adipose-enriched gene cluster. Differentiated mammary cells are designed to make lipid during lactation and malignant epithelial cells may potentially represent a source of adipose-enriched genes in more highly differentiated tumors or ILCs. Therapy that induces cell differentiation, such as ligands to retinoid X receptors, has been shown to increase expression of adipocyte-related genes that inhibit cellular proliferation and cause tumor regression. The source of the expressed genes seems to be both malignant epithelial cells and preadipocytes (fibroblasts that differentiate into adipocytes) (Agarwal et al., 2000 ). Preadipocytes, but not mature adipocytes, have also been shown to secrete molecules that inhibit DNA synthesis in murine mammary carcinoma cells (Rahimi et al., 1998 ), suggesting both paracrine and autocrine effects. Conversely, some ductal-like lobulars and IDCs may elicit a different response caused by inflammatory cells in the extracellular matrix as evidenced by a higher expression of immunoglobulins, chemokines, and collagen. The differential expression of immune and defense response genes may explain the observed favorable prognosis of ILCs found in some studies (Toikkanen et al., 1997 ). Together, the gene expression profiles of ILCs and IDCs suggest a different interaction between stromal and epithelial cells in these two types of tumors, possibly due to differences in cross talk between stromal and malignant epithelial cells.
Genetic alterations have been proposed to be the basis for tumor initiation and progression. This raises the question of whether the differences in gene expression between ILCs and IDCs are due to differences in genetic makeup of these two types of tumors. Pandis et al. (1996 ) reported significant differences in karyotypic patterns between ILCs and IDCs in 125 breast carcinomas. In contrast to IDC, ILC were characterized by few, often balanced chromosomal aberrations, yielding a near diploid karyotype. Although no tumor-type specific patterns of aberrations were identified, Flagiello et al. (1998 ) reported highly recurrent der(1:16)(q10;p10) and other 16q (location of the E-cadherin gene) alterations in ILCs. Recently, Gunther et al. (2001 ) demonstrated that ILCs have significant losses of 16q, 22, and 17p and q, and gains in 1q and 8q. These chromosomal changes may contribute to type-specific properties of ILCs. Pollack et al. (2002 ) have compared DNA copy number changes and gene expression in parallel on a genome-wide scale by using the same DNA microarrays in 44 primary breast carcinomas, and revealed a remarkable degree to which variations in DNA copy number changes contribute to gene expression changes and estimated that at least 12% of mRNA variation in breast cancer can be directly attributed to copy number variation. We are currently performing arraybased CGH on our 59 samples for comparison with the gene expression profiles to better understand whether DNA copy number alterations differ between typical ILCs, ductal-like ILCs, and the IDCs that cluster with the ductal-like ILCs to determine whether specific chromosomal abnormalities may play a direct role in type-specific development of these different tumor types.
In conclusion, gene expression profiling has revealed distinct patterns of gene expression among ILCs and IDCs. Differences in a number of biological processes such as cell adhesion and lipid/fatty acid metabolism may contribute to the type-specific properties of IDCs and ILCs. Our data strongly suggest that over half of ILCs (called the typical ILCs) differ from IDCs not only in histological and clinical features but also in global transcription programs. The remaining ILCs (called the ductal-like ILCs) closely resemble IDCs in their transcription patterns. The finding of two subsets of ILCs has important clinical implications about targeted therapies. Further studies would be required to explore whether the ductal-like ILCs should be treated similarly to other IDCs of their particular molecular phenotype (basal-like, luminal A or B, and ERBB2 expressing), and if different and type-specific treatment may be indicated for the typical ILCs. A larger cohort of samples is being analyzed to confirm the existence of different molecular subtypes of ILCs. In addition, further studies on pure populations of epithelial and stromal cells from these two tumor subtypes using microdissection techniques may help us better understand the mechanisms underlying the development and epithelial–stromal interactions of the different tumor phenotypes. Correlative studies using clinical follow up data are in progress.
The Stanford Microarray Database group and Stanford Functional Genomics Facilities are acknowledged for supporting the experiments and data analysis. We thank Therese Sørlie for insights in the data analysis, Michelle Ferrari and Maureen Chang for contributions in the management of the breast cancer database, and Susan Overholser for creating the Web supplement and assistance in configuring the database and preparation of this manuscript. This study was supported by Public Health Service grant U01CA85129 from the National Cancer Institute, National Institutes of Health, Department of Health and Human Services, the Norwegian Cancer Society grant 99061, and the Research Council of Norway, grant number 137012/310. Anita Langerød is a fellow of The Norwegian Cancer Society.
Article published online ahead of print. Mol. Biol. Cell 10.1091/mbc.E03-11-0786. Article and publication date are available at www.molbiolcell.org/cgi/doi/10.1091/mbc.E03-11-0786.