Analyses of gene expression patterns from thousands of genes using DNA microarrays have demonstrated great diversity among tumors arising in the same organ and with apparently similar histopathology. This has raised hopes that classification schemes based on molecular profiling may better capture the complex behavior of tumors and lead to improved prognostication and tailor-made therapeutic strategies. We were the first to identify that specific subclasses of breast cancer, based on gene expression profiling, were distinct biological entities and associated with significant differences in outcome for patients with locally advanced breast cancer [4
]. Subsequently, this has been validated both by us and other groups in different types of breast cancer patient cohorts [3
]. Here, we could confirm the existence of the molecular subtypes of breast tumors also in early breast cancer (T1/T2) using three different microarray platforms. Due to the small sample size reported here, only the luminal A and basal-like groups could be robustly identified, although the other less represented subtypes could also be recognized. These two subtypes are easily distinguishable in several tumor data sets and their expression profiles seem to be anti-correlated, as also has been shown for breast cancer cell lines [24
], but the cellular pathways affected are not known in detail. We show here that the differences in gene expression patterns between the two main subtypes reflect levels of activation of distinct signaling pathways. These changes might have been pre-programmed already at a relatively early stage in the progression of the cancer and hence, imply that the fate of the tumor is already set. This is in accordance with previous reports on breast cancer [3
]. Other groups have analyzed gene expression in DCIS (ductal carcinoma in situ) for comparison with invasive carcinomas and highlighted transcripts that may be important for transformation and invasion [13
]. Extensive studies of DCIS and other pre-invasive stages of tumors will further enlighten this hypothesis and substantiate the value of gene expression-based classification in prognosis of breast cancer at an early stage.
Specifically in this study, a more in-depth molecular characterization of these phenotypes of breast cancer was carried out and provided new insights into the biology of the disease at the molecular level. The distinct and characteristic molecular mechanisms revealed by the protein classification and biological pathway analysis, provided further evidence that these molecular subtypes represent biologically distinct disease entities and may require different therapeutic strategies. For example, our results indicated that the luminal A subtype showed coordinated activation of genes involved in steroid/estrogen signaling and fatty acid metabolism. Fatty acid synthase (FAS)-dependent endogenous fatty acid synthetic activity has been found to be abnormally elevated in a subset of aggressive breast carcinomas [30
], in particular ERBB2-overexpressing tumors [31
], whereas here, high expression of many genes involved in fatty acid/lipid metabolism and degradation were coupled to the luminal A phenotype, know to be associated with a relatively good prognosis [10
]. Although no correlation between fatty acid metabolism and estrogen and progesterone receptor expression status of tumors has been documented in cancer, our results may indicate some level of cross-talk between fatty acid metabolism and steroid signaling that may have effects on apoptosis and cell proliferation and possibly hormonal treatment in this subtype of breast cancer. Indeed, it has been speculated that some lipids may modulate steroid metabolism [32
Such molecular profiling of clinically relevant subtypes of breast tumors provide opportunities for identification of novel targets that can be exploited for targeted therapeutics of the disease. Among the 1210 genes most differentially expressed between luminal A and basal-like tumors, 145 are secreted proteins based on the prediction methodology published in a recent paper [33
]. A variety of biomolecules are secreted proteins such as cytokines, chemokines, hormones and digestive enzymes that play pivotal biological regulatory roles and are very important sources for protein therapeutics. We also identified five G protein-coupled receptors (GPCRs) among these signature genes, a gene family well established as small molecule drug targets.
Although we analyzed only 20 tumor biopsies, data were collected using three different microarray platforms; a two-color fluorescent-based cDNA microarray, a 60-mer oligo microarray using two-color fluorescence detection and a 60-mer oligo microarray using chemiluminescence detection. Of 16,611 common genes among these three platforms, 1019, 1054 and 1164 genes, respectively, were identified to be differentially expressed between luminal A and basal-like tumors. Of these, 319 genes were common to all three technologies, which correspond to an overall consistency of 30%. These numbers could prove to be even higher if a more accurate probe match by sequence rather than gene identifiers would be performed, as has recently been shown [34
]. A few studies have recently been published that aimed to compare variability and consistency between microarray platforms and with different results [35
]. Our study shows that although there is variability between the platforms, the gene expression profiles emerging from using all three technologies are highly correlated to the biological variation in the data and the same tumor subtype pattern was identified with all three methods.
The minimal set of 54 genes that best characterized luminal A and basal-like subtypes was identified based on differential expressed genes on all three platforms and validated by using TaqMan® assays. Convincingly, clustering of expression data from all four methods grouped the experiments together by tumor sample of origin and not by platform. Hence, these genes provide a robust set of potential prognostic molecular markers, but which covers only the two main subtypes. More thorough characterization on significantly larger sample sizes is needed to provide prognostic predictor sets for all subtypes.