We aimed to better understand the molecular changes that are present in breast epithelium before clinical or pathological evidence of breast cancer. Therefore, we examined gene expression in 73 snap-frozen microdissected NlEpi samples. We used microarray (and qPCR) in 42 samples (18 RM, 18 HN, and 6 PM), and qPCR alone in 31 independent samples (8 RM, 17 HN, and 6 PM). The microarray analysis identified an age-independent profile consisting of 98 probe sets (corresponding to 86 genes) differentially expressed in the NlEpi of breast cancer cases (HN), compared with controls (RM). Using these 98 probe sets, PM samples clustered with HN and away from RM samples. Prospective validation by qPCR was high (84%) and uniform in the independent HN–RM comparison. Prospective validation was lower on an average (58%), but more heterogeneous among samples, in the independent PM–RM comparison, as might be expected when dealing with cases at variable risk. The 98 probe sets included many transcription factors and were implicated in cancer-related pathways, in particular MAPK. These results suggest that the HN expression profile is not an effect of tumour. Instead, it may be a marker of increased risk of breast cancer development (e.g. field cancerisation), or may reveal some of breast cancer's earliest genomic alterations. Further study of early alterations may identify new preventive agents and risk-assessment tools.
Our findings raise several points for consideration. First, this study differs in fundamental ways from our initial report (
Tripathi et al, 2008). The sample size is considerably larger. As age influences gene expression (
Yau et al, 2007;
Anders et al, 2008;
Euhus et al, 2008), we tightly age-matched RM and HN subjects, permitting us to identify an age-independent profile. HN samples were balanced for ER+ and ER− tumour status to make our results more generaliseable. We used a novel statistical approach, well suited to a small sample size, to identify differentially expressed genes. We prospectively validated the microarray findings in independent samples. Finally, we examined a rarely studied type of sample – cancer-free breasts from patients undergoing PM because of high breast cancer risk. We are unaware of any earlier reports of gene expression in PMs. Existing papers identify histological and clinical characteristics of PM samples, leading some to advocate for more attention to the genetics of PM (
Scott et al, 2003;
Goldflam et al, 2004;
Yi et al, 2009). Thus, our samples and study design are unique.
Second, despite the important expansions and refinements to the initial study, the RM–HN differences we report here are consistent with those from the earlier study. Even though only 35 (approximately 1 out of 3) probe sets overlap, in both studies, we identify genes belonging to multiple biological and molecular categories, with transcription factors and the p38MAPK pathway apparently overrepresented. This suggests that gene expression differences between the NlEpi of RM and HN samples is a generaliseable finding, applicable to a heterogeneous set of breast cancer samples. Clustering analyses (
Supplementary Figures S2 and S3) suggest that the NlEpi of ER+ cases resembles control epithelium (RM) more closely than does the NlEpi of ER− cases.
Investigations of gene expression in HN breast epithelium are limited (
Finak et al, 2006;
Grigoriadis et al, 2006;
Tripathi et al, 2008;
Chen et al, 2009). Our results contrast with one study that did not find a different gene expression between morphologically normal epithelium microdissected from RM and HN samples (
Finak et al, 2006). The discrepancy may be explained by differences in study design and analysis. Our results are consistent with those of another study (
Grigoriadis et al, 2006), although that study had no samples comparable with our HN samples, and with our own earlier findings (
Tripathi et al, 2008). Another study (
Chen et al, 2009) found a proliferative gene expression signature in morphologically benign tissue (not necessarily HN) of patients with invasive carcinoma, but no controls were reported.
Third, evaluation of the RM–HN gene list leads to several speculations. One speculation emerges from the observation that the RMs that co-cluster with HNs using the 98-probe-set list are significantly older than the RMs that do not cluster with HNs. As the 36 RM and HN samples were tightly age matched, and all RM cases lacked a personal or strong family history of breast cancer, this age-related clustering of RMs is consistent with the hypothesis that ageing itself is associated with genomic changes resembling changes of early cancer. Another speculation relates to the function of transcription factors. We found multiple transcription factors, the expression of which decreased in HN samples (e.g.
ATF3,
MAFF, and
TXNIP). Transcription factors may be preferentially regulated by methylation (
Bloushtain-Qimron et al, 2008) and methylation (and other epigenetic events) is believed to contribute to the early stages of carcinogenesis (
Tommasi et al, 2009). Thus, early epigenetic events could lead to the decreased expression of transcription factors that we see in HN. These epigenetic changes may in some manner determine the subtype of tumour that may arise from a particular cell or TDLU. A final speculation relates to the potential function of the p38MAPK pathway. We found that MAPK pathway gene expression was decreased in HN compared with RM samples, but was less decreased in PM samples. Thus, MAPK pathway deregulation may be implicated early in breast cancer development, and may differentiate PM from HN epithelium. The MAPK functions in cell cycle and transcriptional regulation and in the immune response may thus inhibit tumourigenesis (
Bradham and McClay, 2006;
Cuenda and Rousseau, 2007), Therefore, it is not surprising to see a higher expression of genes in this pathway in NlEpi from women without breast cancer (RM and perhaps PM samples). Alternatively, a decreased expression of MAPK in the epithelium may reflect signals arising from the microenvironment surrounding the epithelium. If so, it is consistent with the view that the microenvironment has a crucial function in suppressing malignant transformation or behaviour (
Spencer et al, 2007;
Hu and Polyak, 2008).
Finally, our analysis of PM gene expression shows that, in general, PM samples resemble HN rather than RM samples. As the PM breast does not contain cancer, the HN-like changes cannot be an effect of the tumour. Instead, they may be a marker of increased breast cancer risk – the concept of mammary field cancerisation is longstanding, and has recently been reviewed (
Heaphy et al, 2009). Alternatively, the HN-like changes could reflect breast cancer's earliest gene expression changes. The variable validation rate among PM samples would be expected in a heterogeneous group composed of high-risk women. Low validation of specific genes could reflect splice variants. If future studies confirm these findings, then evaluation of gene expression in NlEpi could improve risk assessment and affect clinical decision making with regard to this controversial procedure (
Borgen et al, 1998;
Hartmann et al, 1999,
2004;
Eisinger, 2007;
Giuliano et al, 2007;
Briasoulis and Roukos, 2008). These findings could also identify new prevention agents, by finding drugs or interventions that modify or reverse this transcriptional signature.
The primary limitation of our study is the sample size, which is relatively small because of the nature of our samples: fresh, microdissected primary epithelium from untreated women. We have made every effort to counterbalance this limitation by using a statistical analysis suitable for small sample sizes and by using novel and important samples that provide accurate in vivo data.