In the July 10, 2008 issue of Journal of Clinical Oncology,1 we reported a large-scale genomic analysis illustrating that mRNA expression levels of key breast cancer-associated genes, ER-α, ER-β, epithelial growth factor receptor (EGFR), and human epidermal growth factor receptor 2 (HER2) occurred in an age-related manner. Moreover, when stratified by age, breast tumors arising in younger women (≤ 45 years) were enriched with > 350 biologically relevant gene sets compared with those arising in older women (≥ 65 years).2 Breast cancer is no longer viewed as a single disease, but rather a compilation of several distinct subtypes defined via gene expression analysis.2 Microarray techniques have divided breast cancer into intrinsic subtypes: luminal A, luminal B, HER2-enriched, and basal-like, each with unique prognostic and therapeutic implications.3,4 On the basis of findings from Carey et al,5 we hypothesized that (1) breast tumors arising in younger women may be more enriched for aggressive subtypes and (2) age-specific biologic differences observed in breast carcinomas may be highly subtype dependent. To evaluate the relationship between age and breast cancer subtype, and to account for potential confounding variables not previously included, we have performed new analyses on data from Anders et al1 and include a similar analysis performed on a second independent microarray-based breast tumor data set.
To explore our hypotheses, we chose to reanalyze our previous data set; however, we limited our current analyses to a combination of two of the four large data sets used previously, genomic spatial event (GSE) 49226 and GSE7849,7 termed data set A. Please note that two of the four previous data sets were excluded (GSE20348 and GSE31439) because they lacked complete clinical data (ie, histologic grade). Our first goal was to define the distribution of intrinsic subtypes assigned by the PAM5010 to determine whether subtype correlated with age distinction. We hypothesized that more aggressive subtypes (ie, basal-like) would be over-represented among breast carcinomas arising in younger women (≤ 45 years), whereas older women (≥ 65 years) would more commonly be diagnosed with luminal tumors.5 As expected, there was a significant association between subtype and age (P = 3.8e-06; Table 1). Specifically, a higher proportion of younger women were diagnosed with basal-like (odds ratio [OR], 12.27; 95% CI, 3.96 to 45.0) and HER2-enriched (OR, 4.63; 95% CI, 1.50 to 16.48) breast tumors.
Recognizing that age-specific differences in subtype distribution were present, we next examined other potential confounding variables and noted that grade and sample source were associated with age (Table 1). We hypothesized that accounting for all significant clinicopathologic features, namely intrinsic subtype and grade, could have an effect on the number of age-associated genes, thus we used a statistical model that can account for confounding variables. To test our hypothesis, we built a linear regression model for each gene's expression value using age alone or in combination with significant clinical variables, (ols function in R package Design). Before analysis, the log2 intensities of the gene expression from data set A, Affymetrix one-channel data, were row (gene) median centered, and column (sample) standardized. The two data sets (GSE49226 and GSE78497) were combined, using distance weighted discrimination to detect and remove batch bias.11 A linear model of gene expression was defined by age (unadjusted model). A second linear model contained additive terms for age, grade, subtype, and sample source (adjusted model). Higher order interactions were not considered. We then transformed the P values for the age term to q values (the false discovery rate at which the gene is significant) using the method of Benjamini and Hotchterg.12 A false discovery rate of 5% was used to identify significant genes. Within data set A, the unadjusted model of breast tumor gene expression differences by age alone (≤ 45 v ≥ 65 years) yielded 693 genes differentially expressed (q < 0.05; Table 2). Correction for the significant clinicopathologic features (grade, subtype, sample source; Table 1) with the adjusted model yielded zero gene differences (q < 0.05) between breast tumors of previously defined age groups. Recognizing that gene differences diminished to zero, we did not believe gene set enrichment analysis as previously reported would have added further to this analysis.
As is standard for the field, we elected to evaluate our findings in a second independent data set. To conduct our secondary analysis, we pooled 344 clinically annotated breast tumors assayed on the full genome Agilent microarrays from four publications10,13–15 and 12 new arrays (GSE20624, obtained with institutional review board approval), by selecting the subset of samples from these publications that had the same complete clinical data as data set A; this was termed data set B. Similar to data set A, there was an association between age (≤ 45 v ≥ 65 years) and intrinsic subtype in data set b (P = 1.6e-4, Table 3). Younger women were more commonly diagnosed with basal-like breast tumors (OR, 5.1; 95% CI, 2.43 to 11.11) and HER2-enriched breast tumors (OR, 3.16; 95% CI, 1.18 to 8.73). We used the same statistical approach described above to evaluate age-specific gene expression differences. Specific to data set B, gene expression data is Agilent two-channel data. The log2(R/G) of the gene expression was LOWESS normalized, row (gene) median centered, and column (sample) standardized. Comparison of breast tumor gene expression differences by age alone (≤ 45 v ≥ 65) yielded 2,154 genes differentially expressed between age-defined classes (q < 0.05; Table 2). Correction for additive effects of significant clinicopathologic features (including intrinsic subtype, estrogen receptor status, grade, and nodal status, Table 3) yielded only one gene difference between breast tumors of age-defined groups, (SLC25A20, Solute carrier family 25). An identical analysis evaluating age-specific gene expression differences by age less than 45 versus ≥ 45 years yielded identical findings; within data set B, 778 genes differentially expressed by age group diminished to zero gene differences when correcting for subtype and other significant clinicopathologic features (ER status and histologic grade) that differed between age-defined groups.
Results of this analysis continue to refine our understanding of the biology of breast cancer arising in younger women and support our hypothesis that younger women's breast tumors are enriched for more aggressive intrinsic subtypes, namely, basal-like. This finding is complementary to our prior report illustrating breast tumors arising in younger women are characterized by lower mRNA expression of ER-α, ER-β, and PR, but higher expression of EGFR1, a known marker of the basal-like subtype.16 Although we recognize that our current analysis is not population-based, our results are entirely consistent with those of the population-based Carolina Breast Cancer Study, which reported that basal-like breast tumors occurred at a higher prevalence among premenopausal African American patients.5 Most important, our current analysis strongly suggests that biologic differences present between breast carcinomas arising at the extremes of age are strongly influenced by genes associated with intrinsic breast cancer subtype and grade, both of which are highly correlated with age. We recognize that our analysis is not designed to address age-related differences in tumor-host interfaces (which most certainly vary by age) and may not be entirely reflective of tumors arising in very young women (< 35 years), both areas deserving of future research in (ideally) prospectively collected, clinically annotated data sets. Our results, however, suggest there are few age-specific differences in breast tumor biology. Age alone does not appear to provide an additional layer of biologic complexity above that of breast cancer subtype and grade; therefore, when considering treatment programs, decisions should be driven by subtype biology and performance status, and much less influenced by age.