|Home | About | Journals | Submit | Contact Us | Français|
Amplification of the epithelial growth factor receptor gene ERBB2 (HER2, NEU) in breast cancer is associated with a poor clinical prognosis. In mammary gland development, this receptor plays a role in ductal and lobuloalveolar differentiation. We conducted a systematic investigation of the role of genetic variation of the ERBB2 gene in breast cancer risk in a study of 842 histologically-confirmed invasive breast cancer cases and 1108 controls from the Shanghai Breast Cancer Study. We observed that the ERBB2 gene resides within a locus of high linkage disequilibrium, comprised of three major ancestral haplotypes in the study population. These haplotypes are marked by simple tandem repeat and single nucleotide polymorphisms, including the missense variants I655V and P1170A. We observed a risk-modifying effect of a highly polymorphic simple tandem repeat within an evolutionarily conserved region, 4.4 kb upstream from the ERBB2 transcription start site. Under a dominant genetic model, the age-adjusted odds ratio was 1.74 (95% CI 1.27−2.37). Its association with breast cancer, and with breast cancer stratified by histology, by histological grade, and by stage, remained significant after correction for multiple comparisons. In contrast, we observed no association of ERBB2 SNP haplotypes with breast cancer predisposition.
The epithelial growth factor receptor gene ERBB2 is commonly over-expressed in invasive breast cancer, carrying a poor prognosis (1). The receptor serves as an effective therapeutic target of both monoclonal antibodies and small molecule tyrosine kinase inhibitors (2, 3). A breadth of literature describes somatic alterations of ERBB2 in breast cancer. Adenocarcinomas of the breast arise in the terminal duct/lobular unit and are histologically classified by growth pattern and grade. ERBB2 over-expression is observed predominantly among high-grade tumors. Relative to tumors of ductal histologic type (tumors of no special type), tumors of the most common special type, lobular, rarely over-express ERBB2 (4-9).
Systematic investigation of a potential role for inherited germline variants of ERBB2 in breast cancer risk has been conducted using tagging SNPs and haplotype-based analysis in two prior study populations. These include one by Benusiglio et al. of breast cancer subjects from the population-based Anglian Breast Cancer Study (10, 11), and one by Han et al. within a hospital-based Korean study population (12). The former evaluated Caucasian subjects at five regional SNPs that capture much of the SNP information content at the locus among HapMap CEU subjects (13). The latter evaluated Korean subjects at six SNPs that were selected as tagging SNPs specifically within the study population. Neither of these studies observed evidence to support a significant association of common ERBB2 haplotypes with breast cancer risk. The CGEMS genome wide association study of breast cancer evaluated a single SNP within ERBB2, also without significant association (14).
In this study, we sought to characterize common genetic variation at ERBB2 within a well-characterized Han Chinese study population, to assess LD patterns of both single nucleotide and simple tandem repeat (STR) variants, and to systematically evaluate the potential contribution of ERBB2 germline variation to breast cancer risk. Given the notable somatic role of ERBB2 in breast cancer, the hypothesis that ERBB2 germline variation might influence risk of invasive breast cancer is worthy of careful investigation. We also considered the possibility that ERBB2 variation might modify the specific histologic forms of breast cancer, given the established role of the gene in the differentiation of normal breast tissue (15-17). We tested this hypothesis within the Shanghai Breast Cancer Study using single allele and haplotype-based analysis.
The Shanghai Breast Cancer Study has been previously described (18, 19). Briefly, study subjects were female, recruited between August 1996 and March 1998. All subjects were permanent residents of urban Shanghai without a prior history of any cancer and were alive at the time of interview. The study included 1,459 incident breast cancer cases diagnosed at an age between 25 and 64 years in Shanghai during the study period (91% of eligible cases). Cancer diagnoses for all patients were reviewed and confirmed by two senior pathologists of the Shanghai Tumor Hospital, Fudan University. Unaffected controls were randomly selected from the general population using the Shanghai Resident Registry, a population registry containing demographic information for all residents of urban Shanghai. Inclusion criteria for controls were identical to those for cases, with the exception of a breast cancer diagnosis. Controls were frequency matched on age (5 year intervals) to the expected age distribution of the case subjects in a 1:1 ratio. The study included 1,556 control subjects (90.3% of matched eligible controls). Blood samples were collected from each of 1193 (82%) invasive breast cancer cases and 1310 (84%) controls for DNA extraction (Gentra Systems). For these subjects, the mean age was 47.5 for cases and 47.1 for controls. Sixty-eight percent of the breast cancers were premenopausal. All study participants provided written informed consent under an approved institutional review board protocol.
Hematoxylin and eosin stained slides of 842 of 1151 invasive breast cancer cases with genotype data were available for study. The slides were independently reviewed by Drs. Sanders, Boulos, Olivares and Page, board certified pathologists with expertise in breast cancer diagnosis, for detailed evaluation of breast cancer histological characteristics. Slides were randomly assigned to each pathologist. Invasive breast cancers were graded and sub-typed based on the Nottingham modification of the Scarff-Bloom-Richardson grading system (low, intermediate, or high-grade) and histologic features highlighting special types (Table 1) (20-22). Special type tumors are, by definition, greater than 90% pure in pattern and include classic lobular, tubular, cribriform, medullary, and mucinous cancers. In tumors defined as variants of special types, the defining histologic features comprise 70% to 90% of the lesion and include lobular, tubular, tubular-lobular, tubular-cribriform-lobular, and medullary variants (22, 23). Further, categories of no special type (ductal) with lobular, tubular, cribriform, medullary, mucinous, and micropapillary features are designated when the “special” pattern represents less than 70% of the total tumor. The category of no special type (ductal) with lobular features was designated if the tumor had at least 2 of the following features that are typically associated with lobular carcinoma: 1) signet ring cells or lobular cytology, 2) single-filing or targetoid growth pattern, or 3) a dispersed pattern of infiltration that represented less than 70% of the total tumor. Cases approaching the lower threshold and upper limits for these diagnoses were jointly reviewed and a consensus diagnosis reached at a multi-headed microscope. Tumors were staged using the TNM system approved by the American Joint Committee on Cancer (6th edition) (24).
To capture genetic diversity of ERBB2 for haplotype estimation, we screened SNPs within NCBI dbSNP for common polymorphism in the study population. Ninety-three annotated SNPs and three STRs across the 28.5 kb ERBB2 coding region, 17.8 kb of 5’ flanking sequence, and 23.6 kb of 3’ flanking sequence (Figure 1, and Supplementary Table 1) were screened in quadruplicate among 22 Han Chinese controls to test for polymorphism. This screening set provided greater than 90% power to detect polymorphism with a minor variant frequency ≥ 0.05. SNPs rs1801201 (I654V), rs4252633 (W452C), and rs2172826 (P927R) were each predicted to change amino acid residues and were screened further among 356 Han Chinese (half cases, half controls), without observed polymorphism. An additional SNP (R929Q) was discovered upon sequence confirmation of other polymorphisms but was heterozygous in only one sample.
We genotyped SNPs by single nucleotide primer extension and fluorescence polarization in 384-well format. Reaction processing entailed three steps: a 4.4 μl PCR reaction, addition of 4 μl of an exonuclease I (New England Biolabs, Beverly, MA) and calf intestinal alkaline phosphatase (Promega, Madison, WI) reagent mix to degrade unincorporated primer and dephosphorylate dNTPs, and a final addition of 4 μl of an Acyclopol and Acycloterminator reagent mix for the primer extension reaction (AcycloPrime™ FP SNP Detection System, Perkin-Elmer, Boston, MA) (25). Each PCR mixture included 0.1 unit AmpliTaq Gold DNA polymerase, 1x Buffer II (Applied Biosystems, Foster City, CA), 2.5 mM MgCl2, 0.25 mM dNTPs, 335 nM of each primer, and 2 ng DNA template. We detected incorporation of R110- and TAMRA-labeled terminators by fluorescence polarization on a Molecular Devices / LJL Analyst HT. Primers are provided in Supplementary Table 1. The completion rate for the sought genotypes was 96%. CEPH control individual 1347−02 was included on all genotyping assay plates, with a study-wide 98.7% genotype concordance rate.
STRs were genotyped by 5’-dye-labeled fluorescent amplimers, detected on an ABI PRISM® 3700 (Applied Biosystems, Foster City, CA). Primers (Supplementary Table 1) were designed using a tailing strategy to promote full non-templated nucleotide addition by AmpliTaq Gold DNA polymerase (Applied Biosystems, Foster City, CA), providing unambiguous detection of alleles separated by one base pair (26). PCR conditions were as described above. Allele fragment size estimation was accomplished using the internal size standard Genescan 400HD ROX and the local Southern algorithm of GENESCAN software. Editing of alleles was performed in GENOTYPER (Applied Biosystems, Foster City, CA).
Characterization of haplotypes and LD was conducted among a pilot subset of subjects of the Shanghai Breast Cancer Study that included 178 cases and 178 controls. Pairwise LD was estimated by D’ for SNPs (using Haploview (27)) and by both D’ and multi-allelic D’ for STRs (using MIDAS (28)). Tagging SNPs were selected based upon a pairwise r2 threshold of 0.8 and a minor allele frequency threshold of 0.05. When multiple SNPs were assigned as tagging SNPs for a particular bin, the SNP with most robust assay performance was selected for that bin. Tagging SNPs (rs2517951 and rs1136201 (formerly “rs1801200”)) and STRs were genotyped among all cases and controls. Each of the SNPs and STRs was in Hardy-Weinberg equilibrium by likelihood ratio test. Subject diplotypes were estimated by the Bayesian method implemented in PHASE version 2.1 (29-31). Diplotypes of highest probability ≥ 0.9 were employed for subsequent tests of association.
The χ2 test statistic was used to evaluate differences in allele and haplotype frequency of case and control groups. Alleles or haplotypes with an overall frequency <0.04 were grouped for analysis. Permutation testing was used to assess significance. Nominally significant risk or protective variants identified in contingency tests were subsequently modeled by logistic regression conditioned on age to estimate odds ratios (OR) and 95% confidence intervals (CI) (Intercooled Stata 10, Stata Corporation, College Station, TX). Mitotic rate differences were evaluated using Wilcoxon rank-sum tests. Unless specifically noted, P values are unadjusted for multiple comparisons. All P values are calculated with respect to two-sided alternative hypotheses.
All cases included in the study were confirmed to have invasive breast cancer on histologic review. Seventy-nine percent of tumors were no special type (ductal) carcinomas (Table 1). Eleven percent of tumors were categorized as tubular or lobular of special or variant types, with good prognoses. Other special types represented 9% of tumors. Using the Nottingham combined grading system (20, 21), the lobular and tubular special type tumors consisted of: 65% low grade, 31% intermediate grade, and 3% high-grade. The anticipated converse trend was observed for no special type tumors: 11% low grade, 35% intermediate grade, and 54% high-grade tumors. Mitotic activity in all lobular, tubular, and their good prognosis variant carcinomas as a group averaged 2 mitoses/10 high power fields. In contrast, all no special type carcinomas averaged 12 mitoses/10 high power fields. The remaining special type carcinomas as a group averaged 4 mitoses/10 high power fields, predominantly as a result of the medullary and medullary variant carcinomas. Relative to the no special type (ductal) carcinomas, the mitotic rate of tubular and lobular tumors of special or variant types, and the mitotic rate of the remaining special type groups was significantly lower (P <0.00005 for each comparison).
To characterize ancestral versions of the ERBB2 gene within the study population, we sought to identify haplotype-defining polymorphisms and disequilibrium architecture. We found that 19 of 93 screened database SNPs were commonly polymorphic in the study population (Figure 1), including two altering evolutionarily conserved amino acids: rs1136201 (I655V, minor allele frequency 0.13), and rs1058808 (P1170A, minor allele frequency 0.40). We observed strong disequilibrium among SNPs spanning the locus, with a minimum observed D’ value of 0.89 among controls. Eighteen of the SNPs appeared to mark the same two ancestral DNA fragments in the study population (all pairwise r2 values ≥ 0.8). We selected rs2517951 as the tagging SNP for this LD bin due to its robust assay, effectively also distinguishing rs1058808 (P1170A) alleles among subjects. Only rs1136201 (I655V) provided independent information (maximum pairwise r2 value 0.19 with other SNPs) and was also selected as tagging SNP. Together, these tagging SNPs identify three common haplotypes among study subjects, marking versions of the receptor with residue combinations: I655-P1170, I655-A1170, and V655-A1170 (frequencies of 0.60, 0.13, and 0.27, respectively).
We additionally identified three novel polymorphic STRs (denoted A, B, and C in Figure 1) within the 5’ alternative coding region/promoter. STR A and B alleles had high pairwise D’ values with tagging SNPs at ERBB2 (range 0.86 to 0.96, and 0.72 to 0.95, respectively, for alleles of > 0.04 frequency). Pairwise multiallelic D’ values between STRs and tagging SNPs are presented to the lower right in Figure 1. In contrast to STRs A and B, STR C was considerably more polymorphic (respective heterozygosities: 0.58, 0.68, and 0.93). A total of 28 alleles were observed for STR C in the study population, eleven with a frequency > 0.04. The major alleles of STR C had pairwise D’ values ranging from 0.04 to 1.0 with alleles of the tagging SNPs.
Select alleles of each of the STRs marked ancestral SNP haplotypes reasonably well. For example, allele 135 (bp) of STR A had a pairwise r2 value of 0.85 with tagging SNP rs2517951, and allele 136 had a pairwise r2 value of 0.74 with tagging SNP rs1136201. Allele 346 of STR B had a pairwise r2 value of 0.62 with tagging SNP rs2517951, and allele 354 had a pairwise r2 value of 0.46 with tagging SNP rs1136201. In contrast, all pairwise r2 values between alleles of STR C and tagging SNPs were less than 0.06, with the exception of allele 113 and rs1136201 (I655V) (r2 = 0.30). STR C is embedded within evolutionarily conserved sequence, raising the possibility of a direct role in gene function. We employed the STRs and tagging SNPs to capture genetic diversity for subsequent tests of association between ERBB2 and breast cancer.
We conducted tests of association with breast cancer among the histologically confirmed invasive breast cancer cases and controls by contingency tests, presented in Table 2. Among single allele tests of association, an allele of STR C was nominally significant, with an excess of the 119 allele among cases (P=0.002). The association was also evident when comparing the superset of cases including both those with and without available histological slides (n=1151) to controls (P=0.0016, data not shown). The frequency of V655 was 0.125 among histologically confirmed cases, and 0.127 among controls, yielding no evidence for association with invasive breast cancer. The frequency of V655 among all cases (additionally including those without histological slides) was also not significantly different than among controls (P=0.60, data not shown). We also evaluated possible evidence for the association of ERBB2 haplotypes with breast cancer risk; these included haplotypes marked by tagging SNPs, marked additionally by STRs A and B, and finally additionally including the highly polymorphic STR C. No common haplotype yielded significant evidence of association with breast cancer risk (Table 2).
We further estimated the effect size and strength of potential association between STR C alleles 119 and 125 under a dominant logistic regression model adjusting for age (Table 3). Recessive and additive models were not evaluated because only one case was homozygous for allele 119, and only four were homozygous for allele 125. Allele 119 carried an OR=1.74 (95% CI 1.27−2.37, P = 0.001, Bonferroni corrected for the 47 tests of Tables 2 and and33 to P = 0.046), while allele 125 carried an OR=0.78 (955 CI 0.61−1.00, P=0.051). Upon stratification for the major histological subtypes, allele 119 carried an OR=1.81 (95% CI 1.30−2.51, P < 0.001, corrected to P < 0.046) for ductal breast cancer, while allele 125 carried an OR=0.34 (95% CI 0.15−0.74, P = 0.007, corrected to P = 0.28) for lobular and tubular breast cancer (Table 3). The odds ratio estimate of allele 119 with breast cancer was similar for both low- and high-grade cancer, but greater for advanced stage cancer (OR=2.09, 95% CI 1.40−3.12, P < 0.001, corrected to P < 0.046). Tests of STR C allele 119 that were stratified by ductal histology, by high combined histologic grade, and by advanced stage each remained significant upon correction for multiple comparisons.
ERBB2 plays a key role in the differentiation of mammary tissue (15-17). The ERBB2 receptor heterodimerizes with EGFR, a receptor with a prominent role in ductal morphogenesis during adolescence. It also heterodimerizes with ERBB3 and ERBB4, receptors with a prominent role in lobuloalveolar morphogenesis. Somatic alteration of ERBB2 expression is well-established in breast cancer, with clinical utility for stratification of breast cancer patients. Over-expression of ERBB2 is observed in approximately a third of breast cancers and is a prognostic indicator of decreased overall and disease-free survival (1). Lobular carcinomas, though, rarely manifest ERBB2 amplification and over-expression (4, 5). Better overall and disease-free survival has been suggested to be associated with both lobular and tubular histology (22, 32). Given the established role for ERBB2 in breast cancer progression and in normal breast differentiation, our investigation tested the hypothesis that heritable variation of ERBB2 influences risk of breast cancer and of its major histological types.
As a preface to this goal, we identified common variation of ERBB2 to enable comprehensive tests of association with breast cancer. We employed a tagging SNP approach, specifically selecting tagging SNPs among a large set directly tested for polymorphism in the study population. We screened more than twice as many candidate SNPs for polymorphism at this locus than were included in build 36 / phase II of the HapMap. We also developed STR markers within the promoter region to capture additional genetic diversity within the study population. The most notable overall observation is the lack of evidence to support a significant association between ERBB2 SNPs and breast cancer initiation, despite the wealth of information supporting its role in breast cancer progression. These results are concordant with those of Han et al. (12), and Benusiglio et al. (10).
A relatively large number of prior studies have evaluated the role of the I655V variant in breast cancer risk, with inconsistent results. The first of these studies was an investigation within a relatively small subset of the Shanghai Breast Cancer Study, observing a significant association of the minor valine allele with breast cancer risk (33). This association did not replicate in our investigation of a much larger number of cases. We do not believe that these results are an artifact of genotyping assay interference by an adjacent SNP. Frank et al. observed that a rare A to G variant (rs1801201, I654V) might confound genotyping of rs1136201 (I655V), two variants only 3 bases apart (34). Because the rare V654 occurred only when adjacent to V655, an I654/V654 heterozygote might falsely genotype as an I655/I655 homozygote. However, rs1801201 was not polymorphic among our screening set of 372 study subjects, and so is not likely to have been a significant source of confounding in our study.
No prior study has included an investigation of simple tandem repeat polymorphism of ERBB2. Information content of STR's A and B appeared to be similar (though not identical) to that of the SNP haplotypes. We have previously observed that multi-allelic STRs can very efficiently tag SNP haplotypes, and may also be functional candidates in disease (35). STR C was among the most polymorphic markers that we have observed. Alleles of this STR were not highly correlated with the tested SNPs by pairwise r2. We cannot fully exclude a potential role for allele 119 of STR C in breast cancer risk, after correction of significance for multiple comparisons. The association of allele 119 with breast cancer risk was strongest among those with aggressive (high grade or stage) disease, a facet closely intertwined with the histologic types. The potential association of STR C alleles 119 and 125 alternatively with major histological types of breast cancer (ductal and lobular-tubular, respectively) is intriguing. It is conceivable that the half-helical turn of these ERBB2 alleles, which separate evolutionarily conserved flanking sequences, modifies risk for specific types of breast cancer. Our observed association of ERBB2 STR C may be worthy of independent evaluation to distinguish a true association from a multiple comparisons artifact. Given the concordance between our study and the two prior haplotype-based evaluations of ERBB2, it seems unlikely that common population single nucleotide variants at the gene play a significant role as modifiers of breast cancer risk.
In summary, we characterized common genetic variation at ERBB2 within a well-characterized Han Chinese study population, assessed LD patterns of both SNPs and STRs, and systematically evaluated the potential contribution of ERBB2 germline variation to breast cancer risk. We observed no association of common SNP variants or haplotypes of ERBB2 with breast cancer risk. We did, however, observe a risk modifying effect of a highly polymorphic STR within an evolutionarily conserved region of the gene.
This study was supported by the V Foundation for Cancer Research and National Cancer Institute grants P50 CA098131, P30 CA068485, and R01 CA050486. The Shanghai Breast Cancer Study was supported by National Cancer Institute grants R01 CA64277 and R01 CA90899. We thank the study participants and staff of the Shanghai Breast Cancer Study.