|Home | About | Journals | Submit | Contact Us | Français|
The incidence of breast cancer is 35% lower in Hispanic women living in the San Francisco Bay Area than in non-Hispanic white women. We have previously described a significant association between genetic ancestry and risk of breast cancer in a sample of US Hispanics/Latinas. We re-tested the association in women residing in Mexico because of the possibility that the original finding may be confounded by US specific unmeasured environmental exposures. We genotyped a set of 106 ancestry informative markers (AIMs) in 846 Mexican women with breast cancer and 1,035 unaffected controls and estimated genetic ancestry using a maximum likelihood method. Odds ratios (OR) and 95% confidence intervals (CI) for ancestry modeled as a categorical and continuous variable were estimated using logistic regression and adjusted for reproductive and other known risk factors. Greater European ancestry was associated with increased breast cancer risk in this new and independent sample of Mexican women residing in Mexico. Compared to women with 0-25% European ancestry, the risk was increased for women with 51-75% and 76-100% European ancestry (OR=1.35, 95% CI: 0.96-1.91 and 2.44, 95% CI: 0.94-6.35 respectively, p for trend=0.044). For every 25% increase in European ancestry (modeled as a continuous variable) there was a 20% increase in risk of breast cancer (95% CI: 1.03-1.41, p=0.019). These results suggest that non-genetic factors play a crucial role in explaining the difference in breast cancer incidence between Latinas and non-Latina white women and it also points out to the possibility of a genetic component to this difference.
Breast cancer incidence varies substantially worldwide. Europe, Australia, North America, Argentina and Uruguay have age adjusted breast cancer incidence rates of up to 101 per 100,000, while most African, Asian and Latin American countries have incidences of less than 52 per 100,000 (1). This variation might be the result of differences in reproductive, hormonal, and lifestyle factors, and possibly genetic factors between populations (2-4). Differences in breast cancer incidence have been observed not only between countries, but also between populations within countries. For example, Hispanic women in the San Francisco Bay Area had a breast cancer incidence that was 35% lower than that for non-Hispanic white women for the period 1998-2002 (5). A large proportion of the difference in incidence disappears in second and third generation immigrants, suggesting a major environmental influence on breast cancer risk (2). The degree to which genetic factors play a role is unknown.
With respect to breast cancer susceptibility, US Latinas are a unique group to study because of the diversity of both their environmental and cultural exposures and their genetic ancestry. The term “Hispanic” or “Latino” describes a heterogeneous population with a shared history of colonization and a common language, but does not refer to a fixed biological entity or a single common ancestry. Latinos, in terms of their genetic background represent a complex mix of Indigenous American, European, and African ancestries (6). Understanding cancer susceptibility in this population is crucial since Latinos account for 14% of the nation's total population, and are predicted to comprise 25% of the population by the year 2050 (7).
Unknown environmental factors might be important in explaining some of the differences in breast cancer risk among subsets of Latina women. Among US Latinas, those born in the US have a higher risk of post-menopausal breast cancer compared to foreign born Latinas, even after controlling for reproductive and other risk factors (2). Genetic factors might also contribute to differences in breast cancer risk among subgroups of Latinas. We recently demonstrated that European genetic ancestry was associated with increased risk of breast cancer among US Latinas (8). We compared the mean genetic ancestry among US Latina breast cancer cases and US Latina controls from the San Francisco Bay Area and found that cases had significantly greater European ancestry and significantly lower Indigenous American ancestry even after adjusting for known reproductive and other environmental risk factors, including place of birth and age at migration.
We sought to re-test the association in women residing in Mexico because of the possibility that the original findings between genetic ancestry and breast cancer risk in US Latinas might be confounded by unmeasured environmental exposures. Specifically, we investigated the relationship between breast cancer risk and genetic ancestry using a new and independent sample of 1,881 women in Mexico. We hypothesized that if there is a US specific environmental component that increases breast cancer risk and that is associated with European genetic ancestry in US Latinas, then, there should be no association between genetic ancestry and breast cancer risk in the Mexican women.
Analyses were performed using DNA and epidemiologic data from a multicenter case-control study designed to examine predictors of breast cancer risk among Mexican women aged 35-69 years who resided for at least 5 years in Mexico City, Monterrey or Veracruz.
Newly diagnosed cases were identified at 12 hospitals from the major healthcare systems in Mexico: the Mexican Institute of Social Security (IMSS, 6 hospitals), the Social Security system for state workers (ISSSTE, 2 hospitals) and the Ministry of Health (SS, 4 hospitals) that provides health care to those who do not belong to any of the health care systems stated above. Inclusion criteria for cases were a) a histologically confirmed new diagnosis of breast cancer between the years 2004 and 2007, including invasive and in situ tumors; b) no previous treatment such as radiotherapy, chemotherapy, or anti-estrogens in the last 6 months; and c) no present treatment with exemestane, letrozole, anastrozole, or megestrol. Pregnant women or cases known to be HIV infected were excluded from the study. Only two cases were using anti-estrogens and therefore excluded because we were interested in measuring breast density and antiestrogens could have modified it. Only one case known to be HIV positive was excluded from the study. In this study cases and controls derived from the same source populations.
Controls were selected based on a probabilistic multistage sampling design. One or more geo-statistical areas (from Spanish, Área Geoestadística Básica) considered in the catchment area of the participating hospital were randomly selected. Women were randomly sampled to obtain specific numbers of women in each 5-year age category (35-39, 40-44, 45-49, 50-54, 55-59, 60-64, and 65-69 years) based on the age distribution of cases reported by the Mexican Tumor Registry in 2002. If more than one woman age 35-69 years was present in the household, only one was selected (25 controls and 1 case within our sample had a sister of similar age living in the same household) (9). Trained personnel visited the selected households and determined willingness to participate in the study.
Data collection included the administration of a structured questionnaire at the participant's home, and collection of anthropometric measurements and a blood sample at the hospital. In addition, a mammogram was taken for controls. The response rate of the cases was 95.5% for Mexico City, 94.4% for Monterrey and 97.4% for Veracruz. The response rate of the controls was 87.4% for Mexico City, 90.1% for Monterrey, and 97.6% for Veracruz. The complete sample included 1,000 cases and 1,074 controls. DNA for the present study was available for a total of 1,938 subjects (880 cases and 1,058 controls). More detailed information regarding the description of the multicenter population-based case-control study has been recently published. (9).
All participants provided a written informed consent before the in-person interview. The study was approved by the institutional review board at each institution participating in this collaborative study.
The health questionnaire collected information on: socio-demographic characteristics, reproductive factors, use of oral contraceptives and hormone replacement therapy (HRT), family and personal history of chronic diseases, personal history of transmitted sexual diseases, histories of body size, smoking, and alcohol consumption, and history of medical X rays and mammograms. Information on usual dietary intake in the past year was collected using a food frequency questionnaire adapted from the Willet questionnaire (10) and validated in a sample of Mexican women (11).
Study participants were asked about usual physical activity during a one-week period that reflected the activity performed during the last 12 months before any breast cancer symptoms were perceived in cases and during the last 12 months before recruitment in the controls(12). Three different categories of physical activity were defined: strenuous, moderate and light (13).
Standing height, weight, hip and waist circumferences were measured by the interviewer at the hospital. Body mass index (BMI) was calculated as measured weight (kg) divided by measured height (m) squared.
A set of 106 SNPs that can separate Indigenous American, African and European ancestry was used to estimate the proportion of genetic ancestry in the sample of Mexican women with breast cancer and unaffected controls. Simulation studies have demonstrated that ~100 ancestry informative markers (AIMs) with allele frequency differences similar to the ones we used, are required to achieve a correlation coefficient of >0.9 with true ancestry (14). The AIMs used in this study were bi-allelic SNPs selected from the Affymetrix 100K SNP chip (Affymetrix, Santa Clara, CA). AIM selection was based on calculations of allele frequency differences between Europeans, West Africans and Indigenous Americans. The SNPs chosen maximize information for more than one ancestral population pairing, with a large difference in allele frequency between ancestral populations (>0.5). The AIMs are widely spaced throughout the genome and have a well-balanced distribution across all 22 autosomal chromosomes. The average distance between markers is about 2.4 × 107 bp. The parental population samples that were genotyped on the Affymetrix 100K SNP chip included 42 Europeans (Coriell's North American Caucasian panel), 37 West Africans (non-admixed Africans living in London, UK and South Carolina) and 30 Indigenous Americans (15 Mayans and 15 Nahuas) (8, 15).
Genotyping of the 106 AIMs for the Mexican women was performed at the Children's Hospital Oakland Research Institute. Genotyping was performed using a multiplex PCR coupled with single base extension methodology with allele calls using a Sequenom analyzer. Primers and reaction conditions have been previously published(8). Samples were genotyped without knowledge about case/control status by the laboratory personnel.
A total of 1,938 Mexican cases and controls were newly genotyped for this study (880 cases and 1,058 controls). The average call rate for the SNPs was 98.9%. After we removed 3 SNPs with a call rate smaller than 90%, the average call rate for the SNPs was 99.2%. The average sample call rate was 98.7%. After we removed 57 samples (34 cases, 23 controls) with a call rate smaller than 85%, the average call rate for the samples was 99.6% (the excluded samples had no significant differences for the variables analyzed). We genotyped 66 duplicate pairs, and of these, three pairs were excluded from the mismatch analysis because the call rate for one of the duplicates was low (7%, 26% and 25%) compared to the high call rate of most samples in the study. The overall error rate without including the duplicate pairs with low call rates was 0.02%. All the AIMs were in Hardy-Weinberg equilibrium.
Genotypes and phenotype information was available for a total of 1,881 women from Mexico (846 cases and 1,035 controls).
We used a maximum likelihood approach to estimate genetic ancestry at the individual level (16, 17). To compare the characteristics of cases and controls for Mexican women, we used t-tests for continuous variables and Fisher's exact tests for categorical variables. Mean genetic ancestry was estimated as the average of the individual genetic ancestry estimates within a group.
To assess the association between breast cancer risk and genetic ancestry in the sample of Mexican women residing in Mexico, we used unadjusted and adjusted logistic regression models. European or Indigenous American genetic ancestry were modeled as categorical variables (0-25%=1, 26-50%=2, 51-75%=3 and 76-100%=4). We also evaluated the models with genetic ancestry represented as a continuous variable (% genetic ancestry). Both the unadjusted and adjusted models included as covariates the recruitment site (Mexico City, Veracruz or Monterrey) and the type of health insurance to which the individuals belong (SSA, ISSSTE and IMSS). We stratified the unadjusted analysis by recruitment site, to evaluate if the estimation between genetic ancestry and breast cancer risk was consistent across the three sites. The multivariate models adjusted for European ancestry, age (continuous), family history of breast cancer in first-degree relatives (yes, no), personal history of benign breast disease (yes, no), age at menarche (continuous), number of full-term pregnancies (continuous 0-6, ≥7), age at first full-term pregnancy (1=20 years or less; 2=between 21 and 30 years; 3=more than 30 years; 4=no full-term pregnancies), breast-feeding (ever, never), history of HRT use/menopausal status (0=postmenopausal/ever use of HRT; 1=premenopausal/ever use of HRT; 2=postmenopausal/no HRT; 3=premenopausal/no HRT), alcohol intake during the reference year (defined as 12 month previous to diagnosis in cases and 12 months previous to recruitment in controls) (one or more drinks a month in a year or longer=1, otherwise 0), daily caloric intake during the reference year (continuous), education (none=0, some elementary school=1, completed elementary school=2, high school=3, college=4, postgraduate studies=5), moderate physical activity (hours per week, continuous), and socioeconomic status (SES) (low, medium, high). For the construction of the SES variable, we combined information about different belongings (i.e., gas or electric stove, water heating system, radio or cassette recorder, television, videocassette recorder, CD player, refrigerator, washing machine, microwave oven, blender, vacuum cleaner, water pump, motorcycle, car or van, fixed phone, cellular phone, computer, and dish antenna). The polychoric correlation was used for the construction of an SES index (18), categorized into tertiles among controls (low, medium, high).
Individuals with missing data were dropped from the multivariate analysis (161 cases and 180 controls). We evaluated models including both European and African ancestry. We present results based on a model that included European ancestry as the predictor. Native American ancestry is the counterpart of European ancestry and therefore the results can also be interpreted in terms of the former.
We evaluated possible interactions between genetic ancestry and other risk factors (e.g., HRT, BMI, parity, age at first full-term pregnancy, breastfeeding and menopausal status).
All statistical analyses were performed using the program STATA (19) and all tests are two-sided.
Characteristics of breast cancer cases and controls from Mexico are presented in table 1. Mexican cases were significantly older at diagnosis, had fewer full-term pregnancies, were less likely to breast feed, were more likely to report a personal history of benign breast disease, a family history of breast cancer, or history of HRT use, and had higher alcohol intake and higher daily caloric intake. Cases also reported a significantly higher level of education and socioeconomic status and lower BMI than controls (the relationship between BMI and breast cancer risk was only observed among pre-menopausal women). Finally, women with breast cancer had more European and less Indigenous American ancestry than controls. There were no significant differences between cases and controls in age at menarche or African ancestry. The proportion of European and Indigenous American ancestry in Mexican controls differed by recruitment site. Monterrey had the largest proportion of European ancestry (40%, standard deviation (sd)=16) compared to Mexico City (28%, sd=19) and Veracruz (30%, sd=18). The proportion of Indigenous American ancestry was estimated to be 54% (sd=15) in Monterrey, 69% (sd=20) in Mexico City, and 64% (sd=20) in Veracruz. The proportion of African ancestry was estimated to be 6% (sd=5) in Monterrey, 3% (sd=4) in Mexico City, and 6% (sd=8) in Veracruz.
We investigated the association between the different reproductive, demographic, lifestyle characteristics and genetic ancestry among the controls (Table 2) and observed that SES, education, daily kilocalorie intake and family history of breast cancer significantly differ by ancestry category, with family history being more common among women with higher European ancestry and SES, education and daily kilocalorie intake being higher in women with higher European ancestry. We also explored the relationship between the different characteristics of the controls and SES (Table 3). These results show a very strong relationship between SES and genetic ancestry, all reproductive variables, education, alcohol consumption and daily kilocalorie intake. Women with higher SES tend to have less full term pregnancies, higher European ancestry, breast-feed less, consume more alcohol and kilocalories and have more years of education than women with lower SES.
The association between genetic ancestry and breast cancer risk in the sample of Mexican women is shown in table 4. The unadjusted model showed a strong association with European genetic ancestry. Using 0-25% European ancestry as the referent, odds ratios were 1.16 (95% CI=0.94-1.43, p=0.171) for the 26-50% European ancestry, 1.80 (95% CI=1.35-2.39, p=<0.001) for the 51-75% category, and 2.22 (95% CI=1.46-7.11, p=0.004) for the 76-100% category. When known risk factors were adjusted for (Table 4), the association with European ancestry was attenuated but the trend remained statistically significant (26-50% OR=1.01, CI=0.78-1.30, p=0.962; 51-75% OR=1.35, 95% CI=0.96-1.91, p=0.087; 76-100% OR= 2.44, 95% CI=0.94-6.35, p=0.067; p for Trend: 0.044). We chose to represent ancestry as a categorical variable since it allowed us to compare the effect size for the extremes of the ancestry distribution. However, the effect is also seen when ancestry is entered as a continuous variable in the model. For the model that included genetic ancestry as a continuous variable, the OR for every 25% increase in European ancestry was 1.20 (95% CI=1.03-1.41, p=0.019). To ensure that there was no confounding due to regional differences between cases and controls, the unadjusted model was stratified by recruitment site (Mexico City, Monterrey, Veracruz) with all results showing the same trend as the global analysis (Table 4) with the effect weakening as sample size decreases.
In the adjusted model, the associations between breast cancer and alcohol consumption, parity, family history, age at menarche, benign breast disease, kilocalorie intake and moderate physical activity were in the expected direction (Table 4). When the model testing the association between genetic ancestry and breast cancer risk included SES and education as the only covariates, the two covariates showed an effect on breast cancer risk. However, when other covariates were added to the model, the effect of SES and education became non significant (Table 4). Number of full term pregnancies, daily kilocalorie intake and benign breast disease were the variables that when added to the model, absorbed most of the effect of SES and education.
We found no evidence of significant interaction between genetic ancestry and HRT use, BMI, parity, age at first full-term pregnancy, breastfeeding and menopausal status.
We found an association between European genetic ancestry and breast cancer risk among Mexican women that reside in Mexico. Women with greater European ancestry had an increased risk of breast cancer compared to women with lower European ancestry, which could also be interpreted as Indigenous American ancestry being protective against the development of breast cancer. We have previously reported a significant association between genetic ancestry and breast cancer risk in US Latinas, with a 39% increase in breast cancer risk with every 25% increase in European ancestry (8). If the observed association in the US Latinas were solely due to confounding by a specific environmental risk factor associated with genetic ancestry in the US, then we would expect not to observe an association in the Mexican women. However, given that the association between breast cancer risk and genetic ancestry was also present in Latina women residing outside the US, it is reasonable to conclude that if there is an environmental component that is driving the association between genetic ancestry and breast cancer risk, it must be common to both US Latinas and Mexican women. The association is weaker in the Mexicans than in the US Latinas, which could be due to differences in the available covariates. It may also be due to the possibility that unmeasured confounding was greater in the U.S. samples.
Among Mexican women the association between genetic ancestry and breast cancer risk was attenuated after adjustment for known risk factors. Number of full-term pregnancies, daily caloric intake, benign breast disease and SES/education were the main variables responsible for this attenuation. It is interesting to notice that SES/education, which were associated with genetic ancestry and with breast cancer risk in the univariate models, did not have a significant effect on breast cancer risk once other variables, such as number of full term pregnancies and kilocalorie intake, were included in the model.
The results of this study confirmed our previous finding in an independent sample of Mexican women living outside the US. The effect of ancestry on breast cancer risk could be due to the effect of unmeasured environmental confounders present both in Mexico and the US. However, our results also provide support to the hypothesis of a genetic component underlying the association between genetic ancestry and breast cancer risk.
A limitation of the present study is the possibility that our measure of SES might not be optimal. Our models included adjustment for SES and education. However, our data included only 3 categories for SES and this may not capture the true variation in the population. The correlation between SES and genetic ancestry in the Mexican sample is high (the correlation coefficient is 0.23). If the true SES were more strongly correlated with genetic ancestry then it would be difficult to separate the effect of the two.
Another limitation of our analysis is that we treated each ancestral population, Europeans, Indigenous Americans and Africans, as one population despite the fact that numerous populations contributed to the gene pool of contemporary Latinos, and, in particular, there were multiple Indigenous American populations who were genetically (20) and culturally diverse. Therefore, our results in Mexican women may not be generalizable to populations from different parts of the Americas.
In summary, we replicated our previous results, although with weaker effects, showing an association between European genetic ancestry and breast cancer risk in an independent sample of Latina women living in three different regions of Mexico. Future studies in other populations in the Americas will allow us to further explore the effect of genetic ancestry on breast cancer risk in diverse environments, populations with different breast cancer incidence, and in groups with different proportions of European, Indigenous American and African ancestry.
FUNDING: National Cancer Institute (R01CA120120 to EZ); CONACyT (SALUD-2002-C01-7462 to GTM); UCSF Clinical and Translational Sciences Institute (Career Development Award to LF); Prevent Cancer Foundation (Postdoctoral Fellowship to LF); National Institutes of Health (HL078885 and HL088133 to EGB); RWJF Amos Medical Faculty Development Award (to EGB); The Sandler Family Supporting Foundation (to EGB); National Cancer Institute (Redes En Acción, U01-CA86117 to EPS); National Cancer Institute (R01 CA63446 and R01 CA77305 to EMJ); California Breast Cancer Research Program (7PB-0068 to EMJ).
NOTES: The authors would like to thank the study participants. We would also like to thank the physicians responsible for the project in the different participating hospitals in Mexico: Dr. Germán Castelazo (IMSS, Hospital de la Raza, Ciudad de México, DF), Dr. Sinhué Barroso Bravo (IMSS, Hospital Siglo XXI, Ciudad de México, DF), Dr. Fernando Mainero Ratchelous (IMSS, Hospital de Gineco-Obstetricia No 4. “Luis Castelaco Ayala”, Ciudad de México, DF), Dr. Hernando Miranda Hernández, SS, Hospital General de México, Ciudad de México, DF), Dr. Joaquín Zarco Méndez (ISSSTE, Hospital 20 de Noviembre, Ciudad de México, DF), Dr. Edelmiro Pérez Rodríguez (Hospital Universitario, Monterrey, Nuevo León), Dr. Jesús Pablo Esparza Cano (IMSS, Hospital No. 23 de Ginecología, Monterrey, Nuevo León), Dr. Heriberto Fabela (IMSS, Hospital No. 23 de Ginecología, Monterrey, Nuevo León), Dr. José Pulido Rodríguez (SS, Hospital Metropolitano Dr “Bernardo Sepulveda”, Monterrey, Nuevo León), Dr. Manuel de Jesús García Solis, (SS, Hospital Metropolitano Dr “Bernardo Sepulveda”, Monterrey, Nuevo León), Dr. Fausto Hernández Morales (ISSSTE, Hospital General, Veracruz, Veracruz), Dr. Pedro Coronel Brizio (SS, Centro Estatal de Cancerología “Dr. Miguel Dorantes Mesa”, Xalapa, Veracruz), Dr. Vicente A. Saldaña Quiroz (IMSS, Hospital Gineco-Pediatría No 71, Veracruz, Veracruz). Teresa Shama Levi, INSP, Cuernavaca Morelos.