The Sister Study [National Institute of Environmental Health Sciences (NIEHS) 2011] is a prospective cohort study of 50,884 U.S. and Puerto Rican volunteer women 35–74 years of age who were not previously diagnosed with breast cancer but had a full or half sister who was diagnosed with breast cancer. Details of the study design have been described elsewhere (D’Aloisio et al. 2010
). Study enrollment occurred between 2003 and 2009. The Sister Study was approved by the Institutional Review Board (IRB) of the NIEHS, National Institutes of Health, and the Copernicus Group IRB; all participants provided their informed consent. Computer-assisted telephone interviews assessed demographics, medical history, and possible risk factors for breast cancer and other conditions. Participants also completed a self-administered questionnaire on family history that included early-life events. Prepaid phone cards were given to participants to contact their mother or other relatives for assistance with completing family history questionnaires.
Our analysis included 3,534 black women who were 35–59 years of age when they completed baseline enrollment activities in the Sister Study. Similar to our previous analysis, we excluded women ≥ 60 years of age because of temporal trends in ultrasound use for fibroid diagnosis and because of the reduced likelihood that these women had a living mother to ask about early-life events (D’Aloisio et al. 2010
). A total of 3,567 black women, 35–59 years of age, were available at baseline. We excluded 10 women with missing data on fibroid status, 21 with missing age at fibroid diagnosis, and 2 with unlikely ages at fibroid diagnosis—for example, fibroids are rarely seen in girls < 12 years of age who have not begun menstruation. For comparison to results in black women, we updated our previously published results in non-Hispanic white women (n
= 19,972), which were from an analysis done prior to completion of cohort enrollment (D’Aloisio et al. 2010
). Data from 27,786 non-Hispanic white women (hereafter referred to as “whites”) who were 35–59 years of age at baseline and who had complete data on fibroid diagnoses constituted the updated comparison group.
During the baseline telephone interview, women reported whether a physician or health professional had ever diagnosed them as having uterine fibroids and their age at diagnosis. Because fibroids are highly prevalent and because their prevalence increases with age until menopause, many older women have undiagnosed fibroids. Therefore, we focused our study on early-onset fibroids to limit misclassification of women as noncases. For white women, we previously defined early-onset fibroids as a diagnosis at ≤ 35 years of age (D’Aloisio et al. 2010
) based on ultrasound screening results that have shown that fibroid prevalence accelerates after 35 years of age (Baird et al. 2003
; Laughlin et al. 2010
). However, Laughlin et al. (2010)
reported that the estimated age at onset may be about 10 years earlier for black women. Because our fibroid diagnoses are based on self-reported data, we used a conservative estimate to define early-onset fibroids in black women as a reported diagnosis at ≤ 30 years of age. We considered noncases to be all women who did not report early-onset fibroids, regardless of whether they reported a later fibroid diagnosis.
Exposure and covariate assessment.
As previously described, most of the early-life exposures were ascertained in self-administered family history questionnaires that were completed at baseline (D’Aloisio et al. 2010
). Early-life exposures included birth weight (pounds/ounces), categorical gestational age at birth, singleton or multiple birth, ever breastfed during infancy, ever fed soy formula, and maternal factors involving the index pregnancy (maternal age at birth, residence and work on a farm, smoking, DES use, prepregnancy diabetes, gestational diabetes, preeclampsia, and gestational hypertension). Gestational diabetes was defined as reporting gestational diabetes during the index pregnancy and reporting no history of diabetes before the index pregnancy. We defined mothers as having had gestational hypertension during the index pregnancy only if pregnancy-related high blood pressure was reported in the absence of preeclampsia. To allow for uncertainty, response options were “definitely,” “probably,” “probably not,” and “definitely not” as well as “don’t know” for most early-life questions. However, “don’t know” was the only option available to address uncertainty for gestational age, birth weight, singleton or multiple birth, and maternal age at birth.
Questionnaires provided additional information on multiple births including number of babies delivered from the index pregnancy, sex of multiple birth sibling(s), and zygosity based on the participant’s perception and test results when available. Using this information, we assigned twin births as monozygotic (reporting a twin sister who is genetically identical to the participant based on perception or testing), dizygotic (reporting a twin brother or a twin sister who is not genetically identical to the participant), or unknown (missing data to categorize type of twin birth). For the < 1% of black or white participants who reported having two or more multiple birth siblings, we combined them with twin birth categories according to the following criteria: monozygotic (all were genetically identical sisters), dizygotic (no sisters that are genetically identical to the participant), or unknown (missing data to categorize multiple births) or polyzygotic (having a genetically identical sister and a non–identical multiple birth sibling). Birth order was calculated using birthdates reported by participants for full siblings and half-siblings sharing the same mother. Participants reported sisters’ birthdates during the telephone interview and brothers’ birthdates in the family history questionnaire. Questionnaires also asked how long participants were breastfed or fed with soy formula, and whether soy formula consumption occurred during the first two months of life. Whether the participant’s mother was alive at baseline was reported in the family history questionnaire.
Participants reported childhood exposures during the baseline telephone interview. Childhood socioeconomic factors included educational level of parents or guardians in the household when the participant was 13 years of age, family income level based on self-reported categories (poor, low, middle, or well off), and food insecurity (not having enough to eat at times during childhood). Developmental factors were age at menarche and height and weight relative to peers at age 10 years. Information on participant characteristics at baseline, including age, education, smoking status, alcohol intake, ages at live and still births, and menopausal status, were assessed during the telephone interview. Body mass index at baseline was calculated using home visit measurements of weight (pounds) and height (inches).
Statistical analyses. We estimated risk ratios (RR) with 95% confidence intervals (CI) using log-binomial regression for associations between each exposure and early-onset fibroid diagnosis. For infancy feeding and maternal exposures involving the index pregnancy (except for maternal age at birth), women were classified as exposed if they reported “definitely” or “probably” for a given factor and as unexposed if they reported “probably not” or “definitely not.” We did not analyze data on how long participants were breast-fed or fed with soy formula because a substantial proportion of participants did not know this information. For birth weight analyses, we created the following categories: < 2,500 g (low), 2,500–3,999 g, and ≥ 4,000 g (high) based on clinical definitions of low and high birth weight and similarity of fibroid risk within categories. For maternal age and birth order, we created a combined index variable based on similarities in associations with fibroids and correlations between young maternal age at birth and being the firstborn using the following categories: < 20 years old and firstborn, ≥ 20 years old and firstborn, < 20 years old and not firstborn, and ≥ 20 years old and not firstborn.
We considered variables as confounders if they either influenced reporting or were associated with the fibroid outcome and early-life and childhood exposures. Specifically, in all race-specific log-binomial regression models, we included participant’s age and education because they may affect reporting of fibroids and exposures, and we included the maternal age at birth/birth-order index variable because it was associated with fibroids in our data and influences other early-life and childhood factors. We also adjusted for childhood family income in analyses for whites but not for blacks because childhood family income was only associated with the fibroid outcome for whites. Because of missing data on confounders, our adjusted analyses are based on 3,201 black women and 27,048 white women. For many of the observations excluded from the adjusted analyses, women did not complete family history questionnaires (n
= 203 blacks; n
= 361 whites), which resulted in missing data for maternal age at birth, birth order, and other early-life exposures. Despite evidence that early menarche is a risk factor for fibroids (Marshall et al. 1998
; Wise et al. 2004
), we did not adjust for age at menarche because it may be on the causal pathway between exposures of interest and fibroids. In secondary analyses, we further addressed whether other early-life factors may be acting as confounders by reevaluating associations after restricting analyses to smaller subsets of women without the potentially related factors. For example, we reevaluated the association with DES after excluding maternal conditions involving the index pregnancy (diabetes and hypertensive disorder). We conducted statistical analyses using SAS software (version 9.2; SAS Institute Inc., Cary, NC).
To assess whether our primary results for blacks were biased due to a high proportion of missing data for some early-life exposures, we also estimated associations with early-life exposures after performing multiple imputation analyses. These analyses assumed that missing data were not related to the actual values of missing variables but may have been associated with values of other study variables (i.e., we assumed that data were missing at random). We obtained ten imputation data sets from multiple imputation by chained equations using IVEware (version 0.1; University of Michigan, Ann Arbor, MI), which fits a series of regression models conditional on other variables and imputes missing data based on the values predicted for each participant in these models (Raghunathan et al. 2001
). The multiple imputation regression models included all early-life factors (excluding soy formula consumption in the first two months of life because of multicollinearity with any soy formula consumption), childhood family income, participant’s age, participant’s education, mother’s vital status at baseline, and early-onset fibroid status. Childhood family income and maternal vital status were not confounders in log-binomial regression models; however, they were included in multiple imputation regression models because they influenced whether early-life data were missing. Log-binomial regression analyses for associations with early-onset fibroids were repeated in the 10 imputation data sets as previously described, and regression estimates were combined using PROC MIANALYZE in SAS (version 9.2; SAS Institute Inc.). In secondary analyses after imputation of missing data, we also explored whether there was possible confounding among the early-life exposures by adjusting for additional early-life factors in log-binomial regression models. Multiple imputation of missing data allowed for simultaneous adjustment for several early-life factors in log-binomial regression models without reductions in sample size.