Data for this analysis were from a Children's Oncology Group (COG) case–control study of infant leukemia. Cases were collected in two phases for this study. Both phases required cases to have a confirmed diagnosis of acute leukemia prior to 1 year of age. Patients could be diagnosed with either ALL or AML. Cases who died before the study period were eligible for study participation, since the main source of data collection was the child's mother. Children were eligible if they were not diagnosed with Down syndrome, had a biological mother who spoke English or Spanish (phase II only), had a biological mother available by telephone and were treated or diagnosed at a participating COG institution in the USA or Canada. Once cases were identified, the treating physician was contacted and asked to provide permission to contact the child's mother or parental consent to contact was obtained directly. Mothers with physician approval or consent were sent a letter explaining the study and notifying them that they would be contacted by phone. The first phase of recruitment included cases diagnosed between 1 January 1996 and 13 October 2002; the second phase included cases diagnosed between 1 January 2003 and 31 December 2006. In Phase I, 348 cases were confirmed eligible from 126 participating COG institutions and 240 of these (69%) completed interviews. In Phase II, 345 cases were identified by 133 participating COG institutions as potentially eligible for the study. Of those eligible, 203 (59%) completed interviews.
Controls were selected in two phases for this study coinciding with the case periods. In Phase I, controls were selected though random digit dialing (RDD). Numbers were generated using a modification of the methods proposed by Waksberg (Robison and Daigle, 1984
). Potential phone numbers were generated from case phone numbers at diagnosis. The area code and exchange of the case phone number were retained and the last four digits were randomly selected in order to obtain a control number. For each number, up to nine contact attempts were performed. If the number resulted in no contact, a refusal or an ineligible household, subsequent numbers were generated until an eligible control agreed to participate in the study. The mother's name and address were then obtained along with permission to send a letter. Controls were obtained from 25 516 telephone numbers selected using RDD, of which 11 713 were identified as residential numbers. Using the method outlined by Slattery et al. (1995
), the RDD household screening response rate was 67%. Maternal telephone interviews were successfully completed for 254 out of 430 potential eligible controls, giving a field response rate of 59% and an overall response rate of 40%.
Phase II controls were selected through state birth registries. Sixteen states that could release birth records and registered a large number of infant leukemia cases in Phase I were approached about participation, 15 of which ultimately provided rosters of birth certificate (BC) data. Controls were frequency matched to cases on year of birth and region of residence based on Phase I case distribution. The 15 states were allocated to regions to facilitate geographical matching. An introductory letter was sent to 270 potential controls providing information about the study and indicating that an interviewer would contact them by phone. Phone contact was attempted for each potential control successively until an eligible control agreed to participate. In both phases, controls were required to have a biological mother who spoke English or Spanish (Phase II) and was available by telephone. Initial contact letters were sent to mothers of 270 children from randomly selected BCs of which 267 were found eligible. A total of 70 mothers completed the interview and one partially completed the interview, giving a total field response rate of 27% (71/267).
Information was collected for cases and controls through maternal interview. The maternal interview included questions about pregnancy history, maternal exposures during pregnancy with the participating (index) child, family history of cancer and other diseases, and information about the medical history of the mother. Several questions about infertility and infertility treatment were also asked, including length of time to index pregnancy, history of infertility (more than 1 year of trying without becoming pregnant), history of doctor's visits by mother or index biological father due to non-pregnancy, specific infertility treatment, use of female hormones for ovulation stimulation, and use of female hormones for infertility or conditions related to infertility.
MLL status was determined using the case's file from his or her initial COG institution. Information about molecular or cytogenetic testing for MLL gene rearrangements at the time of diagnosis was collected and reviewed by three independent reviewers. Infants were ultimately classified into three classes: MLL+ by molecular or cytogenetic methods, MLL− by molecular or cytogenetic methods, or not enough information to determine MLL status. A total of 69 cases had unknown MLL status after review.
The institutional review boards at the University of Minnesota and the participating COG institutions approved this study. In addition, health departments for the states providing BCs also reviewed and approved this study. All participants provided informed consent prior to participating in the study.
Exposures of interest in this study included maternal age (continuous), history of recurrent pregnancy loss (2 or more, 1 or none), time to index pregnancy (not trying, <1 year, ≥1 year), specific infertility treatment (medication, surgery, or other) (yes/no), use of ovulation-stimulating drugs before or during early pregnancy (yes/no).
In addition, a composite infertility variable was constructed based on latent class analysis (LCA), including maternal age, history of recurrent pregnancy loss (2 or more, 1 or none), history of infertility (more than 1 year of trying without becoming pregnant), visit to a doctor by mother or index biological father due to non-pregnancy (yes/no), and use of ovulation-stimulating drugs before or during early pregnancy (yes/no) (Formann and Kohlmann, 1996
). This method was used since infertility is difficult to measure and a couple's ‘true’ infertility status is usually unknown. The LCA combines information from many different variables in order to obtain a better measurement of the unknown ‘true’ infertility status. Models with and without maternal age were explored in order to determine if the effect of infertility was only through maternal age or if there was an independent risk factor for infertility apart from age.
The analysis used the conditional independence model which assumes that the observed variables are independent of one another given class membership. This means that once infertility (as defined by the LCA model) is taken into account, the observed variables used to measure infertility are not related to one another. Since more than three observed variables were used, the model was identifiable and no additional constraints were needed. LCA was conducted using M-Plus software (Muthen and Muthen, 1998–2004
). Both two and three class models were fit to the data and model selection relied on the BIC value for model fit. Predicted class membership was categorized and used as a predictor in a logistic regression model along with potential confounders.
Descriptive methods were used to assess the appropriateness of statistical analysis and the functional forms of the relationship between exposures and outcome. Multivariate models were constructed after considering matching variables as well as confounders including maternal age (continuous), maternal education, maternal race, smoking during pregnancy, household income, gestational age and birthweight. Exposures were included in the logistic analysis if at least four cases and controls were represented in all exposure categories. Birth year was included in all analysis as a matching factor. Results are reported as odds ratios (ORs) and 95% confidence intervals (CIs).
In addition to the combined leukemia analysis, subgroups based on subtype (ALL, AML) and by MLL status (MLL+, MLL−) were examined separately. Each subgroup of cases was compared with the entire control set since there was no basis for selecting a subset of control children and using all of the controls maximized power. Model-based analysis was performed if there were at least two cases and controls in each exposure category within the subgroup. All logistic regression analyses were performed using SAS 9.1 (SAS Institute, Cary, NC, USA).