The AHS prostate cancer nested case–control study has been described in detail previously (Koutros et al. 2010b
). Briefly, eligible cases were white pesticide applicators who a
) were diagnosed with prostate cancer between 1993 and 2004 after enrollment in the AHS cohort, b
) provided a buccal cell sample, and c
) had no previous history of cancer except nonmelanoma skin cancer. Eligible controls were white male applicators in the cohort who a
) provided a buccal cell sample, b
) had no previous history of cancer except nonmelanoma skin cancer, and c
) were alive at the time of case diagnosis. Previous work in the AHS has demonstrated minimal differences with respect to a variety of characteristics between participants that did and did not provide a buccal cell sample (Engel et al. 2002
). Controls were frequency matched 2:1 to cases by date of birth (± 1 year). Based on these inclusion criteria, 841 cases and 1,659 controls were identified. As described previously (Koutros et al. 2010b
), exclusions because of insufficient number of available chips (164 controls with the lowest DNA mass), quality control issues [insufficient/poor DNA quality (n
= 20) or < 90% completion rate for genotyping assays (n
= 88)], or a genetic background that was inconsistent with European ancestry [< 80% European ancestry using STRUCTURE software, version 2.3.3 (n
= 3) (Pritchard et al. 2000
) or significant deviation from the first two components in principal components analysis (n
= 5)] resulted in a final sample size of 776 cases and 1,444 controls. All participants provided written informed consent, and the study was approved by the institutional review boards of all participating institutions.
Information on lifetime use of 50 pesticides was captured in two self-administered questionnaires completed during cohort enrollment (1993–1997). All 2,220 nested case–control study participants completed the first (enrollment) questionnaire, which inquired about ever/never use of the 50 pesticides, as well as duration (years) and frequency (average days per year) of use for a subset of 22 of the pesticides; 1,439 of these men (60.4% of cases and 67.2% of controls) completed the second (take-home) questionnaire, which inquired about use of the remaining 28 pesticides. A previous AHS analysis demonstrated similar characteristics, except for age, between cohort participants who completed the take-home questionnaire and those who did not (Tarone et al. 1997
). For each pesticide, we computed total lifetime days of application (number of years × days per year applied) using midpoints of the indicated categories. We also computed an intensity-weighted metric by multiplying the total lifetime days by an intensity score, which was derived from an algorithm based on mixing status, application method, equipment repair, and use of personal protective equipment (Dosemeci et al. 2002
) that was recently updated (Coble J, personal communication). For permethrin, we summed exposure variables for crop and animal applications because these were asked about separately. We categorized lifetime days and intensity-weighted lifetime days of application for each pesticide into a three-level, ordinal-valued variable (none/low/high), with low and high categories distinguished by the median among exposed controls. Because of statistical power limitations, we excluded the 10 pesticides with < 10% prevalence among the cases (trichlorfon, ziram, aluminum phosphide, ethylene dibromide, maneb/mancozeb, chlorothalonil, carbon tetrachloride/carbon disulfide, dieldrin, aldicarb, and 2,4,5-trichlorophenoxypropionic acid), leaving 39 available for analysis. All analyses were based on AHS data release version P1REL0712.04 [National Cancer Institute (NCI), Rockville, MD].
Genotyping and single-nucleotide polymorphism (SNP) selection.
DNA was extracted from buccal cells using the Autopure protocol (Qiagen Inc., Valencia, CA). Genotyping was performed at the NCI Core Genotyping Facility using a custom Infinium® BeadChip assay (iSelect™) from Illumina Inc. (San Diego, CA) as part of an array of 26,512 SNPs in 1,291 candidate genes. Blinded duplicate samples (2%) were included, and SNP concordance ranged from 96% to 100%. Tag SNPs were chosen to cover candidate DNA repair genes for three ancestry populations [Caucasian (CEU), Japanese Tokyo (JPT) + Chinese Beijing (CHB), and Yoruba people of Ibadan, Nigeria (YRI)] in the HapMap Project [data release 20/phase II, National Center for Biotechnology Information (NCBI) Build 36.1 assembly, dbSNPb126 (International HapMap Project 2011
)] to allow use of this custom iSelect panel for studies containing different ethnic populations. Tag SNPs were chosen using a modified version of the method described by Carlson et al. (2004)
as implemented in the Tagzilla module of the GLU software package, version 1.0b2 (Jacobs 2010
). For each candidate gene, SNPs within the region spanning 20 kb 5´ of the start of transcription to 10 kb 3´ of the end of the stop codon were grouped using a binning threshold of r2
= 0.80, and one tag SNP per bin was selected. Bins were created for each HapMap population, and the optimal tag SNPs were selected such that all three populations were adequately covered at a minimum binning threshold of r2
= 0.8. Select SNPs previously reported as being potentially functional were also included.
There were 31 BER genes included in the iSelect platform, which were selected based on supplementary information from a review of DNA repair genes (Wood et al. 2005
). Of the 698 tag SNPs selected and genotyped for these genes, 626 remained after quality control exclusions (completion rate < 90% or Hardy-Weinberg equilibrium p
-value < 1 × 10–6
). We further restricted SNPs to those with a minor allele frequency (MAF) of ≥ 10% among controls because of limited power for interaction assessments with rarer variants, which resulted in 394 SNPs.
Statistical analysis. We used unconditional logistic regression models adjusted for age (< 60, 60–69, ≥ 70 years) and state (Iowa or North Carolina) to estimate main effect odds ratios (ORs) and 95% confidence intervals (CIs) for the 39 pesticides and 394 BER SNPs with prostate cancer risk and to evaluate pesticide × SNP interactions. The addition of family history of prostate cancer and ever/never use of the 5 pesticides most highly correlated with a given pesticide did not alter inference, so these variables were not included in the models.
We examined both intensity-weighted and unweighted pesticide exposure variables, and results were similar; therefore, here we present results only for the intensity-weighted variables. For pesticide main effects analysis and interaction analysis, we used the three-level, ordinal-valued pesticide variables. For the tests of trend with pesticide exposure, we created new variables for each pesticide by assigning participants the value of the median intensity-weighted (or unweighted) lifetime days among controls for their respective exposure category (none/low/high). For SNP main effects analysis, we used variables coded as the number of variant alleles (0, 1, or 2), assuming a log-additive genetic model. To test for interaction, we computed p
-values from a 1 degree of freedom likelihood ratio test (LRT), using the three-level, ordinal-valued pesticide variables and assuming the dominant genetic model. We used SAS software (version 9.1; SAS Institute Inc., Cary, NC) to estimate ORs for pesticide main effects and stratified effects by genotype, as well as interaction p
), and PLINK (Purcell et al. 2007
) to estimate ORs for SNP main effects. We evaluated interactions between pesticides and haplotypes for SNPs in linkage disequilibrium (LD) blocks within a gene of interest using generalized linear models, assuming the additive genetic model for haplotypes and treating the most common haplotype as the referent group, using the haplostats package in R (Sinnwell and Schaid 2009
). Haplotypes with frequency < 1% were collapsed into a single group. We identified LD blocks using Haploview software (Barrett et al. 2005
) based on control data and considering tag SNPs with MAF ≥ 1% among controls. We also used Haploview to compute r2
values among controls for pairings of SNPs.
We used SAS to calculate false discovery rate (FDR)-adjusted interaction p
-values with the intensity-weighted pesticide variables (Benjamini and Hochberg 1995
). We conducted the FDR analysis by gene (number of comparisons = 39 pesticides × number of tag SNPs for gene) to account for the differing numbers of SNPs by gene. Interactions meeting FDR < 0.2 were considered robust to adjustment for multiple comparisons.
We presented two sets of results for pesticide × SNP interactions. One set encompassed interactions meeting FDR < 0.2. The second set encompassed interactions with a p
-value < 0.01 for both intensity-weighted and unweighted exposure metrics and a significant increased risk (
= 0.05) of prostate cancer following a monotonic pattern with increasing pesticide exposure in one genotype group and no significant association in the other group. We did not focus on interactions involving increased risk with exposure in one genotype group and decreased risk in the other (sometimes referred to as a qualitative interaction) because the biological basis for such a pattern is unclear and a chance effect of the exposure of interest in one of two population subgroups will force this pattern when there is no main effect of the exposure and no confounding (Weiss 2008