Case and control DNA samples were obtained under IRB-approved protocols. Specifically for the samples from the Israeli blood bank, corresponding institutional review boards and the National Genetic Committee of the Israeli Ministry of Health approved the studies. The DNA samples from 963 prostate cancer cases were used in this study. Of these, 885 cases presented at Memorial Sloan-Kettering Cancer Center (MSKCC) with histologically confirmed prostate cancer and report all four grandparents were from Eastern Europe and Jewish. An additional 78 cases from Montreal, a Canadian metropolitan area, for whom both parents were of Ashkenazi ancestry were included. These patients were treated for prostate cancer at one of the three McGill University affiliated hospitals: the Royal Victoria and Montreal General Hospital sites of the McGill University Health Center and the Jewish General Hospital. Control DNA was collected from 1854 healthy men in New York and Israel – 419 participants in New York Cancer Project (Mitchell et al, 2004
), 194 samples from the National Laboratory for the Genetics of Israeli Populations (NLGIP) (www.tau.ac.il/medicine/NLGIP/nlgip.htm
), and 1241 healthy individuals from the Israeli Blood Bank. All controls self-report that all four grandparents are of Ashkenazi ancestry. These samples have been previously described (Kirchhoff et al, 2004
; Shifman et al, 2008
; Tischkowitz et al, 2008
). The cases range in age from 26 to 94 (mean 68, s.d., 8.3). The controls range in age from 18 to 98 (mean 46, s.d., 15.2).
A total of 29 SNPs of interest were identified based on previous reports of association with prostate cancer risk from recent genome-wide association studies (GWAS) and other studies (). Most of these SNPs were selected on the basis of being reported in one of the GWAS papers as being significantly associated with prostate cancer risk. We also included numerous SNPs at 8q24 that were discovered in follow-up studies of this locus after its initial identification by linkage and association. Finally, we also included a SNP (rs7008482) reported as a prostate cancer risk SNP in the African-American population to see if this SNP had an effect in the Ashkenazi population despite not being associated in other European populations studied.
Known prostate cancer risk SNPs successfully tested in this study
Samples from MSKCC, McGill University, the NY Cancer Project, and the NLGIP were genotyped using the Sequenom MassArray technology at MSKCC. This includes all cases and 31% of the controls. We designed two multiplex assays to genotype all 29 SNPs using Assay Design software (Sequenom, San Diego, CA, USA). PCR amplification and extension were performed using Sequenom iPLEX Gold reagents as per the manufacturer's protocol and analysed on the Sequenom MassARRAY system (Sequenom). Genotypes were called using the Typer 4.0 software package (Sequenom).
For quality control on the Sequenom data, we first manually inspected the cluster plots. Then, the data was processed with PLINK (Purcell et al, 2007
). In all, 112 individuals with more than 20% missing data were removed. All SNPs had <20% missing data and showed no significant deviation from Hardy–Weinberg equilibrium in controls (P
>0.01; ). Six SNPs had significant differences in genotype calling rate between cases and controls (P
<0.01; FDR<0.05) and were therefore removed from further consideration.
Genotype counts and deviation from Hardy–Weinberg equilibrium stratified by disease status and source study
The controls from the Israeli blood bank were processed separately as they were initially genotyped genome wide as part of a separate study not related to cancer. These samples were fully anonymised immediately after collection and subsequently genomic DNA was extracted from blood samples by using the Nucleon kit (GE Healthcare, Piscataway, NJ, USA). The samples were genotyped on the Illumina HumanOmni1-Quad arrays (Illumina, San Diego, CA, USA) according to manufacturer's specifications under protocols approved by the Institutional Review Board of the North Shore-LIJ Health System. SNPs were filtered on the following bases: call rate <98%, minor allele frequency <0.02 and Hardy–Weinberg exact test P
<0.000001. The samples were filtered based on cryptic identity and first- or second-degree relatedness using pairwise identity-by-decent (IBD) estimation (PI_HAT >0.20) in PLINK with 128
403 LD pruned (r2
>0.2) genome-wide SNPs and population stratification using Principal Component Analysis with Ancestry Informative Markers specific for the AJ population.
Of the 23 SNPs that passed quality control from the Sequenom genotyping, 20 were directly genotyped on the Illumina chip. The remaining three SNPs were analysed only with the data from the Sequenom genotyping. Association analysis was performed in PLINK using logistic regression. Regression was performed twice, once without an adjustment for age and once with an adjustment for age of either diagnosis (cases) or sample collection (controls). Multiple testing was accounted for by holding the false discovery rate to be 5% using the Benjamini–Hochberg procedure (Benjamini and Hochberg, 1995
To compute the power to detect association for each SNP, we assumed the previously reported odds ratio (OR), allele frequencies in our control population, and a sample size based on the number of successfully genotyped cases and controls. We used a previously reported method to compute the power at a significance level of 0.05 (Klein, 2007
As a reference population of non-Ashkenazi white Americans, we used the GWAS data from the CGEMS Prostate Cancer GWAS – Stage 1 – PLCO (phs000207.v1.p1) in dbGaP (http://www.ncbi.nlm.nih.gov/gap
), removing duplicate individuals. To test for the heterogeneity of the OR between the CGEMS data and our data, we used the Breslow–Day test as implemented in PLINK.