Illumina 610Quad SNP genotypes were obtained for all 66 samples genotyped. Before conducting association-based analyses, we subjected the SNP data set to rigorous quality control in terms of excluding samples and SNPs with poor call rates. As mentioned previously, one sample was excluded because of having a call rate of <95% the remaining samples had average call rates across all SNPs of >99%. Thus, the final analysis was based on 65 samples. Following this, we critically evaluated the data set for ancestral differences by principal component analysis (). Although minor differences were apparent, all individuals genotyped were relatively ancestrally comparable. Thus, without introducing significant systematic bias we considered the data set to be uniform to maximise power to detect important associations under the assumption of homogeneity and an ancestral risk haplotype for RAM. In all analyses we treated individuals with RAM as affected and all other family members as of unknown phenotype.
Figure 1 Principal component analysis of SNP genotypes showing the extent of ethnic variability in the TC cohort. The first two principal components of the analysis were plotted. HapMap CEU (Caucasian) individuals are denoted by grey triangles, CHB (Chinese Han (more ...)
The median distance between the 575
272 autosomal SNPs in the Illumina 610Quad arrays was ~2.7
Kb and ~88% of the genome was within 10
Kb of a SNP marker. In this study, the heterozygosity of markers was ~94%, hence almost as many SNPs present on that array are heterozygous in this Jewish population as in the general Caucasian population.
We systematically interrogated haplotypes defined by a varying number of SNPs. Haplotypes defined by >12 SNPs proved too computationally intensive to recover on a genomewide basis. We therefore restricted our search for disease-associated risk locus on the basis of 12 SNP haplotypes.
This analysis provided results on the association between RAM and 41
414 haplotype tests across the genome (). In all, 66 haplotype tests provided evidence for an association between genotype and RAM at P
<0.001 () including multiple haplotypes on chromosomes 18 and 10. The strongest associations were shown at 18q21.1 (P
=7.5 × 10−5
), 18q21.32 (P
=2.8 × 10−5
) and 10q21.3 (P
=1.6 × 10−4
Manhattan plot of genomewide haplotype test P-values for the association between haplotypes and RAM. The –log10 P-values (y axis) are presented at their chromosomal positions (x axis).
Details of haplotypes showing evidence of association with RAM at P<0.001
A number of genes map to the 18q21.1 region of association including PIAS2, KATNAL2, TCEB3CL, TCEB3C, TCEB3B
whereas the 18q21.31 region is bereft of genes. In contrast to the other associations, the 10q21.3 signal was characterised by a large number of neighbouring haplotype associations; eight providing evidence for an association at P
<0.001. These haplotypes all mapped within a 2
Mb region of 10q21.3 and all annotate the catenin
) gene. Among the top 66 associations, we identified only two other genes that were annotated by multiple haplotype tests showing evidence for an association at P
<0.001. Specifically, TMC1
on 9p21.13 and PCDH15
on 10q21.1 were captured two and three times, respectively, by haplotype associations ().
The CTNNA3 is part of the Wnt signalling pathway and, although speculative, CTNNA3 represents an attractive basis for susceptibility given the role of dysfunctional Wnt signalling in radiosensitivity. In view of this, we explored the possibility that a common or restricted set of coding sequence changes in CTNNA3 might underscore the 10q21.3 association. For completeness we also screened the leucine rich repeat transmembrane neuronal 3 (LRRTM3) gene that maps internally within CTNNA3 ().
Haplotype –log10 P-values for the 10q21.3 region. Beneath the plot are the five isoforms of CTNNA3 and two of LRRTM3 with exons shown as black blocks and UTRs as empty blocks, annotated as per Supplementary Table 2.
Nine sequence changes within coding sequence were identified in the same 65 individuals whose DNA passed QC in the genomewide stage. These included five polymorphic variants documented in dbSNP (four in CTNNA3 and one in LRRTM3) and four novel changes (three and one in CTNNA3 and LRRTM3, respectively). Seven of the variants identified were missense changes, six in CTNNA3 and one in LRRTM3. None of the missense changes identified were confined to individuals with a RAM phenotype (Supplementary Table 2). Furthermore, none of the sequence changes were predicted to impact on the functionality of the expressed protein.