There were 195 genes with strong evidence of association between the gene and CD risk in the logic-regression gene-level SNP-SNP-interaction analysis of the WTCCC GWAS data, 40 of which are listed in (all are shown in Table S1
). Notably, all nine regions of the genome showing strong evidence of association by the single-SNP analysis of WTCCC data12
, as well as seven out of the eight regions showing moderate evidence of association, were represented among the 195 genes. Thirty-seven (63%) of the 59 chromosomal locations, that were previously identified by a meta-analysis of single-SNP studies that involved over 22,000 cases and 29,000 controls 
, were included in the 195 genes. Also included in the 195 genes that showed strong evidence of association were three genes located in IBD1
(Chr 16q12), two genes in IBD2
(Chr 12q13), six genes in IBD3
(Chr 6p21, HLA region), eight genes in IBD5
(Chr 5q31-33), two genes in IBD6
(Chr 19p13), and one gene in IBD7
(Chr 1p36), well-established regions of chromosomes for CD risk: no gene in IBD4
(Chr 14q11-12) was included, however. In addition, there were a number of chromosome regions that did not show strong or moderate evidence of association in the single-SNP analysis of WTCCC, but had three or more genes appearing among the 195 genes, namely, 1q32, 2q14, 8p12, 10q22, 10q26, 11p14, and 18q22. These are indicated by green highlighting in the tables. Furthermore, there are clusters corresponding to certain families of genes in the 195 genes. For example, genes associated with phosphoprotein phosphatase activity (e.g., PPM1K
) showed strong evidence of association with CD risk, of which only PTPN2
had been previously indicated.
Forty genes with the strongest evidence for association with Crohn's Disease risk, with chromosomal locations, numbers of SNPs, approximate p-values, and Bayes factors.
Intestine Specific Homeobox (ISX
) was the gene most strongly associated with CD risk in our WTCCC logic-regression-based analysis and represents a new CD susceptibility gene. Homeobox genes encode DNA-binding proteins, of which many are thought to be involved in early embryonic development. ISX
is a transcription factor that regulates gene expression in the intestine 
. The logic structure of ISX
is shown in . Based on the genotypes of the SNPs in the two trees, the following three risk groups shown in emerge: a reference risk group (1540 cases/2562 controls); a low risk group (1 cases/372 controls, estimated odds ratio 0.0045); and a high risk group (207 cases/2 controls, estimated odds ratio 172.2). Both the low and high risk groups are defined by uncommon variants with over 150-fold effect sizes. We confirmed the allele frequencies of these SNPs with the Hapmap CEU population as an informal check of the possibility of genotyping errors for the rare variants.
Logic structures, frequencies, and associated Crohn's Disease odds ratios of the ISX gene (p-value<3.8×10−6).
We note that using the WTCCC dataset for discovery and the dbGaP non-Jewish and Jewish datasets for replication is untenable, because of the observed population differences (Figure S1
) and the difference in study power due to the large differences in sample sizes. Another disadvantage is the difference in genotyping platforms between these data sets, including their genotyping errors and genomic coverage. Nonetheless, we applied the same method of analysis to the dbGaP's non-Jewish and Jewish GWAS datasets. Since this analysis focused on the 195 genes with strong evidence of association with CD risk in the WTCCC analysis, the BF threshold for strong evidence for this stage of the analysis is 2.29. We applied this threshold to the larger BF of the two dbGaP GWAS analyses. lists 17 genes that showed strong evidence for their CD-risk association in both stages of the analysis. Seven of the seventeen genes in are located in regions of the genome that showed strong or moderate evidence of association with CD by the single-SNP analysis of WTCCC data 
. Of the remaining ten genes, TMEM183A
are both located in Chromosome 1q32. Chromosome 1q32 has been shown to be associated with the risk of Ankylosing Spondylitis that is linked to CD 
: this region was not identified by the single-SNP WTCCC or dbGaP analyses. A gene of organic anion transporter, SLCO6A1
, showed strong evidence of association in all three GWASs, in spite of no previous implication of CD-risk association: this is significant in view of the known association of SLC22A4
), genes of organic cation transporters, with CD risk 
. In addition to SLCO6A1
, three genes (IL23R
, and TMEM183A
) showed consistently strong evidence across the three datasets.
Seventeen genes with the strong evidence for association with Crohn's Disease risk in WTCCC and one or both of Non-Jewish and Jewish dbGap GWASs, with chromosomal locations, numbers of SNPs, approximate p-values, and Bayes factors.