|Home | About | Journals | Submit | Contact Us | Français|
Genome-wide association (GWA) studies provide insight into multigenic diseases through the identification of susceptibility genes and etiological pathways. In addition, identification of shared variants among autoimmune disorders provides insight into common disease pathways. We previously reported association of a nonsynonymous single nucleotide polymorphism (nsSNP) rs763361/Gly307Ser in the immune response gene CD226 on chromosome 18q22 with type 1 diabetes (T1D) susceptibility. Here, we report efforts towards identifying the causal variant by exonic resequencing and tag SNP mapping of the 18q22 region in both T1D and multiple sclerosis (MS). In addition to the analysis of newly available samples in T1D (2,088 cases and 3,289 controls) and autoimmune thyroid disease (AITD) (821 cases and 1,920 controls), resulting in strong support for the Ser307 association with T1D (P= 3.46 × 10−9) and continued potential evidence for AITD (P = 0.0345), we provide convincing evidence for association of Gly307Ser with MS (P = 4.20 × 10−4) and some evidence for another autoimmune disease, rheumatoid arthritis (RA) (P = 0.017). The Ser307 allele of rs763361 in exon 7 of CD226 predisposes to T1D, MS, possibly AITD and possibly RA, and based on the tag SNP analysis, could be the causal variant.
Type 1 diabetes (T1D), multiple sclerosis (MS), autoimmune thyroid disease (AITD) and rheumatoid arthritis (RA) are organ specific autoimmune diseases mediated by self-reactive T cells and other cells of the adaptive and innate immune systems. T1D is characterized by inflammation of the pancreatic islets of Langerhans with destruction of insulin producing ß-cells1, while in MS and RA, there is selective white matter or joint tissue destruction, respectively2; 3. Sibling and twin studies indicate a major genetic component of the familial clustering of these common diseases4-6 with the major susceptibility loci in the HLA region7-10. Recent genome-wide association (GWA) studies have successfully identified many non-HLA single nucleotide polymorphisms (SNPs) associated with common disease susceptibility11. In addition to the identification of associated SNPs, these investigations are providing insight into genes and mechanisms shared among autoimmune diseases. Examples include STAT4 in RA and systemic lupus erythematosus (SLE)12, and IL2RA in T1D13, Graves' disease14 and MS15; 16.
We recently reported a GWA study of nsSNPs in T1D that provided strong statistical evidence for association at chromosome 18q2217. In 6,021 T1D cases and 6,088 controls, Gly307Ser, located in CD226, showed a P = 2.82 × 10−8 (Odds Ratio (OR) for minor allele = 1.16, 95% confidence interval (c.i.) = 1.10-1.22) and in 2,997 parent-child trios, with a P = 0.0281 (Relative Risk, RR, = 1.08, 95% c.i. = 1.00-1.16). Combining the results obtained from the case-control and family studies yielded a P = 1.38 × 10−8.
CD226 (also known as DNAX accessory molecule 1, DNAM-1) is a 67 kDa type I membrane protein involved in the adhesion and co-stimulation of T cells18. It belongs to the immunoglobulin supergene family of receptors, containing two Ig-like domains in the extracellular region and is constitutively expressed on the majority of natural killer (NK) cells, CD4+ and CD8+ T cells, monocytes, platelets and a subset of B cells18. Further, in an experimental model of MS, experimental autoimmune encephalomyelitis (EAE), anti-CD226 mAb treatment delayed the onset and reduced the severity of EAE19. The Gly307Ser variant could alter expression or signalling of CD226 as it occurs in the molecule's cytoplasmic tail17. It was, therefore, of interest to explore whether Gly307Ser was the causal variant in the region and a shared risk locus for autoimmune disease.
Here we report an initial fine-mapping study of the 18q22 region in both T1D and MS by means of exonic resequencing and a tag SNP mapping approach based around Gly307Ser. We provide no evidence against the hypothesis that the nonsynonymous SNP (Gly307Ser) is the causal variant in the 18q22 region for MS and T1D. Moreover, we extended this analysis to include RA (and additional AITD samples), suggesting that Ser307 predisposes to a range of human autoimmune diseases.
In order to increase SNP density and detect as-yet-unknown SNPs in the coding region or identify SNPs that may disrupt intron/exon splice sites present in CD226, we resequenced the exonic regions in the roughly 50 kb linkage disequilibrium (LD) block (exons 4, 5, 6, and 7) containing Gly307Ser and 3 kb of 3′ flanking sequence in 32 individuals chosen from the HapMap CEPH collection. This led to 7.7 kb of DNA being resequenced and the identification of 13 SNPs in three exons and the 3′ flanking sequence (exon 5 was not sequenced due to PCR failures, see material and methods). When compared with the publicly available SNPs in dbSNP build 128, two SNPs were found to be novel polymorphisms. They were located in exon 6: a nsSNP (Ala279Leu, ss102661466) and a synonymous SNP (Gln282Gln, ss102661465) with minor allele frequencies (MAFs) of 0.065 and 0.078, respectively. As these variants are functional candidates, they were genotyped in the T1D case-control collection. The single locus tests provided little evidence of an association with T1D disease susceptibility: Ala279Leu P = 2.08 × 10−3 (OR = 1.15; 95% c.i. 1.05-1.26) and Gln282Gln P = 0.0298 (OR = 0.90; 95% c.i. 0.82-0.99) (Supplementary Information Table 1). Nor did the forward logistic regression analysis adding either novel SNP to Gly307Ser provide evidence (minimum P = 0.0542) of an independent association with T1D susceptibility, while Gly307Ser added significantly to both SNPs (minimum P = 8.10 × 10−6) indicating that neither novel SNPs are independently associated with T1D susceptibility.
Further, we selected and tested a set of tag SNPs (see materials and methods) to investigate the association previously identified in the 18q22 region. Gly307Ser was still the most associated with T1D in 8,109 cases and 9,377 controls (P = 1.32 × 10−8; OR = 1.13, 95% c.i. 1.08-1.18) (Table 1). We conducted a forward logistic regression analysis testing the addition of each SNP to Gly307Ser and found that none added significantly. There was, therefore, no evidence for a known polymorphism (with a MAF > 0.05 and r2 with Gly307Ser > 0.25) in the CD226 region that showed stronger association with T1D than Gly307Ser or had an independent effect on T1D susceptibility.
We then proceeded to test Gly307Ser in a cohort of MS samples consisting of 1,275 trios, 1,194 USA cases, 595 USA controls, 993 UK cases and 9,377 UK controls (UK controls are the same as in our T1D association study). The combined P-value was 4.20 × 10−4 (Table 2). In order to test the hypothesis that Gly307Ser was the causal variant in MS, we tested the same set of T1D tag SNPs in an extended set of 1,318 MS trios, 1,769 MS US cases, 2,508 US controls, and 1,003 MS UK cases and used the genotyping data already available for 9,377 UK controls. Consistent with our T1D study, we obtained no evidence against the hypothesis that CD226 Gly307Ser is the causal variant associated with MS in the 18q22 region (Table 3).
As Gly307Ser was found to be associated with both T1D and MS susceptibility, it was of interest to examine a collection of RA samples and an additional cohort of autoimmune thyroid disease cases. We tested Gly307Ser in 3,595 RA cases and 3,214 controls and obtained some evidence of association, at P = 0.017 (OR = 1.09; 95% c.i. 1.02-1.16) (Table 2). We obtained no evidence of heterogeneity of association between males and females (P = 0.90), nor between RF positive and RF negative cases (P = 0.86), nor between anti-CCP negative and anti-CCP positive cases (P = 0.45). This suggests Gly307Ser is associated with RA and not a sub-phenotype. Further, adding 821 AITD cases to our previously published data17 (N = 2,958, N = 5,431) we obtained potential evidence for association, at P = 0.0335 (OR = 1.08; 95% c.i. 1.00 - 1.15) (the controls are the same used in our T1D association study but matched geographically) (see Supplementary table 4 for Graves' disease and Hashimoto's disease results separately reported).
GWA scans in common human autoimmune diseases have recently identified many loci associated with disease susceptibility. Understanding allelic heterogeneity and homogeneity among diseases provides insight into common gene function and pathways. Here, we examined the gene encoding CD226, a molecule expressed on the surface of haematopoietic cells that has independently been implicated in the pathogenesis of animal models of autoimmune diseases. Our resequencing efforts and tagging approach aimed at narrowing the association in the 18q22 region provided no evidence against the hypothesis that CD226 Gly307Ser is the causal variant in T1D and MS. In addition, the International Multiple Sclerosis Genetics Consortium (IMSGC) extended the evidence supporting an association of Gly307Ser with MS (see accompanying short report). In an additional sample of 3,610 MS cases, 324 controls and 1,036 trios, the IMSGC further validated Gly307Ser association with MS (P= 5.4 × 10−8) (Table XX in IMSGC paper, see editorial for complete analysis and overlap between studies). Taken together with our association study of the role of Gly307Ser in a collection of RA cases and additional data for Gly307Ser in AITD (P = 0.0345), we provide initial evidence for the CD226 gene to be shared among at least four common human autoimmune diseases.
We note, however, that until a more complete set of polymorphisms is identified and genotyped in a large collection of cases and control subjects, we cannot exclude another variant in LD with Gly307Ser being the causal variant. Future successful resequencing of exon 5 may provide as yet undiscovered variants that will need to be assessed for disease susceptibility. In addition, the CD226 region may harbour other, independent associations with susceptibility to disease that our tag mapping approach was not designed to identify, as has previously been shown for another T1D susceptibility locus containing the IL2RA gene20, although the data indicate that this is not the case for common variants in the CD226 region.
CD226 is implicated in natural killer cell mediated cytoxicity as well as Th1 cell mediated immune response18; 19. Phosphorylation of the cytoplasmic tail of CD226 assists in co-localization with LFA-1 and cell activation21. Our genetic association data now justify studies of the functional consequences of the Gly307Ser variant in adaptive and innate immune responses. We have previously hypothesized17 that the SNP could disrupt a splice site enhancer, or silencer, thereby altering RNA splicing, as has been demonstrated for other immune related genes (human CD45 and mouse Ctla4)22, resulting in either a putative CD226 isoform acting as a non-functional (non-signalling) protein, or with a novel function. Alternatively, this amino acid substitution could alter the signalling cascade by affecting the two known phosphorylation sites at positions 322 and 32921; 23, which share a critical role in CD226 and the immune response.
All case and control subjects were of self-reported white ethnicity and were enrolled under study protocols approved by the Institutional Review board of each institution that contributed. Written informed consent was obtained from the participants or their guardians.
T1D cases were recruited as part of the Juvenile Diabetes Research Foundation/Wellcome Trust Diabetes and Inflammation Laboratory's British case collection (Genetic Resource Investigating Diabetes)17. Control samples were obtained from the British 1958 Birth Cohort (B58C), and WTCCC Blood Service controls11. Cases and controls were chosen to be matched geographically.
Healthy adult control subjects were recruited through the Brigham and Women's Hospital, the University of Cambridge and the University of California at San Francisco, as previously described15. All were unrelated individuals having no history of chronic inflammatory disease. MS cases were collected as described in our recent investigation of patients with MS15. Subjects with MS all meet McDonald criteria for MS24.
DNA was from UK RA patients, over 18 years old and satisfying the American College of Rheumatology criteria for RA was available from six centres in the UK and five of these centres provided controls as described in our recent investigation25. The samples collected from 6 centres in the UK raise the possibility that the results were affected by population substructure and heterogeneity. A stratified analysis by center revealed that the association of Gly307Ser with rheumatoid arthritis was independently observed in 4 of the 5 centers tested (one center had no controls, and therefore association could not be statistically tested for this centre). No heterogeneity was detected among the samples from the different centers and combined evidence from the different centers (using a Cochran-Mantel-Haenszel test) attained a significance level virtually the same as that from the combined samples.
As previously reported17 as part of the autoimmune thyroid disease (AITD) UK National Collection, 2,958 unrelated individuals were recruited including 2,295 with Graves' disease and 663 with Hashimoto's Thyroiditis. Cases and controls were chosen to be matched geographically.
Polymorphisms were identified by resequencing 32 Centre d'Etude du Polymorphisme Humain (CEPH) DNA samples, which are the same samples used by HapMap (www.hapmap.org). Sequencing was performed using Applied Biosystems' BigDye chemistry (version 3.1) and the sequences resolved using an ABI 3700 Genetic Analyzer. Analyses of the sequence traces were performed using the Staden package, and traces were scored independently by a second operator by hand. Annotations are available from T1DBase (see URLs), together with sequence and polymorphism data at the T1DBase PosterPages (see URL). Primer sequences are available upon request. Due to problems with design of primers to amplify exon 5, this exon could not be successfully sequenced and any as yet undiscovered variants that may reside in this exon are not part of our association analysis.
SNPs were genotyped using the iPLEX™ Sequenom MassARRAY® platform, or TaqMan® (Applied Biosystems) in accordance with the manufacturer's instructions. Cases and controls were genotyped and data scored twice to minimize error, with the second operator being unaware of case-control status or family structure. None of the SNPs significantly deviated from Hardy-Weinberg disequilibrium in controls and unaffected parents (P>0.05) (except for rs17208112, see tag SNP section).
All statistical analyses were performed in the Stata or R statistical packages (see URLs).
The case-control data were analysed using logistic regression models stratified by 12 geographical regions across England, Scotland and Wales to minimise loss of power due to geography26. When analysing a single SNP, we performed a one degree of freedom (1 d.f.) likelihood ratio test to determine whether a 1 d.f. multiplicative allelic effects model or a 2 d.f. genotypic effects model better fit the data27.
We used forward logistic regression to assess the evidence against the most significant SNP being the sole associated variant in the region (in other words, whether this SNP alone was sufficient to model the association). For the purposes of this analysis, we did not assume any specific mode of inheritance for the most associated SNP (A>a) or for any additional SNP with significant independent effects on T1D, so genotype risks of A/A and A/a were modeled relative to the a/a genotype. We then used a 1-d.f. test for adding each of the remaining SNPs to the model by assuming multiplicative allelic effects for the additional SNPs.
To delimit the disease-associated region and select an informative set of tags, we analyzed the LD (using r2 and D′28) structure of CD226 in DNA samples obtained from 32 individuals from the CEPH collection genotyped by the International HapMap project (www.hapmap.org)29. The tagging strategy involved the selection of SNPs with minor allele frequencies (MAFs) > 0.05 and r2 values > 0.25 with Gly307Ser. Available data allowed for the analysis of 205 SNPs within CD226. Of the 135 SNPs in the LD block, 43 had MAFs > 0.05 and r2 > 0.25 with Gly307Ser. Eleven SNPs were sufficient to pairwise tag this LD block (r2 > 0.80) as determined in HapMap (release 21). If an association were not observed in 5,500 cases and 5,500 controls, the SNP would not be genotyped in additional cases.
rs17208112, a singleton SNP tagging itself, failed quality control tests in both our T1D and UK MS cohorts due to an adjacent SNP disrupting the binding specificity of the probe and was hence removed from the data set.
This work was funded by the Juvenile Diabetes Research Foundation International and the Wellcome Trust. We gratefully acknowledge the participation of all the patients and control subjects. We acknowledge use of the DNA from the British 1958 Birth Cohort collection, funded by the Medical Research Council and Wellcome Trust. We also thank The Avon Longitudinal Study of Parents and Children laboratory in Bristol and the British 1958 Birth Cohort team, including S. Ring, R. Jones, M. Pembrey, W. McArdle, D. Strachan and P. Burton for preparing and providing the control DNA samples. We thank Cristin Aubin at the Broad Institute. We thank Helen Schuilenburg and Nigel Ovington for data support as well as Oliver Burren for bioinformatics support. DNA samples were prepared by P. Clarke, J. Denesha, D. Harrison, S. Hawkins, M. Himsworth, T. Mistry, N. Taylor, N. Ubani. This study makes use of data generated by the Wellcome Trust Case Control Consortium. A full list of the investigators who contributed to the generation of the data is available from www.wtccc.org.uk.
URLs: British 1958 Birth Cohort: http:// www.b58cgene.sgul.ac.uk/; T1DBase: http://t1dbase.org (and UK mirror site, http://dil.t1dbase.org) Stata: http://www.stata.com/; R: http://www.r-project.org/; rpart: http://cran.r-project.org/; Haploview: http://www.broad.mit.edu/mpg/haploview/; gbrowse: http://www.gmod.org/; bioconductor: http://www.bioconductor.org/; dbSNP: http://www.ncbi.nlm.nih.gov/projects/SNP/