|Home | About | Journals | Submit | Contact Us | Français|
Study design: E.F.R., F.C., Y.K., M.J.O., B.D.K., D.U., W.C., C.I.A., J.O'S., M.Ga., D.L.K., A.G. Analysis: E.F.R., Y.K., M.J.O., B.Y., W.C., C.I.A., M.B.D., G.R.W., M.Ga., D.L.K., A.G., Sample procurement and data generation: E.F.R., F.C., Y.K., M.J.O., N.A., C.S., J.M.L., B.Y., B.D.K., A.C., O.A., Z.E., H.A., D.U., I.T.-T., G.A.-D., W.C., C.I.A., M.B.D., A.A.K., G.A., B.E., O.J.B., V.G.K., Ph.K., E.B.-C., M.S., F.F., M.Gh., W.E.R.O., Y.-H.C., D.B., J.O'S., G.R.W., M.Ga., D.L.K., A.G. Writing: E.F.R., Y.K., M.J.O., B.Y., C.I.A., J.O'S., M.Ga., D.L.K., A.G.
Behçet's disease (BD) is a genetically complex disease of unknown etiology, characterized by recurrent inflammatory attacks affecting orogenital mucosa, eyes, and skin. We performed a GWAS with 311,459 SNPs in 1215 BD patients and 1278 healthy controls from Turkey. We confirmed the known HLA-B51 association and identified a second, independent association within the MHC Class I region. We also found one SNP with genome-wide evidence for association within the gene encoding interleukin-10 (IL10). A meta-analysis including a total of 2430 cases and 2660 controls established associations with the IL10 variant (rs1518111, P=3.54×10-18, odds ratio 1.45 with 95% confidence interval 1.34-1.58) and with a variant located between the interleukin-23 receptor (IL23R) and interleukin 12 receptor β2 (IL12RB2) genes (rs924080, P=6.69×10-9, odds ratio 1.28 with 95% confidence interval 1.18-1.39). The disease-associated IL10 variant (rs1518111 A allele) was associated with diminished mRNA expression and low protein production, suggesting novel therapeutic targets for BD.
BD is prevalent in Middle Eastern countries with the highest rate (4 in 1000 individuals) found in Turkey. A genetic contribution to BD is supported by the high sibling recurrence risk ratio (λs), estimated from 11.4 to 52.5 in the Turkish population1. HLA-B51 is the most strongly associated known genetic factor2, however it accounts for less than 20% of the genetic risk, even in familial cases3, indicating that genetic factors remain to be discovered. Candidate gene studies and several small GWASs4,5 have examined BD genetics, but the studies have been generally underpowered, making interpretation and replication of the results problematic.
We therefore performed a GWAS of 311,459 autosomal SNPs in 1215 BD cases and 1278 healthy controls from Turkey (Fig. 1, Supplementary Fig. 1, and Supplementary Table 1). Only SNP genotype data that met strict quality control standards (see Online Methods) were included. A principal components method (Online Methods) was used to evaluate population stratification in the cases and the controls. After correction for 6 PCs, λGC, a measure of genomic inflation, was reduced from 1.06 to 1.05 (Supplementary Fig. 2a, b, c). Correcting for 6 PCs in the absence of the strongly associated MHC region SNPs reduced λGC from 1.05 to 1.04 (Supplementary Fig. 2d). Given the minimal degree of population stratification, uncorrected data are presented. P < 5.0 × 10-8 was considered genome-wide significance.
The most significantly associated SNPs (P value < 10-44) were located on chromosome 6 in the MHC region. To evaluate the contribution of the HLA-B51 type to BD, we determined the HLA-B types of 1190 of the cases and 1257 of the controls. Occurrence of the HLA-B51 type (one or two copies) was found in 59.1% of cases and only 29.3% of controls (odds ratio = 3.49 [95% CI = 2.95 to 4.12], P = 5.47 × 10-50). Within the HLA-B region the most significantly associated SNPs were located from telomeric to the HLA-B coding region to centromeric to MICA (encoding MHC Class I chain related sequence A), which has been suggested to be the source of the BD-HLA-B51 association6. We found the HLA-B51 type was more strongly associated with disease than was any genotyped SNP (HLA-B51 allele frequency = 0.352 in cases and 0.159 in controls, P = 1.44 × 10-54, Fig. 2). Strong linkage disequilibrium (LD) was observed between HLA-B51 and all the SNPs located from HLA-B to more than 62 kb centromeric to the MICA gene, despite spanning several blocks of LD (Supplementary Fig. 3). This LD pattern is observed because the HLA-B51 variant is found almost exclusively on a single extended haplotype. This extended haplotype occurred at 0.321 frequency in cases and 0.144 frequency in controls. Interestingly, the identical SNP haplotype, but lacking HLA-B51, occurred in cases and controls equally at 0.04 frequency, suggesting that HLA-B51 is required for the disease association in the HLA-B region.
We next performed a conditional logistic regression analysis of the 292 SNPs from the MHC region with allelic Chi-squared P < 0.0001, specifying HLA-B51 as a covariate. None of the HLA-B/MICA region SNPs remained significantly associated with BD in the conditional analysis (Fig. 2). After accounting for the effect of HLA-B51, three SNPs within the HLA-A gene region with disease-associations retained genome-wide significance. The most strongly associated SNP, rs9260997, located 50 kb centromeric to the HLA-A gene, had a regressor P value = 5.49 × 10-9, suggesting a highly significant association independent of the HLA-B51 association. Coincidentally, an association of markers within the HLA-A region, independent of HLA-B51, was recently described in Japanese BD patients with attribution of the association to the HLA-A26 allele5.
In the association data (Fig. 1) we identified near genome-wide significance for one non-MHC SNP, rs936551 (P = 5.29 ×10-8), located at the telomeric end of the short arm of chromosome 4, within the promoter region of CPLX1, which encodes complexin-1, a regulator of exocytosis during vesicle membrane fusion. Additionally, one SNP located within the first intron of the IL10 gene, rs3024490, was strongly suggestive for association with a P value = 2.22 × 10-7 (Fig. 1). We also identified an additional 66 SNPs from an additional 49 chromosomal regions with suggestive evidence for disease association, P < 0.0001 (Supplementary Table 2).
To better evaluate genetic associations within the CPLX1 and IL10 gene regions, we genotyped the same case and control samples for additional SNPs from the regions predicted by HapMap CEU data to be in LD with the disease-associated SNP. We also fine-mapped the IL23R/IL12RB2 region, which was identified in an independent GWAS of Behçet's disease in the Japanese population7 and within which we found three SNPs with p < 0.0001 (Supplementary Table 2).
In the CPLX1 region, the additional genotyped SNPs increased the estimated coverage (at r2 greater than 0.8) of HapMap Phase II SNPs with minor allele frequency greater than 0.05 from 57% to 93%. The most significantly associated SNPs were located in the promoter region of the gene, although none were more significantly associated than rs936551 (Supplementary Fig. 4).
In the IL10 gene region, we genotyped 27 additional SNPs resulting in coverage of 100% of the HapMap Phase II SNPs with greater than 5% minor allele frequency in the CEU samples. Five SNPs were found strongly associated with BD, with one SNP, rs1518111, with genome-wide significance (P = 1.88 × 10-8, Fig. 3a). The disease-associated SNPs were located in the promoter region and the first, second, and third introns of the gene (Supplementary Fig. 5) and all were in strong LD with one another.
In the IL23R/IL12RB2 region we genotyped 11 additional SNPs increasing coverage from 58% to 88% of the HapMap Phase II SNPs with greater than 5% minor allele frequency in the CEU samples. No SNPs were found with stronger association than rs924080 from the GWAS analysis (Fig. 3b). We identified a recombination hotspot located in the intergenic region between IL23R and IL12RB2 (Fig. 3b), which is also seen in the HapMap Caucasian and Asian populations (data not shown). The BD-associated variants were located on the IL23R side of the hotspot, suggesting the disease association is more likely to influence IL23R than IL12RB2. Variants within the IL23R gene have been associated with ankylosing spondylitis8, psoriasis9, and inflammatory bowel disease10, diseases with some phenotypic overlap with each other and with BD. Neither of two IL23R coding variants associated with these seronegative diseases was associated with BD (Fig. 3b).
We genotyped the BD-associated IL10, IL23R/IL12RB2, and CPLX1 SNPs in five additional collections of BD cases and controls from Turkey, the Middle East, Europe, and Asia (see Table 1 and Online Methods). A surrogate marker (rs11248047) was used in the place of the CPLX1 disease-associated SNP (rs936551), which failed quality control in the replication assay (both markers shown in Supplementary Fig. 4). The results are shown in Table 1. A number of the UK Caucasian and Middle Eastern Arab samples were previously tested for association of two IL10 promoter region SNPs with BD. Although an association with disease was reported in the UK collection, the association had not replicated in the Middle Eastern Arab collection11. The BD-associated IL10 SNP identified in the Turkish population, rs1518111, reached statistical significance after correction for three tests (P < 0.017) in the Middle Eastern Arab and Greek collections and was of nominal significance (P < 0.05) in the UK Caucasian collection (Table 1). For the most part, the IL23R/IL12RB2 and CPLX1 SNPs showed the same trend as found in the Turkish samples; however the associations did not reach statistical significance in these smaller collections. Association of these markers was not observed in the small Turkish replication collection. Although this lack of association could be explained by small sample size, an alternative explanation could be that the markers are associated with severe disease, as we have noted that disease manifestations in the replication collection (drawn from a Dermatology clinic) were milder than in the discovery collection used for the GWAS, which was obtained from a Rheumatology clinic.
While analyzing these data we learned that another group had also performed a GWAS for Behçet's disease7. We exchanged data for cross validation purposes and found that the IL10 and the IL23R/IL12RB2 SNP associations were also strongly supported by the data obtained from the Japanese population, whereas the CPLX1 association was not observed. A meta-analysis of all the association data (including a total of 2430 BD cases and 2660 controls) provided strong evidence for associations of the IL10 and IL23R/IL12RB2 loci with BD (Table 1). Association of the CPLX1 variant failed to replicate in any of the additional collections and failed to reach genome-wide significant association in a meta-analysis of the data. Analysis of even larger numbers of replication samples will be required to establish whether the CPLX1 locus contributes to BD susceptibility.
Several SNP haplotypes of the IL10 gene promoter have been reported to be associated with regulation of the gene's expression, but the reported allelic effects have been inconsistent and difficult to reconcile12-14. We therefore tested for a difference in expression of the gene from the disease-associated haplotype and other haplotypes by measuring allelic imbalance of the rs1518111 variant in pre-mRNA from monocytes isolated from 8 healthy individuals heterozygous for the disease-associated allele. In 8 donors, the pre-mRNA transcript with the rs1518111 A allele was found at reduced levels compared with the G allele. Expression from the A bearing chromosome was 35% of expression from the G bearing chromosome (Fig. 4a).
To assess whether this expression difference was relevant in terms of cytokine production, we activated mononuclear cells from healthy Turkish donors with lipopolysaccharide (LPS) and found significantly lower amounts of IL-10 protein in supernatants from donors homozygous for the rs1518111 A allele compared with individuals with one or two G alleles (Fig. 4b). We also cultured monocytes from ethnically diverse healthy donors from the United States and stimulated them with two Toll-like receptor and nucleotide-binding oligomerization domain containing receptor ligands, the lipoprotein, Pam3Cys and muramyl dipeptide. Individuals homozygous for the BD-associated rs1518111 A allele produced significantly less IL-10 than individuals with one or two G alleles (Fig. 4c), whereas no statistically significant variation was found in TNFα production (data not shown).
Taken together, these data suggest that a genetic predisposition for low IL-10 expression is a risk factor for BD. There are extensive data in mouse models linking reduction in IL-10 with inflammation15. Variants in the IL10 gene region have been shown to be associated with ulcerative colitis (UC)16, type I diabetes17, systemic lupus erythematosus18, and severe juvenile rheumatoid arthritis (JRA)12, and IL-10 receptor mutations were recently shown to cause early onset enterocolitis19. Interestingly, the SNP found in the UC, type 1diabetes, and lupus studies, rs3024505, was not associated with BD (Fig. 3a) and the BD-associated IL10 variants were not associated with these three diseases, suggesting a different IL-10-related disease mechanism in BD compared with these other diseases. On the other hand, the IL10 association with severe JRA is the same as for BD.
Variants from several candidate genes have been examined for association with BD. Although no variants within these genes were found with genome-wide significant association in the Turkish GWAS, 6 of the 12 genes examined contained at least one SNP with nominal significance (P < 0.05, Supplementary Table 3). A much larger sample collection will be required to determine whether any of these genes harbor true BD susceptibility alleles.
In summary, this GWAS provides new insights into the genetic factors that contribute to BD. It not only supports the association of HLA-B51 as the primary MHC association with BD, but also reveals another independent MHC Class I association telomeric to the HLA-B gene. Most importantly, this study demonstrates that common variations in the IL10 gene and in the vicinity of the IL23R/IL12RB2 genes predispose to BD. Functional analyses indicate that the disease-associated IL10 variants cause decreased expression of this anti-inflammatory cytokine, and thereby, possibly in concert with commensal microorganisms20, result in an inflammation-prone state, thus suggesting a mechanistic hypothesis and possible therapeutic targets.
The genotype data for the 311,459 SNPs in 1215 Behçet's disease cases and 1278 healthy controls from Turkey have been deposited in the National Institutes of Health database of genes and phenotypes, dbGaP (http://www.ncbi.nlm.nih.gov/sites/entrez?db=gap), accession number: phs000272.v1.p1
This research was supported by the Intramural Research Program of the National Institute of Arthritis and Musculoskeletal and Skin Diseases of the National Institutes of Health, by the Istanbul University Research Fund, and by the UK Behçet's Syndrome Society. We thank all of our patients for their enthusiastic support of our studies aiming to understand and find better treatments for Behçet's disease, and Ms. Omur Aksakalli and Ms. Hatice Ustek for their help in blood collection.
GWAS samples were collected at the Behçet's Disease Clinic of Istanbul Faculty of Medicine, Division of Rheumatology, one of the largest centers for BD in Turkey, as consecutive patients fulfilling the International Study Group Criteria21. Demographic and clinical characteristics of the patients are given in Supplementary Table 1. Healthy controls were selected among blood bank donors or healthy volunteers to match by geographic location the place of birth of the patients (Supplementary Fig. 1). All healthy controls were questioned for the absence of recurrent oral aphthous ulcers or other BD-related manifestations, or a family history for BD. All study participants provided written informed consent, and the study was approved by the Ethics Committee of the Istanbul Faculty of Medicine.
The replication study samples were from 1) a non-overlapping set of Turkish cases and controls obtained from the Department of Dermatology Behçet's Disease Clinic of the Istanbul Faculty of Medicine, 2) a collection of BD cases and controls of Middle Eastern Arab origin11, 3) a collection of Greek cases and controls22, 4) a collection of UK Caucasian BD cases and controls11, and 5) a collection of Korean BD cases and controls collected at Yonsei University College of Medicine in Seoul, South Korea. All cases fulfilled the International Study Group Criteria.
Blood samples were collected from all participants and genomic DNA was isolated using the MagNA Pure Instrument with the MagNA Pure Compact DNA Isolation Kit for 1 ml volume (Roche, Mannheim, Germany).
All DNA samples were typed using the Infinium assay (Illumina). The first 332 (146 cases and 186 controls) samples were typed on Human CNV370-Duo v1.0 chips. The remaining 2329 samples (1175 cases and 1150 controls) were typed on Human CNV370-Quad v3.0 chips. Samples with high call rates using Illumina-provided HapMap clusters (greater than 99% on Duo chips and greater than 99.5% on Quad chips) were clustered and these cluster locations were used to determine the genotypes of all the samples. All 2657 samples (1321 cases and 1336 controls) had > 95% call rate. SNPs used in the analysis were located on the autosomes, had > 95% call rate, > 1% minor allele frequency, and Hardy Weinberg equilibrium p-value > 0.00001 determined separately with Duo and Quad chip data. There were 335,887 autosomal SNPs on the Duo and 334,556 SNPs on the Quad chips. Filtering removed 9436 SNPs from the Duo and 11,886 SNPs from the Quad chip data. The Duo and Quad data were merged omitting SNPs not present on both chips (14,970 non-overlapping SNPs from the Duo and 11,188 from the Quad chips). The merged, autosomal data included 311,482 SNPs. Heterozygosity of X chromosome SNPs was used to predict the sex of the samples. We removed 19 samples with sex mismatch. PLINK23 was used to identify samples with genetic relatedness indicating identity (or monozygotic twin) or first, second, or third degree relatives. Only one of each pair of identical samples was included; if however the identity could not be confirmed, both samples were removed. Only one of each pair of first to third degree related samples was included, selecting the sample that was the case or the one with the highest call rate. A total of 145 samples were removed for relatedness. The samples used in the genome-wide association analysis included 1215 cases and 1278 controls. The SNP genotypes from these 2493 individuals were refiltered using the same criteria (> 95% call rate, > 1% minor allele frequency, and Hardy-Weinberg equilibrium > 0.00001) resulting in 311,459 SNPs in the final analysis.
The SNP data were analyzed for SNP association using Sequence Variation Suite 7 (SVS7) software (Golden Helix, Bozeman, Montana). A principal components analysis method was used to detect possible stratification present in the cases and the controls. The genomic inflation factor λGC was 1.06 before correction for the top 6 principal components and 1.05 after correction, and without the MHC region markers was 1.05 without correction and 1.04 with correction for 6 PCs (Supplementary Fig. 2).
Additional SNPs were selected for genotyping in the regions of LD surrounding the disease-associated SNPs identified by the genome-wide association analysis. We used the Tagger algorithm in Haploview24 to select SNPs to augment the coverage of the GWAS SNPs to obtain 100% coverage of the HapMap Phase II SNPs with greater than 5% minor allele frequency in the CEU HapMap population with pairwise r2 > 0.8. Genotyping of the same samples used for the GWAS was performed using the Sequenom iPLEX gold method (Sequenom, San Diego, California). Case and control DNAs were run at the same time and genotypes determined using Typer 4.0 software (Sequenom). Genotype clusters were edited from the combined case and control data without knowledge of the case/control status of individual samples. Of the 30, 13, and 47 SNPs added respectively to the IL10, IL23R, and CPLX1 regions, 26, 11, and 30 SNPs passed quality control. The concordance rate for SNPs genotyped on the Illumina chip and Sequenom iPLEX platforms was 99.83%.
Replication samples were genotyped for rs1518111, rs924080, and rs11248047 with TaqMan® SNP Genotyping Assays (Applied Biosystems, Foster City, California) using the manufacturer's protocol and recommended quality control measures. Genotype clusters were edited from the combined case and control data without knowledge of the case/control status of individual samples.
HLA-B types were determined from case and control DNA samples using a reverse sequence-specific oligonucleotide (SSO) method (One Lambda, Canoga Park, California) and a Bio-Plex 200 suspension array system (BioRad, Hercules, California). Samples were scored for 0, 1, or 2 HLA-B51 alleles.
Peripheral blood mononuclear cells (PBMCs) were isolated from whole blood by centrifugation through Ficoll-Hypaque solution (Biocall Separation solution cat: L6115, BioChrome AG, Berlin, Germany) and washed twice in phosphate buffered saline (PBS). Cells were resuspended in culture medium at a concentration of 1 × 106 cells per ml. Cells were cultured in complete RPMI 1640 supplemented with 10% heat inactivated fetal bovine serum (cat. SH3008803, Thermo Scientific, Waltham, Massachusetts), 2 mM L-glutamine, penicillin (100 U/mL), streptomycin (100 ug/mL) (cat. 15670-063, Invitrogen) at 37°C in a humidified atmosphere containing 5% CO2 (Thermo Scientific). Monocytes were obtained from PBMCs by negative selection (Monocyte Isolation Kit II; Miltenyi Biotec, Gladbach, Grmany). The monocytes were washed and cultured at 2.5 × 105 cells per ml in RPMI 1640 containing 5% FBS.
A genotype assay for rs1518111 (C___8828803_1_, Applied Biosystems, Foster City, California), an intronic SNP within the IL10 gene, was used in a quantitative expression assay to compare the relative expression of the disease-associated and the other copy of the IL10 gene in pre-mRNA from heterozygous carriers of the disease-associated haplotype. Monocytes were isolated from heterozygous carriers of the IL10 disease-associated haplotype. Total RNA was treated with DNase I (Invitrogen) prior to synthesis of cDNA, which was used in the assay for relative expression of the two copies of the gene. The expression data were analyzed in triplicate and corrected for the detection difference of the two alleles using genomic DNA from each donor. The relative expression of the disease-associated allele compared with other alleles was determined using the ΔΔCt method:
Relative expression of the A allele compared with the G allele = 2-ΔΔCt
ΔΔCt = (Avg FAM (A) Ct cDNA – Avg FAM (A) Ct gDNA) - (Avg VIC (G) Ct cDNA – Avg VIC (G) Ct gDNA).
The significance of the difference in expression of the A allele relative to the G allele was determined with a Wilcoxon matched pairs signed rank test with P < 0.05 considered significant.
For LPS-stimulated PBMCs, LPS (E. coli, Serotype O26:B6, cat. L-3274, Sigma Aldrich, St. Louis, Missouri) was added to the cells (1 μg/106 cells) and incubated 24 hours. Supernatants were collected, and IL-10 concentration was analyzed by enzyme-linked immunosorbent assay (ELISA) with a commercially available kit according to manufacturer's protocol (Human IL-10 ELISA Kit, cat. KHC0101, Invitrogen). Absorbance was measured at 450 nm in a DTX 800 Multimode Detector (Beckman Coulter, Brea, California). The assay detection limit was 7.8 pg/ml.
For MDP + Pam3Cys-stimulated monocyte experiments, cells were stimulated for 24 hours with medium, or medium supplemented with MDP (10 μg/ml; Sigma Aldrich) and Pam3Cys (1 μg/ml; InvivoGen, San Diego, California). Supernatants were collected and concentrations of IL-10 and TNF-α were measured with the Bio-Plex Pro Human Cytokine x-plex assay. Data were collected using the Bio-Plex 200 system according to the manufacturer's protocol (Bio-Rad).
A Mann-Whitney test was used to test for significant differences among genotypes with P < 0.05 considered significant. For LPS-stimulated PBMC, 11 homozygotes for the A allele, 13 heterozygotes, and 14 homozygotes for the G allele were used. For the MDP + Pam3Cys stimulated monocytes, 11 homozygotes for the A allele, 18 heterozygotes, and 18 homozygotes for the G allele were used.
None of the authors have competing financial interests.