|Home | About | Journals | Submit | Contact Us | Français|
In previous linkage and genome-wide association studies we identified 17 susceptibility loci for generalized vitiligo. By a second genome-wide association study, meta-analysis, and independent replication study, we have now identified 13 additional vitiligo-associated loci, including OCA2-HERC2, a region of 16q24.3 containing MC1R, a region of chromosome 11q21 near TYR, several immunoregulatory loci including IFIH1, CD80, CLNK, BACH2, SLA, CASP7, CD44, IKZF4, SH2B3, and a region of 22q13.2 where the causal gene remains uncertain. Functional pathway analysis shows that most vitiligo susceptibility loci encode immunoregulatory proteins or melanocyte components that likely mediate immune targeting and genetic relationships among vitiligo, malignant melanoma, and normal variation of eye, skin, and hair color.
In generalized vitiligo patches of depigmented skin and hair result from autoimmune destruction of epidermal melanocytes1, epidemiologically associated with other autoimmune diseases2. In previous linkage analyses and a genome-wide association study (GWAS1; Supplementary Table 1), we identified 14 confirmed and 3 suggestive vitiligo susceptibility loci3–5 in persons of non-Hispanic European (EUR) ancestry. Most encode immunoregulatory proteins, and several are associated with other autoimmune diseases. However, one, TYR, encodes tyrosinase, a key enzyme of melanin biosynthesis that likely mediates immune targeting of melanocytes. Causal variation at TYR is inversely associated with vitiligo and malignant melanoma6, suggesting vitiligo may be related to immune surveillance of melanoma7.
To identify additional vitiligo susceptibility loci, we performed a second GWAS (GWAS2; Supplementary Table 1), with meta-analysis of GWAS1 and GWAS2 (GWAS-MA) to enhance statistical power. GWAS2 included 450 EUR generalized vitiligo cases and genotype data from 3,182 EUR controls from the database of Genotypes and Phenotypes (dbGaP). The GWAS-MA demonstrated improved significance of almost all significant loci from GWAS1 (Supplementary Table 2) and suggestive association (P < 10−4 for multiple SNPs across a region) of 24 novel loci (Supplementary Figure 1 and Supplementary Table 3), of which six achieved genome-wide significance (P < 5 × 10−8) (Table 1). At all 24 novel loci, we imputed genotypes using 1000 Genomes Project data, performed logistic regression to identify independent association signals, and genotyped the most significant SNPs at each locus (Supplementary Table 3) in an independent replication cohort of 1440 EUR cases and 1316 EUR controls. We then performed overall meta-analysis of GWAS1, GWAS2, and the replication study, using conservative criteria for confirming association: (i) nominal association in the replication study (P < 0.05), (ii) consistent high-risk alleles in GWAS1, GWAS2, and the replication study, (iii) non-significant heterogeneity across all three studies (P > 1.09 × 10−3), and (iv) overall combined genome-wide significance (P < 5 × 10−8).
As shown in Table 1, we confirmed association of vitiligo with 13 novel loci. Among the most interesting, at chromosome 15q12–q13.1 the GWAS-MA showed suggestive association of SNPs (nt 27886016–28392261) spanning OCA2 upstream to within HERC2 (Fig. 1), especially rs12913832 (P = 3.29 × 10−7) and imputed SNP rs1129038 (P = 3.23 × 10−7) (r2 = 0.99). OCA2 is causal for oculocutaneous albinism type 2, encodes a melanosomal membrane transporter8, and plays a major role in determining skin, hair, and eye color. The replication study and overall meta-analysis confirmed association of both rs1129038 (P = 3.91 × 10−8, OR 1.22) and rs12913832 (P = 3.81 × 10−8, OR 1.22) (Table 1). Furthermore, the SNP alleles that are low-risk for vitiligo are strongly associated with gray/blue eye color9–11 and with elevated risk of malignant melanoma12,13, tagging a founder variant within HERC2 that down-regulates transcription of the OCA2 allele in cis11,14. OCA2 is thus analogous to TYR: both encode melanocyte antigens presented by HLA-A*0215,16, for both vitiligo protection is associated with reduced functional protein, and for both susceptibility to vitiligo and melanoma constitute genetic opposites7, perhaps modulating immune surveillance for melanoma (Supplementary Fig. 2). Furthermore, we predict that gray/blue eye color should be under-represented, and tan/brown eye color over-represented, among vitiligo patients. To test this, we surveyed 1206 EUR vitiligo patients, confirming both predictions: among vitiligo patients the prevalence of gray/blue eye color (26.8%) was greatly reduced and tan/brown eye color (43.2%) greatly elevated compared to both USA17 (P < 0.0001) and Australian18 (P < 0.0001) EUR individuals; Table 2). Compared to persons with gray/blue eye color, the OR for vitiligo was 2.98 in persons with tan/brown eye color and 2.25 in persons with green/hazel eye color, indicating additional eye color genes besides OCA2 constitute risk loci for vitiligo, and indeed TYR is associated both with vitiligo3 and with green/hazel eye color19.
At chromosome 16q24.3, the GWAS-MA showed complex association of SNPs spanning nt 89647951–90078022, particularly rs8049897 (P = 2.03 × 10−7) and imputed SNPs rs9926296 (P = 4.34 × 10−11) and rs4785587 (P = 1.08 × 10−8) (Supplementary Fig. 3a), confirmed by the replication study and overall meta-analysis (rs9926296 P = 1.82 × 10−13, OR 0.79). The associated region contains 20 genes, notably including MC1R, encoding the melanocortin receptor, a regulator of melanogenesis and minor vitiligo autoantigen, associated with malignant melanoma and with skin and hair color20.
At 11q21, the GWAS-MA showed association with rs4409785 (nt 95311422) (P = 2.26 × 10−10) and imputed SNP rs11021232 (P = 9.20 × 10−10) (Supplementary Fig. 3b), confirmed by the replication study and overall meta-analysis (rs4409785 P = 1.57 × 10−13, OR 1.34). These SNPs are located in a 559 kb region containing no known genes, approximately 6.28 Mb distal to TYR. These SNPs are not in linkage disequilibrium with TYR SNPs (r2=0), and remain highly significant when conditioned on common causal TYR SNPs rs1042602 and rs1126809. We speculate this region might harbor a regulatory element affecting TYR transcription in cis.
Besides OCA2, MC1R, and the chromosome 11q21 locus, most vitiligo-associated loci encode immunoregulatory proteins. At chromosome 2q24.2, the GWAS-MA showed association with SNPs spanning nt 163076146–163154363, between IFIH1 and FAP, particularly rs2111485 (P = 1.67 × 10−10) (Supplementary Fig. 3c), confirmed by the replication study and overall meta-analysis (rs2111485 P = 4.91 × 10−15, OR 0.77). IFIH1 encodes an interferon-induced RNA helicase involved in antiviral innate immune responses21, associated with type 1 diabetes22, Graves' disease23, multiple sclerosis24, psoriasis25, and perhaps lupus26.
At 3q13.33, the GWAS-MA showed suggestive association of SNPs (nt 119276377–119197379) spanning and upstream of CD80, particularly rs4330287 and imputed SNP rs59374417 (both P = 3.97 × 10−7; r2 = 1.0) (Supplementary Fig. 3d), confirmed by the replication study and overall meta-analysis (rs59374417 P = 3.78 × 10−10, OR 1.34). CD80 is a surface protein on activated B-cells, monocytes, and dendritic cells that co-stimulates T cell priming27,28.
At 4p16.1, the GWAS-MA showed suggestive association of SNPs (nt 10702156–10729386) upstream of CLNK, including rs16872571 (P = 2.50 × 10−7) and several imputed SNPs, particularly rs11940117 (P = 9.00 × 10−8) (Supplementary Fig. 3e), confirmed by the replication study and overall meta-analysis (rs16872572 P = 1.56 × 10−8, OR 1.21). CLNK encodes mast cell immunoreceptor signal transducer, a positive regulator of immunoreceptor signaling29.
At 6q15, the GWAS-MA showed suggestive association of SNPs (nt 90941239-91915693) spanning BACH2, particularly rs3757247 (P = 2.14 × 10−5) (Supplementary Fig. 3f), confirmed by the replication study and overall meta-analysis (P = 2.53 × 10−8, OR 1.20). BACH2 encodes a transcriptional repressor of B cells30, and is associated with type 1 diabetes31,32, celiac disease33, and Crohn's disease34.
At 8q24.22, the GWAS-MA showed suggestive association of SNPs (nt 133929917-133979872) spanning the TG/SLA locus, at which the two genes are interdigitated and encoded on the opposite strands. Most significant were rs853308 (P = 1.14 × 10−6) and several imputed SNPs (Supplementary Fig. 3g), confirmed by the replication study and overall meta-analysis (rs853308 P = 1.58 × 10−8, OR 1.20). TG encodes thyroglobulin, while SLA encodes Src-like adaptor protein, a regulator of antigen receptor signaling35. TG is associated with autoimmune thyroid disease36, which affects approximately 17 percent of vitiligo patients2, suggesting association of rs853308 with vitiligo might derive from patients with concomitant autoimmune thyroid disease. However, stratification showed association of rs853308 with vitiligo derives both from patients with (P = 2.43 × 10−3) and without (P = 3.98 × 10−7) autoimmune thyroid disease, with virtually identical ORs in the two subgroups (1.20 and 1.19, respectively). Moreover, directly comparing rs853308 in vitiligo patients with (N = 608) versus without (N = 2579) autoimmune thyroid disease showed no difference (P = 0.94, OR 1.01). It is not apparent what role thyroglobulin might play in vitiligo pathogenesis, suggesting association of vitiligo with the TG/SLA locus may derive from SLA, rather than TG. SLA might likewise account for reported association of TG with autoimmune thyroid disease36.
At 10q25.3, the GWAS-MA showed suggestive association of SNPs (nt 115439530-115492092) spanning CASP7, particularly rs3814231 (P = 1.20 × 10−5) (Supplementary Fig. 3h), confirmed by the replication study and overall meta-analysis (P = 3.56 × 10−8, OR 0.81). CASP7 encodes caspase 7, an executioner protein of apoptosis and inflammation37, associated with rheumatoid arthritis38, and suggested as a candidate gene for IDDM17 in type 1 diabetes39.
At 11p13, the GWAS-MA showed association of SNPs (nt 35242907-35375280) spanning portions of CD44 and SLC1A2, particularly rs736374 (P = 3.06 × 10−8) and rs10768122 (P = 6.13 × 10−8) (Supplementary Fig. 3i), confirmed by the replication study and overall meta-analysis (rs10768122 P = 1.78 × 10−9, OR 1.21). CD44 encodes a cell surface glycoprotein with various functions, including a role in T cell development40, and is associated with lupus41.
At 12q13.2, the GWAS-MA showed association with SNPs (nt 56369506-56535251) in a region including IKZF4 (Supplementary Fig. 3j), particularly rs1701704 (P = 1.53 × 10−9) and imputed SNP rs2456973 (P = 1.22 × 10−9), confirmed by the replication study and overall meta-analysis (rs2456983 P = 2.75 × 10−14, OR 1.29). IKZF4 encodes a regulator of T cell activation42, and is associated with type 1 diabetes43 and alopecia areata44.
At 12q24.12, the GWAS-MA showed association with SNPs (nt 111708458-112906415) within and near SH2B3, particularly rs3184504 (P = 1.32 × 10−11) and imputed SNP rs4766578 (P = 9.10 × 10−12), located downstream, within ATXN2 (Supplementary Fig. 3k). Many SNPs in this region achieved genome-wide significance, but logistic regression analysis indicated all reflect a single association signal. The replication study and overall meta-analysis confirmed association of both rs3184504 (P = 2.46 × 10−17, OR 0.76) and rs4766578 (P = 3.54 × 10−18, OR 0.76). ATXN encodes Ataxin-2, and is causal for spinocerebellar ataxia type 2. SH2B3 encodes adaptor protein LNK, regulating development of both B and T cells45, and associated with type 1 diabetes46, celiac disease47, rheumatoid arthritis48, multiple sclerosis49, and perhaps lupus25. SH2B3 thus seems more likely relevant to vitiligo susceptibility than ATXN2.
At 22q13.2, the GWAS-MA showed suggestive association with SNPs in a broad region (nt 41707054-42062822), particularly rs79008 (P = 1.44 × 10−6), upstream of TOB2, and several imputed SNPs, including rs4822024 (P = 1.02 × 10−7), between ZC3H7B and TEF (Supplementary Fig. 3l). The replication study and overall meta-analysis confirmed association of both SNPs, greatest with rs4822024 (P = 6.81 × 10−10, OR 0.78). TOB2 encodes a regulator of cell cycle progression involved in T cell tolerance50. However, the locus contains 12 genes, and assignment of TOB2 as causal remains uncertain.
Besides these 13 confirmed vitiligo-associated loci, an additional locus at 19p13.3 did not quite achieve genome-wide significance. The GWAS-MA showed suggestive association of SNPs (nt 4830628-4837557) spanning TICAM1, particularly rs6510827 (P = 6.98 × 10−6) and imputed SNP rs811383825 (P = 1.49 × 10−5) (Supplementary Fig. 3m), with association with rs6510827 confirmed by the replication study and near-significance in the overall meta-analysis (P = 8.80 × 10−8, OR 1.19). TICAM1 encodes toll-like receptor adaptor molecule 1, which mediates innate immune responses to viral pathogens51. SNPs at ten additional loci that appeared suggestive in the GWAS-MA were not confirmed by the replication study (Supplementary Table 3).
Altogether, the vitiligo susceptibility loci we identified account for approximately 10% of vitiligo risk in EUR individuals, and about 18% of vitiligo heritability (h2 ~ 0.75). Most encode immunoregulatory proteins or melanocyte proteins that likely mediate antigenic triggering and immune targeting of melanocytes, and bioinformatic network analysis indicated at least 26 comprise a functional network spanning from the melanocyte to the immune system (Supplementary Fig. 4). Many vitiligo susceptibility loci are shared with other autoimmune diseases, most sharing the same high-risk alleles, consistent with epidemiological associations among these diseases (Supplementary Fig. 5). Functional prediction of all genotyped and imputed missense and splice junction variants in all confirmed non-MHC vitiligo loci (Supplementary Table 4) identified predicted deleterious variants at PTPN22, IFIH1, SLA, CD44, TYR, OCA2, MC1R, UBASH3A, and C1QTNF6, and for TYR two common variants, S192Y and R402Q, confer protection from vitiligo, whereas HLA-A *02:01 confers risk6. Additional vitiligo susceptibility loci undoubtedly remain undiscovered; nevertheless, the genes and pathways already identified provide insights into vitiligo pathobiology, optimal use of existing treatments, and even novel therapeutics.
GWAS1 has been described previously3. GWAS2 included 450 unrelated generalized vitiligo patients (cases) of non-Hispanic/Latino European ancestry (EUR) from North America and Europe, who met strict clinical criteria for generalized vitiligo52. Controls for GWAS2 were 3182 EUR individuals not specifically known to have any autoimmune disease or malignant melanoma, for whom genome-wide genotypes were obtained from the database of Genotypes and Phenotypes (dbGaP; phs000092v1, phs000125v1, phs000138v2, phs000168v1, and phs000206v3), or from the Illumina iControlDB. The replication study included 1440 unrelated EUR generalized vitiligo cases and 1316 unrelated EUR controls from North America and Europe, principally spouses of vitiligo patients from the GWAS2 study and the replication study itself. There was no overlap of cases and controls between the GWAS1, GWAS2, and replication cohorts. Cases and controls provided clinical history regarding vitiligo and other autoimmune diseases as described previously for GWAS13, and controls having known relatives with vitiligo or reporting any known autoimmune diseases or melanoma were excluded. Eye color was by self-report. Written informed consent was obtained from all study subjects. This study was approved by each institutional review board and was conducted according to Declaration of Helsinki principles.
Genomic DNA was prepared from saliva specimens using a DNA self-collection kit per the manufacturer's instructions (Oragene, DNA Genotek). For genome-wide genotyping, DNA concentrations were assayed by both fluorescence staining (PicoGreen method, Invitrogen) and ultraviolet A260 spectrophotometry (Nanodrop, Thermo Scientific). For genotyping specific SNPs in the replication study, DNA concentrations were assayed by Nanodrop.
Genome-wide genotyping for GWAS2 cases was performed for 657,366 SNPs using Illumina Human660W-Quad BeadChips per the manufacturer's instructions. GWAS2 controls obtained from dbGaP and Illumina iControlDB had been genotyped using Illumina Human 610-Quad and Human 1Mv1 BeadChips. Genotyping of 46 SNPs in the replication study was performed using a custom Illumina GoldenGate assay per the manufacturer's instructions.
Quality control filtering of genome-wide genotype data in GWAS2 was carried out as described for GWAS13 using Illumina GenomeStudio, version 3 and PLINK53, version 1.07. Cases were excluded on the basis of SNP call rates < 98.5% (N = 16), discordance between reported and observed sex (N = 0), and/or inadvertent subject duplication (N = 0). Beyond prior quality control procedures, controls were excluded on the basis of SNP call rates < 95% (N = 0). Additional cases (N = 6) and controls (N = 356) were excluded on the basis of cryptic relatedness based on pair-wise identity-by-descent estimation (Pi-hat > 0.05) of the entire (GWAS1 + GWAS2) summary dataset, in which case the individual with lower SNP call rate was excluded. SNPs were excluded on the basis of genotype missing rate ≥ 2% overall (N = 97,301) in cases plus controls, observed minor allele frequency < 0.01 (N = 23,064), significant deviation (P < 10−5) from Hardy-Weinberg equilibrium in the control dataset (N = 5,241), and/or significant difference (P < 10−5) in genotype missing rate in cases versus controls (N = 35,939). Data for SNPs with P values < 10−15 were reviewed with respect to genotype clusters and allele calls in cases, and minor allele frequency in controls compared to data in public data sources, and were excluded if there were apparent data quality problems (N = 2). To control for population stratification, we performed principle components analysis using EIGENSOFT54, version 4.2, and excluded as outliers cases (N = 10) and controls (N = 16) whose ancestry was ≥ 6 standard deviations from the mean on one of the top ten eigenvectors.
In the replication study we successfully determined genotypes for 46 SNPs that showed suggestive (P < 10−4) or significant P values in the genome-wide meta-analysis of GWAS1 and GWAS2, using a custom Illumina GoldenGate assay. SNP genotypes were subjected to quality control filters similar to those in GWAS1 and GWAS2, as appropriate.
For GWAS2, after quality control filtering of subjects and SNP data and removal of genetic outliers, we compared allele frequencies of the remaining 495,821 SNPs in the final 418 cases and 2810 controls using the unadjusted Cochran-Armitage trend test implemented in PLINK53 and the adjusted Cochran-Armitage trend test implemented in EIGENSOFT54, in which both phenotypes and genotypes of subjects were adjusted for ancestry using the top ten eigenvectors. The unadjusted Cochran-Armitage trend tests and the adjusted Cochran-Armitage trend tests yielded genomic inflation factors of 1.054 and 1.050, respectively, indicating that residual population stratification was negligible. Odds ratios and 95% confidence limits were calculated by logistic regression analysis by use of PLINK53.
To fully assess association at the 24 novel suggestive loci, we imputed genotypes for each locus using MaCH55, ver.1.0 based on patterns of haplotype variation in the 1000 Genomes Project European ancestry samples (Release Aug 4, 2010). We retained imputed SNPs with r2 > 0.3 and minor allele frequency > 0.01 for further analyses.
For the replication study, after quality control filtering, we compared allele frequencies for genotyped SNPs in the remaining 1377 patients and 1284 controls using the Cochran-Armitage trend test. Odds ratios and 95% confidence limits were calculated by logistic regression analysis.
To obtain combined ORs and P values for GWAS1 and GWAS2, and for the combined GWAS1, GWAS2, and replication studies, we performed meta-analysis using a Cochran-Mantel-Haenszel test. A Breslow-Day test was used to test for heterogeneity of ORs of the same SNP in different study cohorts.
Calculation of linkage disequilibrium between SNPs in regions of association was carried out using Haploview56, version 4.2. As a test of the independent effect of a given locus conditioned on the effect of another locus, we compared the fit of a model containing both loci to a model containing only the conditioning locus, assuming a multiplicative genotypic effect for the high-risk allele of each locus. Analyses were performed using STATA, version 10.0.
We estimated the contribution of known vitiligo susceptibility loci to the total variance in vitiligo liability as the difference between the variance accounted for by all SNPs genome-wide and the variance remaining after removing SNPs in known vitiligo susceptibility loci, using GCTA57 to estimate the two components of variance. We estimated heritability of vitiligo liability58,59 assuming a vitiligo prevalence in the EUR population of 0.003860 and sibling risk 0.062.
To gain insights into potential functional relationships among proteins encoded by vitiligo risk loci, we carried out functional interaction network analysis using the Search Tool for the Retrieval of Interacting Genes (STRING) 9.061. Input data were all known vitiligo susceptibility loci (including TICAM1, and selecting BTNL2 to represent the MHC class II gene region3). In addition, we iteratively tested inclusion of all proteins encoded by genes in the 16q24.3 and 22q13.2 association regions as well as FAM76B as potentially representing the 11q21 association region as a means of possible gene identification.
We thank the many vitiligo patients and normal control individuals around the world who participated in this study. We thank the University of Colorado Cancer Center Genomics and Microarray Core for genomewide genotyping, and BodySync (Aurora, CO) for replication genotyping. Supported by grants R01AR045584, R01AR056292, and P30AR057212 from the U.S. National Institutes of Health.
URLs. 1000 Genomes Project, http://www.1000genomes.org/; 1000 Genomes Project data, http://www.sph.umich.edu/csg/abecasis/MACH/download/1000G-2010-08.html; CCDS database, http://www.ncbi.nlm.nih.gov/CCDS/CcdsBrowse.cgi; Functional Genome Server, http://gvs.gs.washington.edu/GVS/; Genomic Evolutionary Rate Profiling http://mendel.stanford.edu/SidowLab/downloads/gerp/index.html; Illumina iControlDB, http://www.illumina.com/science/icontroldb.ilmn; MACH, http://www.sph.umich.edu/csg/abecasis/MACH/download/; NIH Database of Genotypes and Phenotypes (dbGaP), http://www.ncbi.nlm.nih.gov/gap; PHylogenetic Analysis with Space/Time models Conservation, http://genome.ucsc.edu/goldenPath/help/phastCons.html; PLINK, http://pngu.mgh.harvard.edu/purcell/plink/; STATA, http://www.stata.com; STRING database, http://string-db.org
AUTHOR CONTRIBUTIONS Y.J. performed statistical analyses. K.G. managed computer databases and genotype data. T.M.F., S.B., S.A.B., S.L.R., and J.B.C. managed DNA samples and contributed to experimental procedures. P.J.H. managed subject coordination. S.A.B., D.C.B., R.L., A.W., J.P.v.d.V., M.R.W., W.T.M., E.H.K., D.J.G., A.P.W., M.P., G.L., A.T., T.J., K.E., N.v.G., J.L., A.O., A.H., R.M.H., and N.B.S. provided subject samples and phenotype information. P.R.F. and R.A.S. oversaw and managed all aspects of the study. R.A.S. wrote the first draft of the manuscript. All authors contributed to the final paper.
COMPETING FINANCIAL INTERESTS The authors declare no competing financial interests.