|Home | About | Journals | Submit | Contact Us | Français|
Vitiligo is an autoimmune disease in which depigmented skin results from destruction of melanocytes1, with epidemiologic association with other autoimmune diseases2. In previous linkage and genome-wide association studies (GWAS1, GWAS2), we identified 27 vitiligo susceptibility loci in patients of European (EUR) ancestry. We carried out a third GWAS (GWAS3) in EUR subjects, with augmented GWAS1 and GWAS2 controls, genome-wide imputation, and meta-analysis of all three GWAS, followed by an independent replication. The combined analyses, with 4,680 cases and 39,586 controls, identified 23 new loci and 7 suggestive loci, most encoding immune and apoptotic regulators, some also associated with other autoimmune diseases, as well as several melanocyte regulators. Bioinformatic analyses indicate a predominance of causal regulatory variation, some corresponding to eQTL at these loci. Together, the identified genes provide a framework for vitiligo genetic architecture and pathobiology, highlight relationships to other autoimmune diseases and melanoma, and offer potential targets for treatment.
In previous genome-wide linkage and association studies, we identified 27 vitiligo susceptibility loci3–6 in EUR subjects, principally encoding immunoregulatory proteins, many of which are associated with other autoimmune diseases7. Several other vitiligo-associated genes encode melanocyte components that regulate normal pigmentary variation8 and in some cases are major vitiligo autoimmune antigens, with an inverse association of variation at these loci with vitiligo versus malignant melanoma4,6. To detect additional vitiligo-associations with lower odds ratios (ORs), as well as uncommon risk alleles with higher ORs, we conducted a third GWAS (GWAS3) of EUR subjects. We augmented the number of population controls in our previous GWAS1 and GWAS2 and performed genome-wide imputation of all three EUR vitiligo GWAS. After quality control procedures, the augmented studies included 1,381 cases and 14,518 controls (GWAS1), 413 cases and 5,209 controls (GWAS2), and 1,059 cases and 17,678 controls (GWAS3), with genomic inflation factors 1.068, 1.059, and 1.013, respectively. We performed a fixed-effects meta-analysis of the three GWAS datasets for 8,966,411 markers (GWAS123; Online Methods). Replication used an additional 1,827 EUR vitiligo cases and 2,181 controls.
Results for the three individual GWAS, the meta-analysis, and the replication study are presented in Table 1, Supplementary Table 1, and Fig. 1. Twenty-three new loci achieved genome-wide significance (P < 5 × 10−8) for association with vitiligo and demonstrated subsequent replication; of these, 21 are completely novel (FASLG, PTPRC, PPP4R3B, BCL2L11, FARP2-STK25, UBE2E2, FBXO45-NRROS, PPP3CA, IRF4, SERPINB9, CPVL, NEK6, ARID5B, a multigenic segment that includes BAD, TNFSF11, KAT2A-HSPB9-RAB5C, TNFRSF11A, SCAF1-IRF3-BCL2L12, a multigenic segment that includes ASIP, PTPN1, and IL1RAPL1), while two, CTLA4 and TICAM1, were suggestive in our previous studies. One previously significant locus, CLNK, was no longer significant (Supplementary Table 1). Another potential new locus, PVT1, exceeded genome-wide significance in the discovery meta-analysis (P = 7.74 × 10−9), but could not be successfully genotyped in the replication study and so remains uncertain. Two other loci, FLI1 and LOC101060498, exceeded genome-wide significance in the discovery meta-analysis (P = 3.76 × 10−8 and P = 3.60 × 10−11, respectively), but did not demonstrate replication. Seven additional novel loci achieved suggestive significance (P < 10−5) in the discovery meta-analysis (STAT4, PPARGC1B, c7orf72, PARP12, FADS2, CBFA2T3, and a chr17 locus in the vicinity of AFMID) and gave evidence of replication, but failed to achieve genome-wide significance (Supplementary Table 1).
Together, the most significantly associated variants at the 48 loci (Table 1) identified by meta-analyses of the three GWAS account for 17.4% of vitiligo heritability (h2 ~ 0.75). To assess whether additional independent variants at these loci might account for additional vitiligo heritability, we performed logistic regression conditional on the most significant SNP at each locus. Eight loci (FARP2-STK25, IFIH1, IL2RA, LPP, MC1R, SLA/TG, TYR, UBASH3A) and the MHC showed evidence of additional independent associations, accounting for an additional 5.1% of vitiligo heritability, for a total of 22.5%. In general, the ORs for the 23 new confirmed loci were lower than those for loci detected previously6, 1.15 to 1.27, excepting CPVL (OR = 1.84), RALY-EIF252-ASIP-AHCY-ITCH (OR = 1.64), and IL1RAPL1 (OR = 1.77); for these three signals the associated alleles are uncommon (minor allele frequencies 0.03, 0.07, and 0.01, respectively) and thus were not detected in the previous GWAS due to power limitations.
To screen for functional relationships among proteins encoded at the 48 confirmed vitiligo-associated loci, we included all genes under the association peaks at these loci in unsupervised pathway analyses using g:PROFILER9, PANTHER10, and STRING11. PANTHER and gPROFILER identified an enriched network of BioGRID interactions, most significant for the GO categories immune response, immune system process, positive regulation of response to stimulus, positive regulation of biological process, and regulation of response to stimulus. STRING identified a large potential interaction network (Fig. 2), with a predominance of proteins involved in immunoregulation, T-cell receptor repertoire, apoptosis, antigen processing and presentation, and melanocyte function.
Considering proteins encoded at the 23 newly confirmed vitiligo candidate loci, at least twelve (CTLA4, TICAM1, PTPRC, FARP2, UBE2E2, NRROS, CPVL, ARID5B, PTPN1, TNFSF11, TNFRSF11A, IRF3, and perhaps also IL1RAPL1) play roles in immune regulation, and PPP3CA may regulate FOXP3 via NFATC2 and is associated with canine lupus12. Six (FASLG, BCL2L11, BCL2L12, SERPINB9, NEK6, BAD) are regulators of apoptosis, particularly involving immune cells. ASIP is a regulator of melanocyte gene expression, and IRF4 is a key transcription factor for both immune cells and melanocytes.
Strikingly, several vitiligo-associated genes encode proteins that interact physically and functionally. BCL2L11 and BAD are binding partners that promote apoptosis13. CD80 binds to CTLA4 to inhibit T cell activation14. BCL2L12 binds to and neutralizes caspase 7 (CASP7)15. SERPINB9 binds to and specifically inhibits granzyme B (GZMB)16. Eos (IKZF4) binds and is an obligatory co-repressor of FOXP3 in regulatory T cells17. RANK (TNFRSF11A) binds to RANKL (TNFSF11) to regulate many aspects of immune cell function, including interactions of T cells and dendritic cells and thymic tolerization18. Agouti signaling protein (ASIP) binds to the melanocortin-1 receptor (MC1R) to down-regulate production of brown-black eumelanin19. IRF4 cooperates with MITF to activate transcription of TYR20. And the vitiligo-associated HLA-A*02:01:01:01 subtype presents peptide antigens derived from several different melanocyte proteins, including tyrosinase (TYR), OCA2, and MC1R4,6,21. Together, these relationships appear to highlight key pathways of vitiligo pathogenesis that are beginning to coalesce.
An unexpected finding from vitiligo GWAS has been an inverse relationship between vitiligo and malignant melanoma risk for genes that encode melanocyte structural and regulatory proteins. TYR, OCA2, and MC1R, encode functional components of the melanocyte and are key vitiligo autoantigens. IRF4 encodes a transcription factor for melanocytes as well as lymphoid, myeloid, and dendritic cells22, controlled by alternative tissue-specific enhancers23. ASIP and PPARGC1B encode paracrine regulators of melanocyte gene expression. All six loci play important roles in normal pigmentary variation8,24, and for all six the specific SNPs associated with vitiligo risk are also associated with melanoma protection, and vice-versa25–27. The inverse genetic relationship of susceptibility to vitiligo versus melanoma suggests that vitiligo may represent enhanced immune surveillance against melanoma27,28, consistent with the threefold reduction in melanoma incidence among vitiligo patients29,30 and prolonged survival of melanoma patients who develop vitiligo during immunotherapy31.
Vitiligo is epidemiologically associated with several other autoimmune diseases, including autoimmune thyroid disease, pernicious anemia, rheumatoid arthritis, adult-onset type 1 diabetes, Addison’s disease, and lupus2,32. We searched the NHGRI-EBI GWAS Catalog and PubMed for the 48 genome-wide significant and 7 suggestive vitiligo susceptibility loci for associations with other autoimmune, inflammatory, and immune-related disorders. As shown in Fig. 3, of the 23 novel genome-wide significant vitiligo loci, FASLG has been associated with celiac disease33 and Crohn’s disease34; PTPRC with ulcerative colitis35; BCL2L11 with primary sclerosing cholangitis36; CTLA4 with alopecia areata37, rheumatoid arthritis38, autoimmune thyroid disease39,40, myasthenia gravis41, and type 1 diabetes autoantibody production42; TNFRSF11A with myasthenia gravis41; and ARID5B with systemic lupus erythematosus43. Of the seven suggestive loci, STAT4 has been associated with Behḉet’s disease44, Sjögren’s syndrome45, and lupus46; and c7orf72 with lupus47. These concordant associations for vitiligo and other autoimmune and inflammatory diseases add to those involving previously identified vitiligo susceptibility loci, which include RERE, PTPN22, IFIH1, CD80, LPP, BACH2, RNASET2-FGFR1OP-CCR6, TG/SLA, IL2RA, CD44, a chr11q21 gene desert, IKZF4, SH2B3-ATXN2, UBASH3A, and C1QTNF64,6. Nevertheless, in most cases it remains uncertain whether apparent shared locus associations for different autoimmune diseases reflect shared or different underlying causal variants.
A majority of loci associated with complex traits involve causal variants that are regulatory in nature48–52, often corresponding to apparent expression quantitative trait loci (eQTLs)52. For TYR21, GZMB53, and MC1R7, principal vitiligo risk derives from missense substitutions, whereas for OCA26 and the MHC class I54 and class II55 loci principal vitiligo risk is associated with causal variation in nearby transcriptional regulatory elements. To assess the fraction of vitiligo-associated loci for which causal variation is likely regulatory, we carried out conditional logistic regression analysis of all loci to define independent association signals, and for each signal we compiled all variants that could not be statistically distinguished. All variants were then annotated against all available ENCODE datasets for immune-related and melanocyte-related cells (Supplementary Table 2). Overall, at approximately 58% of loci, the most significant variants (or statistically indistinguishable variants) are within a transcriptional regulatory element predicted by ENCODE data56,57. Only about 15% are in coding regions, several resulting in missense substitutions. To further assess the general functional categories of apparent causal variants for vitiligo, we applied stratified LD score regression51 to the GWAS meta-analysis summary statistics. As shown in Fig. 4, greatest enrichment of heritability was observed for markers in regulatory functional categories, with considerably less enrichment of markers in protein coding regions.
We utilized two approaches to assess correspondence of vitiligo association signals with expression of genes in the vicinity. We used PrediXcan58 to predict expression of 11,553 genes in whole blood for each study subject and then tested association of predicted expression of each gene with vitiligo affection status. We used a Bayesian method to assess co-localization of cis eQTL signals in purified blood monocytes with the confirmed vitiligo association signals. The PrediXcan analysis found 83 genes with significant differential predicted expression in vitiligo cases versus controls after Bonferroni correction (Supplementary Table 3); of these, 75 were located within 1 Mb of one of the 48 confirmed vitiligo susceptibility loci, demonstrating highly significant enrichment compared with locations of genes non-significant for PrediXcan (P value < 0.00001). The eQTL analysis found that 8 of the confirmed vitiligo association signals showed significant co-localization with eQTL association signals identified in purified monocytes (Supplementary Fig. 1 and Supplementary Table 4). Of the confirmed vitiligo-associated genes that could be tested using both methods, 6 were significant in both analyses (CASP7, HERC2-OCA2, ZC3H7B-TEF, TICAM1, RERE, RNASET2-FGFR1OP-CCR6). For all of these except CASP7, one or more of the most associated SNPs not distinguishable by logistic regression was located within or very close to an ENCODE element likely to regulate gene expression in immune cell types, melanocytes, or both (Supplementary Table 2).
Like a jigsaw puzzle, the pieces of the vitiligo pathogenome are thus beginning to fit together, revealing a complex network of immunoregulatory proteins, apoptotic regulators, and melanocyte components that mediate both autoimmune targeting of melanocytes in vitiligo and susceptibility to melanoma. For vitiligo as for other complex diseases, there is enrichment of causal variation in regions that regulate gene expression. This may bode well for identifying potential therapeutic targets, as pharmacologic modulation of dysregulated biological pathways may prove more tractable than attempting to target proteins impacted by amino acid substitutions.
The genome-wide portion of this study included unrelated cases from our three generalized vitiligo GWAS: GWAS14 (n = 1514), GWAS26 (n = 450), and the current GWAS3 (n = 1090). All cases were of non-Hispanic-Latino European-derived white ancestry (EUR) from North America and Europe, and met strict clinical criteria for generalized vitiligo59. All controls were EUR individuals not specifically known to have any autoimmune disease or malignant melanoma, for whom genome-wide genotypes were obtained from the NCBI database of Genotypes and Phenotypes (dbGaP; phs000092.v1.p1, phs000125.v1.p1, phs000138.v2.p1, phs000142.v1.p1, phs000168.v1.p1, phs000169.v1.p1, phs000206.v3.p2, phs000237.v1.p1, phs000346.v1.p1, and phs000439.v1.p1 for GWAS1; phs000203.v1.p1, and phs000289.v2.p1 for GWAS2; phs000196.v2.p1, phs000303.v1.p1, phs000304.v1.p1, phs000368.v1.p1, phs000381.v1.p1, phs000387.v1.p1, phs000389.v1.p1, phs000395.v1.p1, phs000408.v1.p1, phs000421.v1.p1, phs000494.v1.p1, and phs000524.v1.p1 for GWAS3). Control datasets were matched to each of the three GWAS case datasets based on platforms used for genotyping. The independent replication study included 1827 unrelated EUR vitiligo cases and 2181 unrelated EUR controls not included in any of the GWAS. All subjects provided written informed consent. This study was carried out under the jurisdiction of each local IRB with overall oversight of the Colorado Multiple Institutional Review Board (COMIRB).
Saliva specimens were obtained using a DNA self-collection kit (Oragene, DNA Genotek), and DNA was prepared using either the Maxwell apparatus/16 LEV Blood DNA kit (Promega) or the DNA Genotek Oragene Purifier protocol. DNA concentrations were measured using either the Qubit dsDNA BR Assay kit and Qubit 2.0 Fluorometer (Invitrogen) or the Promega QuantiFluor ONE dsDNA kit and GloMax®-Multi+ Detection System (Promega).
Genome-wide genotyping for the GWAS3 cases was performed for 716,503 variants using Illumina Human OmniExpress BeadChips by the Center for Inherited Disease Research (CIDR). Genotype data for GWAS3 were deposited in dbGaP (phs000224.v3.p1). GWAS14 and GWAS26 have been described previously.
Quality control filtering of genome-wide genotype data was carried out using PLINK60, version 1.9. For each case/control dataset, DNA strand calls were reversed as needed. Cases were excluded on the basis of SNP call rates <98.5%, discordance between reported and observed sex, or inadvertent subject duplication, and controls were excluded on the basis of SNP call rates < 98%. SNPs were excluded on the basis of genotype missing rate > 2% for SNPs with observed minor allele frequency (MAF) ≥ 0.01, and for SNPs with MAF < 0.01 exclusion criteria were genotype missing rate >1% and < 5 minor alleles observed, or significant (P < 10−4) deviation from Hardy-Weinberg equilibrium. For X chromosome SNPs, Hardy-Weinberg equilibrium tests were performed in females, and SNPs with P < 10−4 were excluded from the final analysis. For each GWAS, only SNPs that existed in all case and control datasets were retained for imputation.
Within each GWAS, subjects were excluded based on cryptic relatedness identified by pairwise identity-by-descent estimations (pi-hat > 0.0625), in which case the individual with lower SNP call rate was excluded. For each of the three GWAS, the cleaned case dataset was combined with one cleaned control dataset at a time and the genotype data of 270 subjects of Phase I and II of the International HapMap Project from 4 populations, and principal components analysis (PCA) was performed with EIGENSOFT59 based on tag-SNPs (within which no pair were correlated with r2 >0.2) selected from genotyped SNPs. The first two eigenvectors were used to produce a PCA plot. A PCA plot was first made for cases and HapMap samples, and cases that were clearly separated from the main cluster of cases and HapMap EUR samples were excluded as outliers. A PCA plot of controls and HapMap samples was then made, and the same x and y coordinates that separated the case outliers from the main cluster of cases and HapMap EUR samples were used to identify control outliers.
After all QC procedures, the final number of genotyped SNPs remaining in GWAS1, GWAS2, and GWAS3 were 464,902, 494,043, and 483,609, respectively. For autosomal analyses, the final numbers of cases and controls in GWAS1, GWAS2, and GWAS3 were 1,381 and 14,518, 413 and 5,209, and 1,059 and 17,678, respectively, whereas for X chromosome analyses, the final numbers of cases and controls in GWAS1, GWAS2, and GWAS3 were 1,380 and 9,439, 413 and 5,209, and 1,059 and 14,220, respectively. This sample size provided at least 85% power to detect associations with OR ≥ 1.22 at genome-wide significance (P = 5 × 10−8) for MAF ≥ 0.25.
For each GWAS, we used SHAPEIT version2 to pre-phase genotypes to produce best-guess haplotypes, and then performed imputation with these estimated haplotypes using IMPUTE2 and the 1000 Genomes Project phase I integrated variant set version 3 (March, 2012) as the reference panel. All cryptic related individuals and outliers from each GWAS were included in the process to improve imputation accuracy, but were removed for the final analyses. Only genotypes with imputation INFO ≥ 0.5 were retained, which were combined with prior SNP genotype data. Imputed genotypes for variants with MAF ≥ 0.01 calculated from all 3 GWAS combined and without significant (P > 10−5) deviation from Hardy-Weinberg equilibrium were used in the final analysis, which included 8,721,242 autosome variants and 245,169 chromosome X variants.
For the replication study, genotyping was attempted for 379 variants using a custom Illumina Golden Gate array by CIDR. 71 SNPs were excluded on the basis of genotype missing rate > 2% (which includes apparent technical failures), or significant (P < 10−4) deviation from Hardy-Weinberg equilibrium. For X chromosome SNPs, Hardy-Weinberg equilibrium tests were performed in females. Subjects were excluded on the basis of SNP call rates <95%, or discordance between reported and observed sex. Unintended duplicate samples were identified by pairwise identity-by-descent estimations (pi-hat > 0.99), in which case the individual with lower SNP call rate was excluded. The final numbers of remaining cases and controls were 1,827 and 2,181, respectively, providing at least 80% power to replicate associations at P = 0.05 with Bonferroni correction for up to 48 independent tests for OR ≥ 1.23 for MAF ≥ 0.25.
To control for the effects of population stratification, we assigned cases and controls of each GWAS to homogenous clusters using GemTools60, and performed Cochran-Mantel- Haenszel (CMH) analysis to test for association for each GWAS and the combined GWAS data, with the cluster variable defined by the case-control clusters from each GWAS. After removing variants within the extended MHC, the genomic inflation factor for GWAS1, GWAS2, and GWAS3 was 1.068, 1.059, and 1.013, respectively. For the combined GWAS1–GWAS2–GWAS3 genotype data for shared SNPs, the genomic inflation factor was 1.019.
For the replication study, after quality control procedures we compared allele frequencies for the remaining 308 SNPs in the remaining 1,827 cases and 2,181 controls using the Cochran-Armitage trend test. ORs and 95% confidence limits were calculated by logistic regression analysis. We used CMH analysis to obtain ORs and P values for the combined GWAS plus the replication study data, with the cluster variable defined by the case-control clusters from each GWAS and the replication study data as one cluster. To analyze X chromosome SNPs, we assumed complete X-inactivation and similar effect size between males and females, with the effect of having an A allele in a male equal to the effect of having two A alleles in a female63. We thus coded males as homozygous for the allele carried for each variant and tested for association by CMH analysis to obtain ORs and P values for each GWAS, the combined GWAS, and the combined GWAS plus the replication study data, and by the Cochran- Armitage trend test for the replication study data.
To test heterogeneity of associations across the three GWAS and the replication study data, we performed the Cochran Q test. The analysis was done with PLINK, version 1.07, using the ORs and standard errors estimated from the CMH analysis of each GWAS, and from logistic regression analysis of the replication study data. The I2 statistic from the Q test quantifies heterogeneity and ranges from 0% to 100%64, with a value of 75% or greater typically taken to indicate a high degree of heterogeneity65.
To test for multiple independent signals at each locus, we performed logistic regression analysis of each locus conditional on the most significantly associated variant, including as covariates in the model the significant principal components for each GWAS derived from GemTools62 to control for population stratification, and used a stepwise procedure to select additional variants, one by one, until no additional variants showed conditional P values ≤ 1.0 × 10−5. If a tested variant and the conditional variant could not improve each other significantly (P ≥ 0.05 when comparing the two SNP model to a single SNP model), then both variants were considered to represent the same signal. We calculated the variance explained by a specific variant or a set of variants from the combined GWAS as the Pseudo R2 of a logistic regression model which included the specific variants tested.
To screen for functional relationships among the vitiligo candidate genes, we carried out pathway analysis of the protein products of all positional candidate genes at all 48 confirmed loci and the seven suggestive loci using g:PROFILER10, PANTHER11, and STRING12. To assess enrichment of association signals in different functional genomic categories contributing to vitiligo heritability, we applied stratified LD score regression51 to the combined CMH GWAS123 summary statistics. The regression model contained 24 overlapping functional categories, including coding, UTR, promoter and intronic regions, annotations for different histone marks, DNase I hypersensitivity site (DHS) regions, combined ChromHMM and Segway predictions, conserved regions in mammals, super-enhancers and FANTOM5 enhancers. For each of the 24 categories, a 500-bp window was used. Linkage disequilibrium data were provided by the LD score software, estimated from the EUR samples in the 1000 Genomes Project Phase 1. Enrichment per category was calculated by the ratio of the estimated proportion of heritability explained by the category over the proportion of the markers in the category.
We carried out a gene-based test of association of vitiligo with “imputed” expression profiles for 11,553 autosomal genes in whole blood using PrediXcan58. The analysis included 2,853 cases and 37,412 controls from the combined GWAS. Association testing between expression estimates for each gene and affection status for vitiligo was performed by generalized logistic regression. P values were adjusted for the number of genes tested (n = 11,553). NRROS, ZC3H7B, TNFRSF11A, BCL2L12, RALY, ASIP, OCA2, and TYR were excluded from the PrediXcan analysis due to poor prediction of gene expression in blood cells.
We derived expression quantitative trait loci (eQTLs) in peripheral blood monocytes from 414 EUR subjects with paired genotyping and gene expression data66. SHAPEIT version2 was used to pre-phase genotypes to produce best-guess haplotypes with imputation performed using IMPUTE2 and the 1000 Genomes Project phase I integrated variant set version 3 (March, 2012) as reference panel. We tested for co-localization of eQTL and vitiligo GWAS autosomal association patterns as described67,68. Vitiligo susceptibility loci were defined by windows of robust association plus an added 100 kb buffer on both sides. eQTL probes were selected by choosing probes that resided within these windows. Probe quality annotation was performed using ReMOAT69 and all probes with an annotation of “bad” were removed. After removing non-autosomal loci and duplicate probe IDs, a total of 904 probes remained. All vitiligo susceptibility loci contained at least one probe with the exception of the gene desert 3’ of TYR, for which the only probe that intersected the locus was excluded due to ReMOAT annotation of “bad”. Within each locus window, all SNPs were tested for association with all probes using linear regression. P values, MAF for each SNP and respective sample sizes were used as input to test for co- localization, simultaneously testing five mutually exclusive hypotheses by generating 5 corresponding posterior probabilities (PP):
Posterior probabilities were calculated using the R package “coloc” using default settings for prior probabilities of association. Co-localization was assessed as per Guo et al.68; significant co-localization was PP3+PP4 > 0.99 and PP4:PP3 > 5, and suggestive co-localization was PP3+PP4 > 0.95 and PP4:PP3 > 3.
We thank the thousands of vitiligo patients and normal control individuals around the world who participated in this study. We thank the Center for Inherited Disease Research (CIDR) for genotyping. This work utilized the Janus supercomputer, which is supported by the National Science Foundation (award number CNS-0821794), the University of Colorado Boulder, the University of Colorado Denver, and the National Center for Atmospheric Research. The Janus supercomputer is operated by the University of Colorado Boulder. This work was supported by grants R01AR045584, R01AR056292, X01HG007484, and P30AR057212 from the U.S. National Institutes of Health and by institutional research funding IUT20-46 from the Estonian Ministry of Education and Research.
URLs. 1000 Genomes Project, http://www.1000genomes.org/; 1000 Genomes Project data, http://www.sph.umich.edu/csg/abecasis/MACH/download/1000G-2010-08.html; NHGRI-EBI GWAS Catalog, http://www.ebi.ac.uk/gwas/; NIH Database of Genotypes and Phenotypes (dbGaP), http://www.ncbi.nlm.nih.gov/gapr; Online Mendelian Inheritance in Man (OMIM), http://www.ncbi.nlm.nih.gov/omim; PLINK, http://pngu.mgh.harvard.edu/purcell/plink/; STATA, http://www.stata.com; STRING database, http://string-db.org.
Accession Codes. Genotype and phenotype data for GWAS1, GWAS2, and GWAS 3 have been deposited with the NIH Database of Genotypes and Phenotypes (dbGaP) as phs000224.v1.p1, phs000224.v2.p1, and phs000224.v3.p1, respectively.
AUTHOR CONTRIBUTIONSY.J., G.A. and D.Y. performed statistical analyses. J.S. managed computer databases, software, and genotype data. T.M.F., S.B., G.A., and K.M.B. managed DNA samples and contributed to experimental procedures. P.J.H. managed subject coordination. S.A.B., A.H., A.L., R.M.L., A.W., J.P.W.vdV., N.vG., J.L., D.C.B., A.T., K.E., E.H.K., D.J.G., A.P.W., S.K., E.P., K.K., M.K., M.R.W., W.T.M., A.O., S.M., R.C., M.P., N.B.S., M.S., Y.V., I.K., M.B., H.L., I.H., L.Z., and Q.-S.M. provided subject samples and phenotype information. S.A.S., P.R.F. and R.A.S. conceived, oversaw, and managed all aspects of the study. R.A.S. wrote the first draft of the manuscript. All authors contributed to the final paper.
URLs. 1000 Genomes Project, http://www.1000genomes.org/; coloc, https://cran.r-project.org/web/packages/coloc/index.html; GemTools, http://wpicr.wpic.pitt.edu/WPICCompgen/GemTools/GemTools.htm; IMPUTE2, https://mathgen.stats.ox.ac.uk/impute/impute_v2.html; International HapMap Project, http://hapmap.ncbi.nlm.nih.gov/; PrediXcan, https://github.com/hriordan/PrediXcan; SHAPEIT, http://www.shapeit.fr/; REMOAT, http://remoat.sysbiol.cam.ac.uk/transcript.php.
Competing Financial Interests
The authors declare no competing financial interests.