The human leukocyte antigen (HLA) class II genes HLA-DRB1, -DQA1 and -DQB1 are the strongest genetic factors for type 1 diabetes (T1D). Additional loci in the major histocompatibility complex (MHC) are difficult to identify due to the region’s high gene density and complex linkage disequilibrium (LD). To facilitate the association analysis, two novel algorithms were implemented in this study: one for phasing the multi-allelic HLA genotypes in trio families, and one for partitioning the HLA strata in conditional testing. Screening and replication were performed on two large and independent datasets: the Wellcome Trust Case–Control Consortium (WTCCC) dataset of 2,000 cases and 1,504 controls, and the T1D Genetics Consortium (T1DGC) dataset of 2,300 nuclear families. After imputation, the two datasets have 1,941 common SNPs in the MHC, of which 22 were successfully tested and replicated based on the statistical testing stratifying on the detailed DRB1 and DQB1 genotypes. Further conditional tests using the combined dataset confirmed eight novel SNP associations around 31.3 Mb on chromosome 6 (rs3094663, p = 1.66 × 10−11 and rs2523619, p = 2.77 × 10−10 conditional on the DR/DQ genotypes). A subsequent LD analysis established TCF19, POU5F1, CCHCR1 and PSORS1C1 as potential causal genes for the observed association.
Electronic supplementary material
The online version of this article (doi:10.1007/s00439-010-0908-2) contains supplementary material, which is available to authorized users.
Although they have demonstrated success in searching for common variants for complex diseases, Genome-Wide Association (GWA) studies are less successful in detecting rare genetic variants because of the poor statistical power of most of current methods. We developed a two-stage method that can apply to GWA studies for detecting rare variants. Here we report the results of applying this two-stage method to the Wellcome Trust Case Control Consortium (WTCCC) dataset that include 7 complex diseases: Bipolar disorder, Cardiovascular disease, Hypertension, Rheumatoid Arthritis, Crohn’s disease, Type 1 Diabetes and Type 2 Diabetes. We identified 24 genes or regions that reach genome wide significance. 8 of them are novel and were not reported in the WTCCC study. The cumulative risk (or protective) haplotype frequency for each of the 8 genes or regions is small, being at most 11%. For each of the novel genes, the risk (or protective) haplotype set cannot be tagged by the common SNPs available in chips (r2<0.32). The gene identified in hypertension was further replicated in the Framingham Heart Study (FHS), and is also significantly associated with Type 2 Diabetes. Our analysis suggests that searching for rare genetic variants is feasible in current genome-wide association studies and candidate gene studies, and the results can severe as guides to future resequencing studies to identify the underlying rare functional variants.
The Type I Diabetes Genetics Consortium (T1DGC) is an international, multicenter research program with two primary goals. The first goal is to identify genomic regions and candidate genes whose variants modify an individual’s risk of type I diabetes (T1D) and help explain the clustering of the disease in families. The second goal is to make research data available to the research community and to establish resources that can be used by, and that are fully accessible to, the research community. To facilitate the access to these resources, the T1DGC has developed a Consortium Agreement (http://www.t1dgc.org) that specifies the rights and responsibilities of investigators who participate in Consortium activities. The T1DGC has assembled a resource of affected sib-pair families, parent–child trios, and case–control collections with banks of DNA, serum, plasma, and EBV-transformed cell lines. In addition, both candidate gene and genome-wide (linkage and association) studies have been performed and displayed in T1DBase (http://www.t1dbase.org) for all researchers to use in their own investigations. In this supplement, a subset of the T1DGC collection has been used to investigate earlier published candidate genes for T1D, to confirm the results from a genome-wide association scan for T1D, and to determine associations with candidate genes for other autoimmune diseases or with type II diabetes that may be involved with β-cell function.
type I diabetes; autoantibodies; HLA; families; linkage; association
The Wellcome Trust Case Control Consortium (WTCCC) primary genome-wide association (GWA) scan1 on seven diseases, including the multifactorial, autoimmune disease, type 1 diabetes (T1D), shows significant association (P < 5 × 10−7 between T1D and six chromosome regions: 12q24, 12q13, 16p13, 18p11, 12p13 and 4q27. Here, we attempted to validate these and six other top findings in 4,000 individuals with T1D, 5,000 controls and 2,997 family trios that were independent of the WTCCC study. We confirmed unequivocally the associations of 12q24, 12q13, 16p13 and 18p11 (Pfollow-up ≤ 1.35 × 10−9; Poverall ≤ 1.15 × 10−14), leaving eight regions with small effects or false-positive associations with T1D. We also obtained evidence for chromosome 18q22 (Poverall = 1.38 × 10−8) from a genome-wide association study of nonsynonymous SNPs. Several regions, including 18q22 and 18p11, showed association with autoimmune thyroid disease. This study increases the number of T1D loci with compelling evidence from six to at least ten.
The Type I Diabetes Genetics Consortium (T1DGC) is an international collaboration whose primary goal is to identify genes whose variants modify an individual’s risk of type I diabetes (T1D). An integral part of the T1DGC’s mission is the establishment of clinical and data resources that can be used by, and that are fully accessible to, the T1D research community (http://www.t1dgc.org). The T1DGC has organized the collection and analyses of study samples and conducted several major research projects focused on T1D gene discovery: a genome-wide linkage scan, an intensive evaluation of the human major histocompatibility complex, a detailed examination of published candidate genes, and a genome-wide association scan. These studies have provided important information to the scientific community regarding the function of specific genes or chromosomal regions on T1D risk. The results are continually being updated and displayed (http://www.t1dbase.org). The T1DGC welcomes all investigators interested in using these data for scientific endeavors on T1D. The T1DGC resources provide a framework for future research projects, including examination of structural variation, re-sequencing of candidate regions in a search for T1D-associated genes and causal variants, correlation of T1D risk genotypes with biomarkers obtained from T1DGC serum and plasma samples, and in-depth bioinformatics analyses.
type I diabetes; sequence analysis; HLA; structural variants; expression
OBJECTIVE— The Type 1 Diabetes Genetics Consortium (T1DGC) has assembled and genotyped a large collection of multiplex families for the purpose of mapping genomic regions linked to type 1 diabetes. In the current study, we tested for evidence of loci associated with type 1 diabetes utilizing genome-wide linkage scan data and family-based association methods.
RESEARCH DESIGN AND METHODS— A total of 2,496 multiplex families with type 1 diabetes were genotyped with a panel of 6,090 single nucleotide polymorphisms (SNPs). Evidence of association to disease was evaluated by the pedigree disequilibrium test. Significant results were followed up by genotyping and analyses in two independent sets of samples: 2,214 parent-affected child trio families and a panel of 7,721 case and 9,679 control subjects.
RESULTS— Three of the SNPs most strongly associated with type 1 diabetes localized to previously identified type 1 diabetes risk loci: INS, IFIH1, and KIAA0350. A fourth strongly associated SNP, rs876498 (P = 1.0 × 10−4), occurred in the sixth intron of the UBASH3A locus at chromosome 21q22.3. Support for this disease association was obtained in two additional independent sample sets: families with type 1 diabetes (odds ratio [OR] 1.06 [95% CI 1.00–1.11]; P = 0.023) and case and control subjects (1.14 [1.09–1.19]; P = 7.5 × 10−8).
CONCLUSIONS— The T1DGC 6K SNP scan and follow-up studies reported here confirm previously reported type 1 diabetes associations at INS, IFIH1, and KIAA0350 and identify an additional disease association on chromosome 21q22.3 in the UBASH3A locus (OR 1.10 [95% CI 1.07–1.13]; P = 4.4 × 10−12). This gene and its flanking regions are now validated targets for further resequencing, genotyping, and functional studies in type 1 diabetes.
Genome-wide association studies (GWAS) have emerged as a powerful approach for identifying susceptibility loci associated with polygenetic diseases such as type 2 diabetes mellitus (T2DM). However, it is still a daunting task to prioritize single nucleotide polymorphisms (SNPs) from GWAS for further replication in different population. Several recent studies have shown that genetic variation often affects gene-expression at proximal (cis) as well as distal (trans) genomic locations by different mechanisms such as altering rate of transcription or splicing or transcript stability.
To prioritize SNPs from GWAS, we combined results from two GWAS related to T2DM, the Diabetes Genetics Initiative (DGI) and the Wellcome Trust Case Control Consortium (WTCCC), with genome-wide expression data from pancreas, adipose tissue, liver and skeletal muscle of individuals with or without T2DM or animal models thereof to identify T2DM susceptibility loci.
We identified 1,170 SNPs associated with T2DM with P < 0.05 in both GWAS and 243 genes that were located in the vicinity of these SNPs. Out of these 243 genes, we identified 115 differentially expressed in publicly available gene expression profiling data. Notably five of them, IGF2BP2, KCNJ11, NOTCH2, TCF7L2 and TSPAN8, have subsequently been shown to be associated with T2DM in different populations. To provide further validation of our approach, we reversed the approach and started with 26 known SNPs associated with T2DM and related traits. We could show that 12 (57%) (HHEX, HNF1B, IGF2BP2, IRS1, KCNJ11, KCNQ1, NOTCH2, PPARG, TCF7L2, THADA, TSPAN8 and WFS1) out of 21 genes located in vicinity of these SNPs were showing aberrant expression in T2DM from the gene expression profiling studies.
Utilizing of gene expression profiling data from different tissues of individuals with or without T2DM or animal models thereof is a powerful tool for prioritizing SNPs from WGAS for further replication studies.
In the presence of epistasis multilocus association tests of human complex traits can provide powerful methods to detect susceptibility variants. We undertook multilocus analyses in 1924 type 2 diabetes cases and 2938 controls from the Wellcome Trust Case Control Consortium (WTCCC). We performed a two-dimensional genome-wide association (GWA) scan using joint two-locus tests of association including main and epistatic effects in 70,236 markers tagging common variants. We found two-locus association at 79 SNP-pairs at a Bonferroni-corrected P-value = 0.05 (uncorrected P-value = 2.14 × 10−11). The 79 pair-wise results always contained rs11196205 in TCF7L2 paired with 79 variants including confirmed variants in FTO, TSPAN8, and CDKAL1, which are associated in the absence of epistasis. However, the majority (82%) of the 79 variants did not have compelling single-locus association signals (P-value = 5 × 10−4). Analyses conditional on the single-locus effects at TCF7L2 established that the joint two-locus results could be attributed to single-locus association at TCF7L2 alone. Interaction analyses among the peak 80 regions and among 23 previously established diabetes candidate genes identified five SNP-pairs with case-control and case-only epistatic signals. Our results demonstrate the feasibility of systematic scans in GWA data, but confirm that single-locus association can underlie and obscure multilocus findings.
Epistasis; simultaneous search; joint effects; genome-wide association
Recent genome-wide association studies have resulted in a dramatic increase in our knowledge of the genetic loci involved in type 2 diabetes. In a complementary approach to these single-marker studies, we attempted to identify biological pathways associated with type 2 diabetes. This approach could allow us to identify additional risk loci.
RESEARCH DESIGN AND METHODS
We used individual level genotype data generated from the Wellcome Trust Case Control Consortium (WTCCC) type 2 diabetes study, consisting of 393,143 autosomal SNPs, genotyped across 1,924 case subjects and 2,938 control subjects. We sought additional evidence from summary level data available from the Diabetes Genetics Initiative (DGI) and the Finland-United States Investigation of NIDDM Genetics (FUSION) studies. Statistical analysis of pathways was performed using a modification of the Gene Set Enrichment Algorithm (GSEA). A total of 439 pathways were analyzed from the Kyoto Encyclopedia of Genes and Genomes, Gene Ontology, and BioCarta databases.
After correcting for the number of pathways tested, we found no strong evidence for any pathway showing association with type 2 diabetes (top Padj = 0.31). The candidate WNT-signaling pathway ranked top (nominal P = 0.0007, excluding TCF7L2; P = 0.002), containing a number of promising single gene associations. These include CCND2 (rs11833537; P = 0.003), SMAD3 (rs7178347; P = 0.0006), and PRICKLE1 (rs1796390; P = 0.001), all expressed in the pancreas.
Common variants involved in type 2 diabetes risk are likely to occur in or near genes in multiple pathways. Pathway-based approaches to genome-wide association data may be more successful for some complex traits than others, depending on the nature of the underlying disease physiology.
In addition to the HLA-locus, six genetic risk factors for primary biliary cirrhosis (PBC) have been identified in recent genome-wide association studies (GWAS). To identify additional loci, we carried out a GWAS using 1,840 cases from the UK PBC Consortium and 5,163 UK population controls as part of the Wellcome Trust Case Control Consortium 3 (WTCCC3). Twenty-eight loci were followed up in an additional UK cohort of 620 PBC cases and 2,514 population controls. We identified 12 novel risk loci (P<5×10−8) and replicated all previously associated loci. Three further novel loci were identified by meta-analysis of data from our study and previously published GWAS results. New candidate genes include STAT4, DENND1B, CD80, IL7R, CXCR5, TNFRSF1A, CLEC16A, and NFKB1. This study has considerably expanded our knowledge of the genetic architecture of PBC.
Type 1 diabetes arises from the actions of multiple genetic and environmental risk factors. Considerable success at identifying common genetic variants that contribute to type 1 diabetes risk has come from genetic association (primarily case-control) studies. However, such studies have limited power to detect genes containing multiple rare variants that contribute significantly to disease risk.
RESEARCH DESIGN AND METHODS
The Type 1 Diabetes Genetics Consortium (T1DGC) has assembled a collection of 2,496 multiplex type 1 diabetic families from nine geographical regions containing 2,658 affected sib-pairs (ASPs). We describe the results of a genome-wide scan for linkage to type 1 diabetes in the T1DGC family collection.
Significant evidence of linkage to type 1 diabetes was confirmed at the HLA region on chromosome 6p21.3 (logarithm of odds [LOD] = 213.2). There was further evidence of linkage to type 1 diabetes on 6q that could not be accounted for by the major linkage signal at the HLA class II loci on chromosome 6p21. Suggestive evidence of linkage (LOD ≥2.2) was observed near CTLA4 on chromosome 2q32.3 (LOD = 3.28) and near INS (LOD = 3.16) on chromosome 11p15.5. Some evidence for linkage was also detected at two regions on chromosome 19 (LOD = 2.84 and 2.54).
Five non–HLA chromosome regions showed some evidence of linkage to type 1 diabetes. A number of previously proposed type 1 diabetes susceptibility loci, based on smaller ASP numbers, showed limited or no evidence of linkage to disease. Low-frequency susceptibility variants or clusters of loci with common alleles could contribute to the linkage signals observed.
It has been postulated that multiple-marker methods may have added ability, over single-marker methods, to detect genetic variants associated with disease. The Wellcome Trust Case Control Consortium (WTCCC) provided the first successful large genome-wide association studies (GWAS) which included single-marker association analyses for seven common complex diseases. Of those signals detected, only one was associated with coronary artery disease (CAD), and none were identified for hypertension (HTN). Our objective was to find additional genetic associations and pathways for cardiovascular disease by examining the WTCCC data for variants associated with CAD and HTN using two-marker testing methods. We applied two-marker association testing to the WTCCC dataset, which includes ~2,000 affected individuals with each disorder, and a shared pool of ~3,000 controls, all genotyped using Affymetrix GeneChip 500 K arrays. For CAD, we detected single nucleotide polymorphisms (SNP) pairs in three genes showing genome-wide significance: HFE2, STK32B, and DIPC2. The most notable SNP pairs in a non-protein-coding region were at 9p21, a known major CAD-associated region. For HTN, we detected SNP pairs in five genes: GPR39, XRCC4, MYO6, ZFAT, and MACROD2. Four further associated SNP pair regions were at least 70 kb from any known gene. We have shown that novel, multiple-marker, statistical methods can be of use in finding variants in GWAS. We describe many new, associated variants for both CAD and HTN and describe their known genetic mechanisms.
Over 50 regions of the genome have been associated with type 1 diabetes risk, mainly using large case/control collections. In a recent genome-wide association (GWA) study, 18 novel susceptibility loci were identified and replicated, including replication evidence from 2,319 families. Here, we, the Type 1 Diabetes Genetics Consortium (T1DGC), aimed to exclude the possibility that any of the 18 loci were false-positives due to population stratification by significantly increasing the statistical power of our family study.
We genotyped the most disease-predicting single-nucleotide polymorphisms at the 18 susceptibility loci in 3,108 families and used existing genotype data for 2,319 families from the original study, providing 7,013 parent–child trios for analysis. We tested for association using the transmission disequilibrium test.
Seventeen of the 18 susceptibility loci reached nominal levels of significance (p < 0.05) in the expanded family collection, with 14q24.1 just falling short (p = 0.055). When we allowed for multiple testing, ten of the 17 nominally significant loci reached the required level of significance (p < 2.8 × 10−3). All susceptibility loci had consistent direction of effects with the original study.
The results for the novel GWA study-identified loci are genuine and not due to population stratification. The next step, namely correlation of the most disease-associated genotypes with phenotypes, such as RNA and protein expression analyses for the candidate genes within or near each of the susceptibility regions, can now proceed.
Electronic supplementary material
The online version of this article (doi:10.1007/s00125-012-2450-3) contains peer-reviewed but unedited supplementary material, including a full list of members of the Type 1 Diabetes Genetics Consortium, which is available to authorised users.
Families; Population stratification bias; Power; Replication; Susceptibility; Type 1 diabetes
The Type I Diabetes Genetics Consortium (T1DGC) Rapid Response Workshop was established to evaluate published candidate gene associations in a large collection of affected sib-pair (ASP) families. We report on our quality control (QC) and preliminary family-based association analyses. A random sample of blind duplicates was analyzed for QC. Quality checks, including examination of plate-panel yield, marker yield, Hardy–Weinberg equilibrium, mismatch error rate, Mendelian error rate, and allele distribution across plates, were performed. Genotypes from 2324 families within nine cohorts were obtained from a panel of 21 candidate genes, including 384 single-nucleotide polymorphisms on two genotyping platforms performed at the Broad Institute Center for Genotyping and Analysis (Cambridge, MA, USA). The T1DGC Rapid Response project, following rigorous QC procedures, resulted in a 2297 family, 9688 genotyped individual database on a single-candidate gene panel. The available data include 9005 individuals with genotype data from both platforms and 683 individuals genotyped (276 in Illumina; 407 in Sequenom) on only one platform.
type I diabetes; candidate gene; SNP; quality control; association
Rheumatoid arthritis (RA) is an archetypal, common, complex autoimmune disease with both genetic and environmental contributions to disease aetiology. Two novel RA susceptibility loci have been reported from recent genome-wide and candidate gene association studies. We, therefore, investigated the evidence for association of the STAT4 and TRAF1/C5 loci with RA using imputed data from the Wellcome Trust Case Control Consortium (WTCCC). No evidence for association of variants mapping to the TRAF1/C5 gene was detected in the 1860 RA cases and 2930 control samples tested in that study. Variants mapping to the STAT4 gene did show evidence for association (rs7574865, P = 0.04). Given the association of the TRAF1/C5 locus in two previous large case–control series from populations of European descent and the evidence for association of the STAT4 locus in the WTCCC study, single nucleotide polymorphisms mapping to these loci were tested for association with RA in an independent UK series comprising DNA from >3000 cases with disease and >3000 controls and a combined analysis including the WTCCC data was undertaken. We confirm association of the STAT4 and the TRAF1/C5 loci with RA bringing to 5 the number of confirmed susceptibility loci. The effect sizes are less than those reported previously but are likely to be a more accurate reflection of the true effect size given the larger size of the cohort investigated in the current study.
Most pathway and gene-set enrichment methods prioritize genes by their main effect and do not account for variation due to interactions in the pathway. A portion of the presumed missing heritability in genome-wide association studies (GWAS) may be accounted for through gene–gene interactions and additive genetic variability. In this study, we prioritize genes for pathway enrichment in GWAS of bipolar disorder (BD) by aggregating gene–gene interaction information with main effect associations through a machine learning (evaporative cooling) feature selection and epistasis network centrality analysis. We validate this approach in a two-stage (discovery/replication) pathway analysis of GWAS of BD. The discovery cohort comes from the Wellcome Trust Case Control Consortium (WTCCC) GWAS of BD, and the replication cohort comes from the National Institute of Mental Health (NIMH) GWAS of BD in European Ancestry individuals. Epistasis network centrality yields replicated enrichment of Cadherin signaling pathway, whose genes have been hypothesized to have an important role in BD pathophysiology but have not demonstrated enrichment in previous analysis. Other enriched pathways include Wnt signaling, circadian rhythm pathway, axon guidance and neuroactive ligand-receptor interaction. In addition to pathway enrichment, the collective network approach elevates the importance of ANK3, DGKH and ODZ4 for BD susceptibility in the WTCCC GWAS, despite their weak single-locus effect in the data. These results provide evidence that numerous small interactions among common alleles may contribute to the diathesis for BD and demonstrate the importance of including information from the network of gene–gene interactions as well as main effects when prioritizing genes for pathway analysis.
eigenvector centrality; epistasis network; evaporative cooling machine learning feature selection; pathway enrichment analysis; regression-based genetic association interaction network (reGAIN); SNPrank
The advent of genome-wide association (GWA) studies has revolutionized the detection of disease loci and provided abundant evidence for previously undetected disease loci that can be pooled together in meta-analysis studies or used to design followup studies. A total of 1715 SNPs from the Wellcome Trust Case Control Consortium GWA study of type I diabetes (T1D) were selected and a follow-up study was conducted in 1410 affected sib-pair families assembled by the Type I Diabetes Genetics Consortium. In addition to the support for previously identified loci (PTPN22/1p13; ERBB3/12q13; SH2B3/12q24; CLEC16A/16p13; UBASH3A/21q22), evidence supporting two new and distinct chromosome locations associated with T1D was observed: FHOD3/18q12 (rs2644261, P=5.9×10−4) and Xp22 (rs5979785, P=6.8×10−3; http://www.T1DBase.org). There was independent support for both SNPs in a GWA meta-analysis of 7514 cases and 9045 controls (P values=5.0×10−3 and 6.7×10−6, respectively). The chromosome 18q12 region contains four genes, none of which are obvious functional candidate genes. In contrast, the Xp22 SNP is located 30 kb centromeric of the functional candidate genes TLR8 and TLR7 genes. Both TLR8 and TLR7 are functional candidate genes owing to their key roles as pathogen recognition receptors and, in the case of TLR7, overexpression has been associated directly with murine autoimmune disease.
genome-wide association; type I diabetes; follow-up study; T1DGC
Candidate gene studies have long been the principal method for identification of susceptibility genes for type I diabetes (T1D), resulting in the discovery of HLA, INS, PTPN22, CTLA4, and IL2RA. However, many of the initial studies that relied on this strategy were largely underpowered, because of the limitations in genomic information and genotyping technology, as well as the limited size of available cohorts. The Type I Diabetes Genetic Consortium (T1DGC) has established resources to reevaluate earlier reported genes associated with T1D, using its collection of 2298 Caucasian affected sib-pair families (with 11 159 individuals). A total of 382 single-nucleotide polymorphisms (SNPs) located in 21 T1D candidate genes were selected for this study and genotyped in duplicate on two platforms, Illumina and Sequenom. The genes were chosen based on published literature as having been either ‘confirmed’ (replicated) or not (candidates). This study showed several important features of genetic association studies. First, it showed the major impact of small rates of genotyping errors on association statistics. Second, it confirmed associations at INS, PTPN22, IL2RA, IFIH1 (earlier confirmed genes), and CTLA4 (earlier confirmed, with distinct SNPs) loci. Third, it did not find evidence for an association with T1D at SUMO4, despite confirmed association in Asian populations, suggesting the potential for population-specific gene effects. Fourth, at PTPN22, there was evidence for a novel contribution to T1D risk, independent of the replicated effect of the R620W variant. Fifth, among the candidate genes selected for replication, the association of TCF7-P19T with T1D was newly replicated in this study. In summary, this study was able to replicate some genetic effects, reject others, and provide suggestions of association with several of the other candidate genes in stratified analyses (age at onset, HLA status, population of origin). These results have generated additional interesting functional hypotheses that will require further replication in independent cohorts.
type I diabetes; candidate genes; T1DGC; SNP selection
Psychiatric phenotypes are currently defined according to sets of
descriptive criteria. Although many of these phenotypes are heritable, it
would be useful to know whether any of the various diagnostic categories in
current use identify cases that are particularly helpful for
To use genome-wide genetic association data to explore the relative genetic
utility of seven different descriptive operational diagnostic categories
relevant to bipolar illness within a large UK case–control bipolar
We analysed our previously published Wellcome Trust Case Control Consortium
(WTCCC) bipolar disorder genome-wide association data-set, comprising 1868
individuals with bipolar disorder and 2938 controls genotyped for 276 122
single nucleotide polymorphisms (SNPs) that met stringent criteria for
genotype quality. For each SNP we performed a test of association (bipolar
disorder group v. control group) and used the number of associated
independent SNPs statistically significant at P<0.00001 as a
metric for the overall genetic signal in the sample. We next compared this
metric with that obtained using each of seven diagnostic subsets of the group
with bipolar disorder: Research Diagnostic Criteria (RDC): bipolar I disorder;
manic disorder; bipolar II disorder; schizoaffective disorder, bipolar type;
DSM–IV: bipolar I disorder; bipolar II disorder; schizoaffective
disorder, bipolar type.
The RDC schizoaffective disorder, bipolar type (v. controls) stood
out from the other diagnostic subsets as having a significant excess of
independent association signals (P<0.003) compared with that
expected in samples of the same size selected randomly from the total bipolar
disorder group data-set. The strongest association in this subset of
participants with bipolar disorder was at rs4818065 (P =
2.42×10–7). Biological systems implicated included
gamma amniobutyric acid (GABA)A receptors. Genes having at least
one associated polymorphism at P<10–4 included
B3GALTS, A2BP1, GABRB1, AUTS2, BSN, PTPRG, GIRK2 and
Our findings show that individuals with broadly defined bipolar
schizoaffective features have either a particularly strong genetic
contribution or that, as a group, are genetically more homogeneous than the
other phenotypes tested. The results point to the importance of using
diagnostic approaches that recognise this group of individuals. Our approach
can be applied to similar data-sets for other psychiatric and non-psychiatric
OBJECTIVE—This study examined how differences in the BMI distribution of type 2 diabetic case subjects affected genome-wide patterns of type 2 diabetes association and considered the implications for the etiological heterogeneity of type 2 diabetes.
RESEARCH DESIGN AND METHODS—We reanalyzed data from the Wellcome Trust Case Control Consortium genome-wide association scan (1,924 case subjects, 2,938 control subjects: 393,453 single-nucleotide polymorphisms [SNPs]) after stratifying case subjects (into “obese” and “nonobese”) according to median BMI (30.2 kg/m2). Replication of signals in which alternative case-ascertainment strategies generated marked effect size heterogeneity in type 2 diabetes association signal was sought in additional samples.
RESULTS—In the “obese-type 2 diabetes” scan, FTO variants had the strongest type 2 diabetes effect (rs8050136: relative risk [RR] 1.49 [95% CI 1.34–1.66], P = 1.3 × 10−13), with only weak evidence for TCF7L2 (rs7901695 RR 1.21 [1.09–1.35], P = 0.001). This situation was reversed in the “nonobese” scan, with FTO association undetectable (RR 1.07 [0.97–1.19], P = 0.19) and TCF7L2 predominant (RR 1.53 [1.37–1.71], P = 1.3 × 10−14). These patterns, confirmed by replication, generated strong combined evidence for between-stratum effect size heterogeneity (FTO: PDIFF = 1.4 × 10−7; TCF7L2: PDIFF = 4.0 × 10−6). Other signals displaying evidence of effect size heterogeneity in the genome-wide analyses (on chromosomes 3, 12, 15, and 18) did not replicate. Analysis of the current list of type 2 diabetes susceptibility variants revealed nominal evidence for effect size heterogeneity for the SLC30A8 locus alone (RRobese 1.08 [1.01–1.15]; RRnonobese 1.18 [1.10–1.27]: PDIFF = 0.04).
CONCLUSIONS—This study demonstrates the impact of differences in case ascertainment on the power to detect and replicate genetic associations in genome-wide association studies. These data reinforce the notion that there is substantial etiological heterogeneity within type 2 diabetes.
The Type I Diabetes Genetics Consortium (T1DGC) has collected thousands of multiplex and simplex families with type I diabetes (T1D) with the goal of identifying genes involved in T1D susceptibility. These families have been genotyped for the HLA class I and class II loci and, recently, for a genome-wide panel of single-nucleotide polymorphisms (SNPs). In addition, multiple SNPs in specific candidate genes have been genotyped in these families in an attempt to evaluate previously reported T1D associations, including the C883A (Pro–Thr) polymorphism in exon 2 of TCF7, a T-cell transcription factor. The TCF7 883A allele was associated with T1D in subjects with T1D not carrying the high-risk HLA genotype DR3/DR4. A panel of 11 SNPs in TCF7 was genotyped in 2092 families from 9 cohorts of the T1DGC. SNPs at two positions in TCF7 were associated with T1D. One associated SNP, C883A (rs5742913), was reported earlier to have a T1D association. A second SNP, rs17653687, represents a novel T1D susceptibility allele in TCF7. After stratification on the high T1D risk DR3/DR4 genotype, the variant (A) allele of C883A was significantly associated with T1D among non-DR3/DR4 cases (transmission =55.8%, P =0.004; OR =1.26) but was not significantly associated in the DR3/DR4 patient subgroup, replicating the earlier report. The reference A allele of intronic SNP rs17653687 was modestly associated with T1D in both DR3/DR4 strata (transmission =54.4% in DR3/DR4; P =0.03; transmission =52.9% in non-DR3/DR4; P =0.03). These results support the previously reported association of the non-synonymous Pro–Thr SNP in TCF7 with T1D, and suggest that other alleles at this locus may also confer risk.
polymorphism; transcription factor; Th1; type I diabetes
Expression quantitative trait loci (eQTL), or genetic variants associated with changes in gene expression, have the potential to assist in interpreting results of genome-wide association studies (GWAS). eQTLs also have varying degrees of tissue specificity. By correlating the statistical significance of eQTLs mapped in various tissue types to their odds ratios reported in a large GWAS by the Wellcome Trust Case Control Consortium (WTCCC), we discovered that there is a significant association between diseases studied genetically and their relevant tissues. This suggests that eQTL data sets can be used to determine tissues that play a role in the pathogenesis of a disease, thereby highlighting these tissue types for further post-GWAS functional studies.
Genome-wide association studies (GWAS) have identified single-nucleotide polymorphisms (SNPs) at multiple loci that are significantly associated with coronary artery disease (CAD) risk. In this study, we sought to determine and compare the predictive capabilities of 9p21.3 alone and a panel of SNPs identified and replicated through GWAS for CAD.
Methods and Results
We used the Ottawa Heart Genomics Study (OHGS) (3323 cases, 2319 control subjects) and the Wellcome Trust Case Control Consortium (WTCCC) (1926 cases, 2938 control subjects) data sets. We compared the ability of allele counting, logistic regression, and support vector machines. Two sets of SNPs, 9p21.3 alone and a set of 12 SNPs identified by GWAS and through a model-fitting procedure, were considered. Performance was assessed by measuring area under the curve (AUC) for OHGS using 10-fold cross-validation and WTCCC as a replication set. AUC for logistic regression using OHGS increased significantly from 0.555 to 0.608 (P=3.59×10–14) for 9p21.3 versus the 12 SNPs, respectively. This difference remained when traditional risk factors were considered in a subgroup of OHGS (1388 cases, 2038 control subjects), with AUC increasing from 0.804 to 0.809 (P=0.037). The added predictive value over and above the traditional risk factors was not significant for 9p21.3 (AUC 0.801 versus 0.804, P=0.097) but was for the 12 SNPs (AUC 0.801 versus 0.809, P=0.0073). Performance was similar between OHGS and WTCCC. Logistic regression outperformed both support vector machines and allele counting.
Using the collective of 12 SNPs confers significantly greater predictive capabilities for CAD than 9p21.3, whether traditional risks are or are not considered. More accurate models probably will evolve as additional CAD-associated SNPs are identified.
coronary disease; genetics; risk factors
To reassess earlier suggested type I diabetes (T1D) associations of the insulin receptor substrate 1 (IRS1) and the paired domain 4 gene (PAX4) genes, the Type I Diabetes Genetics Consortium (T1DGC) evaluated single-nucleotide polymorphisms (SNPs) covering the two genomic regions. Sixteen SNPs were evaluated for IRS1 and 10 for PAX4. Both genes are biological candidate genes for T1D. Genotyping was performed in 2300 T1D families on both Illumina and Sequenom genotyping platforms. Data quality and concordance between the platforms were assessed for each SNP. Transmission disequilibrium testing neither show T1D association of SNPs in the two genes, nor did haplotype analysis. In conclusion, the earlier suggested associations of IRS1 and PAX4 to T1D were not supported, suggesting that they may have been false positive results. This highlights the importance of thorough quality control, selection of tagging SNPs, more than one genotyping platform in high throughput studies, and sufficient power to draw solid conclusions in genetic studies of human complex diseases.
autoimmune disease; genetic susceptibility; single-nucleotide polymorphism; type I diabetes; IRS1; PAX4
To develop novel methods for identifying new genes that contribute to the risk of developing type 1 diabetes within the Major Histocompatibility Complex (MHC) region on chromosome 6, independently of the known linkage disequilibrium (LD) between human leucocyte antigen (HLA)-DRB1, -DQA1, -DQB1 genes.
We have developed a novel method that combines single nucleotide polymorphism (SNP) genotyping data with protein–protein interaction (ppi) networks to identify disease-associated network modules enriched for proteins encoded from the MHC region. Approximately 2500 SNPs located in the 4 Mb MHC region were analysed in 1000 affected offspring trios generated by the Type 1 Diabetes Genetics Consortium (T1DGC). The most associated SNP in each gene was chosen and genes were mapped to ppi networks for identification of interaction partners. The association testing and resulting interacting protein modules were statistically evaluated using permutation.
A total of 151 genes could be mapped to nodes within the protein interaction network and their interaction partners were identified. Five protein interaction modules reached statistical significance using this approach. The identified proteins are well known in the pathogenesis of T1D, but the modules also contain additional candidates that have been implicated in β-cell development and diabetic complications.
The extensive LD within the MHC region makes it important to develop new methods for analysing genotyping data for identification of additional risk genes for T1D. Combining genetic data with knowledge about functional pathways provides new insight into mechanisms underlying T1D.
genetic association; integrative genomics; major histocompatibility complex; protein interaction networks; type 1 diabetes