|Home | About | Journals | Submit | Contact Us | Français|
DTNBP1 is associated with schizophrenia in many studies, but the associated alleles and haplotypes vary between samples.
We assessed nine single nucleotide polymorphisms (SNPs) in this gene for association with schizophrenia in a new sample of 1021 cases and 626 controls from Ireland.
Four SNPs give evidence of association (0.000018<p<0.045), most strongly with the common allele at rs760761. A haplotype of the common alleles of five markers (including rs760761) and the minor allele of rs2619538 overlapping the 5′ end of the DTNBP1 gene also gives evidence for association (p=0.0002). Secondary analyses showed no difference in the association signal based on sex or family history. These results are in agreement with the most consistently observed association with common alleles and common-allele haplotypes, reported in a previous study of Irish cases and controls but not in an Irish high-density family sample. Our results do not support the prior report from a Swedish sample of increased association in cases with a family history of psychotic illness. Comparison of human, chimpanzee and rhesus sequence suggest that rs760761 is a particularly variable position in the primate lineage.
This study provides further evidence from a large case/control sample for association of common DTNBP1 alleles and haplotypes with schizophrenia.
Schizophrenia has a complex etiology and lifetime prevalence of ~1% worldwide. Family, twin and adoption studies consistently demonstrate a strong genetic risk component (Kendler and Diehl, 1993; Maki et al., 2005). Schizophrenia is linked to 6p24-p22 (Straub et al., 1995) and associated with DTNBP1 (Straub et al., 2002; van den Oord et al., 2003). Reported associations vary considerably, but cluster in 8 commonly typed single nucleotide polymorphisms (SNPs). Data from these 8 SNPs yield a stable cladistic solution of 6 common haplotypes numbered throughout as in Figure 1.
SNPs in DTNBP1 were associated with schizophrenia in 14 studies of 16 independent samples (Figure 2). Haplotype 2 was associated with schizophrenia in Irish high-density families (Straub et al., 2002; van den Oord et al., 2003) and Japanese case/controls (Numakawa et al., 2004). Swedish (Van Den Bogaert et al., 2003) and US (Funke et al., 2004) case/controls showed association with the closely related haplotype 5. Common alleles and haplotype 1 show association in 7 independent samples of families from Germany, Israel and Hungary (Schwab et al., 2003), China (Tang et al., 2003; Li et al., 2005), and Bulgaria (Kirov et al., 2004), and case/controls from the UK and Ireland (Williams et al., 2004) and Japan (Tochigi et al., 2006). Though initially negative, the Irish sample (Morris et al., 2003) was positive when additional SNPs were typed (Williams et al., 2004). In a small number of studies, other associations were observed: with the low frequency haplotype 6 in Scottish (Li et al., 2005) and Italian (Tosato et al., 2007) case/controls, and with rs3213207 and markers more 3′ in the gene in Spanish case/controls (Vilella et al., 2007) and mixed ancestry families (Duan et al., 2007).
No association was reported in 13 studies of 17 independent samples: in German or Polish (Van Den Bogaert et al., 2003), Australian (Holliday et al., 2006), Korean (Joo et al., 2006), European- or African-American (Pedrosa et al., 2007), European-American (Wood et al., 2007), Dutch (Bakker et al., 2007), UK (Datta et al., 2007), Australian (Peters et al., 2008) and mixed European-ancestry (Sanders et al., 2008) case/controls, or in US or Afrikaans (Hall et al., 2004), Canadian (DeLuca et al., 2005), Australian or Indian (Holliday et al., 2006), Taiwanese (Liu et al., 2007) or Finnish (Turunen et al., 2007) families.
We tested DTNBP1 for association with schizophrenia in a new Irish case/control sample. We also assessed whether association varied by family history of illness or by sex as previously reported in some studies (Van Den Bogaert et al., 2003; Chen et al., 2007). We further assessed human and primate sequence data to determine the ancestral alleles of the SNPs studied here.
Cases were ascertained from in-patient and out-patient psychiatric facilities in Ireland and Northern Ireland, using the Structured Clinical Interview for DSM III-R, Patient version with an expanded psychosis section (Spitzer et al., 1987). DSM-III-R criteria were used for consistency with our family collection (Kendler et al., 1996). Detailed personal interviews and hospital record rating forms were completed for each proband. Subjects with a field diagnosis of schizophrenia or poor-outcome schizoaffective disorder were eligible if all four grandparents were born in Ireland or the UK. Schizoaffective disorder was defined as poor outcome if the subject was found on interview to have significant psychosocial deterioration and negative symptoms, and a course of illness consistent with chronic schizophrenia. The ICCSS sample includes 1021 cases.
Controls (N=626) were recruited from donors at the Northern Ireland Blood Transfusion Service (N=554) and from the Irish national police (N=38) and army reserve (N=34). Controls were asked about past history of psychotic illness and were eligible if they reported no history of psychotic illness and all four grandparents born in Ireland or the UK. The mode of sample collection did not allow for more detailed assessment. All participants gave appropriate informed consent. Recorded sex was verified by X/Y genotypes; cases were 68% male and 32% female, controls were 55% male, 45% female. This sample has ≥90% power to detect effects with minor allele frequency (MAF) ≥20% and a genotype relative risk (GRR) ≥1.3 at an alpha of 0.001.
Ireland has had a stable, homogeneous population (Cavalli-Sforza et al., 1994; Relethford, 1983; Sunderland et al., 1973) for ~15,000 years (Hill et al., 2000; Semino et al., 2000; McEvoy et al., 2004), with a genetic structure minimally influenced by human migrations over the last three millennia (McEvoy et al., 2004). In the experience of the blood bank staff, non-Irish donors are very rare, in agreement with the history of minimal in-migration to Ireland, and would have been excluded on the basis of questions about their grandparents.
Family history research diagnostic criteria (FH-RDC) (Andreasen et al., 1977) were assessed by the FH-RDC interview; cases were asked to report on psychotic illness in relatives (Endicott et al., 1978). The FH-RDC (kappa 0.49±0.02) compares favorably with best-estimate diagnosis (kappa 0.43±0.02) from interview of family members. Validated against best-estimate diagnosis, sensitivity was 0.37 and specificity was 0.996 (Roy et al., 1996). The FH-RDC criteria and instrument thus miss true illness in relatives but almost never produce a false positive.
We defined positive family history (FH+) by report of 1 or more first-degree relatives with schizophrenia or unspecified functional psychosis. We restrict the definition to first-degree relatives because we expect greater reliability of reporting for immediate family members. Unspecified functional psychosis required positive report of one or more specific symptoms (delusions, hallucinations, incoherence, or bizarre behavior) not due to a mood disorder. FH information is available for 739 cases (72.4%). Under our definition, there are 196 FH+ cases. Because family history information was not collected for controls, we cannot use a logistic regression approach to assess SNP×FH interaction. Rather, we compare these 196 FH+ cases to the most conservatively defined 478 FH − cases reporting no psychotic illness in either first or second degree relatives. This approach has ≥65% power to detect differences between the FH positive and negative subsamples assuming minor allele frequency (MAF) ≥20%, allelic odds ratio (OR) ≥1.3 and an alpha of 0.05.
We genotyped 9 DTNBP1 SNPs (Table 1) allowing reconstruction of previously observed haplotypes (van den Oord et al., 2003). We included rs2619538 and rs2619539 because of their importance in defining association in a previous study of DTNBP1 in Irish case/controls (Williams et al., 2004). Markers were genotyped using fluorescence polarization detection of template-directed dye-terminator incorporation (FP-TDI) with appropriate AcycloPrime SNP detection kits for specific polymorphisms (PerkinElmer, Boston) and an automated allele scoring platform (Van den Oord et al., 2003). Primer sequences are available on request. We genotype 41 samples (11 cases and 30 controls) in duplicate, and use discordant genotypes in these duplicates to estimate genotyping error rates. DTNBP1 is transcribed in opposite orientation to the human genome sequence. All SNP data are presented 5′-3′ on the genomic (−) strand, in coding orientation for DTNBP1, with the exception of Figure S1 in genomic (+) orientation.
SNPs and haplotypes were analyzed in Haploview (Barrett et al., 2005). We used Haploview default settings and excluded any individual missing >50% of genotypes. We tested for significant (p<0.001) departures from Hardy-Weinberg Equilibrium (HWE) and compared linkage disequilibrium (LD) against HapMap and previous Irish family data. Primary haplotype analyses were performed within blocks defined by the default confidence-interval method (Gabriel et al., 2002). We assessed empirical significance with 100,000 permutations of case and control status because of the significance of the asymptotic p-value for rs760761/p1320.
Markers rs2619538 and rs2619539 have only modest LD with other markers assessed (Figure S1), but were important in defining the specific 3-marker haplotype (rs2619538/SNP A, rs3213207/p1635, rs2619539/p1655) associated in UK and Irish case/controls (Williams et al., 2004). We therefore directly tested haplotypes composed of these three markers; we assessed empirical significance with 5000 permutations of case and control status.
We also tested whether composite LD patterns differ between cases and controls using DPRIME (Zaykin et al., 2006) to compare differences between case and control LD matrices across a region. This test can detect differences in the extent of LD between cases and controls, which might be observed if interacting alleles are present in a gene and has higher power than haplotypic tests under certain disease models. We analyzed both r2 and D’ measures; because r2 is sensitive to low MAFs, results from r2 comparisons alone can be difficult to interpret. The significance of the LD difference is assessed by randomly permuting case/control status to create a series of null difference statistics against which the actual result is compared. We compared regional LD patterns between cases and controls across all 9 markers (rs2619538-rs2619539) and in a subset of 6 markers (rs2619538-rs2005976) in moderately high LD.
To assess differences in association by sex, we coded each SNP in an additive framework and performed logistic regression analyses, including gender and a SNP×Sex interaction term. Given our sample size and assuming a MAF of .2, we have >80% power to detect interaction effects with a relative risk of >1.55 in the presence of no main effects beyond those induced by the interaction at a liberal alpha of .05. This is consistent with our actual results where our largest interaction effect was roughly 1.45.
To test for differences in evidence for association between sets of cases defined by FH status, we tested for allelic (1df) association by assessing Χ2 between FH+ and FH− cases. This analysis was performed using SAS (SAS Institute, 2002).
In order to orient phylogenetic trees, cladistic analyses and other studies of haplotypes generally make the simplifying assumption that common and ancestral alleles are the same. In order to define these patterns without assumptions, we sought to determine the derived and ancestral alleles at the 8 SNP positions studied here by comparison with chimpanzee and rhesus genome sequence data. Human DTNBP1 sequence covering the region studied here and shown in Figure 1 was extracted from genomic contig AL022343, which includes the 5′ 92 kb of DTNBP1 and all SNP positions in this study. To infer the ancestral alleles of these key DTNBP1 SNPs in the primate lineage, human sequence was aligned with syntenic extracts of chimpanzee (NW 107908) and rhesus (NW 001116478) genomic contigs using the MegAlign package within Lasergene7 (DNASTAR, Madison, WI).
We used three bioinformatic approaches to assess evidence of functional significance for associated SNPs. First, we analyzed sequence conservation using VISTA (http://genome.lbl.gov/vista/index.shtml). Second, we searched for possible transcription factor binding sites (TFBSs) in 1000bp of sequence around a SNP of interest in the UCSC Genome Browser (http://genome.ucsc.edu/). Third, we examined overlap of expressed sequence tags (ESTs) with SNPs of interest.
In total, 14/1647 samples (0.85%, 4 cases, 10 controls) were excluded due to missing data. Results are based on 1017 cases and 616 controls (N=1633), 1090 samples (66.18%) missing zero, 387 (23.50%) missing one, 111 (6.74%) missing two, 38 (2.31%) missing three and 7 (0.43%) missing four genotypes). By marker, average genotyping completion was 94.7% (89.3–98.7%).
We genotype 9 SNPs × 41 samples in duplicate (N=369 genotype pairs) and both genotypes were available for 344 (93.2%); 3 were discordant, estimating our genotyping error rate at 0.9%. All markers satisfied HWE criteria. LD patterns (Figure S1) are in close agreement with those observed previously in other samples. As previously observed, both rs2619538 (SNP A, 2 kb 5′ of start codon) and rs2619539 (p1655, intron 5) show reduced LD with all other markers studied.
The results of single marker analyses are shown in Table 1. We observed evidence of association between schizophrenia and 4 DTNBP1 SNPs at nominal significance levels of p<0.05. Common alleles were associated with schizophrenia for all but one marker. The most strongly associated SNP was rs760761 (p1320), with allele frequency 0.809 in cases and 0.742 in controls (Χ2=18.404, p<1.8×10−5, allelic OR 1.47). Smaller signals were observed for the minor allele of rs2619538 and the major alleles of rs1474605 and rs3213207 with case/control allele frequency differences between 2.2 and 4.1%. Only the result from rs760761/p1320 remained significant (P=1.8×10−4) after 100,000 permutations.
Results of haplotype analyses are shown in Table 2. Haplotypes were analyzed within the blocks defined by Haploview (Table 2A), and in the specific combination (rs2619538/SNP A, rs3213207/p1635, rs2619539/p1655, Table 2B) associated in UK and Irish case/controls (Williams et al., 2004). We observed association on the haplotype defined by the major alleles of rs1474605-rs2005976 (markers 2–6, Χ2=17.919, p=2.31×10−5, Perm P=0.0003, haplotypic OR 1.44). The observed case/control difference in frequency of this haplotype (6.3%) is almost identical to the allele frequency difference observed for rs760761/p1320 in single marker analyses (6.7%). We observed no evidence for association in the ICCSS sample with any haplotype composed of alleles of rs2619538/SNP A, rs3213207/p1635, rs2619539/p1655 (Table 2B).
We compared LD patterns between cases and controls across all 9 markers (rs2619538-rs2619539) and 6 markers in moderate-high LD (rs2619538-rs2005976), using both r2 and D′ measures. We used 1K–100K permutations to generate an appropriate null distribution. In the 9 marker comparison, we found a significant difference between cases and controls for the r2 (Z2=0.0011; empirical P=0.01) but not the D′ based (Z2=0.0016; empirical P=0.22) composite LD comparisons. In the 9 marker comparison, the observed difference may result from the sensitivity of r2 to MAF noted above. However, in the 6 marker window, we found significant differences comparing both D′ (Z2=0.0033; empirical P=5×10−4) and r2 (Z2=0.0026; empirical P=1×10−5). These results indicate that there is a difference in the composite LD between cases and controls in DTNBP1, particularly in the 6 SNPs rs2619538-rs2005976 (where both tests were significant). Because both cases and controls are Irish, this difference seems unlikely to be due to population stratification, and instead to reflect a real difference in the composite LD structure in this region between cases and controls.
In logistic regression tests of gender differences in association (Table S1), 4 markers gave nominally significant evidence of association, 3 of which were also associated in primary analyses. However, the SNP × Sex interaction term was not significant for any marker tested, providing no evidence for differential association depending on the sex of the case.
Based on the single marker results above, we limited FH analyses to the most significantly associated marker, rs760761 (Table S2). Using the categorical definitions described in Methods, there were 196 FH+ and 478 conservatively defined FH− cases (no illness reported in first or second degree family members) in the ICCSS sample. A genotype at rs760761 was missing for 47 individuals with family history (28 FH− and 19 FH+), leaving a total analyzed sample N=627 (1254 alleles), 450 FH− and 177 FH+. There was no evidence of any difference in association of rs760761 between FH+ and FH− cases (Table S2).
We sought to infer the ancestral allele at key DTNBP1 SNPs. We determined the alleles present in chimpanzee and rhesus reference sequences at the positions corresponding to the 8 haplotype-defining SNPs in Figure 1 (rs1474605-rs3213207, Table 3a). The data are consistent with the human major allele in European populations being ancestral in 3 cases (rs1474605/p1792, rs1018381/p1578, rs1011313/p1325) and derived in 2 cases (rs2619522/p1763, rs2619528/p1765). In 2 cases (rs2005976/p1757, rs3213207/p1635) the human major allele in European populations is present in the chimpanzee and minor allele present in the rhesus reference sequences. Neither human allele at rs760761 is present in chimpanzee and rhesus sequence.
Sequence around rs760761/p1320 in human, chimpanzee and rhesus is shown in Table 3b; the SNP is shown at position 21. Comparisons of the three primate sequences show that neither human allele (C/T) is present in the reference chimpanzee sequence (A). The sequence around this position is ambiguous (G or −) in rhesus due to a 3 bp deletion relative to the human and chimpanzee sequences. The sequence context and alignment are consistent with the deletion occurring at either positions 21–23 (with the position of rs760761 deleted and a rhesus-specific variant (G) at position 24) or positions 24–26 (as shown in Table 3b, with G at the site of rs760761). In either case, diversity is higher than average at the position of rs760761 based on the limited number of SNPs we have assessed so far, but rs760761 is intronic and there is as yet no evidence that this SNP has any functional significance.
The VISTA analysis shows that rs760761 is not conserved between human and mouse, and is located in a LINE repeat element, both arguing against any functional significance. No TFBS was identified anywhere in the 1000 bp of sequence surrounding the rs760761, and no known EST overlaps this position. These results are in general agreement with prior bioinformatic analyses of the gene and of specific key SNPs (see Discussion).
We observed evidence for association of schizophrenia with SNPs in DTNBP1, particularly with the common allele of rs760761/p1320 (the only result to remain significant after permutation testing) and a haplotype composed of the common alleles of five markers (including rs760761) and the minor allele of rs2619538. Frequency differences observed for the haplotype (6.3%) and the best SNP rs760761 (6.7%) are similar. The pattern of LD and specific haplotypes were very similar to those observed previously. This large sample provides further support for the involvement of DTNBP1 in schizophrenia.
We observed no evidence for differential association of DTNBP1 by family history. Even with the reduced sample size in our case-only analysis, we examined a substantially larger sample than the 142 cases and 272 controls from Sweden previously reported (Van Den Bogaert et al., 2003). All definitions of positive family history in current use have fairly low sensitivity, as described above, so our non-replication of this finding is unlikely to be due to substantial differences in the sensitivity of the family history definition between studies. We also observed no difference in association by sex, as reported for a locus on chromosome 5 in the ICCSS sample (Chen et al., 2007).
Our results agree with those reported previously in German families and Chinese and Japanese case/control samples. Few studies included rs2619538, so data for comparison are limited; there is no evidence for association of this marker in our data. LD between rs2619538 and other markers is consistently lower than between markers in the defined block. Otherwise, our results also agree with those reported in the study of 2 independent samples of UK and Irish case/controls.
The results of our bioinformatic analyses of rs760761 are in general agreement with the few published reports of such studies of DTNBP1. So far there has been no evidence for function ascribed to any of the intronic SNPs in the gene (including those studied here). A ChIP-chip study of fetal brain tissue identified only one promoter region immediately 5′ of the gene containing the specific histone modifications for which antibodies were included; the closest SNP studied, rs2619538, is ~1.5 kb outside of this putative promoter region (Pedrosa et al., 2009). A study of expression differences by SNP genotype and haplotype detected no evidence of any effect of rs760761 (Bray et al., 2005). Therefore, in common with most reported intronic or intergenic associations, LD between the associated SNPs and other functionally significant variation remains the most likely explanation for our results. Our results are also in keeping with a recent survey of 180 genomewide association studies (http://www.genome.gov/27529020) showed that association of complex traits with SNPs in non-coding regions of the genome is commonly observed: among 782 unique SNPs with P < 10−6, only 67 (8.6%) were in coding or regulatory regions, while 340 (43.5%) were in intronic regions and 354 (45.3%) were in intergenic regions.
In comparative assessments of human and primate sequence, the pattern we observe is complex and does not support any simple explanation of the ages and origins of haplotypes observed in the region. There are 5 positions where chimpanzee and rhesus sequence are informative about the likely ancestral alleles with respect to human ones. Considering only these five positions, haplotypes 2 and 6 (Fig. 1) display four ancestral alleles, haplotypes 1, 4 and 5 contain three ancestral alleles, and haplotype 3 contains two ancestral alleles. Neither human allele at rs760761 is present in chimpanzee and rhesus sequence. This SNP was the most significantly associated marker in the present study, and it also gave significant evidence of heterogeneity in allelic association across samples in a recent meta-analysis (Maher et al., 2009) using new methodology suitable for detecting and quantifying between sample differences (the “flip-flop” phenomenon (Lin et al., 2007) in the scale and/or direction of allelic association.
Neither of the two reported Irish case/control samples support the results from the Irish high-density family sample, the outlier among samples from Ireland, perhaps more consistent with underlying differences in the risk variants between the two sample types. Multiplex schizophrenia families are relatively rare, and affected individuals in such families may carry different and perhaps more highly penetrant risk variants. Conversely, supportive data from a Japanese case/control sample argue against an allele specific to a set of Irish high density families with a common founder. The results of these two studies along with those in the Swedish and US samples occur on related haplotypes sharing substantial variation. This close evolutionary relationship between haplotypes 2 and 5 raises the possibility of a variant common to both. If this was true, however, both haplotypes should be associated in the samples where one or other is; this has generally not been observed.
Rare variants and particularly structural variation in the human genome have been a focus of some interest in schizophrenia genetic studies recently. Three copy-number variants have been identified in the genomic interval around DTNBP1 (chromosome 6 variations 2620, 37543 and 7536 from the Database of Genomic Variation, http://projects.tcag.ca/variation/?source=hg18). Variants in this region were observed on only 1/540 chromosomes in the largest of the three studies reporting CNVs in this area (Redon et al., 2006), and none overlap the DTNBP1 gene. It thus seems unlikely that structural variation is a major contributor to or confounder of the association signals observed in this region.
The present study has some limitations. First, even though our sample is among the largest in which DTNBP1 has been studied, our power is high for MAF ≥20% and/or OR ≥1.3, but these values represent the largest effect sizes currently expected in complex traits. Effects below these thresholds would have lower probability of being detected, declining as either or both of the MAF and OR decrease. Second, we genotyped SNPs that had been included in prior studies because these should define the same haplotypes and haplotype blocks (barring any difference in the underlying LD in our sample) as previously reported. This makes our study data directly comparable with prior reports and provided reasonable power but our genotyped markers do not include SNPs selected on functional grounds. Three nonsynonymous SNPs in DTNBP1 are reported in dbSNP, but all have low MAFs, in the range 1–6% where our sample power is lower. Finally, as we note above, in common with most association studies, our results are likely to reflect LD between the markers typed (for which no functional role has been suggested) and currently unknown variation in the locus.
Fluctuation in association results is common in studies of complex traits, but may not necessarily indicate false positive findings (Gruber et al., 2007; Lin et al., 2007). Certainly, such results are not consistent with an underlying hypothesis of a single risk variant in a gene. The collective data from studies outlined above are most consistent with multiple liability variants in DTNBP. The evolutionary distance between haplotype 1 and haplotypes 2 and 5 (Figure 1) is consistent with this idea. In conclusion, the results from the present study of DTNBP1 and schizophrenia in a large, ethnically homogeneous sample add further support for the association of common alleles and common haplotypes with schizophrenia.
Supplementary information is available at the Schizophrenia Research website.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.