|Home | About | Journals | Submit | Contact Us | Français|
The analysis of copy number variations (CNVs) is an emerging tool for identifying genetic factors underlying complex traits. In this chapter I will review studies that have been carried out showing that CNVs play a role in the development of two such complex traits; schizophrenia (SZ) and bipolar disorder (BD). There are two aspects to consider regarding the role of copy variations in these conditions. One is gene discovery in which DNA from patients is analyzed for the purpose of identifying rare, patient-specific CNVs that may be informative to a larger population of affected individuals. The model for this concept is based on the emergence of DISC1 as a SZ candidate gene, which was discovered in a single informative family with a rare chromosomal translocation. Another aspect revolves around the idea that polymorphic CNVs found in the general population, many of which appear to disrupt previously identified SZ and BD candidate genes, contribute to disease pathogenesis. Here, gene-disrupting CNVs are viewed in the same manner as functional SNPs and analyzed for involvement in disease susceptibility using genetic association. Although the analysis of CNVs in patients with psychiatric disorders is in its infancy, informative new findings have already been made, suggesting that this is a very promising line of research.
Classic cytogenetic analysis has been part of mainstream genetic culture for decades and has been instrumental in gene discovery and understanding disease pathogenesis; from cancer to mental illness to stillbirths. The mindset created by classic cytogenetics is that, with few exceptions, chromosomal alterations are bad for your health; they are relatively rare and when found are almost always pathogenic, associated with some disease state or catastrophic syndrome. This way of thinking was biased in some respects because in order to detect chromosomal changes, they had to be large-scale, on the order of several megabases, involving many genes, thereby increasing the likelihood that a phenotype will emerge. The discovery of CNVs only a few short years ago has turned this notion upside down; chromosomal alterations, it turns out, abound in the human genome as polymorphic copy variations (Iafrate et al., 2004; Sebat et al., 2004; Pinkel and Albertson, 2005; Feuk et al., 2006; Estivill and Armengol, 2007; Kehrer-Sawatzki, 2007; Pinto et al., 2007). So far, ~25,000 CNVs and ins/del variants >100 bases have been cataloged, affecting a large fraction of annotated genes (see Database of Genomic Variants; http://projects.tcag.ca/variation/).
We cut our scientific teeth on the notion that there are two copies of every gene, except for those on chromosomes X and Y. As for the double dose of X-linked genes in women, these are neatly compensated by random inactivation of one X chromosome in every cell. This standard view of gene dosage has taken a series of hits over the past few years. Chromosomes and the genes they house are not fixed structures in which everyone expresses every gene that is programmed for activation in cells; they are dynamic entities with molecular tricks that affect expression in many ways. First, there was the discovery of imprinted loci in which single alleles are expressed in a parent of origin manner (Kelsey, 2007; Davies et al., 2008). Then came the discovery that X chromosome inactivation is imperfect – the sexes do indeed differ in the number of X-linked genes expressed in cells from females since some partially escape the inactivating effects of Xist (Ng et al., 2007; Johnston et al., 2008). Finally, several gene families are expressed in a monoallelic manner (Kaneko et al., 2006; Ribich et al., 2006).
However, none of these phenomena are as paradigm shifting as CNVs. Imprinted loci and monoallelic expression are standard operating procedure, part of the molecular homeostasis of cells that have passed the test of evolution. Not so for CNVs. While an individual might have several involving dozens of genes, his or her next-door neighbor could have a completely different repertoire of disrupted loci. What effect does this genomic heterogeneity have on phenotype? Many of the genes whose coding regions and regulatory domains are affected by copy number variability are neutral, with relatively inconsequential phenotypic effects. Their presence in the genome is maintained by genetic drift or other non-selective factors: olfactory receptor genes, which are scattered throughout the genome, are often affected by CNVs and may be an example of this phenomenon. Other CNVs result in the gain or loss of genes, or their regulatory domains, that could conceivably influence phenotype – both physical and behavioral. If this assumption is correct, copy number heterogeneity may contribute to the development of common complex traits. Indeed, they may have as influential effect on phenotype as functional SNPs, perhaps more so (Iafrate et al., 2004; Sebat et al., 2004; Goidts et al., 2006; Nguyen et al., 2006; Redon et al., 2006; McCarroll and Altshuler, 2007). Some CNVs are already known to influence phenotype. One affecting the CCL3L1 gene, for example, may be involved in HIV susceptibility, and CYP2D6 copy gains are known to increase the metabolism of certain drugs (Arenzana-Seisdedos and Parmentier, 2006; Zanger et al., 2007). In addition, deletions in the α-globin locus on chromosome 16 are responsible for most cases of α-thalassemia in Africa and Southeast Asia, heterozygosity for which improves survival against falciparum malaria (Williams, 2006).
Geneticists have been successfully cataloging CNVs for the past few years, and their rate of discovery greatly exceeds our current understanding of their phenotypic effects. It seems clear, though, that these dramatic alterations of the human genome will prove to be major players in phenotypic variability, both normal and pathological. In this chapter, I will consider their potential role in the susceptibility to develop schizophrenia (SZ) and bipolar disorder (BD), two of the more complex of complex traits.
The World Health Organization includes both SZ and BD in the top ten of the most disabling conditions in the world. Although they are far less common than cardiovascular disease and cancer, these psychiatric disorders are particularly incapacitating in terms of years lost to disability because they strike early in life, usually in adolescence, and generally persist for many decades, ending in suicide or the medical consequences of nicotine addiction, alcohol and drug abuse, and neglect. SZ and BD have been studied from a variety of perspectives, but the most active area of basic research, and arguably the one with the most potential for gaining an understanding into disease pathogenesis, has been the widespread effort to uncover their genetic basis. I will not review the genetics of SZ and BD, but will instead refer readers to several comprehensive reviews on the subject (Harrison and Weinberger, 2005; Straub and Weinberger, 2006; Kato, 2007; Smoller and Gardner-Schuster, 2007; Sun et al., 2008). The bottom line: there have been many advances, with more than a hundred candidate genes identified, although only a small handful can be viewed as very probable at this time. One of the main obstacles in interpreting genetic findings in SZ and BD is that extensive locus heterogeneity is a feature, as judged by the positive linkage signals that have been found on virtually every chromosome. It is also widely assumed that allelic heterogeneity is a factor as well, since association signals have been detected in several gene regions in some of the larger SZ candidate genes, such as NRG1 (Stefansson et al., 2002, 2004; Lachman et al., 2006; Thomson et al., 2007; Georgieva et al., 2008). Because of extensive heterogeneity, the effect sizes for individual markers in recently published genome wide association studies (GWAS) have been small. Consequently very large sample sizes and multiple replication studies will be needed for confirmation. Life for the psychiatric geneticist is also made worse by the difficulty investigators have had identifying functional alleles in candidate genes. The dearth of candidate functional alleles is a contributing factor to the small effect size of associated markers in GWAS; establishing associations based on LD with non-functional markers (usually tag-SNPs) can lead to underestimates of effect size in conditions associated with allelic heterogeneity. I believe that considering copy variations as disease-causing functional variants may address this problem.
There are two ways to approach the analysis of CNVs in mental disorders. One is as a gene discovery tool for identifying rare but informative patient-specific CNVs. The other is to consider polymorphic copy variants (also called copy number polymorphisms – CNPs) present in the general population as functional alleles that can contribute to disease susceptibility, analogous to analyzing the role of a functional SNP using genetic association. In this section, I will address the gene discovery aspect first.
Although linkage analysis and more recently genome wide association have been the mainstays of gene discovery in psychiatric genetics, classic cytogenetic analysis has also made an important contribution. The idea is that large-scale cytogenetic abnormalities, while quite rare, can be informative to a much larger patient population by targeting a specific gene or pathway. After some two decades of such research, two major discoveries have been made using this approach. One was the finding that a 3-megabase microdeletion on 22q11 involving some two dozen genes causes velocardio facial syndrome (VCFS), a congenital disorder associated with heart disease and cleft palate, among other physical problems. VCFS is also associated with a very high rate of psychiatric disorders; approximately 30% of children with the condition develop SZ, ultra rapid cycling BD, and other psychiatric problems (Carlson et al., 1997; Papolos et al., 1998; Murphy, 2002; Bassett and Chow, 2008). The 22q11 deletion is one of the most potent genetic risk factors for SZ ever uncovered, increasing the risk by about 30- to 50-fold, similar to the risk of disease if one's monozygotic twin is affected. Although the gene responsible for the physical features of VCFS has been identified, the gene or genes involved in the psychiatric manifestations of the illness have not. Among the more notable genes in the region considered as candidates for the psychiatric features are COMT and PRODH. However, evidence favoring these genes is weak, although their analysis has sparked interesting and productive research (Lachman et al., 1996; Karayiorgou and Gogos, 2004; Paterlini et al., 2005; Li and Wu, 2007).
The other major cytogenetic discovery was a 1;11(q42.1; q14.3) translocation, which disrupts the DISC1 gene, found in a large Scottish family with SZ and other psychiatric problems (St Clair et al., 1990; Ekelund et al., 2001, 2004). The scientific windfall emanating from the 1;11(q42.1;q14.3) translocation discovery has been remarkable. DISC1, in fact, has been referred to as the Rosetta stone of SZ genetics (Ross et al., 2006; Brandon, 2007). It's not hyperbole. One outcome has been an increased focus on 1q42 in BD and SZ, which resulted in establishing linkage and association to DISC1 in a much wider subset of patients, who presumably have more subtle mutations underlying disease susceptibility (Ekelund et al., 2004; Craddock et al., 2005; Hamshere et al., 2005). In addition, studying the role of DISC1 protein in the brain has provided a tremendous amount of knowledge on disease pathogenesis. The 1;11 translocation generates a truncated DISC protein in which the carboxy-terminal end is deleted. DISC1 is expressed primarily in the hippocampus and binds to nudel, a protein involved in neuronal migration during fetal development (James et al., 2004; Callicott et al., 2005). A recent study revealed that DISC1 forms part of the microtubule-associated dynein motor complex and that the truncated variant acts as dominant-negative inhibitor by causing dissociation of the DISC1-dynein complex from the centrosome, thereby impairing cortical development (Kamiya et al., 2006; Taya et al., 2007). Finally, DISC1 binds other proteins, the identification of which has uncovered new genetic pathways to disease. For example, a family with psychiatric disorders was found to harbor a 1;16(p31.2;q21) translocation involving PDE4B, which codes for a DISC1 binding protein (Millar et al., 2005). (As discussed below, PDE4B and other DISC1 binding proteins are disrupted by polymorphic CNVs.)
Thus, finding the rare 1;11(q42.1;q14.3) translocation in a single family led to the identification of new molecular and genetic pathways underlying SZ and BD. (Searching for novel, rare chromosomal abnormalities to help understand disease pathogenesis in a larger patient population has also been a productive research area in autism, dyslexia and speech and language disorders (Feuk et al., 2006).)
With this background in mind, investigators have just recently begun to analyze DNA from patients with SZ for submicroscopic chromosomal deletions and duplications (CNVs, in other words) hoping to find unique, informative structural defects (Cantor and Geschwind, 2008; Kirov et al., 2008a, Walsh et al., 2008). One such study using high-resolution tiling path BAC arrays in 93 patients was reported recently (Kirov et al., 2008a). This group found a novel, 250-kb deletion affecting the 5′ end of NRXN1. This is a particularly feasible candidate gene for SZ pathogenesis considering its key role, along with other members of the neurexin family, on GABAergic and glutamatergic synaptic differentiation (see Discussion). In addition, NRXN1 deletions, as well as deletions and point mutations affecting members of the neuroligin gene family, which code for the synaptic binding partners of neurexins, have been found in two neuropsychiatric disorders related to SZ through some overlap of symptoms, autism and mental retardation (Laumonnier et al., 2004; Tabuchi et al., 2007; Kim et al., 2008; Lawson-Yuen et al., 2008). A structural variant affecting NRXN1 was also found by Walsh et al., who carried out the most comprehensive study to date to assess copy variants in SZ (Walsh et al., 2008). They used array CGH at a 100-kb resolution and analyzed 150 individuals with SZ and 268 controls. Combining all rare variants, the investigators found that 15% of patients had novel CNVs compared with 5% of controls. The percentage of affected patients increased to 20% in the young-onset subgroup. One of the CNVs was a novel 115-kb deletion on 2q16.3 affecting NRXN1; the deletion was found in a pair of identical twins concordant for childhood onset SZ.
Similarly, Friedman et al. (2008) recently identified a copy variation affecting CNTNAP2 in three unrelated patients with epilepsy and SZ, but not in 512 controls. CNTNAP2 codes for a member of the neurexin superfamily and has recently been implicated in autism using classic genetic linkage and association analysis (Alarcon et al., 2008; Arking et al., 2008).
In addition to the CNVs affecting NRXN1, several others relevant to SZ pathogenesis were detected. Kirov et al. (2008a) found a patient with a duplication affecting APBA2, which codes for a member of the X11 family of adaptor scaffolding proteins involved in trafficking membrane proteins, including beta-amyloid precursor and the synaptic vesicle exocytosis machinery. There is also some evidence that X11 proteins interact with neurexins, which suggests convergence on a common neural or molecular pathway (Biederer and Sudhof, 2000; Mori et al., 2002). Also, Walsh et al. (2008) found a novel deletion affecting ERBB4, which codes for the tyrosine kinase receptor for NRG1, a multifactorial growth factor encoded by one of the most well established SZ candidate genes. The deletion in ERBB4 affects coding elements and is different from the CNVs reported in the gene that were detected in control populations, which affect introns (see table and Database of Genomic Variants). Pathway analysis of the genes affected by all of the novel CNVs found by Walsh et al. showed over-representation of a number of different pathways, most notably synaptic long-term potentiation, neuregulin signaling and nitric oxide signaling, all consistent with existing genetic and molecular models of SZ.
Recently, two very comprehensive studies were published in which copy variations were analyzed in patients with schizophrenia using high-density SNP arrays. In one, 66 de novo CNVs were identified by analyzing 9,878 transmissions from parents to affected offspring (Stefansson et al., 2008). These were subsequently tested for association in 4,718 schizophrenia patients and 41,199 controls. Rare deletions at 1q21.1, 15q11.2 and 15q13.3 were found to be significantly increased in patients. In the other study, a significant increase in CNVs greater than 100 kilobases in length was found in 3,391 patients compared with 3,181 controls (The International Schizophrenia Consortium, 2008). Similar to the findings by Stefansson et al. (2008), deletions involving 15q13.3 and 1q21.1 were significantly increased in patients. In addition, deletions at NRXN1 and CNTNAP2 were found in patients. Table Table11
Through linkage analysis, and more recently GWAS, a number of candidate genes for both SZ and BD have been mapped (reviewed by Straub and Weinberger, 2006; Wellcome Trust Case Control Consortium, 2007; Baum et al., 2008; Sklar et al., 2008; Sun et al., 2008). Among the more notable are NRG1, DTNBP1, DAOA, DAO, and DISC1 (although DISC1, as noted above, was initially identified by a chromosomal translocation). Investigators have reported potentially interesting functional SNPs in these and other candidate genes. However, no bona fide disease-causing allelic variants have been unequivocally identified so far. One reason, as described earlier in this review, could be that allelic and locus heterogeneity reduces the effect size of any individual variant, even a functional one, resulting in the need for very large sample sizes to establish statistical significance. Another is that disease-associated functional alleles are not necessarily the ‘usual suspects’ – those found in exons, intron-exon junctions and promoters. Instead, alterations in enhancers and other regulatory domains that affect gene expression in a quantitative or temporal/spatial manner, and which may be located tens, even hundreds of kilobases up or downstream of a gene's coding elements, or within introns, will need to be identified and analyzed (Lemon and Tjian, 2000; West and Fraser, 2005). In addition synonymous SNPs, which can occasionally disrupt gene expression through a variety of methods, could also be involved (Qu et al., 2006; Nielsen et al., 2007).
However, another possibility is that some of the relevant disease-associated allelic variants are polymorphic CNVs that disrupt candidate gene coding regions and/or regulatory elements. In this model, CNVs found in the general population are present at higher frequencies in patients and contribute to disease risk in combination with other CNVs and/or functional SNPs. This is a very intriguing idea considering the fact that CNVs could have easily escaped detection in the course of resequencing candidate genes using DNA from patients, a widespread practice once a gene of interest has been targeted. Copy variation, especially copy loss, would be expected to have dramatic effects on gene expression through reduced gene dosage, or by unmasking the phenotypic consequences of a recessive allele. Copy gains could also have an adverse effect by increasing gene expression or through a dominant negative action, if truncated transcripts are generated. Thus, functional copy variations affecting coding or regulatory elements in candidate genes are feasible candidate components of the genetic mix underlying SZ and BD susceptibility. In order to test this hypothesis, standard association studies need to be carried out to determine the frequency of polymorphic CNVs using case control comparisons or family-based association tests. Are there such CNVs to analyze? The answer, it turns out, after a perusal of the Database of Genomic Variants, is a categorical yes.
Candidate genes for SZ and BD were chosen for examination based on replicated linkage studies and molecular analysis (Straub and Weinberger, 2006), assessment following extensive literature review (Sun et al., 2008), a probabilistic analysis of genes that interact with pathways and networks of predicted candidate genes (Iossifov et al., 2008), and the strongest positive associations in published GWAS (Wellcome Trust Case Control Consortium, 2007; Baum et al., 2008; Kirov et al., 2008b; Sklar et al., 2008). In addition, given the high-ranking status of DISC1 as a candidate gene for both SZ and BD, genes coding for DISC1 protein binding partners were also assessed. Finally, several candidate genes encoding glutamate receptors, genes involved in controlling circadian rhythm, and protocadherins were analyzed (only those with supportive positive association studies were included in the search), as were COMT and ARVCF, candidate genes for the psychiatric manifestations of VCFS, which are affected by unique CNVs, aside from the classic 3-Mb deletion.
My analysis showed that the coding elements of 82 SZ and BD candidate genes appear to be disrupted by CNVs initially identified in control populations. I use the word ‘appear’ because the precise borders for most CNVs have not yet been determined with great accuracy, so the disruption of coding elements must be viewed as provisional at this time. (Also, I will refer to these as polymorphic CNVs, although technically, some are not polymorphic in the strictest genetic definition since their frequencies in the population are less than 1%.) Among these 82 are 32 that affect the top 75 candidate genes from the Sun et al. (2008) review. In addition to the CNVs that appear to disrupt coding elements, 39 genes have CNVs in introns and nearby 5′ and 3′ flanking regions, most notably DTNBP1, NRG1, ERBB4, FAT, and NRXN3 (bottom panel). The effect of copy variations involving non-coding regions, however, is far from clear, and molecular analysis would have to be carried out to establish functionality.
Remarkably, five of the top candidate genes (DGKH, SORCS2, DFNB31, A2BP1, and NXN) identified in the first published GWAS in BD (Baum et al., 2008) are affected by polymorphic CNVs. DGKH codes for diacylglycerol kinase eta, an enzyme intimately involved in PIP2-mediated signal transduction by terminating the action of diacylglycerol. PIP2 has long been considered a molecular target for understanding BD susceptibility based on early observations regarding lithium's inhibitory effect on a key enzyme in phosphoinositide recycling, myo-inositol 1-phosphatase (Berridge and Irvine, 1989). In addition to its effect on PKC, DGKH, along with DFNB31, A2BP1, and NXN, also influences the Wnt/GSK3I3 signaling pathway, another lithium-inhibited pathway. GSK3β itself is affected by several copy gain and copy loss variants (Gould et al., 2006). This convergence of genetic, molecular, and pharmacological findings makes these newly targeted candidate genes, and the polymorphic CNVs within them, compelling variants to analyze in BD susceptibility.
The second WGA in BD (Wellcome Trust Case Control Consortium, 2007) also provided evidence for associations to several genes that are disrupted by copy variations. One is DPP10, which codes for a protein that binds to potassium channels; DPP10 is affected by several CNVs, two of which appear to disrupt exons. Another encodes an arginyl aminopeptidase family member, RNPEPL1, which is affected by a rare CNV. Two other BD-associated genes, GABRB1 and GRM7 found in this study, have CNVs within introns.
DISC1 itself is disrupted by two relatively rare CNVs – copy gains affecting several 5′ DISC1 exons and the promoter. These are interesting in view of the suggestion that the truncated protein resulting from the 1;11 translocation acts as a dominant-negative inhibitor (Kamiya et al., 2006; Taya et al., 2007). A copy gain of a portion of DISC1 could theoretically reduce gene expression through a similar dominant negative effect on the remaining intact copies. In addition, genes coding for DISC1-binding proteins are also disrupted by CNVs including PDE4B, which has a polymorphic copy loss that appears to obliterate the promoter and 5′ exons, and FEZ1, NDEL1, MFAP1, and ATF5.
Several 22q11-linked genes are disrupted by CNVs including COMT, the well-known SZ candidate gene, along with its immediate next-door neighbor, ARVCF, and PRODH. GRK3, which maps outside of the VCFS deleted region, but has also been targeted as a BD candidate, is affected by several CNVs (Barrett et al., 2007).
Finally, both NRXN1 and APBA2, which, as noted in the previous section, are disrupted by patient-specific deletions, are also affected by CNVs found in the general population. Two NRXN1 CNVs appear to disrupt coding elements. The frequency of each is low (<0.5% of controls have the variant). A large copy loss on chromosome 15 affecting APBA2 has been found in one out of 506 controls. These would be very interesting variants to analyze using genetic association, but obviously very large samples would be needed to determine significance. The chromosome 15 deletion is also of interest because it includes CHRFAM7A, a hybrid gene formed from a partial duplication involving the nicotinic receptor subunit gene, CHRNA7, and FAM7. Both CHRNA7 and CHRFAM7A are subject to smaller and more common copy gains and losses.
Despite the promise, there have only been a few studies in which polymorphic CNVs were analyzed by genetic association in psychiatric and control populations. The first was a report by Flomen et al. (2006) who genotyped CHRFAM7A copy number in 208 patients with SZ, 217 with BD and 28 with other psychotic disorders, as well as 197 controls. They found a modest association (P < 0.04) between low CHRFAM7A copy number and psychosis, although the statistical significance was lost after the P value was adjusted for multiple testing.
In the first report using CGH arrays in SZ, reported by Moon et al. (2006), a total of 35 CNVs were identified in 30 patients with SZ, 22 gains and 13 losses. All of the reported CNVs were present in 13–52% of subjects, suggesting that these are polymorphic in the general population, in which case their relationship to SZ would require a larger sample size and a control group. For example, they reported a copy loss variant on 1p21.1 involving AMY2A and AMY1A in 14% of their patients. Another affects SARDH. However, CNVs involving these genes are highly polymorphic in control populations and are found at frequencies similar to those reported by Moon et al. in their SZ subjects. They did not report rare, potentially patient-specific CNVs.
A subsequent study by Wilson et al. (2006) had similar issues. This group analyzed DNA from 105 brain samples equally divided among controls and patients with BD or SZ. They found some evidence for copy variations in genes involved in glutamate signaling, which is consistent with current dogma, using BAC arrays. Two of the genes they highlighted – AKAP5 and CACNG2 – are affected by CNVs found in ~3% of the general population. Thus, the significance of their finding of two patients out of 70 with copy variations affecting these genes cannot be assessed with the sample size analyzed.
A recent study by Sutrala et al. (2007) reported an analysis of 85 patients with SZ for copy variations in 891 candidate genes using an allele quantitation method and DNA pooling. Unfortunately, they did not find evidence for any copy variations. However, since single SNPs were used to interrogate the genes, the method would miss copy gains or losses affecting other exons or relevant regulatory domains. In addition, since copy gain and loss often occur in the same region, copy variations could be missed in individual samples when DNA is pooled. The authors correctly pointed out the need to verify putative CNVs occurring in candidate genes for SZ and BD, before widespread association studies are carried out.
Finally, we recently reported a modest increase in copy variations affecting the GSK3β gene locus in BD using quantitative PCR as a genotyping tool (Lachman et al., 2007a). The sample size (n = 500; 225 patients, 275 controls) was an improvement over the above cited studies, but it was still modest by the standards established for complex trait genetics. Consequently, the results should be viewed as preliminary.
The analysis of CNVs in SZ and BD is still in its infancy. Yet, despite only a handful of published reports, there is, in my opinion, already an important finding; the identification of copy loss variants in NRXN1 in a few subjects with SZ. As a lone observation, this could easily have been passed over for want of more supportive evidence. However, there is a growing literature showing that neurexins are involved in several neuropsychiatric and behavioral disorders, including SZ and autism, as noted above. In addition, recent studies suggest that NRXN1 is involved in nicotine addiction (Bierut et al., 2007; Nussbaum et al., 2008). Similarly, another member of the neurexin family, NRXN3, has been implicated in opioid addiction and alcoholism (Hishimoto et al., 2007; Lachman et al., 2007b). These findings point to the strong likelihood that the rare NRXN1 copy deletions identified by Kirov et al. (2008a) and Walsh et al. (2008) are functionally significant variants involved in SZ pathogenesis even though they were detected in a small number of subjects. The findings strongly suggest that genetic variability in the neurexin gene family – both structural and subtle – should be analyzed as candidates for SZ and other neuropsychiatric and behavioral disorders.
Neurexins are also guilty by association because of the finding, as noted above, that structural variants and SNPs affecting various neuroligin family members are associated with autism and mental retardation (Tabuchi and Sudhof, 2002; Lawson-Yuen et al., 2008). Neuroligins and neurexins are synaptic adhesion proteins that make heterophilic contact, leading to the maturation and differentiation of GABAergic and glutamatergic synapses through bidirectional signaling (Ushkaryov and Sudhof, 1993; Graf et al., 2004; Lise and El-Husseini, 2006; Craig and Kang, 2007; Kang et al., 2008). The three neurexin genes code for more than 2,000 different products through alternative promoter usage, leading to alpha and beta isoforms, and by extensive alternative splicing at five different sites (Rowen et al., 2002). A popular model for the underlying neuronal basis of SZ posits that patients have a deficit in glutamatergic transmission. The model is supported by pharmacological observations, such as the SZ-like phenotype that occurs following exposure to phencyclidine (PCP), an NMDA receptor antagonist, and by genetic data showing a convergence of multiple SZ candidate genes on the NMDA receptor complex (Lewis and Lieberman, 2000; Harrison and Weinberger, 2005; Chen et al., 2006). Consequently, copy deletions in genes that play a key role in glutamatergic differentiation should be viewed very seriously as a SZ susceptibility variant.
Continuing along this line of thinking, it should be noted that three out of four neuroligin genes (NLGN1, NLGN3 and NLGN4X) are affected by CNVs that appear to disrupt exons, including a copy loss variant in NLGN3 found in 10% of controls. Although not included in the original search of SZ and BD candidate genes, because they have not been viewed as such, these should also be analyzed as potential candidate variants.
It is expected that CNV analysis in SZ and BD will expand greatly in the next few years, especially with new, high density SNP arrays that, in addition to containing informative tagSNPs for GWAS, also contain non-polymorphic markers designed to interrogate copy variations more effectively. Standard SNP arrays contain markers chosen for Mendelian inheritance and Hardy-Weinberg equilibrium, conditions that may be violated in areas associated with polymorphic CNVs, making them less suited for interrogating CNVs. Improved computational analysis of allele intensities will also improve the efficiency of CNV genotyping. GWAS typically include thousands of subjects. Such studies carried out with improved arrays and computational approaches could lead to the identification of rare, patient-specific CNVs, as well as provide an adequate sample size for determining the significance of polymorphic CNVs. So far, there have been no reports of data mining for copy variation analysis in published GWAS in BD and SZ using standard SNP arrays.
Are there more rare, submicroscopic copy variations waiting to be uncovered that will be as informative as the 1;11 translocation involving DISC1? With the high resolution of CGH and SNP arrays and the high throughput capacity of array-based strategies, the answer is undoubtedly yes. Are polymorphic CNVs in existing candidate genes identified in control populations involved in disease pathogenesis? Probably, but clearly, extensive analysis in large data sets has to be carried out. The CNVs of interest have to be mapped more precisely and inexpensive and accurate means for genotyping them need to be implemented. In addition to SZ and BD, CNV analysis should also be a valuable tool for studying the genetics of other neuropsychiatric conditions, such as OCD, Tourette's, and major depression, and behavioral traits, such as homosexuality, which have underlying genetic contributing factors that have been difficult to identify. These complex traits may very well have their own Rosetta stones, carved in gene-disrupting CNVs, waiting to be discovered.
The author is supported by a grant from the Juvenile Bipolar Research Foundation and NIMH (R01MH073164).