Suggestive evidence for linkage was detected on chromosome 8p21 in multiple analyses: in nonparametric, dominant and recessive analyses of 707 European-ancestry families, and in nonparametric and dominant analyses of all 807 families.
This same region produced suggestive evidence for linkage (and the largest peak), in the independent Molecular Genetics of Schizophrenia (MGS) sample35
of 409 European-ancestry and African American families. Our peak results were between 45.9–46.8 cM (between rs1561817 and rs9797, 26.59–27.65 Mb; deCODE linkage map and genome build 36.3 physical locations). The MGS peak was at 43.3 cM for all families (near rs196886 at 24.79 Mb), while in European-ancestry families it was at 15.3 cM (8p23, near rs7834209 at 6.9 Mb), with a slightly smaller peak at 34.6 cM (8p21, near rs34393111 at 20.28 Mb), and suggestive evidence for linkage extended beyond our peak scores. Pulver and colleagues were the first to report preliminary36
and then strongly suggestive evidence10
for linkage of SCZ to chromosome 8p markers in much of the JHU sample that is included here. We previously reported support for 8p linkage in a study of microsatellite markers in a majority of the families in the present analysis13
, consistent with results in this enlarged sample.
The most widely-studied 8p candidate gene is NRG1 (neuregulin 1), found to be associated with SCZ by Stefansson et al.37
in a linkage disequilibrium mapping study of a suggestive linkage peak observed in Icelandic families (there were two 8p peaks in that analysis, with the second one closer to ours), with supportive evidence in some datasets.38
There are several indications that, if there is linkage on chromosome 8p, it is not entirely explained by NRG1. Here, lod scores within one unit of the maximum (1-lod interval) were observed between 21.37–29.36 Mb, whereas NRG1 is between 32.53–32.74 Mb. (The 1-lod interval is a reliable confidence interval in studies of Mendelian disorders, but not for complex disorders.) In the companion meta-analysis paper1
, the second “bin” on chromosome 8 (8.2, 28.1–56.2 cM, ~ 15.7–33 Mb) produced the strongest (suggestive) evidence for linkage in 22 European-ancestry datasets, and was ranked eighth in the analysis of all 32 datasets. NRG1 is at the centromeric edge of that bin (~ 55.7 cM), so one would expect that if it explained the linkage, the signal would extend equally in the centromeric and telomeric directions, but support for linkage was not observed in more centromeric bins (bin 8.3 in the primary analysis, from 56.2–84.3 cM; bin 8.4 in the “20 cM” analysis from 56.2–75 cM; or bin 8.3 in the “30 cM” shifted analysis from 42.15–70.25 cM).1
We hypothesize that there is weak linkage to SCZ on chromosome 8p, due to one or more loci in which there are multiple rare risk-associated SNPs and/or structural variants and/or multiple associated common SNPs. There are other candidate genes on 8p (see discussion in the meta-analysis paper1
), it is not yet clear what accounts for the evidence for linkage in this region.
Suggestive evidence for linkage was observed on chromosome 9q in the dominant analysis of all families. Support for this region in other analyses was modest, but not substantially different than the evidence for 8p. This region is not supported by previous linkage findings or the meta-analysis.1
Genomewide significant evidence for linkage allowing for intersite heterogeneity was observed on chromosome 10p12 at 45.6 cM (21.28 Mb). We previously reported modest evidence for heterogeneity in this region14
, and in that report we also reviewed the evidence for 10p linkage reported previously in the NIMH-SGI, VCU/Ireland and part of the Bonn/Perth samples studied here. A significant signal is now seen in the present expanded sample, with a denser marker map, due to allele sharing in the Paris/CNRS, NIMH-SGI and (to a lesser degree) the VCU samples (online Supplementary Table 2
). There is no indication of a high-penetrance signal from a small subset of families: the NIMH sample includes small nuclear families from the general U.S. population; and although there are some large, extended pedigrees in the Paris/CNRS sample from La Réunion Island, most of the families with positive lod scores were small families from the general French population, and no single family had a lod score (Kong-Cox, dominant or recessive) greater than 1.4. Because we combined families from eight previously-colected datasets, we do not have a consistent set of clinical ratings across samples to search for a possible clinical basis for linkage heterogeneity. The 10p peak is not supported by meta-analysis1
, and is far from the chromosome 10q peaks observed between 100–110 cM in two independent studies39,40
Significant heterogeneity (but not linkage with heterogeneity) was seen on chromosome 22q at 15 Mb, adjacent to the typical region (17–21 Mb) of the 22q11 deletion syndromes whose manifestations include SCZ in approximately 20% of cases.41
This deletion was detected in less than 0.5% of SCZ cases in two recent large studies.42,43
No consistent association signals have been observed to date between SCZ and common SNPs in candidate genes within the deletion region.
Two other regions, on chromosomes 8q24.1 and 12q24.1, produced suggestive evidence for linkage in at least one analysis, both reportedly linked to mood disorders rather than SCZ. On 8q, a combined analysis of genotypes from 11 linkage scans (1,067 families) produced a nonparametric lod score of 3.40 at 134.5 Mb, just telomeric to our 1-lod interval, in an analysis of bipolar-I and bipolar-II cases, but the signal was much smaller in an analysis of only bipolar-I.44
Given that by definition only bipolar-I can include psychosis (usually in around half of cases), one would not predict that the same locus in this region would account for linkage signals to bipolar disorder and SCZ. On chromosome 12q, there have been reports of linkage to major depressive45,46
and bipolar disorders (see review by Barden et al.47
) with peak locations ranging from 97.4–126.5 Mb -- 116–126 Mb in bipolar studies, close to our peak at 111 Mb. Neither region was supported by the SCZ linkage meta-analysis.1
In the linkage meta-analysis1
, genomewide significant evidence for linkage was detected on chromosome 2q (132–162 cM, 121–152 Mb), with some support for linkage across a broad region (118–176 cM and 206–235 cM). In the present study, we see a jagged line across chromosome 2q (), reflecting diverse peaks in different samples, although without statistically significant evidence for heterogeneity. Our largest peak was in the nonparametric EUR analysis at 206.6 cM (210.87 Mb). Thus, in our data and in the meta-analysis of 32 datasets, linkage evidence on 2q is intriguing but poorly localized. Thus, in our data and in the meta-analysis of 32 datasets, linkage evidence on 2q is intriguing but poorly localized. It was recently reported that a SNP in ZNF804A, at 185 Mb on 2q, produced genomewide significant evidence for linkage when a large collaborative SCZ association sample was combined with bipolar disorders cases from the Wellcome Trust Case Control Consortium project.48
What is the relevance of linkage studies as the field moves on to GWAS and large-scale resequencing methods? Meta-analysis provides some support for quite modest linkage signals.1
Thus, no gene is likely to have a large effect on overall population risk. In this situation, GWAS methods have better power2
, but (currently) only for common SNPs. GWAS technologies can also detect some but not all copy number variants (CNVs). Recent studies suggest that rare deletions on chromosomes 1q and 15q (as well as 22q11) predispose to SCZ42,43,49, 50
; and that SCZ cases also have a small but significant excess of very rare CNVs, some of which might therefore also be pathogenic. These findings support the more general hypothesis of multiple rare genomic events (SNPs, CNVs, other structural changes) influencing risk for a common disease.51–53
High-penetrance CNVs like those on 1q and 15q have effects such as mental retardation and/or autism, consistent with the observation that they reduce fertility and thus are usually de novo
mutations rather than transmitted in families. But most SCZ risk variants probably have smaller effects: the risk to probands’ siblings is around 5%20
, and if one allows for a small proportion of cases to be due to high-penetrance CNVs, the remaining risk should be due to lower-penetrance variants which would thus be transmitted in families. It is possible that weak SCZ linkage signals are in regions where there are multiple rare as well as common risk variants, whose aggregate frequency and effects are sufficient to produce a linkage signal, and whose effects on fertility are not too severe. We refer here both to deleterious transmitted and/or recurring sequence and structural polymorphisms with low population frequencies, and to very rare and thus very deleterious variants that segregate in different families, i.e., extreme allelic heterogeneity.
One approach to finding these variants would be high-throughput resequencing studies of linkage regions. For example, significant differences have been found in the proportions of high- and low-risk individuals carrying very rare non-synonymous coding SNPs for some diseases.54–55
This approach has not yet been attempted for schizophrenia in a large sample, thus we lack information to predict the power or optimal design of such studies. If a region in fact contained a sufficient number of rare high-risk variants to produce a linkage signal, then it might be possible to detect them via resequencing, although success would depend on the the proportion of subjects of families carrying such variants, and by the extent of locus heterogeneity, i.e., if a small proportion of cases carried rare risk variants at a large number of loci in a linkage region, studies of a feasible sample size might not detect them. It is not known whether it will prove most productive to resequence exons, entire genes with their nearby regulatory regions, or entire linkage regions (given that there are likely to be relevant unannotated intergenic regulatory sequences). Family-based samples might be particularly useful for resequencing studies of linkage peaks, if rare variants were contributing to the signal. But it also possible that because these variants are rare precisely because they reduce fertility, they could be more easily found in case-control samples, which are also larger. In our view, multiple strategies should be attempted.
It has also been suggested that the power of GWAS can be increased by upweighting evidence for association based on linkage scores (resulting in a small downweighting of other regions).56
Whether or not this formal approach is used, it would be reasonable to consider linkage findings when selecting genes and regions for dense LD mapping and large-scale resequencing studies.