|Home | About | Journals | Submit | Contact Us | Français|
Large case–control genome-wide association studies primarily expose common variants contributing to disease pathogenesis with modest effects. Thus, alternative strategies are needed to tackle rare, possibly more penetrant alleles. One strategy is to use special populations with a founder effect and isolation, resulting in allelic enrichment. For multiple sclerosis such a unique setting is reported in Southern Ostrobothnia in Finland, where the prevalence and familial occurrence of multiple sclerosis (MS) are exceptionally high. Here, we have studied one of the best replicated MS loci, 5p, and monitored for haplotypes shared among 72 regional MS cases, the majority of which are genealogically distantly related. The haplotype analysis over the 45 Mb region, covering the linkage peak identified in Finnish MS families, revealed only modest association at IL7R (P = 0.04), recently implicated in MS, whereas most significant association was found with one haplotype covering the C7-FLJ40243 locus (P = 0.0001), 5.1 Mb centromeric of IL7R. The finding was validated in an independent sample from the isolate and resulted in an odds ratio of 2.73 (P = 0.000003) in the combined data set. The identified relatively rare risk haplotype contains C7 (complement component 7), an important player of the innate immune system. Suggestive association with alleles of the region was seen also in more heterogeneous populations. Interestingly, also the complement activity correlated with the identified risk haplotype. These results suggest that the MS predisposing locus on 5p is more complex than assumed and exemplify power of population isolates in the identification of rare disease alleles.
Multiple sclerosis [MS (MIM #126200)] is a chronic inflammatory disease of the central nervous system (CNS) characterized by multifocal demyelination. MS is a multifactorial trait with complex inheritance and a putative autoimmune pathogenesis (1–5), presenting a quite uniform prevalence of 100/105 in populations of Northern European origin (6). The strongest and most consistent evidence of linkage and association with MS is found with the HLA class II region on 6p, which has been estimated to explain 15–40% of the genetic fraction of MS etiology (7–9). Recently, the first MS genome-wide association (GWA) study revealed two additional loci, IL2RA (MIM *147730) and IL7R (MIM *146661), associated with an increased risk of MS (10), the latter being simultaneously identified also through a candidate gene approach in independent and overlapping data sets (11,12). Interestingly, the IL7R gene is located in chromosome 5p region, to which also linkage has been reported in MS families from several populations, including Finns (13–18), and which is syntenic to the experimental autoimmune encephalomyelitis (EAE) locus on mouse chromosome 15 (19).
The new loci identified in the recent International MS Genetics Consortium GWA scan revealed important new information about the pathogenesis of MS but were estimated to explain only 0.2% of the variance in the risk of the development of multiple sclerosis (10). Thus, it is evident that other loci remain to be identified for a more comprehensive understanding of the genetic susceptibility of MS. Since large GWA studies using common SNPs of HapMap identify common variants of disease loci, alternative strategies are needed to identify less common, possibly more penetrant variants contributing to the molecular background of MS. These relatively rare variants with at least moderate penetrance most probably give rise to a familial concentration of MS cases. One approach to tackle the discovery of relatively rare alleles is to use special populations with a founder effect and isolation, where one could hypothesize enrichment of the disease alleles (20). For MS, such a setting is provided by the MS high-risk region of Southern Ostrobothnia in Western Finland, where exceptionally high incidence (12/105) and prevalence (200/105) rates, as well as increased familial occurrence, have been observed in epidemiological studies over three decades (21–26).
As we previously have detected linkage to chromosome 5p in Finnish MS families, enriched for cases from the high-risk internal isolate, we first wanted to use the Finnish GWA data to study whether IL7R alone or possibly together with other loci in this chromosomal region could be identified as a susceptibility locus for MS in the isolate. A set of 72 MS cases, originating from Southern Ostrobothnia and majority of the samples being genealogically traced to two founding couples living in this geographical area in the 15th and 16th centuries, was monitored for enriched haplotypes using the HumanHap300 Illumina panel.
We initially wanted to study whether the variants of the IL7R gene on 5p, one of the main findings in the recent international GWA study (10), contribute to MS susceptibility in Finland. The non-synonymous SNP rs6897932, which is the likely causative variant (10,12), and three additional SNPs (rs11567701, rs3194051 and rs6871748) from previous association studies (11,27) were genotyped in 922 Finnish MS cases, of which 197 originate from the Southern Ostrobothnian MS high-risk region, and 1392 population controls (Table 1, FIN-isolate1 + 2, FIN-OUT). There was a slight difference in SNP allele frequencies between cases and controls both in the study set from the high-risk isolate (FIN-isolate1 + 2) and in the study set from the rest of Finland (FIN-OUT), providing modest evidence for association in the combined Finnish study sample [Table 2, rs6897932 and rs6871748, P = 0.002, odds ratio (OR) 1.24 (95% CI: 1.09–1.41), the SNPs are in full linkage disequilibrium (LD) (r2 = 1) with each other]. A test for heterogeneity between the data sets from the high-risk isolate and rest of Finland showed no evidence for stratification at this locus, permitting the combined analysis (Mantel–Haenszel corrected P = 0.0001). The OR corresponds to that observed in other studies. However, the associated allele C of the likely causal SNP rs6897932 is less common among Finnish MS cases (0.69) than among cases of other populations reported (Sweden: 0.74; UK and Belgium: 0.76 and USA: 0.78) (11,12).
As the single SNP association observed with markers in the IL7R gene was quite modest in the Finnish MS samples, we wanted to systematically monitor if a haplotype in the whole 45 Mb linked locus on 5p (between markers D5S667 and D5S407, 11.1–56.0 Mb, UCSC Genome Browser, hg18 assembly, March 2006) (Fig. 1A) would be enriched in the internal high-risk isolate and potentially contribute more substantially to MS susceptibility.
Haplotypes generated from the Illumina HumanHap300 SNP panel genotypes were monitored for association with MS in the genealogically linked sample from the internal isolate. Specifically, 72 MS cases (Table 1, FIN-isolate1), having either both parents born within the high-risk region of Southern Ostrobothnia (n = 64) or one Southern Ostrobothnia born parent as well as a family history of MS (n = 8), have been genotyped for the Finnish GWA study (E. Jakkula et al., manuscript in preparation), and this data were first utilized to screen the chromosome 5p-linked region. Forty-one of the 72 MS cases belonged to either one or both of the two large interconnected mega-pedigrees, which we were able to construct via genealogical studies. However, none of the MS cases included in the study were first-degree relatives with each other. We hypothesized that the relatively short history, with common ancestors only 14–16 generations ago (ancestors born in 1490 and 1594), might expose shared haplotypes between the distantly related MS cases. Thus, the linked interval on 5p was screened with a five SNP sliding window haplotype association analysis using PLINK (28). Sixty-eight identity-by-state (IBS) matched Finns were used as controls (Table 1, FIN-isolate1). We used genome-wide SNP data and identity-by-sharing, multidimensional scaling and identity-by-descent analyses to select these controls so that their genetic background would be similar with the cases, as parental birthplace information was not available for all the controls (Supplementary Material, Fig. S1). The genomic inflation factor (λ), calculated using genome-wide single SNP data in PLINK, was 1.0758 for our GWA data set, suggesting that cases and controls are well-matched and there is no large-scale population stratification within our final study set.
In the sliding window analysis, only very modest association was observed with the IL7R haplotypes (single haplotype P = 0.043 and omnibus haplotype P = 0.084) (Supplementary Material, Tables S1 and S2). Instead, the haplotype analysis revealed one region (5p13.1 at 41.0 Mb region, P = 0.00029) with omnibus P-values ≤10−4 when all the haplotypes were compared between cases and controls (Fig. 1A; Supplementary Material, Table S1). This region is located at the C7-FLJ40243 gene locus, 5.1 Mb centromeric from IL7R. The same C7-FLJ40243 haplotype also provided the strongest evidence for association in a single haplotype analysis (P = 0.00017, flanked by SNPs rs1901167–rs10512754), whereas only two other single haplotypes within the MS-linked 5p locus provided a P-value of <0.001: an intergenic region in 5p15.2 at 13.6 Mb (P = 0.00085, flanked by SNPs rs3868287–rs10491318) and the FYB region (MIM *602731) in 5p13.1 at 39.2 Mb (P = 0.00088, flanked by SNPs rs1549690–rs66782) (Supplementary Material, Table S2).
The C7-FLJ40243 haplotype was further extended with PLINK to both orientations until reaching sites where the number of observed haplotypes increased corresponding to potential recombination spots (shown in Fig. 1B as an increase in the degrees of freedom), resulting to an eight SNP haplotype. This haplotype had a frequency of 0.183 in MS cases compared with 0.037 in controls in the GWA study sample (P = 0.00013) and it covers the 3′ ends of the C7 (complement component 7 precursor) and the FLJ40243 (hypothetical protein LOC133558) genes (Fig. 1C).
To confirm the initial finding and to study the genetic variation of the C7-FLJ40243 region in more detail, the region with strongest original association was tagged with 24 HapMap SNPs and analyzed in an independent set of 125 Finnish MS cases having at least one parent born within the Southern Ostrobothnian isolate and 365 population controls from the same geographical region (Table 1, FIN-isolate2). The initial C7-FLJ40243 haplotype association was validated in this larger independent sample from the internal isolate with a P-value of 4.0 × 10−4 (Fig. 1C, FIN-isolate2). To further estimate the effect size of the 59 kb risk haplotype in the MS high-risk region, the two study sets from the isolate were combined. The frequency of the risk haplotype in this combined study set was 0.12 among MS cases and 0.04 among controls, providing a P-value of 3.2 × 10−6 (pperm = 5.0 × 10−5, P = 0.012 after Bonferroni correction for the 3976 haplotype windows tested) and an allelic OR of 2.73 (95% CI: 1.67–4.47) (Fig. 1C, FIN-isolate1 + 2). The main haplotypes of the region (freq ≥ 0.05) and their frequencies in the combined study set from the high-risk region are shown in Table 3.
Additional family members were available for 156 MS patients from the MS high-risk region (Table 1, FIN-isolate1 and FIN-isolate2). The 24 SNPs over the critical C7-FLJ40243 region were further analyzed in these families using the gamete competition test (29) to perform family based association analysis that avoids any potential population stratification. Convincingly, also family based evidence for association with MS (P < 0.05) was observed with eight SNPs, SNP rs6860438 in C7 providing the strongest P-value of 0.006 (Supplementary Material, Table S3).
We next monitored the frequency of the identified C7-FLJ40243 risk haplotype in various populations utilizing the Human Diversity Panel SNP data publicly available. The MS risk haplotype, enriched among the MS cases of the Finnish isolate, was observed to be relatively rare globally, found at the ~4% of alleles in the general European population, being almost absent in the Africans, Southern Americans and Oceanians, and having the highest frequency of ~6% in the Eastern populations (Supplementary Material, Table S4).
To evaluate the role of the identified C7-FLJ40243 risk region in MS susceptibility in more heterogeneous populations, 14 of the 24 tagging SNPs (see Materials and Methods) were genotyped in four independent study sets: Finland outside the MS high-risk region (725 MS cases and 959 controls), Sweden (651 MS cases and 651 controls), Norway (359 MS cases and 471 controls) and USA (920 MS cases and 177 controls) (Table 1, FIN-OUT, SWE, NOR, US). Association of the main haplotypes (alleles with freq ≥ 0.05) was tested in each population sample. Frequencies and P-values for the haplotypes are shown in Table 3.
The relatively rare allele, which was observed to be enriched among MS cases from the high-risk region, had a comparable frequency in control samples of the isolate (0.04) and the other populations of Northern European origin (0.05 in Finland and Norway; 0.03 in Sweden and 0.06 in USA). No evidence for association was seen with this haplotype allele in other study sets. However, the Finnish MS cases were observed to carry another risk allele (freq MS 0.20, controls 0.16, P = 0.003), which was also marginally over-represented in the MS cases from Sweden and USA (Sweden: freq MS 0.18, controls 0.15; USA: freq MS 0.17, controls 0.14). A third allele was discerned to be over-represented among the Norwegian cases (freq MS 0.05, controls 0.03, P = 0.012). Further, single SNPs also provided marginal evidence for association with MS (best P-values: FIN-OUT P = 0.006; SWE P = 0.012; NOR P = 0.013; US P = 0.018), but again, there seems to be allelic heterogeneity within the region in more heterogeneous populations, and strongest association was detected with different SNPs in different populations (Supplementary Material, Table S5).
Notably, the identified 59 kb risk region can be divided into a low LD region (first-half) and a more clear block structure (end-half) in heterogeneous populations according to the HapMap CEU data (Fig. 1C), whereas it seems to cover only one haplotype block in the Finnish MS isolate (based on degrees of freedom in the GWA data, Fig. 1B).
To exclude structural alterations in the coding regions of the C7 and FLJ40243 genes in MS, the C7 promoter, all exons of C7 and the 3′ exons 25–42 of FLJ40243 (covered by the risk haplotype) were sequenced in 16 Finnish individuals from the high-risk region, of which four MS cases were known to carry two copies, four MS cases one copy and eight controls no copies of the C7-FLJ40243 risk haplotype.
Altogether 11 variants were observed, 2 of which were located in the promoter of C7, 4 in the coding sequence of C7 and 5 in the coding sequence of FLJ40243. The conserved alleles of the non-synonymous SNPs (rs2271708 in C7 exon 5, rs1063499 in C7 exon 10, rs13157656 and a novel SNP in C7 exon 14, rs10054110 in FLJ40243 exon 27 and rs2271704 in FLJ40243 exon 33) are present both in the MS risk haplotype and in most of the other common haplotypes. Thus, these SNPs are not likely candidates for the causative variant. Importantly, none of the variants identified was observed to be in tight LD with the risk haplotype identified. All the sequenced variants are shown in Table 4.
The function of FLJ40243 is unknown, but it is known to encode for a protein. Based on the human GNF Expression Atlas 2 Data (UCSC Genome Browser, hg18 assembly, March 2006), FLJ40243 is expressed at an extremely low level at least in ovary, testis, adrenal gland, pancreas, kidney, cerebellum, caudate nucleus, globus pallidus, olfactory bulb, ciliary ganglion and dorsal root ganglia.
No expression of FLJ40243 was observed in peripheral blood mononuclear cells (PBMCs) of which we had RNA available (data not shown). Thus, we could not test whether the identified risk haplotype has an effect on FLJ40243 expression in MS. We further monitored the expression of FLJ40243 in several tissues using human tissue cDNA panels (Clontech, Mountain View, CA, USA) and FLJ40243-specific PCR probes. The fetal (fetal brain, lung, liver, kidney, heart, spleen, thymus and skeletal muscle) and immune (adult spleen, lymph node, thymus, tonsil, bone marrow and peripheral blood leukocyte) cDNA panels were tested. The gene was observed to be expressed at very low level only in spleen, lymph node, fetal liver and fetal skeletal muscle (data not shown).
The C7 gene is an excellent functional candidate for an inflammatory disease like MS, for it encodes the seventh component of the complement system. C7 is mainly expressed in cell types impractical for RNA quantification studies (for example, hepatocytes and endothelial cells). No expression of C7 was observed in PBMCs (data not shown).
To evaluate, whether the identified risk haplotype has an effect on complement levels, we collected serum and plasma samples from 20 Finnish MS cases and 32 unaffected controls. Of the MS cases 11 and of the controls 13 carried the C7-FLJ40243 risk allele, either as heterozygote (MS n = 9 and Ctrl n = 11) or as homozygote (MS n = 2 and Ctrl n = 2). Most of the plasma C7 protein levels were within the reference range. However, we observed a distinct trend, the C7 protein levels being higher in the C7-FLJ40243 risk allele carriers than in non-carriers (Fig. 2A), although the difference did not reach a statistical significance in this small set of samples (P = 0.064). The difference was seen both in MS cases (mean no risk allele = 85% and mean risk allele = 94%) and in unaffected controls (mean no risk allele = 90% and mean risk allele = 95%).
C7 is a component of the terminal complement complex (TCC, C5b-9), which, when assembled on a cell membrane, forms the cytolytic membrane attack complex (MAC). Thus, also the total complement activity was studied in the same set of MS cases and controls by measuring the number of TCC formed as a consequence of activation of each pathway: classical (CP), alternative (AP) and lectin (MBL). The MS cases were observed to have a significantly increased number of functional TCC generated via the classical and the alternative pathway when the complement cascade was activated compared with the controls: CP mean Ctrl = 83%, mean MS = 104%, P < 0.0001; AP mean Ctrl = 85%, mean MS = 100%, P = 0.0001 (Fig. 2B). Importantly, the complement system was most active in MS cases carrying the identified C7-FLJ40243 risk allele (MS cases, AP: no risk allele = 96%, risk allele = 104%, P = 0.0049; MS cases, CP: no risk allele = 101%, risk allele = 106%, P> 0.05) (Fig. 2B).
The two reported MS associated loci, C7-FLJ40243 and IL7R, are 5.0 Mb apart from each other and there is no LD between the two loci. We looked for a potential sum effect of the risk alleles to show how these two independent MS risk loci factors contribute to the probability of developing the disease in an additive way (Fig. 3). Individuals carrying the C7-FLJ40243 risk allele seem to be in higher risk to get the disease compared with non-carriers, and the risk is even higher when an additional IL7R risk allele is present. This was observed both in the study set from the Southern Ostrobothnia high-risk region (Fig. 3A) and in the combined study set from more heterogeneous populations, namely Finland outside the high-risk region and Sweden (Fig. 3B).
Identification of genes predisposing to MS has proven to be a daunting task. A large number of genome-wide linkage scans have been conducted, but the strong linkage and association with the HLA locus remained to be the only systematically replicated finding with a genome-wide significance until 2007 (8). Finally, the first MS GWA study (10), together with other studies (11,12), revealed two additional loci, IL7R and IL2RA, associated with an increased risk of MS. These loci have now been replicated by independent groups of investigators (30–32) and are associated with MS susceptibility at a genome-wide level of significance. Despite of initial disappointment often faced with HLA-associated diseases, we are beginning to uncover the genetic architecture of this disease. However, the risk alleles in the IL7R and IL2RA are estimated to explain only a small proportion (0.2%) of the genetic risk of MS (10,33). Many more such low penetrance alleles remain to be discovered using conventional GWA approaches, but a change of tack will also be needed to discover relatively rare alleles, enriched in some populations or familial study samples, and potentially having a stronger effect on disease susceptibility. Here, we have attempted to pinpoint a rare MS predisposing gene on the 5p linkage region, for which most of the linkage information emerges from an internal founder population. An effort was previously made to narrow the wide locus (34), but only the most ‘telomeric’ 20 cM of the original wide 40 cM-linked region was analyzed in the restriction step, as the LODs were highest (maximum LOD 3.4) in that area. Importantly, no gene in the restricted region was found to explain the linkage observed (data not shown). Here, the complete ~40 cM-linked region was scanned using the Illumina HumanHap300 panel.
In this linkage-guided association study, we initially used the haplotype association analysis results for ranking the most interesting regions for follow-up. Scanning of the complete linkage region of 45 Mb revealed strong association only with one haplotype flanking the C7 and FLJ40243 genes, and this finding was validated in an independent case–control set originating from the same high-risk geographical region. The same markers also provided nominal evidence for association in a family based analysis. The association analysis combining all the samples from the high-risk region resulted in a P-value of 3.2 × 10−6, which remained significant even after the conservative Bonferroni correction.
Eighty-nine percentage of the MS cases used in the initial screen had both parents born within the MS high-risk region, and genealogical studies showed that majority of the patients were distantly related. Thus, our study design was specifically aimed to enrich for relatively rare, potentially more penetrant variants. Indeed, the identified risk allele was found to be present only in ~4% of alleles in the general European population, whereas it has a slightly higher frequency of ~6% in the Asian populations and it is almost absent among Africans, Southern Americans and Oceanians. Importantly, it has obviously become enriched in the Southern Ostrobothnian high-risk MS region due to the founder effect and isolation, and is found in 12% of cases. Within the high-risk region, carriers of the C7-FLJ40243 risk allele have higher risk to get the disease compared with non-carriers (OR: 2.7), whereas the previously identified MS risk alleles in IL7R and IL2RA have been reported to have OR of 1.2 (32). Suggestive evidence for association with C7-FLJ40243 alleles and MS was seen also in more heterogeneous populations, but as expected, there seems to be more allelic heterogeneity within the region, and the causative variant is most probably carried in diverse allelic backgrounds.
The advantages of the Finnish population and, especially, its subisolates are that most of the affected individuals typically share the same major risk allele and that the relatively rare variants can be exposed by the common HapMap markers due to the wide LD intervals (35). However, it is worth noting that these relatively rare variants identified using population isolates are much more challenging to detect with common HapMap SNPs in more heterogeneous populations having distinct LD patterns. Moreover, it is also possible that there are various causative mutations within the same gene in different populations. Thus, replication of the original allelic association in a more heterogeneous population might not be straightforward. This holds true, for example, for familial hypercholesterolemia [FH (MIM #143890)], an autosomal dominant disorder, where majority of mutations occur in the gene encoding low-density lipoprotein receptor (LDLR) and in two other genes (36). There is notable allelic heterogeneity in FH patients worldwide. Over 1000 mutations have been reported (37), and allelic heterogeneity is seen even within Finland in LDLR: there is a founder mutation (FH-North Karelia) found as the major mutation in a limited region in the Northeastern part of Finland and other mutations are found elsewhere in Finland (38).
C7 has previously been studied as an MS candidate gene (39). Three SNPs, C7-327 corresponding to rs7713884 in intron 13, C7-363 corresponding to rs13157656 in exon 14 and a SNP called C7-396, were genotyped in 227 MS families (including only 59 whole trio families) and 72 controls originating from UK. Suggestive evidence for linkage was observed (P = 0.0061). Further, SNP rs7713884 provided nominal evidence for association in the case–control analysis (P = 0.009), but family based association with the transmission disequilibrium test (TDT) was not seen with any of the SNPs, and the authors concluded that C7 does not confer susceptibility to MS. Notably the family based study was underpowered.
Both human C7 (C7) and FLJ40243 (RIKEN cDNA 4930455B06 gene) exist also in the corresponding location of the mouse chromosome 15 (Mouse Genome Informatics database, http://www.informatics.jax.org/). The genes coding for the sixth and ninth components of the complement cascade [C6 (MIM +217050), C9 (MIM +120940)] are also located in the chromosome 5-linked region, but no signal of association with MS was seen with those in the Finnish GWA study sample.
Based on our results, neither C7 nor FLJ40243 was expressed in PBMCs. Thus, expression levels of these genes could not be measured in RNA samples obtained from mononuclear cells of MS patients. However, we tested expression of FLJ40243 in human tissue cDNA panels. FLJ40243 was observed to be expressed in spleen, lymph node, fetal liver and fetal skeletal muscle, but no evidence for expression was seen in fetal brain, adult thymus, tonsil, bone marrow or peripheral blood leukocytes.
C7 is synthesized not only by hepatocytes, but also by endothelial cells, polymorphonuclear cells, macrophages, platelets, fibroblasts, synovial tissue and even in the CNS by astrocytes and oligodendrocytes (40–44), cells relevant for MS. We measured circulating C7 protein levels and total complement activity in MS patients and in unaffected controls. The causative variant of the C7-FLJ40243 risk region is yet unknown, but based on our results, it may affect protein levels of the complement component 7, which is an excellent functional candidate for MS. Importantly, the complement system was observed to be significantly more active among MS cases compared with unaffected population controls; the MS patients carrying the identified C7-FLJ40243 risk haplotype producing most functional TCCs when the complement cascade gets activated.
The complement system is a biochemical cascade of the innate immune system that helps to clear pathogens from the organism. In addition, the first components (C1–C4) of the classical and the lectin pathway are important in the clearance of apoptotic cells and cellular debris (45). Activation of C3 by the classical, lectin or alternative pathway leads to activation of the terminal components (C5–C9), which then form the TCC (C5b–C9), also called as the MAC when assembled on a cell membrane. The MAC forms a transmembrane channel on the cell membrane of the target cell, resulting in the osmotic lysis of the target. The classical complement pathway requires antibodies for activation, whereas the alternative pathway can be activated directly by microbial surfaces and the lectin pathway by certain carbohydrates, such as mannan. The end result of each pathway is in any case the formation of the MAC.
C7 is a single-chain plasma glycoprotein involved in the cytolytic phase of the complement activation cascade. C7 deficient individuals (MIM #610102) are prone to Neisseria meningitidis infections, because the MAC complex, which is important in destroying gram-negative bacteria by cytolysis, cannot be formed correctly. Importantly, C7 is a critical limiting factor of complement activation: only when the local expression of C7 is sufficient, C7 binds to preformed C5b6 and the resulting C5b–C7 complex is able to insert into the phospholipid membrane to start the formation of the MAC (46). The oligodendrocytes are, especially, sensitive to complement-mediated injury because of the relative deficiency of protective complement regulatory proteins, which normally protect host cells from complement-mediated lysis (47).
Several publications support the hypothesis that complement activation and especially the MAC may play a key role in the pathogenesis of autoimmune demyelination. Increased levels of terminal complement complexes (C5b–9) have been detected in the cerebrospinal fluid of MS patients during relapses, and their levels have been shown to correlate with the Expanded Standard Disability Status Scale score, measuring the neurological disability (48). C6-deficient rats, unable to form the MAC, exhibit no demyelination and show a significantly reduced clinical score in the Ab-mediated demyelinating form of EAE, the animal model of MS (49). Moreover, C9 deposition (corresponding to the number of MAC complexes), P-selectin expression and cellular infiltrates were observed to be reduced in the spinal cords of these rats compared with rats capable of forming the MAC (50). On the other hand, mice lacking the complement regulatory protein CD59, which normally protects host cells from complement-mediated damage by blocking the formation of the MAC, have more severe EAE and increased demyelination, inflammatory cell infiltration and axonal injury (51).
It is tempting to speculate that the causative variant of the C7-FLJ40243 region may lead to increased complement activation in the CNS and further to demyelination by damaging the defenceless myelin forming oligodendrocytes. MS can be classified into four different pathological patterns based on the composition of active lesions (52). Pattern II, accounting for over 50% of patients, is characterized by immunoglobulins and MAC deposition at the areas of myelin destruction. Thus, the C7-FLJ40243 risk allele identified here may predispose, especially, to this type of disease.
To summarize, the C7-FLJ40243 risk region, here shown to contain a risk allele enriched in a population isolate, is an excellent candidate for the susceptibility of autoimmune demyelination and seems to be a second MS locus in this region on 5p, initially identified by linkage in multiple study samples. Our observations highlight the value of population isolates in the identification of rare disease alleles in common diseases and further demonstrate the complexity of the genome regions initially identified as potential risk loci.
We selected 72 MS patients, having either both parents born within the Southern Ostrobothnia high-risk region for MS (n = 64) or one high-risk region born parent as well as a family history of MS (n = 8), for our Finnish GWA study. In addition, 68 IBS-matched population controls from Finnish genome-wide studies (from a total of 227 control individuals with GWA data) were used as the control set (Table 1, FIN-isolate1). Fourteen of the controls were known to have both parents born in Southern Ostrobothnia (Supplementary Material, Fig. S1, SOB Ctrls), while we did not have the information of the place of birth of the parents for rest of the controls. However, 13 of the controls were known to live in Southern Ostrobothnia (Supplementary Material, Fig. S1, SOB living Ctrls), and 41 controls were part of the Health 2000 project (Supplementary Material, Fig. S1, H2000 selected Ctrls) and have been used as controls in a genome-wide study of schizophrenia (O. Pietiläinen et al., manuscript in preparation). Genealogical studies enabled us to construct two large interconnected mega-pedigrees, and 41 MS cases out of 72 were noticed to belong to either one (n = 14) or both (n = 27) of them.
An independent set of 125 MS cases, having at least one parent born within the Southern Ostrobothnia high-risk region (including 74 MS cases with both parents born in the high-risk region), and 365 population controls, collected from the same region in Southern Ostrobothnia, were used to validate the initially observed association (Table 1, FIN-isolate2). Nine first-degree relatives of the cases of the GWA study were included in this validation sample set.
Additional family members (33 MS patients and 392 parents or sibs) were available for 156 MS patients included either in the FIN-isolate1 or FIN-isolate2 sets. Forty-one of the MS cases in these case–control sets had no healthy family members available and were not included in the family based analysis. We could construct 22 multiplex (two to six affected individuals per each pedigree) and 134 nuclear (one affected individual per family) MS families originating from Southern Ostrobothnia high-risk region to be used in the family based association analysis. Of the nuclear families, 48 contained two parents, 57 contained one parent and 76 had one or more healthy siblings.
A total of 725 unrelated MS cases and 959 population controls from Finland outside the high-risk region were analyzed as one case–control study set from more heterogeneous populations (Table 1, FIN-OUT). In addition, 651 MS cases and 651 controls from Sweden, 359 MS cases and 471 controls from Norway and 920 MS cases and 177 controls from United States (Table 1, SWE, NOR, US) were obtained from the collaborators of the Nordic MS Genetics Network and the Partners Multiple Sclerosis Center in Boston, MA, USA.
The diagnosis of MS has strictly followed Poser’s or McDonald’s diagnostic criteria (53,54). All the samples were of Northern European descent. All individuals have given their informed consent, and the study has been approved by the Ethics Committee for Ophthalmology, Otorhinolaryngology, Neurology and Neurosurgery in the Hospital District of Helsinki and Uusimaa (Decision 46/2002, Dnro 192/E9/02) and the Ethics Committees of the institutions involved.
Illumina HumanHap300 SNP chip (Illumina, San Diego, CA, USA) was used to genotype samples of the Finnish MS project. Genotyping was performed according to manufacturer’s instructions at the Broad Institute, Cambridge, MA, USA and at the Finnish Genome Center, Helsinki, Finland. The samples for schizophrenia study were genotyped with Illumina HumanHap300 chip at DeCode Genetics, Reykjavik, Iceland. Alleles were called on TOP-strand orientation for each project. Only samples and markers with success rates >95% were included in the analyses.
SNPs used to screen the original linked region map to 11.1–56.0 Mb according to UCSC Genome Browser (hg18 assembly, March 2006, http://genome.ucsc.edu/). The identified eight SNP haplotype region (40 996 921–41 039 556 according to UCSC Genome Browser, hg18 assembly, March 2006) was covered with tag SNPs using the HapMap tagger tool [r2 = 0.9, minor allele frequency (MAF) > 0.05] (http://hapmap.org/), resulting in 24 SNPs with an average 2.4 kb inter-SNP interval: rs1901167, rs324061, rs1315656, rs3805221, rs3805226, rs3828511, rs6860438, rs1055021, rs7732104, rs4957361, rs13161930, rs2122564, rs1376174, rs7736582, rs896117, rs1530812, rs3817324, rs2167229, rs16870603, rs7712140, rs2305314, rs1450658, rs1450659 and rs10058056. All the 24 SNPs were genotyped in the Finnish study set from the Southern Ostrobothnian isolate (Table 1, FIN-isolate2), whereas only 14 of these SNPs (rs1901167, rs324061, rs13157656, rs3805221, rs3805226, rs1055021, rs7732104, rs2122564, rs7736582, rs896117, rs3817324, rs2305314, rs1450658 and rs10058056) were genotyped in other study materials (Table 1, FIN-OUT, SWE, NOR, US; controls for complement level assay), because they provide sufficiently enough information to estimate the main C7-FLJ40243 haplotypes (freq ≥ 0.05), capture 92% of the common variation with r2> 0.5 and 75% of the variation with r2 > 0.8.
The SNPs tagging the C7-FLJ40243 risk region and the SNPs mapping to the IL7R gene were genotyped using the Sequenom’s MassARRAY iPLEX system (Sequenom, San Diego, CA, USA). The SNP assays were designed using Assay Designer 3.1 (Sequenom), and the PCR and extension reactions were done as specified by the manufacturer (specific details available upon request). We used 15 ng of genomic DNA/SNP-multiplex. Genotypes were both automatically called with the automatic caller of the software (Sequenom) and manually verified, as described in detail by Silander et al. (55).
Genotypes from controls of the schizophrenia project were merged with the MS study set, and genome-wide IBS sharing and multidimensional scaling analyses were performed using PLINK to select the best matching controls (28), resulting in the total number of 68 controls to be used in this study (Supplementary Material, Fig. S1). We then calculated pairwise IBD-sharing estimates for our study set and no first-degree relative pairs were found in any pair combination (case–case, case–control and control–control). There were three case–case pairs with IBD sharing estimates consisting with second-degree relatives (sharing 0.25 of their genome by IBD) and on third-degree relative level (sharing 0.125 of their genome by IBD), there were four case–case, two case–control and one control–control pairs. All the other relationships between all pair combinations were more distant. As the cases and controls were genotyped in different centers, and we had more genealogical data for our cases, we calculated genomic inflation factor (λ) using genome-wide single SNP data in PLINK (28). The genomic inflation factor, λ, was 1.0758 for our data set, which suggests that cases and controls are well-matched and there is no large-scale population stratification within our final study set.
SNPs mapping to 11.1–56.0 Mb according to UCSC Genome Browser (hg18 assembly, March 2006) and passing the quality control criteria in the genome-wide study (minimum of 95% call rate per individual and per SNP) were used in the sliding window five SNP haplotype analysis. All the analyses were performed with the PLINK program (version 1.00). Altogether 3976 consecutive windows were analyzed both for global haplotype association (omnibus P-value) and for the degrees of freedom (number of haplotypes − 1). In addition to global haplotype association, also single haplotype association P-values for each haplotype with a frequency of over 1.0% was calculated. All the results are available as Supplementary Material, Tables S1 and S2 and are represented as uncorrected P-values.
SNP genotypes in the C7-FLJ40243 validation and IL7R analyses were checked for Hardy–Weinberg equilibrium using Pearson’s χ2 test. Allele and genotype frequencies were determined from the data. Only markers with MAF over 5%, overall genotyping success rate >95% and validity of Hardy–Weinberg equilibrium in control samples were accepted for statistical analyses. Five SNPs used to form the C7-FLJ40243 haplotype were genotyped using both the Illumina HumanHap300 SNP chip and Sequenom system in 77 individuals. No discrepancies were observed between the two genotyping methods. There were no discrepancies between the 100 duplicate samples, which were added to the sample plates and genotyped using Sequenom system.
The haplotype frequencies for the validation sample set (FIN-isolate2), the combined Southern Ostrobothnian study set (FIN-isolate1 + 2) and the more heterogeneous populations (FIN-OUT, SWE, NOR, US) were estimated using Haploview (version 4.0beta6) (56), and the case–control P-values were obtained from the Haploview custom association test. P-value for the combined Southern Ostrobothnian study set (FIN-isolate1 + 2) was corrected for multiple testing using the permutation option available in Haploview, 1 000 000 permutations. Both the uncorrected and the permutated P-values are shown in the text.
LD between the SNPs flanking the C7 and FLJ40243 genes was measured using Haploview and the publicly available genotype data of Centre d’Etude du Polymorphisme Humain (CEPH) individuals of European origin (Fig. 1C).
To estimate the effect size of the identified risk haplotype in the Finnish isolate, the haplotypes were estimated with the Phase 2.1.1 program (57,58). Haplotypes with probability of >0.6 were accepted for further analyses. Haplotype frequencies estimated by Haploview 4.0beta6 and the Phase 2.1.1 program were almost identical (MS: 0.12 Haploview, 0.12 Phase; Ctrls: 0.04 Haploview, 0.05 Phase). ORs for the C7-FLJ40243 risk haplotype and for the IL7R risk allele were calculated using a Pearson’s χ2 test, comparing allele carriage counts between cases and controls (Fig. 1C and Table 2).
The SNPs in MS families were analyzed using the gamete competition option of the Mendel 7.0.0 program (29), which utilizes the genotype data of the whole pedigree and is better adapted to missing data than the classical TDT test.
P-values for the C7 protein and total complement activity differences between carriers and non-carriers of the risk haplotype as well as for the cases and the controls were calculated with the non-parametric Mann–Whitney test, because the number of individuals studied was relatively small.
Risk alleles for both the C7-FLJ40243 region (population-specific risk haplotype, see Table 3) and IL7R (rs6897932 allele C) were available for 150 MS cases and 323 controls from the Southern Ostrobothnia high-risk region and for 1135 MS cases and 1402 controls of the combined study set from Finland outside the high-risk region and Sweden, and these individuals were analyzed to estimate the co-effect of the C7-FLJ40243 and IL7R risk alleles on MS predisposition (Fig. 3). IL7R genotypes for the Swedish individuals were obtained from the collaborators of the MS Nordic Genetics Network. IL7R genotypes were not available for the study sets from Norway and USA. Groups with two copies of the C7-FLJ40243 risk alleles were excluded from the analysis due to the small number of individuals (two C7-FLJ40243 risk alleles and zero IL7R risk alleles: Southern Ostrobothnia MS n = 0; other parts of Finland and Sweden MS n = 4).
A test of heterogeneity between the study sets from the high-risk isolate and rest of Finland was performed to allow for a combined analysis for IL7R. Mantel–Haenszel corrected P-values were calculated using PLINK (28).
In 890 bases of the C7 promoter region, all the exons of the C7 gene and exons 25–42 of FLJ40243 (3′ end, covered by the risk haplotype) were sequenced in four MS cases with two copies, four MS cases with one copy and eight controls with no copies of the C7-FLJ402437 risk haplotype of the high-risk region. Four of the controls had only the head or the tail of the C7-FLJ402437 risk haplotype, and these controls enabled us to sort out the variants, which were in LD with only the head or the tail, but were not inherited with the actual risk haplotype (head and tail). Coverage of the sequencing was 100%. All the individuals sequenced originated from the Finnish high-risk region. Sequencing was done both in the Broad Institute Center for Genotyping and Analysis (C7) and in the National Public Health Institute of Finland (FLJ40243). Briefly, the primer design was done using Primer3 (http://primer3.sourceforge.net/), 10 µl PCRs were cleaned up using a SAP/Exo mix (Amersham Biosciences, Piscataway, NJ, USA) or SOPE Resin (Edge Biosystems Gaithersburg, MD, USA), and detection was done using ABI 3730xl’s (Applied Biosystems, Foster City, CA, USA).
The primer design was done using Primer3. Primers (C7: 5′-GGCCAGATGTGGAGAAGATT-3′ and 5′-ATGCCATCCATCAGTACAGG-3′; FLJ40243: 5′-TGATTTCATTCCTTCTGCACCT-3′ and 5′-CACAGAATTGCCTGTAGAAATCC-3′) mapped to exons within the risk haplotype region.
Tissue expression of FLJ40243 was tested by PCR analysis using the Human Fetal and Immune cDNA panels (Clontech). Tissue-specific products were detected by agarose gel electrophoresis.
Expression of FLJ40243 and C7 was further tested in mononuclear cell samples using the same method. These mononuclear cells are collected from 10 Finnish MS patients. All donors have given their informed consent and the study has been approved by the Ethics Committee for Ophthalmology, Otorhinolaryngology, Neurology and Neurosurgery in the Hospital District of Helsinki and Uusimaa (Decision 46/2002, Dnro 192/E9/02). The PBMCs were collected using Vacutainer CPT tubes (Becton-Dickinson, Franklin Lakes, NJ, USA). The total cellular RNA was extracted using RNAeasy Plus Mini Kit columns (Qiagen, Valencia, CA, USA) according to the manufacturers’ instructions. The quality of the RNA was assessed using the RNA 6000 Nano assay in the Bioanalyzer (Agilent, Foster City, CA, USA) monitoring for ribosomal S28/S18 RNA ratio (acceptable 1.5–2.5), signs of degradation and DNA contamination. The concentration and the A260/A280 ratio of the samples were measured using a spectrophotometer, the acceptable ratio being 1.8–2.2. Oligo(dT) primer and the TaqMan Gold RT–PCR kit (Applied Biosystems) were used to perform first-strand cDNA synthesis starting from 1 µg of total RNA, according to manufacturer’s instructions.
Serum and plasma samples of 20 Finnish MS cases from the Southern Ostrobothnia high-risk region were collected, and the total complement activity as well as the C7 protein levels were measured at the Department of Medical Microbiology and Immunology, University of Turku, Finland. All donors have given their informed consent and the study has been approved by the Ethics Committee for Ophthalmology, Otorhinolaryngology, Neurology and Neurosurgery in the Hospital District of Helsinki and Uusimaa (Decision 46/2002, Dnro 192/E9/02). Two of the MS cases had two C7-FLJ40243 risk alleles of the isolate, nine had one risk allele and nine had no risk alleles. There was approximately the same number of females (no risk alleles 63% and risk allele 55%) and users of immunomodulation drugs (cortisone during 6 months, interferon β or glatirameracetate; no risk alleles 43% and risk allele 33%) in different risk groups. Only one patient had a primary progressive form of MS (no risk allele). A Wielisa total complement system screen kit (COMPL 300, Wieslab, Lund, Sweden) was used for measuring functional activity of the classical (CP), alternative (AP) and lectin (MBL) pathways of the complement system in serum (59). In brief, dilutions of sera were incubated at 37°C in microtiter wells precoated with IgM, LPS or mannan for activation of the CP, AP and MBL pathways, respectively. After washing, alkaline phosphatase-conjugated antihuman C5b-9 was added before incubation for 30 min at room temperature. After additional washings and incubation with a substrate, the absorbance was measured at 405 nm. Reference values are CP > 60%, AP > 40%, MBL > 10% (MBL deficiency < 10%, decreased 10–40%). Plasma C7 protein levels were measured by radial immunodiffusion (Mancini technique) using goat anti-human-C7 antiserum (A308, Quidel, San Diego, CA, USA). Reference values are 80–120%. The results are quantitative.
A set of 150 healthy Finnish individuals was obtained as described in more detail by Seppänen et al. (60). These control samples have been collected from Helsinki and represent mixed Finnish population. Thirteen of these controls were noticed to carry the C7-FLJ40243 risk allele of the FIN-OUT. In addition to these 13 risk allele carriers, 19 non-carriers of the risk allele were selected for the complement level assays. There was approximately the same number of females in the control group compared with the MS group (no risk alleles 68% and risk allele 54%). The total complement activity and the C7 protein levels were measured at the Department of Medical Microbiology and Immunology, University of Turku, Finland, using the same methods described above.
This work was supported by National Institutes of Health (grant RO1 NS 43559), Center of Excellence for Disease Genetics of the Academy of Finland, the Sigrid Juselius Foundation, the Biocentrum Helsinki Foundation, Helsinki University Central Hospital Research Foundation, the Multiple Sclerosis Foundation of USA, a Harry Weaver Neuroscience Scholar Award from National Multiple Sclerosis Society (P.L.D.) and the Neuropromise EU project (grant LSHM-CT-2005-018637). The Broad Institute Center for Genotyping and Analysis is supported by the National Center for Research Resources (grant U54 RR020278). The genotyping of the Health 2000 controls was funded by the SGENE EU project (LSHM-CT-2006-037761) and Simons Foundation (R01MH71425-01A1).
We wish to thank all participating MS patients and families, the Finnish Genome Center, Dr Robert Onofrio and Dr Daniel Mirel for their help with sequencing, Mrs Anne Nyberg for her help in genotyping and Mrs Seija Laanti for expert laboratory help in complement determinations. The Norwegian Bone Marrow Donor Registry is acknowledged for collaboration in establishment of the Norwegian control material.
Conflict of Interest statement. None declared.