Probands, other affected family members and parents were ascertained primarily in orthopedic clinics in Dallas, Texas and St Louis, Missouri (see Materials and Methods and
Supplementary Material, Table S1 for a description of study populations). Samples from Texas AIS family trios (probands and parental controls) were genotyped using the Illumina HumanCNV370-quad platform that interrogates over 370 000 human polymorphisms. After applying stringent quality control, we performed tests of transmission disequilibrium (TDT) for 326 498 SNPs using the PLINK analysis program (
15) in two ways (see Materials and Methods for description of statistical methods). In the first analysis, genotypes for all 419 families (
n = 1122) of all self-reported ethnicities were used, as the TDT statistic is robust to population stratification (
16). The second analysis was restricted to the subset of self-reported non-Hispanic white families, our largest ethnic group. To define the latter group, we corrected possible stratification by performing identity-by-state (IBS) analysis of unrelated probands using PLINK. Plotting the first three dimensions of a multidimensional scaling analysis of pairwise IBS distances identified one outlier family that was removed from further analyses (
Supplementary Material, Fig. S1). A plot of resulting −2log(e)
P-values against expected results under the null hypothesis (quantile–quantile plot, Fig. ) suggested a modest excess of associations without evidence of stratification (
λGC = 1.0025) within the non-Hispanic white cohort. TDT results for the two data sets are shown in the form of Manhattan plots in Figures and . We also examined cryptic relatedness that could produce overly inflated results. Using pairwise inheritance-by-descent (IBD) estimation, we did not detect closely related pairs, with
πhat values <0.15 for all samples (see Materials and Methods).
We estimated that our discovery cohort would provide (dependent on allele frequencies) ~90% power to detect disease associations with effect sizes [odds ratios (ORs)] ≥ 2.0 at a significance level
P = 5.0 × 10
−8 (
Supplementary Material, Fig. S2a), and 70% power to detect loci with an effect size of 1.8, but only 10% power to detect weaker effect sizes of 1.5 or less at
P = 5.0 × 10
−8 (
17). However, this cohort was potentially enriched for genetic risk factors, as 21% of the cases were familial. Only three SNPs met or exceeded a significance threshold
P ≤ 1 × 10
−5; however, genomic clustering suggested non-random association. Specifically, we noted that in the total data set, our most significant result was obtained for the SNP rs1400180 [OR = 1.92, 95% confidence interval (CI) = 1.48–2.49;
P = 6.2 × 10
−7], with nearby SNP rs10510181 among the top-ranked SNPs (OR = 1.88, 95% CI = 1.42–2.49;
P = 7.1 × 10
−6). These two SNPs are within 21 kb of each other at distal chromosome 3p and were ranked highest in the non-Hispanic white data set (Fig. and
Supplementary Material, Table S2). The evidence for association for the chromosome 3 SNPs rs1400180 (OR = 2.13;
P = 7.9 × 10
−8) and rs10510181 (OR = 2.03;
P = 2.6 × 10
−6) increased in the non-Hispanic white cohort despite the fact that this subset contained 80 fewer families than the total (results for non-white families are given in
Supplementary Material, Table S2).
We also imputed genotypes at untyped loci to potentially increase genome-wide coverage, given the relatively low density of the CNV370-quad platform. We imputed 2 271 581 genotypes in the non-Hispanic white families and tested these SNPs for association using the TDT statistic in PLINK as before. This analysis produced additional signals of interest (in terms of clustering and
P-values), in particular for loci on chromosomes 1, 6 and 21 (
Supplementary Material, Fig. S3). However, SNPs clustering in the region of rs1400180 and rs10510181 remained the most significant by imputation (
Supplementary Material, Tables S3 and S4 and
Fig. S4), and we prioritized this region for further study (Fig. ).
We subsequently tested SNPs rs1400180 and rs10510181 for evidence of allelic association with AIS in additional independent cohorts. In the first replication study, we genotyped samples from 375 Texas cases of non-Hispanic white ethnicity as defined by self-report as well as by IBS analyses using genotypes from 384 ancestry-informative markers (as described above and Materials and Methods,
Supplementary Material, Fig. S5). Close relationship within or between this cohort and the discovery cohort was unlikely per extensive review of pedigree and demographic information. For both SNPs, we observed that the same allele that was overtransmitted in families was over-represented in cases when compared with controls (i.e. the same direction of effect). Strongest results were obtained for rs10510181, where the frequency in cases was 0.37 when compared with 0.32 in controls. In addition, a logistic regression analysis incorporating gender and age at onset as covariates yielded essentially the same results (Table ). In the second replication study, we genotyped samples from 187 cases ascertained in US clinics outside of Texas and 222 controls, and again observed the same direction of effect for both SNPs and strongest results for SNP rs10510181 (Table ). Combining the results of the two replication studies (562 cases, 666 controls) yielded OR = 1.36, 95% CI = 1.14–1.61,
P = 0.0005 for rs10510181.
| Table 1.AIS risk association results for chromosome 3p26.3 loci |
Two overlapping genes,
CHL1 and
LOC642891, are nearest the region of association that we observed and are predicted to be transcribed in opposite orientation.
CHL1 encodes
Close
Homologue of
L1, a member of the family of immunoglobulin-class L1 neural cell adhesion molecules. Chl1 functions in axonal guidance and neuronal migration (
18,
19); however, whether
LOC642891 encodes a functional protein is unknown. We confirmed transcription of both genes in fetal and adult brain (
Supplementary Material, Fig. S6 and S7). The
CHL1 gene spans >212 kb and its putative promoter sequences are ~45 kb distal to the associated haplotype. The predicted
LOC642891 gene spans ~1.5 kb, where its 5′-end lies within
CHL1 intron 1 and its 3′-end lies within the putative
CHL1 promoter (Fig. ). Forty-nine imputed SNPs from this region (50–500 kb at chromosome 3p26.3) including the
CHL1 and
LOC642891 genes produced
P-values < 9 × 10
−4 by TDT analysis (
Supplementary Material, Table S4) of discovery data. To validate these findings, we selected four imputed SNPs (rs965084, rs1400182, rs9754850 and rs9754552) near rs1400180 and rs10510181 for genotyping in the discovery and Rep1 cohorts (
Supplementary Material, Tables S4 and S5). SNPs rs9754850 and rs9754552 yielded evidence for association (
P < 0.05) with the same direction of effect as in the discovery phase, but SNPs rs965084 and rs1400182 did not. SNPs rs9754850 and rs9754552 are within 800 bp of rs10510181 and are moderately correlated with this SNP:
r2 (rs9754850: rs10510181) = 0.56, 0.57, and
r2 (rs9754552: rs10510181) = 0.56, 0.58, respectively, in case–control and discovery cohorts. We obtained similar results for rs9754850 and rs9754552 in the Rep2 cohort (Table ).
In a third replication study (Rep 3), we compared our data to a separate GWAS of 137 AIS cases and 2126 controls ascertained at Children's Hospital of Philadelphia (CHOP). Allele frequencies for SNPs rs1400180, rs10510181 and rs9754850 did not differ in cases compared with controls in that study (Table ). To assess potential bias in our locally ascertained controls, we examined SNP risk allele frequencies in available data sets. We found that risk allele frequencies for the four SNPs rs1400180, rs9754850, rs9754552 and rs10510181 were similar between our controls and four control data sets (total
n = 3917 individuals), but not HapMap CEU, a difference we attribute to the relative few chromosomes represented in the HapMap data set (
n = 60) (
Supplementary Material, Table S6).
Taken together, these results suggest that a genomic region correlated with SNP rs10510181, proximal to the CHL1 and LOC642891 genes, is associated with increased AIS risk. The lack of replication in the CHOP GWAS may reflect issues of heterogeneity and power to detect modest effect sizes.
Many prior observations have indirectly linked scoliosis and neuropathology (
4). Clear evidence that improper axonal targeting specifically can evoke scoliosis is evident in the rare autosomal recessive disease
horizontal
gaze
palsy with
progressive
scoliosis (HGPPS, MIM #607313) that is remarkable for absent horizontal eye movements and severe progressive scoliosis. This disease is caused by homozygous or compound heterozygous mutations in the
ROBO3 gene encoding a transmembrane receptor that controls commissural axon guidance (
20). Brain imaging and neurophysiologic studies of HGPPS patients have revealed hindbrain anomalies and improper motor and sensory axonal projections (
21). We have noted with interest that Chl1 and Robo3 proteins belong to the same molecular (immunoglobulin transmembrane receptor) and functional (axon guidance and neurite outgrowth) classes. Thus,
CHL1 is a plausible candidate gene for AIS susceptibility. We selected 90 families having multiple members affected with AIS for analysis of the
CHL1 gene, with the rationale that such families could be more likely to harbor highly penetrant alleles (
22). Analysis of four markers in the region produced suggestive evidence for linkage (HLOD = 1.93,
P = 0.001 at rs1400180) (
Supplementary Material, Table S7). We re-sequenced coding exons and flanking intronic regions of the
CHL1 gene in 10 unrelated probands from AIS families with positive evidence of linkage to the region (LOD ≥ 1.0). We observed two coding changes that predict non-synonymous amino acid changes (rs2272522 and rs62230378) and are found in exons 3 and 17, respectively, of the
CHL1 gene. We noted with interest that SNP rs2272522 was previously associated with susceptibility to schizophrenia in separate studies of Japanese and Han Chinese populations (
23,
24). This SNP was actually present on the CNV370-quad beadchip and was informative in our population, but did not produce evidence for association with AIS. Further investigation of rs62230378 also did not yield evidence that this SNP was associated with AIS. We also observed nine non-coding variants, none of which predicted alterations of known functional elements, such as transcription factor binding sites or consensus splice sites (Materials and Methods,
Supplementary Material, Table S8).
We observed additional associations in our discovery data with SNPs in axon guidance genes. Among the top results were several SNPs clustering in the
DSCAM gene located on chromosome 21 within the Down syndrome critical region (OR = 0.56, 95% CI = 0.42–0.73;
P = 2.26 × 10
−5 for rs2222973) (see top 100 SNP associations,
Supplementary Material, Table S9, and Figs and ).
DSCAM encodes
Down
syndrome
cell
adhesion
molecule, an immunoglobulin-class neural cell adhesion molecule in the same molecular class with Chl1 and Robo3. Dscam likewise functions in axon guidance, including commissural axon pathfinding, as observed in both vertebrate and invertebrate model systems (
25,
26). We also noted rs11770843 in the
CNTNAP2 gene within our top-associated SNPs (OR = 1.75, 95% CI = 1.32–2.30;
P = 6.20 × 10
−5) and some evidence for association with nearby loci, although SNP coverage in this region was poor (
Supplementary Material, Table S10). This could be a random effect, given the size of the
CNTNAP2 gene (2.3 Mb) and the number tests that we performed (297 SNPs from the
CNTNAP2 gene were genotyped). However, the evidence for association with rs11770843 remained significant after correction for the 297 tests.
CNTNAP2 is of interest, as this gene was previously linked with AIS (
27). The extent of overlap with our data is unclear, but we conclude that further study of
CNTNAP2 in AIS cohorts is warranted.
CNTNAP2 encodes contactin-associated protein 2, also called neurexin IV, that binds to the immunoglobulin-class neural cell adhesion molecule contactin-2 in
cis and to the L1 family of neural cell adhesion molecules (possibly including Chl1) in
trans (
28). In Drosophila, neurexin IV was recently shown to interact with Robo (
29). These data provide additional evidence for variation in
CNTNAP2 in AIS susceptibility. These and other top findings from our GWAS warrant further exploration in additional AIS cohorts.