In this paper, we perform genome-wide analyses of an isolated Chilean population affected by Specific Language Impairment (SLI). Homozygosity mapping and parametric linkage analyses did not identify any chromosome segments that co-segregate with SLI in this population, suggesting that a completely penetrant monogenic aetiology is unlikely. This hypothesis is further supported by the observed nature of the language impairments. Affected individuals do not present with a specific core phenotype as may be predicted under a monogenic model, but instead show extensive heterogeneity in the severity and nature of impairment between affected individuals, as is typical of complex genetic forms of SLI.
The most consistent region of linkage extended across 48
Mb of chromosome 7q (chromosome position 111
965). This region reached a maximum NPL score of 6.73 (P
=4.0 × 10−11
) and achieved genome-wide significance in all three non-parametric analyses performed and overlapped with a peak of parametric linkage (recessive model max HLOD=1.24) and two segments of homozygosity. Although these are not independent observations and a number of alternative analyses were performed, the reliability of the linkage in this region is consistent with that expected from a true positive.
Segregation analyses identified a two-SNP haplotype that was found at a marginally increased frequency in cases than controls (P
=0.008). This haplotype fell across the NOBOX
(OMIM no. 610934) and TPK1
(OMIM no. 606370) genes. NOBOX
is a homeobox gene, which is preferentially expressed in oocytes, but not reported to be expressed in brain.28 TPK1
encodes the thiamine pyrophosphokinase 1 enzyme, which catalyses the conversion of thiamine to thiamine pyrophosphate. Thiamine (or vitamin B1) is essential for the metabolism of carbohydrates into glucose and acts as a co-enzyme in the production of acetylcholine. Thiamine deficiency forms part of numerous disorders including ataxia, confusion and impaired memory.29
Interestingly, a recent study suggested a link between thiamine deficiency and syntactic and lexical disorder.30
The chromosome 7 peak also overlaps with the AUTS1 locus of linkage to autism31
and includes both the FOXP2
genes, both of which have previously been associated with language disorders.9, 32
The genotyping panels utilised in this study were optimised for linkage investigations and thus involve a relatively sparse map of SNPs (~1SNP every 500
kb). The fine mapping of these regions is therefore required to enable the identification of candidates in an unbiased manner. We found that the two-SNP haplotype on chromosome 7 showed moderate long-range linkage disequilibrium with a number of SNPs indicating that further information would be required to narrow the linkage peak. Higher density SNP arrays would also enable the detection of smaller runs of homozygosity.
We did not observe any linkage to chromosomes 16 or 19, which have previously been implicated in SLI.5, 6, 33
Again, this may be caused by the low density of markers investigated in the present study. Alternatively, as the loci on chromosome 16 and 19 were identified by a quantitative genome screen of language-related measures, this may reflect differences in study design. As the Chilean quantitative linguistic data was collected only for subjects within a restricted age range (3 and 9 years), the current study utilised a binary affection status. This is similar to the approach applied by Bartlett et al
(2002, 2004) in their genome screen for SLI in which they identified a region of linkage on chromosome 13 (SLI3), which overlaps with that found by the present study. This region has also been linked to autism,34
a result which was strengthened by the selection of families on the basis of linguistic data.35
Our chromosome 13 linkage consisted of two adjacent peaks. The distal peak (34–48
Mb) overlapped with a segment of homozygosity and achieved a maximum NPL score of 4.8 (P
=8.0 × 10−7
) using CEPH allele frequencies. The proximal peak (83–94
Mb) reached an NPL of 3.5 (P
=0.0002) under all non-parametric analyses performed and coincided with an area of marginal linkage under a recessive parametric model.
In addition to the linkages on chromosome 7 and 13, we also observed significant linkage (NPL>4.08 (P
<2.2 × 10−5
)) to chromosome 17 and highly significant linkages (NPL>4.99 (P
<3.0 × 10−7
)) to chromosomes 6q and 12 (, ). However, these peaks were only observed under a single non-parametric model and not in models using alternative expected allele frequencies. It is therefore likely that these divergent results may be driven by differences in the allele frequencies of the control populations used and illustrate the importance of correctly estimating allele frequencies, especially for markers that are in linkage disequilibrium.36
Indeed, we found that the correlation of expected allele frequencies between the three different control groups was moderate (0.41–0.70 across all SNPs) and was lower than average across the conflicting regions of linkage on chromosome 6 and 12 (as low as 0.29 and 0.09, respectively), but remained moderate across the region of linkage on chromosome 7 (0.48–0.67). Importantly, simulation studies indicate that although allele frequency misspecification can lead to false positives, this artefact is not expected to affect the power to detect true linkages.37
Thus, although the loci on chromosome 6 and 12 reached a threshold of highly significant linkage, as these were observed with only one non-parametric analysis, we must recognise the possibility that they represent false positives, especially given the high number of tests performed. Instead, a more fruitful avenue of investigation may be provided by the examination of regions found to be consistently implicated across all three analyses performed, even in cases where this linkage did not reach genome-wide significance (eg, chromosome 2, 6p, 8, 9, 15 and 17. , Supplementary Figure 1).
In conclusion, this study has applied a genome-wide approach to identify loci which may contain genes underlying susceptibility to SLI in an isolated population. This study represents the first step in the detection of genetic variants that underlie the increased frequency of language impairments in this population. It is envisaged that the fine mapping of the identified loci will allow the detection of associated polymorphisms. It is likely that the variants identified by the further study of this population will have a significant role in furthering our understanding of the genetic basis of language impairments and language development.