Overall, in the final dataset we made an average of 3.5 CNV calls per subject with a median CNV length of 76.4 kb. Of these, 60% correspond to deletions and 40% to duplications (Figure S3
We contrasted the total CNV burden between TS cases and controls, stratified by size into four categories: <10 kb, 10–100 kb, 100–500 kb and >500 kb (). We found a statistically significant increase in the frequency of CNVs >500 kb in cases (27 or 0.15 per individual) compared to controls (15 or 0.06 per individual; p
0.006). In total, 25 cases (14%) versus 15 controls (6.4%) were found to carry large CNVs, representing an excess of ~7.6% (95% C.I.
1.6–13.6%, one-sided Fisher’s exact test p
0.006). Of the 27 large CNVs found in cases, 24 occurred in regions free of CNVs in controls. Two of the TS cases had two large CNVs each, while no control carried more than one large CNV. Since no controls were available for the CVCR samples, we evaluated the effect of population stratification by testing the correlation of CNV burden with ancestry of the samples, evaluated using PCA. The presence of large CNVs was not correlated with ancestry (p
>0.05 for PCs 1 to 4). We also verified that OR estimates for large CNVs are consistent whether the CVCR cases are included (95% ci: 1.27–4.96) or not (95% ci: 1.08–5.95), but as expected from a reduction in sample size, when the burden analysis is restricted to Antioquia the significance decreases (one-sided Fisher’s exact test p
0.16). Because cases and controls were genotyped in two batches (one batch of CVCR cases and one batch of Antioquia cases and controls), we also tested for correlation of genotyping batch with the presence of large CNVs, but found no significant effect.
CNV burden in TS cases and controls.
We next explored the potential involvement in TS of CNVs at specific genome regions, stratifying by size. We first examined the 24 (out of 27) regions with CNVs >500 Kb that were detected only in the cases. Of these, 4 did not include exons of any annotated gene. The remaining 20 mapped to 15 different genomic regions. Two of these contain genes for uncharacterized proteins with no known functions (LOC284749
). The remaining 18 large CNVs were located in 13 gene regions (Table S1
). Of these regions, 10 presented rearrangements in a single case and some of these regions could be of potential relevance for TS (such a region on 22q11 overlapping DiGeorge’s syndrome critical region (Figure S4
–43) which has been implicated in rare unusual TS cases 
and has also been found to be associated with schizophrenia 
). Three regions showed rearrangements in more than one TS case. A ~600 Kb region on 3q12.1 (overlapping the COL8A1
gene) was duplicated in four cases (). Two other regions on 2p22.3 and 5q21.1 (overlapping the BIRC6/TTC27/LTBP1
and the SLCO4C1/SLCO6A1
genes, respectively) were duplicated in two cases each (). We also examined genome regions with CNVs <500 kb but focusing solely on those encompassing exons of the same gene in at least two TS cases but not in controls. We identified four such regions, each carrying a CNV in two patients (). The largest rearrangements (two ~400 kb deletions) encompass exons 1–3 of the Neurexin1
) gene on 2p16.3 (Figures S4
Chromosomal regions harbouring large (>500 kb) CNVs overlapping annotated gene exons in at least two TS cases and not in controls.
Regions harbouring smaller CNVs (<500 kb) overlapping gene exons in at least two TS cases but not in controls.
We followed up the COL8A1
findings using multiplex ligation-dependent probe amplification (MLPA; Methods S1
) targeting exons 1 and 2 of COL8A1
and exons 1 to 4 of NRXN1
(with two additional probes 3′ and 5′ of this gene) (Table S2
). We carried out MLPA in the Antioquian samples included in the SNP-based analysis for which DNA was available (92 cases and 142 controls). We validated the five SNP-based CNV calls (four on COL8A1
and one on NRXN1
) made on these samples (Figure S5
-1). MLPA identified an additional three COL8A1
deletions and two NRXN1
deletions not detected in the SNP-based CNV calls (Figures S5
-2 and S5
-3). No CNVs in COL8A1
were detected by MLPA in the controls. We also applied the COL8A1
MLPA assay to an additional set of 53 TS cases from Antioquia but did not detect further rearrangements in these individuals. Aggregating the results of the SNP-based CNV calls and MLPA (), in a total of 232 cases examined we found 7 with rearrangements in COL8A1
(all from Antioquia) and 4 in NRXN1
(3 from Antioquia and 1 from the CVCR). None of the 234 Antioquian controls showed rearrangements in these two gene regions in the SNP-based calls or MLPA. To further support the notion that the CNVs observed here are not simply population polymorphisms, we checked the Database of Genomic Variants (DGV; http://dgvbeta.tcag.ca/dgv/app/home
), a curated catalogue of human structural variation, for CNVs in the NRXN1
gene regions. While there is a considerable number of CNVs in both regions, all of the CNVs that lie within the respective gene itself are between a few hundred bp and ~100 kb long, and therefore significantly shorter than the variants described here. More importantly, the majority of these variants do not affect any of the exons of the respective genes, the only exception being a 100 kb deletion affecting NRXN1
exons 7-9 (DGV Variation_2383). This variant affects a different region from the variants observed here; in addition, it was found only in one out of 540 chromosomes and is therefore also not likely to represent a common population polymorphism. Overall, the size and position of the variants identified here, both in NRXN1
, do not show any overlap with common population polymorphism.
Number of TS cases and controls with CNVs affecting COL8A1 and NRXN1 detected using SNP-based calls, MLPA or both.
To evaluate the possibility that the COL8A1
rearrangements detected in TS cases could represent de-novo
mutations, we applied the MLPA assay to the parents of TS cases with rearrangements in these two gene regions. We considered only the patients for which DNA from both parents was available and confirmed relatedness in each trio. This included two cases with COL8A1
duplications and three cases with NRXN1
deletions (all from Antioquia). The same duplication was found in a parent in each of the two cases with COL8A1
duplications examined, indicating that this variant was inherited. This and the observation of similar boundaries for the COL8A1
duplications in the SNP-based CNV calls () suggest that this variant is segregating in the Antioquian population. Deletion of NRXN1
5′ exons was found in the father of one of the cases with a NRXN1
deletion (GT64.1) but not in the parents of the two other cases with this deletion, indicating a de novo
mutation in these two trios. The father of case GT64.1 has a diagnosis of OCD, a condition that shows significant co-morbidity and may share common predisposing factors with TS (interestingly, the paternal grand-father is reported to have suffered from OCD; however, his CNV type is unknown). One of the two de novo NRXN1
deletions identified occurred in a proband that had no family history of TS (case GT5.1, Figure S5
-2a). The second case with a de novo NRXN1
deletion (GT34.1, Figure S5
-2b) had a history of TS/OCD on the paternal side of his family.