Targeted regions: design strategy
Our primary goal was to develop an efficient mutation-screening strategy for the diagnosis of patients with phenotypes evocative of BBS, or of clinically overlapping ciliopathies. We chose a target enrichment approach coupled with NGS in order to focus the sequencing on genomic regions of interest. We targeted all exons (including 5′ and 3′ UTRs) of the 16 known BBS genes (). Because of the known clinical overlap, we also included coding exons of ALMS1
, and of all 12 known nephronophthisis genes (NPHP1
), since retinal degeneration can often be observed in this kidney-specific disease.9
Coding sequences of AHI1/JBTS3
, and of the proposed BBS-modifier CCDC28B/MGC1203
, were also targeted.18
Because some of these genes are associated with multiple phenotypes, our design includes 6 MKS, 7 JBTS and 4 Senior-Loken syndrome (SLSN) genes (see ).
With this first design, we wanted to investigate whether including intronic sequences could favour both, the detection and sizing of exon deletions. We therefore included baits-targeting intronic sequences of BBS1
. This choice was dictated by two observations: an apparent excess of patients heterozygous for the BBS1-recurrent mutation M390R with no second mutation detected,11
and multiple reports of BBS4
exon deletions in patients.4
A maximal threshold of 200 kb for cumulated targeted regions was set because of the manufacturer's pricing limits.
Presence of repeated sequences precluded bait tiling in 19.7% of initially targeted regions. This concerned, almost exclusively, introns of BBS1 and BBS4, besides a small number of 3′UTRs, and only 128 bp of protein coding regions (within first exons of ALMS1 and NPHP3; table S1).
Proof-of-principle and technical results
In our proof-of-principle experiment, we selected 16 DNA samples, of which 14 were with known BBS mutations. In this first trial, after barcoding the target-enriched libraries, we sequenced pools of four or eight libraries per lane of a GAIIx (see Supplementary Methods). This proof-of-principle analysis was carried blind, that is, without knowledge of implicated BBS genes and their associated mutations. A constellation of all mutation types (missenses, nonsenses, splice mutations, large deletions and complex rearrangements) at different allelic dosage was tested (figure S1). All 14 previously identified mutations, including two heterozygous BBS1-deletions (), were detected in their correct heterozygous/homozygous state (table S2). In particular, in patient AKE12, we could detect an abnormal local drop of coverage in BBS12 due to a rare mutation type (insertion of an Alu sequence, figure S1A) although the exact nature of the mutation could only be determined by Sanger sequencing. A similar drop in coverage was observed for a second patient, AHX91, with another complex mutation detected previously by Sanger sequencing (insertion/inversion in BBS5).
Figure 2 Detection of large deletions in three patients using a depth-of-coverage method. Black peaks: normalized depth of coverage from patients' DNA samples. Empty peaks: normalized mean depth of coverage across samples from the same sequencing lane. Grey squares: (more ...)
In this first experiment, we almost systematically reached the maximal theoretical coverage of 144x illustrated by a mean coverage of 127±4x after removal of duplicate reads (). Due to this global saturating coverage when considering unique reads, we used all reads, including duplicates, when applying our depth-based method for the detection of CNVs.
Sequencing statistics of both coverage (in captured regions) and capture efficiency
These promising depth-of-coverage results (, table S3) encouraged us to further increase the number of pooled samples. In the next experiments we used a single capture reaction for two barcoded libraries, allowing both cost and bench-time savings, and pooled 12 libraries per sequencing lane (maximum number of barcodes proposed at that time by Illumina).
This new protocol was performed on a second cohort of 36 patients with unknown mutations. Sequencing resulted in a mean coverage of 78±17× (283±153x before discarding duplicate reads) with 91.4±6.4% of targeted regions being covered more than 40x (). This relative drop of coverage appears to be a consequence of a lower capture efficiency that might be due to: (1) an input amount of individual library reduced by half, due to the pre-capture pooling and (2) the addition of barcodes before capture, leading to less efficient blocking and unspecific hybridisation. The resulting coverage still guarantees a reliable detection of variants and of their homozygous/heterozygous state.
A small proportion of targeted regions was weakly covered in some patients (ie, depth <10x after duplicates filtering), with very few of them in a systematic way in other patients (table S4). This only concerned 0.63±0.68% of protein coding regions, and mostly included intronic GC-rich sequences (GC content: 68.3±5% vs 40.2±10% across all targeted regions), or some first exons (tables S3 and S4).
Variant filtering: importance of databases and frequency data
In targeted regions, we detected, on average, 170 variants (Single Nucleotide Variants (SNVs) and indels) per patient. All were systematically analysed for putative effect on protein structure and splice sites using VaRank (, Supplementary Methods). About 130 of these variants were recorded in dbSNP134 (table S5), but only 20 were validated with at least two independent methods and, therefore, filtered out. Indeed, in the context of a rare recessive disorder, some true mutations can be present at very low frequency in a heterozygous state in controls.
Potential pathogenicity of the remaining 150 variants was assessed using bioinformatic tools and considering their allele frequency in a European-American population, as reported in the Exome Variant Server database (EVSdb). This yielded from zero to six interesting variants per sample, among which were obvious truncating or known mutations in some patients.
The new ‘clinical significance’ field introduced in dbSNP134 has to be considered with caution since established mutations can now be reported in the database but are not systematically flagged as pathogenic (example: rs179363897, p.R138H) mutation in BBS5
). Conversely, we detected some false-positive annotations: rs4784677 (p.N70S) in BBS2
—initially reported as a third allele according to the triallelic hypothesis—20
is flagged as pathogenic, but is too frequent to be a fully penetrant mutation (0.77% in EVSdb). Filters have to be carefully adapted to the disorder of interest, and to the constantly evolving updates of databases.
Detection of exon deletions
One advantage of NGS-based strategies, as opposed to Sanger sequencing, is the opportunity to detect—in addition to SNVs and small indels—CNVs affecting one or more exons ( and S2). In the proof-of-principle
experiment, two heterozygous deletions could be detected in BBS1
. Among the unknown samples, two homozygous deletions in BBS3/ARL6
were identified. To our knowledge, we provide here the first report of large deletions in BBS1
, while several deletions affecting BBS4
have been previously observed.4
Since we also targeted intronic sequences of BBS1
, we were able to narrow the boundaries of subsequent detected deletions (). For patient AMV5, by using coordinates of affected exons, the estimated size of BBS1
deletion would be between 466 and 4707 bp, while with our design, we could restrict it to 1862–3841 bp (, table S2). In patient P3, we could similarly reduce the assessed size of the BBS4
deletion from 4626–12 975 bp based on exon positions down to 9376–10 469 bp (, ). Lastly, since the BBS3/ARL6
deletion in patient ALG42 encompasses the first three exons of the gene (), we tested whether it may extend and affect EPHA6
located upstream, encoding an ephrin receptor. Direct PCR testing excluded such extended deletion (data not shown).
Table 3 Identified mutations and other potentially pathogenic variants in the 38 patients with previously unknown genotype. A) Patients with two clearly pathogenic variants in one gene; B) Patients with a single or no clear pathogenic variant in one gene. Mutations (more ...)
Thanks to this method, in the six patients in whom we detected a single heterozygous potentially pathogenic mutation, we can ascertain that no heterozygous deletion is present in trans, or at least none encompassing exonic sequences.
Distribution of detected BBS mutations in the 38 unknown patients
Of the 38 samples with unknown BBS mutations (36+2 from the proof-of-principle experiment) we detected clearly pathogenic biallelic mutations in 26 cases (68.4%; ). To our knowledge, seventeen of these mutations have not been reported previously, indicating that the BBS mutation spectrum is far from being saturated in spite of numerous BBS mutation reports. Homozygous mutations were found in 88% (23/26) of mutated patients, coherent with the large number of consanguineous probands included in the cohort (75%; 25/33). In two patients of consanguineous origin, the BBS mutation was located outside the homozygous regions detected by prior SNP array analysis and would, therefore, have been missed using a homozygosity mapping strategy (patients AGL23, AKX44; ).
Among the remaining 12 patients with no biallelic mutations identified (), one patient (AHR2) had a heterozygous clearly pathogenic splicing mutation in BBS3. Five patients had heterozygous missenses predicted to be damaging by SIFT, Polyphen2 and/or Mutation T@ster (AIY87, AIX45, AMO77, AMA28, AHL86) with the latter two carrying such variants in two different genes. One consanguineous Melanesian patient, AKE98, presented with classical BBS-inclusion features, including polydactyly. He carried two homozygous variants which initially appeared as potentially pathogenic: a distal frameshift in INVS/NPHP2 predicted to add 14 amino-acids to the C-terminus of the longer protein isoform, and a non-reported missense P2679L affecting a conserved residue in ALMS1 (figure S3). Subsequent segregation analysis ruled out their implication in the disease since both variants were heterozygous in a similarly affected brother.
In five patients, no potentially pathogenic variant could be identified in any of the 30 targeted genes. These patients are thus candidate for exome sequencing which might either help in identifying novel genes or in reconsidering the clinical diagnosis.
Mutation load in BBS and other targeted genes: importance of ALMS1
The mutation load among BBS genes in our cohort appears consistent with previous reports.7
Observed occurrences for BBS1
(7/38, 18.4%) and BBS2
mutations (5/38, 13.2%), the most frequently mutated genes in our study, are similar to the respective reported figures of 16.9% and 12%.7
Considering frequently mutated genes, our study was strongly biased against BBS1
, since two-thirds of the patients' DNAs were previously tested negative for BBS1
recurrent mutations, plus all BBS12
protein coding sequence. This explains the total absence of BBS12
mutations in our cohort, and the relatively low contribution of BBS10
(5/38, 13.2%) as compared with the literature (≥20%).7
The contribution of other BBS genes was low, with frequently only one proband involved.
We did not find any mutation in the ‘new’ BBS genes (BBS13
) suggesting that, cumulatively, they have a small contribution to the total mutation load. BBS13/MKS1
was indeed shown to be mostly implicated in MKS since only one BBS patient was reported with two heterozygous mutations p.[C492W]; [F371del] others carried only heterozygous missenses, sometimes in addition to homozygous truncating BBS1
Likewise, for BBS14/CEP290
, a homozygous truncating mutation (p.E1903*) was found in a single BBS patient,21
while other mutations are much more often implicated in Joubert, Senior Loken, Leber Congenital Amaurosis or Meckel syndromes. Similar observations can be made for BBS15/WDPCP
Like in all other studies of BBS cohorts, no mutation was identified in BBS11/TRIM32
raising the question of its real implication in BBS: only one homozygous missense mutation was described in a single consanguineous family, while several other mutations were identified in recessive forms of limb girdle muscular dystrophy.24
One noteworthy result is the finding of homozygous truncating ALMS1
mutations in 3/38 patients (AIA84, AKO26, ALB64; 7.9%). In particular, the nonsense found in AKO26 patient p.R3629* seems to be a recurrent ALMS1
mutation, since already reported in five other ALMS patients.25–27
The phenotypic overlap between BBS and ALMS seems to be larger than previously thought, as recently suggested with examples of Alström patients with mutations in BBS genes,5
and the reverse situation, such as in our study, of ALMS1
mutations in patient with suspected BBS.3
Lastly, no clearly pathogenic mutation was found in any NPHP or JBTS genes in the cohort.
Correlation between mutation detection efficiency and clinical phenotype
Comparison of clinical phenotype between patients with two clearly pathogenic mutated alleles (n=26) and those with either a single possible pathogenic variant or no suspicious variant detected (n=12) showed a clear correlation between the number of major BBS clinical features and the probability of detecting two BBS mutated alleles in patients (). Biallelic mutations were detected in 81% (CI (60% to 92%)) of patients meeting BBS inclusion criteria. In particular, in Tunisian patients recruited upon strict clinical criteria, mutations were found in 11/11 cases and in seven different BBS genes, ruling out a potential founder effect. On the contrary, for some of the 12 patients without clear mutations, BBS was only one suspected diagnosis among others. Furthermore, our initial selection of patients without recurrent mutations in BBS1 or BBS10, and without any mutation in BBS12 may have enriched our cohort in patients with non-typical BBS phenotypes. The current widely quoted estimation that known BBS genes account for only 70–75% of the total mutation load in BBS patients may thus be underestimated if considering only patients with strictly defined BBS phenotype.
Figure 3 Compliance with classical BBS phenotype is positively correlated to the efficiency to detect principal mutations in BBS genes. (A) The number of BBS diagnostic major inclusion criteria6 in patients is correlated to an efficient detection of BBS mutations. (more ...)
The distribution of BBS inclusion features appears different between patients with two BBS mutations, two ALMS1 mutations or no biallelic mutation identified (). Patients with no detected mutation presented with significantly less polydactyly, a major BBS clinical sign: only 25% versus 70% in patients with detected BBS mutations (p=0.029*). The other clinical features seem to follow the trend of classical BBS patients.
Report of major BBS clinical features in the 38 patients without previously known molecular diagnosis, with or without detected mutations
-mutated probands, 2/3 had been sent for suspected BBS (Prader-Willi or ALMS were also considered for AIA84 and AKO26, respectively) and satisfied BBS diagnostic criteria; the last one (ALB64) was addressed for syndromic retinal dystrophy ().6
AIA84 presents a classical BBS with retinal dystrophy, obesity, cognitive defects, hypogonadism and brachydactyly. AKO26 presents an atypical BBS with the same features along with abnormal severe deafness, specific for ALMS. Lastly, ALB64 presents a typical ALMS with severe deafness and retinal dystrophy. None of them presented with polydactyly. As previously suggested,3
both, the absence of polydactyly and the prevalence of deafness in ALMS1
-mutated patients, are keys for genotype-phenotype discrimination between ALMS and BBS mutated patients.
Assessment of oligogenism in BBS
The presence and potential effect of triallelism or oligogenism in BBS has been widely discussed and appears controversial (18
). In our approach, the simultaneous sequencing of all 16 BBS genes, and of 14 other genes involved in overlapping ciliopathies, allows the systematic detection of most additional potentially pathogenic variants in those genes and, consequently, an unbiased assessment of oligogenism.
Out of the 52 patients analysed, we found only one heterozygous truncating mutation (p.K1870Nfs*4) as a third allele in BBS14/CEP290
in patient AKR68 who carries a pathogenic missense mutation in BBS10
. Such a frequency is in fact in the range of what to expect by chance. We previously calculated the probability of carrying a true BBS-pathogenic mutation to be about 1:50.31
Since we also included in our design other ciliopathy genes, the probability to carry a pathogenic mutation in one of the 30 genes is rather between 1:20 and 1:30 (calculation based on each disease incidence and reported contributions of targeted genes in the mutation load). Potentially, pathogenic heterozygous missenses (not previously reported in patients or in the EVSdb) were also found in eight patients (three of the proof-of-principle
, table S2; five of the ‘unknown’ cohort, ). Such variants might act as modifiers, but it is unlikely that they are required for full expression of a classical BBS phenotype. Conversely, in some patients where a single clearly pathogenic mutation was found, variants in other genes of the same pathway (especially those encoding proteins of the same complex, such as the BBSome or the BBS-chaperonin complex)33
might contribute to the disease state in a digenic mode of inheritance proposed in few BBS families.20
Potential case of triallelism is illustrated by patient AIZ62, who is compound heterozygous or a nonsense p.E191* and a missense p.A242S in BBS6/MKKS. The pathogenicity of A242S variant has been a subject of discussion.37–39
Analysis in zebrafish indicated that it affects BBS6/MKKS function and suggested a dominant negative effect.40
EVSdb allows to infer its frequency at 0.59% (CI (0.43 to 0.80%)), higher than the most frequent BBS mutation, M390R in BBS1 (0.24%, CI (0.15 to 0.39%)). A242S cannot thus be a highly penetrant mutation since it should then be found more frequently in patients than the M390R mutation, which is not the case. In patient AIZ62, a third heterozygous variant was identified in BBS12 (p.Q620R, residue conserved in mammals, but not in more distant vertebrates) thus affecting another subunit of the BBS-chaperonin complex.34
We suggest that A242S is a hypomorphic allele that may lead to a phenotype when in trans
, with a complete null mutation, and could be further potentiated by a hypomorphic allele affecting another subunit of the same complex. Segregation analysis in AIZ62 family could not be performed to test this hypothesis.
Lastly, we looked in our cohort for the allelic frequency of the previously proposed BBS-modifier variant c.330C→T in CCDC28B/MGC1203
and found a frequency of 3.85% (CI (1.56 to 9.47%)), which is not significantly different (p=0.17) from the 2.07% (CI (1.76 to 2.43%)) observed in EVSdb.