Clinical evaluation of CNVs identified via techniques such as array comparative genome hybridisation (aCGH) involves the inspection of lists of known and unknown duplications and deletions with the goal of distinguishing pathogenic from benign CNVs. A key step in this process is the comparison of the individual's phenotypic abnormalities with those associated with Mendelian disorders of the genes affected by the CNV. However, because often there is not much known about these human genes, an additional source of data that could be used is model organism phenotype data. Currently, almost 6000 genes in mouse and zebrafish are, when knocked out, associated with a phenotype in the model organism, but no disease is known to be caused by mutations in the human ortholog. Yet, searching model organism databases and comparing model organism phenotypes with patient phenotypes for identifying novel disease genes and medical evaluation of CNVs is hindered by the difficulty in integrating phenotype information across species and the lack of appropriate software tools.
Here, we present an integrated ranking scheme based on phenotypic matching, degree of overlap with known benign or pathogenic CNVs and the haploinsufficiency score for the prioritisation of CNVs responsible for a patient's clinical findings.
We show that this scheme leads to significant improvements compared with rankings that do not exploit phenotypic information. We provide a software tool called PhenogramViz, which supports phenotype-driven interpretation of aCGH findings based on multiple data sources, including the integrated cross-species phenotype ontology Uberpheno, in order to visualise gene-to-phenotype relations.
Integrating and visualising cross-species phenotype information on the affected genes may help in routine diagnostics of CNVs.
Purpose and scope
The aim of this Position Statement is to provide recommendations for Canadian medical geneticists, clinical laboratory geneticists, genetic counsellors and other physicians regarding the use of genome-wide sequencing of germline DNA in the context of clinical genetic diagnosis. This statement has been developed to facilitate the clinical translation and development of best practices for clinical genome-wide sequencing for genetic diagnosis of monogenic diseases in Canada; it does not address the clinical application of this technology in other fields such as molecular investigation of cancer or for population screening of healthy individuals.
Methods of statement development
Two multidisciplinary groups consisting of medical geneticists, clinical laboratory geneticists, genetic counsellors, ethicists, lawyers and genetic researchers were assembled to review existing literature and guidelines on genome-wide sequencing for clinical genetic diagnosis in the context of monogenic diseases, and to make recommendations relevant to the Canadian context. The statement was circulated for comment to the Canadian College of Medical Geneticists (CCMG) membership-at-large and, following incorporation of feedback, approved by the CCMG Board of Directors. The CCMG is a Canadian organisation responsible for certifying medical geneticists and clinical laboratory geneticists, and for establishing professional and ethical standards for clinical genetics services in Canada.
Results and conclusions
Recommendations include (1) clinical genome-wide sequencing is an appropriate approach in the diagnostic assessment of a patient for whom there is suspicion of a significant monogenic disease that is associated with a high degree of genetic heterogeneity, or where specific genetic tests have failed to provide a diagnosis; (2) until the benefits of reporting incidental findings are established, we do not endorse the intentional clinical analysis of disease-associated genes other than those linked to the primary indication; and (3) clinicians should provide genetic counselling and obtain informed consent prior to undertaking clinical genome-wide sequencing. Counselling should include discussion of the limitations of testing, likelihood and implications of diagnosis and incidental findings, and the potential need for further analysis to facilitate clinical interpretation, including studies performed in a research setting. These recommendations will be routinely re-evaluated as knowledge of diagnostic and clinical utility of clinical genome-wide sequencing improves. While the document was developed to direct practice in Canada, the applicability of the statement is broader and will be of interest to clinicians and health jurisdictions internationally.
Genome-Wide Sequencing; Canadian Healthcare System; Return of Results; Position Statement
Multiple clinical scoring systems have been proposed for Silver-Russell syndrome (SRS). Here we aimed to test a clinical scoring system for SRS and to analyse the correlation between (epi)genotype and phenotype.
Subjects and methods
Sixty-nine patients were examined by two physicians. Clinical scores were generated for all patients, with a new, six-item scoring system: (1) small for gestational age, birth length and/or weight ≤−2SDS, (2) postnatal growth retardation (height ≤−2SDS), (3) relative macrocephaly at birth, (4) body asymmetry, (5) feeding difficulties and/or body mass index (BMI) ≤−2SDS in toddlers; (6) protruding forehead at the age of 1–3 years. Subjects were considered to have likely SRS if they met at least four of these six criteria. Molecular investigations were performed blind to the clinical data.
The 69 patients were classified into two groups (Likely-SRS (n=60), Unlikely-SRS (n=9)). Forty-six Likely-SRS patients (76.7%) displayed either 11p15 ICR1 hypomethylation (n=35; 58.3%) or maternal UPD of chromosome 7 (mUPD7) (n=11; 18.3%). Eight Unlikely-SRS patients had neither ICR1 hypomethylation nor mUPD7, whereas one patient had mUPD7. The clinical score and molecular results yielded four groups that differed significantly overall and for individual scoring system factors. Further molecular screening led identifying chromosomal abnormalities in Likely-SRS-double-negative and Unlikely-SRS groups. Four Likely-SRS-double negative patients carried a DLK1/GTL2 IG-DMR hypomethylation, a mUPD16; a mUPD20 and a de novo 1q21 microdeletion.
This new scoring system is very sensitive (98%) for the detection of patients with SRS with demonstrated molecular abnormalities. Given its clinical and molecular heterogeneity, SRS could be considered as a spectrum.
Russell Silver Syndrome; Silver Russell Spectrum; Clinical scoring system; ICR1 11p15 hypomethylation and mUPD7; phenotypic-genotypic correlation
Although BRCA1 and BRCA2 mutations account for only ∼27% of the familial aggregation of ovarian cancer (OvC), no OvC risk prediction model currently exists that considers the effects of BRCA1, BRCA2 and other familial factors. Therefore, a currently unresolved problem in clinical genetics is how to counsel women with family history of OvC but no identifiable BRCA1/2 mutations.
We used data from 1548 patients with OvC and their relatives from a population-based study, with known BRCA1/2 mutation status, to investigate OvC genetic susceptibility models, using segregation analysis methods.
The most parsimonious model included the effects of BRCA1/2 mutations, and the residual familial aggregation was accounted for by a polygenic component (SD 1.43, 95% CI 1.10 to 1.86), reflecting the multiplicative effects of a large number of genes with small contributions to the familial risk. We estimated that 1 in 630 individuals carries a BRCA1 mutation and 1 in 195 carries a BRCA2 mutation. We extended this model to incorporate the explicit effects of 17 common alleles that are associated with OvC risk. Based on our models, assuming all of the susceptibility genes could be identified we estimate that the half of the female population at highest genetic risk will account for 92% of all OvCs.
The resulting model can be used to obtain the risk of developing OvC on the basis of BRCA1/2, explicit family history and common alleles. This is the first model that accounts for all OvC familial aggregation and would be useful in the OvC genetic counselling process.
Genetic epidemiology; Ovarian Cancer; Risk prediction; Genome-wide; Genetic screening/counselling
The Canadian Open Genetics Repository is a collaborative effort for the collection, storage, sharing and robust analysis of variants reported by medical diagnostics laboratories across Canada. As clinical laboratories adopt modern genomics technologies, the need for this type of collaborative framework is increasingly important.
A survey to assess existing protocols for variant classification and reporting was delivered to clinical genetics laboratories across Canada. Based on feedback from this survey, a variant assessment tool was made available to all laboratories. Each participating laboratory was provided with an instance of GeneInsight, a software featuring versioning and approval processes for variant assessments and interpretations and allowing for variant data to be shared between instances. Guidelines were established for sharing data among clinical laboratories and in the final outreach phase, data will be made readily available to patient advocacy groups for general use.
The survey demonstrated the need for improved standardisation and data sharing across the country. A variant assessment template was made available to the community to aid with standardisation. Instances of the GeneInsight tool were provided to clinical diagnostic laboratories across Canada for the purpose of uploading, transferring, accessing and sharing variant data.
As an ongoing endeavour and a permanent resource, the Canadian Open Genetics Repository aims to serve as a focal point for the collaboration of Canadian laboratories with other countries in the development of tools that take full advantage of laboratory data in diagnosing, managing and treating genetic diseases.
Genetic screening/counselling; Clinical genetics; Diagnostics tests; Evidence Based Practice; Genetics
Over 40% of males and ~16% of female carriers of a premutation FMR1 allele (55-200 CGG repeats) will develop Fragile X associated Tremor/Ataxia Syndrome (FXTAS), an adult onset neurodegenerative disorder while, about 20% of female carriers will develop Fragile X-associated Primary Ovarian Insufficiency. Marked elevation in FMR1 mRNA transcript levels has been observed with premutation alleles, and RNA toxicity due to increased mRNA levels is the leading molecular mechanism proposed for these disorders. However, although the FMR1 gene undergoes alternative splicing, it is unknown if all or only some of the isoforms are overexpressed in premutation carriers and which isoforms may contribute to the premutation pathology.
To address this question we have applied a long-read sequencing approach using single molecule real time (SMRT) sequencing and qRT-PCR.
Our SMRT sequencing analysis performed on peripheral blood mononuclear cells, fibroblasts and brain tissue samples derived from premutation carriers and controls revealed the existence of 16 isoforms out of 24 predicted variants. Although the relative abundance of all isoforms was significantly increased in the premutation group, as expected based on the bulk increase in mRNA levels; there was a disproportionate (4-6 fold) increase, relative to the overall increase in mRNA, in the abundance of isoforms spliced at both exons 12 and 14, specifically Iso10 and Iso10b, containing the complete exon 15 and differing only in splicing in exon 17.
These findings suggest that RNA toxicity may arise from a relative increase of all FMR1 mRNA isoforms. Interestingly, the Iso10 and Iso10b mRNA isoforms, lacking the C-terminal functional sites for FMRP function, are the most increased in premutation carriers relative to normal, suggesting a functional relevance in the pathology of FMR1 associated disorders.
FMR1; isoforms; premutation; alternative splicing; RNA toxicity
Genotype-phenotype correlations are poorly characterized in arrhythmogenic right ventricular cardiomyopathy (ARVC). We investigated whether carriers of rare variants in desmosomal genes (DC) and titin gene (TTN) display different phenotypes and clinical outcomes, when compared to non-carriers (NT-ND).
Methods and Results
Thirty-nine ARVC families (173 subjects, 67 affected) with extensive follow up (mean 9 years), prospectively enrolled in the International Familial Cardiomyopathy Registry since 1991, were screened for rare variants in TTN and desmosomal genes (DSP, PKP2, DSG2, DSC2). Multiple clinical and outcome variables were compared between 3 genetic groups (TTN, DC, NT-ND) to define genotype-phenotype associations.
Of the 39 ARVC families, 13% (5/39) carried TTN rare variants (11 affected subjects), 13% (5/39) DC (8 affected), while 74% (29/39) were NT-ND (48 affected). Compared to NT-ND, DC had a higher prevalence of inverted T waves in V2-3 (75% vs. 31%, p=0.004), while TTN had more supraventricular arrhythmias (46% vs. 13%, p=0.013) and conduction disease (64% vs. 6% p<0.001). Compared to the NT-ND group, the DC group experienced a worse prognosis (67% vs. 11%, p=0.03) and exhibited a lower survival free from death or heart transplant (59% vs. 95% at 30 years, and 31% vs. 89% at 50 years, HR 9.66, p=0.006), while the TTN group showed an intermediate survival curve (HR 4.26, p=0.037).
TTN carriers display distinct phenotypic characteristics including a greater risk for supraventricular arrhythmias and conduction disease. Conversely, DC are characterized by negative T waves in anterior leads, severe prognosis, high mortality and morbidity.
Arrhythmogenic right ventricular cardiomyopathy; human genome; sudden cardiac death; desmosome; titin
In populations of European ancestry, the genetic contribution to body mass index (BMI) increases with age during childhood but then declines during adulthood, possibly due to the cumulative effects of environmental factors. How the effects of genetic factors on BMI change with age in other populations is unknown.
Subjects and methods
In a rural Gambian population (N=2535), we used a combined allele risk score, comprising genotypes at 28 ‘Caucasian adult BMI-associated’ single nucleotide polymorphisms (SNPs), as a marker of the genetic influence on body composition, and related this to internally-standardised z-scores for birthweight (zBW), weight-for-height (zWT-HT), weight-for-age (zWT), height-for-age (zHT), and zBMI cross-sectionally and longitudinally.
Cross-sectionally, the genetic score was positively associated with adult zWT (0.018±0.009 per allele, p=0.034, N=1426) and zWT-HT (0.025±0.009, p=0.006), but not with size at birth or childhood zWT-HT (0.008±0.005, p=0.11, N=2211). The effect of the genetic score on zWT-HT strengthened linearly with age from birth through to late adulthood (age interaction term: 0.0083 z-scores/allele/year; 95% CI 0.0048 to 0.0118, p=0.0000032).
Genetic variants for obesity in populations of European ancestry have direct relevance to bodyweight in nutritionally deprived African settings. In such settings, genetic obesity susceptibility appears to regulate change in weight status throughout the life course, which provides insight into its potential physiological role.
genetic risk score; BMI; Gambia; longitudinal analysis; cross-sectional analysis
Germline CDH1 mutations confer a high lifetime risk of developing diffuse gastric (DGC) and lobular breast cancer (LBC). A multidisciplinary workshop was organised to discuss genetic testing, surgery, surveillance strategies, pathology reporting and the patient's perspective on multiple aspects, including diet post gastrectomy. The updated guidelines include revised CDH1 testing criteria (taking into account first-degree and second-degree relatives): (1) families with two or more patients with gastric cancer at any age, one confirmed DGC; (2) individuals with DGC before the age of 40 and (3) families with diagnoses of both DGC and LBC (one diagnosis before the age of 50). Additionally, CDH1 testing could be considered in patients with bilateral or familial LBC before the age of 50, patients with DGC and cleft lip/palate, and those with precursor lesions for signet ring cell carcinoma. Given the high mortality associated with invasive disease, prophylactic total gastrectomy at a centre of expertise is advised for individuals with pathogenic CDH1 mutations. Breast cancer surveillance with annual breast MRI starting at age 30 for women with a CDH1 mutation is recommended. Standardised endoscopic surveillance in experienced centres is recommended for those opting not to have gastrectomy at the current time, those with CDH1 variants of uncertain significance and those that fulfil hereditary DGC criteria without germline CDH1 mutations. Expert histopathological confirmation of (early) signet ring cell carcinoma is recommended. The impact of gastrectomy and mastectomy should not be underestimated; these can have severe consequences on a psychological, physiological and metabolic level. Nutritional problems should be carefully monitored.
Cancer: gastric; Clinical genetics; Diagnostics; Cancer: breast; Stomach and duodenum
Usher syndrome (USH) is a clinically and genetically heterogeneous disease. The three recognised clinical phenotypes (types I, II and III; USH1, USH2 and USH3) are caused by mutations in nine different genes. USH2C is characterised by moderate to severe hearing loss, retinitis pigmentosa and normal vestibular function. One earlier report describes mutations in GPR98 (VLGR1) in four families segregating this phenotype.
To detect the disease-causing mutation in an Iranian family segregating USH2C. In this family, five members had a phenotype compatible with Usher syndrome, and two others had nonsyndromic hearing loss.
Mutation analysis of all 90 coding exons of GPR98.
Consistent with these clinical findings, the five subjects with USH carried a haplotype linked to the USH2C locus, whereas the two subjects with nonsyndromic hearing loss did not. We identified a new mutation in GPR98 segregating with USH2C in this family. The mutation is a large deletion g.371657_507673del of exons 84 and 85, presumably leading to a frameshift.
A large GPR98 deletion of 136 017 bp segregates with USH2C in an Iranian family. To our knowledge, this is only the second report of a GPR98 mutation, and the first report on male subjects with USH2C and a GPR98 mutation.
Usher syndrome type II; USH2C; VLGR1
Constitutional DICER1 mutations have been associated with pleuropulmonary blastoma, cystic nephroma, Sertoli-Leydig tumours and multinodular goitres, while somatic DICER1 mutations have been reported in additional tumour types. Here we report a novel syndrome termed GLOW, an acronym for its core phenotypic findings, which include Global developmental delay, Lung cysts, Overgrowth and Wilms tumour caused by mutations in the RNase IIIb domain of DICER1.
Methods and results
We performed whole exome sequencing on peripheral mononuclear blood cells of an affected proband and identified a de novo missense mutation in the RNase IIIb domain of DICER1. We confirmed an additional de novo missense mutation in the same domain of an unrelated case by Sanger sequencing. These missense mutations in the RNase IIIb domain of DICER1 are suspected to affect one of four metal binding sites located within this domain. Pyrosequencing was used to determine the relative abundance of mutant alleles in various tissue types. The relative mutation abundance is highest in Wilms tumour and unaffected kidney samples when compared with blood, confirming that the mutation is mosaic. Finally, we performed bioinformatic analysis of microRNAs expressed in murine cells carrying specific Dicer1 RNase IIIb domain metal binding site-associated mutations. We have identified a subset of 3p microRNAs that are overexpressed whose target genes are over-represented in mTOR, MAPK and TGF-β signalling pathways.
We propose that mutations affecting the metal binding sites of the DICER1 RNase IIIb domain alter the balance of 3p and 5p microRNAs leading to deregulation of these growth signalling pathways, causing a novel human overgrowth syndrome.
Geleophysic dysplasia (GD, OMIM 231050) is an autosomal recessive disorder characterized by short stature, small hands and feet, stiff joints, and thick skin. Patients often present with a progressive cardiac valvular disease which can lead to an early death. In a previous study including six GD families, we have mapped the disease gene on chromosome 9q34.2 and identified mutations in the A Disintegrin And Metalloproteinase with Thrombospondin repeats-like 2gene (ADAMTSL2).
Following this study, we have collected the samples of 30 additional GD families, including 33 patients and identified ADAMTSL2 mutations in 14/33 patients, comprising 13 novel mutations. The absence of mutation in 19 patients prompted us to compare the two groups of GD patients, namely group 1, patients with ADAMTSL2 mutations (n¼20, also including the 6 patients from our previous study), and group 2, patients without ADAMTSL2 mutations (n¼19).
The main discriminating features were facial dysmorphism and tip-toe walking, which were almost constantly observed in group 1. No differences were found concerning heart involvement, skin thickness, recurrent respiratory and ear infections, bronchopulmonary insufficiency, laryngo-tracheal stenosis, deafness, and radiographic features.
It is concluded that GD is a genetically heterogeneous condition. Ongoing studies will hopefully lead to the identification of another disease gene.
Mutations of SCN8A encoding the neuronal voltage-gated sodium channel NaV1.6 are associated with early-infantile epileptic encephalopathy type 13 (EIEE13) and intellectual disability. Using clinical exome sequencing, we have detected three novel de novo SCN8A mutations in patients with intellectual disabilities, and variable clinical features including seizures in two patients. To determine the causality of these SCN8A mutations in the disease of those three patients, we aimed to study the (dys)function of the mutant sodium channels.
The functional consequences of the three SCN8A mutations were assessed using electrophysiological analyses in transfected cells. Genotype–phenotype correlations of these and other cases were related to the functional analyses.
The first mutant displayed a 10 mV hyperpolarising shift in voltage dependence of activation (gain of function), the second did not form functional channels (loss of function), while the third mutation was functionally indistinguishable from the wildtype channel.
Comparison of the clinical features of these patients with those in the literature suggests that gain-of-function mutations are associated with severe EIEE, while heterozygous loss-of-function mutations cause intellectual disability with or without seizures. These data demonstrate that functional analysis of missense mutations detected by clinical exome sequencing, both inherited and de novo, is valuable for clinical interpretation in the age of massive parallel sequencing.
Epilepsy and seizures; Movement disorders (other than Parkinsons); intelectual disability; sodium channel; encephalopathy
Fabry disease results from deficient α-galactosidase A activity and globotriaosylceramide accumulation causing renal insufficiency, strokes, hypertrophic cardiomyopathy and early demise. We assessed the 10-year outcome of recombinant α-galactosidase A therapy.
The outcomes (severe clinical events, renal function, cardiac structure) of 52/58 patients with classic Fabry disease from the phase 3 clinical trial and extension study, and the Fabry Registry were evaluated. Disease progression rates for patients with low renal involvement (LRI, n=32) or high renal involvement (HRI, n=20) at baseline were assessed.
81% of patients (42/52) did not experience any severe clinical event during the treatment interval and 94% (49/52) were alive at the end of the study period. Ten patients reported a total of 16 events. Patients classified as LRI started therapy 13 years younger than HRI (mean 25 years vs 38 years). Mean slopes for estimated glomerular filtration rate for LRI and HRI were −1.89 mL/min/1.73 m2/year and −6.82 mL/min/1.73 m2/year, respectively. Overall, the mean left ventricular posterior wall thickness and interventricular septum thickness remained unchanged and normal. Patients who initiated treatment at age ≥40 years exhibited significant increase in left ventricular posterior wall thickness and interventricular septum thickness. Mean plasma globotriaosylceramide normalised within 6 months.
This 10-year study documents the effectiveness of agalsidase beta (1 mg/kg/2 weeks) in patients with Fabry disease. Most patients remained alive and event-free. Patients who initiated treatment at a younger age and with less kidney involvement benefited the most from therapy. Patients who initiated treatment at older ages and/or had advanced renal disease experienced disease progression.
Genetics; Metabolic disorders
Leucocyte telomere length (LTL) is a complex trait associated with ageing and longevity. LTL dynamics are defined by LTL and its age-dependent attrition. Strong, but indirect evidence suggests that LTL at birth and its attrition during childhood largely explains interindividual LTL variation among adults. A number of studies have estimated the heritability of LTL, but none has assessed the heritability of age-dependent LTL attrition.
We examined the heritability of LTL dynamics based on a longitudinal evaluation (an average follow-up of 12 years) in 355 monozygotic and 297 dizygotic same-sex twins (aged 19–64 years at baseline).
Heritability of LTL at baseline was estimated at 64% (95% CI 39% to 83%) with 22% (95% CI 6% to 49%) of shared environmental effects. Heritability of age-dependent LTL attrition rate was estimated at 28% (95% CI 16% to 44%). Individually unique environmental factors, estimated at 72% (95% CI 56% to 84%) affected LTL attrition rate with no indication of shared environmental effects.
This is the first study that estimated heritability of LTL and also its age-dependent attrition. As LTL attrition is much slower in adults than in children and given that having a long or a short LTL is largely determined before adulthood, our findings suggest that heritability and early life environment are the main determinants of LTL throughout the human life course. Thus, insights into factors that influence LTL at birth and its dynamics during childhood are crucial for understanding the role of telomere genetics in human ageing and longevity.
Rett Syndrome (RTT), a neurodevelopmental disorder that primarily affects girls, is characterized by a period of apparently normal development until 6–18 months of age, when motor and communication abilities regress. More than 95% of people with RTT have mutations in Methyl-CpG-binding protein 2 (MECP2), whose protein product modulates gene transcription. Surprisingly, although the disorder is caused by mutations in a single gene, disease severity in affected individuals can be quite variable. To explore the source of this phenotypic variability, we propose that specific MECP2 mutations lead to different degrees of disease severity. Using a database of 1052 participants assessed over 4940 unique visits, the largest cohort of both typical and atypical RTT patients studied to date, we examined the relationship between MECP2 mutation status and measures of growth, motor coordination, communicative abilities, respiratory function, autonomic symptoms, scoliosis, and seizures over time. In general agreement with previous studies, we found that particular mutations, such as p.Arg133Cys, p.Arg294X, p.Arg306Cys, 3′ Truncations, and Other Point Mutations, were relatively less severe in both typical and atypical RTT. In contrast, p.Arg106Trp, p.Arg168X, p.Arg255X, p.Arg270X, Splice Sites, Large Deletions, Insertions, and Deletions were significantly more severe. We also demonstrated that, for most mutation types, clinical severity increases with age. Furthermore, of the clinical features of RTT, ambulation, hand use, and age at onset of stereotypies are strongly linked to overall disease severity. Thus, we have confirmed that MECP2 mutation type is a strong predictor of disease severity. However, clinical severity continues to become progressively worse with advancing age regardless of initial severity. These findings will allow clinicians and families to anticipate and prepare better for the needs of individuals with RTT.
genotype-phenotype; MeCP2; Rett syndrome; RTT
Opitz G/BBB syndrome is a heterogeneous disorder characterised by variable expression of midline defects including cleft lip and palate, hypertelorism, laryngealtracheoesophageal anomalies, congenital heart defects, and hypospadias. The X-linked form of the condition has been associated with mutations in the MID1 gene on Xp22. The autosomal dominant form has been linked to chromosome 22q11.2, although the causative gene has yet to be elucidated.
Methods and results
In this study, we performed whole exome sequencing on DNA samples from a three-generation family with characteristics of Opitz G/BBB syndrome with negative MID1 sequencing. We identified a heterozygous missense mutation c.1189A>C (p.Thr397Pro) in SPECC1L, located at chromosome 22q11.23. Mutation screening of an additional 19 patients with features of autosomal dominant Opitz G/BBB syndrome identified a c.3247G>A ( p.Gly1083Ser) mutation segregating with the phenotype in another three-generation family.
Previously, SPECC1L was shown to be required for proper facial morphogenesis with disruptions identified in two patients with oblique facial clefts. Collectively, these data demonstrate that SPECC1L mutations can cause syndromic forms of facial clefting including some cases of autosomal dominant Opitz G/BBB syndrome and support the original linkage to chromosome 22q11.2.
Whole-genome sequencing (WGS) and whole-exome sequencing (WES) technologies are increasingly used to identify disease-contributing mutations in human genomic studies. It can be a significant challenge to process such data, especially when a large family or cohort is sequenced. Our objective was to develop a big data toolset to efficiently manipulate genome-wide variants, functional annotations and coverage, together with conducting family based sequencing data analysis.
Hadoop is a framework for reliable, scalable, distributed processing of large data sets using MapReduce programming models. Based on Hadoop and HBase, we developed SeqHBase, a big data-based toolset for analysing family based sequencing data to detect de novo, inherited homozygous, or compound heterozygous mutations that may contribute to disease manifestations. SeqHBase takes as input BAM files (for coverage at every site), variant call format (VCF) files (for variant calls) and functional annotations (for variant prioritisation).
We applied SeqHBase to a 5-member nuclear family and a 10-member 3-generation family with WGS data, as well as a 4-member nuclear family with WES data. Analysis times were almost linearly scalable with number of data nodes. With 20 data nodes, SeqHBase took about 5 secs to analyse WES familial data and approximately 1 min to analyse WGS familial data.
These results demonstrate SeqHBase's high efficiency and scalability, which is necessary as WGS and WES are rapidly becoming standard methods to study the genetics of familial disorders.
whole-genome sequencing; whole-exome sequencing; big data; de novo mutations; inherited homozygous or compound heterozygous mutations
Inactivating germline mutations in the tumour suppressor gene BRCA1 are associated with a significantly increased risk of developing breast and ovarian cancer. A large number (>1500) of unique BRCA1 variants have been identified in the population and can be classified as pathogenic, non-pathogenic or as variants of unknown significance (VUS). Many VUS are rare missense variants leading to single amino acid changes. Their impact on protein function cannot be directly inferred from sequence information, precluding assessment of their pathogenicity. Thus, functional assays are critical to assess the impact of these VUS on protein activity. BRCA1 is a multifunctional protein and different assays have been used to assess the impact of variants on different biochemical activities and biological processes.
Methods and results
To facilitate VUS analysis, we have developed a visualisation resource that compiles and displays functional data on all documented BRCA1 missense variants. BRCA1 Circos is a web-based visualisation tool based on the freely available Circos software package. The BRCA1 Circos web tool (http://research.nhgri.nih.gov/bic/circos/) aggregates data from all published BRCA1 missense variants for functional studies, harmonises their results and presents various functionalities to search and interpret individual-level functional information for each BRCA1 missense variant.
This research visualisation tool will serve as a quick one-stop publically available reference for all the BRCA1 missense variants that have been functionally assessed. It will facilitate meta-analysis of functional data and improve assessment of pathogenicity of VUS.
Cancer: breast; Clinical genetics; Molecular genetics; BRCA1
Whole-genome sequencing (WGS) and whole-exome sequencing (WES) technologies are increasingly used to identify disease-contributing mutations in human genomic studies. It can be a significant challenge to process such data, especially when a large family or cohort is sequenced. Our objective was to develop a big data toolset to efficiently manipulate genome-wide variants, functional annotations, and coverage, together with conducting family-based sequencing data analysis.
Hadoop is a framework for reliable, scalable, distributed processing of large data sets using MapReduce programming models. Based on Hadoop and HBase, we developed SeqHBase, a big data-based toolset for analyzing family-based sequencing data to detect de novo, inherited homozygous, or compound heterozygous mutations that may contribute to disease manifestations. SeqHBase takes as input BAM files (for coverage at every site), VCF files (for variant calls), and functional annotations (for variant prioritization).
We applied SeqHBase to a 5-member nuclear family and a 10-member 3-generation family with WGS data, as well as a 4-member nuclear family with WES data. Analysis times were almost linearly scalable with number of data nodes. With 20 data nodes, SeqHBase took about 5 seconds to analyze WES familial data and approximately 1 minute to analyze WGS familial data.
These results demonstrate SeqHBase’s high efficiency and scalability, which is necessary as WGS and WES are rapidly becoming standard methods to study the genetics of familial disorders.
whole-genome sequencing; whole-exome sequencing; big data; de novo mutations; inherited homozygous or compound heterozygous mutations
Mutations in the sulfate transporter gene SLC26A2 (DTDST) cause a continuum of skeletal dysplasia phenotypes that includes achondrogenesis type 1B (ACG1B), atelosteogenesis type 2 (AO2), diastrophic dysplasia (DTD), and recessive multiple epiphyseal dysplasia (rMED). In 1972, de la Chapelle et al reported two siblings with a lethal skeletal dysplasia, which was denoted “neonatal osseous dysplasia” and “de la Chapelle dysplasia” (DLCD). It was suggested that DLCD might be part of the SLC26A2 spectrum of phenotypes, both because of the Finnish origin of the original family and of radiographic similarities to ACG1B and AO2.
To test the hypothesis whether SLC26A2 mutations are responsible for DLCD.
We studied the DNA from the original DLCD family and from seven Finnish DTD patients in whom we had identified only one copy of IVS1+2T>C, the common Finnish mutation. A novel SLC26A2 mutation was found in all subjects, inserted by site-directed mutagenesis in a vector harbouring the SLC26A2 cDNA, and expressed in sulfate transport deficient Chinese hamster ovary (CHO) cells to measure sulfate uptake activity.
We identified a hitherto undescribed SLC26A2 mutation, T512K, homozygous in the affected subjects and heterozygous in both parents and in the unaffected sister. T512K was then identified as second pathogenic allele in the seven Finnish DTD subjects. Expression studies confirmed pathogenicity.
DLCD is indeed allelic to the other SLC26A2 disorders. T512K is a second rare “Finnish” mutation that results in DLCD at homozygosity and in DTD when compounded with the milder, common Finnish mutation.
Leucocyte telomere length (LTL), which is fashioned by multiple genes, has been linked to a host of human diseases, including sporadic melanoma. A number of genes associated with LTL have already been identified through genome-wide association studies. The main aim of this study was to establish whether DCAF4 (DDB1 and CUL4-associated factor 4) is associated with LTL. In addition, using ingenuity pathway analysis (IPA), we examined whether LTL-associated genes in the general population might partially explain the inherently longer LTL in patients with sporadic melanoma, the risk for which is increased with ultraviolet radiation (UVR).
Genome-wide association (GWA) meta-analysis and de novo genotyping of 20 022 individuals revealed a novel association (p=6.4×10−10) between LTL and rs2535913, which lies within DCAF4. Notably, eQTL analysis showed that rs2535913 is associated with decline in DCAF4 expressions in both lymphoblastoid cells and sun-exposed skin (p=4.1×10−3 and 2×10−3, respectively). Moreover, IPA revealed that LTL-associated genes, derived from GWA meta-analysis (N=9190), are over-represented among genes engaged in melanoma pathways. Meeting increasingly stringent p value thresholds (p<0.05, <0.01, <0.005, <0.001) in the LTL-GWA meta-analysis, these genes were jointly over-represented for melanoma at p values ranging from 1.97×10−169 to 3.42×10−24.
We uncovered a new locus associated with LTL in the general population. We also provided preliminary findings that suggest a link of LTL through genetic mechanisms with UVR and melanoma in the general population.
Complex traits; Telomere; cancer: skin; melanoma
Mutations in microtubule-regulating genes are associated with disorders of neuronal migration and microcephaly. Regulation of centriole length has been shown to underlie the pathogenesis of certain ciliopathy phenotypes. Using a next-generation sequencing approach, we identified mutations in a novel centriolar disease gene in a kindred with an embryonic lethal ciliopathy phenotype and in a patient with primary microcephaly.
Methods and results
Whole exome sequencing data from a non-consanguineous Caucasian kindred exhibiting mid-gestation lethality and ciliopathic malformations revealed two novel non-synonymous variants in CENPF, a microtubule-regulating gene. All four affected fetuses showed segregation for two mutated alleles [IVS5-2A>C, predicted to abolish the consensus splice-acceptor site from exon 6; c.1744G>T, p.E582X]. In a second unrelated patient exhibiting microcephaly, we identified two CENPF mutations [c.1744G>T, p.E582X; c.8692 C>T, p.R2898X] by whole exome sequencing. We found that CENP-F colocalised with Ninein at the subdistal appendages of the mother centriole in mouse inner medullary collecting duct cells. Intraflagellar transport protein-88 (IFT-88) colocalised with CENP-F along the ciliary axonemes of renal epithelial cells in age-matched control human fetuses but did not in truncated cilia of mutant CENPF kidneys. Pairwise co-immunoprecipitation assays of mitotic and serum-starved HEKT293 cells confirmed that IFT88 precipitates with endogenous CENP-F.
Our data identify CENPF as a new centriolar disease gene implicated in severe human ciliopathy and microcephaly related phenotypes. CENP-F has a novel putative function in ciliogenesis and cortical neurogenesis.
Clinical genetics; Molecular genetics; CENPF; Ciliopathy; Microcephaly
Congenital diaphragmatic hernia (CDH) is a common birth defect affecting 1 in 3,000 births. It is characterized by herniation of abdominal viscera through an incompletely formed diaphragm. Although chromosomal anomalies and mutations in several genes have been implicated, the cause for most patients is unknown.
We used whole exome sequencing in two families with CDH and congenital heart disease, and identified mutations in GATA6 in both.
In the first family, we identified a de novo missense mutation (c.1366C>T, p.R456C) in a sporadic CDH patient with tetralogy of Fallot. In the second, a nonsense mutation (c.712G>T, p.G238*) was identified in two siblings with CDH and a large ventricular septal defect. The G238* mutation was inherited from their mother, who was clinically affected with congenital absence of the pericardium, patent ductus arteriosus, and intestinal malrotation. Deep sequencing of blood and saliva derived DNA from the mother suggested somatic mosaicism as an explanation for her milder phenotype, with only approximately 15% mutant alleles. To determine the frequency of GATA6 mutations in CDH, we sequenced the gene in 378 patients with CDH. We identified one additional de novo mutation (c.1071delG, p.V358Cfs34*).
Mutations in GATA6 have been previously associated with pancreatic agenesis and congenital heart disease. We conclude that, in addition to the heart and the pancreas, GATA6 is involved in development of two additional organs, the diaphragm and the pericardium. In addition we have shown that de novo mutations can contribute to the development of CDH, a common birth defect.
Congenital diaphragmatic hernia; de novo mutation; GATA6; somatic mutation; whole exome sequencing
When compared to the other mismatch repair genes involved in Lynch syndrome, the identification of mutations within PMS2 has been limited (<2% of all identified mutations), yet the immunohistochemical analysis of tumour samples indicates that approximately 5% of Lynch syndrome cases are caused by PMS2. This disparity is primarily due to complications in the study of this gene caused by interference from pseudogene sequences.
Using a recently developed method for detecting PMS2 specific mutations, we have screened 99 patients who are likely candidates for PMS2 mutations based on immunohistochemical analysis.
We have identified a frequently occurring frame-shift mutation (c.736_741del6ins11) in 12 ostensibly unrelated Lynch syndrome patients (20% of patients we have identified with a deleterious mutation in PMS2, n = 61). These individuals all display the rare allele (population frequency <0.05) at a single nucleotide polymorphism (SNP) in exon 11, and have been shown to possess a short common haplotype, allowing us to calculate that the mutation arose around 1625 years ago (65 generations; 95% confidence interval 22 to 120).
Ancestral analysis indicates that this mutation is enriched in individuals with British and Swedish ancestry. We estimate that there are >10 000 carriers of this mutation in the USA alone. The identification of both the mutation and the common haplotype in one Swedish control sample (n = 225), along with evidence that Lynch syndrome associated cancers are rarer than expected in the probands’ families, would suggest that this is a prevalent mutation with reduced penetrance.