|Home | About | Journals | Submit | Contact Us | Français|
To investigate the utility of whole-exome sequencing (WES) to define a molecular diagnosis in patients clinically diagnosed with congenital anomalies of kidney and urinary tract (CAKUT).
WES was performed in 62 families with CAKUT. WES data were analyzed for Single Nucleotide Variants (SNVs) in 35 known CAKUT genes, putatively deleterious sequence changes in new candidate genes, and potentially disease-associated copy-number variants (CNVs).
In approximately 5% of families, pathogenic SNVs were identified in PAX2, HNF1B, and EYA1. Observed phenotypes in these families expand the current understanding about the role of these genes in CAKUT. Four pathogenic CNVs were also identified using two CNV detection tools. In addition, we found one deleterious de novo SNV in FOXP1 among the 62 families with CAKUT. Database of clinical BMGL laboratory was queried and seven additional unrelated individuals with novel de novo SNVs in FOXP1 were identified. Six of these 8 individuals with FOXP1 SNVs, have syndromic urinary tract defects, implicating this gene in urinary tract development.
We conclude that WES can be used to identify the molecular etiology (SNVs, CNVs) in a subset of individuals with CAKUT. WES can also help identify novel CAKUT genes.
The diagnosis of congenital anomalies of the kidney and urinary tract (CAKUT) is based on the recognition of a broad spectrum of renal and urinary tract malformations which, in aggregate, constitute the most common cause of end-stage renal disease (ESRD) in children. 1,2 CAKUT may result in chronic kidney disease (CKD) which leads to severe impairment of physical and psychosocial development. 3 Socioeconomically, CAKUT poses a substantial economic burden to families and health care systems. CAKUT is a clinically heterogeneous phenotype that encompasses renal agenesis, renal hypo/dysplasia (RHD), multicystic kidney dysplasia (MCDK), cross-fused ectopia, duplex renal collecting system, ureteropelvic junction obstruction (UPJO), mega-ureter, posterior urethral valves (PUV), and vesicoureteral reflux (VUR).
Multiple lines of evidence suggest that genetic factors contribute to CAKUT. This evidence includes familial segregation of CAKUT cases and, the identification of causative genes. 4 Discovery of an underlying genetic etiology facilitates molecular diagnosis and can aid physicians and family members by clarifying associated risks and allowing improved genetic counseling.
Whereas in the past, genetic diagnosis was limited to the analysis of individual candidate genes, whole-exome sequencing (WES) provides an opportunity to arrive at an accurate molecular diagnosis with a single test. 5,6 WES is able to identify Single Nucleotide Variants (SNVs), however recently it has been used to uncover even small CNVs encompassing a single gene or even one exon. 7 Extraction of CNV information from WES data is challenging partly due to the potential artifacts introduced during the exon targeting and amplification steps of WES.8 Moreover, WES can enhance gene discovery for novel potential contributory genes.
Initial reports of clinical WES at a Clinical Laboratory Improvement Amendments (CLIA) certified laboratory indicated a molecular diagnostic rate of 25% for patients referred for genetic evaluation 5,6. However, neurological phenotypes constituted 80% of the patient population in that study. The clinical utility of using WES in common, sporadic birth defects is under active investigation. Current evaluations of CAKUT patients involve diagnostic imaging, however involvement of other organs may go undiagnosed during such evaluations. Here we investigated the utility of WES to define a molecular diagnosis (SNVs and CNVs) in patients clinically diagnosed with CAKUT.
Patients and their families were recruited from the pediatric urology and renal diseases clinics at the Texas Children's Hospital, Houston Texas, USA. Inclusion criteria included individuals with non-syndromic forms of CAKUT (as defined above) and individuals with syndromic forms of CAKUT for which a genetic etiology had not been identified. Exclusion criteria included individuals with syndromic forms of CAKUT in which an underlying genetic etiology was known and individuals with non-syndromic and non-familial forms of vesicoureteral reflux (VUR). Therefore, individuals with syndromic features without a known diagnosis were included in the study. The study protocol was approved by the Institutional Review Board for the Protection of Human Subjects at Baylor College of Medicine. Standard procedures were used to recruit subjects for this study. Demographics of the families and phenotypic details of subjects with CAKUT are summarized as Table 1 and Table S1. Blood samples or saliva-based specimens were collected by standard procedures, according to the families' wishes. DNA extraction was performed with a QIAmp kit (QIAGEN) per manufacturer's instructions. DNA was quantified with nanodrop, and 1ug of DNA was used for WES. In familial multiplex cases, WES was performed on the affected available family members most distantly related observed in their respective pedigrees (see Figure 1). Among apparently isolated, singleton cases, we performed WES on the proband and the two apparently unaffected parents (case-parent trios in 20 families), when both parents were available. In all other cases, WES was performed only on the proband.
WES analysis started with conversion of raw sequencing data (bcl files) to the fastq format by Casava. Then, the short reads were mapped to a human genome reference sequence (GRCh37) by the Burrows-Wheeler Alignment (BWA). Subsequently, the recalibration was done by GATK,9 and variant calling was performed by the Atlas2 suite. 10 The Mercury pipeline is available in the cloud via DNANexus (http://blog.dnanexus.com/2013-10-22-run-mercury-variant-calling-pipeline/).
After detection of all bi-allelic (homozygous or compound heterozygous) and de novo variants from WES data, we established a SNV prioritization workflow. This included sequential analysis of bi-allelic predicted loss of function variants (stopgain, frameshift indels and splicing); bi-allelic missense variants; de novo truncating variants; and de novo missense variants. Finally, we further examined the shared rare variants among affected family members and parents to detect potential mosaic variants in parents. This SNV prioritization workflow was followed by subsequent filtering of variants based on their frequencies (MAF <= 0.1%) in internal and external databases including Baylor-Hopkins Center for Mendelian Genomics (BHCMG), Exome Aggregation Consortium (ExAC), Exome Variant Server (ESP), 1000 Genome Project, and the Atherosclerosis Risk in Communities Study (ARIC) databases. To retrieve potentially deleterious and conserved missense changes, we utilized various bioinformatics tools including Phylop conservation score and Mutation Taster, SIFT, and PolyPhen-2 prediction scores. Next, these potential rare causative variants were analyzed in terms of: 1) gene function and the associated phenotype in OMIM and Pubmed; 2) gene-associated animal models; 3) tissue expression of the encoded protein; 4) association with already known gene/genes linked to the patient's phenotype in terms of i) gene networks; ii) gene families iii) coexpression; iv) physical protein-protein interaction; v) predicted protein-protein interaction vi) molecular pathways and 5) location of the variant with respect to functional protein domains. The resulting most promising candidate variants were further confirmed and segregated by Sanger sequencing. Finally, the confirmed variants in candidate genes were interrogated in BHCMG and Baylor Miraca Genetics Laboratories (BMGL) databases and/or through GeneMatcher for the identification of additional affected cases with similar phenotypes.
To identify CNVs, our WES data was analyzed by CoNIFER 11 software and CoNVex algorithms. 12 In CoNVex, as a first step, read depth information from WES data was extracted. Then, the general additive model (GAM) correction method was performed to remove the systemic bias from the read depth information. After this step, Smith-Waterman algorithm was used to infer the CNV state and score the detected CNV regions. Each potential CNV region was assigned a confidence score. We further filtered CoNVex-detected CNV calls by selecting those that have an associated confidence score >=5 and number of probes >= 5. Afterwards, these detected CNV calls by CoNVex were overlapped with CNV calls detected by CoNIFER by using Granges function in R Bioconductor GenomicRanges Package. Overlapping CNVs with previous studies were subjected for validation by aCGH. 13,14 Non-overlapping CNVs were also investigated. Among them, rare CNVs (not present or present with low frequency from Database of Genomic Variants [http://dgv.tcag.ca]) involving genes potentially contributing to kidney abnormalities were selected for array comparative genomic hybridization (aCGH) validation. The flowchart of CNV discovery from WES data is provided (Figure S1a).
SNV interpretation was performed from the most recent guidelines published by America College of Medical Genetics and Genomics (ACMG). 15 Accordingly, only variants that met strict criteria were called pathogenic. CNV interpretation was based on size, gene content, overlap with known disease-associated regions, and phenotype overlap according to ACMG guidelines for postnatal CNV calling. 16
In this study Codified software (https://www.scienceexchange.com/labs/codified-genomics) was used to search WES data for pathogenic SNVs, variants of uncertain clinical significance (VUS) and benign SNVs in the following 19 dominantly inherited genes reported to be associated with CAKUT: BMP4, BMP7, CDC5L, CHD1L, DSTYK, EYA1, GATA3, HNF1B, KAL1, PAX2, RET, ROBO2, SALL1, SIX1, SIX2, SIX5, SOX17, TNXB, UPK3A. The following recessive CAKUT genes were interrogated for 2 SNVs: AGT, ACE, REN, AGTR1, FRAS1, FREM2, GRIP1, HPSE2, LRP4, and ROR2. In addition, we searched for pathogenic SNVs in the following 6 dominantly inherited genes: GLI3, JAG1, NOTCH2, TFAP2A, TBX18, and WNT4.
Codified Genomics software was used to annotate, filter and prioritize variants. Variants were filtered as previously described. 5 Annotations were generated by Annovar 18 and VEP 19 against the UCSC, RefSeq and Ensembl gene models. Variants and genes were further annotated using dbNSFP, 20 Illumina body map, Uniprot, HPO and OMIM databases amongst others. Variants were prioritized based on patient phenotype similarity to known disease genes, mutation type, and for nonsynonymous variants, predicted deleteriousness.
We performed WES on 112 individuals from 62 families with CAKUT (Table 1 and Table S1). Probands were mostly children and young adults ranging in age from 2 months to 24 years. Thirty-one percent of the probands had more than one organ (other than the kidney and urinary tract) involved, which suggests that these patients potentially harbor a syndromic form of CAKUT. In approximately 16% of the families, WES was performed in a familial mode since more than one individual was affected with CAKUT. The most common phenotypic indications were ‘renal dysplasia’ and ‘agenesis/hypoplasia’.
WES results were interrogated for SNVs in known genes that cause CAKUT as described in methods. Pathogenic SNVs were identified in three known genes (EYA1, HNF1B, and PAX2) in 3 families (approximately 5%) (Table 2). Two of these variants were frameshift variants and one was a splice site variant, each suggesting a loss-of-function mechanism. The frameshift variant in HNF1B is a novel pathogenic allele. Pedigrees of these families are illustrated as Figure 1. Among the pathogenic SNVs identified in these three families, two were de novo from trio analyses, and one was inherited from an affected parent. All selected SNVs identified in the probands and their parents (when available) were confirmed by Sanger sequencing.
Among the three families with pathogenic SNVs, clinical assessment had not identified anomalies of any other organs prior to WES. Importantly, WES elicited further clinical assessment and the delineation of the additional organ system involvement in Families 1 and 2 retrospectively. In Family 3, defects in other organs have not been observed clinically.
The initial diagnosis of Family 1 with p.G24fs SNV in PAX2 was a familial form of renal dysplasia and membranous nephropathy. After the familial variant was identified and in recognition of the current understanding of the phenotype of patients with PAX2 variants (Renal-Coloboma syndrome, [MIM: 120330]), the first degree relatives were referred to an ophthalmologist with expertise in the diagnosis and management of genetic disorders. This clinical evaluation revealed optic nerve colobomata and other congenital optic nerve abnormalities in those first degree relatives who were proven to be variant carriers.
After the initial clinical diagnosis of cystic renal dysplasia (CRD) in Family 2, and in light of the WES results (p.Q378fs in HNF1B), the patient's clinical presentation was further reviewed. The patient had a recent diagnosis of gout and elevated liver function tests (LFTs). The patient also had increased echogenicity of the pancreas (one of known signs of HNF1B variants) noted previously by abdominal ultrasound, whose significance became apparent after the genetic analyses. A known recurrent de novo intronic variant (c.867+5G>A) was identified in EYA1 in the proband of Family 3, who has VUR and multicystic dysplastic kidney (MCDK). Both parents were negative for this variant based on both trio WES and Sanger sequencing.
In this study, we also defined variants of uncertain clinical significance (VUSs) in 19 dominantly inherited known CAKUT genes (Table S2). In this cohort, SNVs were not identified in SIX1, SOX17, GATA3, or UPK3A. Benign SNVs and VUSs were identified in BMP7, CDC5L, CHD1L, SALL1, SIX5, SIX2, ROBO2, BMP4, KAL1, TNXB, RET, PAX2, EYA1, and DSTYK (Table S2). Further allele frequencies and prediction data for all SNVs identified in known CAKUT genes in this cohort are summarized as Table S3. Probability of Loss-of-Function score (pLI) is also provided in this table. The closer pLI score approaches to 1 (unity), the more LoF (loss-of-function) intolerant the gene appears to be (http://exac.broadinstitute.org). We attempted to confirm all VUSs with Sanger sequencing. Details of confirmation are provided in Table S2.
Trio analyses, consisting of WES in the proband and both biological parents to evaluate for new mutations, were performed in 20 families. We confirmed relationship (paternity and maternity) in the trios by review of the de novo SNVs in each family. There was no proband with more than expected number of de novo SNVs (>2) in the coding exonic region of the genome, well within the expected rate of 1.20 × 10−8 per nucleotide per generation. 21 We identified a de novo SNV (p.P225T) in FOXP1 [MIM: 605515] in a proband with hydrocephaly and unilateral renal agenesis (Family 38) (Figure 1). This patient was enrolled initially into this study at age 4 months. Later, the patient manifested delay in gross motor and speech development. In addition, he was diagnosed with strabismus and left optic atrophy. The pedigree of this family is shown as Figure 1 (Family 38).
We next attempted to identify other subjects/families with variants in FOXP1. The database of Whole Genome Laboratory (WGL) at BMGL was queried for other de novo SNVs in FOXP1. We identified seven more de novo SNVs in this gene among approximately 5000 patients (Table 3). Relationship (paternity and maternity) were confirmed by inheritance of rare SNVs from each parent in cases 3-8. In case 2, paternity was confirmed by inheritance of rare SNVs from the father. Maternity, however could not be genetically confirmed per consent, and was verified by pregnancy history.
All eight individuals had neurodevelopmental phenotypes consistent with loss-of-function variants in FOXP1 [MIM: 613670]. However, four out of eight individuals also had upper urinary tract defects, and five of them had defects in lower genitourinary (GU) tract, including undescended testis, hypospadias, neurogenic bladder, inter alia. In addition, these patients have brain and heart involvement, which are consistent with the role of FOXP1 in development of these two organs. 22–24 CNS malformations including hydrocephaly and cardiac defects are among the phenotypes of the patients in this study. The genotypes and phenotypes of these individuals (6/8 with upper or lower urinary tract defects) are summarized as Table 3. pLI score of FOXP1 is 1, which suggests this gene is intolerant to loss-of-function variants.
CNVs were inferred from WES data as described in Methods. Pathogenic CNVs and CNVs of uncertain clinical significance are summarized as Table 4. A de novo 22q11.1q11.21 triplication was identified in Family 34, in which the proband had syndromic VUR. Thistriplication is proximal to the DiGeorge region, consistent with the gain of genetic material seen with type I supernumerary inv dup(22)(q11), associated with cat eye syndrome25 [MIM: 115470] (Family 34). This patient's phenotype has overlap with Goldenhar or Oculo-Auriculo-Vertebral spectrum (OAV, [MIM: 164210]) and VATER Association [MIM: 192350]. The patient was an 11-year-old Latin American male with short stature, imperforate anus, thumb anomaly, severe gastroesophageal reflux (GERD), VUR, neurogenic bladder, right renal hypoplasia with evidence of scarring/renal damage, bilateral ear tags, ocular Duane anomaly, and left microphthalmia. Patient had normal cognition, although with some difficulty in Mathematics.
Three other pathogenic CNVs were found in regions associated with known syndromes, namely 16p11.2 deletion,16p11.2 duplication, and 16p13.11 duplication. CNVs in all individuals in Table 4 were validated by aCGH. Parental studies were also performed by aCGH. In Family 34, in which samples were available from both parents, the CNV was found to be de novo. The flowchart of copy-number data inference (a), CoNIFER (b) and aCGH data (c) for de novo 22q11 triplication appears as Figure S1.
During the past four years, WES has become a powerful clinical test to define both recognized and previously undefined genes and potential variant susceptibilities to establish molecular diagnoses for birth defects. Clinical WES identifies pathogenic SNVs in approximately 25% of pediatric cases (mostly syndromic) that represent diagnostic dilemmas refractory to clinical diagnosis despite previous extensive medical evaluations 5,6. Nevertheless, the utility of WES for molecular diagnosis in isolated birth defects including CAKUT remains uncertain. This study shows that WES can be used in the diagnostic setting to define the molecular defects that underlie CAKUT and to reveal additional insights into the clinical presentation of the disorder. In addition, WES can be used for the identification of new candidate genes.
In Family 1 (Figure 1 and Table 2), the diagnosis of Renal Coloboma Syndrome (RCS, [MIM: 120330]) was possible only after WES result became available. Prior to WES, the proband was on immunosuppressants for proteinuria; however, after molecular diagnosis management is being altered to a tapering dose of immunosuppressive therapy. This allows avoidance of unnecessary immunosuppression since the etiology of kidney disease is not immunologic in this family. The molecular diagnosis of this family obtained by WES thus affected clinical decision making for the patient and the prognosis and management for family members. These data also expand the phenotype related to PAX2 pathogenic variants as the proband has membranous nephropathy and other family members have proteinuria and renal dysplasia. To date, focal segmental glomerulosclerosis (FSGS) has been reported with PAX2 variants; 26 membranous nephropathy is a novel finding. In Family 2, a novel frameshift pathogenic SNV (p.Q378fs) was identified in HNF1B. This molecular diagnosis concluded by WES further substantiated the clinical phenotype as described in the results, minimizing the necessity for additional diagnostic evaluation. The variant in this family is novel, which adds to our current knowledge of diseases related to HNF1B gene.
The phenotype of Family 3 with the de novo EYA1 variant suggests that underlying genetic predisposition can lead to or at least exacerbate renal pathology in patients with VUR. Variants in EYA1 can cause Branchio-Oto-Renal syndrome (BOR, [MIM: 113650]), an autosomal dominant disorder characterized by sensorineural, conductive, or mixed hearing loss, structural defects of the outer, middle, and inner ear, branchial fistulas or cysts, and renal abnormalities ranging from mild hypoplasia to agenesis. The c.867+5G>A SNV does not affect the invariant splice site: nevertheless, RNA analysis of samples from patients with BOR showed that this SNV affects EYA1 splicing, producing an aberrant mRNA transcript that lacks exon 8 and results in premature termination in exon 9. 27 The proband in this family will be evaluated for hearing impairment, since the SNV in this individual causes BOR. This family provides another example that WES can improve the clinical diagnosis of syndromic forms of CAKUT beyond clinical evaluation alone.
Families 1, 2, and 3 exemplify the effect of WES on the clinical management of the patients and families, since identification of the SNVs in PAX2, HNF1B, and EYA1 respectively warranted further investigations in other organ systems. We identified only a fraction of families (3/62 = 4.8%) with pathogenic SNVs, similar to a recent large study that evaluated 749 individuals from 650 families with CAKUT for variants in 17 known CAKUT genes (6.3%). 28
CAKUT is a clinically heterogeneous clinical spectrum and thus many more genes and causal variants are likely to be identified. Next generation sequencing (NGS) and specifically WES have improved discovery of novel causative genes. 7,29–34 We have identified a novel genes (FOXP1) that likely contribute to the CAKUT phenotype. We found 8 novel different de novo pathogenic SNVs, from both clinical and research WES, in FOXP1 in unrelated individuals. As summarized as Table 3, the phenotypes observed in these individuals suggest a clinical pattern that may be potentially recognizable. Structural brain anomalies (including hydrocephaly), intellectual disability, developmental delay, cardiac defects, hypotonia, behavior problems and renal/GU defects are some of the more common features of this syndrome. Six out of eight individuals in this study (Table 3) have known renal/GU phenotype in addition to other organ involvements. Although de novo disruptions in FOXP1, were recently discovered to cause intellectual disability [OMIM# 613670], 23,24 here we defined a new syndrome which is characterized by hydrocephalus/brain malformation, cognitive impairment, cardiac defects, and CAKUT attributable to a single gene with pleiotropic effect. We recommend that patients with pathogenic or likely pathogenic variants in FOXP1 should have renal ultrasound. Upper tract defects may remain undiagnosed if ultrasound is not performed.
Although FOXP1 has been shown to have important roles in the developmental process of key organs including lung, heart and brain, 22,24,35 there is no data regarding the role of this master transcription factor in kidney and urinary tract development. In this study we showed the role of FOXP1 in CAKUT and lower urinary tract defect. All of the FOXP1 SNVs identified in this study were de novo and novel variants. These variants included frameshift, as documented for other birth defects such as the megacystis microcolon intestinal hypoperistalsis syndrome due to de novo SNVs in ACTG2. 36
Based on previous investigations, CNVs account for about 16% of CAKUT cases. 13 We hypothesized that CNVs underlie a substantive fraction of birth defects in our families as well; therefore, we inferred CNV data with two detection tools. There are some known limitations to CNV discovery from WES data. 37 One primary limitation is a high false positive rate, particularly for small CNVs. We used a stringent approach to identify potentially pathogenic CNVs for validation by aCGH. Approximately 6.5% (4/62) of our cohort have pathogenic CNVs related to the patients’ phenotype.
The four pathogenic CNVs identified (Table 4) are in disease-associated regions and have been evaluated based on ACMG guidelines. Although the fraction of families with pathogenic CNVs (6.5%) is lower than studies designed specifically to identify CNV, we included only known pathogenic CNVs and not copy-number variants of uncertain significance.
Among the most intriguing CNVs identified in this study is triplication of 22q11. Although the patient with proximal 22q11 triplication did not have chromosome analysis to determine if a marker chromosome was present, the gain of euchromatic genetic material is the same as what is seen in cat eye syndrome. Urogenital malformations are present in ~70% of reported individuals with this syndrome and include male and female genital malformations, renal agenesis, hydronephrosis, VUR, dysplastic or cystic kidneys, and bladder defects.38 Individuals with partial gains of the cat eye critical region and renal anomalies have been described, providing evidence that the distal portion of the region, including CECR2, SLC25A18, and ATPV1E1, may be responsible for these features.39 Our patient carried a clinical diagnosis of OAV; however, after uncovering the CNV triplication, most of his features are consistent with cat eye syndrome.
Our findings support the concept that WES could be an adjuvant diagnostic tool even in cases of non-syndromic CAKUT, since the involvement of other organs may be subtle or not manifest at the time of primary evaluation. WES may identify novel candidate genes, as exemplified here, and uncover underlying CNVs that contribute to CAKUT spectrum.
This study reports the use of WES for molecular diagnosis of the genetic contribution to CAKUT. Nearly 5% of individuals with CAKUT have pathogenic SNVs in known key genes that can be uncovered by WES. In addition, 6.5% of these patients have pathogenic CNVs that were extracted from WES data. In some families, organ involvement beyond CAKUT was sought retrospectively, after the review of WES results. We identified previously unrecognized genes and genetic variants (both SNVs and CNVs) in this cohort and expanded the phenotype of several known genes. Pathogenic SNVs in FOXP1 in individuals with GU/Renal phenotype strongly suggests an important role for this gene in urinary tract development.
Supported in part by K12 DK083014 Multidisciplinary K12 Urologic Research Career Development Program scholar and R01 DK078121 from the National Institute of Diabetes and Digestive and Kidney Diseases awarded to DJL. This work was funded in part by grant R01 NS058529 (JRL) from the National Institute of Neurological Disorders and Stroke and U54 HG006542 from the National Human Genome Research Institute/National Heart Lung and Blood Institute to the Baylor Hopkins Center for Mendelian Genomics. Supported in part also by a National Institutes of Human Genome Research (NHGRI) (U54HG003273) awarded to RAG. RAL is a Senior Scientific Investigator of Research to Prevent Blindness whose unrestricted funding supported part of this study.
DNA extraction was performed in laboratory of translational genomics (LTG) by Patricia Hernandez and Gladys Zapata. The analysis of the data was performed by software generously available by Codified Genomics Company. We also express our sincere gratitude to our patients and their families for their willing participation in this study.
Supplementary Methods: Whole-Exome Sequencing, Validation of selected SNVs, Validation of selected CNVs by aCGH, References for known CAKUT genes, Accession numbers
Table S1. Description of the cohort (probands of 62 families) in detail with basic demographic, phenotypic information, type of sample (saliva or blood), the summary of the genetic findings, and the pattern of segregation.
Figure S1. Copy-number variants (CNVs) inference from WES data
Table S2. Benign SNVs and variants of uncertain significance (VUS) in 19 CAKUT known genes identified in 62 families with CAKUT using Whole-Exome Sequencing