Motivation: Ontologies provide a structured representation of the concepts of a domain of knowledge as well as the relations between them. Attribute ontologies are used to describe the characteristics of the items of a domain, such as the functions of proteins or the signs and symptoms of disease, which opens the possibility of searching a database of items for the best match to a list of observed or desired attributes. However, naive search methods do not perform well on realistic data because of noise in the data, imprecision in typical queries and because individual items may not display all attributes of the category they belong to.
Results:: We present a method for combining ontological analysis with Bayesian networks to deal with noise, imprecision and attribute frequencies and demonstrate an application of our method as a differential diagnostic support system for human genetics.
Availability: We provide an implementation for the algorithm and the benchmark at http://compbio.charite.de/boqa/.
Sebastian.Bauer@charite.de or Peter.Robinson@charite.de
Supplementary Material for this article is available at Bioinformatics online.
A primary immunodeficiency syndrome caused by loss-of-function mutations in the IL-21 receptor exhibits impaired B, T, and NK cell function.
Primary immunodeficiencies (PIDs) represent exquisite models for studying mechanisms of human host defense. In this study, we report on two unrelated kindreds, with two patients each, who had cryptosporidial infections associated with chronic cholangitis and liver disease. Using exome and candidate gene sequencing, we identified two distinct homozygous loss-of-function mutations in the interleukin-21 receptor gene (IL21R; c.G602T, p.Arg201Leu and c.240_245delCTGCCA, p.C81_H82del). The IL-21RArg201Leu mutation causes aberrant trafficking of the IL-21R to the plasma membrane, abrogates IL-21 ligand binding, and leads to defective phosphorylation of signal transducer and activator of transcription 1 (STAT1), STAT3, and STAT5. We observed impaired IL-21–induced proliferation and immunoglobulin class-switching in B cells, cytokine production in T cells, and NK cell cytotoxicity. Our study indicates that human IL-21R deficiency causes an immunodeficiency and highlights the need for early diagnosis and allogeneic hematopoietic stem cell transplantation in affected children.
Mouse phenotype data represents a valuable resource for the identification of disease-associated genes, especially where the molecular basis is unknown and there is no clue to the candidate gene’s function, pathway involvement or expression pattern. However, until recently these data have not been systematically used due to difficulties in mapping between clinical features observed in humans and mouse phenotype annotations. Here, we describe a semantic approach to solve this problem and demonstrate highly significant recall of known disease-gene associations and orthology relationships. A web application (MouseFinder; www.mousemodels.org) has been developed to allow users to search the results of our whole-phenome comparison of human and mouse. We demonstrate its use in identifying ARTN as a strong candidate gene within the 1p34.1-p32 mapped locus for a hereditary form of ptosis.
phenotype; candidate disease genes; model organism; mouse
Numerous disease syndromes are associated with regions of copy number variation (CNV) in the human genome and, in most cases, the pathogenicity of the CNV is thought to be related to altered dosage of the genes contained within the affected segment. However, establishing the contribution of individual genes to the overall pathogenicity of CNV syndromes is difficult and often relies on the identification of potential candidates through manual searches of the literature and online resources. We describe here the development of a computational framework to comprehensively search phenotypic information from model organisms and single-gene human hereditary disorders, and thus speed the interpretation of the complex phenotypes of CNV disorders. There are currently more than 5000 human genes about which nothing is known phenotypically but for which detailed phenotypic information for the mouse and/or zebrafish orthologs is available. Here, we present an ontology-based approach to identify similarities between human disease manifestations and the mutational phenotypes in characterized model organism genes; this approach can therefore be used even in cases where there is little or no information about the function of the human genes. We applied this algorithm to detect candidate genes for 27 recurrent CNV disorders and identified 802 gene-phenotype associations, approximately half of which involved genes that were previously reported to be associated with individual phenotypic features and half of which were novel candidates. A total of 431 associations were made solely on the basis of model organism phenotype data. Additionally, we observed a striking, statistically significant tendency for individual disease phenotypes to be associated with multiple genes located within a single CNV region, a phenomenon that we denote as pheno-clustering. Many of the clusters also display statistically significant similarities in protein function or vicinity within the protein-protein interaction network. Our results provide a basis for understanding previously un-interpretable genotype-phenotype correlations in pathogenic CNVs and for mobilizing the large amount of model organism phenotype data to provide insights into human genetic disorders.
Semantic Web technology can considerably catalyze translational genetics and genomics research in medicine, where the interchange of information between basic research and clinical levels becomes crucial. This exchange involves mapping abstract phenotype descriptions from research resources, such as knowledge databases and catalogs, to unstructured datasets produced through experimental methods and clinical practice. This is especially true for the construction of mutation databases. This paper presents a way of harmonizing abstract phenotype descriptions with patient data from clinical practice, and querying this dataset about relationships between phenotypes and genetic variants, at different levels of abstraction.
Due to the current availability of ontological and terminological resources that have already reached some consensus in biomedicine, a reuse-based ontology engineering approach was followed. The proposed approach uses the Ontology Web Language (OWL) to represent the phenotype ontology and the patient model, the Semantic Web Rule Language (SWRL) to bridge the gap between phenotype descriptions and clinical data, and the Semantic Query Web Rule Language (SQWRL) to query relevant phenotype-genotype bidirectional relationships. The work tests the use of semantic web technology in the biomedical research domain named cerebrotendinous xanthomatosis (CTX), using a real dataset and ontologies.
A framework to query relevant phenotype-genotype bidirectional relationships is provided. Phenotype descriptions and patient data were harmonized by defining 28 Horn-like rules in terms of the OWL concepts. In total, 24 patterns of SWQRL queries were designed following the initial list of competency questions. As the approach is based on OWL, the semantic of the framework adapts the standard logical model of an open world assumption.
This work demonstrates how semantic web technologies can be used to support flexible representation and computational inference mechanisms required to query patient datasets at different levels of abstraction. The open world assumption is especially good for describing only partially known phenotype-genotype relationships, in a way that is easily extensible. In future, this type of approach could offer researchers a valuable resource to infer new data from patient data for statistical analysis in translational research. In conclusion, phenotype description formalization and mapping to clinical data are two key elements for interchanging knowledge between basic and clinical research.
Neurofibromatosis type 1 (NF1) is a multi-system disease caused by mutations in the NF1 gene encoding a Ras-GAP protein, neurofibromin, which negatively regulates Ras signaling. Besides neuroectodermal malformations and tumors, the skeletal system is often affected (e.g. scoliosis and long bone dysplasia) demonstrating the importance of neurofibromin for development and maintenance of the musculoskeletal system. Here, we focus on the role of neurofibromin in skeletal muscle development. Nf1 gene inactivation in the early limb bud mesenchyme using Prx1-cre (Nf1Prx1) resulted in muscle dystrophy characterized by fibrosis, reduced number of muscle fibers and reduced muscle force. This was caused by an early defect in myogenesis affecting the terminal differentiation of myoblasts between E12.5 and E14.5. In parallel, the muscle connective tissue cells exhibited increased proliferation at E14.5 and an increase in the amount of connective tissue as early as E16.5. These changes were accompanied by excessive mitogen-activated protein kinase pathway activation. Satellite cells isolated from Nf1Prx1 mice showed normal self-renewal, but their differentiation was impaired as indicated by diminished myotube formation. Our results demonstrate a requirement of neurofibromin for muscle formation and maintenance. This previously unrecognized function of neurofibromin may contribute to the musculoskeletal problems in NF1 patients.
Recent work has used a family-based approach and whole-exome sequencing to identify de novo mutations in sporadic cases of mental retardation.
With the availability of next-generation sequencing (NGS) technology, it is expected that sequence variants may be called on a genomic scale. Here, we demonstrate that a deeper understanding of the distribution of the variant call frequencies at heterozygous loci in NGS data sets is a prerequisite for sensitive variant detection. We model the crucial steps in an NGS protocol as a stochastic branching process and derive a mathematical framework for the expected distribution of alleles at heterozygous loci before measurement that is sequencing. We confirm our theoretical results by analyzing technical replicates of human exome data and demonstrate that the variance of allele frequencies at heterozygous loci is higher than expected by a simple binomial distribution. Due to this high variance, mutation callers relying on binomial distributed priors are less sensitive for heterozygous variants that deviate strongly from the expected mean frequency. Our results also indicate that error rates can be reduced to a greater degree by technical replicates than by increasing sequencing depth.
Genotyping experiments are widely used in clinical and basic research laboratories to identify associations between genetic variations and normal/abnormal phenotypes. Genotyping assay techniques vary from single genomic regions that are interrogated using PCR reactions to high throughput assays examining genome-wide sequence and structural variation. The resulting genotype data may include millions of markers of thousands of individuals, requiring various statistical, modeling or other data analysis methodologies to interpret the results. To date, there are no standards for reporting genotyping experiments. Here we present the Minimum Information about a Genotyping Experiment (MIGen) standard, defining the minimum information required for reporting genotyping experiments. MIGen standard covers experimental design, subject description, genotyping procedure, quality control and data analysis. MIGen is a registered project under MIBBI (Minimum Information for Biological and Biomedical Investigations) and is being developed by an interdisciplinary group of experts in basic biomedical science, clinical science, biostatistics and bioinformatics. To accommodate the wide variety of techniques and methodologies applied in current and future genotyping experiment, MIGen leverages foundational concepts from the Ontology for Biomedical Investigations (OBI) for the description of the various types of planned processes and implements a hierarchical document structure. The adoption of MIGen by the research community will facilitate consistent genotyping data interpretation and independent data validation. MIGen can also serve as a framework for the development of data models for capturing and storing genotyping results and experiment metadata in a structured way, to facilitate the exchange of metadata.
Semantic similarity searches in ontologies are an important component of many bioinformatic algorithms, e.g., finding functionally related proteins with the Gene Ontology or phenotypically similar diseases with the Human Phenotype Ontology (HPO). We have recently shown that the performance of semantic similarity searches can be improved by ranking results according to the probability of obtaining a given score at random rather than by the scores themselves. However, to date, there are no algorithms for computing the exact distribution of semantic similarity scores, which is necessary for computing the exact P-value of a given score.
In this paper we consider the exact computation of score distributions for similarity searches in ontologies, and introduce a simple null hypothesis which can be used to compute a P-value for the statistical significance of similarity scores. We concentrate on measures based on Resnik's definition of ontological similarity. A new algorithm is proposed that collapses subgraphs of the ontology graph and thereby allows fast score distribution computation. The new algorithm is several orders of magnitude faster than the naive approach, as we demonstrate by computing score distributions for similarity searches in the HPO. It is shown that exact P-value calculation improves clinical diagnosis using the HPO compared to approaches based on sampling.
The new algorithm enables for the first time exact P-value calculation via exact score distribution computation for ontology similarity searches. The approach is applicable to any ontology for which the annotation-propagation rule holds and can improve any bioinformatic method that makes only use of the raw similarity scores. The algorithm was implemented in Java, supports any ontology in OBO format, and is available for non-commercial and academic usage under: https://compbio.charite.de/svn/hpo/trunk/src/tools/significance/
Summary: Gene Ontology and other forms of gene-category analysis play a major role in the evaluation of high-throughput experiments in molecular biology. Single-category enrichment analysis procedures such as Fisher's exact test tend to flag large numbers of redundant categories as significant, which can complicate interpretation. We have recently developed an approach called model-based gene set analysis (MGSA), that substantially reduces the number of redundant categories returned by the gene-category analysis. In this work, we present the Bioconductor package mgsa, which makes the MGSA algorithm available to users of the R language. Our package provides a simple and flexible application programming interface for applying the approach.
Availability: The mgsa package has been made available as part of Bioconductor 2.8. It is released under the conditions of the Artistic license 2.0.
Contact: email@example.com; firstname.lastname@example.org
Motivation: Next-generation sequencing and exome-capture technologies are currently revolutionizing the way geneticists screen for disease-causing mutations in rare Mendelian disorders. However, the identification of causal mutations is challenging due to the sheer number of variants that are identified in individual exomes. Although databases such as dbSNP or HapMap can be used to reduce the plethora of candidate genes by filtering out common variants, the remaining set of genes still remains on the order of dozens.
Results: Our algorithm uses a non-homogeneous hidden Markov model that employs local recombination rates to identify chromosomal regions that are identical by descent (IBD = 2) in children of consanguineous or non-consanguineous parents solely based on genotype data of siblings derived from high-throughput sequencing platforms. Using simulated and real exome sequence data, we show that our algorithm is able to reduce the search space for the causative disease gene to a fifth or a tenth of the entire exome.
Availability: An R script and an accompanying tutorial are available at http://compbio.charite.de/index.php/ibd2.html.
Multicellular organismal development is controlled by a complex network of transcription factors, promoters and enhancers. Although reliable computational and experimental methods exist for enhancer detection, prediction of their target genes remains a major challenge. On the basis of available literature and ChIP-seq and ChIP-chip data for enhanceosome factor p300 and the transcriptional regulator Gli3, we found that genomic proximity and conserved synteny predict target genes with a relatively low recall of 12–27% within 2 Mb intervals centered at the enhancers. Here, we show that functional similarities between enhancer binding proteins and their transcriptional targets and proximity in the protein–protein interactome improve prediction of target genes. We used all four features to train random forest classifiers that predict target genes with a recall of 58% in 2 Mb intervals that may contain dozens of genes, representing a better than two-fold improvement over the performance of prediction based on single features alone. Genome-wide ChIP data is still relatively poorly understood, and it remains difficult to assign biological significance to binding events. Our study represents a first step in integrating various genomic features in order to elucidate the genomic network of long-range regulatory interactions.
The interpretation of data-driven experiments in genomics often involves a search for biological categories that are enriched for the responder genes identified by the experiments. However, knowledge bases such as the Gene Ontology (GO) contain hundreds or thousands of categories with very high overlap between categories. Thus, enrichment analysis performed on one category at a time frequently returns large numbers of correlated categories, leaving the choice of the most relevant ones to the user's; interpretation.
Here we present model-based gene set analysis (MGSA) that analyzes all categories at once by embedding them in a Bayesian network, in which gene response is modeled as a function of the activation of biological categories. Probabilistic inference is used to identify the active categories. The Bayesian modeling approach naturally takes category overlap into account and avoids the need for multiple testing corrections met in single-category enrichment analysis. On simulated data, MGSA identifies active categories with up to 95% precision at a recall of 20% for moderate settings of noise, leading to a 10-fold precision improvement over single-category statistical enrichment analysis. Application to a gene expression data set in yeast demonstrates that the method provides high-level, summarized views of core biological processes and correctly eliminates confounding associations.
We describe a consanguineous Iraqi family in which affected siblings had mild mental retardation and congenital ataxia characterized by quadrupedal gait. Genome-wide linkage analysis identified a 5.8 Mb interval on chromosome 8q with shared homozygosity among the affected persons. Sequencing of genes contained in the interval revealed a homozygous mutation, S100P, in carbonic anhydrase related protein 8 (CA8), which is highly expressed in cerebellar Purkinje cells and influences inositol triphosphate (ITP) binding to its receptor ITPR1 on the endoplasmatic reticulum and thereby modulates calcium signaling. We demonstrate that the mutation S100P is associated with proteasome-mediated degradation, and thus presumably represents a null mutation comparable to the Ca8 mutation underlying the previously described waddles mouse, which exhibits ataxia and appendicular dystonia. CA8 thus represents the third locus that has been associated with quadrupedal gait in humans, in addition to the VLDLR locus and a locus at chromosome 17p. Our findings underline the importance of ITP-mediated signaling in cerebellar function and provide suggestive evidence that congenital ataxia paired with cerebral dysfunction may, together with unknown contextual factors during development, predispose to quadrupedal gait in humans.
We identified a homozygous missense mutation (S100P) in the gene encoding carbonic anhydrase VIII in a consanguineous Iraqi family in which affected siblings had mild mental retardation and congenital ataxia characterized by quadrupedal gait. The affected persons walk on their hands and feet with their legs held straight with a “bear-like” gait. Our results show that the mutation S100P induces proteasome-mediated degradation with a severe reduction of the level of CA8 protein. The waddles (wdl) mouse, a spontaneous animal model with ataxia, was previously shown to harbor a 19-bp deletion in Ca8 that leads to an almost complete lack of detectable Ca8 protein, resulting in abnormalities in cerebellar synaptic transmission. Therefore, we speculate that the reduction in CA8 protein concentration associated with the S100P mutation could result in similar pathophysiological effects. With the current report, alterations at three gene loci (CA8, VLDLR, and a yet-to-be discovered gene on chromosome 17p) have been reported to be associated with quadrupedal gait. It is unknown whether quadrupedal gait is related to specific molecular abnormalities or is an adaptive response to ataxia in some circumstances. However, we note that ataxia associated with mutations at all three loci is congenital and also associated with mental retardation, which is not generally a feature of other hereditary ataxias.
Bioinformatics applications are now routinely used to analyze large amounts of data. Application development often requires many cycles of optimization, compiling, and testing. Repeatedly loading large datasets can significantly slow down the development process. We have incorporated HotSwap functionality into the protein workbench STRAP, allowing developers to create plugins using the Java HotSwap technique.
Users can load multiple protein sequences or structures into the main STRAP user interface, and simultaneously develop plugins using an editor of their choice such as Emacs. Saving changes to the Java file causes STRAP to recompile the plugin and automatically update its user interface without requiring recompilation of STRAP or reloading of protein data. This article presents a tutorial on how to develop HotSwap plugins. STRAP is available at and .
HotSwap is a useful and time-saving technique for bioinformatics developers. HotSwap can be used to efficiently develop bioinformatics applications that require loading large amounts of data into memory.
The identification of disease-causing mutations in next-generation sequencing (NGS) data requires efficient filtering techniques. In patients with rare recessive diseases, compound heterozygosity of pathogenic mutations is the most likely inheritance model if the parents are non-consanguineous. We developed a web-based compound heterozygous filter that is suited for data from NGS projects and that is easy to use for non-bioinformaticians. We analyzed the power of compound heterozygous mutation filtering by deriving background distributions for healthy individuals from different ethnicities and studied the effectiveness in trios as well as more complex pedigree structures. While usually more then 30 genes harbor potential compound heterozygotes in single exomes, this number can be markedly reduced with every additional member of the pedigree that is included in the analysis. In a real data set with exomes of four family members, two sisters affected by Mabry syndrome and their healthy parents, the disease-causing gene PIGO, which harbors the pathogenic compound heterozygous variants, could be readily identified. Compound heterozygous filtering is an efficient means to reduce the number of candidate mutations in studies aiming at identifying recessive disease genes in non-consanguineous families. A web-server is provided to make this filtering strategy available at www.gene-talk.de.
This paper describes an approach to providing computer-interpretable logical definitions for the terms of the Human Phenotype Ontology (HPO) using PATO, the ontology of phenotypic qualities, to link terms of the HPO to the anatomic and other entities that are affected by abnormal phenotypic qualities. This approach will allow improved computerized reasoning as well as a facility to compare phenotypes between different species. The PATO mapping will also provide direct links from phenotypic abnormalities and underlying anatomic structures encoded using the Foundational Model of Anatomy, which will be a valuable resource for computational investigations of the links between anatomical components and concepts representing diseases with abnormal phenotypes and associated genes.
Ontologies are widely used to represent knowledge in biomedicine. Systematic approaches for detecting errors and disagreements are needed for large ontologies with hundreds or thousands of terms and semantic relationships. A recent approach of defining terms using logical definitions is now increasingly being adopted as a method for quality control as well as for facilitating interoperability and data integration.
We show how automated reasoning over logical definitions of ontology terms can be used to improve ontology structure. We provide the Java software package GULO (Getting an Understanding of LOgical definitions), which allows fast and easy evaluation for any kind of logically decomposed ontology by generating a composite OWL ontology from appropriate subsets of the referenced ontologies and comparing the inferred relationships with the relationships asserted in the target ontology. As a case study we show how to use GULO to evaluate the logical definitions that have been developed for the Mammalian Phenotype Ontology (MPO).
Logical definitions of terms from biomedical ontologies represent an important resource for error and disagreement detection. GULO gives ontology curators a fast and simple tool for validation of their work.
Marfan syndrome is an autosomal dominantly inherited disorder of connective tissue with prominent skeletal, ocular, and cardiovascular manifestations. Aortic aneurysm and dissection are the major determinants of premature death in untreated patients. In previous work, we showed that extracts of aortic tissues from the mgR mouse model of Marfan syndrome showed increased chemotactic stimulatory activity related to the elastin-binding protein. Aortic samples were collected from 6 patients with Marfan syndrome and 8 with isolated aneurysms of the ascending aorta. Control samples were obtained from 11 organ donors without known vascular or connective tissue diseases. Soluble proteins extracted from the aortic samples of the two patient groups were compared against buffer controls and against the aortic samples from controls with respect to the ability to induce macrophage chemotaxis as measured using a modified Boyden chamber, as well as the reactivity to a monoclonal antibody BA4 against bioactive elastin peptides using ELISA. Samples from Marfan patients displayed a statistically significant increase in chemotactic inductive activity compared to control samples. Additionally, reactivity to BA4 was significantly increased. Similar statistically significant increases were identified for the samples from patients with idiopathic thoracic aortic aneurysm. There was a significant correlation between the chemotactic index and BA4 reactivity, and the increases in chemotactic activity of extracts from Marfan patients could be inhibited by pretreatment with lactose, VGVAPG peptides, or BA4, which indicates the involvement of EBP in mediating the effects. Our results demonstrate that aortic extracts of patients with Marfan syndrome can elicit macrophage chemotaxis, similar to our previous study on aortic extracts of the mgR mouse model of Marfan syndrome (Guo et al., Circulation 2006; 114:1855-62).
The sheep is an important model organism for many types of medically relevant research, but molecular genetic experiments in the sheep have been limited by the lack of knowledge about ovine gene sequences.
Prior to our study, mRNA sequences for only 1,556 partial or complete ovine genes were publicly available. Therefore, we developed a composite de novo transcriptome assembly method for next-generation sequence data to combine known ovine mRNA and EST sequences, mRNA sequences from mouse and cow, and sequences assembled de novo from short read RNA-Seq data into a composite reference transcriptome, and identified transcripts from over 12 thousand previously undescribed ovine genes. Gene expression analysis based on these data revealed substantially different expression profiles in standard versus delayed bone healing in an ovine tibial osteotomy model. Hundreds of transcripts were differentially expressed between standard and delayed healing and between the time points of the standard and delayed healing groups. We used the sheep sequences to design quantitative RT-PCR assays with which we validated the differential expression of 26 genes that had been identified by RNA-seq analysis. A number of clusters of characteristic expression profiles could be identified, some of which showed striking differences between the standard and delayed healing groups. Gene Ontology (GO) analysis showed that the differentially expressed genes were enriched in terms including extracellular matrix, cartilage development, contractile fiber, and chemokine activity.
Our results provide a first atlas of gene expression profiles and differentially expressed genes in standard and delayed bone healing in a large-animal model and provide a number of clues as to the shifts in gene expression that underlie delayed bone healing. In the course of our study, we identified transcripts of 13,987 ovine genes, including 12,431 genes for which no sequence information was previously available. This information will provide a basis for future molecular research involving the sheep as a model organism.
Elastin production is characteristically turned off during the maturation of elastin-rich organs such as the aorta. MicroRNAs (miRNAs) are small regulatory RNAs that down-regulate target mRNAs by binding to miRNA regulatory elements (MREs) typically located in the 3′ UTR. Here we show a striking up-regulation of miR-29 and miR-15 family miRNAs during murine aortic development with commensurate down-regulation of targets including elastin and other extracellular matrix (ECM) genes. There were a total of 14 MREs for miR-29 in the coding sequences (CDS) and 3′ UTR of elastin, which was highly significant, and up to 22 miR-29 MREs were found in the CDS of multiple ECM genes including several collagens. This overrepresentation was conserved throughout mammalian evolution. Luciferase reporter assays showed synergistic effects of miR-29 and miR-15 family miRNAs on 3′ UTR and coding-sequence elastin constructs. Our results demonstrate that multiple miR-29 and miR-15 family MREs are characteristic for some ECM genes and suggest that miR-29 and miR-15 family miRNAs are involved in the down-regulation of elastin in the adult aorta.
Mutations in the FBN1 gene cause Marfan syndrome (MFS) and a wide range of overlapping phenotypes. The severe end of the spectrum is represented by neonatal MFS, the vast majority of probands carrying a mutation within exons 24-32. We previously showed that a mutation in exons 24-32 is predictive of a severe cardiovascular phenotype even in non-neonatal cases, and that mutations leading to premature truncation codons are under-represented in this region. To describe patients carrying a mutation in this so-called “neonatal” region, we studied the clinical and molecular characteristics of 198 probands with a mutation in exons 24-32 from a series of 1013 probands with a FBN1 mutation (20%). When comparing patients with mutations leading to a premature termination codon within exons 24-32 to patients with an in-frame mutation within the same region, a significantly higher probability of developing ectopia lentis and mitral insufficiency were found in the second group. Patients with a premature termination codon within exons 24-32 rarely displayed a neonatal or severe MFS presentation. We also found a higher probability of neonatal presentations associated with exon 25 mutations, as well as a higher probability of cardiovascular manifestations. A high phenotypic heterogeneity could be described for recurrent mutations, ranging from neonatal to classical MFS phenotype. In conclusion, even if the exon 24-32 location appears as a major cause of the severity of the phenotype in patients with a mutation in this region, other factors such as the type of mutation or modifier genes might also be relevant.
Codon, Nonsense; DNA Mutational Analysis; Ectopia Lentis; genetics; Exons; genetics; Humans; Marfan Syndrome; genetics; Microfilament Proteins; genetics; metabolism; Mutation; Phenotype