PMCCPMCCPMCC

Search tips
Search criteria 

Advanced

 
Logo of blackwellopenThis ArticleFor AuthorsLearn MoreSubmit
Annals of Neurology
 
Ann Neurol. 2017 January; 81(1): 9–16.
Published online 2017 January 24. doi:  10.1002/ana.24835
PMCID: PMC5347854

Leigh map: A novel computational diagnostic resource for mitochondrial disease

Joyeeta Rahman, BSc, 1 Alberto Noronha, MSc, 2 Ines Thiele, PhD, 2 and Shamima Rahman, FRCP, PhD 1 , 3

Mitochondrial disorders are among the most severe metabolic disorders wherein patients suffer from multisystemic phenotypes, often resulting in early death.1 Clinical, biochemical, and genetic heterogeneity among individuals, together with poor understanding of gene‐to‐phenotype relationships, pose significant diagnostic and therapeutic challenges for clinicians. In light of recent advances in next generation sequencing technologies, whole exome sequencing (WES) is emerging as the new global standard for the diagnosis of monogenic disorders, including mitochondrial diseases.2 However, owing to genetic heterogeneity of mitochondrial disorders and ongoing discovery of novel disease genes, WES data may not provide clinicians with enough certainty for a definitive diagnosis.

With these challenges in mind, we present the Leigh Map, a novel computational gene‐to‐phenotype network to be used as a diagnostic resource for mitochondrial disease, using Leigh syndrome (Mendelian Inheritance in Man 256000), the most genetically heterogeneous and most frequent phenotype of pediatric mitochondrial disease,3, 4 as a prototype. Leigh syndrome is a progressive neurodegenerative disorder defined neuropathologically by spongiform basal ganglia and brainstem lesions.4, 5 Clinical manifestations include psychomotor retardation, with regression, and progressive neurological abnormalities related to basal ganglia and/or brainstem dysfunction, often resulting in death within 2 years of initial presentation.4, 6 However, many patients may also present with multisystemic (eg, cardiac, hepatic, renal, or hematological) phenotypes. To date, there are 89 genes known to cause Leigh syndrome, the majority of which are difficult to definitively differentiate from each other, either biochemically or clinically. We hypothesized that these multisystemic features may help to distinguish different genetic subtypes of Leigh syndrome.

The Leigh Map (freely available at vmh.uni.lu/#leighmap), was built on the Molecular Interaction NEtwoRks VisuAlization (MINERVA) platform7 previously used to construct networks of Parkinson disease and human metabolism.8, 9, 10 The network comprises 89 genes and 236 phenotypes, expressed in Human Phenotypic Ontology (HPO) terms,11, 12 providing sufficient phenotypic and genetic variation to test the network's diagnostic capability. The Leigh Map aims to enhance the interpretation of WES data to aid clinicians in providing faster and more accurate diagnoses for patients so that appropriate measures can be taken for optimal management. The phenotypic components of the Leigh Map can be queried to generate a list of candidate genes. In addition, the genetic components of the Leigh Map may also be queried to browse a list of all reported phenotypes associated with a particular gene defect. We propose that this functionality can be used to enhance clinical surveillance of patients with an established genetic diagnosis. Blinded validation of test cases containing clinical and biochemical, but not genetic, data demonstrated that 2 independent testers were able to predict the correct causative gene using this method in 80% of cases. The success of the Leigh Map demonstrates the efficacy of computational networks as diagnostic aids for mitochondrial disease (Fig (Fig11).

Figure 1
Conceptualization of the Leigh Map. The Leigh Map is a novel computational resource that effectively integrates a large amount of phenotypic and genetic data from the literature and synthesizes it into a comprehensive resource that has the potential to ...

Creation of the Leigh Map

Systematic Literature Review

The genetic and phenotypic information gathered in this study came from an initial knowledgebase of >900 publications, collected from PubMed (latest search November 2016) and the senior author's personal archive. To facilitate data collection from this large breadth of literature associated with Leigh syndrome, we performed systematic literature mining with QDA Miner Lite (v1.4.2; Provalis Research, Montreal, Quebec, Canada) to generate a list of genes reported to cause Leigh syndrome or Leigh‐like syndromes, and their corresponding phenotypes. Phenotypic information was standardized by manually entering each reported phenotype into Phenomizer (compbio.charite.de/phenomizer),11, 12 a free online resource, which catalogues thousands of standardized human phenotypes, to obtain the appropriate HPO term and number. In addition to obtaining individual Leigh syndrome genes and phenotypes, we collected information on additional parameters that will give users further insight for an informed diagnosis. Such parameters include modes of inheritance, magnetic resonance imaging findings, and patient demographic information. These data were then organized into an Excel file. Although we aimed to rely solely on text mining to obtain these data, some publications required manual clarification, owing to formatting errors on QDA Miner, which were especially prevalent in publications with large tables. In total, we consulted >500 publications to create the Leigh Map. A simplified version of the gene‐to‐phenotype knowledgebase is provided in Tables 1 and 2.

Table 1
Leigh Syndrome Disease Genes and Phenotypes Associated with Metabolism
Table 2
Leigh Syndrome Disease Genes and Phenotypes Associated with Other Mitochondrial Functions

Structure and Functionality of Leigh Map

The Leigh Map was manually assembled using CellDesigner (v4.4)13 by incorporating phenotypic, genetic, and demographic data collected through literature mining. The map layout loosely follows mitochondrial structure. The outermost compartment represents the cytosol, where it is possible to find the nucleus and the mitochondrion. Three nuclear genes, nuclear envelope protein NUP62, nuclear export protein RANBP2, and adenosine deaminase ADAR, have been included in our network as genes causing a clinical and radiological phenotype closely resembling Leigh syndrome.14, 15, 16 The mitochondrion is visualized in its double membrane structure, and mitochondrial genes are grouped according to function and can be found in their submitochondrial location (eg, outer membrane, matrix). To represent gene‐to‐phenotype associations, a submap was created for each gene, displaying all phenotypes associated with any given gene defect. Also incorporated at this stage are links to external databases (eg, Uniprot17 and HGNC18) and modes of inheritance. This approach enables a modular overview of the map, avoiding overwhelming the user with the “hairball” effect caused by the high connectivity of the network. All submaps were integrated in the MINERVA framework,7 which makes use of the Google Maps application programming interface, enables content query, and allows a low‐latency interactive navigation of the network and its submodules simply by clicking a specific gene and opening the embedded submap window available on the interface.

Navigation through the network is similar to that of Google Maps, wherein the user can reveal increasingly specific components of information by zooming in on the different compartments (Fig (Fig2,2, Supplementary Figs 1–4). Additional data (patient demographics, modes of inheritance, external annotations, etc) can be accessed by clicking an element of the map. The corresponding data will be displayed in the left panel. The search functionality enables the query of multiple genes and phenotypes. The query results are displayed in the information panel and are also highlighted on the map. When searching for multiple phenotypes, all genes associated with each phenotype will be listed. Opening the submap for any given gene will display 1 or more of the highlighted phenotype elements, providing an immediate visual interpretation of the search results.

Figure 2
Schematic layout of the Leigh Map. The Leigh Map is a novel gene‐to‐phenotype network that can be used as a diagnostic resource for Leigh syndrome. The layout and navigation of the Leigh Map are similar to those of Google Maps, wherein ...

The Leigh Map provides data about 89 genes reported to cause Leigh syndrome and Leigh‐like syndromes, the highest number of Leigh syndrome genes that has been collated to date, as well as 236 associated phenotypes. The network consists of >1,700 interactions, all of which can be manually queried by the user. To facilitate access, causative Leigh syndrome genes are segregated according to gene function and arranged on a simplified schematic of the mitochondrion. Genes with similar functions are grouped together in subcategories. Examples of gene categories that can be found on the Leigh Map include genes involved in oxidative phosphorylation (eg, NDUFA1, SDHA) and genes that maintain mitochondrial DNA (eg, POLG, SUCLA2; see Fig Fig2).2). Expression of Leigh syndrome phenotypes in HPO terms11, 12 serves to normalize the network, thereby eliminating discrepancies in clinical jargon for phenotypes for which >1 synonym exists. “Leukodystrophy,” for example, can be described alternatively as “leukoencephalopathy” or “white matter changes.” The use of different nomenclature varies among clinicians and in different geographical regions; therefore, the use of a single HPO term (leukodystrophy; HP: 0002415) simplifies the Leigh Map and encourages its widespread utilization (Fig (Fig33).

Figure 3
Querying the Leigh Map. (A–C) All phenotypic and genetic components of the Leigh Map can be queried using the search function in the left‐hand panel. The user can query a particular gene by typing the name of the gene or any known alias ...

The Efficacy of the Leigh Map as a Diagnostic Resource

Blinded validation by 2 nonclinical investigators using a series of anonymized test cases revealed that the Leigh Map was able to identify the correct gene for 16 of 20 cases. The first and second authors, who both lack formal clinical expertise, acted as independent blinded testers of the network. The anonymized test cases were obtained from the senior author's clinical practice, a national mitochondrial disease clinic where patients with Leigh syndrome who have diverse clinical presentations and genetic causes are diagnosed and managed. The criteria for these test cases were patients who had a definitive genetic diagnosis of Leigh syndrome, confirmed by Sanger sequencing or WES. Testers were provided with clinical vignettes and biochemical data, without genetic information. All corresponding phenotypes identified from each test case were entered into the query box of the Leigh Map, each separated by a semicolon. The search tool then generated a list of candidate genes for each phenotype in individual panels, which were then manually browsed to establish a list of candidate genes (see Fig Fig3).3). We define "candidate genes" as those that include >50% of the queried phenotypes. Due to the immense number of phenotypes on the network, every test case generated a list of potentially causative genes. For 10 cases, the Leigh Map was able to identify the correct gene as the "top hit," that is, the gene corresponding to the highest number of matched phenotypes. The network also predicted the correct gene for an additional 6 cases, in which they were not the top hit. In the remaining 4 test cases, the Leigh Map failed to produce the correct gene as one of the generated candidate genes. In all cases, the Leigh Map produced a shortlist of no more than 8 candidate genes, effectively eliminating ~90% of the genes in the network. Multiple advanced search is not yet possible on this platform, so some manual deduction is required for the use of the Leigh Map at this time.

Future Prospects

Due to its high success rate in predicting causative genes by nonclinical testers, we conclude that the Leigh Map is an efficacious diagnostic resource that, in combination with WES data and metabolic testing, can be used by clinicians to provide patients with accurate diagnoses or to direct further biochemical investigation. Increased certainty of the genetic causes of mitochondrial disease has significant implications, because it could potentially attenuate the need for invasive diagnostic procedures, namely muscle biopsy with an attendant general anesthetic, which could pose risk to pediatric patients. It is important to iterate that we do not propose that the Leigh Map act as a substitute for WES data or other relevant functional studies, but rather as a supplement to these techniques.

The computational nature of the Leigh Map allows for the addition of novel disease genes or phenotypes with relative ease; thereby, clinicians have access to a database of all current causative genes, which can enhance the interpretation of WES data. Ideally, we will update both the phenotypic and genetic components of the Leigh Map concurrently with the literature and also develop a facility wherein experts can submit additional genetic or phenotypic information. This is especially beneficial within the context of mitochondrial diseases, because novel genes are constantly being identified. For Leigh syndrome specifically, one‐third of the causative genes were identified within the past 5 years.3

Currently, the most significant limitation of the Leigh Map is the lack of a multiple advanced search facility. Although the absence of this feature does not detract from the network's accuracy, it does reduce its ease of use. Future work aims to implement this feature into the network. Furthermore, the efficacy of the Leigh Map is affected by the breadth of literature available for individual genes. SURF1, one of the earliest mitochondrial disease genes to be identified and the most common nuclear genetic cause of Leigh syndrome, is the subject of numerous publications.19 Thus, SURF1 is associated with > 90 phenotypes in the Leigh Map, the largest number for any single gene. In contrast, the recently characterized complex I assembly gene C17ORF89 20 only features in a small section of a larger publication and accordingly is associated with only 2 phenotypes on the Leigh Map, although patients who harbor this mutation may display other phenotypes.

Expanding the current gene‐to‐phenotype binary of the Leigh Map is a future prospect that can further improve its usefulness as a diagnostic resource. Although there are no current curative therapies for mitochondrial disease, there are numerous compounds that are aimed at symptomatic management, including anticonvulsant drugs used to manage epilepsy and cofactor and vitamin supplements, such as coenzyme Q10, thiamine, and biotin, used to treat corresponding deficiencies. The addition of drug targets (a current feature of the MINERVA platform) to the Leigh Map could potentially provide insight into the effectiveness of various agents in treating mitochondrial disease in specific genetic contexts. For example, patients with SLC19A3 mutations respond dramatically to biotin and thiamine therapy,21 whereas those with HIBCH mutations may benefit from N‐acetyl cysteine.22 cDNA and protein mutations and annotations regarding animal models are also useful potential supplements to the Leigh Map. Leigh syndrome is a defined disorder5 wherein certain phenotypes appear almost ubiquitously, including hypotonia (91% of patients), developmental delay (82%), lactic acidosis (78%), and failure to thrive (61%). The failure to deduce the correct candidate genes for a minority of our test cases was due to the predominant presence of these common Leigh syndrome phenotypes and a lack of discriminating phenotypes. We found more success in "diagnosing" cases that presented with less frequently observed phenotypes such as cardiomyopathy (59%), optic atrophy (47%), or renal tubulopathy (15%). Therefore, the addition of these extra elements can be helpful in narrowing down a large list of candidate genes, thereby increasing the predictive power of the Leigh Map. An alternative approach to increase diagnostic power for common phenotypes is to incorporate a scoring system, which is a common element in other bioinformatics resources such as BLAST.23 In the context of our network, we propose "common" phenotypes be scored lower than less frequently observed phenotypes. The addition of a scoring system would complement the more sophisticated advanced search feature that we aim to implement in the future.

Progressive improvements in sequencing technologies and increased global cooperation have allowed for the generation of copious amounts of genetic and clinical information pertaining to mitochondrial disease. The Leigh Map effectively integrates these clinical and scientific data into an efficacious diagnostic resource for a genetically heterogeneous disorder, the success of which provides the basis for the construction of larger computational networks for a wider scope of mitochondrial and metabolic diseases.

Author Contributions

S.R. and I.T. were involved in the conception and design of the study. J.R. and A.N. acquired the data and created the network. All authors drafted the manuscript and the figures.

Potential Conflicts of Interest

Nothing to report.

Supporting information

Additional supporting information can be found in the online version of this article.

Supplementary Information

Supplementary Information

Supplementary Information

Supplementary Information

Supplementary Information

Acknowledgment

Funding for this study was provided by the British Inherited Metabolic Disease Group, an ATTRACT program grant (FNR/A12/01) from the Luxembourg National Research Fund, and a Great Ormond Street Hospital Children's Charity leadership award (V1260; S.R.).

References

1. Lightowlers RN, Taylor RW, Turnbull DM. Mutations causing mitochondrial disease: what is new and what challenges remain? Science 2015;349:1494. [PubMed]
2. Wortmann SB, Koolen DA, Smeitink JA, et al. Whole exome sequencing of suspected mitochondrial patients in clinical practice. J Inherit Metab Dis 2015;38:437–443. [PubMed]
3. Lake NJ, Compton AG, Rahman S, et al. Leigh syndrome: one disorder, more than 75 monogenic causes. Ann Neurol 2016;79:190–203. [PubMed]
4. Rahman S, Blok RB, Dahl H‐HM, et al. Leigh syndrome: clinical features and biochemical and DNA abnormalities. Ann Neurol 1996;39:343–351. [PubMed]
5. Leigh D. Subacute necrotizing encephalomyelopathy in an infant. J Neurol Neurosurg Psychiatry 1951;14:216–221. [PubMed]
6. Sofou K, De Coo IFM, Isohanni P, et al. A multicenter study on Leigh syndrome: disease course and predictors of survival. Orphanet J Rare Dis 2014;9:52. [PubMed]
7. Gawron P, Ostaszewski M, Satagopam V, et al. MINERVA—a platform for visualization and curation of molecular interaction networks. NPJ Syst Biol Appl 2016;2:16020.
8. Fujita KA, Ostaszewski M, Matsuoka Y, et al. Integrating pathways of Parkinson's disease in a molecular interaction map. Mol Neurobiol 2014;49:88–102. [PubMed]
9. Noronha A, Danielsdóttir AD, Jóhannsson F, et al. ReconMap: an interactive visualisation of human metabolism. arXiv 2016;1606.00042[q‐bio.MN].
10. Thiele I, Swainston N, Fleming RMT, et al. A community‐driven global reconstruction of human metabolism. Nat Biotech 2013;31:419–425. [PMC free article] [PubMed]
11. Kohler S, Schulz MH, Krawitz P, et al. Clinical diagnostics in human genetics with semantic similarity searches in ontologies. Am J Hum Genet 2009;85:457–464. [PubMed]
12. Kohler S, Doelken SC, Mungall CJ, et al. The Human Phenotype Ontology project: linking molecular biology and disease through phenotype data. Nucleic Acids Res 2014;42(Database issue):D966–D974. [PubMed]
13. Funahashi A, Matsuoka Y, Jouraku A, et al. CellDesigner 3.5: a versatile modeling tool for biochemical networks. Proc IEEE 2008;96:1254–1265.
14. Basel‐Vanagaite L, Muncher L, Straussberg R, et al. Mutated nup62 causes autosomal recessive infantile bilateral striatal necrosis. Ann Neurol 2006;60:214–222. [PubMed]
15. Singh RR, Sedani S, Lim M, et al. RANBP2 mutation and acute necrotizing encephalopathy: 2 cases and a literature review of the expanding clinico‐radiological phenotype. Eur J Paediatr Neurol 2015;19:106–113. [PubMed]
16. Livingston JH, Lin JP, Dale RC, et al. A type I interferon signature identifies bilateral striatal necrosis due to mutations in ADAR1. J Med Genet 2014;51:76–82. [PubMed]
17. UniProt Consortium. UniProt: a hub for protein information. Nucleic Acids Res 2015;43(Database issue):D204–D212. [PubMed]
18. Gray KA, Yates B, Seal RL, et al. Genenames.org: the HGNC resources in 2015. Nucleic Acids Res 2015;43(Database issue):D1079–D1085. [PubMed]
19. Wedatilake Y, Brown RM, McFarland R, et al. SURF1 deficiency: a multi‐centre natural history study. Orphanet J Rare Dis 2013;8:96. [PubMed]
20. Floyd B, Wilkerson E, Veling M, et al. Mitochondrial protein interaction mapping identifies regulators of respiratory chain function. Mol Cell 2016;63:621–632. [PubMed]
21. Fassone E, Wedatilake Y, DeVile C, et al. Treatable Leigh‐like encephalopathy presenting in adolescence. BMJ Case Rep 2013;2013:200838. [PMC free article] [PubMed]
22. Ferdinandusse S, Waterham HR, Heales SJ, et al. HIBCH mutations can cause Leigh‐like disease with combined deficiency of multiple mitochondrial respiratory chain enzymes and pyruvate dehydrogenase. Orphanet J Rare Dis 2013;8:188. [PubMed]
23. Altschul SF. A protein alignment scoring system sensitive at all evolutionary distances. J Mol Evol 1993;36:290–300. [PubMed]

Articles from Wiley-Blackwell Online Open are provided here courtesy of Wiley-Blackwell, John Wiley & Sons