Pantothenate kinase-associated neurodegeneration (PKAN) is a rare, inborn error of metabolism characterized by iron accumulation in the basal ganglia and by the presence of dystonia, dysarthria, and retinal degeneration. Mutations in pantothenate kinase 2 (PANK2), the rate-limiting enzyme in mitochondrial coenzyme A biosynthesis, represent the most common genetic cause of this disorder. How mutations in this core metabolic enzyme give rise to such a broad clinical spectrum of pathology remains a mystery. To systematically explore its pathogenesis, we performed global metabolic profiling on plasma from a cohort of 14 genetically defined patients and 18 controls. Notably, lactate is elevated in PKAN patients, suggesting dysfunctional mitochondrial metabolism. As predicted, but never previously reported, pantothenate levels are higher in patients with premature stop mutations in PANK2. Global metabolic profiling and follow-up studies in patient-derived fibroblasts also reveal defects in bile acid conjugation and lipid metabolism, pathways that require coenzyme A. These findings raise a novel therapeutic hypothesis, namely, that dietary fats and bile acid supplements may hold potential as disease-modifying interventions. Our study illustrates the value of metabolic profiling as a tool for systematically exploring the biochemical basis of inherited metabolic diseases.
PKAN; coenzyme A; mitochondria; mass spectrometry; cholesterol
Mitochondrial calcium uptake is present in nearly all vertebrate tissues and is believed to be critical in shaping calcium signaling, regulating ATP synthesis and controlling cell death. Calcium uptake occurs through a channel called the uniporter that resides in the inner mitochondrial membrane. Recently, we used comparative genomics to identify MICU1 and MCU as the key regulatory and putative pore-forming subunits of this channel, respectively. Using bioinformatics, we now report that the human genome encodes two additional paralogs of MICU1, which we call MICU2 and MICU3, each of which likely arose by gene duplication and exhibits distinct patterns of organ expression. We demonstrate that MICU1 and MICU2 are expressed in HeLa and HEK293T cells, and provide multiple lines of biochemical evidence that MCU, MICU1 and MICU2 reside within a complex and cross-stabilize each other's protein expression in a cell-type dependent manner. Using in vivo RNAi technology to silence MICU1, MICU2 or both proteins in mouse liver, we observe an additive impairment in calcium handling without adversely impacting mitochondrial respiration or membrane potential. The results identify MICU2 as a new component of the uniporter complex that may contribute to the tissue-specific regulation of this channel.
Metabolic reprogramming has been proposed to be a hallmark of cancer, yet we currently lack a systematic characterization of the metabolic pathways active in transformed cells. Using mass spectrometry, we measured the consumption and release (CORE) of 219 metabolites from media across the NCI-60 cancer cell lines, and integrated CORE profiles with a pre-existing atlas of gene expression. The integrated analysis identified glycine consumption and expression of the mitochondrial glycine biosynthetic pathway as strongly correlated with rates of proliferation across cancer cells. Antagonizing glycine uptake and its mitochondrial biosynthesis preferentially impaired rapidly proliferating cells. Moreover, higher expression of this pathway was associated with greater mortality in breast cancer patients. Increased reliance on glycine may represent a metabolic vulnerability for selectively targeting rapid cancer cell proliferation.
Advances in next-generation sequencing (NGS) promise to facilitate diagnosis of inherited disorders. While in research settings NGS has pinpointed causal alleles using segregation in large families, the key challenge for clinical diagnosis is application to single individuals. To explore its diagnostic utility, we performed targeted NGS in 42 unrelated infants with clinical and biochemical evidence of mitochondrial oxidative phosphorylation disease, who were refractory to traditional molecular diagnosis. These devastating mitochondrial disorders are characterized by phenotypic and genetic heterogeneity, with over 100 causal genes identified to date. We performed “MitoExome” sequencing of the mitochondrial DNA (mtDNA) and exons of ~1000 nuclear genes encoding mitochondrial proteins and prioritized rare mutations predicted to disrupt function. Since patients and controls harbored a comparable number of such heterozygous alleles, we could not prioritize dominant acting genes. However, patients showed a five-fold enrichment of genes with two such mutations that could underlie recessive disease. In total, 23/42 (55%) patients harbored such recessive genes or pathogenic mtDNA variants. Firm diagnoses were enabled in 10 patients (24%) who had mutations in genes previously linked to disease. 13 patients (31%) had mutations in nuclear genes never linked to disease. The pathogenicity of two such genes, NDUFB3 and AGK, was supported by cDNA complementation and evidence from multiple patients, respectively. The results underscore the immediate potential and challenges of deploying NGS in clinical settings.
Calcium uptake into mitochondria occurs via a recently identified ion channel called the uniporter. Here, we characterize the phylogenomic distribution of the uniporter’s membrane-spanning pore subunit (MCU) and regulatory partner (MICU1). Homologs of both components tend to co-occur in all major branches of eukaryotic life, but both have been lost along certain protozoan and fungal lineages. Several bacterial genomes also contain putative MCU homologs that may represent prokaryotic calcium channels. The analyses indicate that the uniporter may have been an early feature of mitochondria.
Mitochondria from diverse organisms are capable of transporting large amounts of Ca2+ via a ruthenium-red-sensitive, membrane-potential-dependent mechanism called the uniporter1–4. Although the uniporter’s biophysical properties have been studied extensively, its molecular composition remains elusive. We recently used comparative proteomics to identify MICU1 (also known as CBARA1), an EF-hand-containing protein that serves as a putative regulator of the uniporter5. Here, we use whole-genome phylogenetic profiling, genome-wide RNA co-expression analysis and organelle-wide protein coexpression analysis to predict proteins functionally related to MICU1. All three methods converge on a novel predicted transmembrane protein, CCDC109A, that we now call ‘mitochondrial calcium uniporter’ (MCU). MCU forms oligomers in the mitochondrial inner membrane, physically interacts with MICU1, and resides within a large molecular weight complex. Silencing MCU in cultured cells or in vivo in mouse liver severely abrogates mitochondrial Ca2+ uptake, whereas mitochondrial respiration and membrane potential remain fully intact. MCU has two predicted transmembrane helices, which are separated by a highly conserved linker facing the intermembrane space. Acidic residues in this linker are required for its full activity. However, an S259A point mutation retains function but confers resistance to Ru360, the most potent inhibitor of the uniporter. Our genomic, physiological, biochemical and pharmacological data firmly establish MCU as an essential component of the mitochondrial Ca2+ uniporter.
The metazoan mitochondrial translation machinery is unusual in having a single tRNAMet that fulfills the dual role of the initiator and elongator tRNAMet. A portion of the Met-tRNAMet pool is formylated by mitochondrial methionyl-tRNA formyltransferase (MTFMT) to generate N-formylmethioninetRNAMet (fMet-tRNAmet), which is used for translation initiation; however, the requirement of formylation for initiation in human mitochondria is still under debate. Using targeted sequencing of the mtDNA and nuclear exons encoding the mitochondrial proteome (MitoExome), we identified compound heterozygous mutations in MTFMT in two unrelated children presenting with Leigh syndrome and combined OXPHOS deficiency. Patient fibroblasts exhibit severe defects in mitochondrial translation that can be rescued by exogenous expression of MTFMT. Furthermore, patient fibroblasts have dramatically reduced fMet-tRNAMet levels and an abnormal formylation profile of mitochondrially translated COX1. Our findings demonstrate that MTFMT is critical for efficient human mitochondrial translation and reveal a human disorder of Met-tRNAMet formylation.
The reduction of plasma low-density lipoprotein levels by HMG-CoA reductase inhibitors, or statins, has had a revolutionary impact in medicine, but muscle-related side effects remain a dose-limiting toxicity in many patients. We describe a chemical epistasis approach that can be useful in refining the mechanism of statin muscle toxicity, as well as in screening for agents that suppress muscle toxicity while preserving the ability of statins to increase the expression of the low-density lipoprotein receptor. Using this approach, we identified one compound that attenuates the muscle side effects in both cellular and animal models of statin toxicity, likely by influencing Rab prenylation. Our proof-of-concept screen lays the foundation for truly high-throughput screens that could help lead to the development of clinically useful adjuvants that can one day be co-administered with statins.
Piwi; Hop; Hsp90; canalization; phenotypic variation; Drosophila
The cellular content of mitochondria changes dynamically during development and in response to external stimuli, but the underlying mechanisms remain obscure. To systematically identify molecular probes and pathways that control mitochondrial abundance, we developed a high-throughput imaging assay that tracks both the per cell mitochondrial content and the cell size in confluent human umbilical vein endothelial cells. We screened 28,786 small molecules and observed that hundreds of small molecules are capable of increasing or decreasing the cellular content of mitochondria in a manner proportionate to cell size, revealing stereotyped control of these parameters. However, only a handful of compounds dissociate this relationship. We focus on one such compound, BRD6897, and demonstrate through secondary assays that it increases the cellular content of mitochondria as evidenced by fluorescence microscopy, mitochondrial protein content, and respiration, even after rigorous correction for cell size, cell volume, or total protein content. BRD6897 increases uncoupled respiration 1.6-fold in two different, non-dividing cell types. Based on electron microscopy, BRD6897 does not alter the percent of cytoplasmic area occupied by mitochondria, but instead, induces a striking increase in the electron density of existing mitochondria. The mechanism is independent of known transcriptional programs and is likely to be related to a blockade in the turnover of mitochondrial proteins. At present the molecular target of BRD6897 remains to be elucidated, but if identified, could reveal an important additional mechanism that governs mitochondrial biogenesis and turnover.
Defects in cellular energy metabolism represent an early feature in a variety of human neurodegenerative diseases. Recent studies have shown that targeting energy metabolism can protect against neuronal cell death in such diseases. Here, we show that meclizine, a clinically used drug that we have recently shown to silence oxidative metabolism, suppresses apoptotic cell death in a murine cellular model of polyglutamine (polyQ) toxicity. We further show that this protective effect extends to neuronal dystrophy and cell death in Caenorhabditis elegans and Drosophila melanogaster models of polyQ toxicity. Meclizine's mechanism of action is not attributable to its anti-histaminergic or anti-muscarinic activity, but rather, strongly correlates with its ability to suppress mitochondrial respiration. Since meclizine is an approved drug that crosses the blood–brain barrier, it may hold therapeutic potential in the treatment of polyQ toxicity disorders, such as Huntington's disease.
Mitochondrial diseases comprise a diverse set of clinical disorders that affect multiple organ systems with varying severity and age of onset. Due to their clinical and genetic heterogeneity, these diseases are difficult to diagnose. We have developed a targeted exome sequencing approach to improve our ability to properly diagnose mitochondrial diseases and apply it here to an individual patient. Our method targets mitochondrial DNA (mtDNA) and the exons of 1,600 nuclear genes involved in mitochondrial biology or Mendelian disorders with multi-system phenotypes, thereby allowing for simultaneous evaluation of multiple disease loci.
Targeted exome sequencing was performed on a patient initially suspected to have a mitochondrial disorder. The patient presented with diabetes mellitus, diffuse brain atrophy, autonomic neuropathy, optic nerve atrophy, and a severe amnestic syndrome. Further work-up revealed multiple heteroplasmic mtDNA deletions as well as profound thiamine deficiency without a clear nutritional cause. Targeted exome sequencing revealed a homozygous c.1672C > T (p.R558C) missense mutation in exon 8 of WFS1 that has previously been reported in a patient with Wolfram syndrome.
This case demonstrates how clinical application of next-generation sequencing technology can enhance the diagnosis of patients suspected to have rare genetic disorders. Furthermore, the finding of unexplained thiamine deficiency in a patient with Wolfram syndrome suggests a potential link between WFS1 biology and thiamine metabolism that has implications for the clinical management of Wolfram syndrome patients.
Motifs are patterns found in biological sequences that are vital for understanding gene function, human disease, drug design, etc. They are helpful in finding transcriptional regulatory elements, transcription factor binding sites, and so on. As a result, the problem of identifying motifs is very crucial in biology.
Many facets of the motif search problem have been identified in the literature. One of them is (ℓ, d)-motif search (or Planted Motif Search (PMS)). The PMS problem has been well investigated and shown to be NP-hard. Any algorithm for PMS that always finds all the (ℓ, d)-motifs on a given input set is called an exact algorithm. In this paper we focus on exact algorithms only. All the known exact algorithms for PMS take exponential time in some of the underlying parameters in the worst case scenario. But it does not mean that we cannot design exact algorithms for solving practical instances within a reasonable amount of time. In this paper, we propose a fast algorithm that can solve the well-known challenging instances of PMS: (21, 8) and (23, 9). No prior exact algorithm could solve these instances. In particular, our proposed algorithm takes about 10 hours on the challenging instance (21, 8) and about 54 hours on the challenging instance (23, 9). The algorithm has been run on a single 2.4GHz PC with 3GB RAM. The implementation of PMS5 is freely available on the web at http://www.pms.engr.uconn.edu/downloads/PMS5.zip.
We present an efficient algorithm PMS5 that uses some novel ideas and combines them with well-known algorithm PMS1 and PMSPrune. PMS5 can tackle the large challenging instances (21, 8) and (23, 9). Therefore, we hope that PMS5 will help biologists discover longer motifs in the futures.
Flavin-linked sulfhydryl oxidases participate in the net generation of disulfide bonds during oxidative protein folding in the endoplasmic reticulum. Members of the Quiescin-sulfhydryl oxidase (QSOX) family catalyze the facile direct introduction of disulfide bonds into unfolded reduced proteins with the reduction of molecular oxygen to generate hydrogen peroxide. Current progress in dissecting the mechanism of QSOX enzymes is reviewed, with emphasis on the CxxC motifs in the thioredoxin and Erv/ALR domains and the involvement of the flavin prosthetic group. The tissue distribution and intra- and extracellular location of QSOX enzymes are discussed, and suggestions for the physiological role of these enzymes are presented. The review compares the substrate specificity and catalytic efficiency of the QSOX enzymes with members of the Ero1 family of flavin-dependent sulfhydryl oxidases: enzymes believed to play key roles in disulfide generation in yeast and higher eukaryotes. Finally, limitations of our current understanding of disulfide generation in metazoans are identified and questions posed for the future. Antioxid. Redox Signal. 13, 1217–1230.
Resurfacing submacular human Bruch's membrane with a cell-deposited extracellular matrix increases long-term survival of retinal pigment epithelial cells. This effect is most marked in submacular Bruch's membrane of aged Caucasians.
To determine whether resurfacing submacular human Bruch's membrane with a cell-deposited extracellular matrix (ECM) improves retinal pigment epithelial (RPE) survival.
Bovine corneal endothelial (BCE) cells were seeded onto the inner collagenous layer of submacular Bruch's membrane explants of human donor eyes to allow ECM deposition. Control explants from fellow eyes were cultured in medium only. The deposited ECM was exposed by removing BCE. Fetal RPE cells were then cultured on these explants for 1, 14, or 21 days. The explants were analyzed quantitatively by light microscopy and scanning electron microscopy. Surviving RPE cells from explants cultured for 21 days were harvested to compare bestrophin and RPE65 mRNA expression. Mass spectroscopy was performed on BCE-ECM to examine the protein composition.
The BCE-treated explants showed significantly higher RPE nuclear density than did the control explants at all time points. RPE expressed more differentiated features on BCE-treated explants than on untreated explants, but expressed very little mRNA for bestrophin or RPE65. The untreated young (<50 years) and African American submacular Bruch's membrane explants supported significantly higher RPE nuclear densities (NDs) than did the Caucasian explants. These differences were reduced or nonexistent in the BCE-ECM-treated explants. Proteins identified in the BCE-ECM included ECM proteins, ECM-associated proteins, cell membrane proteins, and intracellular proteins.
Increased RPE survival can be achieved on aged submacular human Bruch's membrane by resurfacing the latter with a cell-deposited ECM. Caucasian eyes seem to benefit the most, as cell survival is the worst on submacular Bruch's membrane in these eyes.
Most cells can dynamically shift their relative reliance on glycolytic versus oxidative metabolism in response to nutrient availability, during development, and in disease. Studies in model systems have shown that re-directing energy metabolism from respiration to glycolysis can suppress oxidative damage and cell death in ischemic injury. At present we have a limited set of drugs that safely toggle energy metabolism in humans. Here, we introduce a quantitative, nutrient sensitized screening strategy that can identify such compounds based on their ability to selectively impair growth and viability of cells grown in galactose versus glucose. We identify several FDA approved agents never before linked to energy metabolism, including meclizine, which blunts cellular respiration via a mechanism distinct from canonical inhibitors. We further show that meclizine pretreatment confers cardioprotection and neuroprotection against ischemia-reperfusion injury in murine models. Nutrient-sensitized screening may offer a useful framework for understanding gene function and drug action within the context of energy metabolism.
Emerging technologies allow the high-throughput profiling of metabolic status from a blood specimen (metabolomics). We investigated whether metabolite profiles could predict the development of diabetes. Among 2,422 normoglycemic individuals followed for 12 years, 201 developed diabetes. Amino acids, amines, and other polar metabolites were profiled in baseline specimens using liquid chromatography-tandem mass spectrometry. Cases and controls were matched for age, body mass index and fasting glucose. Five branched-chain and aromatic amino acids had highly-significant associations with future diabetes: isoleucine, leucine, valine, tyrosine, and phenylalanine. A combination of three amino acids predicted future diabetes (>5-fold higher risk for individuals in top quartile). The results were replicated in an independent, prospective cohort. These findings underscore the potential importance of amino acid metabolism early in the pathogenesis of diabetes, and suggest that amino acid profiles could aid in diabetes risk assessment.
The NADH:quinone oxidoreductase (complex I) has evolved from a combination of smaller functional building blocks. Chloroplasts and cyanobacteria contain a complex I-like enzyme having only 11 subunits. This enzyme lacks the N-module which harbors the NADH binding site and the flavin and iron–sulfur cluster prosthetic groups. A complex I-homologous enzyme found in some archaea contains an F420 dehydrogenase subunit denoted as FpoF rather than the N-module. In the present study, all currently available whole genome sequences were used to survey the occurrence of the different types of complex I in the different kingdoms of life. Notably, the 11-subunit version of complex I was found to be widely distributed, both in the archaeal and in the eubacterial kingdoms, whereas the 14-subunit classical complex I was found only in certain eubacterial phyla. The FpoF-containing complex I was present in Euryarchaeota but not in Crenarchaeota, which contained the 11-subunit complex I. The 11-subunit enzymes showed a primary sequence variability as great or greater than the full-size 14-subunit complex I, but differed distinctly from the membrane-bound hydrogenases. We conclude that this type of compact 11-subunit complex I is ancestral to all present-day complex I enzymes. No designated partner protein, acting as an electron delivery device, could be found for the compact version of complex I. We propose that the primordial complex I, and many of the present-day 11-subunit versions of it, operate without a designated partner protein but are capable of interaction with several different electron donor or acceptor proteins.
Electronic supplementary material
The online version of this article (doi:10.1007/s00239-011-9447-2) contains supplementary material, which is available to authorized users.
NADH:quinone oxidoreductase; NiFe-hydrogenase; Formate:hydrogen lyase; Protein phylogeny; Functional modules; Antiporter
Discovering the molecular basis of mitochondrial respiratory chain disease is challenging given the large number of both mitochondrial and nuclear genes involved. We report a strategy of focused candidate gene prediction, high-throughput sequencing, and experimental validation to uncover the molecular basis of mitochondrial complex I (CI) disorders. We created five pools of DNA from a cohort of 103 patients and then performed deep sequencing of 103 candidate genes to spotlight 151 rare variants predicted to impact protein function. We used confirmatory experiments to establish genetic diagnoses in 22% of previously unsolved cases, and discovered that defects in NUBPL and FOXRED1 can cause CI deficiency. Our study illustrates how large-scale sequencing, coupled with functional prediction and experimental validation, can reveal novel disease-causing mutations in individual patients.
Under the shell of a chicken egg are two opposed proteinaceous disulfide-rich membranes. They are fabricated in the avian oviduct using fibers formed from proteins that are extensively coupled by irreversible lysine-derived crosslinks. The intractability of these eggshell membranes (ESM) has slowed their characterization and their protein composition remains uncertain. In this work, reductive alkylation of ESM followed by proteolytic digestion led to the identification of a cysteine rich ESM protein (abbreviated CREMP) that was similar to spore coat protein SP75 from cellular slime molds. Analysis of the cysteine repeats in partial sequences of CREMP reveals runs of remarkably repetitive patterns. Module a contains a C-X4-C-X5-C-X8-C-X6 pattern (where X represents intervening non-cysteine residues). These inter-cysteine amino acid residues are also strikingly conserved. The evolutionarily-related module b has the same cysteine spacing as a, but has 11 amino acid residues at its C-terminus. Different stretches of CREMP sequences in chicken genomic DNA fragments show diverse repeat patterns: e.g. all a modules; an alternation of a-b modules; or an a-b-b arrangement. Comparable CREMP proteins are found in contigs of the zebra finch (Taeniopygia guttata) and in the oviparous green anole lizard (Anolis carolinensis). In all these cases the long runs of highly conserved modular repeats have evidently led to difficulties in the assembly of full length DNA sequences. Hence the number, and the amino acid lengths, of CREMP proteins are currently unknown. A 118 amino acid fragment (representing an a-b-a-b pattern) from a chicken oviduct EST library expressed in Escherichia coli is a well folded, highly anisotropic, protein with a large chemical shift dispersion in 2D solution NMR spectra. Structure is completely lost on reduction of the 8 disulfide bonds of this protein fragment. Finally, solid state NMR spectra suggest a surprising degree of order in intact ESM fibers.
Mitochondrial calcium uptake plays a central role in cell physiology by stimulating ATP production, shaping cytosolic calcium transients, and regulating cell death. The biophysical properties of mitochondrial calcium uptake have been studied in detail, but the underlying proteins remain elusive. Here, we utilize an integrative strategy to predict human genes involved in mitochondrial calcium entry based on clues from comparative physiology, evolutionary genomics, and organelle proteomics. RNA interference against 13 top candidates highlighted one gene that we now call mitochondrial calcium uptake 1 (MICU1). Silencing MICU1 does not disrupt mitochondrial respiration or membrane potential but abolishes mitochondrial calcium entry in intact and permeabilized cells, and attenuates the metabolic coupling between cytosolic calcium transients and activation of matrix dehydrogenases. MICU1 is associated with the organelle’s inner membrane and has two canonical EF hands that are essential for its activity, suggesting a role in calcium sensing. MICU1 represents the founding member of a set of proteins required for high capacity mitochondrial calcium entry. Its discovery may lead to the complete molecular characterization of mitochondrial calcium uptake pathways, and offers genetic strategies for understanding their contribution to normal physiology and disease.
Quiescin-sulfhydryl oxidase (QSOX) flavoenzymes catalyze the direct, facile, insertion of disulfide bonds into reduced unfolded proteins with the reduction of oxygen to hydrogen peroxide. To date, only QSOXs from vertebrates have been characterized enzymatically. These metazoan sulfhydryl oxidases have 4 recognizable domains: a redox-active thioredoxin (Trx) domain containing the first of three CxxC motifs (CI-CII), a second Trx domain with no obvious redox-active disulfide, a helix-rich domain, and then an Erv/ALR domain. This last domain contains the FAD moiety, a proximal CIII-CIV disulfide and a third CxxC of unknown function (CV-CVI). Plant and protist QSOXs lack the second Trx domain, but otherwise appear to contain the same complement of redox centers. This work presents the first characterization of a single-Trx QSOX. Trypanosoma brucei QSOX was expressed in Escherichia coli using a synthetic gene and found to be a stable, monomeric, FAD-containing protein. Although evidently lacking an entire domain, TbQSOX shows catalytic activity and substrate specificity similar to the vertebrate QSOXs examined previously. Unfolded reduced proteins are more than 200-fold more effective substrates on a per-thiol basis than glutathione, and some 10-fold better than the parasite bis-glutathione analog, trypanothione. These data are consistent with a role for the protist QSOX in oxidative protein folding. Site-directed mutagenesis of each of the 6 cysteine residues (to serines) show that the CxxC motif in the single Trx domain is crucial for efficient catalysis of the oxidation of both reduced RNase and the model substrate dithiothreitol. As expected, the proximal disulfide CIII-CIV, which interacts with the flavin, is catalytically crucial. However, as observed with human QSOX1, the third CxxC motif shows no obvious catalytic role during the in vitro oxidation of reduced RNase or dithiothreitol. Pre-steady state kinetics demonstrates that turnover in TbQSOX is limited by an internal redox step leading to 2-electron reduction of the FAD cofactor. In sum, the single-Trx domain QSOX studied here shows a striking similarity in enzymatic behavior to its double-Trx metazoan counterparts.
Assembling genomic sequences from a set of overlapping reads is one of the most fundamental problems in computational biology. Algorithms addressing the assembly problem fall into two broad categories - based on the data structures which they employ. The first class uses an overlap/string graph and the second type uses a de Bruijn graph. However with the recent advances in short read sequencing technology, de Bruijn graph based algorithms seem to play a vital role in practice. Efficient algorithms for building these massive de Bruijn graphs are very essential in large sequencing projects based on short reads. In an earlier work, an O(n/p) time parallel algorithm has been given for this problem. Here n is the size of the input and p is the number of processors. This algorithm enumerates all possible bi-directed edges which can overlap with a node and ends up generating Θ(nΣ) messages (Σ being the size of the alphabet).
In this paper we present a Θ(n/p) time parallel algorithm with a communication complexity that is equal to that of parallel sorting and is not sensitive to Σ. The generality of our algorithm makes it very easy to extend it even to the out-of-core model and in this case it has an optimal I/O complexity of Θ(nlog(n/B)Blog(M/B)) (M being the main memory size and B being the size of the disk block). We demonstrate the scalability of our parallel algorithm on a SGI/Altix computer. A comparison of our algorithm with the previous approaches reveals that our algorithm is faster - both asymptotically and practically. We demonstrate the scalability of our sequential out-of-core algorithm by comparing it with the algorithm used by VELVET to build the bi-directed de Bruijn graph. Our experiments reveal that our algorithm can build the graph with a constant amount of memory, which clearly outperforms VELVET. We also provide efficient algorithms for the bi-directed chain compaction problem.
The bi-directed de Bruijn graph is a fundamental data structure for any sequence assembly program based on Eulerian approach. Our algorithms for constructing Bi-directed de Bruijn graphs are efficient in parallel and out of core settings. These algorithms can be used in building large scale bi-directed de Bruijn graphs. Furthermore, our algorithms do not employ any all-to-all communications in a parallel setting and perform better than the prior algorithms. Finally our out-of-core algorithm is extremely memory efficient and can replace the existing graph construction algorithm in VELVET.
Mitochondrial dysfunction has been observed in skeletal muscle of people with diabetes and insulin-resistant individuals. Furthermore, inherited mutations in mitochondrial DNA can cause a rare form of diabetes. However, it is unclear whether mitochondrial dysfunction is a primary cause of the common form of diabetes. To date, common genetic variants robustly associated with type 2 diabetes (T2D) are not known to affect mitochondrial function. One possibility is that multiple mitochondrial genes contain modest genetic effects that collectively influence T2D risk. To test this hypothesis we developed a method named Meta-Analysis Gene-set Enrichment of variaNT Associations (MAGENTA; http://www.broadinstitute.org/mpg/magenta). MAGENTA, in analogy to Gene Set Enrichment Analysis, tests whether sets of functionally related genes are enriched for associations with a polygenic disease or trait. MAGENTA was specifically designed to exploit the statistical power of large genome-wide association (GWA) study meta-analyses whose individual genotypes are not available. This is achieved by combining variant association p-values into gene scores and then correcting for confounders, such as gene size, variant number, and linkage disequilibrium properties. Using simulations, we determined the range of parameters for which MAGENTA can detect associations likely missed by single-marker analysis. We verified MAGENTA's performance on empirical data by identifying known relevant pathways in lipid and lipoprotein GWA meta-analyses. We then tested our mitochondrial hypothesis by applying MAGENTA to three gene sets: nuclear regulators of mitochondrial genes, oxidative phosphorylation genes, and ∼1,000 nuclear-encoded mitochondrial genes. The analysis was performed using the most recent T2D GWA meta-analysis of 47,117 people and meta-analyses of seven diabetes-related glycemic traits (up to 46,186 non-diabetic individuals). This well-powered analysis found no significant enrichment of associations to T2D or any of the glycemic traits in any of the gene sets tested. These results suggest that common variants affecting nuclear-encoded mitochondrial genes have at most a small genetic contribution to T2D susceptibility.
Mitochondria play a crucial role in metabolic homeostasis, and alteration of mitochondrial function is a hallmark of diabetes. While mitochondrial activity is reduced in people with diabetes, it is unclear whether mitochondrial dysfunction is a cause or effect of type 2 diabetes. Genome-wide association studies for type 2 diabetes have explained ≈10% of the heritability of the disease, but none of the loci are known to affect mitochondrial activity. It is possible though that a mitochondrial contribution is hidden in the remaining 90%. Hence, we tested the hypothesis that multiple mitochondria-related genes encoded in the nucleus, each having a weak effect (hard to detect individually), can collectively influence type 2 diabetes. To address this, we developed a computational method (MAGENTA) that allowed us to adequately analyze large collective datasets of human genetic variation obtained from collaborative studies of type 2 diabetes and related glycemic traits. Despite the increased sensitivity of MAGENTA compared to single-DNA variant analysis, we found no support for a causal relationship between mitochondrial dysfunction and type 2 diabetes. These results may help steer future efforts in understanding the pathogenesis of the disease. MAGENTA is broadly applicable to testing associations between other biological pathways and common diseases or traits.