The EcoCyc database is an online scientific database which provides an integrated view of the metabolic and regulatory network of the bacterium Escherichia coli K-12 and facilitates computational exploration of this important model organism. We have analysed the occurrence of dead end metabolites within the database – these are metabolites which lack the requisite reactions (either metabolic or transport) that would account for their production or consumption within the metabolic network. 127 dead end metabolites were identified from the 995 compounds that are contained within the EcoCyc metabolic network. Their presence reflects either a deficit in our representation of the network or in our knowledge of E. coli metabolism. Extensive literature searches resulted in the addition of 38 transport reactions and 3 metabolic reactions to the database and led to an improved representation of the pathway for Vitamin B12 salvage. 39 dead end metabolites were identified as components of reactions that are not physiologically relevant to E. coli K-12 – these reactions are properties of purified enzymes in vitro that would not be expected to occur in vivo. Our analysis led to improvements in the software that underpins the database and to the program that finds dead end metabolites within EcoCyc. The remaining dead end metabolites in the EcoCyc database likely represent deficiencies in our knowledge of E. coli metabolism.
Biological systems exhibit two structural features on many levels of organization: sparseness, in which only a small fraction of possible interactions between components actually occur; and modularity – the near decomposability of the system into modules with distinct functionality. Recent work suggests that modularity can evolve in a variety of circumstances, including goals that vary in time such that they share the same subgoals (modularly varying goals), or when connections are costly. Here, we studied the origin of modularity and sparseness focusing on the nature of the mutation process, rather than on connection cost or variations in the goal. We use simulations of evolution with different mutation rules. We found that commonly used sum-rule mutations, in which interactions are mutated by adding random numbers, do not lead to modularity or sparseness except for in special situations. In contrast, product-rule mutations in which interactions are mutated by multiplying by random numbers – a better model for the effects of biological mutations – led to sparseness naturally. When the goals of evolution are modular, in the sense that specific groups of inputs affect specific groups of outputs, product-rule mutations also lead to modular structure; sum-rule mutations do not. Product-rule mutations generate sparseness and modularity because they tend to reduce interactions, and to keep small interaction terms small.
The potato rot nematode, Ditylenchus destructor, is a very destructive nematode pest on many agriculturally important crops worldwide, but the molecular characterization of its parasitism of plant has been limited. The effectors involved in nematode parasitism of plant for several sedentary endo-parasitic nematodes such as Heterodera glycines, Globodera rostochiensis and Meloidogyne incognita have been identified and extensively studied over the past two decades. Ditylenchus destructor, as a migratory plant parasitic nematode, has different feeding behavior, life cycle and host response. Comparing the transcriptome and parasitome among different types of plant-parasitic nematodes is the way to understand more fully the parasitic mechanism of plant nematodes. We undertook the approach of sequencing expressed sequence tags (ESTs) derived from a mixed stage cDNA library of D. destructor. This is the first study of D. destructor ESTs. A total of 9800 ESTs were grouped into 5008 clusters including 3606 singletons and 1402 multi-member contigs, representing a catalog of D. destructor genes. Implementing a bioinformatics' workflow, we found 1391 clusters have no match in the available gene database; 31 clusters only have similarities to genes identified from D. africanus, the most closely related species to D. destructor; 1991 clusters were annotated using Gene Ontology (GO); 1550 clusters were assigned enzyme commission (EC) numbers; and 1211 clusters were mapped to 181 KEGG biochemical pathways. 22 ESTs had similarities to reported nematode effectors. Interestedly, most of the effectors identified in this study are involved in host cell wall degradation or modification, such as 1,4-beta-glucanse, 1,3-beta-glucanse, pectate lyase, chitinases and expansin, or host defense suppression such as calreticulin, annexin and venom allergen-like protein. This result implies that the migratory plant-parasitic nematode D. destructor secrets similar effectors to those of sedentary plant nematodes. Finally we further characterized the two D. destructor expansin proteins.
Macrotermitinae (fungus-cultivating termites) are major decomposers in tropical and subtropical areas of Asia and Africa. They have specifically evolved mutualistic associations with both a Termitomyces fungi on the nest and a gut microbiota, providing a model system for probing host-microbe interactions. Yet the symbiotic roles of gut microbes residing in its major feeding caste remain largely undefined. Here, by pyrosequencing the whole gut metagenome of adult workers of a fungus-cultivating termite (Odontotermes yunnanensis), we showed that it did harbor a broad set of genes or gene modules encoding carbohydrate-active enzymes (CAZymes) relevant to plant fiber degradation, particularly debranching enzymes and oligosaccharide-processing enzymes. Besides, it also contained a considerable number of genes encoding chitinases and glycoprotein oligosaccharide-processing enzymes for fungal cell wall degradation. To investigate the metabolic divergence of higher termites of different feeding guilds, a SEED subsystem-based gene-centric comparative analysis of the data with that of a previously sequenced wood-feeding Nasutitermes hindgut microbiome was also attempted, revealing that SEED classifications of nitrogen metabolism, and motility and chemotaxis were significantly overrepresented in the wood-feeder hindgut metagenome, while Bacteroidales conjugative transposons and subsystems related to central aromatic compounds metabolism were apparently overrepresented here. This work fills up our gaps in understanding the functional capacities of fungus-cultivating termite gut microbiota, especially their roles in the symbiotic digestion of lignocelluloses and utilization of fungal biomass, both of which greatly add to existing understandings of this peculiar symbiosis.
The protozoan Trypanosoma brucei causes African Trypanosomiasis or sleeping sickness in humans, which can be lethal if untreated. Most available pharmacological treatments for the disease have severe side-effects. The purpose of this analysis was to detect novel protein-protein interactions (PPIs), vital for the parasite, which could lead to the development of drugs against this disease to block the specific interactions. In this work, the Domain Fusion Analysis (Rosetta Stone method) was used to identify novel PPIs, by comparing T. brucei to 19 organisms covering all major lineages of the tree of life. Overall, 49 possible protein-protein interactions were detected, and classified based on (a) statistical significance (BLAST e-value, domain length etc.), (b) their involvement in crucial metabolic pathways, and (c) their evolutionary history, particularly focusing on whether a protein pair is split in T. brucei and fused in the human host. We also evaluated fusion events including hypothetical proteins, and suggest a possible molecular function or involvement in a certain biological process. This work has produced valuable results which could be further studied through structural biology or other experimental approaches so as to validate the protein-protein interactions proposed here. The evolutionary analysis of the proteins involved showed that, gene fusion or gene fission events can happen in all organisms, while some protein domains are more prone to fusion and fission events and present complex evolutionary patterns.
Experimental variance is a major challenge when dealing with high-throughput sequencing data. This variance has several sources: sampling replication, technical replication, variability within biological conditions, and variability between biological conditions. The high per-sample cost of RNA-Seq often precludes the large number of experiments needed to partition observed variance into these categories as per standard ANOVA models. We show that the partitioning of within-condition to between-condition variation cannot reasonably be ignored, whether in single-organism RNA-Seq or in Meta-RNA-Seq experiments, and further find that commonly-used RNA-Seq analysis tools, as described in the literature, do not enforce the constraint that the sum of relative expression levels must be one, and thus report expression levels that are systematically distorted. These two factors lead to misleading inferences if not properly accommodated. As it is usually only the biological between-condition and within-condition differences that are of interest, we developed ALDEx, an ANOVA-like differential expression procedure, to identify genes with greater between- to within-condition differences. We show that the presence of differential expression and the magnitude of these comparative differences can be reasonably estimated with even very small sample sizes.
The bony shell of the turtle is an evolutionary novelty not found in any other group of animals, however, research into its formation has suggested that it has evolved through modification of conserved developmental mechanisms. Although these mechanisms have been extensively characterized in model organisms, the tools for characterizing them in non-model organisms such as turtles have been limited by a lack of genomic resources. We have used a next generation sequencing approach to generate and assemble a transcriptome from stage 14 and 17 Trachemys scripta embryos, stages during which important events in shell development are known to take place. The transcriptome consists of 231,876 sequences with an N50 of 1,166 bp. GO terms and EC codes were assigned to the 61,643 unique predicted proteins identified in the transcriptome sequences. All major GO categories and metabolic pathways are represented in the transcriptome. Transcriptome sequences were used to amplify several cDNA fragments designed for use as RNA in situ probes. One of these, BMP5, was hybridized to a T. scripta embryo and exhibits both conserved and novel expression patterns. The transcriptome sequences should be of broad use for understanding the evolution and development of the turtle shell and for annotating any future T. scripta genome sequences.
The giant virus Mimiviridae family includes 3 groups of viruses: group A (includes Acanthamoeba polyphaga Mimivirus), group B (includes Moumouvirus) and group C (includes Megavirus chilensis). Virophages have been isolated with both group A Mimiviridae (the Mamavirus strain) and the related Cafeteria roenbergensis virus, and they have also been described by bioinformatic analysis of the Phycodnavirus. Here, we found that the first two strains of virophages isolated with group A Mimiviridae can multiply easily in groups B and C and play a role in gene transfer among these virus subgroups. To isolate new virophages and their Mimiviridae host in the environment, we used PCR to identify a sample with a virophage and a group C Mimiviridae that failed to grow on amoeba. Moreover, we showed that virophages reduce the pathogenic effect of Mimivirus (plaque formation), establishing its parasitic role on Mimivirus. We therefore developed a co-culture procedure using Acanthamoeba polyphaga and Mimivirus to recover the detected virophage and then sequenced the virophage's genome. We present this technique as a novel approach to isolating virophages. We demonstrated that the newly identified virophages replicate in the viral factories of all three groups of Mimiviridae, suggesting that the spectrum of virophages is not limited to their initial host.
The high throughput and cost-effectiveness afforded by short-read sequencing technologies, in principle, enable researchers to perform 16S rRNA profiling of complex microbial communities at unprecedented depth and resolution. Existing Illumina sequencing protocols are, however, limited by the fraction of the 16S rRNA gene that is interrogated and therefore limit the resolution and quality of the profiling. To address this, we present the design of a novel protocol for shotgun Illumina sequencing of the bacterial 16S rRNA gene, optimized to amplify more than 90% of sequences in the Greengenes database and with the ability to distinguish nearly twice as many species-level OTUs compared to existing protocols. Using several in silico and experimental datasets, we demonstrate that despite the presence of multiple variable and conserved regions, the resulting shotgun sequences can be used to accurately quantify the constituents of complex microbial communities. The reconstruction of a significant fraction of the 16S rRNA gene also enabled high precision (>90%) in species-level identification thereby opening up potential application of this approach for clinical microbial characterization.
The ascomycete fungus Ophiostoma ulmi was responsible for the initial pandemic of the massively destructive Dutch elm disease in Europe and North America in early 1910. Dutch elm disease has ravaged the elm tree population globally and is a major threat to the remaining elm population. O. ulmi is also associated with valuable biomaterials applications. It was recently discovered that proteins from O. ulmi can be used for efficient transformation of amylose in the production of bioplastics.
We have sequenced the 31.5 Mb genome of O.ulmi using Illumina next generation sequencing. Applying both de novo and comparative genome annotation methods, we predict a total of 8639 gene models. The quality of the predicted genes was validated using a variety of data sources consisting of EST data, mRNA-seq data and orthologs from related fungal species. Sequence-based computational methods were used to identify candidate virulence-related genes. Metabolic pathways were reconstructed and highlight specific enzymes that may play a role in virulence.
This genome sequence will be a useful resource for further research aimed at understanding the molecular mechanisms of pathogenicity by O. ulmi. It will also facilitate the identification of enzymes necessary for industrial biotransformation applications.
Effective comparative analysis of microbial genomes requires a consistent and complete view of biological data. Consistency regards the biological coherence of annotations, while completeness regards the extent and coverage of functional characterization for genomes. We have developed tools that allow scientists to assess and improve the consistency and completeness of microbial genome annotations in the context of the Integrated Microbial Genomes (IMG) family of systems. All publicly available microbial genomes are characterized in IMG using different functional annotation and pathway resources, thus providing a comprehensive framework for identifying and resolving annotation discrepancies. A rule based system for predicting phenotypes in IMG provides a powerful mechanism for validating functional annotations, whereby the phenotypic traits of an organism are inferred based on the presence of certain metabolic reactions and pathways and compared to experimentally observed phenotypes. The IMG family of systems are available at http://img.jgi.doe.gov/.
Long-term memories are thought to depend upon the coordinated activation of a broad network of cortical and subcortical brain regions. However, the distributed nature of this representation has made it challenging to define the neural elements of the memory trace, and lesion and electrophysiological approaches provide only a narrow window into what is appreciated a much more global network. Here we used a global mapping approach to identify networks of brain regions activated following recall of long-term fear memories in mice. Analysis of Fos expression across 84 brain regions allowed us to identify regions that were co-active following memory recall. These analyses revealed that the functional organization of long-term fear memories depends on memory age and is altered in mutant mice that exhibit premature forgetting. Most importantly, these analyses indicate that long-term memory recall engages a network that has a distinct thalamic-hippocampal-cortical signature. This network is concurrently integrated and segregated and therefore has small-world properties, and contains hub-like regions in the prefrontal cortex and thalamus that may play privileged roles in memory expression.
Memory retrieval is thought to involve the coordinated activation of multiple regions of the brain, rather than localized activity in a specific region. In order to visualize networks of brain regions activated by recall of a fear memory in mice, we quantified expression of an activity-regulated gene (c-fos) that is induced by neural activity. This allowed us to identify collections of brain regions where Fos expression co-varies across mice, and presumably form components of a network that are co-active during recall of long-term fear memory. This analysis suggested that expression of a long-term fear memory is an emergent property of large scale neural network interactions. This network has a distinct thalamic-hippocampal-cortical signature and, like many real-world networks as well as other anatomical and functional brain networks, has small-world architecture with a subset of highly-connected hub nodes that may play more central roles in memory expression.
The Trypanosomatids parasites Leishmania braziliensis, Leishmania major and Leishmania infantum are important human pathogens. Despite of years of study and genome availability, effective vaccine has not been developed yet, and the chemotherapy is highly toxic. Therefore, it is clear just interdisciplinary integrated studies will have success in trying to search new targets for developing of vaccines and drugs. An essential part of this rationale is related to protein-protein interaction network (PPI) study which can provide a better understanding of complex protein interactions in biological system. Thus, we modeled PPIs for Trypanosomatids through computational methods using sequence comparison against public database of protein or domain interaction for interaction prediction (Interolog Mapping) and developed a dedicated combined system score to address the predictions robustness. The confidence evaluation of network prediction approach was addressed using gold standard positive and negative datasets and the AUC value obtained was 0.94. As result, 39,420, 43,531 and 45,235 interactions were predicted for L. braziliensis, L. major and L. infantum respectively. For each predicted network the top 20 proteins were ranked by MCC topological index. In addition, information related with immunological potential, degree of protein sequence conservation among orthologs and degree of identity compared to proteins of potential parasite hosts was integrated. This information integration provides a better understanding and usefulness of the predicted networks that can be valuable to select new potential biological targets for drug and vaccine development. Network modularity which is a key when one is interested in destabilizing the PPIs for drug or vaccine purposes along with multiple alignments of the predicted PPIs were performed revealing patterns associated with protein turnover. In addition, around 50% of hypothetical protein present in the networks received some degree of functional annotation which represents an important contribution since approximately 60% of Leishmania predicted proteomes has no predicted function.
The cestode Echinococcus granulosus - the agent of cystic echinococcosis, a zoonosis affecting humans and domestic animals worldwide - is an excellent model for the study of host-parasite cross-talk that interfaces with two mammalian hosts. To develop the molecular analysis of these interactions, we carried out an EST survey of E. granulosus larval stages. We report the salient features of this study with a focus on genes reflecting physiological adaptations of different parasite stages.
We generated ∼10,000 ESTs from two sets of full-length enriched libraries (derived from oligo-capped and trans-spliced cDNAs) prepared with three parasite materials: hydatid cyst wall, larval worms (protoscoleces), and pepsin/H+-activated protoscoleces. The ESTs were clustered into 2700 distinct gene products. In the context of the biology of E. granulosus, our analyses reveal: (i) a diverse group of abundant long non-protein coding transcripts showing homology to a middle repetitive element (EgBRep) that could either be active molecular species or represent precursors of small RNAs (like piRNAs); (ii) an up-regulation of fermentative pathways in the tissue of the cyst wall; (iii) highly expressed thiol- and selenol-dependent antioxidant enzyme targets of thioredoxin glutathione reductase, the functional hub of redox metabolism in parasitic flatworms; (iv) candidate apomucins for the external layer of the tissue-dwelling hydatid cyst, a mucin-rich structure that is critical for survival in the intermediate host; (v) a set of tetraspanins, a protein family that appears to have expanded in the cestode lineage; and (vi) a set of platyhelminth-specific gene products that may offer targets for novel pan-platyhelminth drug development.
This survey has greatly increased the quality and the quantity of the molecular information on E. granulosus and constitutes a valuable resource for gene prediction on the parasite genome and for further genomic and proteomic analyses focused on cestodes and platyhelminths.
Cestodes are a neglected group of platyhelminth parasites, despite causing chronic infections to humans and domestic animals worldwide. We used Echinococcus granulosus as a model to study the molecular basis of the host-parasite cross-talk during cestode infections. For this purpose, we carried out a survey of the genes expressed by parasite larval stages interfacing with definitive and intermediate hosts. Sequencing from several high quality cDNA libraries provided numerous insights into the expression of genes involved in important aspects of E. granulosus biology, e.g. its metabolism (energy production and antioxidant defences) and the synthesis of key parasite structures (notably, the one exposed to humans and livestock intermediate hosts). Our results also uncovered the existence of an intriguing set of abundant repeat-associated non-protein coding transcripts that may participate in the regulation of gene expression in all surveyed stages. The dataset now generated constitutes a valuable resource for gene prediction on the parasite genome and for further genomic and proteomic studies focused on cestodes and platyhelminths. In particular, the detailed characterization of a range of newly discovered genes will contribute to a better understanding of the biology of cestode infections and, therefore, to the development of products allowing their efficient control.
The Toxoplasma gondii SRS gene superfamily is structurally related to SRS29B (formerly SAG1), a surface adhesin that binds host cells and stimulates host immunity. Comparative genomic analyses of three Toxoplasma strains identified 182 SRS genes distributed across 14 chromosomes at 57 genomic loci. Eight distinct SRS subfamilies were resolved. A core 69 functional gene orthologs were identified, and strain-specific expansions and pseudogenization were common. Gene expression profiling demonstrated differential expression of SRS genes in a developmental-stage- and strain-specific fashion and identified nine SRS genes as priority targets for gene deletion among the tissue-encysting coccidia. A Δsag1 ∆sag2A mutant was significantly attenuated in murine acute virulence and showed upregulated SRS29C (formerly SRS2) expression. Transgenic overexpression of SRS29C in the virulent RH parent was similarly attenuated. Together, these findings reveal SRS29C to be an important regulator of acute virulence in mice and demonstrate the power of integrated genomic analysis to guide experimental investigations.
Parasitic species employ large gene families to subvert host immunity to enable pathogen colonization and cause disease. Toxoplasma gondii contains a large surface coat gene superfamily that encodes adhesins and virulence factors that facilitate infection in susceptible hosts. We generated an integrated bioinformatic resource to predict which genes from within this 182-gene superfamily of adhesin-encoding genes play an essential role in the host-pathogen interaction. Targeted gene deletion experiments with predicted candidate surface antigens identified SRS29C as an important negative regulator of acute virulence in murine models of Toxoplasma infection. Our integrated computational and experimental approach provides a comprehensive framework, or road map, for the assembly and discovery of additional key pathogenesis genes contained within other large surface coat gene superfamilies from a broad array of eukaryotic pathogens.
Extracellular recombinant proteins are commonly produced using HEK293 cells as histidine-tagged proteins facilitating purification by immobilized metal affinity chromatography (IMAC). Based on gel analyses, this one-step purification typically produces proteins of high purity. Here, we analyzed the presence of TGF-β1 in such IMAC purifications using recombinant extracellular fibrillin-1 fragments as examples. Analysis of various purified recombinant fibrillin-1 fragments by ELISA consistently revealed the presence of picomolar concentrations of active and latent TGF-β1, but not of BMP-2. These quantities of TGF-β1 were not detectable by Western blotting and mass spectrometry. However, the amounts of TGF-β1 were sufficient to consistently trigger Smad2 phosphorylation in fibroblasts. The purification mechanism was analyzed to determine whether the presence of TGF-β1 in these protein preparations represents a specific or non-specific co-purification of TGF-β1 with fibrillin-1 fragments. Control purifications using conditioned medium from non-transfected 293 cells yielded similar amounts of TGF-β1 after IMAC. IMAC of purified TGF-β1 and the latency associated peptide showed that these proteins bound to the immobilized nickel ions. These data clearly demonstrate that TGF-β1 was co-purified by specific interactions with nickel, and not by specific interactions with fibrillin-1 fragments. Among various chromatographic methods tested for their ability to eliminate TGF-β1 from fibrillin-1 preparations, gel filtration under high salt conditions was highly effective. As various recombinant extracellular proteins purified in this fashion are frequently used for experiments that can be influenced by the presence of TGF-β1, these findings have far-reaching implications for the required chromatographic schemes and quality controls.
Cyanobacteria are an important group of photoautotrophic organisms that can synthesize valuable bio-products by harnessing solar energy. They are endowed with high photosynthetic efficiencies and diverse metabolic capabilities that confer the ability to convert solar energy into a variety of biofuels and their precursors. However, less well studied are the similarities and differences in metabolism of different species of cyanobacteria as they pertain to their suitability as microbial production chassis. Here we assemble, update and compare genome-scale models (iCyt773 and iSyn731) for two phylogenetically related cyanobacterial species, namely Cyanothece sp. ATCC 51142 and Synechocystis sp. PCC 6803. All reactions are elementally and charge balanced and localized into four different intracellular compartments (i.e., periplasm, cytosol, carboxysome and thylakoid lumen) and biomass descriptions are derived based on experimental measurements. Newly added reactions absent in earlier models (266 and 322, respectively) span most metabolic pathways with an emphasis on lipid biosynthesis. All thermodynamically infeasible loops are identified and eliminated from both models. Comparisons of model predictions against gene essentiality data reveal a specificity of 0.94 (94/100) and a sensitivity of 1 (19/19) for the Synechocystis iSyn731 model. The diurnal rhythm of Cyanothece 51142 metabolism is modeled by constructing separate (light/dark) biomass equations and introducing regulatory restrictions over light and dark phases. Specific metabolic pathway differences between the two cyanobacteria alluding to different bio-production potentials are reflected in both models.
Most filarial parasites in the subfamilies Onchocercinae and Dirofilariinae depend on Wolbachia endobacteria to successfully carry out their life cycle. Recently published data indicate that the few Wolbachia-free species in these subfamilies were infected in the distant past and have subsequently shed their endosymbionts. We used an integrated transcriptomic and proteomic analysis of Onchocerca flexuosa to explore the molecular mechanisms that allow worms of this species to survive without a bacterial partner. Roche/454 sequencing of the adult transcriptome produced 16,814 isogroup and 47,252 singleton sequences that are estimated to represent approximately 41% of the complete gene set. Sequences similar to 97 Wolbachia genes were identified from the transcriptome, some of which appear on the same transcripts as sequences similar to nematode genes. Computationally predicted peptides, including those with similarity to Wolbachia proteins, were classified at the domain and pathway levels in order to assess the metabolic capabilities of O. flexuosa and compare against the Wolbachia-dependent model filaria, Brugia malayi. Transcript data further facilitated a shotgun proteomic analysis of O. flexuosa adult worm lysate, resulting in the identification of 1,803 proteins. Three of the peptides detected by mass spectroscopy map to two ABC transport-related proteins from Wolbachia. Antibodies raised to one of the Wolbachia-like peptides labeled a single 38 kDa band on Western blots of O. flexuosa lysate and stained specific worm tissues by immunohistology. Future studies will be required to determine the exact functions of Wolbachia-like peptides and proteins in O. flexuosa and to assess their roles in worm biology.
Elastin is a major structural component of elastic fibres that provide properties of stretch and recoil to tissues such as arteries, lung and skin. Remarkably, after initial deposition of elastin there is normally no subsequent turnover of this protein over the course of a lifetime. Consequently, elastic fibres must be extremely durable, able to withstand, for example in the human thoracic aorta, billions of cycles of stretch and recoil without mechanical failure. Major defects in the elastin gene (ELN) are associated with a number of disorders including Supravalvular aortic stenosis (SVAS), Williams-Beuren syndrome (WBS) and autosomal dominant cutis laxa (ADCL). Given the low turnover of elastin and the requirement for the long term durability of elastic fibres, we examined the possibility for more subtle polymorphisms in the human elastin gene to impact the assembly and long-term durability of the elastic matrix. Surveys of genetic variation resources identified 118 mutations in human ELN, 17 being non-synonymous. Introduction of two of these variants, G422S and K463R, in elastin-like polypeptides as well as full-length tropoelastin, resulted in changes in both their assembly and mechanical properties. Most notably G422S, which occurs in up to 40% of European populations, was found to enhance some elastomeric properties. These studies reveal that even apparently minor polymorphisms in human ELN can impact the assembly and mechanical properties of the elastic matrix, effects that over the course of a lifetime could result in altered susceptibility to cardiovascular disease.
We describe the reconstruction of a genome-scale metabolic model of the crenarchaeon Sulfolobus solfataricus, a hyperthermoacidophilic microorganism. It grows in terrestrial volcanic hot springs with growth occurring at pH 2–4 (optimum 3.5) and a temperature of 75–80°C (optimum 80°C). The genome of Sulfolobus solfataricus P2 contains 2,992,245 bp on a single circular chromosome and encodes 2,977 proteins and a number of RNAs. The network comprises 718 metabolic and 58 transport/exchange reactions and 705 unique metabolites, based on the annotated genome and available biochemical data. Using the model in conjunction with constraint-based methods, we simulated the metabolic fluxes induced by different environmental and genetic conditions. The predictions were compared to experimental measurements and phenotypes of S. solfataricus. Furthermore, the performance of the network for 35 different carbon sources known for S. solfataricus from the literature was simulated. Comparing the growth on different carbon sources revealed that glycerol is the carbon source with the highest biomass flux per imported carbon atom (75% higher than glucose). Experimental data was also used to fit the model to phenotypic observations. In addition to the commonly known heterotrophic growth of S. solfataricus, the crenarchaeon is also able to grow autotrophically using the hydroxypropionate-hydroxybutyrate cycle for bicarbonate fixation. We integrated this pathway into our model and compared bicarbonate fixation with growth on glucose as sole carbon source. Finally, we tested the robustness of the metabolism with respect to gene deletions using the method of Minimization of Metabolic Adjustment (MOMA), which predicted that 18% of all possible single gene deletions would be lethal for the organism.
Although there has been extensive debate about whether Trichuris suis and Trichuris trichiura are separate species, only one species of the whipworm T. trichiura has been considered to infect humans and non-human primates. In order to investigate potential cross infection of Trichuris sp. between baboons and humans in the Cape Peninsula, South Africa, we sequenced the ITS1-5.8S-ITS2 region of adult Trichuris sp. worms isolated from five baboons from three different troops, namely the Cape Peninsula troop, Groot Olifantsbos troop and Da Gama Park troop. This region was also sequenced from T. trichiura isolated from a human patient from central Africa (Cameroon) for comparison. By combining this dataset with Genbank records for Trichuris isolated from other humans, non-human primates and pigs from several different countries in Europe, Asia, and Africa, we confirmed the identification of two distinct Trichuris genotypes that infect primates. Trichuris sp. isolated from the Peninsula baboons fell into two distinct clades that were found to also infect human patients from Cameroon, Uganda and Jamaica (named the CP-GOB clade) and China, Thailand, the Czech Republic, and Uganda (named the DG clade), respectively. The divergence of these Trichuris clades is ancient and precedes the diversification of T. suis which clustered closely to the CP-GOB clade. The identification of two distinct Trichuris genotypes infecting both humans and non-human primates is important for the ongoing treatment of Trichuris which is estimated to infect 600 million people worldwide. Currently baboons in the Cape Peninsula, which visit urban areas, provide a constant risk of infection to local communities. A reduction in spatial overlap between humans and baboons is thus an important measure to reduce both cross-transmission and zoonoses of helminthes in Southern Africa.
Bacterial pathogens continue to threaten public health worldwide today. Identification of bacterial virulence factors can help to find novel drug/vaccine targets against pathogenicity. It can also help to reveal the mechanisms of the related diseases at the molecular level. With the explosive growth in protein sequences generated in the postgenomic age, it is highly desired to develop computational methods for rapidly and effectively identifying virulence factors according to their sequence information alone. In this study, based on the protein-protein interaction networks from the STRING database, a novel network-based method was proposed for identifying the virulence factors in the proteomes of UPEC 536, UPEC CFT073, P. aeruginosa PAO1, L. pneumophila Philadelphia 1, C. jejuni NCTC 11168 and M. tuberculosis H37Rv. Evaluated on the same benchmark datasets derived from the aforementioned species, the identification accuracies achieved by the network-based method were around 0.9, significantly higher than those by the sequence-based methods such as BLAST, feature selection and VirulentPred. Further analysis showed that the functional associations such as the gene neighborhood and co-occurrence were the primary associations between these virulence factors in the STRING database. The high success rates indicate that the network-based method is quite promising. The novel approach holds high potential for identifying virulence factors in many other various organisms as well because it can be easily extended to identify the virulence factors in many other bacterial species, as long as the relevant significant statistical data are available for them.
Alzheimer’s disease (AD) is a progressive neurodegenerative disease involving the alteration of gene expression at the whole genome level. Genome-wide transcriptional profiling of AD has been conducted by many groups on several relevant brain regions. However, identifying the most critical dys-regulated genes has been challenging. In this work, we addressed this issue by deriving critical genes from perturbed subnetworks. Using a recent microarray dataset on six brain regions, we applied a heaviest induced subgraph algorithm with a modular scoring function to reveal the significantly perturbed subnetwork in each brain region. These perturbed subnetworks were found to be significantly overlapped with each other. Furthermore, the hub genes from these perturbed subnetworks formed a connected hub network consisting of 136 genes. Comparison between AD and several related diseases demonstrated that the hub network was robustly and specifically perturbed in AD. In addition, strong correlation between the expression level of these hub genes and indicators of AD severity suggested that this hub network can partially reflect AD progression. More importantly, this hub network reflected the adaptation of neurons to the AD-specific microenvironment through a variety of adjustments, including reduction of neuronal and synaptic activities and alteration of survival signaling. Therefore, it is potentially useful for the development of biomarkers and network medicine for AD.
Understanding cellular regulation of metabolism is a major challenge in systems biology. Thus far, the main assumption was that enzyme levels are key regulators in metabolic networks. However, regulation analysis recently showed that metabolism is rarely controlled via enzyme levels only, but through non-obvious combinations of hierarchical (gene and enzyme levels) and metabolic regulation (mass action and allosteric interaction). Quantitative analyses relating changes in metabolic fluxes to changes in transcript or protein levels have revealed a remarkable lack of understanding of the regulation of these networks. We study metabolic regulation via feasibility analysis (FA). Inspired by the constraint-based approach of Flux Balance Analysis, FA incorporates a model describing kinetic interactions between molecules. We enlarge the portfolio of objectives for the cell by defining three main physiologically relevant objectives for the cell: function, robustness and temporal responsiveness. We postulate that the cell assumes one or a combination of these objectives and search for enzyme levels necessary to achieve this. We call the subspace of feasible enzyme levels the feasible enzyme space. Once this space is constructed, we can study how different objectives may (if possible) be combined, or evaluate the conditions at which the cells are faced with a trade-off among those. We apply FA to the experimental scenario of long-term carbon limited chemostat cultivation of yeast cells, studying how metabolism evolves optimally. Cells employ a mixed strategy composed of increasing enzyme levels for glucose uptake and hexokinase and decreasing levels of the remaining enzymes. This trade-off renders the cells specialized in this low-carbon flux state to compete for the available glucose and get rid of over-overcapacity. Overall, we show that FA is a powerful tool for systems biologists to study regulation of metabolism, interpret experimental data and evaluate hypotheses.